Sign in to follow this  
Captain Harlock

WebGL slow when rendering 10k triangles

Recommended Posts

I'm using WebGL to render N triangles in 2D.

The triangles' geometry and colors are random, computed once and placed in buffers at start up. My shaders are super simple.

Then I render them by one call to drawArrays, and animate via requestAnimationFrame. The FPS drops rapidly as N grows (both on my PC and my MacPro).

When N = 10k, it's painfully slow. 

All over the web I see smooth demos with very large numbers of triangles.

What I'm doing is very simple:

https://jsfiddle.net/CaptainHarlock/mphd96L5/

You can vary the number of triangles and see performance go down.

I don't get it. Isn't WebGL expected to excel at this type of things? What am I doing wrong?

Share this post


Link to post
Share on other sites
Quote

I don't get it. Isn't WebGL expected to excel at this type of things? What am I doing wrong?

GPU acceleration is great when you understand how to take advantage out of it.

Things I think could help

- Use and index buffer and put vertex attribute data in one single buffer. This way you can avoid switching between buffers.

- If you are not updating buffers then use gl.STATIC_DRAW.

- I think this is the most relevant for your test. Having a single draw call doesn't mean better performance. Split your triangles into reasonable batches and do multiple draw calls. Maybe limit the amount of triangles to 4000 or less.

I feel the main issue with your example is memory overhead.

You can look at the code of this tiny canvas implementation I did if you want some reference, it's not the fastest 2D renderer but it uses the stuff I mentioned: https://github.com/bitnenfer/tiny-canvas/blob/master/src/canvas.js

Cheers. 

Share this post


Link to post
Share on other sites

Hi,

Thanks for your response. I have a few follow up questions:

- I will end up modifying the buffer, but for the sake of the experiment I tried with STATIC_DRAW. I couldn't notice any difference in performance, at least on this example.

- The vertex attribute data is already in one buffer.

- I don't use an index buffer and drawElements. Why would that be better than drawArrays? In my case I never reuse vertices, so the amount of data to transfer with drawElements should be strictly superior to that for drawArrays.

- What do you mean by "memory overhead"?

Share this post


Link to post
Share on other sites

Hi,

I feel the first two points I mentioned should be appended to the last one, once you do the last one do the first two. What I mean about memory overhead is that probably for your current hardware pushing 10k triangles is too much.

Also you are currently using 2 buffers, one for vertex position and the other for vertex color. What I suggest is putting both in the same buffer this would make the composition of your buffer something like: [x0, y0, argb0, x1, y1, argb1, ...].

This is how you could use a single vbo to store your vertex data (position and color)

var VBO, aWorldCoordsLocation, aColorLocation;
// vertexSize represents the size in bytes of your 
// attribute data. (float + float + (char * 4))
var vertexSize = (4 * 2) + (4);

VBO = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, VBO);

aWorldCoordsLocation = gl.getAttribLocation(program, 'aWorldCoords');
aColorLocation = gl.getAttribLocation(program, 'aColor');

gl.enableVertexAttribArray(aWorldCoordsLocation);
gl.enableVertexAttribArray(aColorLocation);

// Offset is 0
gl.vertexAttribPointer(aWorldCoordsLocation, 2, gl.FLOAT, false, vertexSize, 0);

// Offset is 8 because the size of a vec2 in bytes is 8.
gl.vertexAttribPointer(aColorLocation, 4, gl.FLOAT, true, vertexSize, 8);

 

Cheers

Share this post


Link to post
Share on other sites

I'll give the idea of one unique buffer for vertices and colors a try.

But I want to come back to your idea of using multiple batches of "reasonable" sizes.

Suppose I choose k batches, each containing N/k triangles. I''ll end up calling drawArrays k times, each with 1/k the amount of data.

Could you explain why this would help? What's the fundamental reason? Isn't the GPU doing the same amount of work? Is parallelization helping? Why does it matter if I make k calls vs 1?

It is certainly counterintuitive to me, as I would expect that the overhead of k calls from javascript would make it worse. Please help me understand.

Share this post


Link to post
Share on other sites

I might be way off the mark here but isn't the slowest bit of the process the geometry upload in the first place? 10k triangles is naff all (I'm assuming your GPU isn't total garbage) so maybe its just that recreating the geometry (are you doing that?) each frame is the slow bit. My MBP, with its fairly average GPU, can get up to about 280k triangles in the Pixi Bunnymark before it starts to drop below 60fps, that included appending a new sprite each frame to the VBO (slight assumption here, not totally clear how Pixi manages that) and updating transforms for all sprites each frame. I've done a similar quick test using stackGL modules and a bit of hacking, couldn't get near Pixi perf (around 200k triangles though so decent enough for 2d games) but I'm pretty sure I remember it being far far slower when uploading new geometry each frame and obviously choking very hard as the number of sprites (thus triangles) raised.

Actually, this might be naff answer (I'll leave it here in case it helps you towards a proper solution), I used the gl-sprite-batch module and cleared and re-created the new batch each frame. The JS array of sprites it used was simply added to though rather than recreated each frame. I'm trying to remember how I managed to kill performance (my initial test had very similar results to you, much much lower than expected or desired), maybe I didn't start with the sprite-batch module and tried doing it manually and obvs sprite-batch contains some clever stuff.

Share this post


Link to post
Share on other sites

seems very weird to me. I just profiled the jsfiddle example, I've never seen results like this. the results show that most of the time was "idle" then "program" and a tiny bit of it was your code.
I guess it could be too many pixels as mentioned, that would kind of explain the idle cpu time as the gpu is doing too much work and can't keep up.
Ok I also accept that answer :D TIL

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.