GPU Computing

elessar.perm · January 6, 2015

Hi guys!

I'm working on realistic ocean simulation for a browser game at the moment.

The best known way to simulate ocean waves is Jerry Tessendorf's method with statistical model.

I won't paste any formulas here for simplification, so here is core problem: calculations are expensive and I don't want to compute water heightmap by CPU in browser because the algorithm may be paralleled very well and GPU is able to compute the grid much faster.

Is there any way to use GPU computing from babylon.js?

I'm thinking about using shader with texture renderTarget to generate heightmap and then use the results in physics simulation in javascript and pass it to the shader material for rendering water surface.

Is it worth or not? Can anyone suggest any other methods?

Thanks!

JCPalmer · January 6, 2015

GPGPU for WebGL, welcome to the weeds! Before OpenCL came out, I had tried to use OpenGL 2.0 for GPGPU. Got into a nVidia developer program to test OpenCL at the first opportunity. It was much easier.

I see major issues using OpenGL 2.0 ES for GPGPU. OpenGL 2.0 was bad enough. Basically, you build a vertex shader with a single, ortho, quad. Passed up any substantial data to read as textures. The main part of the program is a, or a series of, fragment shaders, which read the textures. Cannot quite remember how you got the data back to the cpu.

OpenGL 2.0 ES does not support quads, so you'll have to build 2 triangles. Makes basing your calculation on your location in the quad more involved. Probably have to get the vertex shader involved to know which triangle you are in.

Then inter operating with BabylonJS seems difficult. Web workers seems like a better way to go, even if it is async. There are virtually no single core cpus on the market today. BabylonJS is only using 1 core. Flooring a separate core seems much more attractive.

julien-moreau · January 6, 2015

Hello elessar.perm !

I love what you're going to do

Something you can do (because we don't have Compute Shaders :'( ) is to create a "Screen Quad" for your height map calculation. You can see the screen quad vertices organization in this file for example : https://github.com/clbr/MLAA-test-app/blob/master/screenquad.h

Once you have your ScreenQuad mesh, apply a ShaderMaterial that will generate your height map using one or multiple passes into your RTT(s).

Basically, the vertex program of the Screen Quad should look like (to be fullscreen) :

attribute vec3 position;attribute vec2 uv;varying vec2 vUV;void main(void) {    gl_Position = vec4(position, 1.0);    vUV = uv;}

And in your pixel program you'll calculate the height map.

It's only an idea, not sure it will work !

May the force be with you !

elessar.perm · January 6, 2015

Hello elessar.perm !
I love what you're going to do

Something you can do (because we don't have Compute Shaders :'( ) is to create a "Screen Quad" for your height map calculation. You can see the screen quad vertices organization in this file for example : https://github.com/clbr/MLAA-test-app/blob/master/screenquad.h

Once you have your ScreenQuad mesh, apply a ShaderMaterial that will generate your height map using one or multiple passes into your RTT(s).

Basically, the vertex program of the Screen Quad should look like (to be fullscreen) :
attribute vec3 position;attribute vec2 uv;varying vec2 vUV;void main(void) {    gl_Position = vec4(position, 1.0);    vUV = uv;}
And in your pixel program you'll calculate the height map.

It's only an idea, not sure it will work !

May the force be with you !

Exactly what I wanted to to do.

But the main question is will it be significant faster that web worker version or not.

julien-moreau · January 6, 2015

Oh course it will because vertices & pixels operations you want to do are in almost all cases faster on GPUs ^^

You can read the little article I wrote about CPU & GPU computations at : https://medium.com/community-play-3d/computing-your-own-depth-shadow-pass-into-cp3d-439293b36457

There is a performance comparison between both methods at the end.

elessar.perm · January 6, 2015

I mean I'm not sure that babylon.js will perform all necessary actions fast enough, such as mapping buffers and transport to GPU.

These operations may really be the bottleneck of this method, overwhelming the performance improvement of GPU calculations

But I'll try of course.

Romanichel_2.0 · January 6, 2015

If you don't have to read back the data on the CPU (ie the waves simulation are only used as inputs of other shaders) it should be doable (maybe not in every browser / devices).

You have to render a quad and store your simulation data into textures accessible in read/write by the gpu (if I recall, for wave simulations, you need access to frame n-1 and n-2 to compute frame n), and use the data produces in the texture inside your redering vertex shader (you animate vertex positions of a grid).

In the DX9.1 era, with some Nvidia custom extensions, I used to do something quite similar .

elessar.perm · January 6, 2015

Unfortunately I have to read back the data because I need it for ship physics simulation

JCPalmer · January 6, 2015

cpu time need not be less then gpu time + transfer back.

If you can live with your height map being one frame behind, cpu time must only be less then BJS cpu time + gpu time. This is due to to the fact that you will be running async on an otherwise unused cpu core.

Executing on the gpu will need to be sync, and take away time for the rest. A better question is: what will give better throughput? Also, in a photo finish, I would always do the less exotic way (web worker). If you need some OpenGL extensions, you could have device issues.

elessar.perm · January 6, 2015

Thank you all for your answers! I think I will try web worker approach and determine if the speed is sufficient. If not, I will think further.

I will post results when it's done

JCPalmer · January 7, 2015

Think it would be wise to actually write an inline version first, which can be adapted for a web worker if required. "Right first, Fast later". You could be fighting too many simultaneous battles attempting to go straight to a web worker.

Also, I would avoid objects like BABYLON.Vector3. Put the output of your height map inside of a Float32Array if possible. 3 reasons:

Typed arrays are known for slightly slower initialization, but also slightly faster access. I use them extensively, and found at least they are not slower. Difficult to actually measure. You only need to create it once.
Typed arrays are not inside of the VM heap, unlike an array of BABYLON.Vector3. If you have to use Vector3, make sure to DO NOT create them over and over. Use .inPlace() methods, or .x, .y, .z = . Throw away instances will put a lot of pressure on the heap causing more garbage collection.
If this were ever to come from the GPU, a Float32Array is how it would come.

If this were to be changed to a web worker, then you would just create 2 arrays. The current one for babylon to use, and the future one being updated by the the web worker.

elessar.perm · January 7, 2015

Thank you for your help. Of course, I will start from basic version and adopt parallelization later. But I didn't think about typed arrays. Sounds good.

JCPalmer · January 8, 2015

For future search result:

I did a double check on returning data from OpenGL 2.0 ES, since I knew this was an obvious area to cut back on for mobile. The function gl.readPixels is in ES, but it is hobbled to only return in Uint8Array. OpenGL 2.0 can return in 20 different formats.

Getting readPixels & your own unrelated shaders to inter-operate probably would require pretty close familiarity with the BabylonJS source code.

elessar.perm · January 8, 2015

Just understood some interesting thing. I don't need full grid for physics simulation, because it will be used just for some relatively small amount of objects, so for them I can use CPU.

I need the full grid only for rendering and I can compute it in the fragment shader.

GPU Computing

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members