Jump to content

Submeshes and rendering performance


AussieKSU
 Share

Recommended Posts

I have observed that sub meshes can solve many issues for dynamically changing materials for parts of meshes. "Dynamic" changing of materials is meant to mean changing the material of parts of the scene content on demand while the application is running.

It is understood by me that in order to increase general performance of the BABYLON engine, one should reduce the total number of meshes in the scene. This has certainly been an observed truth in my implementation of BABYLON for our application. We see that this truth, however can become problematic when several large meshes exist, and then many sub meshes are created for assigning materials to parts of the mesh. As I understand from forum research, empirical analysis, and reviewing BABYLON source code, the number of sub meshes is directly linked to the number GL draw calls. If the scene is organized with several large meshes, you will still see inferior FPS performance if many sub meshes are added to each mesh. This is true, even if many of the sub meshes reference a small number of materials. 

To visualize this problem, I will use the example of a large horse (metaphorically) being the entire contents of the scene. In this case, we will have a single mesh, containing all the vertices to draw this horse. Dynamically, we wish to change this horse to a zebra - where the vertices are identical to the horse, but instead we add black and white stripes to the horse. I have explored two options to accomplish this "zebra-fication".


1) (Non-submesh solution) Manipulating the index and vertex arrays themselves, it is possible to "pull" apart all the white parts of the zebra into a single mesh, and all the black parts of the zebra into a separate mesh. This performs wonderfully from a FPS perspective, but is not wonderful for the array manipulation itself. This is also NOT using the concept of sub meshes whatsoever. The problem, once again with this solution is not the render performance, but the operation of creating the two meshes. It can also be considered that once indexes are manipulated, we could retain the use of a single mesh, and since the indexes are reorganized and are concurrent with respect to their materials, we could also use 2 submeshes.

2) (Submesh solution) An elegant solution in terms of performance of the "zebra-fication" is to cut up the horse mesh into white and black stripes with sub meshes. The performance of creating these sub meshes is very good - as expected. The problem with this solution, as I introduced in the beginning paragraph, is that for each "stripe/submesh" we see a separate GL draw call. 

My post is intended to firstly inquire if there is a way to bridge the gap between the low cost for the zebra-fication with submeshes while also keeping GL draw calls low and therefore FPS high. As I understand it, I cannot currently have my cake and eat it too.

My post is also intended to inquire about a possible enhancement to the BABYLON engine, which I am more than happy to code, test, and post to the BABYLON source. This would be an optional structure to pass multiple sub meshes to the rendering manager, where these sub meshes share the same material index. It should also be noted that these sub meshes do not need to have consecutive start/end indexes. Therefore, one could still enjoy the excellent sub mesh performance while "zebrafying", and also enjoy great FPS performance. I am more than happy to discuss design details with this enhancement if this enhancement would be allowed to the source code.

It may also be possible to enhance the submesh class so that non sequential ranges of indexes could be added. This may be an easier and more elegant solution. With this solution, one could create two submeshes (one white, one black), and specify the various index ranges for each white and black stripe respectively.

Thank you for your time

Link to comment
Share on other sites

Have you explored the SPS what can be handled as mesh subpart API : http://doc.babylonjs.com/overviews/Solid_Particle_System ?

With the SPS, you're not obliged to consider the particles as animated particles... you can simply consider them as subparts of a mesh if you just want to set their UV, colors, scalings or rotations

Link to comment
Share on other sites

Jerome, 

Thank you very much for the suggestion. I had not considered SPS. My hour or so of research into the code leads me to come to a few worries/comments:

Please note, my brief research into SPS certainly makes me no expert. Please correct my worries/comments if you think they are not accurate.


1) SPS seems to be a management structure around a collection of meshes. In the end, the meshes that have been added to the scene that are part of the SPS are still rendered via the RenderManager, and will therefore experience the same performance problems in my OP. SPS seems to be a handy structure for collecting meshes in a system, and acting upon these meshes - I just fear that the performance will not change, or will be worse because of point 2.
2) SPS does not like to use merged meshes? As far as I understand my cursory review of the code, things might go awry if we consolidate the meshes in the SPS via BABYLON.Mesh.MergeMeshes(...). This comment may not be entirely accurate, but nevertheless, this is certainly weighed into my final complaint below.
3) My last complaint about SPS is not meant to be a sound argument to dissuade the usage of SPS for other BABYLON developers, but it is still a point that many of us devs must take to heart. Our application is rather mature, complete with some cool performance tweaks for loading scenes on web workers from JSON data provided by services, special picking logic, etc, etc. I would very much rather not touch this code, even if the ripple effect of changes is "theoretically" small. Therefore, due to the maturity of our application, I am very hesitant to switch to something dramatically different - SPS - in order to solve my problem. 

Edit : 

1) Jerome has pointed out through demos that this is not true. SPS uses a single draw call for the mesh, and does not use multiple materials, but instead uses color buffers. Performance is very good, regardless of number of colors.
2) Incorrect. SPS could very well use a merged mesh.

Link to comment
Share on other sites

Draw calls with non-sequential ranges may not be possible. Looking at how the WebGLRenderContext works, it seems it only supports drawing consecutive indeces.

drawElements(mode: number, count: number, type: number, offset: number): void;

I think my proposed solution may be infeasible. I could not draw multiple parts of the bound gemetry with a single draw call :(.
I don't know if this invalidates my proposed solution entirely. We could still reduce the need to (re)bind the geometry buffers for each sub mesh. Not sure what this would mean for render performance. This would require more analysis before I could definitely comment. Intuitively, it seems that the overhead is in the draw call, since sub meshes after all, do a single bind and I have already mentioned the degradation of FPS when many of these are used.

Edit : Binding the vertex context is not a performance issue. Even so, there is only one bind for the mesh as it stands today, regardless how many sub meshes it contains.

Link to comment
Share on other sites

Well, to shorten :

The SPS is a mesh, so it's rendered with one draw call only, so quite fast.

Its subparts, called particles, can be accessed one by one and you can set their color and texture (actually uvs from a global texture, think about a texture atlas). This can be done dynamically as many times you need.

The best you could do would be to produce a PG example with what you intend to do, or even a simple prototype showing the principle (say, 300 meshes that need to be textured and colored dynamically or anythng else) and we, this lovely communtiy, would be happy to try to solve your problem or at least to show you some leads to follow to fit your needs

Link to comment
Share on other sites

Use option "3", which is submeshes, but not one for each strip.  Just 2 total.  Assign each sub their own material.  When a horse, the diffuseColor will match.  To zebra, change diffuse color of the material(s).

I think where you are messed up is thinking sub-meshes are contiguous vertices.  They are not, but they must be grouped / defined together.  If you built your geometry in Blender, 3D Max, then this ordering game is automatic.  Doing this by hand is near impossible though.  You have to have duplicate vertices on the borders.

Link to comment
Share on other sites

have a look at this example : http://www.babylonjs-playground.com/#HDHQN#9

no matter how the SPS is built : here, from a digested torus knot with 128 rings, but the way it's built is not important. What's matter is the way you can access its subpart.

In this example a random color is given to each torus ring, then the color is passed each frame to the next ring => one draw call only

the same with 800 rings : http://www.babylonjs-playground.com/#HDHQN#10   (30 fps on my not that good laptop)

and 60 fps if I lower the tubular segment number : http://www.babylonjs-playground.com/#HDHQN#12

or with bigger rings : http://www.babylonjs-playground.com/#HDHQN#13

(change the facetNb value at the line 32)

 

Remember that the torus knot is really a heavy geometry (have a look at the vertice number in the debug layer)

Link to comment
Share on other sites

Jerome,

Thank you for the demos. There are a few observations I have after playing with SPS. It is important to note that the size of geometry that is typical for us is 3,000,000 vertices.

I am sure many of these observations are known by you, but I will post and explain them for anyone else reading along:

If I crank your SPS example up to ~3,000,000 vertices (removing the animation as well) - http://www.babylonjs-playground.com/##2AZKAH#0 - we can see the fps plummet. No surprising, considering your two demos with different tubular segments.

If I use the same example, but instead, reduce the number of particles to 1, FPS is still poor. http://www.babylonjs-playground.com/#1ZXDGM#5 (6fps on my machine). I assume this is because of the added color weight to the draw call? It seems the number of particles is not a major factor in the final FPS.

Consider only using the meshes, and not particle systems at all - the color is only set set at the beginning of the draw call. Therefore, the draw data is much lighter to the GPU. Notice the FPS now http://www.babylonjs-playground.com/#H4PD (60 FPS).

If I would rip apart the geometry for the first example, and build meshes based on similar material, I would see very good FPS again. Forgive me, I am not going to create a demo of this geometry manipulation. I have done it in my application, but this requires quite a bit of code, blood, sweat, and tears.

Edit: The poor performance in my examples has to do with a gaff by me. Removing the line sps.setParticles(); resolves the performance issues. Jerome speaks more about this below.

Link to comment
Share on other sites

5 hours ago, JCPalmer said:

Use option "3", which is submeshes, but not one for each strip.  Just 2 total.  Assign each sub their own material.  When a horse, the diffuseColor will match.  To zebra, change diffuse color of the material(s).

I think where you are messed up is thinking sub-meshes are contiguous vertices.  They are not, but they must be grouped / defined together.  If you built your geometry in Blender, 3D Max, then this ordering game is automatic.  Doing this by hand is near impossible though.  You have to have duplicate vertices on the borders.

Sub meshes must be consecutive indices. This is what I mean by array manipulation - to reorder the indices so that your option 3 could be implemented. To give an example:

If the horse (for simplicity sake) is made up some "groupings" of indices 1, 2, 3, 4, 5, 6 where each index "grouping" is an abstraction of references to some vertices. If we want to colorize grouping 1->white, 2->black, 3->white, 4->black, 5->white, 6->black we would need to define 6 sub meshes.  In order to reduce the needed sub meshes to 2, we would need to reorder the groupings to 1,3,5,2,4,6. 

Link to comment
Share on other sites

Not sure I was really clear when explaining how to use the SPS because you seem confused and duplicate many SPS what wasn't the initial goal

If you duplicate many SPS without any animation each frame, it can still be as fastest as many meshes : http://www.babylonjs-playground.com/#1ZXDGM#7

or http://www.babylonjs-playground.com/#2AZKAH#2

This has nothing to do with the number of color or particles, but it's related to what amount of vertices you try to update each frame.

The SPS is just a mesh. If you update, say, 50 meshes or submeshes each frame, it will have a cost... and I guess that a scene with 3M updatable vertices will be heavy anyway for the couple JS/WebGL whatever the way you intend to deal with.

The SPS was usefull here to gather all what you needed to update in a single pool (and one draw call) and to use its API as a "subpart API", what is generally easier than accessing consecutive indices in submeshes : it's designed in this purpose.

 

[EDIT] have a look at this SPS with more than 3M vertices : http://www.babylonjs-playground.com/#HDHQN#43

It's not updated each frame, so the FPS is better... even I still set the colors of the 3M vertices each update. But maybe you don't need to update ALL the vertices each frame but only some of them at certain moment, what could then be far more faster.

Link to comment
Share on other sites

Jerome,

Thank you very much for the information. I think SPS is a very nice solution for the problem as I have described it.

SPS however, is not capable of setting a material per particle? Instead, only a color may be applied to each particle. My zebra example was not entirely accurate for my purposes - each stripe should be considered a separate material, not simply a separate rgb value. I guess getting to have my carrot cake, and eat it too is better than no cake :D.
 

 

Link to comment
Share on other sites

no you can't set a material per particle... but you could define one material for the whole SPS with one texture atlas, one file embbeding many different images like this one for instance : spriteAtlas.png 

From this single texture you can set a different image (subpart) to each particle : http://doc.babylonjs.com/overviews/Solid_Particle_System#uvs

 

ex : http://www.babylonjs-playground.com/#2KSQ1R#38

or course, you can change the particle uvs (texture) dynamically : http://www.babylonjs-playground.com/#2KSQ1R#113

[EDIT] the principle detailed in these former examples is still the right one, but the code is really old and uses the prototype of the SPS, not the far more optimized BJS-core one,

Note : particle colors and material colors are mixed together, if any.

In brief : one mesh, one material, one texture (so one draw call, it's its purpose) but a way to access to each subpart and to set its color and texture individually (and position, rotation, velocity also)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...