Jump to content

Feature Request - Rendering Perf


Recommended Posts

I'm getting a little pressed for performance in a game I'm working on. With ~6+ characters I'm starting to drop below 60fps on medium/low-end computers when my meshes are not very high poly, my animations don't have a very high sample rate and only one animation playing per character, and with no particle effects.

My render loop perf breakdown is 4-6ms spent in _animate and 12-20ms in _renderForCameras in the top level render call.

I'm investigating ways to optimize the loop, but it's really difficult to find big chunks that would use:

1) wasm 

a) where the chunk is easy to black box

b) where the chunk is big enough that it's worth paying the high bridge-latency-cost of going in and out of a wasm module


or 2) web workers

a) where work can be done simultaneously with other stuff  / where order of costly calculations don't matter

b) where we don't need to serialize / deserialize too much info if we were to send it to a worker & get a response back

c) where (like wasm)  the chunk is big enough that it's worth paying the cost of messaging to and from the worker


Wasm to me feels like it'll be an all-or-nothing sort of thing because of how intertwined so many things are in babylon.

Web workers might be easier to work in because they can use existing javascript code.


Do other people have ideas / opinions / specialty knowledge? Do you guys spot some large areas that would be easy to isolate & either execute asynchronously or just optimize with wasm? 

Link to comment
Share on other sites


1) yes this would require to rebuild most of the engine in wasm 

2) Unfortunately as long as SharedArrayBuffer are disabled due to Spectre and Meltdown, we won t be able to rely on it as well.

There are usually plenty of other tricks that could apply for perf improvment and the community can definitely help addressing some of your issues.

Link to comment
Share on other sites

\o/  for SAB !

That said, the usage of webworkers with SAB is not that simple about their synchronization. Please have a read at this last year experiment : 



As sebavan said about wasm, the best implementation would be that the major parts of the framework would be ported to wasm AND the user logic would also be coded in wasm to avoid the communication issues (data passing back and forth, multiple wasm calls from js, etc). Note that, they intend, some day, to implement the multi-threading wasm side and the question of the performance gain will then be capital.

Meanwhile, maybe are there other ways to improve your very own case ? ... reducing the draw call number, reducing the globally the computation required (don't compute what's not visible, what's not pertinent, etc), reducing the number of loops, iterations, the amount of data ? well, just leads as we don't really know what you're trying to achieve ?

Link to comment
Share on other sites

Looking at your screen shot, _evaluateActiveMeshes, 26.9% total, is where computeWorldMatrix gets called, 16.7% total, is called.  While this is required if the scaling, location, or rotation changed since the last frame, if you know of meshes that are never going to move, this can be eliminated for them.  No amount of this automated optimization can ever know that.  FreezingWorldMatrix of background meshes is the kind of overhead that can be taken out without sacrificing or redesigning the scene.

One other area without sacrifice / redesign is merging meshes of the same material, which also do not move, scale or rotate.  After that, much of the low hanging fruit has been picked.

Link to comment
Share on other sites

  • 2 weeks later...

@JCPalmer if you are working on an extension, please let us all know. Otherwise, perhaps this is an extension we need to take on specifically for the BJS framework. I'm actually not sure why I'm writing this post, as all the info is available in the browser dev tools. But everyone want's the EASY way. I wish I had it so easy in the early days of making console games.


Link to comment
Share on other sites

I am not working in this area.  MergingMeshes of the same material already works with both meshes & clones.  One area which might make it easier is optimizing materials.  It is just a loop thru scene.meshes, but why should everyone need to do it.

Maybe, a method on Material.mergeMeshes(exclude? : array<Mesh>) : void.

If you had a bunch of tree meshes or clones, call it like: scene.getMaterialByName("tree_material").mergeMeshes();

If no one uses it, then it would be more bloat.

Strawman, not validated:

public mergeMeshes(exclude : array<Mesh> = []) : void {
    const selected = new Array<Mesh>();
    for (m in this._scene.meshes) {
        if (m.material === this) {
            let ignore = false;
            for (e in exclude) {
                if (m === e) {
                    ignore = true;
            if (!ignore) selected.push(m);



Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...