Jump to content

Meshes Memory Usage


royibernthal
 Share

Recommended Posts

mesh = 3d model created in blender, around 1MB (practically less, but for the sake discussing the worst case scenario)

I have 25 "mesh displayers" on screen simultaneously - each one can display one out of 12 different meshes. Each one can change the mesh it's displaying at any given time. Each mesh displayer has its own array (length: 12) of clones of the original loaded meshes.

Would having 300 meshes clones occupying the memory from the moment the game is loaded be too much? Although 25 of them are displayed at any given time, 300 will still be stored in memory.

I was thinking of creating a meshes pool and only creating new mesh clones when they have to be displayed (and by definition of objects pool - putting them back in the pool when they no longer need to be displayed) - it'll probably reduce the number of meshes created from 300 to 50-100.

On the downside, and this is slightly a guess as I'm not familiar with how many resources are required for the task - cloning new meshes during gameplay can possibly slow things down and damage the experience, most likely more so on mobile devices. Correct me if I'm wrong.

There is of course the mixed strategy of creating an objects pool with 50-100 meshes from the start, which could reduce the number of times in which meshes would have to be cloned during gameplay.

Is there even a memory issue here or will all options run extremely fast? If there is an issue, which strategy would be better in your opinion?

Link to comment
Share on other sites

Hi Diaco :) Thanks for the article, I read it.

I don't think it answers my original question, but it does raise 2 more questions:

1) Are 25 different meshes with different vertex data displayed at the same time too much? Taking into consideration that the actual meshes displayed change frequently.

2) I have 2 "mesh displayers", each can display its own clones of meshA and meshB.

DisplayerA at first displays its own clone of meshA

DisplayerB at first displays its own clone of meshB

afterwards they switch -

DisplayerA - meshB

DisplayerB - meshA

remember that for instance the meshA used in displayerB is not identical to the meshA used in displayerA, it's a different clone of the same source mesh

In that case would the meshes need to be re-uploaded to the GPU? Or would bjs recognize that the same vertex data is being used?

Link to comment
Share on other sites

Hi @royibernthal, in my projects I got in performance trouble when babylon has a lot of 'drawing calls'. In the debugPanel you can see how many are currently called. I could reduce them from over 400 down to 50 by merhing meshes with same materials. And the polygoncount was only a secondary property that was not the problem in that case.

You second solution (cloning at runtime when the meshes are needed) sounds good to me. I also do it that way but I don't make a game where everything is about a fluid gameplay. So I don't know exactly if there will be performance lacks, but I would suggest just to try it out. I think it also depends an further properties like the polygoncount, the kind of materials, shadows and so on ...

And try to use instances instead of clones also :-)

Link to comment
Share on other sites

Hi @jellix

I'm calling scene.debugLayer.show() and for some reason it is very small and unreadable, I tried playing with scene.debugLayer.axisRatio but it doesn't change anything. What should I do to make it bigger?

Regardless, at the moment I have 84 draw calls for 27 active meshes. Oddly when I make 25 of the meshes to be clones of the same source the draw calls go up to 102, how come?

Unfortunately merging meshes isn't an option in my case, I need to be able to "play" with each mesh separately.

Do instances have substantially better performance than clones?

 

 

Link to comment
Share on other sites

Out of curiosity, do you know exactly why instances result in better performance?

What about this question?

1 hour ago, royibernthal said:

at the moment I have 84 draw calls for 27 active meshes. Oddly when I make 25 of the meshes to be clones of the same source the draw calls go up to 102, how come?

Link to comment
Share on other sites

:) I doubt you will get answers to "is it too much" questions.  Too subjective.  Is 4 FPS too slow?  I say "nah", but you might not agree.  :)

I rarely put meshes in an array.  Scene.meshes already has it handled.  I just setEnabled(true/false).  By doing that, it turns on/off all sorts of servicing of the mesh, no matter clones, instances, or masters.  "mesh displayers" is the wrong approach, imho.  Babylon engine is already a mesh displayer.  Why build another one?

Now, if you call these "mesh displayers" something different... it makes your brain get flexible again.

What if you called it a "SetEnabled Manager"?  Sure you can maintain refs to your mesh-groups in an array, but... you know... always use enables and disables.  That is the performance secret.  Clone-up a thousand, or instance-up a thousand... don't worry about it.  Only enable the needed ones.  Keep your camera.maxZ nice and small.  :)

It is my opinion that you can just relax, code away, and have a good time... not worrying about performance... IF you manage your enabled/disabled wisely. 

1. Is there even a memory issue here or will all options run extremely fast?  Test and see.  Playground-making time - performance comparisons.  Fun!
2. If there is an issue, which strategy would be better in your opinion?  Compare all methods and see.  Make playground tests, look at the numbers, then we discuss optimizing... after we can see what you are trying.  English description of what you seek... SUCKS, compared to playground examples.
3. Are 25 different meshes with different vertex data displayed at the same time too much?  Too much for which platforms?  How many verts in the mesh?  Textures?  Shadows?  Alpha testing/blending needed?  Seen the train demo on main site?  Huge... and still very fast.
4. I have two mesh displayers... blah blah blah, snore.  Don't use mesh displayers.  Store clones and instances in scene... disabled or enabled.  Swap mesh vertexdata when needed.... but you won't need to.  ;) (Let Babylon do what it does.  Avoid micro-managing.  Search for playgrounds with keywords 'createInstance' and 'clone'.  See what others have done.  BJS is already super-optimized.  No need for you to concern yourself.)
5. Do instances have substantially better performance than clones?   Yes.  Instances always have same material as master.  Clones, not so.  Or vice versa.  Test.  :)
6. Out of curiosity, do you know exactly why instances result in better performance?  Yes.  Less service overhead;)
7. Hey Deltakosh, do you perhaps have a moment to look into my question?  Nicely asked!  He probably does, but he needs to be able to FIND the questions amongst the yap. Take it from me, a professional yapper.

I don't know if any of my answers are correct or good, but at least we have gathered all your questions into one pile... for easier management. :)  NOW Deltakosh or other big dogs might visit and comment.  There's a better chance, now.  Hope ya don't mind my consolidating and adding my potentially wrong answers.

Link to comment
Share on other sites

Thanks for the answers :) I'll try to be more objective from now on.

 

MeshDisplayer is mainly for game logic, it's not really to try and replace in any way bjs functionality or try to improve performance. It consists of an array of meshes which are bundled for game logic reasons, so that they can be added / removed in a comfortable way according to game needs. Each MeshDisplayer displays one mesh at a time, which is taken from a collection of around 12 different meshes, you can look at each mesh as a frame if it helps, while the container is sort of a placeholder.

With that said, I AM storing mesh instances, "MeshDisplayer" doesn't contradict that, it just serves as a way to easily swap between meshes according to game circumstances. Feel free to call it whatever you want if you're not comfortable with MeshDisplayer :) The reason I introduced it in the first placed is for you to be able to understand better my examples, but it seems I didn't explain it well and caused a lot of confusion.

 

I'm asking these questions instead of doing performance tests in order to understand better how bjs works. Doing performance tests by trial and error can be very exhausting and much less productive in my opinion.

 

I'm aiming for 60 FPS on all devices, hopefully it's doable. By too much I refer to a situation that'll cause the frame rate to drop from the average 60 FPS on mobile for instance.

Following all the suggestions so far I changed my code to use mesh.createInstance() instead of mesh.clone(), I created a mesh instances pool.

 

Here're the questions I have left, written in a straightforward way with no examples.

It'd probably be discouraging to start answering all of them, sorry about that.

 

1) Is there a performance difference if I use scene.addMesh and scene.removeMesh instead of mesh.setEnabled?

 

2) Is it ideal to store 300 mesh instances (instances of a 1Mb source mesh) that are not displayed in the scene?

Do they take up a considerable amount of memory to store?

If I understand correctly, they shouldn't because they re-use the vertex data and material of the source mesh. Is that right?

 

3)  Are 25 different meshes with different vertex data displayed at the same time likely to perform well on 60 FPS on a mobile device?

 

4) Does adding / removing a mesh (or enabling / disabling it) require a substantial amount of memory? I'm adding / removing meshes from the scene very frequently. In the extreme case I'm removing from the scene 5 meshes and adding 5 other meshes instead every 10 frames.

 

5) At the moment I have 84 draw calls for 27 active meshes according to debug.

Oddly when I make 25 of the meshes to be instances of the same source in order to batch them the draw calls go up to 102.

Is there some logic I should know about batching meshes with the same vertex data and materials or is bjs supposed to handle it automatically?

 

6) I'm calling scene.debugLayer.show() and for some reason it is very small and unreadable, I tried playing with scene.debugLayer.axisRatio but it doesn't change anything. What should I do to make it bigger?

 

7) For this one I have to write an example.

I have a mesh - let's call it A.

I create from it 2 mesh instances:

A_1, A_2

I removed both instances from the scene.

I add A_1 to the scene.

I remove A_1 from the scene and add A_2 to the scene.

At that moment, is the mesh re-uploaded to the GPU or does bjs recognize that A_2 uses the same vertex data and material as A_1 and simply re-uses them?

Link to comment
Share on other sites

Oh c'mon you pros/Gods.  If you have time to LIKE my post, then you have a moment to answer Royi's questions, or at least give your opinions.  :)

Poor Royi is gonna have a nervous breakdown.  hehe.  Let's put some effort into these questions that are essentially over my head.  Pretty-please?  Thanks!

Sorry Royi.  We've consolidated the questions, we've added lots of white space... you used all the tricks-of-the-trade (approved methods) that I suggested to you in PM.  I dunno. 

Do your pits stink?  Do mine?  Maybe we should try some nice cologne, next?  I suppose we could sprinkle some gold dust on the thread floor.  I've never seen expert-luring that was THIS DIFFICULT.  :D

I'm thinkin' naked girls and a grilled lobster feed, next.  heh

Link to comment
Share on other sites

6 hours ago, royibernthal said:

5) At the moment I have 84 draw calls for 27 active meshes according to debug.

Oddly when I make 25 of the meshes to be instances of the same source in order to batch them the draw calls go up to 102.

Is there some logic I should know about batching meshes with the same vertex data and materials or is bjs supposed to handle it automatically?

Can you recreate this in the PG?

6 hours ago, royibernthal said:

1) Is there a performance difference if I use scene.addMesh and scene.removeMesh instead of mesh.setEnabled?

http://www.babylonjs-playground.com/#1LV6QX#0

6 hours ago, royibernthal said:

4) Does adding / removing a mesh (or enabling / disabling it) require a substantial amount of memory? I'm adding / removing meshes from the scene very frequently. In the extreme case I'm removing from the scene 5 meshes and adding 5 other meshes instead every 10 frames.

http://www.babylonjs-playground.com/#1LV6QX#4

http://www.babylonjs-playground.com/#1LV6QX#5

 

The PG is your friend.  Ask it a question and it answers immediately.

 

Link to comment
Share on other sites

Thanks @adam  @royibernthal has a few years of 2D games under his belt, but ol' Wingnut could use even more assistance.  Would we use the browser's f12 dev tools performance monitor... to do comparisons with these playgrounds (pre #6)?  Or Just watch the FPS rates... and be able to learn doing that?  Do some stop-watch checks?  I have never used the (firefox) f12 performance monitor.  It looks complicated.  :)  But I can learn it... maybe.  Thoughts, anyone?  thx.

Link to comment
Share on other sites

1. Yes..removeMesh is better as the mesh will no longer be enumerated at all

2.You are correct. they do not use a lot of space (just a matrix mostly). If you do not want to display them at all, it is safe to remove them from the scene

3. 25 boxes: for sure. 25 dinosaurs with 1millions polygons: not sure ;) This really depends on meshes complexity

4. Nope..totally free

5. As Adam mentioned, please provide a PG. Using instances should drastically reduce draw calls (Example here: http://www.babylonjs.com/Demos/InstancedBones/ 100 meshes and only 13 draw calls)

6. You must have a 4k monitor or something like that. debuglayer is pure html that you can style with css: http://doc.babylonjs.com/tutorials/Using_the_Debug_Layer#controlling-the-debug-layer-by-code (see last part at the bottom)

7. babylon.js is smart enough to recognize them. So you're good

Link to comment
Share on other sites

@Deltakosh Thanks for the short and on point answers :)

 

3. What about this info for the whole scene?


Total meshes: 85
Total lights: 2
Total vertices: 102217
Total materials: 13
Total textures: 69
Active meshes: 28
Active indices: 358614
Active bones: 0
Active particles: 0
Draw calls: 12

 

5. I am using mesh instances. I can create a PG, but let me simplify this (question in the next line and example in the following 3 lines) -

Is there some logic I should follow for batching mesh instances of the same source? For instance the order in which they are added to the scene? Or should they be batched automatically?

For instance, if I have meshes A, B and C, and I'm adding to the scene their mesh instances in the following order:

A B B B C B B A C A A B C C C

Will all mesh instances be grouped and batched according to their source mesh? e.g. all mesh instances of A batched, same for B, and C

 

6. 1920x1080 resolution, I'll look into modifying the css, but isn't scene.debugLayer.axisRatio supposed to take care of that? How come changing it doesn't affect anything?

 

8. Is there a future for bjs on mobile devices? Will 60 fps ever be feasible on a mobile device for the scene mentioned in point 3? How far are bjs and the devices from getting there?

 

@adam Thanks for the answers, I'm familiar with the PG but I was looking to understand how bjs works rather than blind trial and error performance tests. See Deltakosh's answers :)

Also, since I'm using typescript spread across tens of files, me and the PG are not very good friends, it requires me to re-write my code for the tests, many times from scratch.

Regarding your mobile test - That is extremely low for a very simple scene. So I'm guessing bjs isn't really meant for mobile after all?

Although it should probably be much faster on newer devices (like iphone 6 and especially 7), what do you think? I would've run some tests for this question but unfortunately I still own a poor iphone 4.

Link to comment
Share on other sites

:)  You keep wording things a bit weird, Royi.  Truth is, mobile devices are not ready for full-power webGL, no matter the webGL framework.  BJS only tries to expose the webGL API in a thin-layer way.  BJS does what it needs to do... to expose most of the API.  Can a mobile device handle that?  Rarely.  I wanted to get ya cleaned-up on this point.  :)  BJS IS meant for mobile... in every way.  Mobile isn't ready to handle the necessary requirements for full webGL power.  (imho)

#5 - You have 4 questions in #5.  I'm pretty sure that this is NOT one of the suggestions in "Wingnut's Posting Guide for Getting Attention from Big-Dog Coding Gods"  heh

Let me try a reword of #5.  Does the "order" of instantiations/clonings affect the performance of the engine's updating/servicing OF those instances/clones?

How was that?  :)  btw, Deltakosh is in Texas, ropin', roundin' wranglin', brandin', and drivin' the cattle herd to market, at the moment.

One other point:  Is anyone besides me... having troubles with Royi's term "batching"?  Is "batching" the method used to fill these mesh-pool arrays you once spoke-of?  (thx for clarification there)

#3 - subjective again.  You know as well as I... that you "should" build the scene using simple models, and then do perf tests... and see for yourself.  There is no way for anyone to answer #3.  Nobody knows your hopes and dreams, and showing stats won't help, either.  It is ALL dependent upon the size/amount of your mesh, your materials, your alpha-blending needs, your shadow needs, your post-processing needs, and which platforms you wish to have this thing be "tolerable" upon.  So many variables. That's a rough one (imho).

(aside) Royi and I have been doing a few PM's, so I know he can handle my verbal beatings.  :)  But only I get to beat him up in public. heh

Royi... I wish I had some actual answers for you, though.  What you ask... I can understand WHY you ask it.  But... it's difficult to answer without doing benchmark tests, friend... and I think you realize this, too.  And what is "handle it"?  Half-speed FPS okay?  What is "mobile device"?  Cray XMP3 in a wheelbarrow with two car batteries powering it?  ;)  (Boy, am I a goofball today, huh?)

Link to comment
Share on other sites

I see, it'd be nice to hear your thoughts ( @Wingnut @Deltakosh @adam ) of when mobile will be able to fully handle WebGL. Is it already happening in this generation of devices or will it take a few more years in your opinion?

 

 

#5 -

I'm giving examples with seemingly more questions to help everybody understand better the main question (a bad habit?). If you want a straightforward question, it would be this one:

Is there some logic I should follow for batching mesh instances of the same mesh source?

 

#3 -

The statistics are supposed to help @Deltakosh deduce the complexity of my 25 active meshes, since I'm not sure how to find out the polygons per mesh. I should've probably said that to avoid confusion.

I'm giving full details on an existing scene and talking about optimizations of a scene of that complexity, not about any other dream scene that nobody knows about.

There's no doubt that benchmark tests will be helpful, and I intend to do them. I do however believe that @Deltakosh, being the creator of this 3d engine, would be able to deduce the scene complexity from my question and stats and understand if it's optimal for the fps and device I mentioned. Perhaps I'm wrong and like you say I'm asking for the impossible.

Here's the history of #3, if you still believe my answers are subjective, feel free to beat me up again :)

 

@royibernthal

3)  Are 25 different meshes with different vertex data displayed at the same time likely to perform well on 60 FPS on a mobile device?

 

@Deltakosh

3. 25 boxes: for sure. 25 dinosaurs with 1millions polygons: not sure ;) This really depends on meshes complexity

 

@royibernthal

What about this info for the whole scene?


Total meshes: 85
Total lights: 2
Total vertices: 102217
Total materials: 13
Total textures: 69
Active meshes: 28
Active indices: 358614
Active bones: 0
Active particles: 0
Draw calls: 12

Link to comment
Share on other sites

#3: this is perfectly fine. For example the Sponza demo runs at 60fps on an iPhone or Galaxy S7 (http://www.babylonjs.com/Demos/Sponza/)

#5: (thanks Wingnut for clarification): Order of instancing is not a problem. babylon.js will always find a way to merge correct instances together. So no logic involved, you can batch and instantiate like you want. In you example, engine will correctly batch all instances of A together, same for B and C 

#6: scene.debugLayer.axisRatio is for axis (x, Y, Z) when they are enabled

#8: A big number of our demo (on our homepage) runs already at 60fps on high end devices (like iPhone or Galaxy S)

Link to comment
Share on other sites

WebGL 1.1 is fully implemented on iOS & Android.  Has been for a few years.  WebGL 2.0 spec will hopefully be final this year.  This will allow a few things on mobile to be done that the API limited, like exceeding 23 bones on mobile (would require changes though).  I am also positive that you are going to need an A5 cpu at minimum, so that iPhone 4, with the Samsung cpu, will never run 2.0.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...