jerome Posted April 29, 2015 Share Posted April 29, 2015 Hi, After having monitored in many browsers the CPU profiler, I noticed, when morphing meshes, that the computeNormals() method called in each frame was the bottleneck. There was a previous topic where this performance issue was talked about with JCPalmer.No real improvement were found. So I decided to hack this method to check if something could be done.Remember this method was designed long time ago before dynamic morphing was added to BJS, so it wasn't designed to be used in a render loop 60 times per second. <tl;dr;>current computeNormals() :http://www.babylonjs-playground.com/#ZOSGB local computeNormals() :http://www.babylonjs-playground.com/#ZOSGB#1 If this mesh is too heavy (48 000 vertices !) for your computer, please adjust your mesh size line 147 : var nbPaths = 80;</tl;td;> What did I find ?I found many inexpensive changes could be done.For instance, the current computeNormals() method does :- 3 passes : the first for positions.length times, the second for faces number times and the last vertices number * faces of vertices number times- allocate 3 intermediate arrays : one will be filled with number of vertices new Vector3 objects, another one will be filled number of vertices arrays (an array of arrays) containing faces indexes. This makes the code very readable (if I can understand it at first sight not being an expert, I consider it very readable) but may be improved in term of passes and created objects.This is quite important because I noticed that Chromium, which has for now better general perfs in js execution and rendering than FF, starts to consume more and more in GC after a while on a morphing running for minutes... up to 70% CPU on some examples on long duration. The framerate finally decreases...Since FF keeps a constant lower framerate and a constant GC usage. Well, for now, I reduced the method down to 2 passes only (the first for nb of faces times and the second for nb of vertices times) and the memory allocation down to 6 Vector3 objects and one array of Vector3 sized nb of vertices length.The 6 intermediate Vector3 could be not used if I would re-implement vector3 methods locally (add, subtract, cross product, normalize) but I don't think it's worth it in term of GC gain.I would like to eliminate the second pass and the intermediate array whose use is only to normalize each normal, but I have no idea for now. The CPU profiler doesn't show this pass has a noticeable impact, so it may not be worth it either. That said... how much is this improvement ?Well, it depends on the mesh size and on your computer/browser capabilities. So please in this very heavy example (done in purpose to stress the method), change the mesh size (nb of vertices) line 147 and you'll probably find a value where the improvement is really really noticeable.On my own computer, with Chromium, I can have 60 fps (and no GC at all) for minutes with the new function since the legacy one will decrease the framerate downto 28 fps after minutes and the GC will increase up to 70% CPU. You can also compare both method durations by opening your browser console :new method : http://www.babylonjs-playground.com/#ZOSGB#2legacy : http://www.babylonjs-playground.com/#ZOSGB#3On my computer, with local examples (outside PG and PG editor running scripts), I get between x3 and x5 speed increase ! So please make your own tests and let me know what you think about this improvement before I dare a PR Quote Link to comment Share on other sites More sharing options...
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.