jerome

Members
  • Content count

    3,196
  • Joined

  • Last visited

  • Days Won

    70

jerome last won the day on June 8

jerome had the most liked content!

About jerome

  • Rank
    Advanced Member
  • Birthday 03/17/1970

Contact Methods

  • Website URL
    http://jerome.bousquie.fr/BJS/demos/

Profile Information

  • Gender
    Male
  • Location
    France / Rodez

Recent Profile Visitors

3,274 profile views
  1. or just create first your object options outside the create call : var coneProperties = {height: 10, diameterTop: 5, etc }; // you can access this one then var cone = BABYLON.MeshBuilder.CreateCylinder("cone", coneProperties, scene);
  2. I guess you are confusing me with another Jerome ( https://twitter.com/jerome_etienne ) because I don't know anything about VR ;-)
  3. this way ? https://www.babylonjs-playground.com/#C1M3C7#2
  4. If you want to rotate a camera by setting a rotation angle, just create a mesh (a blank one, no shape), set it as the camera parent, then rotate the mesh.
  5. As you said, the V8 engine does a lot of magic under the hood and we can't easily predict where the gain would be. Nevertheless, when doing "var tm5 = someFloat", the engine has to create a float var anyway (floats are stored in the heap in JS, not in the stack), because tm5 can be set then with any other value. I'm not that expert, but I spent hours to compare the behavior of ComputeNormals() with and without the temp variables, what were here just for readability reasons, at the time I optimized it (up to x5 faster). The same (spent days there) with the behavior of all the internal computations (positions, normals, rotations, quatertions, uvs, colors) of the SPS to try to make it almost as fast as the legagy 2D particle system. Using a 10 yo laptop to make those comparisons, I can say there is a substantial gain when we deal with more than 8-10K calls per frame. This is an empirical value obviously but I noticed there was, on every machine, a limit where the CPU has so many things to do while 16 ms that skipping 10K scalar variable allocations (not objects, I don't even speak about the GC here) per frame could make a real difference. Your case seems to reach this limit because, just counting them, it's about 100K floats stacked and removed from the heap per frame. Even if it's only a part of 8% of the time on my machine, I guess it's worth a try to avoid this as this can be done. DK is OK for this. Unfortunately I won't do it before end of august or early september (no code for now). Not sure it's possible for your own case, but did you try the SPS approach ? store your 800 meshes in one SPS (if possible), then move them and compare the perfs ...
  6. and BTW, I would probably compare your floats like this because of the precision : Math.abs(y - 1.0) < 0.001
  7. If no billboard mode and no parent are used, multiplyToRef() (matrix multiplication) is still called several times in each call to computeWorldMatrix() : https://github.com/BabylonJS/Babylon.js/blob/master/src/Mesh/babylon.abstractMesh.ts#L1157 https://github.com/BabylonJS/Babylon.js/blob/master/src/Mesh/babylon.abstractMesh.ts#L1158 https://github.com/BabylonJS/Babylon.js/blob/master/src/Mesh/babylon.abstractMesh.ts#L1158 https://github.com/BabylonJS/Babylon.js/blob/master/src/Mesh/babylon.abstractMesh.ts#L1158 So, if I'm not wrong 4 times per call to computeWorldMatrix() at least. This means, in @fenomas case 25600 x 4 = 102 400 float allocations per frame. This could be avoided. I'll talk about this to @Deltakosh
  8. No idea what is the best ... the best is always the one what works
  9. Not sure the octrees are a good option when the meshes move in the World. The profiler says what you say... I will have a look at the reason why computeWorldMatrix() spends this time weirdly [EDIT] when displaying the profile results as a tree (topdown), the percentage of the time used by computeWorldMatrix() while 7200 ms is "only" 37% of the total time for me ... what still seems a high ratio imho 226.6 ms6.34 % 1322.5 ms36.99 % i.computeWorldMatrixbabylon.js:7 101.3 ms2.83 % 198.7 ms5.56 % t.isSynchronizedbabylon.js:6 80.6 ms2.25 % 370.1 ms10.35 % t.multiplyToRefbabylon.js:2 48.2 ms1.35 % 48.2 ms1.35 % t.copyFrombabylon.js:2 31.4 ms0.88 % 31.4 ms0.88 % i.copyFrombabylon.js:1 21.0 ms0.59 % 36.0 ms1.01 % t.RotationYawPitchRollToRefbabylon.js:2 14.0 ms0.39 % 14.0 ms0.39 % t.ScalingToRefbabylon.js:2 12.6 ms0.35 % 12.6 ms0.35 % getbabylon.js:6 10.7 ms0.30 % 10.7 ms0.30 % t.TranslationToRefbabylon.js:2 10.3 ms0.29 % 10.3 ms0.29 % t.getScenebabylon.js:6 4.8 ms0.13 % 4.8 ms0.13 % getbabylon.js:7 4.1 ms0.12 % 4.1 ms0.12 % getbabylon.js:7 3.4 ms0.10 % 3.4 ms0.10 % getbabylon.js:7 1.0 ms0.03 % 1.0 ms0.03 % r.getRenderIdbabylon.js:9 0.9 ms0.02 % 0.9 ms0.02 % getbabylon.js:6 0 ms0 % 349.8 ms9.78 % i._updateBoundingInfobabylon.js:0 Most of the time in computeWorldMatrix() is spent then in multiplyToRef() (10.35%) and in updateBoundingInfo() (9.78%) Matrix.multiplyToRef() calls then Matrix.multiplyToArray() what consumes 8% of the total time https://github.com/BabylonJS/Babylon.js/blob/master/src/Math/babylon.math.ts#L3376 It's 32 float allocations and 16 linear operations per call ... so for you 32 x 800 float allocations = 25600 each time multiplyToRef() is called ! I guess we could get rid of the float allocations since we can't skip the linear operations. I used to make this kind of little opmitizations for ComputeNormals() or the SPS. Dozens of float allocations per frame don't really matter, but dozens of thousands really start to matter. For the bInfo update, most of the time (9.22 %) is spent in the bBox._update() in no particular sub call : https://github.com/BabylonJS/Babylon.js/blob/master/src/Culling/babylon.boundingBox.ts#L66 Well, it's just that we do 800 x 8 box vertex computations and checks to localize them in the World.
  10. Still investigating, now here : https://github.com/BabylonJS/Babylon.js/pull/2232/files btw, like we can't be sure the normalization computation approximation will give integer numbers, I would rather have written ">0" or "<0" instead of "==1" and "==-1" here : https://github.com/BabylonJS/Babylon.js/blob/master/src/Mesh/babylon.mesh.vertexData.ts#L1653 and here : https://github.com/BabylonJS/Babylon.js/blob/master/src/Mesh/babylon.mesh.vertexData.ts#L1657 or used something like EqualsWithEpsilon()
  11. mmhh... very weird, because I didn't touch anything since the frontUVs and backUVs addition, so since a big while here : https://github.com/BabylonJS/Babylon.js/pull/2212/files on 1st june Weirder : I can't see anything in the code of VertexData._ComputeSides() that would be different for the the Polygon mesh than for the others ... since it's the same called by any double sided mesh.
  12. The LOD is really nice and the algo implementation is great ... but the allocations/deletions start to be noticeable regarding the FPS and the GC activity, I suppose ...
  13. http://doc.babylonjs.com/overviews/how_rotations_and_translations_work#generating-a-rotation-from-a-target-system
  14. mmmmh... I'm afraid you can't skip the WorldMatrix computation that easily because this matrix is passed to the GPU so as it can compute all the mesh vertices final positions and then all the projections to the screen. Computing your own WM from a simple translation matrix (+freezing it) should work either. It's worth a try... not sure the gain is really high though because computing 800 quaternions (for the complete WM) is really fast actually. updateBoundingInfo() should be quite fast also as it updates only the 8 bounding box vertex positions (+ the bbox center). You can update your own bInfo and then lock it to skip the automatic computation with : http://doc.babylonjs.com/classes/3.0/boundinginfo#islocked-boolean Usually, evaluateActiveMeshes() spends most of the time in the call of isInFrustum() : culling btw, I tried some weeks ago to implement a faster culling algo but I wasn't satisfied by the results : http://jerome.bousquie.fr/BJS/test/frustum.html (fast duration = experimental algo, frustum duration = legacy algo) If you're sure (I'm pretty sure you are because I know you're a profiler pro) that the time is spent in the WorldMatrix and bInfo computations, maybe you might think to other approaches : check if you can compute some logical pre-culling (so set some meshes as inactive from your game logic before the camera has to evaluate them), freeze/unfreeze the world matrix in turn for the meshes you know they didn't move for some frames, force the selection for the meshes you know they're quite always in the frustum, etc Maybe using a SPS holding all these meshes (or most of them, even if it's a different model for each solid particle) could help as the SPS computes only one WM and each particle bInfo within the particle loop (so faster)... but has a global level culling (so less accurate : all the particles or none are culled). Usually one draw call, even with false positives (things passed to the GPU that won't be finally rendered because out of the screen), is faster than more pre-computations. This must be tested on your very specific case to check what could be the best solution.