Jump to content

Avatar Animation via Kinect v2


Polpatch
 Share

Recommended Posts

Hello everyone!
I plan to use the Kinect v2 (that for xboxOne) to move an avatar in my scene.
My idea was to gain the quaternions of jointOrientations and then change the appropriate bone matrix.
 
Nevertheless, the jointOrientations are expressed in global rotations (each jointOrientation indicates the direction of the joint in absolute coordinates) while, if I understand it, I can only modify the local matrix of skeleton bone.

So I am trying to convert local jointOrientation in local rotation:
var joint;     //the joint of kinectvar parent = joint.Parent();var localOrientation = BABYLON.Quaternion.Inverse(parent.Orientation).Multiply(joint.Orientation);

But I'm having trouble in the transformation of the reference coordinate between kinect joints and avatar bones int the babylon scene...

I tried to change the axes by swapping values (x, y, z), but I'm probably wrong

var kinectOrientation;        //orientation of the joint expressed in quaternionreturn new BABYLON.Quaternion(kinectOrientation.y,                              kinectOrientation.x,                              kinectOrientation.z,                              kinectOrientation.w);   //this is just one example, I have tried in different cases

Do you have any advice?

Thanks in advance

Link to comment
Share on other sites

Kinect documentation is very poor, but looking around I found that the Kinect uses a right-handed coordinate system.
I tried to use your combination on a single bone (right forearm) of the avatar but I have not had a correct reply.
 
I retried once again with this new combination and the forearm moves correctly ....
var orient = new BABYLON.Quaternion(kinectOrientation.x,                                    -kinectOrientation.y,                                    -kinectOrientation.z,                                     kinectOrientation.w);
Perhaps this could be given by the different local reference axes of each local bone respect the scene.

 

However, your answer made me realize that my reasoning was correct. :D

Link to comment
Share on other sites

I am struggling this as well trying to read motion capture files directly.  The definition I see all over the place for left verses right handed, are about which direct is the positive direction of the Z axis https://www.evl.uic.edu/ralph/508S98/coordinates.html, not the inversion of axises like Blender.  Left handed Z is positive going into the scene.  Right handed Z is positive coming out of the scene towards the viewer.  Wouldn't that involve just switching the sign of Z?

 

I clearly have multiple problems trying to test variable influencers, make human, cmu mocaps, & a before render based skeleton interpolator simultaneously.  I loaded one of the Acclaim skeletons into Blender, and it is parallel to the ground.  This picture is in top view.  It looks like I should not have to do axis swapping.

post-8492-0-71431800-1449245514.png

 

I added a mocap animation to the skeleton, and exported to compare to what I generate (Blend File, https://drive.google.com/file/d/0B6-s6ZjHyEwUUW96Zm1hbklNMDA/view?usp=sharing), if I can get the same frame for both (don't ask).  I have so many problems right now, I hope it will become obvious after I fix some, what I need to do in this area.

Link to comment
Share on other sites

The very old Acclaim format uses earth's magnetic coordinate system, where Y and Z are flipped.  I recommend to always use the .fbx format, and this will solve the orientation issue - depending on what program you export from as the older formats have legacy issues.  If you really want to solve all of your mocap issues, I stream the Kinect V2 (doesn't matter if it's right hand or left hand) live into Motionbuilder, and record there - then simply export a native FBX file.  This has been working for me since I purchased my V2 a year ago now.

Link to comment
Share on other sites

Yeah, I am starting to go off Acclaim.  The original Carnegie Mellon database seems to have a lot of bones flipped too.  Here is an original, processed by the Acclaim importer for Blender, then exported and played as a traditional Animation.  The wire frame is the ground.  It is oriented wrong and the hip is backwards.

post-8492-0-34306900-1449274069.png

 

I got a translation to bvh & then "corrected" from another website, and loaded it using make walk,.  It ran perfectly.  Combination variable influencer & vertex optimization with a skeleton testing is now complete.  Using separate meshes, except for those which have shape keys, allowed the feet to have 3, the hair 5, eyes 1, clothes 7, & body 7 influencers.  Am putting mocap on hold for a few. I want to see if I specified an even lower number, whether you could determine in the case of those verts which had more than the limit, which were the least harmful to exclude. Right now it is first found, first kept.

 

When I say perfectly, the left shoulder seems wrong, but that is not the new vertex shader's fault. If I morph a shape key before the animation though, the face moves far away from the body, but not in the Acclaim un-fixed version (initial translation was another fix). Will look to save out a mesh not in pose mode.  That should allow you to still have an animation in the export itself at index 0.  Translation is ok in the animation, but roots translation will be converted to position change.

post-8492-0-48160700-1449275939.png

Will look into .fbx.  Any other format is better than Acclaim for getting off the HD using a chooser, since Acclaim format consists of 2 files, not 1.  Do not have my own Kinect or MotionBuilder, but this could be a benefit later.  Putting your animation in via another system and exporting is such a terrible work flow.  I am going to put a stop to it for the QueuedInterpolator extension.

Link to comment
Share on other sites

If your morphing is translating, it is always a local center problem.  If you haven't seen this before, then make certain all of your morph target centers are aligned before setting any shape keys, both locally and in world space - and it appears your already setting a default morph targt keyframe at frame 0 or 1 - I prefer 0.  You obviously understand that the .bvh file is a hierarchy animation file and not simply a file containing transformations.  It has always been too problematic to deal with these, but if you look at the ascii .fbx format, it's much simpler to edit if you need to (which I haven't had to do for many years, as it is compatible with every program available these days.) I believe that .fbx is the gold standard to use for all scene elements prior to blender import and .babylon export.

Link to comment
Share on other sites

Hi guys !!  If I may I will chime in in that very interesting subject.  First of all I think that using the Kinect v2 and doing mocap with it, is way more complex then using the v1 (less litterature, less program using it, very heavy on ressource etc.).  I have use both and  I must say that for motion capture the accuracy (for my work) and avateering, there`s not a lot of difference.  Me, I am using the kinect 1 because I am working with high school students and it`s way cheaper and in a nutshell it`s fitting my bill.  This  year I have work a lot with the kinect 1 and I had awesome results with a little freeware call miku miku capture.  It motion capture in bvh and  that works perfectly with the famous makewalk load and retarget addon from makehuman that is used with Blender.  So in less then 1 minute I have a very good motion capture running in Blender with almost any biped using the makewalk addon.  Now I am facing a real dilemma.  The dilemma is when I export my animated character in babylon.js it takes 10 minutes to be converted has a .babylon file.  But when I am using blend4web to do the same task , I have a html file of my animation in.......the push of a button (less then 2 seconds) I am in webgl land.  So I can have a homemade mocap animation in less then 2minutes.....  But babylon.js is fun to work with. If you want to try to take motion capture and avateering with a .fbx file I strongly suggest that you take a look at Live animation studio free version.  This little software is working well with fbx......  But me I am maybe a little bit old school and running on a very small budget, efficacity and timeframe but after all my extensive work in that field :     Mikumikucapture with the kinect 1 + makewalk (load and retarget in Blender) + blend4web is a good solution for me......  But if somebody is capable to do all those three steps in babylon.js I am all in to be part of it, testing it and using it....

 

Thanx,

 

benoit

Link to comment
Share on other sites

Mikumiku capture is a fine pipeline for the V1 - but doesn't work with the V2 - unless something has changed recently.  However, for capture, you are very right in pointing out that there is little difference in quality.  I use my V2 for scanning in higher resolution and it's much faster, so I use Brekel Body V2 as a plugin to Motionbuilder.  But for those people on the V1, the pipeline you lid out works fine, as I believe we discussed earlier this year.  As for te BVH file format, it is being corrected in your pipeline to orient the transforms correctly.  I only recommend the Motionbuilder pipeline as there are so many tools that I only need to work in a single software - execpt for export from software that supports the .babylon format.  However, there are standalone converters to do the job as well.  So I'm able to capture using the V2 and retargetting in real time on fully rendered characters - as well as puppeteering facial and other morph targets in real time.  

 

Thanks for pointing this out, as both pipelines work depending on what version of the Kinect you are using and what else you might need to support in your production.

Link to comment
Share on other sites

Yeah your right ! My workflow is only using kinect v1 because in a high school environment we cannot use the v2 because of the price and you need a computer with usb 3 and at least 16 gig of ram ! I must say that for personnal use i use sometime the awesome Brekel software. But me has a teacher and designer i am very surprise that the coder ninja in javascript-webgl aren't really into motion capture and avateering with the kinect 1 or 2. I will very prefer to do everything with webgl but for right now i think that everything cannot be done directly in webgl...so we have to use some offline technology....

Link to comment
Share on other sites

Thank you all for the answers!!  :D  :D

But I'm working on a specific project and I can not use programs in background for animation in real time, the kinect server for the acquisition of the frame is already an important compromise.. sigh

I've got to move my avatar only with the information contained in the kinect frame

Link to comment
Share on other sites

@Polpatch - Sorry for hijacking your thread, but it I thought you were done with it.

 

@DB - You are right, I think, but your explanation was a little mathy and does not even closely resemble the BJS pipeline.  If I may, I will describe how I am achieving morph yet also using a skeleton, so it will be more recognizable for most.

 

Positions / normals / UV / etc should be expressed relative to its local origin.  They are loaded into both a Float32Array on the cpu & and a gpu buffer.  The vertex shader is passed up the current matix of each bone every call as attributes.  Each call of the vertex shader, a vertex gets its defined starting position from the buffer, and applies the influencers it has for itself.  This happens if it is animating or not, so it is fixed overhead.

 

When morphing, I am changing the Float32Array & refreshing the buffer.  This happens before either gpu or cpu skinning, and I am morphing with the locally & globally centered input to the skinning pipeline.

 

What happened when I used makewalk to load a .bvh was it changed the vertices with the translation prior to even being exported, so the base vertex data is corrupted.  See how the origin is now way in front of mesh.  It was right underneath it before.  Not sure if I can fix this with the exporter.  I guess, just do not do that.

post-8492-0-45140800-1449337163.png

 

@DB & benoit - I am just wondering if it is possible to use a Kinect V1, but with the motionbuilder work flow, saving as a ascii fbx?  I do not have USB 3, and do not need it for anything except capture.  Hope they still sell it.  It would nice to have work flow be as independent from HW as possible.

 

As far as the export taking too long or motionbuilder not exporting a .babylon,  that is what I am talking about when I say the work flow is bad.  I am building tools which are bringing parts of the BJS development process into BJS, and pulling them out of the export process.  So you make your mesh, assign its materials, skeleton, and shape keys, then export it.  You are done there.

 

The voicesync or mocap app will either define your Arpabet+ strings & audio, or read your .fbx file and write out something that can be directly use in BJS, without going back to the export system. The mocap tool will be something like makewalk.  makewalk only works with one input, a .bvh.  I am in process of deciding what my input format(s) will be.  Both will have a stock character so you can see it actually operating in BJS as you develop. It may not be as good as makewalk in first release, but you have to start somewhere.

Link to comment
Share on other sites

@JC and @benoit-1832: You are both right from my experience in your approach and understanding.  Except JCPalmer appears to have skills beyond most, and we're all glad he's working out these issues for the community.  As for the Brekel Motionbuilder pipeline, Kinect V1 works fine - especially if you only require 30fps for animation - which is all that is required right now anyway.  Surface scanning using Kscan or another scanning program is much better with the V2, however, I don't know anyone else on this forum who requires this.

 

As for .bvh files, it's just important to know that this is almost a 30 year old format, where Biovision made choices for compatability in the late 1980s.  But it was changed by the programmers at Giant Studios before I worked with them on films to make the .bvh compatable with the cartesian space we work in.

 

As for mesh morphing, it is all about centers and initial registered transforms as JCPalmer is clear, and in my opinion, the .bvh format has not been adapted well for vertex transforms - except again by the developers at Giant Studios - which they won't share unless you know and work with them.  But for now, this is what we have - which is why I hope more people begin to look at .fbx, as this fomat was designed by Kaydara from the very beginning to replace the IGES format as a global standard.  If you want to look at the .fbx file, there are plenty available to review, and of course the binary .fbx is unreadable for editing text.

 

I look forward to what you guys develop for the community in the coming months.

Link to comment
Share on other sites

.fbx also holds a lot more than just skeletal animation, I see.  Also I do not know what the FBX exporter in the Repo is.  Could that be plugged into MotionBuilder? I have not got a clue.  If it is a standalone executable, it would seem it is a Windows program, maybe to work with an SDK.  Wish it was a typescript parser I could steel.

 

I am only concerned with the skeleton part of the file.

 

BTW, might not another difference between V1 & V2 be the # of bones per mesh tracked.  What skeleton definition is output?  Is there any .fbx known to be specifically make from a V2 stream somewhere?

Link to comment
Share on other sites

Hey All,

 

I would also like to ask @Polpatch (who created this topic) if he believes these current discussions are relevant to his initial post.  I believe they are, but wanted to make sure we've anwered his questions first - so @Polpatch, please let us know if this has been helpful, and if you still have issues which remain to be solved or understood.

 

As for myself, I feel a broader discussion on this topic is good for the community - and it's providing me with allot to consider, as well as probably the next series of posts on the pinned topic "Spaces (world, parent, pivot, local)" which was initially created by gwenael, and is still read by a great # of users and visitors.  However, having not been updated since July :( , this current topic on Kinect v2 animation provides me with quite allot of info to post - which is most likely good knowledge to reveal to the BJS community, and keeps the info as a resource for everyone.

 

So, having said that -

 

@JCPalmer - It sounds as though you've opened an ascii .fbx file in a text editor, which if true, you can see that it holds an entire scene's info including animation play controls and settings - and anything else you would find in practically any 3D software package.  It was developed by Andre Gauthier and his team at Kaydara as a new file format which was designed to and now does fully support most all 3D software production applications; allowing the import/export of practically all scene elements between applications.  On a very basic level, you are able to export an .fbx file from Blender, and then open that .fbx file containing the scene in most every other software application supporting the .fbx file format - and the initial Blender scene will function just as it did in Blender; including most all surfaces and complex objects, image textures, shaders (dependant), cameras, lights, skeletons, animations, vertex to bone binding - basically everything (for the most part).

 

Now, if you're inquiring about the FbxExporter.exe on the GitHub Repo from the BabylonFBXNative Project, it IS a standalone cmd line .exe.  But I don't imagine this is what you are referring to, as it is open source, and you'd potentialy be able to use whatever elements, functions, operations, etc. you might choose from it. I have some ideas as to what you might use this for, however, it's not entirely clear to me what you're ultimatte goal might be here.  But you have my complete attention at this time, so perhaps sharing your thoughts and/or plans would be quite welcome - as I've seen what you are capable of, and perhaps you're just getting started.  In some of these areas, I'm certain I could be of assistance. ;)

 

 

As for the differences between the Kinect V1 and Kinect V2, here's a list of the key differences between the two models:

 

 

 

QUICK REFERENCE: KINECT 1 VS KINECT 2
Posted on  March 5, 2014Author James Ashley
 
Feature                                                    Kinect for Windows 1                          Kinect for Windows 2
 
Color Camera                                          640 x 480 @30 fps                             1920 x 1080 @30 fps
 
Depth Camera                                         320 x 240                                            512 x 424
 
Max Depth Distance                                ~4.5 M                                                 8 M
 
Min Depth Distance                                 40 cm in near mode                            50 cm
 
Depth Horizontal Field of View                57 degrees                                          70 degrees
 
Depth Vertical Field of View                    43 degrees                                          60 degrees
 
Tilt Motor                                                 yes                                                       no
 
Skeleton Joints Defined                           20 joints                                               25 joints
 
Full Skeletons Tracked                             2                                                         6
 
USB Standard                                         2.0                                                       3.0
 
Supported OS                                         Win 7, Win 8                                        Win 8
 
Price                                                        $249                                                    $199
 
Additional Information:

 

“The Kinect v2 face recognition, motion tracking, and resolution are much more precise than the Kinect v1. Kinect v2 uses “time of flight” technology to determine the features and motion of certain objects. IGN summarized this technology well by comparing it to sonar technology, except that this is a large improvement and more accurate. By using this technology, the Kinect v2 can see just as well in a completely dark room as in a well lit room. Although the first Kinect used similar technology, the Kinect v2 has greatly improved upon it.  The Kinect v2 has 1080 resolution (HD), and from the picture below you can see the difference between images.

Kinect-1-and-Kinect-2-Resolution-Compari

Kinect v2 can process 2 gigabytes of data per second, USB 3 provides almost 10x faster broadband for the data transfer, 60% wider field of vision, and can detect and track 20 joints from 6 people’s bodies including thumbs. In comparison, the Kinect v1 could only track 20 joints from 2 people. On top of this, when using Kinect v2 we are capable of detecting heart rates, facial expressions and weights on limbs, along with much more extremely valuable biometric data. The Kinect v1.0 device doesn’t have the fidelity to individually track fingers and stretching and shrinking with hands and arms but the Kinect v2 has these capabilities. It’s clear that this technology is certainly much, much, more powerful and complex than the first generation of Kinect.”

 

 

I had to reformat these stinkin' chart twice - many, many minutes of my life just gone. Copy/Paste text simply blows. :angry:

 

Again, I hope we hear from @Polpatch soon.  But the above info is good for everyone reading this, as I see more and more users of the Kinect enter into the BJS pipeline weekly.

 

Cheers,

 

DB

Link to comment
Share on other sites

Hi Polpatch,

 

I was re-reading your posts to try and understand what your goal actually is.  It appears that you want to simply animate your avatar using the Kinect in real-time as the avatar mesh is rendered in real-time.  If I'm not currect in my assumptions, please let me know. But if I am, then obviously your biggest problem is that this is nowhere near simple to accomplish.

 

What I don't understand is the following...

 

on 03 December 2015, Polpatch wrote:

 

I plan to use the Kinect v2 (that for xboxOne) to move an avatar in my scene.

My idea was to gain the quaternions of jointOrientations and then change the appropriate bone matrix.
 
Nevertheless, the jointOrientations are expressed in global rotations (each jointOrientation indicates the direction of the joint in absolute coordinates) while, if I understand it, I can only modify the local matrix of skeleton bone.

So I am trying to convert local jointOrientation in local rotation:
 
var joint; //the joint of kinect
var parent = joint.Parent();

var localOrientation = BABYLON.Quaternion.Inverse(parent.Orientation).Multiply(joint.Orientation);
 

But I'm having trouble in the transformation of the reference coordinate between kinect joints and avatar bones int the babylon scene...

I tried to change the axes by swapping values (x, y, z), but I'm probably wrong

 

It would be very helpful to understand all of the elements you have in your scene at this time.  Are you currently reading each bone center's orientation (x,y,z) from the Kinect as it runs in real time and attempting to map this onto a seperate bone (retargetting)?  If so, there are far too many issues with this to even list in this thread.  Without more code available to review, as well as a list of scene elements and a more thorough description, it is practically impossible for me to suggest where you might begin to run some tests to identify your problems - which I would have to assume are many at this time.

 

I have been working with motion capture on high profile productions for more than 20 years, and am sure I have faced exactly what you are up against now.  Do you have much experience in motion capture, and if so, what systems are you proficient using and/or have used in the past?  This is important, as each system has it's own methods of aquisition and retargetting onto a secondary skeleton. If you have none or not much experience in mocap, then there may be layers of problems you are currently facing.  So knowing all of the above is the best way for me to personally assist.

 

But, I'll at least try and provide some suggestions until we have more information.  Have you been able to send messages such as using the console.log() for the (x,y,z) positions and rotations of a single Kinect bone's center at a decreased fps - such as 5 fps or 10fps?  If it were me, I would begin by writing a function to do this first to compare these values against the values reported from the Kinect SDK.  This should tell you that at least the transforms for that single bone are read correctly into the BJS scene in real time.

 

If the values appear the same, then I would probably create a cube in BJS, and constrain the cube's local center position to the single Kinect bone's (x,y,z) position.  But matching the orientation of the cube to the exact orientation of the Kinect bone may or may not be easy.  However, I would first just try and animate the cube's rotation using a version of the previous function which attenuates the cube's fps from 30fps to 5fps - and simply try and rotate the cube (animate the rotation of the cube in real-time) using the (x,y,z) rotations from the single bone of the Kinect skeleton.  Also, you need to set a distinct texture on the cube so that you will clearly be able to see how the cube is rotating as well as any undesired behaviors (transforms.)

 

I'm guessing that you are already way past this point and know what numeric values are being read into your BJS scene currently.  I'm guessing this because you appear to have already seen a delta in your bone rotations, since it appears from your brief sample code that you we're attempting to re-order your axis matrices' (x,y,z) on your bones in BJS already. But I can't assume anything unless I have a whole lot more info. So I'm advising as though you just began to attempt streaming your mocap data into your BJS scene, and have no or little experience using mocap at all.  If we were working in the same room together, and providing you are already streaming the Kinect data into a Babylon.js scene, I'm confident that we could solve this in a few hours at most, since this was much more difficult to accomplish when I first faced this problem in the mid 90s.  I should be a walk in the park today.

 

So as I'm writing this, again, I'll guess that you do have some experience in mocap, you know the (x,y,z) values of the data streaming from the Kinect into your BJS scene, you have already imported a copy of your Kinect skeleton into your Babylon scene, and using some function, have attempted to directly map the Kinect data onto the copy of the skeleton - otherwise, why would you want to re-order your bone's matrices'.

 

So, I've given you both extremes - one, in which you are a most ambitious person who's attempting this with little experience; and the other extreme, where you already have an exact copy of your Kinect skeleton rendering in a Babylon scene and are viewing an incorrect re-targetting of the Kinect data in real time.  I'm guessing the latter, however, I still can assume nothing.  If you are already driving a skeleton with your Kinect V2 in real-time in BJS, then it may be as simple as freezing all transforms on your babylon skeleton and applying offsets (probably not that simple, but not that difficult either). So I and others can certainly help you figure this out either way, however, I would need considerably more information on all fronts to assist.  If I were to receive the info requested, and you do have the experience I believe you might have, then you will be the first to accomplish this, and it shouldn't be a huge effort to do so.  I hope I'm right.

 

Cheers,

 

DB

Link to comment
Share on other sites

@JCPalmer no problem ;) ahahah

Hi Dbawel,

first of all I want to thank you for the great help that you give me.
i have almost no experience, just in these months I'm dipping (or drowning hahahah) in this area.
 
I use a local server to send frames kinect to my project via socket.
For each frame, I store the original information of each joint in an appropriate object. This object provides a match between joint kinect and skeletal bones of Babylon, the hierarchical map of the skeleton kinect and rotation/position of the joints (use only the position of the joint spineBase for the global position).
 
The "pseudo-code" you mentioned has the task to get the local rotation to be applied to the bone of my skeleton, since I can not apply global changes in the bones Babylonjs (or at least I have not found anything).
 
I tried to study the behavior of a joint orientation to change the reference coordinates and then use the data directly into the bone (with the appropriate corrections to correct the offset between the two skeletons. But for every bone I am obliged to further modify the axes, since the rotations did not correspond to the rotations of the joint kinect.
I have now found the right combination to match the right forearm of the two skeletons, but over the weekend I was not able work to find the combination of other bones, I will try this afternoon  :D
Link to comment
Share on other sites

@db,  I just happened to see the FBX exporter problem thread.  Had no idea what it did, saw it was not in typescript (meh), but thought you might use it, if you did not know about it.

 

I have not actually seen a text based FBX, I just inferred it was more than a mocap file based on Wikipedia, & some unofficial Blender documentation on FBX format.  I am seeing all binary fbx's, at least for the cmu skeleton.  Now that you mentioned Blender could write a .fbx, I thought I use to get a .fbx from the .bvh to test with, but Blender generates a binary .fbx.

 

Was already doing a bone name translation from cmu names to make human's for Acclaim.  Think anything more than a rename is going to be too much to do.  Remember, I am not just running this using scene.animate().  Too many places for problems to occur not to have a 1:1 bone match up right now.  A binary .fbx is also doable, but not really a good choice until process debugged with a text format.

 

Will think about it as I start putting together a release of the Blender exporter.  Have many changes there, & want to finish something.  I already have multiple classes (Mocap separate from input classes). Think .bvh could be parsed in about 8 hours.  Could be the easiest way to get this over the hump.

 

Kinect data interesting, V2 costs less, not more.  My concern here is not cutting myself off now, for something I am likely to want to do in the future.

Link to comment
Share on other sites

@ JCPalmer - I'm actually writing this bit last as I NEVER expected to get into the explaination I delivered below. So as my memory is blown now, I'll read your post again tommorrow - but I did want to tell you that I can write out an ascii .fbx file for you if you like. I have Motionbuilder, so it is no problem for me to do so. Let me know.

 

 

 

@Polpatch - OK, your last response tells me allot; but there's still many things to understand.

 

I use a local server to send frames kinect to my project via socket.
For each frame, I store the original information of each joint in an appropriate object. This object provides a match between joint kinect and skeletal bones of Babylon, the hierarchical map of the skeleton kinect and rotation/position of the joints (use only the position of the joint spineBase for the global position).

 

1. I'd love to see the code using "socket" (WebsocketIO - I assume) which stores the joint info into an object - and what type of object is this? A null center?

2. Where did you generate the target skeleton from - Blender, Max, Maya, another program? Does your target skeleton contain the same # and topography of the Kinect skeleton?

 

I have many more questions, however, I can tell you now that you will have many, many problems to overcome in retargetting mocap data in the way you are trying to do at this time. I can also say that you are very brave and bold to attempt this with little experience in motion capture. The approach to setting up reliable retargetting will be entirely in your pipeline to prepare your target skeleton to receive real-time animation from a source which is as much about animation, as it is the initial transforms of the bones, and both skeleton's hierarchies.

 

In order to minimize all of the problems you'll definitely encounter, for your first tests your target skeleton must be an exact copy of the Kinect skeleton in all attributes - hierarchy, bone topography, bone position, bone orientation (rotations), and bone scaling (which must be a value of 1.0 on all axis.) This is why I asked how/where you generated your target skeleton. As I had asked what is the object in which you store the Kinect data, I would ask someone like DeltaKosh if it's possible to generate bones in BJS from nothing; as this would be the very best of solutions. I read a post in the past where DK appeared to say that this was possible:

http://www.html5gamedevs.com/topic/16326-how-to-create-skeleton-and-bone-with-out-using-blender/

 

However, I have yet to find a way to do this myself. But if this is possible, then this might be valuable to the process. However, in the long run, the end result must be that you are able to drive a skeleton with a mesh weighted to the bones; and that is where the challenge truly is. So right now, I would only focus on driving an exact copy of the Kinect skeleton in Babylon and make certain that the process is completely reliable prior to attempting any retargetting to a skeleton with a mesh attached and/or a different skeleton. One step at a time, right?

When you have this completely debugged, then look at offsetting orientation, position, and topography - which all three of these are going to be huge hurdles to achieve.

 

Also, before there ever existed any retargetting functions or applications (20 years ago - I'm really dating myself now), I would accomplish this through parenting and constraints. I recommend using this process yourself, as this will provide you with the very best results using what is currently available in BJS.

 

First, we should talk about the "Neutral Pose." In order for any of this to work (at least without a whole bunch of other functions,) you must have already set your target skeleton at a unique stationary stance or pose.  This is normally with the target skeleton facing -Z in world space as the Kinect skeleton should be. But just make certain that the target skeleton is facing the very same direction in world space as the Kinect skeleton is in world space. Now, it's important that when you create or export your target skeleton, that it is in a specific body pose. This would be with the body completely standing straight, both legs straight and together - but not touching, as the feet should be approximately 6 - 8 inches apart and facing straight from toe to heel in the same direction as the body (of course.) Then the arms must be straight out from the sides at a 90 degree angle and the palms of the hands also flush to the ground and straight out from the arms. This is the Neutral Pose. It is key to making the entire system work for you, and is still used in every motion capture session today. Don't be confused by the Kinect not requiring this, as Microsoft uses algorithms based upon human physiology to avoid making the user do this. But your users must do this - at least for now.

 

Next step - create boxes (cubes) at the center of each bone of your target skeleton. Give these a consistant name so that you know what these are - such as "source_jointname" where jointname is the name of the target bone. Copy each one of these boxes, and name these "target_jointname", making certain that these target boxes are in the exact position of the source boxes which are in position on the target skeleton. Then you must change (set) the rotation of each of the target boxes to match the rotation of each cooresponding joint (where each of these boxes where created). 

 

So, now you have a target skeleton in a neutral pose with two boxes at each bone's joint center - and the "target" boxes have an (x,y,z) rotation value matching each cooresponding joint's rotation exactly.

Stay with me now.

The next step is to capture a single frame of your source skeleton, or with the Kinect, you could record this ahead of time as the Kinect wil be consistant per user - but not universally. You must use a single frame from the source skeleton recorded for each user, so it would be best to build this into a function using the Kinect SDK to capture a single frame which then must be used to set up the next step in this process to correctly retarget mocap data on the most basic level. Of course, when this single frame is captured, the person must be in the "Neutral Pose" just as the target skeleton is set.

 

Once you have the single frame of data from the Kinect which represents your neutral pose from the source skeleton, this must be accessible in your running babylon scene. What must be done now is to change the rotation of every "source" box to the (x,y,z) rotation value from the source skeleton at each cooresponding bone (joint.) Now you have two sets of boxes; both sets in the position of each cooresponding joint of the target skeleton, with the rotation value of each "source" box the same as the rotation value of each cooresponding joint on the source skeleton, and the rotation value of each "target" box the same as the rotation value of each cooresponding joint on the target skeleton. We're almost there.

 

Now before we can begin driving the target skeleton with the data from the Kinect, you must parent and constrain objects in the scene exactly in the following manner:

1. Make each "target" box a child of each "source" box at it's location. And make certain that you don't place these stps anywhere else in the process. It is the parenting that will now very acurately maintain the correct offset in rotations between your source skeleton and target skeleton - and without any further computations, however since these boxes are children, their position and rotation values can be later changed to provide additional local offsets for the data driving each bone at the source boxes. This will make more sense once you completely understand this entire process.

2. Constrain both the position and orientation of each bone in the target skeleton to each cooresponding "target" box at each bone's position. As each bone has the exact same position and rotation values as their cooresponding target box, there should be no change in transforms on the target skeleton whatsoever. If you see that there is, then you must have made a mistake at some point in the process.

3. Now in the next step (really many steps), you need to have a complete understanding of this entire process to avoid making a mistake and finding complete failure. The goal is to drive the position and rotation of each source box by the data from each cooresponding joint/bone in the Kinect skeleton. For this system to work, you must create a hierarchy within your "source" boxes which EXACTLY matches the hierarchy of your source skeleton. And we must also assume that the hierarchy of your target skeleton already matches the hierarchy of your source skeleton as well - which needs to exist in any successful retargetting of data using any application. The hierarchy of your target skeleton needs to have already been set in an external application - as we'll assume the skeleton was exported and converted into a .babylon file. So be careful to make sure you know very well the parenting in your source (Kinect) skeleton before you begin setting up this entire process.

So for your "source" boxes, as common sense dictates, wherever the root parent is in the Kinect skeleton, you must now create an additional box which we will name "source_root". This "source_root" box needs to be the top parent of all of the the source boxes, and this "source_root" box will only be used for translation, not rotation. I haven't looked at the Kinect skeleton in some time, but if its root parent is at the base of the spine - or on the pelvis, set the source_root box position to be exactly where ever the root of your source skeleton is located in world space. then parent all other source boxes to their cooresponding parent as represented in the Kinect skeleton.

 

So with this set up, you should have everything in place to drive your retargetted skeleton. Send the root position data from your object which holds the Kinect root transforms to the "source_root" box, and only the position data. For the rest of the skeleton, you will only send rotation data from each object which holds the transforms for each cooresponding bone in the Kinect skeleton to its cooresponding "source" box - and not any positional data. Using this setup allows a "zero" offset between the source boxes and the target boxes which means that the data is automatically offset without you having to make any adjustments. This also solves flipping issues in your target skeleton which is always a huge issue and practically inmpossible to solve - normally - because the bones in your target skeleton don't reach rotational values >179 or < -179, providing you freeze all transforms on your target skeleton before you export your skeleton to a .babylon file. Always freeze your transforms on all matrices' for any skeleton you export - for mocap or any animation.

An additional benefit is that you will also be able to set additional offsets to correct any undesired behaviors in your target skeleton by rotating and or translating your target cubes to be in a different position or orientation from its "zeo offset" initial transforms - and this can be done in real time, dynamically, and conditionally if necessary or desired - although do this with great caution, and in very small amounts.

 

I still suggest you begin incrementally, and make certain that you can drive a target skeleton exported straight from the Kinect with the exact transforms as the Kinect skeleton (it's not that straight forward, but I hope you follow what I'm suggesting.)  And I certainly don't suggest that you build this as a first attempt in doing so, as it will take time to fully understand this setup and why it works - as well as all of the benefits in using this method. This is very similar to what any retargetting software is doing which is on the market today. It's simply that this setup and the functions driving it are hidden from the user and not really accessible. Of course, once you're familiar with the process, all of these steps can be scripted, so the process can be fast and optimized considerably.

 

I certainly didn't expect to get into this on this post, as it will take a great deal of thought on everyone's part to really comprehent all of the relationships which are happening in the process I've described above - but consider that I had to figure this out when no one had yet heard of motion capture, and there were no tools to even make use of the data. So for those of you who actually read this and can understand the process, I hope it will also provide a greater insight into other aspects of animation and center axis transforms. And next time you use Motionbuilder, yo'll know what's going on under the hood. It was first named Filmbox, and I bought the very first license prior to this when it was expensive stage lighting control software - and Kaydara added objects so that we could control objects with channel animation - so I hope you get the idea.

 

I think that's enough.

 

Cheers,

 

DB

 

 

OK, I thought I might add that the only way to really grasp this without driving yourself mad, is to diagram this out BEFORE you attempt putting this process into place. There is never a replacement for pencil and paper.  ;)

 

 

 

 

 

Link to comment
Share on other sites

@JCPalmer - Great news! I can't believe that I didn't know about this tool, but Autodesk has a free tool called FBX Converter, and I believe that v2013.3 is the latest version. I downloaded and installed, and realized there are many nice features in this application. It has a windows based UI to convert several file formats to .fbx files - both binary and ascii. It also allows you to convert binary .fbx files to ascii .fbx files and in reverse - so this will allow you to easily convert to ascii .fbx and edit. The main limitation is that the other file formats that it supports are few, however, there are many converters available - so the real value here is that you can use the FBX Converter to produce an ascii .fbx and edit or use as text, and then convert back to binary, as blender and other application primarily support only the binary .fbx format.

The reason for this is that it is very quick to parse and load the binary format, but since the FBX format supports practically EVERY aspect of a scene; and there is so much information in the ascii format that these are large files and take longer to load.

 

However, in addition there are other great features such as a viewer, where you can quickly view your fully rendered scene with shaders and textures turned on - or turn on and off practically every aspect of shading and model, as well as control elements such as skeletons. The viewer also accepts all channel inputs in the scene such as controllers and any device supported by Motionbuilder - which are many. This allows you to puppeteer in real time without the need to launch the overhead of Motionbuilder on your system. And again, it's free.

 

So as I've mentioned, open an ascii .fbx file (the converter installs with several sample scenes) and you'll see how vast the support of scene elements are - as well as how easy it is to read and edit. Editing FBX files in past productions has saved me countless hours of work as well as being able to script repetitive edits - and has allowed me to do things I could do in my 3D software application itself. Well worth checking out.

Link to comment
Share on other sites

Yea, looks good.  I am done for the day for the east coast.  In my efforts to get the next version of the Blender exporters ready for production, this was out of my consciousness.  I usually know shortly what to do after working on something else for a while (power nap, when really tight for time).  Figured out I should be able find at least one .bvh using cmu skeleton with no translation baked into vertices.

  1.  I can load it with makewalk,
  2. then delete the meshes.  
  3. Export into binary .fbx.

Was then going to ask you if you could convert to ascii.  Looks like I can do it myself.  Thanks.  I have both Mac & Linux machines.  One should work.

Link to comment
Share on other sites

Thanks for all of your work on the exporter. The whole community including me uses it almost everyday. I'm with you - my brain fries at a some point each day - I don't know how you put so much effort into this after other work (I assume).

 

I hope you find some good info and usage from the .fbx format - definately worth looking at, since it holds so much scene info and is compatible across so many platforms.

Link to comment
Share on other sites

Hi @dbawel and thank you for all the time you are devoting to me!
Regarding your question:
 
1.Yes, the server Kinect (github: https://github.com/wouterverweirder/kinect2) uses exactly WebSocketIO.
The server does not have all the features of the Kinect SDK native (eg missing the status of joint), but I can easily make changes.
 
The server sends the following types of frame:
Kinect2.FrameType = {        none				: 0,        infrared			: 0x2, //Not Implemented Yet	longExposureInfrared		: 0x4, //Not Implemented Yet	depth				: 0x8,	bodyIndex			: 0x10, //Not Implemented Yet	body				: 0x20,	audio				: 0x40, //Not Implemented Yet	bodyIndexColor			: 0x80,	bodyIndexDepth			: 0x10, //Same as BodyIndex	bodyIndexInfrared		: 0x100, //Not Implemented Yet	bodyIndexLongExposureInfrared	: 0x200, //Not Implemented Yet	rawDepth			: 0x400,	depthColor			: 0x800};
 
In particular the structure of the frame-type body (and all its components):
//bodyFramebodies: Array[6];floorClipPlane: Quaternion;//body objectbodyIndex: int;joints: Array[25];leftHandState: int;rightHandState: int;tracked: bool;trackingId: int;//joint objectcameraX: double;cameraY: double;cameraZ: double;coloX: double;colorY: double;depthX: double;depthY: double;orientationX: double;orientationY: double;orientationZ: double;orientationW: double;
Currently I just memorize (orientationX, orientationY, orientationZ, orientationW) of each joint.
 
2. Avatar that I use is a native of blender (http://blog.machinimatrix.org/avatar-workbench/) and the initial pose coincides (luckily) with the one you described.
avatar_BJS.png
 
I think I understand in outline the process that you have explained, now I have to study it thoroughly.
Just one thing,

 

 

Next step - create boxes (cubes) at the center of each bone of your target skeleton. 

Taking as an example the forearms, saying "at the center of each bone" you mean that the cube should be placed between the elbow and wrist, or exactly in the corresponding joint (in the case of the Kinect in the wrist)?

 
In my ignorance I thought was enough to read the orientation of the joint kinect and add it (with brute force hahaha) to the bone corresponding with the necessary corrections to the rotation.
Undoubtledly follow your advice.
 
Thank you so much, I will keep you updated in the coming days!! I owe you a  lot of coffee :D
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...