JCPalmer

Talkies, finally! (Improved)

Recommended Posts

It has long been the potential of the QueueInterpolation animation system to be capable of speech using shape keys.  The difference between potential & actual ended up being a couple of years, though.  The "Talk" button on the Automaton QA scene, now says a half dozen sentences.

Share this post


Link to post
Share on other sites
On 9/5/2017 at 11:44 PM, jerome said:

so funny ! :D

Ah, I think I am going to need to credit Robin Williams for the "I'm melting.  I'm melting. / Clean up, Aisle 4." sequence.  I was trying to come up with a way to exercise different expressions.  Completely random stuff is not as good as some kind of a theme across sentences.  Those 2 popped into my head.  I cannot prove he said it, but it fits.

Thanks, @hunts. I like negative cheeks high.  I have actually updated LAUGH & HAPPY (not in published yet) to use this.  They really look great now.  I could add grumpy as a stock expression after a clean up (no tongue), or may just review my ANGRY to see if might be improved, influenced by your settings.

@Jim U, glad this got you register.

I have spent another day on this, and got some real improvements (will adjust topic title when pushed up).  A couple of visemes were fine tuned.  The thing really improved is being able to talk fast, and not have it be just some violent chopping.  I found a way to not always deform for every viseme.  Having less deforms in itself makes it smother.  The thing is to know what can be discarded.  The Arpabet database I converted to Javascript has vowel stresses of 1- primary, 2 - secondary, and none for all of its 10k of words.  I now discard vowels with no stress.  Am now able to say "get me the hell out of here" without it being slow, overly enunciated, & wooden.

Guess I am trapped in an "continuous improvement cycle".  Just one more day.

Share this post


Link to post
Share on other sites

Ok, the talking is now set.  I could run before & now side by side, and it looks much more life like.

In addition to the speech itself, I improved some of the expressions & added GRUMPY.  Also, there needs to be a talking version of each expression, see dropdown.  These are computer built from the normal one. Before I just removed all the MOUTH deforms.  Now I am retaining more. While MOUTH_OPEN is fully removed, other mouth deforms are just reduced by 50%.  Gives much better expressional talking.

About half the sentences had a vowel or 2 discarded from before, and they are all less tortured for it.  Some of the beginning sentence were re-recorded going much faster.

There are some problems with the characters, but I am going to call the talking done.

Share this post


Link to post
Share on other sites

OMG this is totally AWESOME ! I've been looking for this feature around
It sells for 100$ on the unity store ! https://www.assetstore.unity3d.com/en/#!/content/3021

I'm so gonna look into it ! 
Hope it's not too complicated ...

I'd be glad if you had the time/intention to write a guide for noobs ^_^

Also very curious on how you tweaked the audio frequencies to interpret facial movement !

This is great !

Share this post


Link to post
Share on other sites

That unity addon you reference is based on bones.  That kind of a skeleton in WebGL is not advised.  Not sure if you were going to buy or were just referencing it, but exporting the output to BJS could have issues, especially on iOS.  I achieve this without bones, using morphing.

The early workflow from MakeHuman to Blender to export does have a readme.md linked in the References dropdown of the Automaton Test Scene.  FYI, there are many additional parts to put into both MakeHuman & Blender currently required, but the extra stuff for MakeHuman will be included in the normal install  with the next release (soon I think).

I actually do not directly take into account the audio track to determine either the deforms to perform, nor when to do them.  The tool I use to write the animation sequence is not going to be public at this time.  I am not sure where it is going, but once out I cannot get it back.  FYI, all the expressions and vismes are public, so a causal dev could maybe do something by hand.

Share this post


Link to post
Share on other sites

Thx for your answer !

I've seen your documentation after posting here so I see you did share your method ! Thank you very much for that !
So you mean you animated the lipsync by 'hand' ??? That's a lot of work tweaking and fine-tuning !

My first intention was to create a kind of "next gen music video" with BABYLON.js, where a 3D character would sing the song.
One solution would be to use a videoTexture of a face/mouth singing but that's really not ideal because file size would be huge and the result aesthetically doubtful.
Then the morph target feature was released and I thought : that + audio analyser would be the way to go for automatic lipsync !

But that's just the theory and it would take a scientist/engineer to achieve what I want :-/ I'm just a noob using Babylon end-user methods.

Your scene offers a glimpse of endless possibilities ! Love it !

(In the meantime, "next gen music video" already exists now : see a video rendition of an Oculus clip : 

)

Share this post


Link to post
Share on other sites

BTW, for any demo that involves MakeHuman, I usually make a cross post for them.  Rarely does this lead to discussion, but in this case, someone pointed me to a C-based repo that does speech generation.  I ended up coding a 28 line html file, to test how consistent speech rate was cross-browser for the Web Speech API.  Not very unfortunately.  In case anyone cares.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Recently Browsing   0 members

    No registered users viewing this page.