Sign in to follow this  
lancsDavid

generating syllable timing from audio track

Recommended Posts

dear anyone,

i want to create a track of timing data that corresponds to the spoken syllables from an audio track.  ie: for a given audio track (of speech) i'd like to create a track where each syllable of 'video' (for instance) would be created corresponding to the Vid-Ee-O sounds.  it's for creating kinetic typography with.

firstly, if anyone knows of an audio utility (or github project) that does something like this that would be great.  if not i might have to make my own.  was hoping i could make a HTML5 app that might do it, roughly as follows....

>> get the app to display the waveform data on the screen & play through it.  ideally with some checkbox for changing the playback speed (to slow)

>> underneath have another 'track' that - when the above audio track is playing - creates markers every time the user presses a key (but otherwise keeps going).  with hopefully the ability to slide these markers around later (a la editing) but i can probably figure this bit out myself

the attached image is a rough idea of how it might look in the browser (blue = marker, red = playhead)

if anyone knows if this is easy to do with the web audio api or in any other way any suggestions v welcome

david

 

sample_app_playhead.jpg

Share this post


Link to post
Share on other sites

I'd suggest doing it offline, run the audio files through a specialist tool that extracts the syllable data into a cue sheet - for example:

https://github.com/DanielSWolf/rhubarb-lip-sync

Then process that sheet as your audio plays, synchronising the audio / visual as necessary.  Main benefits: 1) a higher level of audio analysis (using other people's expertise) including phonetics, 2) not bogging down the browser at runtime with what is a constant analysis.

Or, if you're looking to do things dynamically, then a rudimentary peak analysis might suffice for syllables (e.g. quantise the amplitude to 100ms and assume a syllable if the value changes by +75%)?

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.