Acoustic Models for HUD

Siegfried-Angel Gevatter Pujals siegfried at
Mon Feb 25 18:13:43 UTC 2013

Hi Ted,

It's great to hear that voice recognition in Ubuntu is finally getting some
love :).

The English Voxforge models are currently packaged in julius-voxforge.
There I did go with the nightly builds there, since in addition to the time
and disk size (which IMHO is already enough of a reason), it needed HTK to
build, which is not redistributable. It'd also be interested in more
opinions though.

Out of curiosity, what's the plan for voice recognition in Ubuntu?



Am Montag, 25. Februar 2013 schrieb Ted Gould :

> **
> Howdy,
> As some folks may have noticed we're working on a voice input feature in
> HUD.  Part of what that requires is acoustic models to be available to
> understand the speech coming in.  Currently in Ubuntu there are a couple of
> these, but we need to get to the point of providing for various languages
> and having a way to update these continuously as the data gets better.
> So that leads to the question: How do we want these to look in Ubuntu?
> The best open source for training data appears to be Voxforge<>,
> a collection of samples based on known text.  These samples can then be
> used to compile the acoustical model that the various libraries need.  This
> takes significant amounts of CPU time.  Their most complete language is
> English, which has about 100 hours of audio, and takes about 10 CPU hours
> to compile the models that Sphinx needs.  While English is the most
> complete, I think it's important to realize that the best/worst case
> scenario that supports all languages well could result in easily over a
> thousand hours of CPU time.
> So if we think of things in the classic source vs. binary split, it seems
> like the Voxforge data is the source and we should make a source package
> that then builds these binary models.  But, at some level, we're just
> exchanging binary data (sound files) for different binary files (acoustic
> models).  Would it make more sense to package something like the Voxforge
> nightly builds<>for use in Ubuntu?
> I'd love to hear people's thoughts on this.  I'm leaning towards putting
> the Voxforge data as a source package, as it is our source, but I'm worried
> about the impact it may have on rebuilding the archive.
> Thanks,
> Ted

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the ubuntu-devel mailing list