Acoustic Models for HUD

Ted Gould ted at
Mon Feb 25 17:13:34 UTC 2013


As some folks may have noticed we're working on a voice input feature in
HUD.  Part of what that requires is acoustic models to be available to
understand the speech coming in.  Currently in Ubuntu there are a couple
of these, but we need to get to the point of providing for various
languages and having a way to update these continuously as the data gets

So that leads to the question: How do we want these to look in Ubuntu?

The best open source for training data appears to be Voxforge, a
collection of samples based on known text.  These samples can then be
used to compile the acoustical model that the various libraries need.
This takes significant amounts of CPU time.  Their most complete
language is English, which has about 100 hours of audio, and takes about
10 CPU hours to compile the models that Sphinx needs.  While English is
the most complete, I think it's important to realize that the best/worst
case scenario that supports all languages well could result in easily
over a thousand hours of CPU time.

So if we think of things in the classic source vs. binary split, it
seems like the Voxforge data is the source and we should make a source
package that then builds these binary models.  But, at some level, we're
just exchanging binary data (sound files) for different binary files
(acoustic models).  Would it make more sense to package something like
the Voxforge nightly builds for use in Ubuntu?

I'd love to hear people's thoughts on this.  I'm leaning towards putting
the Voxforge data as a source package, as it is our source, but I'm
worried about the impact it may have on rebuilding the archive.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application+AC8-pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <>

More information about the ubuntu-devel mailing list