Voice recognition software
braddock
ulist at gs1.ubuntuforums.org
Fri May 5 03:46:15 UTC 2006
Brian Astill Wrote:
> On Thu, 13 Apr 2006 04:16 am, da5id wrote:
> > Â IBM sunk a lot of money into speech recognition only to end up
> > as an also-ran. -- Gene
>
> It's frustrating. IBM has continued some development of speech
> recognition, but only for large commercial use and that isn't
> operating yet, anyway, SFAIK.
>
IBM remains the leader in the field of speech recognition. Why don't
you see an IBM SR product? Because that is not where the money is
coming from.
I used to work IT in Dr. Jelinek's (one of the founders of the field)
lab at Hopkins; most of his grad students get snatched up by IBM's
speech recog lab. The Center for Language and Speech Processing at
Hopkins is one of the best funded labs in the Hopkins Engineering
School.
There are three things to keep in mind to understand the situation.
1: The funding for almost all SR work comes from US intelligence, just
as most machine translation research was (and still is). Any
non-government commercial applications are mere afterthoughts. This is
why IBM continues an intense research program with no (visible)
product.
2: Speech recognition is HARD. World class speaker-independent
recognition error rates with unlimited processing times are still
around 30% (and if you come up with a technique that improves half a
percent, you publish).
3: What is important to the government customers driving the technology
is not necessarily what is important for commercial accessibility
applications. Naturally Speaking, for example, uses a trained limited
vocabulary model with high quality mics in a controlled environment to
get, what, 95%+ accuracy.
But those are not "real world" conditions for intelligence
applications, which are more concerned with things like improved
speaker-independent (no training) keyword extraction and speaker
identification on telephone-quality links (one of the most standard
datasets used is called <a
href="http://ucsu.colorado.edu/~francish/swbd.html">"Switchboard"</a>,
to give you some idea).
Yes, it is unfortunate, and more than a little spooky. And I would
agree with Eric that $10 million for development is a low and high-risk
estimate. The good news is that most of the research and a fair amount
of code is openly available and could be leveraged, although I don't
want to think about what the patent situation might be.
--
braddock
More information about the ubuntu-users
mailing list