Voice Recognition for Linux
Marvin Raaijmakers
marvin.nospam at gmail.com
Tue Feb 20 13:12:30 GMT 2007
Well I have a little experience with Sphinx2. A few years ago I played a
bit with perlbox voice (http://www.perlbox.org/). This application uses
sphinx2 for launching applications by voice commands. That worked quite
well, but it isn't the thing you're looking for (= using voice
recognition for writing texts). But maybe such an application can be
build by using sphinx.
- Marvin Raaijmakers
On Tue, 2007-02-20 at 12:38 +0000, Chris Hayes wrote:
> Thanks or the feedback Eric. Is it really this hopeless? You talked
> about the Sphinx projects being okay - but not ready for normal users.
> To what extent are they capable? I'd really love to know if you or
> anyone else has tried them.
>
> I have looked into them but haven't had the time (and not being a very
> capable technical user) to get them going, orto get them going nicely.
> If I knew how well they worked, I'd probably be more inclined to use
> the time I don't have getting them working.
>
> Chris Hayes
>
>
> On 19/02/07, Eric S. Johansson <esj at harvee.org> wrote:
> Chris Hayes wrote:
> > Hi - I was wondering whether anyone here might know about
> what voice
> > recognition software is currently available for Linux.
>
> (warning, I am an unrepentant curmudgeon and negative
> filter. Interpret
> the following accordingly. If I'm wrong on any points, and
> someone
> wants to correct me, I will gladly learn.)
>
> In a nutshell, not much. Sphinx 4, and others of its family,
> you have
> some fairly decent recognition systems. However, they are not
> ready for
> prime time because if they were, people would be using them
> for desktop
> recognition. while the recognition engines may work well, a
> lot of the
> ancillary pieces such as training, dealing with microphone
> switching,
> dictionary management etc. are not quite there yet. On the
> other hand,
> the same shortcomings can be laid at the feet of Linux and
> Windows audio
> subsystems.
>
> from my perspective, the only usable speech recognition for
> end users is
> naturally speaking. There may be something on a Macintosh but
> I don't
> have any experience there. The reason I say NaturallySpeaking
> is the
> only usable one is because it's a large vocabulary continuous
> speech
> recognition system people used to get work done. Recognition
> engine,
> language model, sound system interface, etc. etc.. have had
> many years
> to evolve. nuance has had a couple of years to screw it up
> and they've
> done a wonderful job at it. I think the only positive
> contribution they
> have made during their stewardship of the product is the
> addition of a
> Bluetooth microphone audio model.
>
> The only way to get good speech recognition on Linux is for
> someone to
> drop a small number of millions of dollars into nuance's lap
> and pray.
> Not a good solution.
>
> I've been thinking about an alternative model for a couple of
> years in
> between other projects but I do believe the best solution
> (best defined
> as getting handicapped people working), would be to make use
> of Windows
> and Linux via virtual machines. Since virtual machines do
> horrible
> things to sound systems, I would recommend using Windows as a
> host OS
> with speech recognition, a mediator to transfer
> characters/commands/keystrokes to the Linux environment and a
> mediator
> to return window state information such as screen content,
> application
> running etc. etc.)
>
> There has been a primitive instance (which this has been taken
> off the
> net) to show the technique is fundamentally sound. a full
> function
> mediator, while difficult, is a couple orders of magnitude or
> more
> easier to build than moving a large and complicated windows
> application
> to Linux.
>
> in the short-term, run Linux on a virtual machine, display
> apps via X11
> server, and use something like natpython and one of its macro
> packages
> to build commands for Linux applications. nattext still bite
> you in the
> ass with all the random characters and inserts in
> applications but,
> that's nuances contribution.
>
> ---eric
>
> --
> Speech-recognition in use. It makes mistakes, I correct some.
>
More information about the Ubuntu-accessibility
mailing list