Back to Windows...

Eric S. Johansson esj at harvee.org
Sat Oct 28 18:02:28 UTC 2006


Andy wrote:
> On 28/10/06, Eric S. Johansson <esj at harvee.org> wrote:
>>  If you do careful analysis, you will find there are
>> missing words, in proper tense, and things that sound right when you
>> read them out loud but aren't the right words (see in proper tense).
> Luckily the human brain is smart, it can correct things as I read them.
> I have seen much much worse on a mailing list and I wouldn't have
> guessed it was done with speech recognition.

:-)  try correcting your own writings when you have seen the same 
mistake *lots* of times.  especially when writing for a customer.

>> I also have something in the ballpark of 6-7 thousand lines of Python
>> code that I have written using speech recognition over the past five
>> years.  That's just my open source project work and no, I don't code for
>> a living otherwise I would not be able to speak anymore.
> Wow, I couldn't imagine writing code through speech recognition, is it
> adapted for programming, i.e. does saying 'if a equals b' generate 'if
> (a == b):', or do you need to say all the individual symbols each
> time?
> It must be extremely difficult, the fact that programming is so picky
> with a single punctuation character being wrong causing a compiler
> error.

that is why python.  spacing is not a critical and most names in the 
libraries are lowercase.

I have some adaptations.  == sounds like "predicate equals" in my world. 
  () is matched parens
(^) is between parens (^ is the cursor position)

the programming by voice project has a much better system that I need to 
try out.


> I unfortunately I don't know python but when I get time I really
> should learn it.
> 
>> But in the case of speech recognition,
>> expect to drop something on the order of $10 mil - $20 mil and five to
>> 10 years assuming you can navigate the patent law minefield for speech
>> recognition.
> Ah software patents, at least there are places without them.
> 
> 
>> I believe that path is creating a bridge
>> application so that a speech recognition engine running on Windows can
>> communicate with a gnome environment.

> Running windows in the background to handle this sounds like overkill,

yes it is but overkill is frequently the shortest path.  make the system 
work first them optimize.

> wouldn't it require some kind of virtual machine technology? and

yes.  that is what I do here.  I have 6-7 virtual machine I use for 
different tasks.  not all at once mind you :-)

> running the whole of windows for that one task is a bit extreme.
> Maybe some kind of emulation for the specific application windows uses
> for speech recognition would be better, extracting that code might
> have huge legal problems though.

wine is about 6-18 months away from the most basic support.  i.e. 
dictation only to naturally speaking's editor.  major problems are:

1) installer
2) installer
3) linux audio support
4) linux audio support
5) linux audio support
6) can't dictate into other applications
7) linux audio support
8) linux audio support

  and did I mention linux audio support?

all we need is working usb audio (both directions) and bluetooth (I'm 
not giving up my wireless headset without a fight)

so for the next 18 months at least, windows has the edge wrt audio and 
ease of running speech recognition.  I and others are willing to pay the 
price of running windows as the speech enabled host os and linux as the 
guest.  it is a model that works today without speech recognition.

> I know absolutely nothing about speech recognition, I am therefore not
> the right person to provide that solution. I am still a student and
> therefore still learning, maybe once I graduate I will know enough to
> be of some productive use.

you don't need to know anything about speech reco to help.  that is what 
people like me are for.  we need coders, people with hands to fix the 
problems.  you will lean by doing.

> Not got too much free time atm. Working on a University group project,
> (spookily some of the things we considered involved speech
> recognition, but we were warned against this, the chips we had where
> old and the atmosphere they where going to be used in would degrade
> performance yet more).

modern svsi systems handle noise better.  after all, they want to be 
able recognize your CC number over a bad cell connection.

> Ah money, unfortunately that's what most things come down to.
> Maybe the government should do something about this?
> They've got the money to actually make this happen.
> If its made platform neutral it could help a lot of people worldwide,
> ah but the government prefers to spend money on useless projects that
> help no one

we are probably better off seeking grants. ossri (website is down atm 
and I need to fix) is getting nonprofit status so we can collect tax 
deductable donations and act as a clearing house for OSS speech related 
  code.
> 
> I also have another problem with speech recognition, my microphone
> won't work under Ubuntu,

hmm. I think I said something about linux audio support above

--- eric





More information about the ubuntu-users mailing list