Speech recognition

Billie Walsh bilwalsh at swbell.net
Fri Jul 4 18:57:40 BST 2008

Nigel Henry wrote:
> On Thursday 03 July 2008 16:44, Derek Broughton wrote:
>> Dotan Cohen wrote:
>>> 2008/7/3 David Fletcher <kubuntu-users at thefletchers.net>:
>>>> I suspect this is a long shot:-
>>>> Does anybody know of any speech recognition software that can be used
>>>> with Kubuntu?
>>>> I have an acquaintance who has never owned a computer, but wants to use
>>>> one to write a book. He can be nudged in the direction of using Kubuntu
>>>> and LaTeX I think, but he says he's not got time to type it. He wants to
>>>> dictate it directly into the text editor.
>>>> Does anybody know of any applications that can do this please? I've
>>>> tried all the search strings I can think of in the Adept installer, and
>>>> Google has turned up nothing for me.
>>> He wants a secretary, not a computer. Speech recognition is used for
>>> sending preconfigured commands to the computer, not for dictating
>>> arbitrary text.
>> With current state of the art, it _should_ be possible to dictate
>> reasonably arbitrary text, but I really can't see how anybody who can't
>> take the time to _write_ a book, will ever get a book written.
>>> For instance, while your friend might dictate one
>>> character discussing a new display with another character, the reader
>>> might be surprised to see the characters discussing a nudist play.
>> Right.  Just like OCR, however accurate it is, you'll spend a lot of time
>> fixing the transcription errors.
>> --
>> derek
> I  somehow that we're still a long way from getting accurate text from speech. 
> It works well on Star Trek, where you can have a conversion with the 
> computer, and Cap't Picard can submit his log. I remember one Star Trek film, 
> where they went back in time, and Scottie tried to talk to an ancient 
> computer, then realised that you had to type your request on a keyboard "of 
> all things".
> Text to speech works ok, but there you have a synthesizer (or is that 
> synthesiser), which changes the text to speech. Words like "there", "their", 
> and "they're", all sound the same, but have different meanings, but from the 
> listeners viewpoint, the context tells you which is which.
> Going the other way, speech to text, it's a whole different ballgame. Looking 
> at the example above, and assuming that everone spoke exactly the same way 
> (no problems with different dialects), the computer would still need to 
> understand the context of what was being dictated, so as to print "there", 
> "their", or "they're". of course when you bring different dialects into the 
> equation, it gets really complex.
> The differences I've found, are usually with the pronunciation of vowels, 
> which can very often, at first, make it difficult to understand what someone 
> is saying, but you sort of get tuned in after a while, but we are humaan, and 
> not a computer.
> In the UK there are some strong dialects, Geordie, Glaswegian (in scotland), 
> and many others. Looking at 2 examples from Wales, and Northern Ireland, the 
> word tongue in Wales is pronounced as tong, and the word film in Northern 
> Ireland is pronounced as filim, and the name of the actor who plays Chief 
> O'Brien in Star Trek, who's first name is Colm, is pronounced Colim.
> At this point in time, I personally see a problem in computers converting 
> speech to text.
> I recently listened to a broadcast on the BBC's world service "Digital 
> Planet", and Amtrak in the US seem to be using speech communication to a 
> computer to get info for train times, etc.
> I recently had a problem with a parcel not being delivered in France, and 
> contacting Chronopost by telephone, was asked to speak my parcel reference No 
> into the machine. On the premise that you ask, so I do, I spoke each letter, 
> and number into the phone. Nothing. Then I'm asked to repeat the parcel 
> reference, which I do, but still nothing. To be fair, I'm English, and 
> perhaps the computer has some problem with my pronunciation. Now I appreciate 
> that this was direct communication by speech with another machine, but I 
> believe that accurate speech to text is going to take quite some time to 
> achieve.
> Just some observations, and comments.
> Nigel.
Please excuse the lack of <snip>'s. I just couldn't decide where to 
<snip> to reply.

A few years ago I watched a demo of speech recognition software. I don't 
remember which one. There was a very nice looking young lady with a 
headset mic on speaking to the audience and the computer behind her. In 
about 99.9% of the time the computer got every word perfectly. When it 
made a mistake she simply told the software t go back and corrected it. 
I was thoroughly impressed.

On the flip side of this thing. She spent weeks training the software 
how she spoke, and learning how to work the software for the demo. For 
her it would work with virtually no hiccups. If someone else took the 
mic about all they would get is garbage. Every word had to be spoken in 
just a certain way by one persons voice. Plus it was a planned script 
that he was using. On the surface it looked fantastic. But when you 
started digging into the nuts and bolt it sort of fell apart.

Life is what happens while your busy making other plans.

More information about the kubuntu-users mailing list