eSpeak in Norwegian, part 1
Henrik Nilsen Omma
henrik at ubuntu.com
Wed Dec 6 18:37:35 GMT 2006
Hi all,
Jonathan kindly generated some basic Norwegian voice files for eSpeak so
I could start testing and giving feedback. He and I have exchanged a few
emails about these files, but I'll take it to the open list now so that
others can follow the process of figuring out how to do this.
While working on the Norwegian voice I'm also trying to work out a more
streamlined procedure for doing this so that it will be easier for
others to contribute to other languages later.
First, I think it's good to have a standard reference text for each
language. I've selected the Wikipedia article on language
(http://en.wikipedia.org/wiki/Language), which is itself available in
many languages. It's far from identical in all languages, but it only
needs to be an internal reference for each language. Ideally when a text
file is made from that page it should be frozen so that it will be a
reliable reference for discussion.
First I cleaned up the text a bit, removing some mark-up, table of
contents and bullet points (it even helps to check the spelling in the
original text ...). I then split article up into individual files for
each paragraph and stored them in my home directory in the following
structure:
~/espeak/text/no/no-lang01.txt -02, etc.
I then generated .wav files of each text file using the initial voice
files. I first made files at the standard speed of -s160, but found that
slower files were easier to analyse and settled on -s100. It may be
different for other languages or listeners though (and most people will
set a much higher speed when actually using the voices).
While I was at it I did the same for Spanish, Polish and Swedish (it's
good to have some competition among neighbours!). The .wav files are
rather large so I ran 'oggenc *' to compress them to ogg.
I should also say a few words about setting up eSpeak at this point. I
downloaded the latest version which Jonathan provided from here:
http://espeak.sourceforge.net/test/espeak-1.17k.zip
eSpeak was already installed on my system and I didn't want to play too
much with that. I just put in a quick hack to use the old application
with the new data files. I unzipped the new espeak in my home directory
and placed the data files in ~/espeak/
in /usr/share I did:
sudo mv espeak-data/ espeak-data-orig
sudo ln -s /home/henrik/espeak/espeak-data/ espeak-data
I'm sure someone can come up with a better way to do this :)
So, after generating the .ogg files it's time to start debugging them. I
found using pre-generated sound files to be quite handy because then you
can pause and rewind (unfortunately seeking is degraded in the ogg
compression step). We could also get native speakers who are not yet
using Linux to listen to the files and report back.
Which raises the next question: What is the most useful form I can
provide feedback in? I've made some comments on individual words below,
mostly vowel sounds, but I suspect a more informed comment about the
phenomes might be better. I guess having the native listener tweak the
language files directly would be ideal but I'll need to grok more of the
eSpeak toolchain to do that.
I've tarred up that directory I was working on and uploaded it here:
http://people.ubuntu.com/~henrik/espeak/espeak-files-heno.tar.gz
But without ogg files, which I've placed separately here:
http://people.ubuntu.com/~henrik/espeak/ogg/
There is also a simple python script in there to help with the .wav
generation, though that could be much improved.
The results from my first listening test:
--------------
Språk (ubestemt) betegner menneskenes[1] måter[2] å kommunisere[3] på.
Bevisst[4] kommunikasjon skjer[5] ved hjelp av lydspråk[6], tegnspråk og
skriftspråk, ubevisst kommunikasjon for eksempel ved kroppsspråk.
Språkvitenskap[7] betegnes som lingvistikk[8].
[1] The 3rd 'e' is too long
[2] 'r' needs to be more pronounced
[3] The last 'e' has the wrong tone/flavour Sounds like an æ, should be
like 'Long E' on [*]
[4] The 'e' is too long/too much emphasis, and the i should be very
short (double consonant rule)
[5] needs a longer 'e' More like 'Long E' on [*]
[6] the 'y' sounds like the 'ee' in Leeds, but should be like 'Long Y'
in [*]
[7] 'å' needs to be longer like 'Long Å' on [*]
[8] needs a shorter 'i'
[*] http://frodo.bruderhof.com/norskklassen/sounds-g.htm
-------------
Please try this approach if there is some basic language support for you
native language in espeak so we can streamline the process further. Thanks!
Henrik
More information about the Ubuntu-accessibility
mailing list