eSpeak in Norwegian, part 1

Henrik Nilsen Omma henrik at ubuntu.com
Fri Dec 8 11:11:08 GMT 2006


Jonathan Duddington wrote:
> I wouldn't want to advertise eSpeak's Norwegian, Swedish, or Polish
> voices at this stage!  Anyone who might be interested enough to help to
> improve a language will download eSpeak to try it.
>   

Perhaps we should find a way to distinguish voices that are ready for 
use and those that are just ready for testing and tweaking? Call it 
no-pre, no-test or something?
> eSpeak will pick up the espeak-data directory in the user's home
> directory (if it exists) in preference to the one in /usr/share.
>   
Cool.
> It's important to use the correct version of the espeak-data with the
> compatible version of the program.  Sometimes there are changes in the
> format of the compiled data files.
>   
Yes, so we should figure out a better way of installing the latest 
version on ubuntu (or running the latest anyway).
>   
> It's useful if you can determine whether an error is in the
> spelling-to-phoneme conversion (i.e. in the no_rules and no_list
> files), or something else (such as a bad implementation of a phoneme
> sound).
>
> So it's useful to understand:
>
> 1.  The no_rules and no_list files (see eSpeak's docs/dictionary.html
> file).
>
> 2.  The -x and -X command line options.
>
> 3.  How to test a pronunciation using explicit phoneme mnemonics eg:
>       speak -vno "[[spr'o:kvi:t at nskA:p]]"
>
> 4.  speak --compile=no  to re-compile the espeak-data/no-dict file after
>     making changes to no_rules and no_list.
>
>   

Thanks, I'll answer your specific questions, but I also want to play 
more with better ways of conducting this process, perhaps write some 
simple scripts to help with the testing of different alternatives. Need 
to read a bit more about linguistics though ...

>
> [1] "menneskenes". The 3rd 'e' is too long
>
> Because it's followed by a single consonant.  Perhaps a "enes" word
> ending is an exception, [En at s] or [@n at s] rather than [e:n at s] ?
>   

Oddly, "[[m,e:nn at sk'En at s]]" is actually better here.
>
> [3] The last 'e' has the wrong tone/flavour Sounds like an å, should
> be like 'Long E' on [*]
>
> Yes, I made final "e" an open-schwa sound (for which I used phoneme
> mnemonic [@2]), as spoken in the Introduksjon sections at
> http://www.languageonthe.net/norsk/  (aatte, sproeyte, hjerte, krone).
> Is "kommunisere" an exception?
> [[k'Ommu-:n,i:sare:]] or [[k'Ommu-:n,i:sar@]] rather than
> [[k'Ommu-:n,i:sar at 2]].
>   
"[[[[k'Ommu-:n,i:se:r at 2]]]]" sounds better

> [4] "Bevisst". The 'e' is too long/too much emphasis, and the i should
> be very short (double consonant rule).
>
> It translates as [[b'e:vIsst]], with the short [I]. It sounds like the
> "short i" should be shorter, perhaps shorter than other short vowels?
>  
> Is "be" in "bevist" an unstressed prefix (like in German)?
>   
Yes "be" is unstressed, and the I is very short indeed.
> [5] needs a longer 'e' More like 'Long E' on [*]
>
> I added a rule according to the comment in [*]:
> "Short e before r is usually pronounced much more like å. e.g. hver
> (every)". Without this rule it would be  [[S'e:r]]
>   

Yes, [[S'e:r]] is better. That rule sounds bogus. May be a local thing.
 

> [6] the 'y' sounds like the 'ee' in Leeds, but should be like 'Long Y' 
> in [*]
>
> The example word in [*] for long-Y ("ny") does indeed sound like the
> "ee" in Leeds.
> Compare the sounds which I've used for long Y and I
>   speak -vno "[[y:]]"
>   speak -vno "[[i:]]"
> also short Y and I
>   speak -vno "[[y]]"
>   speak -vno "[[I]]"
>   

Hm, none of these really capture the Norwegian Y. The closest I can 
think of in English is the way 'y' is said in 'Kyrie eleison' (which the 
default English voice gets wrong IMO too).

> [7] 'å' needs to be longer like 'Long Å' on [*]
>
> It's followed by two consonants, so eSpeak makes it short.  
AFAIK that rule only applies when the consonants are the same. So it 
applies in 'ikke' and 'denne' but not in 'viktig' or 'fordi'. And as you 
say in compound words it also does not apply, so 'Språkkunst' would have 
a long å.


>  no_rules could have a rule so that Språk is
> always [[spr'o:k]].

Yes, that sounds correct.


Henrik



More information about the Ubuntu-accessibility mailing list