Text to Speech - translation
jonsd at jsd.clara.co.uk
Sat Nov 18 12:05:30 GMT 2006
Text-to-Speech (speech synthesis) is an important accessibility feature
for visually impaired users, both for reading documents and speaking
eSpeak is compact open source synthesizer whose small size makes it
suitable for providing multi-lingual TTS on a Live CD. This is proposed
Adding an new language typically adds only 10 to 15 kBytes to the
(uncompressed) size of the eSpeak package, provided its
spelling-to-sound rules are fairly regular. eSpeak aims for clarity and
intelligibility rather than naturalness.
eSpeak currently supports 13 languages. The latest version, 1.17
(17.Nov.2006), is available at:
Some of these languages have been implemented with assistance from
native speakers. Others are just drafts based on a description of the
language found on wikipedia.org and probably sound very wrong until
mistakes are corrected and parameters are tuned to give a better match.
Advice or assistance is needed from native speakers to improve these
languages and to add new ones. This can include:
- Trying out different versions of vowels and other sounds to advise
which sounds the best.
- Working on spelling-to-sound rules, which are trivial in some
languages, but difficult in others (although few are as bad as English
- Identifying "function" words which are best spoken unstressed (eg.
"the", "are", "this", "his", "for", etc), or which are best spoken with
a preceding pause (eg. conjunctions and some prepositions).
- Adjusting parameters to give a good rhythm/cadence to the speech,
such as lengths of different vowels, and the relative lengths of
stressed and unstressed syllables.
- Providing recordings of words and phrases.
- Just listening to text and noting what sounds wrong.
Which language should I add next?
More information about the ubuntu-translators