[g-a-devel] Happy patch bonanza (more patch bonanza)

Sat Jul 1 14:25:32 BST 2006

Hi All:

After a couple cups of coffee on a jet-lagged Saturday morning, I've  
read and digested all the e-mail on this subject.  Since I'm one of  
those damn self-centered ASCII Americans, I don't always completely  
understand the full impact of all the internationalization and  
localization problems, especially when it comes to this particular mix  
of various processes and libraries.

Let me make sure I understand the proposal here:

1) IN FESTIVAL: Rely on a convention to optionally extend the  
programmatic description of Festival voices directly in the Festival  
voice data itself (i.e., not in gnome-speech).  Based upon precedence  
set by the festival-freebsoft-utils folks, this extension adds a  
"coding" attribute to define the character encoding type of the voice.   
The "coding" attribute is a string acceptable for passing directly to  
g_io_channel_set_encoding. ISO-8859-1 is implied if "coding" is absent.  
  See the "current-voice-coding" description at  
<http://www.freebsoft.org/doc/festival-freebsoft-utils/festival- 
freebsoft-utils_13.html> for more information on the "coding"  
attribute.

2) IN GNOME-SPEECH: Patch the gnome-speech festival synthesis driver to  
check for the "coding" attribute of a voice description.  If the  
parameter is defined, call g_io_channel_set_encoding with the value of  
the attribute.  If it is not set, default to ISO-8859-1.

This sounds simple enough to me.  I may be misunderstanding something  
in one of the threads on this topic, but it seems that it is implied  
that the user will be setting the character encoding for their desktop  
to be the same as that of their synthesis engine/voice and visa versa.   
Should some sort of transcoding/conversion be attempted if there is a  
detected mismatch, or is this automatically handled by the g_io  
infrastructure?

In addition, the obvious impact here is on our Telugu  
(festival-te.sf.net) and other UTF-8 language friends - they would need  
to extend the relevant festival voices to set the "coding" parameter to  
UTF-8 and also help test this.

Please let me know if I'm understanding this correctly.  In addition,  
many many thanks to both Enrico and Milan for their understanding and  
diligence in this matter.  You definitely help define what "community"  
means.

Will

PS - I'm out of the office for the next several days, but I will  
release a new gnome-speech tarball for the next GNOME 2.15 deadline  
(12-July) if we can quickly reach closure on this.

On Jun 29, 2006, at 12:27 PM, Enrico Zini wrote:

> On Thu, Jun 29, 2006 at 05:53:49PM +0200, Milan Zamazal wrote:
>
>>     EZ>      (coding "ISO-8859-1")))
>> Yes, except that it's probably better to specify the coding without
>> double quotes: (coding ISO-8859-1)
>
> Done.  I'm not proficient with LISP: what is the difference?
>
>>     EZ> Because if we're inventing it right now, then I think I'd  
>> prefer
>>     EZ> "encoding".
>> Well, I'm not sure which of the two English terms better fits the
>> context.
>
> 'coding' is ok with me, if it's already used somewhere.
>
>> Yes, this is the right thing to do.
>
> Good!
> People, please review the attached patch for gnome-speech to take
> advantage of the 'coding' attribute.
>
>
> Ciao,
>
> Enrico
>
> -- 
> GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico at debian.org>
> <recode1.patch>_______________________________________________
> Gnome-accessibility-devel mailing list
> Gnome-accessibility-devel at gnome.org
> http://mail.gnome.org/mailman/listinfo/gnome-accessibility-devel