Clean Sheet?

Mikko Virkkilä mvirkkil at cc.hut.fi
Sat Jan 15 17:50:16 CST 2005


Jonathon Blake wrote:

>Peter wrote:
>
>  
>
>>As in the translator can see how others translated the string in his/hers language or in a language that he knows 
>>    
>>
>
>If a person is not fluent in both the target and original language,
>then they should not be doing any translation work.  Let them create
>the spellchecker, write user documentation, and similar task.
>
>The only time it makes sense for a list of "how word 'x' was
>translated" is when the L10N team is creating a technical vocabulary
>for a language. [Both the KiSwahili Project, and Translate.org.za have
>to do that when they did their first translations.]
>  
>
This is simply not true. Even in english the actual meaning of folders 
and directories is the same. I think gnome has standardized on using the 
word folders while all command line apps use the term directories. In 
other languages you might have a lot more synonyms for a word (even a 
technical one). Another example is the word path, which is often used as 
a synonym for address which in turn is often exchangeable with the word 
URI. Now imagine that those words can have multiple translations and the 
translations can have multiple synonyms. It is very important to know 
what words the other translators have used when translating parts of the 
same project (ie different gnome apps).

In addition, seeing how it was translated in other languages is of 
assistance when it is not clear what the %s stands for. This happens 
often and I hope that gettext will later be improved in a way that 
forces the programmer to use something like %s_name or %d_zipcode so 
that the translators don't have to go looking for where that particular 
text is in the UI to find out what text is put there.

>Even then, you still need the context.  [ Think "saw" as a noun,
>versus "saw" as a verb, versus "saw" as an adjective.]
>
>  
>
>>Again... I wasn't talking about enforced fully automated memory translations but about "assisted" translations.
>>    
>>
>
>CAT Tools are useful --- provided one has the full context. 
>
>  
>
>>even a crooked translation might help.
>>    
>>
>
>A bad translation won't help anybody.  More to the point, it hinders things.
>
>  
>
A bad translation helps. Often one of the trouble when doing 
translations is remembering what the translation of a word is, not how 
it's used. If I can get a raw automatic translation of a sentence I will 
easily spot the word that are out of place, and fix the grammar. However 
it often takes a while to remember what the equivalent of the word hue 
or interlace is.

>>"there is no po file for X language... would you like to create one
>>    
>>
>via automatic translation system?" and the system should create one
>based on previous translations.
>
>That is how the errors in the French Localization of OOo were created
>in the first place.  And also why the Afrikaans spell checker omitted
>such common words as "was", and "die".
>
>  
>
I

>>maybe assign fuzzy flag to all strings pending a translator review.
>>    
>>
>
>It would have to assign a fuzzy flag, so somebody can correct all the
>errors that were made.  My working assumption is that all fuzzy string
>translations are grossly incorrect.
>
>  
>
I have the same assumption. In addition I think that most first time 
translators make a lot more mistakes in their translation than ppl who 
have translated a lot. That is why I think it would be important that we 
could mark translations as reviewed, ie a translation would go through 
at least three stages, automatic machine translation -> human 
translation -> (possibly fixes by peers) -> review by an experienced 
member of the translation team.

>>in my view exposure means more eyeballs... this should improve accuracy. 
>>    
>>
>
>Like the accuracy of Wikipedia?  Where articles are not internally
>self-consistent.  The highest quality/most accurate material is
>typically find in pages that have been "edited". [Look at the edit
>page, and pick something between the 25th and 50th edit.  That
>probably will have highest quality/accuracy.  [The exact edit to pick
>depends upon whether the people saved their work as they added
>material, or only after completing the material.]
>
>  
>
Wikipedia has long articles, I understand why you think a long po file 
might be the same, however this is exactly the reason why we should have 
a translation database from where people cuold check the word that 
particular translation/project uses throughout. It would also help if we 
had comment fields, so that people could explain why they made the 
change. However the most important part is that unlike wikipedia, we 
could mark individual sentences as reviewd/locked, later entire po files.

In wikipedia an article can always get more info, but a translation is 
something that can simply be done and hence should be locked.

-- Mikko



More information about the rosetta-users mailing list