sorting translation entries
Dafydd Harries
daf at
Mon Jan 31 08:56:46 CST 2005
Ar 22/01/2005 am 02:01, ysgrifennodd Valient Gough:
> Dafydd Harries wrote:
> >Ar 02/01/2005 am 12:52, ysgrifennodd Valient Gough:
> >
> >>I've been sorting my translation strings (POT file) in rough order by
> >>most desired translations first, so translators will see them first in
> >>Rosetta. I don't know how Rosetta plans on dealing with sorting (if at
> >>all), so I'll describe how I've handled it for my project.
> >>
> >
> >We've thought about adding support for guessing about how difficult a
> >string is to translate, based on things such as how long it is, whether
> >it has plural forms, whether it has variable substitutions etc.
> >
> >The main disadvantage to this is that related messages are often (though
> >not always) near each other in the .pot file, and seeing related
> >messages together often helps in having a consisitent translation.
> >
> I see -- it is probably a good assumption that translations are more
> helpful when dealing with more complex sentences..
> But I want to sort based on most-frequently-displayed -- so the strings
> which are displayed most often get priority for translation. My typical
> application has lots of common strings, along with strings which are
> only seen during setup or usage message, and then strings which are only
> seen if something very strange has happened (warning messages, debug
> messages).
Right, this is the sort of information that you can only add manually.
> Those warning messages, when the program detects an unexpected state,
> may be very verbose to try and provide lots of information for
> debugging, but that doesn't mean they are necessarily the best to
> translate because I expect that if my program is working well that
> nobody will every see the strings at all..
There's a number of approaches which you can take with this sort of
- Don't make debugging information translatable. This makes sense if
the information is likely to be useful only to a developer of the
software and not to a user.
- Split out less important messages into a separate translation domain.
For example, GTK+ does this with descriptions for widget properties
for use by Glade. These generally don't appear in applications which
use GTK+, and so they are less important for translation.
> I have a rough ordering of tags right now based on such frequency
> groupings. I don't mind if the tags are re-ordered within a group, but
> I don't want to drop my ordering for an automated grouping from an
> algorithm that knows nothing about my program.
This is another potential approach -- to group messages together using
some form of metadata in the PO template, probably in the comments.
> If you use an algorithm to sort entries, then you optimize for the
> average or mean case. An individual can do a better job on any
> particular case, so I think the goal should be to either enable directed
> grouping (within user-specified subgroups) or as a bootstrap for
> otherwise unsorted applications (but don't override sorting provided
> later by the user).
Yes, I think this is a case where human judgement will be better than
heuristics. But we should not give up on the idea of using a heuristic
when grouping data is missing.
More information about the rosetta-users
mailing list