l10n approach for bzr

Mon Mar 17 21:19:06 GMT 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Barry Warsaw wrote:
> On Mar 17, 2008, at 3:30 PM, John Arbash Meinel wrote:
> 
>> 1) I fully agree that we don't want to leave the strings as is and just
>> happen to correct the English at the appropriate time. So "Branch a
>> bazzar branch" should indeed be fixed directly.
> 
>> 2) Indirecting through IDs does have an appeal. It certainly makes it
>> clearer that you are using ids and where the "correct" place to fix
>> them is.
> 
>> 3) However, IDs make the code a bit harder to read.  You need to use
>> them everywhere, and then you don't actually know what they are saying.
>> One of the key places we will be doing this sort of thing is in
>> bzrlib/errors.py
> 
> Yep.  English strings just make the code much easier to read.  It's
> unfortunate, because I agree that there's an appeal to using message ids.
> 
>> In errors.py, though, I think we might actually want to wait to
>> translate until just before the exception is displayed. So rather than
>> doing:
> 
>> class MyError(BzrError):
> 
>>  _fmt = i18n("My Error says %(foo)s")
> 
> Please, please, please use PEP 292 strings!  You'll save yourselves lots
> of headaches from translators who leave off the trailing 's'.  $foo is
> so much easier for them to get right than %(foo)s.

Certainly the author of that PEP would request it. :)

Seriously, though, we do require python 2.4 and thus we could use them.
It would mean switching how all of our exceptions are processed since
you need to use .substitute(**__dict__) instead of % __dict__.

We've certainly had some problems with it. Our current fix is to always
test str(error) for any new errors we define. A simple thing to do would
be to have an automatic test against all classes derived from BzrError
that would give semi-bogus data to the __init__ function and just test
that str() works.

Certainly we can consider it, I'm not sure if we can switch at this point.

> 
> BTW, I've been musing about a source string format that went something
> like this:
> 
> mymsg = _('563:User $user is subscribed to list $listname')
> 
> So you the extractor matches (P<msgid>\d+):(P<english>.*) and of course
> the runtime _() function would do the same.  Maybe you can get the best
> of both worlds that way?
> 
> -Barry
> 

Of course, then you need to work out a numbering scheme that scales in a
distributed system so that you don't get conflicts with 2 people adding
different strings and calling them both 564. :)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH3uBKJdeBCYSNAAMRAjthAJ9mjaE0WNWv1kDrJONFnn4t/dYC+gCgnhs1
7dycDgV0Mw8nbURF1OuespA=
=b03p
-----END PGP SIGNATURE-----