Internationalisation of bzr cli

INADA Naoki songofacandy at gmail.com
Fri Apr 29 04:07:08 UTC 2011


Making custom message extractor instead of using xgettext may help things.

1) No need to use _() marker for __doc__

2) Custom extractor can dedent message on extraction time, not execution time.

3) Custom extractor can split message by paragraph. Translators doesn't need to
retranslate whole message when one sentence of message is changed.

Mercurial uses custom message extractor and does good i18n (ex. text wrapping
for double-width chars).
Can we import some tools from them?

On Fri, Apr 29, 2011 at 10:43 AM, Andrew Bennetts
<andrew.bennetts at canonical.com> wrote:
> jbowtie at amathaine.com wrote:
>> On Fri, Apr 29, 2011 at 12:16 PM, Martin Pool <mbp at canonical.com> wrote:
>> > As soon as you feel you're sufficiently on the right track, please
>> > propose your branch for merging and we can have a look in more detail.
>> >  You don't have to have i18d everything, or even one whole command,
>> > just enough to show the approach will basically work.
>> >
>>
>> One of the issues is that docstrings don't interact well with gettext
>> - you need to monkey-patch inspect.getdoc to actually get those to
>> display correctly. Which might be where you start to see performance
>> issues.
>
> Because getdoc strips the indentation of a multiline docstring, but we'd
> be wrapping _() around the literal string including indentation?
>
> It seems to me that, ideally, changing the indentation of a docstring
> wouldn't affect the translations of that docstring.  i.e. these two
> class definitions have effectively the same string, even though the
> second one has more indentation on the non-initial lines:
>
>
> class cmd_foo(Command):
>    __doc__ = """Frobnicates the blatterwidget.
>
>    This is a no-op when rotating widdershins.
>    """
>
>
> if have_frobnicator_module:
>    class cmd_foo(Command):
>        __doc__ = """Frobnicates the blatterwidget.
>
>        This is a no-op when rotating widdershins.
>        """
>
>
> [We use explicit __doc__ assignments for docstrings that need to exist
> always, regardless of whether python -OO is being used.  I'd expect that
> other docstrings don't need to be translated, because they'll never be
> seen by users.]
>
> So I'd hope that whatever _() magic we use considers those two cases to
> be the same string.  If it doesn't it wouldn't be very hard to manually
> adjust any translations we already have for the new indentation as code
> is refactored, but it would be annoying.
>
> I think it'd be possible to cope with this without a performance hit and
> without monkey-patching inspect.getdoc.  To avoid the performance hit we
> just need to arrange for the dedenting to be delayed until we actually
> need to look up a translation for the string, and I think if
> inspect.getdoc is troublesome we can simply ignore it and write our own
> variant, it's a pretty simple function (as your monkey-patch shows).
>
> -Andrew.
>
>
>



-- 
INADA Naoki  <songofacandy at gmail.com>



More information about the bazaar mailing list