Internationalisation of bzr cli
INADA Naoki
songofacandy at gmail.com
Fri Apr 29 05:15:57 UTC 2011
This scripts finds assignment to __doc__ and calling 'gettext', '_',
'N_' and shows
string literal in it.
http://paste.ubuntu.com/600595/
Babel <http://babel.edgewall.org/> can produce pot file from custom message
extractor.
On Fri, Apr 29, 2011 at 1:07 PM, INADA Naoki <songofacandy at gmail.com> wrote:
> Making custom message extractor instead of using xgettext may help things.
>
> 1) No need to use _() marker for __doc__
>
> 2) Custom extractor can dedent message on extraction time, not execution time.
>
> 3) Custom extractor can split message by paragraph. Translators doesn't need to
> retranslate whole message when one sentence of message is changed.
>
> Mercurial uses custom message extractor and does good i18n (ex. text wrapping
> for double-width chars).
> Can we import some tools from them?
>
> On Fri, Apr 29, 2011 at 10:43 AM, Andrew Bennetts
> <andrew.bennetts at canonical.com> wrote:
>> jbowtie at amathaine.com wrote:
>>> On Fri, Apr 29, 2011 at 12:16 PM, Martin Pool <mbp at canonical.com> wrote:
>>> > As soon as you feel you're sufficiently on the right track, please
>>> > propose your branch for merging and we can have a look in more detail.
>>> > You don't have to have i18d everything, or even one whole command,
>>> > just enough to show the approach will basically work.
>>> >
>>>
>>> One of the issues is that docstrings don't interact well with gettext
>>> - you need to monkey-patch inspect.getdoc to actually get those to
>>> display correctly. Which might be where you start to see performance
>>> issues.
>>
>> Because getdoc strips the indentation of a multiline docstring, but we'd
>> be wrapping _() around the literal string including indentation?
>>
>> It seems to me that, ideally, changing the indentation of a docstring
>> wouldn't affect the translations of that docstring. i.e. these two
>> class definitions have effectively the same string, even though the
>> second one has more indentation on the non-initial lines:
>>
>>
>> class cmd_foo(Command):
>> __doc__ = """Frobnicates the blatterwidget.
>>
>> This is a no-op when rotating widdershins.
>> """
>>
>>
>> if have_frobnicator_module:
>> class cmd_foo(Command):
>> __doc__ = """Frobnicates the blatterwidget.
>>
>> This is a no-op when rotating widdershins.
>> """
>>
>>
>> [We use explicit __doc__ assignments for docstrings that need to exist
>> always, regardless of whether python -OO is being used. I'd expect that
>> other docstrings don't need to be translated, because they'll never be
>> seen by users.]
>>
>> So I'd hope that whatever _() magic we use considers those two cases to
>> be the same string. If it doesn't it wouldn't be very hard to manually
>> adjust any translations we already have for the new indentation as code
>> is refactored, but it would be annoying.
>>
>> I think it'd be possible to cope with this without a performance hit and
>> without monkey-patching inspect.getdoc. To avoid the performance hit we
>> just need to arrange for the dedenting to be delayed until we actually
>> need to look up a translation for the string, and I think if
>> inspect.getdoc is troublesome we can simply ignore it and write our own
>> variant, it's a pretty simple function (as your monkey-patch shows).
>>
>> -Andrew.
>>
>>
>>
>
>
>
> --
> INADA Naoki <songofacandy at gmail.com>
>
--
INADA Naoki <songofacandy at gmail.com>
More information about the bazaar
mailing list