Some details of what happened (was: Re: HEADS-UP! URGENT! Major problem with translations for Hardy and Intrepid.)

Arne Goetje arne at canonical.com
Sat Jan 17 19:12:07 GMT 2009


Milan Bouchet-Valat wrote:
> Wouldn't you mind giving us more details about the situation you
> describe and its causes? You're suddenly coming and telling us that
> everything is going to collapse and that we need to solve this horrible
> list of bugs ASAP, without even explaining anything about it.

Sorry for that. At the time of sending the initial mail we only knew we
have a security problem at hand which involves buggy translations, which
contain formatting placeholders where they shouldn't be. Only now I have
some information at hand about what happened and will relay it to you.

However, I'm just the messenger, so please don't shoot me. ;)

> From what I've read and seen in the strings list, we're not in such an
> emergency. Sure, some strings are not correct and can lead to crashes if
> % jokers are present when they shouldn't. But this seems to have been
> the case since the release of Hardy and Intrepid, so no need to stress
> the teams like that. I really can't see your case here: what's new in
> Hardy and Intrepid that can break anything? Where does those new strings
> come from, and why can't they be reverted?

We had a number of bug reports about applications crashing in certain
circumstances in hardy and intrepid. Since these are stable releases,
reports about arbitrary crashes get our attention and we try to fix
those issues. If the issue is a security thread, it needs immediate
attention and a fix ASAP. Only the bug about libxine crashing, pointed
us into the right direction that buggy translations might be involved. (
https://bugs.launchpad.net/ubuntu/+source/xine-lib/+bug/290768 )

We noticed, that if a formatting placeholder is present in a translation
where it shouldn't be, the application will read arbitrary data from the
stack when this message is displayed. Reading arbitrary data from the
stack is a security issue, which needs urgent attention. That's why we
raised the flag.

As a result, we turned on c-format checking in langpack-o-matic when
generating language-packs. This will fail the build if such an error is
present in the data. That's why all the buggy data needs to be fixed in
Launchpad asap, or we won't get new language-packs.

What we know is that these buggy translations came from upstream and got
approved in Launchpad. In some cases later updates of those packages
fixed the broken strings in the translations, however, they show up as
'suggests' in Rosetta and need to be approved manually. This has
unfortunately not happened in many cases.

Launchpad does check for c-format errors on translations, but:
 * it seems not to be enough (
https://bugs.launchpad.net/rosetta/+bug/317578 )
 * some buggy translations predated the c-format flag and therefor
didn't have one when they actually needed one
 * in some cases upstream did not set the c-format flag correctly

To catch all possible erroneous translations we enforced the c-format
flag on all messages when doing our analysis. The outcome (
http://people.ubuntu.com/~arne/langpack_errors/ ) has therefor some
false positives.

[Quote from Danilo to illustrate the problem]
Indeed.  c-format and no-c-format flags come from packaged templates, so
it's up to them to decide on the proper usage (i.e. Launchpad doesn't
have enough knowledge to insert them properly).  Note that any approach
to find every _potential_ problem would give us a lot of
false-positives.

I.e. "Insert % sign" is treated as space-padded "%s" modifier if marked
as c-format string, but is definitely not one.  To properly decide if
any one case is a genuine problem or not, one would have to dive into
the code that uses the string itself.
[/Quote]

> Anyway, I think I'd express quite accurately the feeling of many l10n
> teams members if I say we're somewhat tired of those problems. Rosetta
> has allowed people to fork upstream translations when we should only
> have changed Ubuntu-specific strings. This leads to a terrible mess
> where small teams have to manage a dramatically large textual domain
> that they can't really master. Upstream translators work far better than
> we can do on their projects, and avoid the kind of trouble we're now
> facing: downstream-modified strings that don't get fixed when upstream
> updates them. We really need a solution here, like locking translations
> for packages that belong to upstream.

Wouldn't have helped in this case. The buggy translations came from
upstream. I agree that in some cases some locking would be useful. But
on the other hand, if upstream translations have problems, they can be
fixed faster for our users by using Launchpad (especially for stable
releases, which don't receive upstream updates anymore except for
regression and security fixes).

> I'm sorry if this complaint sounds rude, but the tone of your message
> and your way of presenting things isn't fair either. We're mostly
> benevolent people here, and we suffer all the time from Launchpad's
> framwerok problems I've just described. We're not here only to obey
> Canonical, and I think we deserve more than orders like "please report
> back". I appreciate your work on Ubuntu l10n, but please also understand
> ours. We need to understand what can be done in the future to avoid this
> kind of mess rather than blindly fixing things, waiting for new bugs to
> arise.

I'm sorry if my initial mails sounded rude, that was not my intention.
However, I need to say that I wish to receive more feedback from you
guys, especially when it comes to language-pack testing. Whenever we
prepare new language-packs, they go to -proposed for stable releases and
need to be tested before released to -updates. Since I'm doing this I
haven't received any feedback if those proposed language-pack updates
were actually OK. I ended up testing some languages I'm roughly familiar
with myself (although I actually don't have the time for that, I'm
usually busy with development and bug fixing). Therefor the "please
report back" statement.

Since I am largely in charge of everything related to language support
in ubuntu on the Canonical side, I would really appreciate it to receive
feedback from you guys about problems or needed improvements in ubuntu
in regard to language support (input handling, fonts, rendering and also
translation related things). I don't have anything to do with Launchpad
though, so complaints about Launchpad need to be directed to the
Launchpad Translation Team via bug reports or questions. (They are
notoriously under-staffed, though.)

Thanks for taking care of the translations, I know you do this
voluntarily and in your free time (I also work on several projects in my
limited free time) and I appreciate it.

Cheers
Arne



More information about the ubuntu-translators mailing list