Improving Kubuntu's translations

Danilo Šegan danilo at canonical.com
Fri May 8 20:50:43 BST 2009


Hi Harald,

У пон, 04. 05 2009. у 11:33 +0200, Harald Sitter пише:

> Within the last couple of days we, at Kubuntu, have been working out a list of 
> major issues we are facing with the current translation process. They are of 
> considerable impact since I know quite some people who actually decided to not 
> translate Kubuntu any longer, or even stopped using it. But not only users are 
> affected, I think if we continue to ship Kubuntu with such poor quality we 
> will also see a decrease in contributors as well. Why would somone who lives 
> in France, spend his spare time on Kubuntu when he can't even make his mom or 
> brother use it because they wouldn't understand half the system.

Thanks for writing to the list.  And sorry for not responding earlier.

> First off I'd like to direct your attention to 
> http://www.flickr.com/photos/19616885@N00/sets/72157608562200171/
> Which provides a set of pictures documenting the state of translation in 
> Kubuntu and openSUSE, starting with 8.04. It especially gives a quite good 
> impression of what is going on in-development and how the end result relates 
> to that. Now, I know that especially for language packs there is a certain 
> need for short-term breakage, however it happens way too often for Kubuntu 
> (and I would suppose Ubuntu as well). This prevents sensible QA as well as 
> using the product.

Actually, Ubuntu is never as badly affected as Kubuntu: KDE uses
completely different set up of translations, and especially KDE4
introduced a huge number of changes to the layout of templates.  They
were all changes that neither Launchpad nor language pack infrastructure
were ready to handle.

This is not only about majority of Ubuntu/Launchpad developers being
GNOME-oriented (though, that plays a part too): KDE uses different l10n
layout compared to most other free software.

> One of the most incredible things that happened in the 9.04 cycle were, that 
> the pkgbinarymangler (i.e. the component that is responsible for stripping 
> translations from packages, in favor of the ubuntu language packs) was changed 
> to remove translations from desktop files. Incredible is that no one told the 
> Kubuntu devs beforehand, that no one even tested if that change would cause 
> problems, that this change was done less than a month before release.
> Such things just can not happen anymore.

Personally, I don't like the .desktop hack that Ubuntu uses for GNOME
either: it's only a half-solution, and I believe more effort should be
spent on providing a full solution for regenerated content translation
via language packs.  However, that would require a lot more time to
achieve.

If this was done only a month before release, that sounds terribly bad,
though.  In general, Kubuntu l10n suffers from lack of testing (or maybe
bug reporting?) early in the development cycle, but introducing a change
a month before release is definitely bad. :(

> It is this and a lot of smaller regressions that drain Kubuntu's development 
> performance (change all of upstream KDE's langauge packs to include the proper 
> desktop file .pots is certainly no fun and not done within a couple of 
> minutes).
> 
> So here comes the list of issues:
> http://docs.google.com/Doc?id=ajk6csn6c2vn_0c6d8rp6w

I'll try to address some of the issues here that I believe are
important:

> * Upstream 
>   * Make it easy to get from Rosetta to the upstream Po/Pot files 
>     (i.e. include links to websvn.kde.org )

At the moment, each POT file in Launchpad can have an arbitrary
description.  Links are not highlighted there atm, but we can easily fix
that and promote its use more.  Still, this would have to be manually
maintained since it's not something that can be easily automated.

Also, we'd like to have a read only copy updated daily of all upstream
translations in Launchpad, to be able to offer them as suggestions
across the system. Our recent work on Bazaar integration is a step in
that direction as well.

> * Translations can only be changed if their original strings were
>   altered by a patch or when there is sufficient evidence that the
>   change is necessary.

Apart from technical difficulties in achieving this, I wouldn't like to
do this because it's also the wrong thing to do (IMHO, at least).
Updates to translations in Ubuntu can happen even after upstream (i.e.
KDE) stops releasing updates for that package.  

Also, I don't see why we should stop people from using Launchpad merely
as a web-based collaborative PO editor to translate anything in Ubuntu,
including KDE, and then submit those PO files upstream.

> * Make it easy to push changes upstream.

We try to do that as much as possible; as you rightly note, nobody would
accept direct commits from Launchpad, and I am pretty sure upstream
translation teams would accuse us of spamming if we started emailing
them with every few changes on a translation.  So, at this moment, we do
the best we can: we offer easy download of "partial PO" files which give
you only what has been changed in Ubuntu translations compared to last
import.  This is easy to review (it's just a snippet cut out of a full
PO file: diffs don't work well with PO files since they have a lot of
metadata that changes without human intervention), and can be integrated
into upstream PO files in different ways.  In the future, we want to add
different notification and subscription mechanisms as well.

Ideas about automatic bug reporting are a good venue to explore, though.

Pushing stuff to KDE's review board would simply be too complex: we'd
have to cope with every upstream's different system, and we lack the
resources to do so.  I was hoping development of Transifex would lead to
a unified platform for translation submission that could be easily used
by existing communities like GNOME or KDE, but unfortunately, that
hasn't happened (yet).

> If said chain is present (old upstream translation + launchpad bug +
> changed Rosetta version due to lp bug) Rosetta should check at new
> upstream translation import if new upstream translation != old
> upstream translation && upstream bug fixed and, depending on the
> outcome, replace the locally changed version with what upstream did.

We have complicated rules on when does upstream translation take
precedence over Ubuntu one.  In general, upstreams carry more value, but
we can probably improve the heuristics even further.

> * Consistency

As far as consistency goes, that's a hard topic.  I can easily imagine
Ubuntu translators team wanting to solve all consistency issues between
GNOME and KDE translations for a certain language, and I don't think
there's anything wrong with that. 

At the moment, we provide only indirect tools to help with this.  We
provide very prominent per-team documentation links on every translate
page, and whenever someone uses a suggestion from a different package,
they can see what package it's coming from.  This means that they would
have to read the documentation and know the relation between packages,
but providing better tools for this is definitely planned, but as a
longer term goal.

> * PO templates love POs
> If Rosetta publishes the new/changed strings without immediately
> adding the translations available from upstream, it exposes the
> translator to believe that all the missing translations are, well,
> untranslated.

This problem should not be a problem anymore, as far as KDE translations
are concerned (it will cause some wasted work in Launchpad, but upstream
translation would be activated as soon as it's imported for the first
time; this is relatively new — Launchpad does it since December 2008).


Though, the concern you raise is again KDE specific: KDE is the only one
that is distributed in this way, but we handle that properly.

This problem is more of a problem of disconnect of upstream translation
work and tarballs.  I.e. Ubuntu imports only translations from released
tarballs, and there might be much more work done in "trunk" of the
upstream project.  I believe proper solution for this is again to have
all upstream translations in Launchpad as well, and be able to easily
sync them with Ubuntu translations.

> * Don't break translations
> Before langpacks get created the Rosetta import queue needs to be
> empty

This should not be a problem anymore (for the last year or so).  Today,
we import all KDE translations (close to 30000 PO files) in less than
one day.  We've spent a lot of time improving performance, and we are
relatively satisfied with where we stand today, though we are constantly
working on that further.

> * Let me search

Considering the amount of data that Launchpad stores, this is a hard
problem.  At the moment, you'd have to use a workaround of using global
Launchpad search, which will have a pointer to one of the translation
pages that contains the string.  We will be improving this further,
though note that this is hard stuff which is hard to quantify and
estimate.

> * Let me browse

Grouping by package sets is being worked on, but it's a big change in
whole of Launchpad and Ubuntu.  As soon as that's available, we'll be
making use of it to group translations as well.


> * Crush the Cruft 

I think this is where we can do biggest improvements with little code.
I like your suggestion of notifying by email about removed templates,
though we'd have to be careful who do we notify.  Anyway, this is
definitely something we should improve for Karmic, and we've already
started on some infrastructure to be able to carry on this path.

This is the case where we desperately need more input from those
familiar with how KDE templates work (we already special case KDE in our
code, so if there's anything that we can improve greatly, we want to
know about it).

I'd like to discuss this during UDS Karmic as well.

> I'd like to mention that this page's content is based upon known behaviour, if 
> that knowledge is wrong and the status quo is in fact completely different it 
> is all the better. The document also comes with suggestions on how to solve 
> some of these issues, those are really just suggestions of what we think might 
> be a good solution, then again we are neither working on Rosetta's code nor do 
> we use it for stuff other than importing, so discussion between people who 
> actually do is probably necessary. I also want to note that this is not meant 
> to insult anyone or anyone's work, it is merely a list of issues that need to 
> be taken care of ASAP in order to ensure that Kubuntu 9.10+ come with as 
> complete translations as possible.

I believe a few of the issues you raise are already solved, some are in
progress, and others are already planned to be fixed.  Note that some
are very hard to fix, and with others we need as much input from the
community as possible.

> However in the long run I recommend Canonical to seek direct advise from 
> upstream (at least for the KDE side of things), they have years of experience 
> in large scale translation and deployment, not using this knowledge to improve 
> the Ubuntu translation process would be a waste. Maybe invite some upstream 
> translators to UDS (sometime)?

I've seen a lot of translators, including upstream ones, at UDSes.
However, I haven't seen a lot of people interested in both KDE and
translations (those are harder to come by).  KDE is still a special
case, and for as long as they do stuff completely differently from
everyone else (luckily they came back to a standard GNU gettext PO file
format with KDE4, though I've seen some mention of a scripting language
being introduced into PO files :().

For that reason, we do want Kubuntu users to be vocal about problems
they experience, and whenever we can help we will.  Note that Launchpad
is only small part of the entire Ubuntu translations workforce, so it's
not all up to us. :)

Cheers,
Danilo




More information about the kubuntu-devel mailing list