Discussion destillation: Options for language packs

Dafydd Harries daf at muse.19inch.net
Mon Nov 22 10:07:57 CST 2004


Ar 19/11/2004 am 14:00, ysgrifennodd Martin Pitt:
> Hi Fellows!
> 
> I went through Tuesday's IRC discussion again and wrote up a
> structured overview about the possible alternatives, their pros (+)
> and cons (+). Please look over it and add any argument that I forgot
> and correct anything that seems wrong to you.
> 
> Personally I believe that option (F4) is the way to go. It avoids all
> package insanities and seems most flexible to me. 
> 
> Let's open the discussion!

Apologies for dumping into this discussion late.

> Possible options for providing translation updates:
> 
> (F1) single source and binary deb contains program and all available
>      translations, no extra language packs (status quo)
> 
>  + no effort
>  + no version inconsistencies
>  + compatible to Debian and third party packages
>  + users can compile fully functional packages on their own
>  - wastes installed space for unwanted translations
>  - updating translations for stable releases requires a lot of
>    redundant downloads (since the non-translation part of packages
>    does not change)
> 
> (F2) extract translations during package build to separate language debs
> 
>  + users can install just the translation(s) they want, space
>    efficient on installed system
>  + can save space on CDs if we have per-language CDs
>  - requires Ubuntu-specific build system, modification of debhelper,
>    manual modification of packages that do not use debhelper
>  - incompatible to Debian and third party packages, Ubuntu packages
>    would conflict to them (because they ship the same files)
>  - security updates of packages would drag the need to update the
>    language pack(s) as well
> 
>    (F2-1) one deb per language that contains translations of all packages
>    
>     + no significant increase of number of packages
>     - package must be rebuilt after any other package change to update
>       the translations; unbearable impact on buildds and mirrors
>     - users without huge bandwidth will not be able/willing to
>       download big language packs very often (for maybe only one or
>       two string updates)
> 
>    (F2-2) one deb per package that contains translations for all languages
> 
>     + no significantly higher impact on buildds and mirrors 
>     + space-efficient updates of language packs for stable releases
>     o doubles the number of packages, but should be still bearable
>     o translation-only updates do not download code any more, but
>       still download unwanted translations

It seems to me that this situation would only benefit English users --
they would get the disk space/bandwidth savings, while everybody else
would be faced with the complexity of managing these extra packages and
get none of the network/disk benefits. It might even cause some people
to stop using translations in order to gain the benefits of not having
them.

>    (F2-3) one deb per package and language
> 
>     + fine-grained updates with very little mirror and buildd overhead
>     + space-efficient updates of language packs for stable releases
>     - increases number of packages by factor N (number of supported
>       languages, in the order of 10 to 20) -> it takes the 20fold
>       amount of bandwidth, time, space, and memory to download and
>       process the Packages file, which would probably make them bigger
>       than a monolithic per-language deb. However this could be
>       alleviated by providing new package sections for each language.

This is the approach that Mozilla and OpenOffice take, and although it
seems to work quite well on the whole for those packages, it would not
scale to the whole distribution, both because of the
bandwidth/time/space requirements increate and because not all packages
have a system that puts all translations for a given language in one
file.

> (F3) Leave original packages as they are and provide incremental
>      translation update packages
> 
>  + stays compatible to Debian and third party debs
>  + only 
>  - wastes user's disk for unwanted translations
>  - brings along translations we do not support
>  - same problems as above wrt. updating frequency and mirror impact
>    (single deb for all packages) or package number (one translation
>    deb per package)
> 
>    (F3-1) use dpkg-divert in the language pack to replace changed
>           gettext files with newer versions
>      - wastes user's disk for the original copy of the translations (that
>        is shadowed by the update)

As Carlos pointed out, it's not just gettext files that contain
translations. Glade files, .desktop files, etc can also contain
translations.

>    (F3-2) introduce alternative gettext hierarchy /usr/share/langpack
>      + possible to ship po files which only contain the bits that
>        really changed, this alleviates the redundant copies
>      - necessary to change gettext for that, and all packages that
>        include a static copy of gettext

Again, doesn't address non-gettext translations.

> (F4) Leave original packages as they are and provide translation
>      updates without using debs; translations could be directly
>      downloaded from Rosetta to /var/cache/locales/, or a
>      similar place
>  + since this does not touch the archive at all, there is no impact on
>    buildds, mirrors, build systems, Package files, etc.
>  + can be made fine-grained to download only updates for languages and
>    software the user actually wants
>  - we need to develop a version control system which decides when to
>    use /var/cache/locales/ and when /usr/share/locales (updated
>    packages could have newer translations than the ones downloaded
>    from Rosetta); this could be done using the timestamp in the po
>    files
>  o version controlling and downloading should be done in the
>    language-support-XX packages (that we need anyway as a metapackage
>    for Mozilla/Firefox/etc.); this package should provide a simple
>    frontend for triggering updates

Correct me if I'm wrong, but this option would provide no benefit in
terms of package size, CD space, bandwidth usage or update frequency for
people who don't use translations, right?

This also has the same problem of how to create the translation updates.

There is also the problem of breaking MD5sums for installed packages,
but perhaps that could be worked around.

> (F5) keep the status quo on the archive servers, but strip off all but
>      one/some translations in the debs that are shipped on the CDs
>      + easy to achieve without any buildd/mirror hit
>      + saves space on CDs (with per-language ones, at least)
>      - does not solve the "new translation upgrades" problem any
>        better
>      - apt will get confused if it sees two available packages with
>        same version, but different size
>      - insane amount of updated packages at first network update
> 
> (F6) Convert the world to use one common language
>  + No technically solution necessary
>  + can throw away all translations, saves huge amounts of space on the
>    CD that can be filled with indispensable gam^Wproductivity software
>    like TuxRacer and Frozen Bubble
>  - Sebastien insists to use French, but I do not understand a word of it
>  o (SCNR)

This solution clearly has the fewest drawbacks of any of the proposals.

Don't worry about French. Once you (and everybody else) has been taught
Welsh, everything will be OK.

-- 
Dafydd



More information about the ubuntu-devel mailing list