Semi-mechanizing the DTTP translations

Pierre Slamich pierre.slamich at gmail.com
Sun Jan 6 16:45:58 UTC 2013


Hi Tom,
The approach works best for large files where the scale effect works best
vs manual translations. We have tested it on documentation and related
stuff so far. It works on virtually any po file, but you need to check
whether it outputs translations good enough to actually reduce translation
load.
Feel free to forward the original mail.

Pierre

On Sat, Jan 5, 2013 at 2:12 PM, Tom Davies <tomdavies04 at yahoo.co.uk> wrote:

> Hi :)
> Would this semi-mechanising tool be good for other projects to use?  Is it
> good for translating websites, wiki's or printed documentation or all 3?
>
> If it's good for other projects is anyone here on the main LibreOffice
> LoCos mailing list?  Could one of you approach them to suggest it?  If not
> please let me know.
> Regards from
> Tom :)
>
>
>   ------------------------------
> *From:* Pierre Slamich <pierre.slamich at gmail.com>
> *To:* Hannie Dumoleyn <lafeber-dumoleyn2 at zonnet.nl>
> *Cc:* Ubuntu Translators <ubuntu-translators at lists.ubuntu.com>
> *Sent:* Friday, 4 January 2013, 20:03
> *Subject:* Re: Semi-mechanizing the DTTP translations
>
> We keep making incredible progress thanks to the process: we validated on
> average 400 strings a day going from 49289 untranslated strings on Dec 16th
> to 42746 today.
>
> I've updated the structure and the instructions on the Pad to be more
> detailed and more linear. I've added a link to Redmar's script,
> instructions on validating the files and mass-correcting translations
> errors before upload.
> Feel free to ask if you're stuck at any point.
>
> http://lite.framapad.org/p/ddtpUbuntu
>
>  Pierre
>
> On Thu, Dec 27, 2012 at 4:00 PM, Pierre Slamich <pierre.slamich at gmail.com>wrote:
>
> Viva Low-Tech ;-)
> When you come at the point of importing them back, let me know so that I
> can grant you upload rights to the mock project.
>
> Sincerely,
>
> Pierre
> pierre.slamich at gmail.com
>
>
> On Thu, Dec 27, 2012 at 9:28 AM, Hannie Dumoleyn <
> lafeber-dumoleyn2 at zonnet.nl> wrote:
>
>  Hello Hendrik, Redmar, Pierre,
> Redmar, thanks for writing the script.
> The way I did the splitting so far is: open the sorted ddtp file in gedit,
> select lines 1 - 30.000 (which is about 940 Kb), copy these in a new
> document and save it. It only takes a few minutes. Then you can select the
> next 30.000 lines, and the next. Done!
> Of course, using a script to split the whole file in one go is also very
> useful.
> Hannie
> Ubuntu Dutch Translators
>
> Op 23-12-12 11:39, Hendrik Knackstedt schreef:
>
> Am 23.12.2012 10:33, schrieb Redmar:
>
> Hendrik Knackstedt schreef op do 20-12-2012 om 17:39 [+0100]:
>
>  Am 20.12.2012 13:43, schrieb Pierre Slamich:
>
>
>  I don't have a clean way to split them right now. I split them by
> size to keep below 900ko (I took 800 for safety), but I then had to
> adjust manually because the strings were split right in the middle.
>
>  Ok, I'll take a look at it and see if I can come up with something
> useful.
>
>  I've been working with python-polib for a bit, so I think I'd be able to
> create a script to split up a po file into multiple parts pretty
> quickly. I haven't started yet, since I don't want to do duplicate work,
> but please let me know if you want me to make a script or if you need
> help with python-polib.
>
>
> If you can do this, that's great. Thanks!
>
> Hendrik
>
> Regards,
>
> Redmar
> --
> Ubuntu Dutch Translators
>
>  If you don't mind, it would be great to take advantage of the German
> process to automate the process as much as possible.
> Would you be willing to expand the pad
> (http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof
> of French-German partnership ;-P)?
>
>  Sure. What do you mean by "the German process"? I'm a bit short on
> time right now but just let me know what has to be done and I'll try
> to get it done asap.
>
> Regards,
> Hendrik
>
>  Pierre
>
> On Thu, Dec 20, 2012 at 1:35 PM, Hendrik Knackstedt<hendrik.knackstedt at t-online.de> <hendrik.knackstedt at t-online.de> wrote:
>         Hey Pierre!
>
>
>         I'd like to test your approach for the German language also.
>         How exactly did you split the files? Did you use an existing
>         program/script or can you provide a script for doing this?
>         Thanks!
>
>         Hendrik
>
>         Am 19.12.2012 15:58, schrieb Pierre Slamich:
>
>         > Yes, although we might be finished by then ;-)
>         > Thanks to the method we're reviewing and correcting around
>         > 1000 strings per day at the moment.
>         >
>         >
>         > sincerely,
>         > Pierre
>         >
>         >
>         > On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
>         > <lafeber-dumoleyn2 at zonnet.nl> <lafeber-dumoleyn2 at zonnet.nl> wrote:
>         >         Hi Pierre, Redmar, and all who are interested,
>         >         Would it be an idea to brainstorm on this in
>         >         #ubuntu-translators? Perhaps in January 2013?
>         >         I agree with Redmar that the msgmerge is a good
>         >         method, especially for huge documents. The only
>         >         snag is that you still have to approve the fuzzies
>         >         offline before uploading the file back to
>         >         Launchpad. We use this method for the Ubuntu
>         >         Manual "Getting started with Ubuntu" (Lucid >
>         >         Maverick > ....> Raring) and with success.
>         >         Redmar, sorry for not yet having tested your
>         >         popsort :(
>         >         Regards,
>         >         Hannie
>         >
>         >         Op 18-12-12 00:51, Pierre Slamich schreef:
>         >
>         >         > Hi Hannie, Hi Redmar,
>         >         > Thanks a lot for the tips: we're interested in
>         >         > using your approach, and more generally it might
>         >         > be interesting expending the msmerge approach to
>         >         > all teams that are already underway for the
>         >         > DDTP, and the Google one to the teams that need
>         >         > to get started.
>         >         >
>         >         >
>         >         > - For the Google Translator Kit approach, I
>         >         > guess we could extend the mock project we did
>         >         > for fr_FR to other languages (and streamlining
>         >         > our process by using Bazaar) by creating a
>         >         > global team responsible for the DDTP Mock
>         >         > project and including in this team one member
>         >         > from each language team responsible for
>         >         > uploading the machine translated po for his or
>         >         > her language.
>         >         >
>         >         >
>         >         > - For the msmerge approach, do you already have
>         >         > a project to handle this ? Is there any
>         >         > advantage in msmerging raring against releases
>         >         > older than quantal to get more modified
>         >         > strings ? How many strings have you been able to
>         >         > recover using that approach ?  It might be neat
>         >         > to generate the msmerged po for all languages ?
>         >         > Importing them as actual translations (not
>         >         > fuzzy) into a mock project like the Google
>         >         > Translate one would show them as suggestions for
>         >         > the actual DDTP as well.
>         >         > The translator would thus be able to pick the
>         >         > human translated one when available or to build
>         >         > on the machine translated one otherwise.
>         >         >
>         >         >
>         >         > Can we try to schedule some time to coordinate
>         >         > on this so that we can use both approaches and
>         >         > try to onboard all the other languages teams
>         >         > once we have a rock-solid process ?
>         >         >
>         >         >
>         >         > Pierre
>         >         >
>         >         > Pierre Slamich
>         >         > pierre.slamich at gmail.com
>         >         >
>         >         >
>         >         >
>         >         > On Mon, Dec 17, 2012 at 10:30 PM, Redmar
>         >         > <redmar at ubuntu-nl.org> <redmar at ubuntu-nl.org> wrote:
>         >         >         Hi Pierre,
>         >         >
>         >         >         I've actually tried a similar approach
>         >         >         for Dutch using msgmerge, which
>         >         >         might also be worth checking out. When
>         >         >         you merge the translations of an
>         >         >         older version of ubuntu into the current
>         >         >         version (msgmerge
>         >         >         quantal_ddtp.po raring_ddtp.po -o
>         >         >         merged_ddtp.po, for example), there
>         >         >         will be a lot of 'fuzzy' translations
>         >         >         for strings that are similar (for
>         >         >         example, meta packages for different
>         >         >         programs, debugging symbols etc).
>         >         >         These fuzzy often only need a few small
>         >         >         changes (eg program name) to be
>         >         >         accepted, which can really speed up
>         >         >         translations. And you don't have to
>         >         >         worry about google putting in a weird
>         >         >         translation, since it is all based
>         >         >         on earlier translations done by a human.
>         >         >
>         >         >         On a related note, if any of you work on
>         >         >         ddtp-translations offline, I
>         >         >         have written a python program that can
>         >         >         sort entries in ddtp po-files
>         >         >         based on the popularity of the package.
>         >         >         This way, the most popular
>         >         >         packages will be at the top of the po
>         >         >         file, and you are always sure you
>         >         >         are working on the most important
>         >         >         packages first.
>         >         >
>         >         >         You can get the code here:
>         >         >         bzr branch lp:~redmar/+junk/ddtp_popsort
>         >         >
>         >         >         It has a small readme file, please let
>         >         >         me know if something is unclear
>         >         >         or not working for you.
>         >         >
>         >         >         Regards,
>         >         >         Redmar
>         >         >         --
>         >         >         Ubuntu Dutch Translators
>         >         >
>         >         >
>         >         >         Hannie Dumoleyn schreef op ma 17-12-2012
>         >         >         om 17:58 [+0100]:
>         >         >         > Hello Pierre,
>         >         >         > This is a very good idea! I have just
>         >         >         uploaded the first part of the
>         >         >         > incomplete Dutch translation (900kb)
>         >         >         to GTT.
>         >         >         > Thanks,
>         >         >         > Hannie
>         >         >         >
>         >         >         > Op 17-12-12 12:55, Pierre Slamich
>         >         >         schreef:
>         >         >         >
>         >         >         > > The DDTP represent around 50 000
>         >         >         strings to translate * 140
>         >         >         > > languages. On very good weeks, a
>         >         >         typical translation team translates
>         >         >         > > 500 strings (see UWN for examples
>         >         >         weekly figures).
>         >         >         > >
>         >         >         > >
>         >         >         > > Would take a lot of weeks (years?)
>         >         >         with highly motivated volunteers
>         >         >         > > of a large translation team, working
>         >         >         non-stop, at their best to get
>         >         >         > > done with it.
>         >         >         > > Thus we had the idea to delegate
>         >         >         initial translation suggestions to
>         >         >         > > Google Translator Kit and review
>         >         >         translations with humans to speed
>         >         >         > > the process.
>         >         >         > >
>         >         >         > > We successfully did an import for
>         >         >         circa 40 000 French strings  (yup
>         >         >         > > you read that right) this week-end
>         >         >         in a mock project called DDTP
>         >         >         > > Automation
>         >         >         (https://translations.launchpad.net/ddtpautomation).
>         >         >         > > To keep it short, the translations
>         >         >         from this project appear as
>         >         >         > > suggestions in the French DDTP, and
>         >         >         can be reviewed by actual
>         >         >         > > translators.
>         >         >         > > We've started using them, and it
>         >         >         turns out that a lot of them are
>         >         >         > > actually useful and are speeding up
>         >         >         the translation process a lot.
>         >         >         > >
>         >         >         > > We detailed the (somewhat) tedious
>         >         >         process in English at
>         >         >         > >
>         >         >         http://lite.framapad.org/p/ddtpUbuntu
>         >         >         > > Questions and inquiries welcome.
>         >         >         > >
>         >         >         > > Pierre
>         >         >         > >
>         >         >         > >
>         >         >         > > ---
>         >         >         > > pierre.slamich at gmail.com
>         >         >         > >
>         >         >         > >
>         >         >         >
>         >         >
>         >         >
>         >         >
>         >         >         --
>         >         >         ubuntu-translators mailing list
>         >         >         ubuntu-translators at lists.ubuntu.com
>         >         >         https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators
>         >         >
>         >         >
>         >         >
>         >
>         >
>         >
>         >
>         >
>         >
>
>
>
>         --
>         ubuntu-translators mailing list
>         ubuntu-translators at lists.ubuntu.com
>         https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators
>
>
>
>
>
>
>
>
>
> --
> ubuntu-translators mailing list
> ubuntu-translators at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators
>
>
>
>
> --
> ubuntu-translators mailing list
> ubuntu-translators at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-translators/attachments/20130106/f1dec2da/attachment-0001.html>


More information about the ubuntu-translators mailing list