Semi-mechanizing the DTTP translations
Hannie Dumoleyn
lafeber-dumoleyn2 at zonnet.nl
Thu Dec 27 08:28:34 UTC 2012
Hello Hendrik, Redmar, Pierre,
Redmar, thanks for writing the script.
The way I did the splitting so far is: open the sorted ddtp file in
gedit, select lines 1 - 30.000 (which is about 940 Kb), copy these in a
new document and save it. It only takes a few minutes. Then you can
select the next 30.000 lines, and the next. Done!
Of course, using a script to split the whole file in one go is also very
useful.
Hannie
Ubuntu Dutch Translators
Op 23-12-12 11:39, Hendrik Knackstedt schreef:
> Am 23.12.2012 10:33, schrieb Redmar:
>> Hendrik Knackstedt schreef op do 20-12-2012 om 17:39 [+0100]:
>>> Am 20.12.2012 13:43, schrieb Pierre Slamich:
>>>
>>>> I don't have a clean way to split them right now. I split them by
>>>> size to keep below 900ko (I took 800 for safety), but I then had to
>>>> adjust manually because the strings were split right in the middle.
>>> Ok, I'll take a look at it and see if I can come up with something
>>> useful.
>> I've been working with python-polib for a bit, so I think I'd be able to
>> create a script to split up a po file into multiple parts pretty
>> quickly. I haven't started yet, since I don't want to do duplicate work,
>> but please let me know if you want me to make a script or if you need
>> help with python-polib.
>
> If you can do this, that's great. Thanks!
>
> Hendrik
>> Regards,
>>
>> Redmar
>> --
>> Ubuntu Dutch Translators
>>>> If you don't mind, it would be great to take advantage of the German
>>>> process to automate the process as much as possible.
>>>> Would you be willing to expand the pad
>>>> (http://lite.framapad.org/p/ddtpUbuntu) with us (yet another proof
>>>> of French-German partnership ;-P)?
>>> Sure. What do you mean by "the German process"? I'm a bit short on
>>> time right now but just let me know what has to be done and I'll try
>>> to get it done asap.
>>>
>>> Regards,
>>> Hendrik
>>>> Pierre
>>>>
>>>> On Thu, Dec 20, 2012 at 1:35 PM, Hendrik Knackstedt
>>>> <hendrik.knackstedt at t-online.de> wrote:
>>>> Hey Pierre!
>>>>
>>>>
>>>> I'd like to test your approach for the German language also.
>>>> How exactly did you split the files? Did you use an existing
>>>> program/script or can you provide a script for doing this?
>>>> Thanks!
>>>>
>>>> Hendrik
>>>>
>>>> Am 19.12.2012 15:58, schrieb Pierre Slamich:
>>>>
>>>> > Yes, although we might be finished by then ;-)
>>>> > Thanks to the method we're reviewing and correcting around
>>>> > 1000 strings per day at the moment.
>>>> >
>>>> >
>>>> > sincerely,
>>>> > Pierre
>>>> >
>>>> >
>>>> > On Tue, Dec 18, 2012 at 4:06 PM, Hannie Dumoleyn
>>>> ><lafeber-dumoleyn2 at zonnet.nl> wrote:
>>>> > Hi Pierre, Redmar, and all who are interested,
>>>> > Would it be an idea to brainstorm on this in
>>>> > #ubuntu-translators? Perhaps in January 2013?
>>>> > I agree with Redmar that the msgmerge is a good
>>>> > method, especially for huge documents. The only
>>>> > snag is that you still have to approve the fuzzies
>>>> > offline before uploading the file back to
>>>> > Launchpad. We use this method for the Ubuntu
>>>> > Manual "Getting started with Ubuntu" (Lucid >
>>>> > Maverick > ....> Raring) and with success.
>>>> > Redmar, sorry for not yet having tested your
>>>> > popsort :(
>>>> > Regards,
>>>> > Hannie
>>>> >
>>>> > Op 18-12-12 00:51, Pierre Slamich schreef:
>>>> >
>>>> > > Hi Hannie, Hi Redmar,
>>>> > > Thanks a lot for the tips: we're interested in
>>>> > > using your approach, and more generally it might
>>>> > > be interesting expending the msmerge approach to
>>>> > > all teams that are already underway for the
>>>> > > DDTP, and the Google one to the teams that need
>>>> > > to get started.
>>>> > >
>>>> > >
>>>> > > - For the Google Translator Kit approach, I
>>>> > > guess we could extend the mock project we did
>>>> > > for fr_FR to other languages (and streamlining
>>>> > > our process by using Bazaar) by creating a
>>>> > > global team responsible for the DDTP Mock
>>>> > > project and including in this team one member
>>>> > > from each language team responsible for
>>>> > > uploading the machine translated po for his or
>>>> > > her language.
>>>> > >
>>>> > >
>>>> > > - For the msmerge approach, do you already have
>>>> > > a project to handle this ? Is there any
>>>> > > advantage in msmerging raring against releases
>>>> > > older than quantal to get more modified
>>>> > > strings ? How many strings have you been able to
>>>> > > recover using that approach ? It might be neat
>>>> > > to generate the msmerged po for all languages ?
>>>> > > Importing them as actual translations (not
>>>> > > fuzzy) into a mock project like the Google
>>>> > > Translate one would show them as suggestions for
>>>> > > the actual DDTP as well.
>>>> > > The translator would thus be able to pick the
>>>> > > human translated one when available or to build
>>>> > > on the machine translated one otherwise.
>>>> > >
>>>> > >
>>>> > > Can we try to schedule some time to coordinate
>>>> > > on this so that we can use both approaches and
>>>> > > try to onboard all the other languages teams
>>>> > > once we have a rock-solid process ?
>>>> > >
>>>> > >
>>>> > > Pierre
>>>> > >
>>>> > > Pierre Slamich
>>>> > >pierre.slamich at gmail.com
>>>> > >
>>>> > >
>>>> > >
>>>> > > On Mon, Dec 17, 2012 at 10:30 PM, Redmar
>>>> > ><redmar at ubuntu-nl.org> wrote:
>>>> > > Hi Pierre,
>>>> > >
>>>> > > I've actually tried a similar approach
>>>> > > for Dutch using msgmerge, which
>>>> > > might also be worth checking out. When
>>>> > > you merge the translations of an
>>>> > > older version of ubuntu into the current
>>>> > > version (msgmerge
>>>> > > quantal_ddtp.po raring_ddtp.po -o
>>>> > > merged_ddtp.po, for example), there
>>>> > > will be a lot of 'fuzzy' translations
>>>> > > for strings that are similar (for
>>>> > > example, meta packages for different
>>>> > > programs, debugging symbols etc).
>>>> > > These fuzzy often only need a few small
>>>> > > changes (eg program name) to be
>>>> > > accepted, which can really speed up
>>>> > > translations. And you don't have to
>>>> > > worry about google putting in a weird
>>>> > > translation, since it is all based
>>>> > > on earlier translations done by a human.
>>>> > >
>>>> > > On a related note, if any of you work on
>>>> > > ddtp-translations offline, I
>>>> > > have written a python program that can
>>>> > > sort entries in ddtp po-files
>>>> > > based on the popularity of the package.
>>>> > > This way, the most popular
>>>> > > packages will be at the top of the po
>>>> > > file, and you are always sure you
>>>> > > are working on the most important
>>>> > > packages first.
>>>> > >
>>>> > > You can get the code here:
>>>> > > bzr branch lp:~redmar/+junk/ddtp_popsort
>>>> > >
>>>> > > It has a small readme file, please let
>>>> > > me know if something is unclear
>>>> > > or not working for you.
>>>> > >
>>>> > > Regards,
>>>> > > Redmar
>>>> > > --
>>>> > > Ubuntu Dutch Translators
>>>> > >
>>>> > >
>>>> > > Hannie Dumoleyn schreef op ma 17-12-2012
>>>> > > om 17:58 [+0100]:
>>>> > > > Hello Pierre,
>>>> > > > This is a very good idea! I have just
>>>> > > uploaded the first part of the
>>>> > > > incomplete Dutch translation (900kb)
>>>> > > to GTT.
>>>> > > > Thanks,
>>>> > > > Hannie
>>>> > > >
>>>> > > > Op 17-12-12 12:55, Pierre Slamich
>>>> > > schreef:
>>>> > > >
>>>> > > > > The DDTP represent around 50 000
>>>> > > strings to translate * 140
>>>> > > > > languages. On very good weeks, a
>>>> > > typical translation team translates
>>>> > > > > 500 strings (see UWN for examples
>>>> > > weekly figures).
>>>> > > > >
>>>> > > > >
>>>> > > > > Would take a lot of weeks (years?)
>>>> > > with highly motivated volunteers
>>>> > > > > of a large translation team, working
>>>> > > non-stop, at their best to get
>>>> > > > > done with it.
>>>> > > > > Thus we had the idea to delegate
>>>> > > initial translation suggestions to
>>>> > > > > Google Translator Kit and review
>>>> > > translations with humans to speed
>>>> > > > > the process.
>>>> > > > >
>>>> > > > > We successfully did an import for
>>>> > > circa 40 000 French strings (yup
>>>> > > > > you read that right) this week-end
>>>> > > in a mock project called DDTP
>>>> > > > > Automation
>>>> > > (https://translations.launchpad.net/ddtpautomation).
>>>> > > > > To keep it short, the translations
>>>> > > from this project appear as
>>>> > > > > suggestions in the French DDTP, and
>>>> > > can be reviewed by actual
>>>> > > > > translators.
>>>> > > > > We've started using them, and it
>>>> > > turns out that a lot of them are
>>>> > > > > actually useful and are speeding up
>>>> > > the translation process a lot.
>>>> > > > >
>>>> > > > > We detailed the (somewhat) tedious
>>>> > > process in English at
>>>> > > > >
>>>> > >http://lite.framapad.org/p/ddtpUbuntu
>>>> > > > > Questions and inquiries welcome.
>>>> > > > >
>>>> > > > > Pierre
>>>> > > > >
>>>> > > > >
>>>> > > > > ---
>>>> > > > >pierre.slamich at gmail.com
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > >
>>>> > >
>>>> > >
>>>> > > --
>>>> > > ubuntu-translators mailing list
>>>> > >ubuntu-translators at lists.ubuntu.com
>>>> > >https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators
>>>> > >
>>>> > >
>>>> > >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> ubuntu-translators mailing list
>>>> ubuntu-translators at lists.ubuntu.com
>>>> https://lists.ubuntu.com/mailman/listinfo/ubuntu-translators
>>>>
>>>>
>>>>
>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-translators/attachments/20121227/3541610e/attachment.html>
More information about the ubuntu-translators
mailing list