Fwd: [Bug 340394] Re: redirecting log output to a file produces the wrong encoding on win32

Alexander Belchenko bialix at ukr.net
Tue Apr 21 12:28:56 BST 2009


Philippe Lhoste пишет:
> On 21/04/2009 12:13, Martin Pool wrote:
>> https://launchpad.net/bugs/340394
>>
>> Does anyone have a clear idea of
>> 1- what output encoding bzr should use on Windows when output is
>> redirected to a file and running in a terminal using cp850?
> 
> I can only guess, but indeed command line window uses CP580 which I knew 
> as "OEM" codepage (as opposed to "Ansi" Windows codepage, CP1252 (aka. 
> Windows-1252, WinLatin1...). At least on Western Europa and USA.
> It is an annoying quirk, even in standard Windows commands: if you type 
> dir > x you will find (on my French system) lines like:
> 
> 9 R‚p(s)  54ÿ092ÿ599ÿ296 octets libres
> 
> where it should be Rép(s) and 54 092 599 296 (I suppose the ÿ are 
> non-breaking spaces in OEM).
> 
> I think you cannot find out if output must be in terminal CP or Windows 
> CP depending if output is right to the console or redirected/piped...
> 
>> 2- what mechanism if any should be available to control it?
>>
>> I suppose we could (like svn?) have a parameter that specifies the
>> output encoding...
> 
> Makes sense, it is the most flexible and usable solution! That, or 
> adding a filter program on the redirection...
> 
> Note: I typed a commit message (-m) with accents (I usually make these 
> messages in English, but one can still introduce accents, like in 
> "Bézier curves", etc.).
> Output with bzr log is correct, wrong when redirected to a file, but 
> correct with qlog. 

To get correct output when you are redirecting to the file you have to 
switch terminal encoding with chcp command first. Actually you can 
switch and use it all the time without problems if you change the font 
of the console to unicode font (on Windows 2000/2K it's Lucida Console).

> I suppose these messages are converted from the local 
> codepage to UTF-8.

The message has converted from cp1252 to unicode when bzr commands are 
invoked, then saved as utf-8 internally in bzr repository, all three 
variants of logs (log, log> and qlog) deal with unicode internally.

> If I type the message in an editor (supposedly in CP-1252, display would 
> be wrong otherwise anyway with my font), accents are correctly displayed 
> by log and qlog (and still not with redirect).

It's correct. bzr uses ANSI encoding to decode your commit message from 
editor.

> If I force my editor to save in UTF-8, with or without BOM, log is wrong...

It's also correct, see my comment just above.

I don't see reliable way to solve this problem without special global 
option.




More information about the bazaar mailing list