Bazaar/Windows encoding trouble.
jal at etc.to
Tue Mar 3 16:31:06 GMT 2009
I'm trying to get a workable environment that allows both Windows workstations
and Linux workstations to share the same code base, stored in a Bazaar
repository. I've just mostly fixed the CRLF / LF line endings debacle and am
now trying to get both Windumbs and Linux to agree on a suitable encoding for
the files. As usual the Linux side is no problem. The windows side, also as
usual, proves to be a horrible experience and timesink.
Currently most files are physically stored as UTF-8 files since we use
non-ascii characters throughout the code. Editing these files with Windows
crudware usually causes problems since non-ascii characters are corrupted
(because many tools use the abomination that is the "default codepage" to
load 8-bit files). I would like to see to it that when people edit files from
windows with whatever software these files are not damaged.
After lots of experimentation I found out that Windows of course cannot
natively specify UTF-8 as the default code page (if you try the %&*%(&%&
piece of crap does not boot anymore!?!? How's that for an error message!?).
I then tried to use iso-8859-15 encoding for at least the files that are
edited mostly by Windows sh*tware. This very standard encoding corresponds to
code page 28605 in Windows (NOT 1252 - that is iso-8859-1 plus Microsoft
extra's; it does not properly encode the euro sign and has extra characters
in places where iso-8859-1 has nothing. This makes it unusable).
But when I use this encoding (by simply and easily editing the bloody registry
to get Windows to use it - nice toolset there) most Windows programs properly
display iso-8859-15 encoded files, but bazaar reports an error when started:
PS C:\> bzr version
bzr: warning: unknown encoding cp28605. Continuing with ascii encoding.
Bazaar (bzr) 1.12
Python interpreter: c:\tools\python25.dll 2.5.2
Python standard library: c:\tools\lib\library.zip
Bazaar configuration: C:\Documents and Settings\jal\Application
This is already a bit strange since Windows has a separate code page (which
does not allow 28605 because otherwise things would be simple) for command
line crud, changed with the chcp command. I think "bzr" uses the Windows
encoding because it's a win32 console app.
Is this warning a problem? Is there a way to prevent it? Has anyone any
experience in shared code/encoding between Unix and Windows, preferrably in
an encoding that has >255 characters? Would Windows "Native" encoding be
usable (which seems to be UTF-16 )? Any experiences there?
Any help or thoughts would be welcome...
(a very frustrated) Frits Jalvingh
More information about the bazaar