[MERGE][Bug #84043] Commit now invokes an external editor in non-ASCII directories

John Arbash Meinel john at arbash-meinel.com
Sat Nov 17 16:43:07 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Actually, I think there are 2 pieces that we need to consider here.

There are times when you get an exception because a path is not valid in your
sys.getfilesystemencoding(). (A path of \xff\xff when fs encoding is UTF-8).

There are *other* times when you get an exception which is valid (جوجو) in
sys.getfilesystemencoding() but not in bzrlib.user_encoding =
locale.getpreferredencoding().

On Linux, both are usually UTF-8 (sometimes latin-1, sometimes something else
if someone tried really hard to change it.) On Windows, fs-enc is pretty much
always MBCS (a subset of UTF-16), but user_encoding can be all over the map.
(Russian Windows versus US, etc, etc. To add further problems user_encoding is
usually *not* terminal encoding.)

Anyway, on Windows, you can create a path like جوجو but you probably can't pass
that path to your editor.

If you look at the mkstemp call, it is using "dir=u'.'" because that causes it
to return the Unicode path.

Which means that on Linux if the cwd is not valid in fs-enc, it will raise a
UnicodeError before it returns the open file. We can work around this by using
"dir='.'", which always uses the raw 8-bit path [ignoring fs-enc]. Which also
has the advantage that it probably will be passed to an editor without any
problems.

That fails on Windows, because using "dir='.'" will give a path with "????" in
it [since it returns OEM encoding, which can't represent all unicode characters
which are valid in the filesystem]. But as you pass the path to editors as OEM
encoding it is close to okay.

So I would recommend combining the two fixes with:

msgfilename, msgfileno = mkstemp(..., dir='.')
msgfilename = osutils.basename(msgfilename)
msgfile = os.fdopen(msgfileno)

the mkstemp should never fail (both Windows and Linux have something to return
for any path you can have on your fs), and we know the filename will be ascii
(we are creating it) so we don't have to worry about it being Unicode. And by
changing to basename(), we know we can pass that across to an editor.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHPxobJdeBCYSNAAMRAh6VAKDOjJpsPTOq4J4Yy8TyRuM1ZF7VuQCdGRPs
GTYmLgmP1oHfesx3DwqNLaI=
=tT0O
-----END PGP SIGNATURE-----



More information about the bazaar mailing list