[win32] non-ascii/non-english file names: internal usage of file names
Alexander Belchenko
bialix at ukr.net
Tue Nov 29 14:12:05 GMT 2005
As I can see cardinal difference between Windows version of Python and
Linux/Cygwin Python in following fact: when you use flat string on
Windows for base part of file names then all derived file names is
always representing as flat string. On Linux/Cygwin as I can see in
situations when path cannot be represented as flat string (or in ascii
encoding?) it silently converted to unicode. As result we have different
behaviour with non-ascii (non-english) file names.
For workaround of this incompatibility in bzrlib code always should use
unicode file paths for all operations. Key points here is default
directory values such '.' used in construction Branch object etc.
At this moment I found 2 weak point of bzrlib.
1) file bzrlib/builtins.py, function branch_files -- should have definition:
def branch_files(file_list, default_branch=u'.'):
^^^^
2) file bzrlib/add.py, function _prepare_file_list(file_list) --
default file list for add command should also be unicode string:
def _prepare_file_list(file_list):
"""Prepare a file list for use by smart_add_*."""
import sys
if sys.platform == 'win32':
file_list = glob_expand_for_win32(file_list)
if not file_list:
file_list = [u'.'] # <<<<<<<<<<<<<<<<<<<< !!!!!
file_list = list(file_list)
return file_list
There is also some places in code when used flat string such '.' or ''.
I'll try to find it all and fix then I'll create patch.
But here exist another critical problem with non-ascii file names. In
some situations used StringIO file-like object for catching output of
another command. But StringIO could catch only ascii flat strings or
unicode strings. So when I'm try to commit tree with non-ascii filenames
+ want to use external editor for entering commit message I have
UnicodeEncodeError: 'ascii' codec can't encode characters in position
0-5: ordinal not in range(128).
What the plan for such situation?
Alexander
More information about the bazaar
mailing list