[RFC] shlex.split is bad for win32

Alexander Belchenko bialix at ukr.net
Thu Jun 25 10:33:11 BST 2009


In [1]: import shlex

In [2]: shlex.split(r'foo C:/bar/spam')
Out[2]: ['foo', 'C:/bar/spam']

In [3]: shlex.split(r'foo C:\bar\spam')
Out[3]: ['foo', 'C:barspam']

This affects some plugins, but more important: it affects `bzr diff 
--using xxx`

E.g. bzr diff --using "C:\Program Files\WinMerge\WinMergeU.exe" does not 
work, while bzr diff --using "C:/Program Files/WinMerge/WinMergeU.exe" 
does actually work.

Instead of shlex.split should be used Win32 API CommandLineToArgvW() 
function (http://msdn.microsoft.com/en-us/library/bb776391(VS.85).aspx)
This function is already using in win32utils:get_unicode_argv() function.

There is special function in commands.py

def shlex_split_unicode(unsplit):
     import shlex
     return [u.decode('utf-8') for u in 
shlex.split(unsplit.encode('utf-8'))]


So I think it should be platform dependent and use right method on 
win32. Right?

Dear plugin writers! Please use the function shlex_split_unicode in your 
plugins and never shlex.split directly. So once someone will fix its 
behavior re backslashes your plugin will work much better on Windows.
Thanks.




More information about the bazaar mailing list