[MERGE][#139318] bzr send @ win32: ensure that command line to invoking Thunderbird is 8-bit string, not unicode (because subprocess does not understand non-ascii unicode)

Alexander Belchenko bialix at ukr.net
Thu Feb 28 07:20:23 GMT 2008


Aaron Bentley пишет:
> Alexander Belchenko wrote:
>> Test case provided.
> 
> Thanks.
> 
>> I restrict the test for Thunderbird client for now,
>> because it uses url instead of abs path to attachment. I think I should not
>> touch paths (neither path to executable nor attachment).
> 
> I think this isn't right.  If the path to an executable or attachment is
> unicode, then on Windows, won't Python raise a UnicodeEncodeError?

It's work for me, because executable 'C:\Program Files\Mozilla Thunderbird\thunderbird.exe'
is actually ascii string. And ThunderbirdClient class turns attachment path to URL.

This bug is very hard to fix *right* in common sense. There is too much specific details.
I'm start to think that Python is not very good to help us deal with unicode hell.
And Python 3.0 makes situation only worse.

>>> Actually, it looks like we need to encode all arguments that may be
>>> provided, so I think the right place would be in
>>> ExternalMailClient._compose, doing
>>>
>>> cmdline = [a.encode(user_encoding, 'replace') for a in cmdline]
>>>
>>> If you do it there, it's pretty hard to test, so I wouldn't require it.
>> This approach actually is unsafe because
>> unicode_path.encode(user_encoding, 'replace')
>> could break absolute unicode paths to executable or attachment.
> 
> I believe an absolute unicode path to an executable or attachment would
> be unusable.  These are 8-bit APIs, and it looks like on Windows, Python
> isn't encoding properly.  So guess is that on Windows, all unicode
> commandline parameters must be encoded.

For executable path I agree, but not for attachment path. Otherwise Thunderbird
will produce wrong URL.

>> +    * Encode recipient and subject line from unicode to user_encoding before
>> +      invoking external mail client (e.g. Thunderbird) in `bzr send` command
>> +      on Windows. (#139318, Alexander Belchenko)
> 
> No longer only on Windows.

Yes, will fix.

>> --- bzrlib/mail_client.py	2007-11-28 17:27:10 +0000
>> +++ bzrlib/mail_client.py	2008-02-27 18:52:57 +0000
>> @@ -144,9 +144,8 @@
>>              the attachment type.
>>          """
>>          for name in self._get_client_commands():
>> -            cmdline = [name]
>> -            cmdline.extend(self._get_compose_commandline(to, subject,
>> -                                                         attach_path))
>> +            cmdline = _get_compose_8bit_commandline(name, to, subject,
>> +                attach_path)
> 
> Does this actually work?  It looks like it should be
> self._get_compose_8bit...

I don't understand your note. Yes, it works, I tested it manually.

>> === modified file 'bzrlib/urlutils.py'
>> --- bzrlib/urlutils.py	2007-12-21 17:26:37 +0000
>> +++ bzrlib/urlutils.py	2008-02-27 18:24:39 +0000
>> @@ -274,7 +274,8 @@
>>      # check for UNC path \\HOST\path
>>      if win32_path.startswith('//'):
>>          return 'file:' + escape(win32_path)
>> -    return 'file:///' + win32_path[0].upper() + ':' + escape(win32_path[2:])
>> +    return ('file:///' + str(win32_path[0].upper()) + ':' +
>> +        escape(win32_path[2:]))
> 
> I suppose drive letters can't be non-ascii, so this probably doesn't
> matter, but it would be nice to do escape(win32_path[0].upper()) for
> consistency.

I think it's redundant here. Please, don't force me do this.
I think it's not good idea.



More information about the bazaar mailing list