[MERGE] [#196881] Fix merge redirection when remembered location used

John Arbash Meinel john at arbash-meinel.com
Fri Mar 7 09:10:18 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belchenko wrote:
> This patch is already merged but it has hidden the bug inside and will wait
> to blow up when someone will merge with remembered non-ascii local path.
> 
> Especially these lines:
> 
> === modified file 'bzrlib/builtins.py'
> --- bzrlib/builtins.py 29.02.2008 17:53:56
> +++ bzrlib/builtins.py 04.03.2008 18:53:03
> @@ -3008,10 +3008,8 @@
>          mutter("%s", stored_location)
>          if stored_location is None:
>              raise errors.BzrCommandError("No location specified or
> remembered")
> -        display_url = urlutils.unescape_for_display(stored_location,
> -            self.outf.encoding)
> -        self.outf.write("%s remembered location %s\n" % (verb_string,
> -            display_url))
> +        display_url = urlutils.unescape_for_display(stored_location,
> 'utf-8')
> +        note(u"%s remembered location %s", verb_string, display_url)
>          return stored_location
> 
> urlutils.unescape_for_display returns plain string in given encoding
> (utf-8 here)
> and then converted to unicode by note(u"%s remembered ...
> 
> So it will blow up with UnicodeDecodeError one day.

Actually, no. I did this explicitly:

>>> bzrlib.urlutils.unescape_for_display('http://foo/bar', 'utf8')
u'http://foo/bar'

unescape_for_display returns a *unicode* string. It just makes sure that
each section in the string is encodable in the target encoding. If it
*is* then it returns the unicode. If it *is not* then it returns the
original URL escaped form.

> 
> Also note that comment for urlutils.unescape_for_display claims that this
> function returns "A unicode string which can be safely encoded into the
> specified encoding"
> but it's not true. Because here is actual code for local urls:
> 
>             path = local_path_from_url(url)
>             path.encode(encoding)
>             return path
> 
> IMO my version of bugfix was safer for unicode paths.

You are missing something here.

path.encode(encoding) returns a string which we throw away. What you are
thinking is:

path = local_path_from_url(url)
path = path.encode(encoding)
return path

There is a fairly big difference.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH0QYxJdeBCYSNAAMRAs2xAKDX34luxHk+P87iSHZxc4xrKtt+lQCgst1g
wJp7gWGnuFoXFy6/wUpLW4M=
=ckp5
-----END PGP SIGNATURE-----



More information about the bazaar mailing list