[MERGE] UTF-8 encoding in binary diffs

Jonathan Lange jml at mumak.net
Mon Jul 9 08:48:29 BST 2007


On 7/7/07, John Arbash Meinel <john at arbash-meinel.com> wrote:
> John Arbash Meinel has voted +0.
> Status is now: Waiting
> Comment:
> The contents of the files should *not* be UTF-8 encoded. The *filenames*
> should be UTF-8 encoded. I think it is just a doc-string bug.
>

Changed.

> As near as I can tell, you test filenames are 'binary' and 'elephant'.
> Neither of which are non-ascii. So I don't see why the test would ever
> fail.
>

I also don't know the reason. However the test definitely fails
without the change to inventory.py. It's a port of a failing test in
Launchpad.

I'm pretty naïve when it comes to encoding issues -- please bear with me.

> So you should be using Unicode names like:
> 'omega':u'\u03a9', 'alpha':u'\u03b1'
>

Changed.

After making this change, I noticed that '=== added' and friends print
the repr() of the filename, not the str(). This means that users would
see "=== added file '\xce\xb1'" instead of "=== added file 'α'".

I've changed it to use %s instead of %r.

> (I would avoid things like u'\xe5' (a with ring) because of problems
> with combining characters on Mac).
>

File data no longer has u'\xe5'.

> +        self.assertContainsRe(diff, "\\+\\+\\+ b/elephant")
> ^- This seems a lot clearer to me as:
> +        self.assertContainsRe(diff, r"\+\+\+ b/elephant")
>
> (In fact, I would always use r'' for assertContainsRe unless you really
> need to escape something in the string, even if '\' never shows up
> there).
>

I normally use r"", but avoided it to be consistent with the other
tests. Changed to use raw strings.

> So, I'm happy with the change, but I honestly don't see how the test is
> testing what you think it is.
>
>
> For details, see:
> http://bundlebuggy.aaronbentley.com/request/%3Cd06a5cd30707060557u7b55fbe7s1b3bb839d45185f9%40mail.gmail.com%3E
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: show-diff-trees-110092.diff
Type: text/x-diff
Size: 8196 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070709/e27fb590/attachment.bin 


More information about the bazaar mailing list