internal_diff should return bytestreams

John Arbash Meinel john at
Sat Jun 3 18:43:38 BST 2006

My 'compare_trees' changes exposed some other bugs, which I am fixing
with this patch.
Before my earlier changes, 'compare_trees' was actually returning
non-unicode path names. This exposed problems in other areas where they did:

delta = compare_trees(...)
for path, file_id, kind in delta.removed:
    has_changes = 1
    print >>to_file, '=== removed %s %r' % (kind, path)

Since compare_trees was now returning unicode paths, this would print
u'bar' rather than 'bar'.
(It is probably a bug that we are using %r anyway, but that is a side

I fixed most of those already to use path.encode('utf8'), which is not a
perfectly optimal solution, but is good for now. (My encoding branch
should fix the rest).

The second place that wasn't exposed by the test suite was that
'internal_diff' would actually return a unicode string as the path
header for the '---' and '+++' lines.

Aaron Bentley and I worked on the attached patch so that internal_diff
will properly encode the filenames.

We both looked over it, so it has been submitted, but if there are truly
glaring problems, we should fix them.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: diff-unicode-paths.diff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : 

More information about the bazaar mailing list