[MERGE] Test for bug #272444 (symlinks to Unicode file names)
Daniel Clemente
dcl441-bugs at yahoo.com
Mon Oct 27 01:26:10 GMT 2008
Andrew Bennetts <andrew.bennetts at canonical.com> writes:
> I don't know what needs fixing, but I do think that it would be better for
> commit to fail than to make an unbranchable/uncheckoutable branch.
I don't think it's much use trying to displace the source of the error, when we can try to fix bug #272444 and make both commit and branch work.
Maybe later, commit() can raise a warning if it can't detect the system encoding for the symlink target, but that's another feature to implement, and for a very special case.
> Interestingly, when I try from the command line to commit a symlink to adiós I
> get a traceback,
I can do that commit. Try this test, it fails for me (with Python 2.5) on branch, not commit: https://bugs.launchpad.net/bzr/+bug/272444/comments/1
If you see new strange behaviours, you could write a new test or expand the existing ones.
>> ...should I change something else? I'll send the patch when we decide if we need both tests or not.
>
> Unfortunately, yes: make the tests pass with Python 2.4, and ideally Python 2.6.
I attach a new patch which:
- breaks long lines
- provides in a comment the path of the other test's file
- gives some reasons for having two tests
- passes on Python 2.4, 2.5 and 2.7a0 (so probably 2.6 too). Both tests
Still remaining:
- Aaron, could you tell if a branch_implementation test is needed? (see parent message)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_272444_v8.patch
Type: text/x-diff
Size: 11605 bytes
Desc: eigtht version of new test
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081027/7da42941/attachment.bin
-------------- next part --------------
> And ideally fix the UnicodeWarnings that the tests emit in 2.5.
I think this is part of actually fixing Bazaar, not the tests.
I still don't have the knowledge to write a proper and clean patch, but I'm trying my first steps to fix the bug. Let me explain what I noticed:
The problem shows itself at repository.py, line ~400:
elif kind == 'symlink':
current_link_target = content_summary[3]
if not store:
# symlink target is not generic metadata, check if it has
# changed.
if current_link_target != parent_entry.symlink_target:
store = True
At that line, current_link_target is '\xce\xa9' but parent_entry.symlink_target is u'\u03a9'.
My uncertain interpretation is that the second one is wrong because it should have been encoded to utf-8 before being written into the repository. Either both stay in unicode or in utf-8, and I think it must be utf-8 because that's what fingerprints are.
My guess is that whoever wrote that u'\u03a9' to the inventory should have encoded it to utf-8.
And that would be workingtree_4.py, around line 1590:
elif kind == 'symlink':
inv_entry.executable = False
inv_entry.text_size = None
inv_entry.symlink_target = utf8_decode(fingerprint)[0]
That could be changed to just „inv_entry.symlink_target = fingerprint“ to always use utf-8. Other code should be adapted accordingly.
Is that the right way? I'd appreciate anyway some pointers about how to implement this.
--
Daniel
More information about the bazaar
mailing list