[MERGE] Test for bug #272444 (symlinks to Unicode file names)

Daniel Clemente dcl441-bugs at yahoo.com
Sat Oct 18 15:06:28 BST 2008


John Arbash Meinel <john at arbash-meinel.com> writes:

> --- bzrlib/tests/branch_implementations/test_sprout.py	2007-11-01
> 09:52:45 +0000
> +++ bzrlib/tests/branch_implementations/test_sprout.py	2008-10-17
> 09:15:25 +0000
> @@ -1,4 +1,5 @@
>  # Copyright (C) 2007 Canonical Ltd
> +# -*- coding: utf-8 -*-
>
> ^- I may be coming late to this, but we shouldn't need a line like this.
> IIRC, we don't allow non-ascii characters in our source files (at least
> as much as we can manage that.) Pretty much always we can get away with
> ...

  I changed that, even when I personally don't like that restriction. I think it's unfortunate that Python 2.4 defaults to ASCII, but that's not a Bazaar issue.

>
> Also, 'ó' is a combining character, so it is going to get us in trouble
> on Mac, which will expand u'\xf3' to u'o\u0301'.
>

  Isn't this also worth testing? A link to 'ó' should also work on Mac, and if it doesn't, a test should catch this error. Maybe you want a separate test for that which focuses only on combining characters?
  I changed the target in my test to Ω anyway.


> And just to mention *why* it isn't supported. Symlink targets are just
> 'non-null' 8-bit strings, with no defined encoding. I believe the true
> encoding is just whatever the filesystem encoding is. Which means that
> we probably need to decode the string when we read it from disk, track
> it in memory as a Unicode string, store it into our inventories as
> UTF-8, and then encode it when we go to create the symlink.
>

  I also think this. The other alternative (store targets' bytes exactly as they are, even if you get broken links at checkout time) wouldn't be useful to most users, because they want their stored data to work in any other computer, independently of the encoding. It is thus better to adapt to the user's encoding.

  In addition, symlinks have a fingerprint in dirstate.py which is equal to their target. Since the fingerprint is per definition „a nonempty utf8 sequence“, the target must also be stored in utf8.

  This discussion could follow in bug #272444.


>
> +            raise KnownFailure('there is no support for symlinks to
> non-ASCII targets (bug 272444)')
> +
>
> ^- This line is longer than 80 characters, just split it into something
> like:

  My monitor has place for more than 80 characters, and my editor knows how to break lines automatically! I will never understand that convention.
  However, if that is what Bazaar mandates, I change it.

> +            raise KnownFailure('there is no support for'
> +		' symlinks to non-ASCII targets (bug #272444)')
> +
>
> (I *personally* prefer to see bug numbers with # in front of them, but
> there is no specific style guidline for that. Though we are consistent
> about it in the NEWS file.)
>

  Changed.

>
> === modified file 'bzrlib/tests/workingtree_implementations/test_parents.py'
>
> ^- I *do* think that we need a 'workingtree_implementations' test for
> this, but I don't see why it is going in 'test_parents'. Could you
> explain why you thought it should be there?
>

  The reason to put it there is that set_parent_trees seems the function which makes the bug visible. I did that after Robert Collins' message:
http://article.gmane.org/gmane.comp.version-control.bazaar-ng.general/47303


> Especially considering that the tree is already *at* [revision] I don't
> see why "set_parent_ids([revision])" would be anything other than a
> no-op. (Which at a minimum would mean your test fails for the wrong reason.)
>

  This was my fault; thanks for mentioning it. I'm still learning the API.
  I have changed it to simulate an uncommit; I hope it makes sense now.



  I attach a new patch with all your suggestions. Still to decide are:
- which of 'ó' or 'Ω' is more useful to test Unicode
- location of the test (now at test_parents.py)


  Thanks,

Daniel

-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_272444_v7.patch
Type: text/x-diff
Size: 9938 bytes
Desc: seventh version of new test
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081018/d7c3a307/attachment.bin 


More information about the bazaar mailing list