[MERGE] UTF-8 encoding in binary diffs
John Arbash Meinel
john at arbash-meinel.com
Thu Jul 12 16:29:00 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Vincent Ladeuil wrote:
>>>>>> "robert" == Robert Collins <robertc at robertcollins.net> writes:
> robert> On Thu, 2007-07-12 at 16:50 +1000, Martin Pool wrote:
> >> In other cases this might have value though - for example when we want
> >> to allow for platforms with normalization. But even then it's
> >> probably better handled by having tests that don't need to exercise
> >> normalization use names that won't be affected.
> robert> I think what I'm getting at is that we can cheaply increase our test
> robert> coverage of non-ascii names by making all tests use
> robert> normalisation-requiring names whenever possible.
> If by that you imply that the tests will fail on HFS+ filesystems
> mounted via nfs, I think I will be strongly -1 on such an idea.
My understanding was that Robert would have us try a few names until we got one
we knew would be represented correctly.
Even Linux can have its encoding set to iso-8859-1 so some names will not be
Aaron had an interesting point about using something like os.stat(u'\1234') to
see if it could be used. However, that still throws a OSError since the file
I'm fairly confident that Python is just going through
'sys.getfilesystemencoding()', so we can just grab that, and try a few
path.encode(fs_enc). Note we should actually use osutils._fs_enc (only in a
saner manner than accessing a private var), since it handles when sys.get...()
I have mixed feelings overall, though.
I like having more unicode testing. And changing most tests to use Unicode
names does stress more code overall.
I'm not sure how it falls into "each test should test 1 and only 1 thing, so
that failures are clear."
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
More information about the bazaar