Encoding branch alive again
John Arbash Meinel
john at arbash-meinel.com
Tue Apr 25 20:00:33 BST 2006
Hello everyone-
I thought I would let you know that I was able to resurrect my
bzr-encoding branch available from:
http://bzr.arbash-meinel.com/branches/bzr/encoding
(Warning, the branches have been upgraded to be knits so you have to
have a recent copy of bzr.dev to check them out. On the plus side, get
should be *much* faster).
A lot of the tests fail on Mac OSX because of its path normalization
problems. I don't know if I should just skip these tests, or let them
fail for now. But in general, unicode filenames aren't fully supported
under Mac.
I suppose it might also be possible to write a Normalization decorator
for Transport, which simulates the Mac behavior, but I don't feel
inclined to do so myself.
Attached is the 2k line patch. Though almost all of the changes are
actually changes to the test suite (mostly adding more tests.)
Some of the other tests fail because of the unicode/url issues.
Specifically, when you type:
bzr branch a räksmörgås
If you go to:
builtins.cmd_branch.run:
then to_location = u'r\xe4ksm\xf6rg\xe5s'
Which is then fed to:
dir = br_from.bzrdir.sprout(to_location, revision_id, basis_dir)
And sprout then splits on '/' and does 'transport.mkdir(segments[-1])'
The big problem is that arguments given by the user are decoded into
being unicode strings. Which is valid based on what a user would type /
how bash would complete strings.
However, we are expecting them to pass in URLs. Which would make sense
if we were expecting "http://" since that is probably what would be cut
& pasted from their web browser.
I certainly can just change 'cmd_branch' such that it runs 'urlescape'
on its inputs.
I did do some tests on unicode urls as exposed by apache, and they are
urlquoted utf-8 strings.
I think it would do us a lot of good to try and get this stuff into bzr
0.8. But it could be too big of a change, and we want it to be an 0.9
thing. (Though I think making it an 0.8, with an 0.8.1 bugfix would mean
that people using dapper could get decent unicode support).
John
=:->
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bzr-encoding.patch
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20060425/dabca018/attachment.diff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060425/dabca018/attachment.pgp
More information about the bazaar
mailing list