Filesystem paths
John Arbash Meinel
john at arbash-meinel.com
Tue Apr 25 15:17:26 BST 2006
Once again I need to ask the question of what we want the bzrlib api to
look like, when it comes to urls versus unicode.
I'm trying to get my 'bzr-encoding' branch back up and running so that
we can merge some of the changes (even if it doesn't land until 0.9).
The issue is that the Transport api was declared to be all "URL"
strings, while most of the file-i/o stuff is all Unicode strings.
Our URL definition is urllib.quote(path.encode('utf-8')), so utf-8
encoded, and url quoted.
The biggest problem is this dichotomy between accessing files which go
through Transport, and accessing files directly (WorkingTree).
I know we've talked about pushing everything through Transport, but the
more I work with the encoding and unicode stuff, the more I want to say
that all strings internally are Unicode. And that Transport yields a
Unicode api, not a URL one.
When a user supplies a path to the command-line tool, that would have to
be a URL. But once it gets inside of bzrlib, it should be turned into a
unicode string.
The specific interface I'm talking about right now is
'TestCase.build_tree'. If I pass in a unicode string, I get the old
"ascii codeck cannot decode byte...", and that is because the path goes
through "LocalTransport.abspath()" which uses:
result = normpath(pathjoin(self.base, urllib.unquote(relpath)))
The result of the unquote function seems to be a regular string, which
causes problems for os.path.join() when one is a unicode string, and the
other is a regular string.
I really think that having an internal api be URLs is sub-optimal for
us. I think having the API be unicode makes a lot more sense.
If we want, we could make the __init__() function require a URL, and
then all get/put/listdir/... functions would get/return unicode.
I think the only time you really need a URL is when you are dealing with
an absolute path. Since most of the time we are actually dealing with
relative paths, I think Unicode makes a lot of sense.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060425/4a74480f/attachment.pgp
More information about the bazaar
mailing list