Filesystem paths

Aaron Bentley at
Wed Apr 26 04:53:13 BST 2006

Martin Pool wrote:
> On 26/04/2006, at 12:17 AM, John Arbash Meinel wrote:
>> The specific interface I'm talking about right now is
>> 'TestCase.build_tree'. If I pass in a unicode string, I get the old
>> "ascii codeck cannot decode byte...", and that is because the path  goes
>> through "LocalTransport.abspath()" which uses:
>>     result = normpath(pathjoin(self.base, urllib.unquote(relpath)))

That's bogus, but I think it's a bug that build_tree doesn't use POSIX,
or at least doesn't translate the path to a URL.

>> I really think that having an internal api be URLs is sub-optimal for
>> us. I think having the API be unicode makes a lot more sense.
> I agree - given Unicode is so much easier for most code to deal with  
> I'd think we would want to keep that format most of the time, and  only 
> url-escape when necessary.  I think Robert was pretty keen to  keep them 
> mostly as URLs, so perhaps he could say more about why this  is good?

Personally, I don't insist on our APIs being all-url.  In fact, I've
written some recently that weren't.

What I do insist on, is they be internally consistent, and I will throw
a major wobbler if I ever find any heuristics in the transport layer.

I don't know Robert's reasons, but the reason I like the transport layer
being all-url is because some transports *must* be url-based, and all
transports *can* be url-based.  It keeps the layer simple, promotes code
reuse, and all that good stuff.

What's frustrated me thus far about the transport layer that it doesn't
use urls everywhere, and it's not easy to use.  For example, I was
recently fixing a bug to do with finding root directories in Windows
paths.  Unfortunately, that leaves us with an OS-specific test case.
Under Unix, we'll never test whether that functionality is right, and we
can't do so because all the path-manipulation functions require
different assumptions.  If we were using URLs, we could have uniform
testing, because URL manipulations are the same on every platform.

Since users will rarely pass in URL for filesystem paths, we should have
a function that converts user paths unto URLs (if they're not already).
  Quite possibly get_transport should do that.

OTOH, I don't think it's appropriate to be using transports to access
working trees, and since that's the bug you encountered, I suggest
that's what we should fix-- build_tree should either be implemented in
terms of POSIX, or it should translate paths to urls before using them
with Transport.

For Transport, urls are functional, unambiguous, and are painless if
done right.

>> I think the only time you really need a URL is when you are dealing  with
>> an absolute path. Since most of the time we are actually dealing with
>> relative paths, I think Unicode makes a lot of sense.

Transports are always dealing with absolute paths, I think it makes lots
of sense to make transports easy to use with relative paths, though.


More information about the bazaar mailing list