Filesystem paths

Wed Apr 26 04:15:22 BST 2006

On 26/04/2006, at 12:17 AM, John Arbash Meinel wrote:
>
> The specific interface I'm talking about right now is
> 'TestCase.build_tree'. If I pass in a unicode string, I get the old
> "ascii codeck cannot decode byte...", and that is because the path  
> goes
> through "LocalTransport.abspath()" which uses:
> 	result = normpath(pathjoin(self.base, urllib.unquote(relpath)))

Building the local tree ought to depend instead upon  
getfilesystemencoding().

> The result of the unquote function seems to be a regular string, which
> causes problems for os.path.join() when one is a unicode string,  
> and the
> other is a regular string.
>
> I really think that having an internal api be URLs is sub-optimal for
> us. I think having the API be unicode makes a lot more sense.

I agree - given Unicode is so much easier for most code to deal with  
I'd think we would want to keep that format most of the time, and  
only url-escape when necessary.  I think Robert was pretty keen to  
keep them mostly as URLs, so perhaps he could say more about why this  
is good?

One point to consider - we want Transports to be able to pass back  
complete urls as a way to record the scheme, host, etc.  We can  
either make them proper urls (with escaping, etc), or can treat them  
as unescaped Unicode urls.  (Perhaps that's not strictly a valid  
idea, but I think you know what I mean.)  We'd need to make sure that  
those things

> I think the only time you really need a URL is when you are dealing  
> with
> an absolute path. Since most of the time we are actually dealing with
> relative paths, I think Unicode makes a lot of sense.

The general pattern seems to get an initial url either from the  
command line or some file, and then do everything else relative to  
there.  Representation as a URL only happens inside the Transport  
(e.g. when forming the http request) or when the client specifically  
asks for it in that form.

-- 
Martin