Filesystem paths
Martin Pool
mbp at sourcefrog.net
Wed Apr 26 04:15:22 BST 2006
On 26/04/2006, at 12:17 AM, John Arbash Meinel wrote:
>
> The specific interface I'm talking about right now is
> 'TestCase.build_tree'. If I pass in a unicode string, I get the old
> "ascii codeck cannot decode byte...", and that is because the path
> goes
> through "LocalTransport.abspath()" which uses:
> result = normpath(pathjoin(self.base, urllib.unquote(relpath)))
Building the local tree ought to depend instead upon
getfilesystemencoding().
> The result of the unquote function seems to be a regular string, which
> causes problems for os.path.join() when one is a unicode string,
> and the
> other is a regular string.
>
> I really think that having an internal api be URLs is sub-optimal for
> us. I think having the API be unicode makes a lot more sense.
I agree - given Unicode is so much easier for most code to deal with
I'd think we would want to keep that format most of the time, and
only url-escape when necessary. I think Robert was pretty keen to
keep them mostly as URLs, so perhaps he could say more about why this
is good?
One point to consider - we want Transports to be able to pass back
complete urls as a way to record the scheme, host, etc. We can
either make them proper urls (with escaping, etc), or can treat them
as unescaped Unicode urls. (Perhaps that's not strictly a valid
idea, but I think you know what I mean.) We'd need to make sure that
those things
> I think the only time you really need a URL is when you are dealing
> with
> an absolute path. Since most of the time we are actually dealing with
> relative paths, I think Unicode makes a lot of sense.
The general pattern seems to get an initial url either from the
command line or some file, and then do everything else relative to
there. Representation as a URL only happens inside the Transport
(e.g. when forming the http request) or when the client specifically
asks for it in that form.
--
Martin
More information about the bazaar
mailing list