Filesystem paths

John Arbash Meinel john at arbash-meinel.com
Wed Apr 26 17:56:38 BST 2006


Aaron Bentley wrote:
> Martin Pool wrote:
>> On 26/04/2006, at 1:53 PM, Aaron Bentley wrote:
>> +Transports work in URLs.  Take note that URLs are by definition only
>> +ASCII - the decision of how to encode a Unicode string into a URL  must be
>> +taken at a higher level, typically in the Store.
> 
> This doesn't look accurate.  AFAIK, we do unicode-to-utf-8 conversions
> in the transport layer.  The store escaping thing is a separate issue.

I think we can validly encode/decode stuff that is underneath the .bzr/
directory. That is our domain, and we can specify how urls need to be
encoded there.

Outside of that???

> 
>> +A similar edge case is that the url ``http://foo/sweet%2Fsour" contains
>> +one directory component whose name is "sweet/sour".  The escaped  slash is
>> +not a directory separator.
> 
> (although many, many pieces of software will treat it as one, including
> conformant SFTP implementations)
> 
>> So LocalTransport.abspath shouldn't be calling osutils.abspath, but 
>> rather should be manipulating URL objects?  Then we can see that
>>
>>   file:///c|/
>>
>> has no "up"?
> 
> Right.  We can test going up from '/' under *nix, while we can't test
> going up from 'C:\', because osutils (correctly, I believe) treats '/'
> as top-level root under *nix.  But URLs are URLs are URLs.

Sort of. An absolute win32 url is:
file:///c|/path/to/foo

But an absolute Unix url is:
file:///path/to/foo

Now, I believe that the file:// url spec says that all paths are
absolute (so you can't say file:///foo to mean 'foo' in the local
directory).
So if you have a file:// url, then you know it is an absolute path.
(I admit that I'm guessing, but since the form is file://host/path, if
you put anything after the second slash it would be interpreted as a
host, not a local path).

Because the path is slightly munged in the win32 case, you can't just
write a single function which takes a real path and converts it into a
url (and vice versa) that is not platform aware.

...

> In some ways, it might be desirable to permit non-unicode paths when
> working with working trees.  We require versioned files to have unicode
> names, but I don't think we necessarily should require that unversioned
> files have unicode pathnames.  At the moment, I expect that things will
> explode in that situation, anyhow.
> 
> Aaron


What about non-unicode parent directories? Something like:

/path/to/\xee\xee\xee\xee/wd/.bzr

That requires us to use a non-unicode path, since we do a lot of
operations using absolute paths.

Right now, you wouldn't be able to type that path into bzr, since it
automatically decodes command line parameters using bzrlib.user_encoding.

However, I don't know how you would type such a path anyway. (I can
create it using a program like python, but I don't know how you would
pass the command line argument).

I would like to be able to say we could handle non-unicode cruft in the
working directory. And it would be nice to say you don't have to be in a
unicode path to start with. But I also honestly think we can make that a
fairly low priority.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060426/c88b499e/attachment.pgp 


More information about the bazaar mailing list