Filesystem paths
John Arbash Meinel
john at arbash-meinel.com
Wed Apr 26 17:56:38 BST 2006
Aaron Bentley wrote:
> Martin Pool wrote:
>> On 26/04/2006, at 1:53 PM, Aaron Bentley wrote:
>> +Transports work in URLs. Take note that URLs are by definition only
>> +ASCII - the decision of how to encode a Unicode string into a URL must be
>> +taken at a higher level, typically in the Store.
>
> This doesn't look accurate. AFAIK, we do unicode-to-utf-8 conversions
> in the transport layer. The store escaping thing is a separate issue.
I think we can validly encode/decode stuff that is underneath the .bzr/
directory. That is our domain, and we can specify how urls need to be
encoded there.
Outside of that???
>
>> +A similar edge case is that the url ``http://foo/sweet%2Fsour" contains
>> +one directory component whose name is "sweet/sour". The escaped slash is
>> +not a directory separator.
>
> (although many, many pieces of software will treat it as one, including
> conformant SFTP implementations)
>
>> So LocalTransport.abspath shouldn't be calling osutils.abspath, but
>> rather should be manipulating URL objects? Then we can see that
>>
>> file:///c|/
>>
>> has no "up"?
>
> Right. We can test going up from '/' under *nix, while we can't test
> going up from 'C:\', because osutils (correctly, I believe) treats '/'
> as top-level root under *nix. But URLs are URLs are URLs.
Sort of. An absolute win32 url is:
file:///c|/path/to/foo
But an absolute Unix url is:
file:///path/to/foo
Now, I believe that the file:// url spec says that all paths are
absolute (so you can't say file:///foo to mean 'foo' in the local
directory).
So if you have a file:// url, then you know it is an absolute path.
(I admit that I'm guessing, but since the form is file://host/path, if
you put anything after the second slash it would be interpreted as a
host, not a local path).
Because the path is slightly munged in the win32 case, you can't just
write a single function which takes a real path and converts it into a
url (and vice versa) that is not platform aware.
...
> In some ways, it might be desirable to permit non-unicode paths when
> working with working trees. We require versioned files to have unicode
> names, but I don't think we necessarily should require that unversioned
> files have unicode pathnames. At the moment, I expect that things will
> explode in that situation, anyhow.
>
> Aaron
What about non-unicode parent directories? Something like:
/path/to/\xee\xee\xee\xee/wd/.bzr
That requires us to use a non-unicode path, since we do a lot of
operations using absolute paths.
Right now, you wouldn't be able to type that path into bzr, since it
automatically decodes command line parameters using bzrlib.user_encoding.
However, I don't know how you would type such a path anyway. (I can
create it using a program like python, but I don't know how you would
pass the command line argument).
I would like to be able to say we could handle non-unicode cruft in the
working directory. And it would be nice to say you don't have to be in a
unicode path to start with. But I also honestly think we can make that a
fairly low priority.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060426/c88b499e/attachment.pgp
More information about the bazaar
mailing list