Filesystem paths
Martin Pool
mbp at sourcefrog.net
Thu Apr 27 02:18:01 BST 2006
On 27/04/2006, at 2:14 AM, Aaron Bentley wrote:
> Martin Pool wrote:
>> On 26/04/2006, at 1:53 PM, Aaron Bentley wrote:
>> +Transports work in URLs. Take note that URLs are by definition only
>> +ASCII - the decision of how to encode a Unicode string into a
>> URL must be
>> +taken at a higher level, typically in the Store.
>
> This doesn't look accurate. AFAIK, we do unicode-to-utf-8 conversions
> in the transport layer. The store escaping thing is a separate issue.
OK, my text was a bit unclear. There are two levels of escaping - id-
>filename, and then filename-url. But both are done by Store._relpath:
fileid = self._escape_file_id(fileid)
path = prefix + fileid
full_path = u'.'.join([path] + suffixes)
return urlescape(full_path)
I think we should be clear that Transports accept *urls*, not url-
like-things with Unicode in them.
>> +A similar edge case is that the url ``http://foo/sweet%2Fsour"
>> contains
>> +one directory component whose name is "sweet/sour". The escaped
>> slash is
>> +not a directory separator.
>
> (although many, many pieces of software will treat it as one,
> including
> conformant SFTP implementations)
Right - I think the SFTP committee were confused, or at least
conflicted, about this.
> To some extent, this is Arch residue-- that system has a very clean
> distinction between archive-access methods (the PFS) and
> filesystem-access methods (the virtual-unix subsystem).
>
> The idea was that PFS supported only what was available on all
> supported
> access methods-- read, write, list*, mkdir, rmdir, and not much else.
>
> Whereas the virtual-unix subsystem supported a richer command set that
> included, for example, stat and chmod.
>
> However, our Transports do support stat and chmod-- I guess the
> question
> is whether we want to require this. Would that preclude us from
> supporting additional access methods that we want to support?
>
> If the subset of functionality supported by all our transports is such
> that we can implement TreeTransform, the hashcache, etc. on top of
> them,
> then perhaps I'm being too rigid. There is desire to support remote
> access to working trees in some form, and this could be it.
Some Transports won't have that -- clearly they have different
capabilities. But some will - i think everything we need to do can
be done over sftp, with a good server. We already see that in
Transports which can or can't list directories. We could declare
which ones have the minimum set to support a working tree, etc.
> Another argument is that Transports are unfamiliar to most developers,
> and so they introduce an unwarranted barrier to contribution. On the
> other hand, it might make sense to use TreeTransform for our
> operations,
> which would be a new API anyhow.
All WorkingTree operations? That sounds good; I have a feeling it
will make the behaviour more consistent as far as handling backups,
conflicts with the working tree, etc.
> That
> In some ways, it might be desirable to permit non-unicode paths when
> working with working trees. We require versioned files to have
> unicode
> names, but I don't think we necessarily should require that
> unversioned
> files have unicode pathnames. At the moment, I expect that things
> will
> explode in that situation, anyhow.
This is possibly a reason to work partially in URLs for the working
tree - because they're just byte streams, we don't have to be able to
decode them to Unicode to manipulate them.
--
Martin
More information about the bazaar
mailing list