Colocated branch support progress - preserving characters in URLs

Thu Aug 4 11:22:51 UTC 2011

On 04/08/11 09:04, Martin Pool wrote:
> On  3 Aug 2011, Jelmer Vernooij<jelmer at canonical.com>  wrote:
>> The ability to address colocated branches using URLs has long been in
>> progress (https://bugs.launchpad.net/bugs/380871). After brainstorming a
>> bit with John and Martin at the recent Launchpad sprint, I'm finally
>> back on track hacking on it.
>>
>> There are basically four things left to do now:
>>
>>   1) Avoid URLescaping comma's when converting local paths to URLs.
>> (lp:~jelmer/bzr/escape-comments)
>>   2) Preserve the distinction between literal and delimiting comma's in
>> URLs when they are parsed in e.g. ConnectedTransport (2)
>>   3) Parse the subsegment parameters for Transports and make them
>> accessible (done, lp:~jelmer/bzr/transport-segments)
>>   4) Look at the transport segment parameters in ControlDir and use them
>> to determine the default branch (this should be fairly easy)
>>
>> I'm wondering how exactly to do (2). At the moment when we receive a URL
>> we parse it and then urlunescape all characters. Among other things,
>> this means that urlencoded characters won't be distinguisable from
>> literal characters.
>>
>> So far, this distinction hasn't really mattered as we didn't have a
>> different interpretation for them.
>>
>> To make sure we don't lose this information, I would like to stop
>> urlunescaping things like ConnectedTransport._path,
>> ConnectedTransport._username, etc and instead urlunescape it later. Does
>> that seem reasonable, or is there perhaps a better way to work around
>> the current limitations I haven't considered?
> Maybe another way is to say that _unsplit_url (or something similar)
> should return a more-structured path object that contains information
> about path parameters - it can handle splitting them out at the same
> time it does the other unescaping?
>
> Also, since it now has about 6 return values, perhaps it should be
> superseded by a function that returns a ParsedURL with named fields (or
> something.)
I like that idea, it seems a lot easier than trying to manually keep 
track of what is and what is not urlescaped everywhere. I'll give it a 
go - thanks.

Cheers,

Jelmer