[MERGE] Register netloc protocols at 'register_transport_proto' time

Vincent Ladeuil v.ladeuil+lp at free.fr
Sun Oct 21 10:45:18 BST 2007


>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:

    john> Vincent Ladeuil wrote:
    >>>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:
    >> 
    john> The attached patch fixes the bug I just mentioned about
    john> missing netloc protocol registrations when sharing
    john> transports.
    >> 
    >> Nice catch.
    >> 
    >> So, you found a code path where we try to split an url before
    >> calling get_transport...
    >> 
    >> But we are still in a grey area, see
    >> https://bugs.edge.launchpad.net/bzr/+bug/122258.
    >> 
    >> Basically, decorators trick urlparse in a way that rarely causes
    >> problems. I think you run into it because you have switched from
    >> sftp to bzr+ssh.
    >> 
    >> So I think the right solution is either to make urlparse deal
    >> with decorators or to really filter out the decorators *before*
    >> calling urlparse (easier said than done ;-)

    john> Well, we have 2 different "types" of decorators. We have:
    john> readonly+...

    john> And we have
    john> http+urllib

I'm painfully aware of that ;) And bzr+ssh and bzr+http[s] are
from yet another type.

    >> 
    john> I went ahead and defaulted the parameter to False,
    john> because there were more protocols that should not be
    john> registered. (All of the decorators, and memory, etc.)
    >> 
    >> Can you elaborate on why you did that ?
    >> register_urlparse_netloc_protocol ensures that the urls will be
    >> parsed correctly (most of the time :-/). I'm afraid we are still
    >> exposed to the underlying problem then.

    john> Well, it doesn't make much sense to register
    john> "readonly+" or "log+" or "trace+".  And 'memory:///' or
    john> 'file:///' doesn't take a host, but probably would
    john> still be netloc safe.

    john> 'lp:' is the only really weird one, since we don't
    john> require the '///' (though we will accept it if they do,
    john> and we don't accept 'lp:/' or 'lp://' at the moment.)

I understand, but this applies to the "bug" we just identified
(optimistically), what if another one occurs ?

I think you pin-point rightly that it's unsafe to delay the
registration (until the module is loaded and moreover as a
side-effect).

Andrew was surprised to *have* to do that for bzr protocols.

Overall, it seems weird to even *have* to think about that.

So, making the registration mandatory, automatic even, seems safer.


<snip/>

    john> I'm actually thinking we would be best served by just
    john> copying out the urlparse code, and just rewriting it so
    john> that it always had "netloc=True". We could probably
    john> simplify it, and make our lives easier.

    john> Thoughts?

I just checked python 2.4, 2.5 and 2.6 implementations, 2.5 seems
better (and includes SFTP support :), 2.6 introduces no changes
(so far).

So, I'm all for having our own private implementation.

Cons:

- we lose support from the python community

Pro:
- we can fix bugs :) #122258 comes to mind (and is really two
  separate bugs as mentioned in the comments),
- we can take decorators into account,
- we can simplify the implementation by getting rid of useless
  constraints (fragments, param) where appropriate.

    john> John
    john> =:->

    john> PS: I *do* want to mark this bug as critical, because
    john> it breaks all of my heavyweight bzr+ssh checkouts. (I
    john> cannot 'bzr pull' into them.) Which seems like a
    john> serious regression worth letting a release slip.

Strange, I recently switched to using heavyweight bzr+shh
checkouts but didn't encounter the problem. I thought only pull
--overwrite triggered the bug ?

Anyway, I'm ok with marking it critical.

I can't work on it in the coming days, but if your patch is
merged as is, that leaves us a bit more time to make a private
implementation.

So I say, create a bug for *this* registration problem and let's
fix the rest in a second step.

        Vincent



More information about the bazaar mailing list