[MERGE] Register netloc protocols at 'register_transport_proto' time

John Arbash Meinel john at arbash-meinel.com
Sat Oct 20 17:31:59 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vincent Ladeuil wrote:
>>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:
> 
>     john> The attached patch fixes the bug I just mentioned about
>     john> missing netloc protocol registrations when sharing
>     john> transports.
> 
> Nice catch.
> 
> So, you found a code path where we try to split an url before
> calling get_transport...
> 
> But we are still in a grey area, see
> https://bugs.edge.launchpad.net/bzr/+bug/122258.
> 
> Basically, decorators trick urlparse in a way that rarely causes
> problems. I think you run into it because you have switched from
> sftp to bzr+ssh.
> 
> So I think the right solution is either to make urlparse deal
> with decorators or to really filter out the decorators *before*
> calling urlparse (easier said than done ;-)

Well, we have 2 different "types" of decorators. We have:
readonly+...

And we have
http+urllib


> 
>     john> I went ahead and defaulted the parameter to False,
>     john> because there were more protocols that should not be
>     john> registered. (All of the decorators, and memory, etc.)
> 
> Can you elaborate on why you did that ?
> register_urlparse_netloc_protocol ensures that the urls will be
> parsed correctly (most of the time :-/). I'm afraid we are still
> exposed to the underlying problem then.

Well, it doesn't make much sense to register "readonly+" or "log+" or "trace+".
And 'memory:///' or 'file:///' doesn't take a host, but probably would still be
netloc safe.

'lp:' is the only really weird one, since we don't require the '///' (though we
will accept it if they do, and we don't accept 'lp:/' or 'lp://' at the moment.)

> 
>     john> Also, this fixes some unknown bugs, in that
>     john> https+urllib and https+pycurl were never registered.
> 
> Doh !
> 
>     john> Do we need a test for this? Perhaps just a test that if
>     john> you set register_netloc=True it adds it to urlparse's
>     john> netloc list? And one that when it isn't set, it
>     john> doesn't?
> 
> We need more tests to ensure that we correctly parse the urls
> when we use decorators.
> 
> Anyway, this patch improves the current situation, but I think you
> should register unconditionally.
> 
> In itself, doing that may have its own problems since we can't
> add the same prefix two times, but we have no way to verify that
> we don't try to *remove* it accordingly.
> 
> On the other hand, _unregister_urlparse_netloc_protocol is
> private and targeted at the tests, so I'm not too worried.
> 
>      Vincent
> 

I'm actually thinking we would be best served by just copying out the urlparse
code, and just rewriting it so that it always had "netloc=True". We could
probably simplify it, and make our lives easier.

Thoughts?
John
=:->

PS> I *do* want to mark this bug as critical, because it breaks all of my
heavyweight bzr+ssh checkouts. (I cannot 'bzr pull' into them.) Which seems
like a serious regression worth letting a release slip.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHGi1/JdeBCYSNAAMRAh66AKDZLjWk0vaR0iJRJIcfA/I79qwOJQCeNL7y
KaurGjKDeNzgC9WFdYyWUL0=
=FPYl
-----END PGP SIGNATURE-----



More information about the bazaar mailing list