[MERGE][bug #111702] bzr init <remote_branch> connects multiple times
Vincent Ladeuil
v.ladeuil+lp at free.fr
Sat Jun 2 14:20:22 BST 2007
>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:
john> John Arbash Meinel has voted +1 (conditional).
john> Status is now: Conditionally approved
Thanks.
john> Comment:
john> I wish there were a better way than having to pass possible
john> transports around through all the different api layers,
I don't know yet what layers will be traversed but I guess there
will not be that much, BzrDir, Branch and Repository (via BzrDir)
already have their associated transport[s] so the impact should
not be that high.
john> but the only obvious one is to use a cache, and that
john> seems to be very frowned upon.
... and their implementation is incomplete: only ftp and sftp
have a global cache, http and remote (in its TCP, ssh and http
based implementations) don't. One can argue that pycurl have one
though. Also webdav, relying on http, doesn't have one either.
By "doesn't have one", I mean even the connection sharing which
is implemented at the transport level is not enough to guarantee
a single connection for bzr commands.
I think my proposal is between the two already used mechanisms:
- a bit more sharing than at the transport level only,
- a bit less sharing than at the global level.
- a caching policy adjustable by callers.
john> This looks incorrectly indented:
john> + * ``bzr init`` should only connect to the remote location one time.
john> + We have been connecting several times because we forget to pass
john> + around the Transport object. This modifies
john> + ``BzrDir.create_branch_convenience``, so that we can pass in the
john> + Transport that we already have.
john> + (John Arbash Meinel, Vincent Ladeuil, #111702)
john> +
What ? You mean that we will never put that on punched cards so
limiting lines to 72 chars (even 65 in that case) is not really a
requirement ?
Well, your call :)
Humorous note: If you look at the cited lines above, you will
notice that the largest one ends up in column 82, which means
that my default setting of 65 (whose intent was to take this kind
of case into account) is already too high !
john> I realize this isn't your code but this is incorrect:
Fire, fire ! I'm refactoring :-)
john>
john> - t = get_transport(safe_unicode(base))
john> + t = get_transport(safe_unicode(base), possible_transports)
john> get_transport() isn't defined anymore as strictly requiring a
john> Unicode path. It is actually specifically defined as taking a URL
john> or trying to be smart if it gets a unicode path. (If it thinks it
john> is a URL it tries to encode it, if it is a local path it treats
john> it appropriately).
Yes, I saw that.
It perfectly fit my purpose in fact ; as "local" and "connected"
transports already have a distinct treatment, handling connection
sharing will be a natural consequence.
john> So I would just remove the "safe_unicode()" calls. There are
john> other places as well.
Will do.
john> This loop could be clearer:
john> + transport = None
john> + if possible_transports:
john> + for t in possible_transports:
john> + if t.base == base:
john> + transport = t
john> + break
john> + if transport is None:
john> ...
john> + if transport is None:
john> ...
john> + return transport
john> Instead, you could just do:
john> + if possible_transports:
john> + for t in possible_transports:
john> + if t.base == base:
john> + return t
john> + break
john> Then you don't need the 'if transport is None' checks.
Ok. This will be refactored anyway.
john> Also, the way you have written this, the transport will
john> only be re-used if it is identically the same. I
john> thought we were trying to also handle cases when it is
john> a parent/child. Or is that something you are still
john> working on?
Still under work. Specifically I needed the scheme, host, user
and password to be accessible attributes of the transport, so i
began refactoring all "connected" transports to make them
accessible. Every implementation did that in different ways and
many comments indicated that some sharing was more than expected
:-)
This work is done now, I will now begin to address the multiple
connections bugs. So far, using ftp for the tests and disabling
the cache reveals ('b' is the actual number of connections):
FAIL: commands.test_branch.TestBranch.test_branch_remote_local
not equal:
a = 1
b = 2
FAIL: commands.test_cat.TestCat.test_cat
not equal:
a = 1
b = 2
FAIL: commands.test_checkout.TestCheckout.test_checkout
not equal:
a = 1
b = 2
FAIL: commands.test_merge.TestMerge.test_merge
not equal:
a = 1
b = 5
FAIL: commands.test_missing.TestMissing.test_missing
not equal:
a = 1
b = 3
FAIL: commands.test_pull.TestPull.test_pull
not equal:
a = 1
b = 4
FAIL: commands.test_push.TestPush.test_push
not equal:
a = 1
b = 2
What that means is that writable transports don't share
connections without a global cache.
john> And I don't think this works for local relative paths. Because
john> T.base is expanded to the absolute path, rather than the relative
john> path that was passed in. (unless I misread it and the path has
john> already been converted to a full url).
We don't care about reusing connections for local transports nor
memory transports ;-)
Vincent
More information about the bazaar
mailing list