[MERGE/RFC] Odd processing during BzrDir.sprout()
John Arbash Meinel
john at arbash-meinel.com
Thu Sep 4 04:22:50 BST 2008
-----BEGIN PGP SIGNED MESSAGE-----
I was just poking around with some "bzr branch" timings, and I was quite
surprised to see:
114.969 SFTP.readv(...705d.pack) 8107 offsets => 76 coalesced => 158 requests
120.421 creating branch <bzrlib.branch.BzrBranchFormat6 object at 0x8578a2c>
126.118 created new branch
137.845 SFTP.readv(...705d.iix) 1 offsets => 1 coalesced => 1 requests
138.665 SFTP.readv(...705d.iix) 1 offsets => 1 coalesced => 1 requests
139.487 SFTP.readv(...705d.iix) 1 offsets => 1 coalesced => 1 requests
So, for some reason we are creating the target branch, and then going back and
reading the source inventory index (and pack file).
Even worse, it seems like we don't have the index information cached, which
means somewhere we let go of the repository lock. Considering we just did a
fetch, and should certainly have all of the inventory index cached.
For the most part, I tracked it down to the "subtree" code. Which is really a
shame considering 99% of all branches out there don't support subtrees anyway.
With 400ms ping time over the loopback, going into a treeless repository, this
patch drops "bzr branch" times by about 20s (out of 140s), because it doesn't
try to go read the inventory from the source repository, having to probe for
inventory info again.
The main reason for the RFC is that I'm wondering if a better fix would just be:
if recurse == 'down' and repository.supports_tree_reference():
so we just disable this extra lookup if we know in advance that we won't *do*
anything with it.
I came across this, while working on my sftp tests, because it would seem to
finish the transfer. And then just sit around for a while, thinking, before it
actually finished the branch.
Looking more closely, I think we need to address some stuff in BzrDir.sprout().
I see it doing:
source_branch = self.open_branch()
source_repository = source_branch.repository
But I never see it *locking* those objects. Which means it isn't caching any
information between calls.
There is also something inherently wrong (IMO) about having "cmd_branch" do:
and then having sprout do:
br_from = self.open_branch()
and creating an entirely new Branch instance. (One that isn't locked, or
sharing *any* state with the branch we just used to probe the ancestry,
possibly resolve -r XXX information, etc.)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 599 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080903/e35ae818/attachment.bin
More information about the bazaar