[MERGE/RFC] "bzr branch" opens the source branch twice
John Arbash Meinel
john at arbash-meinel.com
Fri Nov 7 23:17:45 GMT 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John Arbash Meinel wrote:
> John Arbash Meinel wrote:
>> While tracking into the time spent for index work, I ran:
>
>> bzr -Dhttp -Dindex branch http://....
>
>> What I found rather enlightening, is that it seems we read all of the
>> -format files 2 times.
>
>> This, along with accidentally reading the pack-names file 2x each time
>> we lock the repository, means it actually takes approx 10s just to get
>> to the point where we have opened the remote HTTP repo, and have the
>> repository locked, in preparation for the next step.
>
>
> I just checked, and we read the http://...pack-names file 12 times
> during "bzr branch". That seems a bit excessive to me.
>
> John
> =:->
>
So, looking into this, the first cause is because we do:
accelerator_tree, br_from = bzrdir.BzrDir.open_tree_or_branch(
from_location)
br_from.lock_read() # so far so good
...
dir = br_from.bzrdir.sprout(to_transport.base, ...)
And the very specific problem is that "br_from.bzrdir.sprout" is
sprouting from the BzrDir object, and not the Branch object. Because of
that, we don't have access to the *branch* or the *repository* that we
just opened. And inside sprout() it then calls "self.open_branch()"
which re-opens everything that we just opened.
Now, I think we have some of that, because BzrDir.sprout() is where all
of the logic for "repository_policy" etc are located. It is also where
the logic for copying nested-trees resides.
Attached is a hack-around, which allows the caller to pass in the branch
we have already loaded. Not only that, because we are smart about
locking the br_from for the lifetime of the action, it also keeps the
repository locked during that whole time.
In my testing, if I do "bzr branch http://" where the local repo already
has all the revisions, this changes the time from 20s down to 10s. If I
do "bzr branch" with data to copy, it changes from 44s down to 29s
(provided the source format is in btree :).
I don't really like having to do it this way, as it seems better to use
Branch.sprout() directly, but I don't have a great feeling about what
logic needs to be where. Obviously this isn't ready to be merged as is,
considering there are no tests for BzrDir.sprout(source_branch=XXX)
I also think we need some sort of effort test, to make sure we don't
re-open the source branch multiple times.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkkUzJkACgkQJdeBCYSNAAPczACeIRlx/leMQF4jsJNdmpR9G2FU
/o8An3zn8gcgW5tMUc//vDrVSzKixDXt
=WuFX
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bzrdir_sprout_branch.patch
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20081107/9deeff1d/attachment-0001.diff
More information about the bazaar
mailing list