[MERGE/RFC] "bzr branch" opens the source branch twice, and the pack-names 12 times

John Arbash Meinel john at arbash-meinel.com
Fri Nov 7 23:38:03 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

While I was at it, I also found that 'BTree.iter_all_entries()' on the
pack-names file, causes it to be read 3 times.

1) iter_all_entries() calls '.key_count()' which reads the root node
2) It then proceeds to call _read_nodes() for all "needed" nodes, but
this includes the root node.
3) _read_nodes() when it doesn't know the size of the index, would
trigger a read of the index, just to get its size, and then throw away
the bytes, just to read it again in the next call.

This patch adds both fixes, and finally drops the number of reads of
"pack-names" to 1 as it should be.

...

> So, looking into this, the first cause is because we do:
> 
>   accelerator_tree, br_from = bzrdir.BzrDir.open_tree_or_branch(
>       from_location)
>   br_from.lock_read() # so far so good
>   ...
> 
>     dir = br_from.bzrdir.sprout(to_transport.base, ...)
> 
> 
> And the very specific problem is that "br_from.bzrdir.sprout" is
> sprouting from the BzrDir object, and not the Branch object. Because of
> that, we don't have access to the *branch* or the *repository* that we
> just opened. And inside sprout() it then calls "self.open_branch()"
> which re-opens everything that we just opened.
> 
> Now, I think we have some of that, because BzrDir.sprout() is where all
> of the logic for "repository_policy" etc are located. It is also where
> the logic for copying nested-trees resides.
> 
> Attached is a hack-around, which allows the caller to pass in the branch
> we have already loaded. Not only that, because we are smart about
> locking the br_from for the lifetime of the action, it also keeps the
> repository locked during that whole time.
> 
> In my testing, if I do "bzr branch http://" where the local repo already
> has all the revisions, this changes the time from 20s down to 10s. If I
> do "bzr branch" with data to copy, it changes from 44s down to 29s
> (provided the source format is in btree :).
> 
> I don't really like having to do it this way, as it seems better to use
> Branch.sprout() directly, but I don't have a great feeling about what
> logic needs to be where. Obviously this isn't ready to be merged as is,
> considering there are no tests for BzrDir.sprout(source_branch=XXX)
> 
> I also think we need some sort of effort test, to make sure we don't
> re-open the source branch multiple times.
> 
> John
> =:->

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkkU0VoACgkQJdeBCYSNAANJ3gCeLgkjXlq/srZ7s5X0o+m1/iia
q0sAn1I/4IMM3MFceV/N8wbEkWNOpfMQ
=3Jl6
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: branch_startup.patch
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20081107/110d2e7b/attachment.diff 


More information about the bazaar mailing list