[MERGE][0.11] commit performance regression in 0.11

John Arbash Meinel john at arbash-meinel.com
Tue Sep 26 01:40:04 BST 2006

Robert Collins wrote:
> On Mon, 2006-09-25 at 13:51 -0500, John Arbash Meinel wrote:
>> 1) We almost have the infrastructure in place to remove a
>> read_inventory
>> call during commit. We just need to have the commit builder return the
>> RevisionTree. 
> As this is a performance regression, and the fix is small in terms of
> api alterations; I'm +1 for fixing this as an rc2. Please merge a fix to
> bzr.dev and I will cherrypick it into 0.11.
> (The fix I am expecting, and +1'ing in advance is to:
>  - add a test that you can retrieve the inventory object from the commit
> builder after calling commit
>  - use that to obtain the tree in commit.py and call set_parent_trees
> with it
>  - in set_parent_trees, after calling set_parent_ids (which performs the
> ghost check), set the basis - delete if the left most tree is None, or
> the parent list is empty, otherwise serialise from the inventory to
> disk.
> This should give us a two serialisations per commit: one to the
> repository and one to the local directory.
> Rob

I know you gave me a preemptive +1 for this but it turned out to be a
little bit less perfect than I wanted. So I'm posting it for review.

1) It turns out that a lot of code actually expected to go through
repository.deserialize_inventory() because that would add a
'inventory.root.revision_id' value for format 5 inventories (Knit1
repositories). But since I was now directly writing the committed
inventory to disk, I needed to add it myself. So I ended up with:

inv = leftmost_parent_tree.inventory
if not self.branch.repository._format.rich_root_data:
    inv.root.revision = leftmost_parent_id

Now, I don't know why the commit builder isn't creating the root
revision (probably also has to do with Knit1 versus Knit2 repositories).

2) I'm not 100% happy with CommitBuilder.get_tree() as the function
name. I had 'get_committed_tree()' for a while, or I could do
'revision_tree()' to mirror the Repository name...

3) I did a lot of refactoring of private functions in WorkingTree,
because of common functionality that wasn't 100% common.

I did check, and this removes the extra read_inventory() calls.
Unfortunately read_inventory() is still faster than write_inventory().
But with this patch, we are down to 1 inventory read, and 2 inventory
writes when committing to a new tree. (And because the formats are
different, we are stuck with 2 writes).

I'd like to do a little bit more tweaking of the serializer, because I
think there are ways to improve it even more. But that can certainly be
an 0.12 thing.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: commit-1-read.patch
Type: text/x-patch
Size: 23272 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060925/968f4fa3/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060925/968f4fa3/attachment.pgp 

More information about the bazaar mailing list