[MERGE][0.11] commit performance regression in 0.11

John Arbash Meinel john at arbash-meinel.com
Tue Sep 26 21:36:42 BST 2006


Robert Collins wrote:

> On Mon, 2006-09-25 at 19:40 -0500, John Arbash Meinel wrote:
>> I know you gave me a preemptive +1 for this but it turned out to be a
>> little bit less perfect than I wanted. So I'm posting it for review.
>>
>> 1) It turns out that a lot of code actually expected to go through
>> repository.deserialize_inventory() because that would add a
>> 'inventory.root.revision_id' value for format 5 inventories (Knit1
>> repositories). But since I was now directly writing the committed
>> inventory to disk, I needed to add it myself. So I ended up with:
>>
>> inv = leftmost_parent_tree.inventory
>> if not self.branch.repository._format.rich_root_data:
>>     inv.root.revision = leftmost_parent_id
>>
>> Now, I don't know why the commit builder isn't creating the root
>> revision (probably also has to do with Knit1 versus Knit2
>> repositories).
> 
> This does seem ugly. I think its because we are exposing dual-model
> inventories outside of the repository layer. I drafted some notes about
> this sort of transition - basically I believe we should have only one
> model in active use throughout the code base, and that every
> producer/consumer of objects should upgrade/downgrade as needed. Our
> fast-path InterObject instances are the sole exception, because they can
> provide specific 'downlevel to downlevel' functionality.
> 
> Lets try to clean this specific case up during 0.12.

> 
>> 2) I'm not 100% happy with CommitBuilder.get_tree() as the function
>> name. I had 'get_committed_tree()' for a while, or I could do
>> 'revision_tree()' to mirror the Repository name...
> 
> I'd prefer revision_tree() to mirror the Repository name.
> Also, please put the fix for 1) within the CommitBuilder. We can have an
> assert in set_parent_trees if you like.


Well, there are issues with CommitBuilder because of how it interacts
with PointlessCommit. I have a patch, but I'm reviewing it with Aaron
right now.

I'll do the 'revision_tree()' change, though.

> 
>> 3) I did a lot of refactoring of private functions in WorkingTree,
>> because of common functionality that wasn't 100% common.
> 
> I was thinking of a different tack than you took:  I was thinking that
> it would look like:
> 
> set_parent_trees(self, trees, allow_ghost):
>     self.set_parent_tree_ids([revid for rev_id, tree in trees])
>     try:
>         left_most_tree = trees[0][1]
>     except IndexError:
>         left_most_tree = None
>     if left_most_tree is None:
>         self._delete_basis_inventory()
>     else:
>         self._set_basis_inventory(left_most_tree.inventory)
> 
> and that set_parent_tree_ids would not need to change at all.

Actually, we don't have a function called 'set_parent_tree_ids()' we
only have 'set_parent_ids()'. And *that* function calls
'set_last_revision()' which has the side-effect of caching the basis
inventory. And we want to avoid caching the basis inventory in that way,
because it reads it from the repository. But I can't really change
set_parent_ids() because of callers that call it directly, rather than
going through set_parent_trees().

> 
> If you think this may make the patch cleaner, please twiddle it,
> otherwise I'm ok with going with what you've got.
> 
>> I did check, and this removes the extra read_inventory() calls.
>> Unfortunately read_inventory() is still faster than write_inventory().
>> But with this patch, we are down to 1 inventory read, and 2 inventory
>> writes when committing to a new tree. (And because the formats are
>> different, we are stuck with 2 writes).
> 
> Its also appropriate to do 2 writes IMO.

Well, there will be a common case where we will write the same data, but
it seems to be less common right now.

> 
>> I'd like to do a little bit more tweaking of the serializer, because I
>> think there are ways to improve it even more. But that can certainly
>> be
>> an 0.12 thing. 
> 
> Yup.
> 
> +1 from me for this - merge to bzr.dev, I'll do the cherrypick to 0.11.
> 
> -Rob
> 

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060926/850fdaa6/attachment.pgp 


More information about the bazaar mailing list