[MERGE][0.11] commit performance regression in 0.11
John Arbash Meinel
john at arbash-meinel.com
Tue Sep 26 21:36:42 BST 2006
Robert Collins wrote:
> On Mon, 2006-09-25 at 19:40 -0500, John Arbash Meinel wrote:
>> I know you gave me a preemptive +1 for this but it turned out to be a
>> little bit less perfect than I wanted. So I'm posting it for review.
>> 1) It turns out that a lot of code actually expected to go through
>> repository.deserialize_inventory() because that would add a
>> 'inventory.root.revision_id' value for format 5 inventories (Knit1
>> repositories). But since I was now directly writing the committed
>> inventory to disk, I needed to add it myself. So I ended up with:
>> inv = leftmost_parent_tree.inventory
>> if not self.branch.repository._format.rich_root_data:
>> inv.root.revision = leftmost_parent_id
>> Now, I don't know why the commit builder isn't creating the root
>> revision (probably also has to do with Knit1 versus Knit2
> This does seem ugly. I think its because we are exposing dual-model
> inventories outside of the repository layer. I drafted some notes about
> this sort of transition - basically I believe we should have only one
> model in active use throughout the code base, and that every
> producer/consumer of objects should upgrade/downgrade as needed. Our
> fast-path InterObject instances are the sole exception, because they can
> provide specific 'downlevel to downlevel' functionality.
> Lets try to clean this specific case up during 0.12.
>> 2) I'm not 100% happy with CommitBuilder.get_tree() as the function
>> name. I had 'get_committed_tree()' for a while, or I could do
>> 'revision_tree()' to mirror the Repository name...
> I'd prefer revision_tree() to mirror the Repository name.
> Also, please put the fix for 1) within the CommitBuilder. We can have an
> assert in set_parent_trees if you like.
Well, there are issues with CommitBuilder because of how it interacts
with PointlessCommit. I have a patch, but I'm reviewing it with Aaron
I'll do the 'revision_tree()' change, though.
>> 3) I did a lot of refactoring of private functions in WorkingTree,
>> because of common functionality that wasn't 100% common.
> I was thinking of a different tack than you took: I was thinking that
> it would look like:
> set_parent_trees(self, trees, allow_ghost):
> self.set_parent_tree_ids([revid for rev_id, tree in trees])
> left_most_tree = trees
> except IndexError:
> left_most_tree = None
> if left_most_tree is None:
> and that set_parent_tree_ids would not need to change at all.
Actually, we don't have a function called 'set_parent_tree_ids()' we
only have 'set_parent_ids()'. And *that* function calls
'set_last_revision()' which has the side-effect of caching the basis
inventory. And we want to avoid caching the basis inventory in that way,
because it reads it from the repository. But I can't really change
set_parent_ids() because of callers that call it directly, rather than
going through set_parent_trees().
> If you think this may make the patch cleaner, please twiddle it,
> otherwise I'm ok with going with what you've got.
>> I did check, and this removes the extra read_inventory() calls.
>> Unfortunately read_inventory() is still faster than write_inventory().
>> But with this patch, we are down to 1 inventory read, and 2 inventory
>> writes when committing to a new tree. (And because the formats are
>> different, we are stuck with 2 writes).
> Its also appropriate to do 2 writes IMO.
Well, there will be a common case where we will write the same data, but
it seems to be less common right now.
>> I'd like to do a little bit more tweaking of the serializer, because I
>> think there are ways to improve it even more. But that can certainly
>> an 0.12 thing.
> +1 from me for this - merge to bzr.dev, I'll do the cherrypick to 0.11.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060926/850fdaa6/attachment.pgp
More information about the bazaar