[RFC/performance] workingtree.commit

John Arbash Meinel john at arbash-meinel.com
Wed Sep 6 13:46:37 BST 2006


Robert Collins wrote:
> Currently workingtree.commit does this:
> 
> self._set_inventory(self.read_working_inventory())
> 
> This is a non-trivial performance issue.
> 
> AFAICT, after looking at it and discussing with Aaron on IRC we suspect
> it is to ensure that the inventory in memory is synched with an
> inventory that may have been diddled with by commit..
> 
> And indeed, in commit we do:
>         if deleted_ids:
>             deleted_ids.sort(reverse=True)
>             for path, file_id in deleted_ids:
>                 del self.work_inv[file_id]
>             self.work_tree._write_inventory(self.work_inv)
> 
> I'd like to change this to:
> 
>     if deleted_ids:
>         for path, file_id in deleted_ids:
>             self.work_tree.unversion(
>                 [file_id, for path, file_id in deleted_ids])

I think you left in an extra for loop. What you wanted was:
if deleted_ids:
    self.work_tree.unversion([file_id for path, file_id in deleted_ids])

Now, I'm okay with having an unversion() function, and you and I already
agree that the inventory should only be held in memory as long as we
have a lock. (So it shouldn't be read at startup, and only on request,
and held inside the Transaction).

However, we've also discussed that 'commit()' shouldn't be deleting
entries. They should be marked as missing, and perhaps it should be
considered a conflict, and should fail to commit, but a missing file
should not be automatically deleted.

So I'm thinking that unversion() is a misfeature. Because it is enabling
something we don't want to be doing. I would be okay with adding it in
the short-term, because it could certainly help our general performance,
but if it would be a similar amount of effort, I would rather get
'missing' files handled properly, and require users to 'bzr rm' files
that have been deleted.

So +0, if we decide not to do missing() yet, the go ahead with unversion.
John
=:->

> 
> which function would be defined as:
> """Remove the file_ids from the set of currently versioned files."""
> 
> This would not write the working inventory itself - at the same time I'd
> put in place a small test in unlock() to write the _inventory if we know
> its dirty.
> 
> This would allow removing that inventory-synchronising call from commit,
> and remove one parse and one write from common-case commits.
> 
> Cheers,
> Rob
> 
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060906/c894c292/attachment.pgp 


More information about the bazaar mailing list