svn-import performance analysis
John Arbash Meinel
john at arbash-meinel.com
Wed Dec 3 00:03:18 GMT 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jelmer Vernooij wrote:
> I've done some work analysing the critical points in importing from
> Subversion using bzr-svn 0.5.
>
> The main culprits appear to be:
>
> * Inventory.copy() (33%)
This is, unfortunately, a deep-copy rather than having CoW semantics.
One thing that the split-inventory code changes is making Inventories a
bit more immutable (because CHKInventory is going to be more explicitly
CoW).
> * Repository.add_revision() (49%) (most of which is spent serialising
> the inventory)
>
> While fetching a revision delta, bzr-svn makes a copy of the inventory
> of the parent revision and applies the changes it sees to that. It needs
> the previous inventory as well to figure out renames and the like in
> roundtripped revisions.
So it needs the parent revision, and the grandparent? Or it needs the
parent revision both for a delta and for comparison?
I'm curious if you hold on to the newly generated inventory when you go
on to the next child, or whether you pull it out again.
>
> I'm at a bit of a loss as to how I can optimize this further. Does
> anybody have any ideas? Also, I would expect CHK inventories to be of
> help here - is that correct?
>
> Cheers,
>
> Jelmer
CHK inventories will be much more capable of "partial update without
full serialization/deserialization", as it is designed in from the
beginning.
One hack I did a long time ago was to have each InventoryEntry remember
its serialized form (along with some other info like what serializer
created it, etc.) and then have it throw away that value when it was
modified. Though it really needs the calling code to be careful about
modifications, because InventoryEntries are just plain ol' data
structures, without a way to know if they have been made "dirty".
So... CHK should be a big help here, and without doing some dirty hacks
in bzrlib, I'm not sure if there is much else to be done.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkk1zMUACgkQJdeBCYSNAAOl6gCgv+CB6kUDmlXTCCbNeLKZ3xan
Ez0An1ZQ+wrEXrfBlbU38FpAo40zpl8C
=guyn
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list