knits and get_delta(s) and performance
Robert Collins
robertc at robertcollins.net
Wed Sep 5 22:47:34 BST 2007
Currently there is a bunch of api friction in knits - they are
pessimised to do several memory copies of texts due to the API.
Specifically the Content API works entirely in terms of annotated texts.
So if we have a knit with no annotations and insert a change to an iso
the iso gets linesplit, each line prepended with the current version,
the basis is read from disk, and each line prepended with the basis
version, then the two sets of line-tuples (version, text) are processed
to give two lists of lines, which finally are delta'd.
Similarly initial commit does a lot of pointless hoop-jumping. Now, due
to some peeking at internals the fastest the API can perform is in its
annotated shape.
Fixing this in a rough and ready way showed a win of 2m35-2m23 for my
initial commit tests.
The largest thing sticking in the way of fixing this is
'knit.get_delta(s)' which I wrote back when converting from weaves as an
experiment to try and do direct weave to knit insertions without full
text reconstruction. (It returns (origin, line) in the delta. Now, I
don't think we use this API anywhere, though some of the
get_left_matching_blocks tests on versionedfile use it.
I'd like to just delete it because:
- it makes performance work much trickier.
- we've never used it in anger AFAICT
- keeping it in a deprecated form will still be tricky
Thoughts?
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070906/a7e7dcec/attachment.pgp
More information about the bazaar
mailing list