[MERGE] faster extraction of plain texts from knits
John Arbash Meinel
john at arbash-meinel.com
Fri Jul 27 17:07:23 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
This is an update to my previous email, which includes some of the cleanup that
I did while working on the pyrex implementation.
I went a slightly different route for this, since I saw it as overall
beneficial. I got rid of some of the annotation factory helpers I had
introduced, in favor of ones that I can have better pyrex implementations for.
I went ahead and skipped having _get_text_map entirely, in favor of only
optimizing for extracting the single text case. Extracting the single text
should be done anyway, and doing it this way allows for less overall churn, and
still gives benefits for the bulk of our work.
Basically, the way this works is that rather than parsing things into lines
from the beginning, and then parsing those lines into either fulltexts (without
annotations) or line deltas (without annotations), we go straight from a gzip
chunk into a line-delta, or un-annotated fulltext.
I have pyrex implementations for both of those, which helps quite a bit.
bzr.dev/bzr checkout . ,,temp
10 loops, best of 3: 5.32 sec per loop
faster_knit_extract/bzr checkout . ,,temp
10 loops, best of 3: 4.68 sec per loop
pyrex_knit_extract/bzr checkout . ,,temp
10 loops, best of 3: 3.72 sec per loop
But in the short term, we can get a 14% improvement in extraction without
having any Pyrex code involved.
I think I've addressed all of Aaron's comments with this patch. Though because
I'm doing it a bit differently, I may have introduced new ones.
I'll post my follow up of pyrex functions after this has been reviewed and
accepted. Since I think they are separate, but BB would consider the one to
supersede the other. (One depends on the first, but I would rather they were
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the bazaar