[MERGE] Comparison cache to speed up diff
Aaron Bentley
aaron.bentley at utoronto.ca
Fri Jul 20 20:39:41 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi all,
This implements the first part of my diff-speedup plans: A cache of
SequenceMatcher.get_matching_blocks output. This is useful because file
comparisons have poor scaling properties.
So a list of matching blocks is associated with the working tree SHA1
sum and the basis tree's file revision.
Caches are created when you run diff, reused the next time you run diff,
and cleared when you commit.
I tested against a bzr.dev tree after doing "revert -r -50"
For vanilla bzr, the best diff time was:
real 0m4.538s
user 0m4.020s
sys 0m0.296s
When writing the cache for the first time, the best result was 1.3x slower:
real 0m6.114s
user 0m5.276s
sys 0m0.492s
When reusing the cached data, the best result was 1.7x faster:
real 0m3.556s
user 0m2.856s
sys 0m0.308s
As usual, file access time is a significant factor: According to
lsprof/kcachegrind: get_file takes 49.61 of 103.05 ticks.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGoQ990F+nu1YWqI0RAog1AJ9tlgmdYbgsOc+6Ii7o6/IP709PCACeIoul
9hxl4T7y8Mqv6EuOEj9M7Ac=
=vdzR
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: comparison-cache.patch
Type: text/x-patch
Size: 31169 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070720/587d9e64/attachment-0001.bin
More information about the bazaar
mailing list