[RFC] Cheap re-pack?

Aaron Bentley aaron.bentley at utoronto.ca
Thu Sep 6 12:51:48 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

With the upcoming pack format, a re-pack is scheduled for every 10
commits.  AIUI, that's an expensive operation, because it requires
generating deltas for 10 commits at re-pack time.

Instead, we could amortize the cost across the 10 commits, by
calculating the deltas at commit time.  Since packs use reverse deltas,
you would calculate the delta from the revision you're committing to its
leftmost ancestor.  So the pack generated by committing revision 10
would also contain the delta from 10 to 9.  The pack for 9 would also
contain the delta to 8.  Etc.

This would mean that repacks would be filtering operations: You read the
deltas for 9 revisions and the fulltext for 1 revision, and you write
those to a new pack.  That would make them IO-bound, not CPU-bound.

Alternatively, if deltas were too big, we could just cache the
get-matching-blocks output at commit time, and calculate the deltas at
repack time.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG3+nU0F+nu1YWqI0RAiN/AJ4i8cqaTFig24h8bZV0f2WxX/AERwCfRrse
eyZCbI/aTIAl1+g2GkWmbHI=
=Vdmt
-----END PGP SIGNATURE-----



More information about the bazaar mailing list