Why does Bazaar download so much data for a small change?
John Arbash Meinel
john at arbash-meinel.com
Wed Feb 4 13:12:43 GMT 2009
-----BEGIN PGP SIGNED MESSAGE-----
Nicholas Allen wrote:
> I pulled from a remote location where only one line was modified in one
> source file. Bazaar's progress bar showed that it downloaded over 1MB of
> data. I'm not sure how a one line change can equate to 1MB of data but
> it would certainly explain why Bazaar is one of the slowest VCSs.
> Any idea what this could be? It would seem that this is an obvious place
> for future optimisation.
Best guess would be reading the index files. And I would bet that using
a bzr >= 1.11 with a --1.9 format repository would be significantly
better about that (as 'small updates from bzr.dev' was something I was
specifically optimizing for).
I don't have detailed numbers, but the 1.9 format repository can
generally lookup any key in an index with approx 12kB of data. For
bzr.dev's repository the largest revision inde is around 810kB.
pack-0.92 generally takes log2(size/64k) requests to find a key, for 4
round trips at 64kB each = 256kB.
Or about 1/20th the data, and 1 less round trip to look and see if a
given key is in a (large) index.
The change in bzr 1.11 was to change the order that we look in files. So
that if we find something in <filename>.rix we look for its
corresponding inventories/texts in <filename>.iix before we look in the
For me, it changed the time for "bzr up" with a revision or 2 from 45s+
down to 10-15s.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
More information about the bazaar