Why does Bazaar download so much data for a small change?

John Arbash Meinel john at arbash-meinel.com
Wed Feb 4 15:21:30 GMT 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nicholas Allen wrote:
> I was using a dumb file storage and not the smart server. Would this
> also make a big difference? If so I will switch to using the smart
> server. But this still seems to me to be an excessively large amount of
> data to download. Ideally, it should be 100-1000 times less data (but I
> guess such optimizations would only be possible with a smart server).

In theory, it could make a large difference. In practice we haven't
finished enough work on the smart server that it makes that much of a
difference.

1.9 format repositories make a huge difference in the amount needed to
download. For small requests, they make a bigger difference in MB than
they do in time, because of latency and round-trips. For moderate sized
requests they help with both.

> 
> I was using the very latest versions of Bazaar installed from bzr.dev.
> I'm not sure what format the repository was though...
> 
> Nick

So bzr.dev has the ordering fix, but without upgrading the repository,
you would be missing the other half of the fix.

John
=:->

> 
> John Arbash Meinel wrote:
> Nicholas Allen wrote:
>  
>>>> Hi,
>>>>
>>>> I pulled from a remote location where only one line was modified in one
>>>> source file. Bazaar's progress bar showed that it downloaded over 1MB of
>>>> data. I'm not sure how a one line change can equate to 1MB of data but
>>>> it would certainly explain why Bazaar is one of the slowest VCSs.
>>>>
>>>> Any idea what this could be? It would seem that this is an obvious place
>>>> for future optimisation.
>>>>
>>>> Cheers,
>>>>
>>>> Nick
>>>>     
> 
> Best guess would be reading the index files. And I would bet that using
> a bzr >= 1.11 with a --1.9 format repository would be significantly
> better about that (as 'small updates from bzr.dev' was something I was
> specifically optimizing for).
> 
> I don't have detailed numbers, but the 1.9 format repository can
> generally lookup any key in an index with approx 12kB of data. For
> bzr.dev's repository the largest revision inde is around 810kB.
> pack-0.92 generally takes log2(size/64k) requests to find a key, for 4
> round trips at 64kB each = 256kB.
> 
> Or about 1/20th the data, and 1 less round trip to look and see if a
> given key is in a (large) index.
> 
> The change in bzr 1.11 was to change the order that we look in files. So
> that if we find something in <filename>.rix we look for its
> corresponding inventories/texts in <filename>.iix before we look in the
> other files.
> 
> For me, it changed the time for "bzr up" with a revision or 2 from 45s+
> down to 10-15s.
> 
> John
> =:->
>>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkmJsnoACgkQJdeBCYSNAAOwzACgg23upOcpYPyy0d83mrG03DC3
tS0An3Pf7eFQ3bPR+fKwFNjSLcQZhd1J
=IPk+
-----END PGP SIGNATURE-----




More information about the bazaar mailing list