Optimising branching and merging big repositories between far away locations...

John Arbash Meinel john at arbash-meinel.com
Wed Oct 29 20:59:57 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Asmodehn Shade wrote:
> Thanks a lot for the help John.
> 
> For those who are experiencing the same problem, and as John advised me,
> try the sftp:// protocol.
> The protocol enforce the size of the chunks being transferred, so that
> enable us to workaround the problem as a user, even if it is slower than
> bzr+ssh://, at least it works ;-)
> 
> Thanks again !
> Waiting for the next version ;-)
> 
> Cheers,
> 
> --
> Alex
> 


The other thing we tried was upgrading to --development2 format, to see
how much of a difference that would make.

To start with, it turns out that the prefetch code doesn't have any
effect on this test. When fetching the first revision (which adds
thousands of files), we are always making full-width requests.

Next, the btree indexes are 23MB instead of 77MB.

While he thought he was latency bound, he is actually bandwidth bound
while inspecting the indexes. And 'latency bound' later, but because of
the buffering time, not because of the actual ping time to the remote host.

So it actually was taking 296s before he could start downloading the
actual file content. With btree indexes, it starts at around 94s.
(77MB/23MB = 3.3, and 296s/94=3.1). And that seems to be a purely
bandwidth based issue. His claim was that he has ~1.5Mbit connection,
which is approx 180kB/s. 77MB/296s ~ 260kB/s, and 23M/94s ~ 240kB/s.

Which means that btree indexes shave about 2.5minutes off of his startup
time. Both ways he'll still have to wait a long time to download the
16GB of content (about 19 hours if my numbers are correct).

So in the grand scheme of things, it is a small improvement, but it is a
consistent improvement. And once he has that enormous initial download,
the startup time will become more of an issue.

I've also attached an interesting log of my network connection. Both
circled areas are doing "bzr log --short -r -10..-1 http://....". Can
you guess which one is using btree indexes and which is using knit indexes?

The time is 7s versus 15s. Which I have the feeling we are latency bound
for all of the .bzr/branch-format, .bzr/branch/format etc files.
Otherwise we would see a much more significant change (packing the
repository drops off another second or 2, so some of it is having
round-trips to multiple pack files, but that still leaves 5s).

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkkIzs0ACgkQJdeBCYSNAAO+8QCeIFCckaJ9MaRqutPBWOob52Kc
6roAn31KdBlgzzHHO8Mpx6oBNxfifWvC
=+7+n
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BzrLog.png
Type: image/png
Size: 3606 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20081029/cb297d2a/attachment.png 


More information about the bazaar mailing list