pack vs knit based push syd-london packs 1/4 the time of knits... and SFTP write latency glitch?

Robert Collins robertc at robertcollins.net
Thu Aug 16 11:05:35 BST 2007


Executive summary:
   bzr pushing a new branch of bzr.dev revno 2000 to London from Sydney
is 60 minutes with knits, 15 minutes with packs, and 5 minutes with
rsync. 'Woot'.

But, something is suspect at the network layer: we can do the same push
operation in 7 minutes within sydney, so something is causing 8 minutes
of extra delay when the only change is the target machine,... and the
latency is skyrocketing. As we're only creating 4 files during the
operation we shouldn't be paying the create-stat-start-writing
multiplier....

So Robey, I'm wondering if something is causing glitches with SFTP.

I was expecting pack creation, which does many small writes to an open
file object, to perform reasonably as we have opened the SFTP file
object its writing to with 'pipeline=True'. However when I test from
Martin's place to mine, which are both in Sydney, performance was
brutally slow.

I then added a buffer and wrote ~64K at a time to the SFTP file, and
performance leapt upwards, and passed knits. This was sufficiently good
that push is twice as fast as pushing knits within the Sydney area, and
only 43 seconds (out of 22 minutes) slower than rsync at pulling back
from my place. The test branch is created by 'bzr branch -r 2000 bzr.dev
test-branch'.

Encouraged by this result I tried pushing to London, where rsync got a
5minute result, and bzr 15 minutes. tcpdump showed what *looked* to be
regular pauses in the upload, and I haven't had time to test with a
larger buffer, but I'm wondering - is there some chance that the
pipeline parameter is not doing what it should? Do you have any
suggestions about how to tell whats happening within the paramiko
core...

If you'd like to play with the pack repository, bzr pull
http://people.ubuntu.com/~robertc/baz2.0/repository, and make a branch
with bzr init --experimental, then pull any content you want to into
that branch. After that push and pull with it will preserve the
repository format (except when you push into/from a shared repository).

-Rob

Performance results:

Martins to my place:
-------------------
push knits sydney
real    13m51.436s
user    0m22.309s
sys     0m1.640s

push packs sydney
real    7m7.261s
user    0m18.453s
sys     0m0.896s

pull knits sydney
real    27m17.761s
user    0m28.586s
sys     0m1.672s

pull packs sydney
real    22m43.602s
user    0m27.474s
sys     0m1.240s

pull-rsync-packs sydney
real    22m0.688s
user    0m0.304s
sys     0m0.216s

Martins to London
-----------------

push knits London
real    60m52.241s
user    0m20.925s
sys     0m1.300s

push packs London
real    15m37.995s
user    0m18.449s
sys     0m0.728s

rsync push packs London
real    5m41.282s
user    0m0.244s
sys     0m0.120s

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070816/3182ca1c/attachment.pgp 


More information about the bazaar mailing list