Slow "bzr branch" on Savannah

John Arbash Meinel john at arbash-meinel.com
Tue Feb 8 21:22:12 UTC 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2/8/2011 1:19 PM, Eli Zaretskii wrote:
...

> What does Launchpad know or do that bzr.savannah doesn't?

Use bzr 2.2+ rather than 2.0?

> 
> The detailed results are as follows:
> 
>   1) lp:emacs
> 
>      a) GNU/Linux:
> 
>         time:
> 
>           real    47m31.014s
>           user    16m31.510s
>           sys     0m14.000s
> 
>         network:
> 
>           Transferred: 675470KiB (237.1K/s r:675004K w:467K)
> 
>      b) Windows:
> 
>         time:
> 
>           real    01h04m01.629s
>           user    00h20m54.484s
>           sys     00h00m57.046s
> 
>         network:
> 
>           Transferred: 676975kB (176.3kB/s r:676479kB w:496kB)

I'm assuming you aren't logged in here, since you mention the 'HTTP'
debug messages later. So this is effectively:

  bzr branch http://bazaar.launchpad.net/...

I would certainly be curious about the times for:

  bzr branch bzr+ssh://bazaar.launchpad.net/

But you would need an LP user id for that.

Note that both http and sftp both read raw files from remote, and then
determine where data needs to be fetched. bzr+ssh:// or bzr:// have the
server compute some of the information about what needs to be sent.

> 
>   2) bzr://bzr.savannah.gnu.org/emacs/trunk
> 
>      a) GNU/Linux:
> 
>         time:
> 
>           real    45m4.820s
>           user    15m58.380s
>           sys     0m12.910s
> 
>         network:
> 
>           Transferred: 540480KiB (199.9K/s r:540403K w:77K)
> 
>      b) Windows:
> 
>         time:
> 
>           real    02h13m52.949s
>           user    00h15m37.828s
>           sys     00h00m39.578s
> 
>         network:
> 
>           Transferred: 552961kB (68.9kB/s r:552882kB w:79kB)

This seems to be a latency thing. And from what you described it could
be something non-optimal about the "discovering revisions to fetch"
phase. There are a lot of pieces that could be involved:

1) Emacs ancestry is very linear over most of its 100k revisions. If you
imagine asking for the parent revision for all revisions you know about,
but you only know about 1 rev, and it only has one parent, then you have
100k round trips to find out the parents.

2) "dumb" transports like http and sftp know that they can't be smart
about getting more ancestry from the server. So they cache most/all of
the index files that they read. Which means that while it is looking for
only 'give me the parent of rev-foo-bar' it has to read at least 4096
bytes of context. If it then caches that context, a future request can
just answer it locally, without hitting the remote server.

3) The smart server 'get_parent_map' request is *supposed* to be
intelligent about this. So if someone says "give me the parent of
rev-foo-bar" it is supposed to give you the direct parent, and all
parent-of-parents that it thinks it can fit in a 64kB message. This may
not be working correctly.

4) Something to check:

 bzr branch nosmart+bzr://...

Note that (2) is often very decent for 'give me the whole ancestry' as
long as everything fits into cache. You end up reading everything, but
you have to read it all anyway if you are getting the whole history. (2)
does very poorly for the 'bzr up' and 'bzr commit' cases, which already
have 99.9% of the history.


> 
>   3) bzr+ssh://MYID@bzr.savannah.gnu.org/emacs/trunk
> 
>      a) GNU/Linux:
> 
>         time:
> 
>           real    39m6.798s
>           user    14m17.740s
>           sys     0m11.640s
> 
>         network:
> 
>           Transferred: 540001KiB (230.2K/s r:539924K w:77K)
> 
>      b) Windows:
> 
>         time:
> 
>           real    02h27m03.017s
>           user    00h28m17.875s
>           sys     00h03m11.984s
> 
>         network:
> 
>           Transferred: 553565kB (62.8kB/s r:553487kB w:78kB)
> 
>   4) sftp://MYID@bzr.savannah.gnu.org/srv/bzr/emacs/trunk/
> 
>      a) GNU/Linux:
> 
>         time: 15 min
> 
>      b) Windows:
> 
>         time: 56 min
> 
> These numbers are consistent, I repeated each attempt several times
> and got similar results.
> 
> Oh, and when I use Launchpad, I see many lines like this in .bzr.log:
> 
>    3705.648  25 bytes left on the HTTP socket
> 
> Looks like some debug message left behind.
> 
> 

One other thing to try. Skip the 'init-repo' step. So rather than doing:

bzr init-repo test
cd test
bzr branch $SOURCE

Just do:

bzr branch $SOURCE test

There is currently an issue, where if you do a smart 'bzr branch'
without a repository, it knows that it needs to fetch everything, and
issues a minimal "give me the whole ancestry from this tip" request. But
if you have a shared (but empty) repository, and do the same thing, it
starts a "find what revisions I don't have, and then give me all of
[these 100k revisions I've found]".

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1RtAQACgkQJdeBCYSNAAPihgCgvy2hVgmC0Tu5yQYFSokjoXMr
nDIAoL3uYV6/52LacCi+pXcuXnMm/9ME
=8msV
-----END PGP SIGNATURE-----



More information about the bazaar mailing list