notes/plan for hpss performance work
John Arbash Meinel
john at arbash-meinel.com
Mon May 5 16:03:07 BST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Andrew Bennetts wrote:
| Martin Pool wrote:
|> Andrew and I have been studying hpss performance using -Dhpss and the
|> netem hack to simulate a 1000ms rtt network. Andrew has the full data
|> and more details but here are some things we've noticed.
| I've dumped the notes and log files I had on my laptop to the wiki:
| The notes are rather terse in places, but should be intelligible. For most of
| the cases I've translated the ~/.bzr.log into a more high-level timeline of the
| network conversation. This tends to make repeated/unnecessary operations more
| obvious, as well as giving a good overview of where the time is going.
| There are some obviously strange things going on, e.g. a push that fails due to
| the branches having diverged should fail much faster than it does: it knows the
| remote tip revision at 17s, but takes 55s to run because it does odd things like
| pushing up a new pack before raising the error.
Does the pack end up included in the remote repository? Or is it aborted?
One of the nice things about "bzr push" with knit repositories is that even if
diverged branches is run, your changes got copied.
So if the changes are being copied (and preserved) on the remote side it isn't
as clear whether it is a strict failure.
| People should feel free take a look at the data and tell us about any insights
| you have. Wikis are good for writeups of known facts, but less good for
| discussions, so I suggest replying to this thread if you want to dissect some
| results. Feel free to add more relevant experiments to the wiki (but also mail
| the list to let people know about them).
So here are a few things...
1) When doing 'bzr push' at the time it comes to "set_revision_history()" it
does all of the graph searching on the remote end. Even though we obviously have
all of the information locally, since we are actively copying it across.
I don't know if it is worth special casing "set_revision_history()" as it is
only for format5 branches, and is something we are getting away from. However, I
think it is more of a general issue we need to watch out for with 'bzr push'. We
will often have the information locally, and ignore that fact to retrieve it
from the remote. I believe you saw that specific behaviour with a 'readv()' of
the remote .rix file before it would create the branch.
2) For push to an existing branch, this is what sticks out to me:
0-21s: opening existing branch, reading revision-history
21s to open a branch is a lot of round trips. B6 is going to be a little bit
better here, but in general I feel our "Branch.open()" function spins a little
bit too much. Over bzr+ssh:// I believe it still does the probe for the remote
repository from the local side, which is silly.
Actually, IIRC, it probes for the remote repository 2-3 times, in some really
weird ways. Like it calls an RPC which sounds like it would find everything and
return it, but it actually does something else.
As for 'put last revision file' taking 3s, is it trying to do it atomically from
the client, thus having a put to a temp name, and then rename into place? Or is
the bzr:// 'put' api smart enough to have the server do all of that work?
3) For case #5, I believe we do the work to detect diverged branches, and raise
the exception, and then catch it and force the push. I might be wrong, though.
4) Similar to #1, we probably do too much get_parent_map() calls on the remote
host, for revisions that we already have locally. The api's are pretty layered,
so it might be hard to inject another source of graph info.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
More information about the bazaar