notes/plan for hpss performance work

John Arbash Meinel john at arbash-meinel.com
Thu May 1 18:29:01 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
| Andrew and I have been studying hpss performance using -Dhpss and the
| netem hack to simulate a 1000ms rtt network.  Andrew has the full data
| and more details but here are some things we've noticed.
|
|  * Generally, RPCs do take around 1.003s, precisely what we'd expect
| for this network.  So this shows we're not suffering from bad
| buffering at any level: we send one packet, and get one packet back.
| (We are not testing with ssh; it's possible problems could occur
| there.)   There is one notable exception: writing a stream to a file
| using the vfs put or append often takes 3s (3 round trips), even when
| it's very small, probably either we really do accidentally have round
| trips within the rpc, or something we're doing is causing tcp to
| stall.  The tcp window-opening behaviour is very noticeable as we send
| a large pack - it speeds up over time - but there's nothing much we
| can do about this.
|
|  * Format 5 branches (with a revision-history file) do a lot of work
| to update that file, reading the graph to update it.  I think we
| should just start giving suggestions to upgrade these  old formats.

If we really cared, we could use the existing file as a shortcut rather than
iterating the whole graph. But set_last_revision_info() is slow for Branch5, but
much faster than set_last_revision_history() for Branch6.

It wouldn't be hard to make B5 faster, just didn't seem worth the effort. I'm
fine with prompting for upgrade. (Though it would have hidden the bug with
RemoteRepo.get_parent_map() that we just found.)

|
|  * Pushing up pack data is actually pretty fast, because we write
| generally only one file and the 1MB buffer keeps even a fat/long pipe
| quite full; however creating the indexes takes one round trip each,
| and writing the pack names requires taking a lock (see below).  When
| pushing a large new branch, performance is quite good to the network
| maximum (imposed by the tcp maximum window size).  When there's less
| real data to send, the unnecessary roundtrips really hurt.

It seems like with HPSS you could send multiple indexes in one trip, and
certainly taking a lock should be only 1 round trip to take it, and 1 to unlock.
And even then, you only have to do that when the pack has been pushed and you
want to update the pack-names file.

Certainly updating the pack-names file could be an RPC, though I think we would
rather have the pack creation, etc all done on the remote as a response to a
"push_data_stream()" call.


|
|  * We always create new components empty and unlocked, and then later
| lock them and put content into them.  This is a straightforward api
| and does mean that if the initial push fails you will generally get a
| correct though empty branch.  We should possibly create the new object
| locked, and avoid reading things back if we know it's empty.  A single
| rpc to create bzrdirs would help.

Creating things locked would simplify things quite a bit. I've run into that in
the past. It even shows up in "bzr init" on the local filesystem. Especially on
Windows where our file locking protocol is, for some reason, very expensive.

|
|  * Perhaps surprisingly, graph operations are not showing up as
| dominant, at least in the cases we did here: pushing just one
| revision, and pushing all of history.  If you have two branches with
| substantial different history on both sides it may be more important,
| but there is plenty of low fruit before getting into that.  We spend a
| lot of time in graph operations for diverged branches but it seems
| totally unnecessary, as we should already know they're diverged.

I'm not sure how we know they are diverged and not just merged at some point we
haven't seen yet. That is certainly what stuff like "find_unique_ancestors" or
find_differences is all about.

On push, you could try "find_unique_ancestors(remote, [local])" and if it
returns anything but the empty set, then you have divergence.

|
|  * Taking and releasing a lockdir at the vfs layer takes about 9
| roundtrips, which is pretty high, and we can get some big wins by
| avoiding it - either by using a lock/unlock rpc, or by making the lock
| implicit in rpcs like "add a pack" or "set last revision".  It looks
| like we need to clean up LockableFiles position in locking, and then
| allow both the Remote and vfs objects to share a single .lock, which
| will work over rpc.  As a first step, we could make sure to take the
| lock over rpc, then just allow the vfs object to at least observe it's
| already held.
|
|  * Repacking is currently done over vfs and pretty slow when it
| happens; we haven't measured this yet.  We probably want it to either
| be totally automatic on the server, or perhaps to have it happen on
| request from the client.

I definitely think we want the server to autopack. It would be a huge win for
bzr+ssh over sftp. And further, it already has *all* of the logic to do it, *on
the server*.

If we made "set_pack_names()" an RPC that included automatic locking, it could
also do automatic repacking. Considering that is where things are happening
today anyway.

I certainly think the end-goal is a "push_data_stream", but that requires much
more development time than "set_pack_names".

|
|  * There are some in-detail inefficiencies, where the trace makes it
| clear that we're reading back data that we either should already know
| or don't need to use, or where we're re-opening objects that we should
| already have open.  (For example push to a diverged branch keeps
| running long after it should know that it won't succeed.)
|
|  * -Dhpss is really good; we should look out for more opportunities to
| add tools that make performance easier to improve.  I hope we can add
| some tests that prohibit silly behaviour, either during the test suite
| or when turned on by a -D flag.  At the moment there is an option that
| bans all vfs operations, but we could change that to a filter of
| disallowed operations, so we can trap access to locks over vfs for
| instance.

Interesting. Don't forget about "-Devil" which is also meant to catch some of
these things. You could make vfs locking an 'evil' operation.

|
| Andrew and I are going to first look at the particular case of pushing
| just one new revision, which on this really slow network takes 2m, and
| specifically starting with the way it locks and unlocks the Branch
| repeatedly.
|

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgZ/d0ACgkQJdeBCYSNAAPD7ACdEuBmQcv+hMaBe5GVL+VkRaG8
+gIAn1QyjmQ3d1y/chUDTsqNhC9AM7hy
=QLZV
-----END PGP SIGNATURE-----



More information about the bazaar mailing list