Smart network performance update

Andrew Bennetts andrew.bennetts at canonical.com
Tue Nov 18 02:07:40 GMT 2008


Just a quick update to let everyone know where the smart server network
performace stands.

In short: It's getting better, but there's still a lot to do.

Where we're at
--------------

Various small improvements, mainly affecting push, have landed in the last
few releases.  One of the biggest improvements is that as of 1.9 autopacking
now occurs server-side, which saves considerable time and bandwidth.
_walk_to_common_revisions has also been improved, which was another large
issue identified at <http://bazaar-vcs.org/SmartPushAnalysis1.8>.  The
benchmarks done there should be repeated for 1.9 and bzr.dev, and we're
close to automating that with the usertest tool running on
<http://benchmark.bazaar-vcs.org/>.

There's still a huge amount of room for improvement.  Creating bzrdirs,
branches and repositories all fallback to VFS operations at the moment, and
so cause many roundtrips.  This is exacerbated by the problem that new
bzrdirs/branches/repositories are initialised unlocked, but are essentially
always locked immediately after creation, which causes more roundtrips.  The
Repository.get_parent_map RPC isn't used universally, because some codepaths
still fallback to VFS objects.  

Also, there's the occasional bug.  Stacked branches in particular seem to
have several serious performance issues with the smart server, but we think
we've identified them all now and they should be fixed in the coming days.


What to do next
---------------

Basically, the two big things to do are:

  * reduce the number of operations that need VFS fallbacks.  If we can
    always open a branch/repo without VFS methods, and create them without
    VFS methods, then I think we'll be a fair way to achieving this.  We can
    then iteratively fix the remaining causes of VFS fallbacks on the most
    common operations.  I'd especially like to at least make RemoteBranch
    work with no fallbacks, even if the repository does fallback.

  * implement a smart, streaming fetch.  This is necessary to avoid VFS for
    many RemoteRepository operations.  There's been a fair bit of discussion
    about this over the last year.  I want to have it done within the next 6
    months.

There are plenty of smaller but still worthwhile issues to tackle, e.g.
implementing get_parent_map RPCs for the inventories/texts/signatures
versionedfiles wouldn't be hard (although if we unify the keyspace this
might not matter?).  And we can implement some pack-specific RPCs (e.g. add
a new pack to pack_names, maybe check_references, etc) as a stopgap for full
streaming fetch.


What shouldn't happen
---------------------

Regressions.  If you find a new release of bzr makes network performance
worse, we want to know about it.  I'm pretty sure we're past the point of
intentionally making things a bit slower because it'll help make things
faster in the long run, so don't hesistate to file bugs if you see a
performance regression.

-Andrew.




More information about the bazaar mailing list