hpss a little too friendly

John Arbash Meinel john at arbash-meinel.com
Fri Jul 27 22:19:26 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've been noticing that doing "bzr commit" with a smart server master branch
spends a decent amount of time 'uploading data to master branch', even though
I'm on the local network.

So I played around with the '-Dhpss' flag, and one thing I found is that we are
issuing a 'hello' command a little too often.

Specifically, for a single 'commit', we say 'hello' 5 times.

Also, we seem to be searching around a bit more than we need to. For example:

hpss call:   ('hello',)
ssh implementation is OpenSSH
hpss result: ('ok', '2')
hpss call:   ('BzrDir.open',
'/srv/bzr/public/branches/bzr/0.19-dev/pyrex_knit_extract/')
hpss result: ('yes',)
hpss call:   ('BzrDir.open_branch',
'/srv/bzr/public/branches/bzr/0.19-dev/pyrex_knit_extract/')
hpss result: ('ok', '')
hpss call:   ('BzrDir.find_repository',
'/srv/bzr/public/branches/bzr/0.19-dev/pyrex_knit_extract/')
hpss result: ('ok', '../..', 'no', 'no')
hpss call:   ('hello',)
hpss result: ('ok', '2')
hpss call:   ('BzrDir.open', '/srv/bzr/public/branches/bzr/0.19-dev/')
hpss result: ('no',)
hpss call:   ('hello',)
hpss result: ('ok', '2')
hpss call:   ('BzrDir.open', '/srv/bzr/public/branches/bzr/')
hpss result: ('yes',)
hpss call:   ('BzrDir.find_repository', '/srv/bzr/public/branches/bzr/')
hpss result: ('ok', '', 'no', 'no')
hpss call:   ('Repository.is_shared', '/srv/bzr/public/branches/bzr/')
hpss result: ('yes',)


We say 'hello', and try to find a BzrDir where we expect one, and then ask it
where its repository is.
But then we seem to search for it manually, and call 'hello' before each probe.

Actually, looking closer, it seems to be that we are placing a
"BzrDir.open_branch" remote call, (so that we can determine if there is a
branch reference on the remote side).
And then we ignore that and return RemoteBranch(self, self.find_repository())
ignoring the fact that the open_branch() call returns where the repository is
located.
The really confusing part is that
RemoteBzrDir.open_branch() *doesn't* actually call "BzrDir.open_branch()", and
sort of does all the work locally.

Is there a reason the code couldn't be:

def open_branch(self, _unsupported=False):
  path = self._path_for_remote_call(self._client)
  response = self._client.call('BzrDir.open_branch', path)
  if response[0] == 'ok':
    if response[1] == '':
	return RemoteBranch(self, self.find_repository())
...

and then we would want

def find_repository(self):
    path = self._path_for_remote_call(self._client)
    response = self._client.call('BzrDir.find_repository', path)
    if response[0] == 'ok':
      remote_path = response[1]
      format = RemoteRepositoryFormat()
      format.rich_root_data = (response[2] == 'yes')
      format.supports_tree_reference = (response[3] == 'yes')
      return RemoteRepository(self, format)

Is it just that 'self' is pointing to the wrong BzrDir at that point, so we
actually need to create a new RemoteBzrDir based on the remote path.

It seems it doesn't cost as much as I thought it did (it turns out that just
removing a bunch of unused plugins on my server dropped the time by almost 500ms)

I can say, though that "time bzr up" with nothing to do, is actually faster for
sftp:// than for bzr+ssh://. (3.7s versus 2.8).

I'm guessing the bulk of that is because on my machine spawning the remote bzr
takes longer than would be ideal.


% time ssh juju echo hello
hello
ssh juju echo hello  0.03s user 0.01s system 15% cpu 0.239 total

% time ssh juju bzr rocks
It sure does!
ssh juju bzr rocks  0.03s user 0.01s system 4% cpu 0.806 total

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGqmFdJdeBCYSNAAMRApvJAJ0auUqhPymM8hI6C3SDmYGVpTIecwCgx7Z9
kiEYVnDo9X/nw/OPqp7mFFY=
=dH9V
-----END PGP SIGNATURE-----



More information about the bazaar mailing list