'bzr branch -r -XXX foo bar' slow for long histories

John Arbash Meinel john at arbash-meinel.com
Sun Mar 16 14:13:51 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've been playing around with the emacs repo a bit. And I found that doing:

 bzr branch -r -100 trunk test

Is actually pretty slow. Looking closely there seems to be 2 big points.

1) Looking up '-r -100'. This is actually handled by either my
"revision_spec" patch or by Lukas's get_rev_id() patch. Basically, we
want to be able to look up '-r -100' without having to actually traverse
the whole history.

2) Branch.copy_content_into() calls
   _synchronize_history(destination, revision_id).

The problem is that _synchronize_history says:

  source_revno, source_revision_id = self.get_last_revision_info()
  if revision_id == source_revision_id or revision_id is None:
    # This is very fast
    revno = source_revno
  else:
    # with 90k mainline revs, this is *very* slow
    revno = len(list(self.repository.iter_reverse_...(revision_id)))

So, I could change the code to presuppose that 'revision_id' is going to
be in the mainline of this branch (it is probably the most likely case).
In which case we do something like:

  cur_revno = source_revno
  for rev_id in self.repository.iter_reverse...(source_revision_id):
    if rev_id == revision_id:
      revno = cur_revno
      break
    cur_revno -= 1
  else: # Could not find in our branch's history, try again
    revno = len(list(self.repository.iter_reverse...(revision_id)))

This would mean that doing "bzr branch -r revid:XXX" is going to be
slower for non-mainline revisions, as it will try first to iterate the
local history (completely) and then completely iterate the target history.

It also has the problem that our KnitPackRepositories don't cache any
parent information (I have a hack for maintaining the
CachingParentsProvider), which means it has to go back to the Index each
time.

Another possibility would be trying to pass in the revno in case we
already had it as part of the revspec lookup.

Thoughts?

Note, we *could* cache the revno for every revision as we commit it,
because it can be trivially derived from its left-hand parent.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH3SsfJdeBCYSNAAMRAsWyAKDIxsJ1TH5h+CpljzQ6spFRvgH/vgCgtFdL
70nn3CRSz8W5HeITQ46mW+M=
=xAmz
-----END PGP SIGNATURE-----



More information about the bazaar mailing list