[RFC] Stop using hpss 'Repository.get_revision_graph()'

John Arbash Meinel john at arbash-meinel.com
Tue Jul 31 22:14:53 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In doing some of the debug logging changes for hpss, I came across an
interesting performance issue.

Specifically, during 'bzr commit' with a bound branch, we read all of
revisions.kndx (1,121,084 bytes 23ms) and inventory.kndx (1,175,920 bytes 25ms).

We then end up doing an RPC Repository.get_revision_graph (1,567,374 bytes,
1000ms). We actually do this 2 times (second times is 904ms).

Now, my server is a bit on the slower side (700MHz PIII). So the time may be a
bit slower than most people see.

Anyway, if we just eliminate the RemoteRepo.get_revision_graph() specialized
function, we would eliminate 3MB of data transfer when committing on a bound
branch. (And for me, it would also eliminate 2s of commit time).

I realize there seems to be serious issues (why are we calling it 2x with the
*same* revision id). And eventually we want to not have to read the remote
.kndx, but we *know* that we have to now. (Also, notice that the total bytes
transferred is 50% less for .kndx. I assume this is because we dictionary
compress the ancestry, versus streaming everything back raw for the RPC call).


Ultimately, I think we want to be using the fancy Graph apis, and never call
Repo.get_revision_graph anyway. Or at least, not require the entire revision
graph, if we are just using a small portion.

In the short term, though, just removing this code means commit is quite a bit
faster for me.

Thoughts?

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGr6ZMJdeBCYSNAAMRAgV6AKCj2dQNuUujPyF3/ZcK/VtWwMvTvQCglEta
VEyXkp8yaQq1PKKjT+BfBtU=
=Lr7u
-----END PGP SIGNATURE-----



More information about the bazaar mailing list