[RFC] Pack-specific smart server verbs: check_references and autopack
Andrew Bennetts
andrew at canonical.com
Thu Jun 19 05:58:47 BST 2008
This patch isn't quite ready for merging, but I would like to let people know
about it. It adds some infrastructure and smart server verbs to optimise some
pack repository operations, rather than treating all repositories more-or-less
independently of their on-disk format as we do now.
The reason for this is that when pushing, two of the major causes of slow downs
are:
* Packer._check_references. After uploading a pack of new revisions the
packer checks that all compression parents not present in that pack exist
elsewhere in the repository, in order to make sure all the file texts will
be reconstructable. With VFS operations that tends to be many, many readvs
of all the .tix files. This happens on every push, and is often a large
fraction of the total push time, e.g. 25% depending on the exact details of
the push.
* RepositoryPackCollection.autopack. If certain thresholds are reached after
adding a pack, an autopack will be triggered to combine several packs into a
single pack. At the moment this involves pulling down all that data and
then reuploading it. It only happens about one in every ten pushes (and
with varying amounts of work to do), but it bites hard when it happens.
A single “stream some revisions into the remote repo” verb will probably to deal
with these intrinsically, but that hasn't been written yet. So as a cheap
interim measure I thought I'd try writing some pack-specific HPSS code to
perform those operations in a single round trip each.
The main part is adding a new InterRepository, InterPacktoRemotePack. It turned
out to be fairly easy to hook it all up.
The Packer._check_references part seems to work well, but I haven't yet thought
carefully about if it's always going to be better than the status quo, or if it
might sometimes be much worse. I think it's probably a worthwhile change, but
feedback on this idea is welcome (even if it's just to say “yes, that is totally
fine, please do that”).
The autopack verb I've added I'm sure is worthwhile, but it doesn't quite work
right yet and has no tests. It seems to do the right thing on the server, but
then leave the client in a state where it has the wrong pack-names cached,
causing a traceback after most of the push is done. Probably that's not too
hard to fix, but I wonder if maybe I could hook it up in a better way.
Thoughts?
-Andrew.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inter-remote-pack.patch
Type: text/x-diff
Size: 58552 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080619/f64091cf/attachment-0001.bin
More information about the bazaar
mailing list