[RFC] Ways to make initial knit creation faster (both push and commit)
John Arbash Meinel
john at arbash-meinel.com
Sat Aug 19 00:10:14 BST 2006
John Arbash Meinel wrote:
> I spent a little time today doing an --lsprof of why 'bzr push' is so
> slow when creating a new remote branch. Here are a few notes I found:
>
...
> 1) Change the VersionedFile interface, so that the file knows whether or
> not it might need to create the parent directory if it cannot directly
> create the file. This saves us the extra get() request since we already
> know the file doesn't exist.
Turns out, I only need to change the Knit interface, unless I want to
implement this for Weaves. Which I don't think matters at this point.
>
> 2) Create a new API on Transport for something like put_non_atomic().
> Which assumes that the remote file doesn't exist, and opens the target
> file for writing() and blats the bytes. We can have warnings to only use
> this when you *know* that the target file doesn't exist. Because it
> isn't a strictly safe function to call.
See my patches about non_atomic_put()
>
> 3) Don't write out the knit header until you are ready to write data.
> Instead just keep a flag that the file needs to be created, and the
> header needs to be written when we start writing data.
>
Attached is a rollup bundle that does all of these things. It currently
has 2 failing tests, and that is because the hash prefix directories are
not getting created with the right permissions.
However. There are 2 huge wins.
test_commit_kernel_like_tree: 35.5s
and
SFTPSlowSocketBenchmark.test_initial_push: 118s
The best time I saw on my machine for bzr.dev commit was 42s. And the
time for 'test_initial_push' was 180s.
So this shaves another 10% off of a plain 'commit' time, and close to
30-40% off of the time to push a new branch.
So it needs some reviewing. And a little bit of love. We need to expand
KnitVersionedFile so that it takes directory modes as well as file
modes. And update not_atomic_put() so that it can take a directory mode.
It is possible that we don't want non_atomic_put() to create the
directories, in which case I need to rework things so that the KnitIndex
and KnitData classes use try/except routines in the case that we want to
create the parent directory.
We haven't settled on whether we want to use 'non_atomic_put()' or
'non_atomic_put_bytes()', etc, etc.
It would be nice if people could test it out. And because it is a little
invasive, it may not make it into 0.10. It is kind of a last minute
change. (Though we would have all of next week to really pound on it).
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: reduce-knit-churn.patch
Type: text/x-patch
Size: 107486 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060818/6d1beba4/attachment.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060818/6d1beba4/attachment.pgp
More information about the bazaar
mailing list