Bazaar NG performance on large repositories

John Arbash Meinel john at arbash-meinel.com
Mon Oct 30 17:58:53 GMT 2006


As a small thing, the official name for the project is Bazaar. We used
-Next Generation for a while, but now we have become the real thing :)

As an aside, what platform are you running on?

Nicholas Allen wrote:
> Hi,
> 
> I thought that you might be interested in some performance tests that I
> performed using bzr after I converted a small part of our svn repository
> to a bzr one. The bzr repository contains about 20,000 revisions and is
> about 500 MB in size. Some operations are fast while others seem to take
> rather a long time. I created 2 shared repositiories - one with trees
> and one without.

^- How many working files? 20K revisions doesn't seem like too many, but
because of a bug like:
https://launchpad.net/products/bzr/+bug/68512

If you have a huge tree, we store too many full copies of the inventory
text. The data format doesn't care, so we could simply change the
creation code to save some space here. (Just to say, we can fix this in
a completely compatible way).

> 
> bzr branch trunk my-branch
> without trees: ~3 seconds
> with trees: 3 mins 58 seconds
> 
> So creating the working tree seems to take quite a long while. Branching
> in a shared repository without trees is lightning fast though!

^- I'm a little surprised that it takes that long to create the working
tree, but yes, it is something that needs to be optimized.

> 
> bzr log some-file
> This command took a very long time. In fact, I gave up waiting for it to
> complete. No output was seen on the terminal at all - even after 5
> minutes. I think in its current state this would be completely unusable
> for us. I hope that bzr will see some performance improvements here.
> 

There is a *huge* room for improvement, related to stuff like:
https://features.launchpad.net/products/bzr/+spec/per-file-log-output

Basically, we already store a graph of what revisions have modified the
file. At present, we are using old code which looks at the global graph,
and then extracts each version to see if the file was modified. It is
very sub-optimal, but we have all the data available to make it very
fast. We just need to start using it.


> bzr log
> This command was fast and output was almost instant on the terminal.
> 
> bzr blame some-file
> This command was very fast - no problems here!
> 
> bzr status
> This command took about 2 minutes to complete. It also claimed that all
> files were modified but I had only modified one file.
> 
> bzr diff
> Took about 30 seconds to complete. It claimed all files had
> modifications "(properties changed)" but there were no code changes
> except the file I modified. What could these property changes be? Is
> this a possible bug?

^- It sounds like something is being weird about the Execute bit. Which
is the only permission that we check. It sounds like something weird. It
might be a win32 issue, because win32 doesn't have the ability to record
an execute bit, so we fake it.

> 
> bzr ci -m "Some message"
> Due to the property changes every file needed to be checked in which
> took about 1 minute.
> After checking in the property changes and modifying 1 file again this
> only took about 10 seconds.
> 
> bzr push
> This command took about a minute even though I only added one line to
> one file and checked in. This was only a problem when pushing to a
> branch that had trees so this was probably caused by the need to update
> the working tree which takes a long time.

Were you pushing over sftp, or to another branch on your working
machine? (Or to bzr+ssh:// on another machine?)

> 
> So I think before we can consider moving to bzr (which I hope we will do
> at some point) building the working tree would need to be faster and the
> log of a single file would have to be a lot faster. I know there has
> been a lot of improvement in the last few releases so hopefully these
> problems will not exist for much longer.
> 
> Thanks,
> 
> Nicholas Allen

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20061030/01e2f960/attachment.pgp 


More information about the bazaar mailing list