Deploying with Bazaar (or how a big repo can make you crazy)

Thu Mar 8 23:44:00 UTC 2012

On 08/03/2012, Leonardo Santagada <santagada at gmail.com> wrote:
> People on the IRC channel said I should share my use story with the
> mailing list, and that is what I will try to do.

Thanks for doing this Leonardo, it's much more useful for future
reference than our IRC log.

> Then someone said to bzr branch -rN where N is a small number to see
> if that would work. I tried to get the speed back so tried http again,
> apparently it tried to download the whole repo again so I stoped it to
> try ssh. SSH worked but then I would have to manually split the
> download so bzr doesn't eat the whole memory. That is way too manual
> to my taste so I gave up on that also.

So, there's some hope this would have worked, and is pretty amenable
to shell scripting. Then as per your final stacking trick, updating
work work as normal with no further fiddling. Just doing a dumb copy
of the repository with metadata as a setup step is also an option for
working around problems like this.

When repacking is involved, which it didn't seem to be in your case,
it's more of an issue as that triggers on irregular intervals in the
future as well as during the initial branch operation.

> Talking about a huge repo, this one has 493mb and around 40k rev. I
> used fastexport to see how big it would be in git and another bad news
> it gets a tad smaller there 401mb after a git repack -a -d -f -F
> (don't ask about all those flags, git cli is crazy). This goes against
> the benchmarks posted on bzr, should they be updated or something?

Doing the same export-import process with bzr would also result in a
smaller repo. In general use the size fluctuates as recent changes a
inefficiently packed, then periodically repacked for better
compression. This is commonly the case across dvcses, with varying
degrees of automation.

> Why does bzr uses so much memory do to a simple branch, and is
> --stacked the best way to do source deployment?

So, the question that hasn't been answered is where the memory during
branching is actually being used. I made a few guesses at the time,
but getting a real answer is where a tool like meliae comes in, which
dumps the contents of memory when python hits an OOM. Unfortunately
that's little difficult to use from your end as the reaper beats
python to the scythe and there's a lot of resultant data to process to
get useful answers.

Martin