Bazaar: Out of memory
John Arbash Meinel
john at
Fri May 9 15:52:07 BST 2008
Hash: SHA1
Ian Clatworthy wrote:
| James Henstridge wrote:
|> The .join() method is faster because it does less memory allocations
|> and copies than the for loop you list above. It should allocate a
|> single string for the final concatenated string. In contrast, the for
|> loop version does a string allocation and copy for every iteration (so
|> in the final iteration you'd expect it to have allocated roughly twice
|> the memory).
| Hmmm ...
| Launchpad shows the exact change I made in bzr-fastexport here:
| The impact was *dramatic*: the memory shown by Gnome System Manager
| importing a repository ( with
| a large binary file (5M) dropped from 275M to 42M. Maybe recent versions
| of Python are smarter now about string concatenation in loops? Or maybe
| the change itself is a red herring and we're seeing a symptom of a
| reference counting bug, say?
Seems a bit odd to me. I'm curious if you put in some "gc.collect()" statements
what would happen.
|> If this change prevents MemoryErrors then something weird is going on.
| In the case of bzr-fastimport, the code was calling read in a loop and
| joining that list. In the case of bzr itself, I'm pretty sure we
| arbitrarily partition huge binary files into "lines" based on where '\n'
| characters just happened to be. Given this is user data, that could be
| almost any size. (In Guido's case, I think his largest file is 40M FWIW.)
| Ian C.
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla -
More information about the bazaar
mailing list