Incorrect bzrlib usage or a memory leak?
Abhay Mujumdar
amujumdar at blackducksoftware.com
Mon Oct 3 11:50:13 UTC 2011
In order to get content every file for for every revision, I was shell'ing out 'bzr cat' with the right parameters. It is pretty slow you if you fork a process for each file revision, probably because Python runtime is started and bzr code is loaded for each invocation.
So I re-wrote it to use bzrlib API and it is order of magnitude faster. However, it is leaking 2-10MB per revision. I tweaked the code to print heap using heapy. The code and stats are below. Notice that along with other things, count of bzrlib._static_tuple_c.StaticTuple objects keep increasing.
I am suspecting I am not using the API correctly (may be the cmd_* classes are not supposed to be used in a loop?).
I'd appreciate any help or hints.
Thanks
Abhay
from bzrlib.builtins import cmd_cat
from bzrlib.builtins import cmd_log
from bzrlib.revisionspec import RevisionSpec
import StringIO
import os
from guppy import hpy
hp = hpy()
# Get contents of a file for specific revision.
def bzr_cat(repository_url, filename, revision):
print "=cat file"
print hp.heap()
os.chdir(repository_url)
spec = RevisionSpec.from_string(revision)
output = StringIO.StringIO()
cmd = cmd_cat()
cmd.outf = output
cmd.run(filename=filename, revision=[spec], name_from_revision=True)
cmd.cleanup_now()
val = output.getvalue()
output.close()
print hp.heap()
return val
# Output of heapy when script started
Partition of a set of 111157 objects. Total size = 14541184 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 50169 45 5017480 35 5017480 35 str
1 30786 28 2947480 20 7964960 55 tuple
2 8229 7 987480 7 8952440 62 function
3 8449 8 946288 7 9898728 68 types.CodeType
4 839 1 838768 6 10737496 74 dict of type
5 290 0 773152 5 11510648 79 dict of module
6 999 1 769392 5 12280040 84 dict (no owner)
7 843 1 731648 5 13011688 89 type
8 658 1 336416 2 13348104 92 dict of class
9 209 0 217360 1 13565464 93 dict of bzrlib.option.Option
#Output of heapy after a few thousand calls
Partition of a set of 437390 objects. Total size = 334836816 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 211711 48 302202312 90 302202312 90 bzrlib._static_tuple_c.StaticTuple
1 160872 37 19002840 6 321205152 96 str
2 1369 0 4491920 1 325697072 97 dict (no owner)
3 31504 7 3026832 1 328723904 98 tuple
4 8403 2 1008360 0 329732264 98 function
5 8643 2 968016 0 330700280 99 types.CodeType
6 866 0 863008 0 331563288 99 dict of type
7 296 0 784000 0 332347288 99 dict of module
8 871 0 755808 0 333103096 99 type
9 662 0 337504 0 333440600 100 dict of class
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/bazaar/attachments/20111003/3738b521/attachment.html>
More information about the bazaar
mailing list