Rev 5543: A TODO entry. in http://bazaar.launchpad.net/~jameinel/bzr/2.3-gcb-peak-mem
John Arbash Meinel
john at arbash-meinel.com
Thu Dec 2 21:42:54 GMT 2010
At http://bazaar.launchpad.net/~jameinel/bzr/2.3-gcb-peak-mem
------------------------------------------------------------
revno: 5543
revision-id: john at arbash-meinel.com-20101202214246-cpvdr4s5bdt52r59
parent: john at arbash-meinel.com-20101202212630-vycb3zf5uy5iz2tc
fixes bug(s): https://launchpad.net/bugs/602614
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.3-gcb-peak-mem
timestamp: Thu 2010-12-02 15:42:46 -0600
message:
A TODO entry.
Overall this experiment hasn't been particularly beneficial. The final speed
is still slower than the existing code, and the primary knob that reduced
peak memory size is to change the stride for large content. Which can be
trivially added to the existing match code.
I like the code quite a bit, but I wish I had more to show for the amount
of effort put into it.
-------------- next part --------------
=== modified file 'bzrlib/_delta_index_pyx.pyx'
--- a/bzrlib/_delta_index_pyx.pyx 2010-12-02 21:26:30 +0000
+++ b/bzrlib/_delta_index_pyx.pyx 2010-12-02 21:42:46 +0000
@@ -108,6 +108,14 @@
atexit.register(report_total_time)
+# TODO: This is the primary table entry in the hash map. As such, this is the
+# part that scales O(N) and thus can provide the largest memory savings.
+# We should determine what attributes are really necessary.
+# Also, we currently store the hash entries inline, which means each
+# 'empty' entry is a full rabin_entry in size. We could, instead,
+# allocate a table of rabin_entry, and then pointers into that table for
+# the hash table. This would reduce the cost of an empty hash slot, at
+# the cost of adding O(N) pointers.
cdef struct rabin_entry:
# A pointer to the actual matching bytes
const_data ptr
More information about the bazaar-commits
mailing list