Rev 3896: Reverted back to the same hash width, and bumped EXTRA_NULLS to 3. in http://bzr.arbash-meinel.com/branches/bzr/brisbane/gc_delta_index_room
John Arbash Meinel
john at arbash-meinel.com
Wed Mar 18 22:45:30 GMT 2009
At http://bzr.arbash-meinel.com/branches/bzr/brisbane/gc_delta_index_room
------------------------------------------------------------
revno: 3896
revision-id: john at arbash-meinel.com-20090318224524-ve32it3ddqfzvi6q
parent: john at arbash-meinel.com-20090318224055-hs7nxyq5fjz0pr19
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: gc_delta_index_room
timestamp: Wed 2009-03-18 17:45:24 -0500
message:
Reverted back to the same hash width, and bumped EXTRA_NULLS to 3.
Most entries in a hash bucket are genuinely random, so they don't trigger
extra comparisons. So walking 4-7 nodes is fairly cheap at that level.
My guess is that bumping EXTRA_NULL has a bigger effect when you get the
occassional non-random data, that forces expansion because it gets a
collision.
Data with repetition a multiple of 16 (but not 16) will cause this, as
you can get a large insertion, with lots of dupes.
We filter out when the dupe is exactly a multiple of 16, we may want to
do something similar at larger ranges (or use limit_hash_table on the data
possibly with a much smaller value than 64.)
Most important (next) is to handle the large update case.
-------------- next part --------------
=== modified file 'bzrlib/diff-delta.c'
--- a/bzrlib/diff-delta.c 2009-03-18 22:40:55 +0000
+++ b/bzrlib/diff-delta.c 2009-03-18 22:45:24 +0000
@@ -28,7 +28,7 @@
/* The hash map is sized to put 4 entries per bucket, this gives us 2 blank
* spaces.
*/
-#define EXTRA_NULLS 1
+#define EXTRA_NULLS 3
static const unsigned int T[256] = {
0x00000000, 0xab59b4d1, 0x56b369a2, 0xfdeadd73, 0x063f6795, 0xad66d344,
@@ -447,7 +447,7 @@
total_num_entries = num_entries + old->num_entries;
else
total_num_entries = num_entries;
- hsize = total_num_entries / 2;
+ hsize = total_num_entries / 4;
for (i = 4; (1u << i) < hsize && i < 31; i++);
hsize = 1 << i;
hmask = hsize - 1;
@@ -531,7 +531,7 @@
first byte to allow for optimizing the Rabin's polynomial
initialization in create_delta(). */
total_num_entries = num_entries + old_index->num_entries;
- hsize = total_num_entries / 2;
+ hsize = total_num_entries / 4;
for (i = 4; (1u << i) < hsize && i < 31; i++);
hsize = 1 << i;
if (hsize < old_index->hash_mask) {
More information about the bazaar-commits
mailing list