B+tree discussions

John Arbash Meinel john at arbash-meinel.com
Thu Jul 3 17:15:35 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
| On Thu, 2008-07-03 at 17:09 +1000, Robert Collins wrote:
|> iter_random_one was spectularly slow. 60% of it was in split-lines - so
|> here with figures from using a C parser.
|> C parser:
|
| I've also upped the leaf cache node size to 4MB rather than 400K
| (currently per index). As already noted this really impacts the tests
| that were cache thrashing (but I prefer cache thrashing to VM thrashing
| - cache thrashing is faster than hitting every page in swap to do
| refcount checks/adjustments).

Unfortunately, it isn't 4MB in memory. It is 4MB of disk space being
cached expanded in memory.

I wish I could get guppy to compile on win32, but it seems that win32
doesn't like putting pointers into static structure definitions.  (I'm
using gcc-mingw, but it really doesn't like:
PyTypeObject NyBitSet_Type = {
~    PyObject_HEAD_INIT(&PyType_Type)
};

I think it considers PyType_Type unsafe. I'm guessing it has to do with
DLL issues. The definition is:

PyAPI_DATA(PyTypeObject) PyType_Type;

And I would guess on win32 that is something like a
__declspec(dllimport). Which means that the actual value isn't known
until dynamic linking time, so it can't be put in at compile time. Shame...


Anyway, to get back to the real discussion. At a minimum, we expect 2:1
compression for disk versus in memory. So the 4K page will expand to 8K
strings in memory. But on top of that, we have the object overhead.
Andrew mentioned that a tuple costs 24bytes + 4*entries, or 32 bytes for
2-tuples. I would assume that a String object would also have around
24bytes direct overhead.

So if we don't intern(), a .tix line will have:

1) A 2-tuple for the key
2) A 2-tuple for (value, refs)
3) A N-tuple of N-tuples for refs (for tix there are no refs, but there
would be for .rix)

I think value is a plain string, though.

So we end up with ~3 tuples per key, or 3*32 = 96 bytes of tuple
overhead. We then have ~60-char file_id and revision_id and some string
overhead.

I'm counting about 204 bytes per row, and we know that we get ~90 rows
into a page. Or 18K per page unpacked in memory.

So the 1000 node cache is actually an 18MB memory cache.

I'm starting to wonder if we want to think about 'intern()'ing our
tuples. Especially since the it seems that key will be repeated fairly
often. Almost 1/2 of our per-row overhead is just memory for the tuple
objects. And certainly there is a lot of plain python object memory
overhead.

John
=:->

|
| In short - faster or equal across the board:
|
| BTreeIndex: iter_all_entries in 1.026
| GraphIndex: iter_all_entries in 10.948
|
| BTreeIndex: iter_random_one in 12.712,
| GraphIndex: iter_random_one in 12.052,
|
| BTreeIndex: get_components_positions(2000) in 7.057,
| GraphIndex: get_components_positions(2000) in 22.545,
|
| BTreeIndex: search_from_revid in 3.499 found 18130,
| GraphIndex: search_from_revid in 7.201 found 18130,
|
| BTreeIndex: miss torture in 22.602,
| GraphIndex: miss torture in 365.309,
|
| -Rob
vi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkhs+ycACgkQJdeBCYSNAANJVwCdF5vWTH6rmqs8GYPv63U227SP
OYEAoKk9St2S8N+sRN8SPeD3Dvm8jen+
=MDQM
-----END PGP SIGNATURE-----



More information about the bazaar mailing list