Rev 3655: Replace time/space benchmarks with real-world testing. in http://bzr.arbash-meinel.com/branches/bzr/1.7-dev/btree
John Arbash Meinel
john at arbash-meinel.com
Thu Aug 21 20:23:48 BST 2008
At http://bzr.arbash-meinel.com/branches/bzr/1.7-dev/btree
------------------------------------------------------------
revno: 3655
revision-id: john at arbash-meinel.com-20080821192346-4mtm95v5g4kkxbyu
parent: john at arbash-meinel.com-20080820231159-lp0gxglwyxveiot7
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: btree
timestamp: Thu 2008-08-21 14:23:46 -0500
message:
Replace time/space benchmarks with real-world testing.
Basically, the value was overstated, because the artifical nodes
were significantly more compressible than real data.
With these results, using .copy() basically is the same time/space
trade off as allowing another repack.
1-repack + copy() is mostly equivalent to 2-repack with no copy
(in both time and space).
They generally seem to be an appropriate 'sweet spot'.
The extra pack (copy) avoids pathological behavior of not filling in
the last bytes while only adding a small overhead.
(approx 10% time cost at 20% space savings.)
modified:
bzrlib/chunk_writer.py chunk_writer.py-20080630234519-6ggn4id17nipovny-1
-------------- next part --------------
=== modified file 'bzrlib/chunk_writer.py'
--- a/bzrlib/chunk_writer.py 2008-08-20 23:11:59 +0000
+++ b/bzrlib/chunk_writer.py 2008-08-21 19:23:46 +0000
@@ -36,16 +36,26 @@
will sometimes start over and compress the whole list to get tighter
packing. We get diminishing returns after a while, so this limits the
number of times we will try.
- In testing, some values for 100k nodes::
-
- w/o copy w/ copy w/ copy & save
- _max_repack time node count time node count t nc
- 1 8.0s 704 8.8s 494 14.2 390 #
- 2 9.2s 491 9.6s 432 # 12.9 390
- 3 10.6s 430 # 10.8s 408 12.0 390
- 4 12.5s 406 12.8 390
- 5 13.9s 395
- 20 17.7s 390 17.8s 390
+ In testing, some values for bzr.dev::
+
+ w/o copy w/ copy w/ copy ins w/ copy & save
+ repack time MB time MB time MB time MB
+ 1 8.8 5.1 8.9 5.1 9.6 4.4 12.5 4.1
+ 2 9.6 4.4 10.1 4.3 10.4 4.2 11.1 4.1
+ 3 10.6 4.2 11.1 4.1 11.2 4.1 11.3 4.1
+ 4 12.0 4.1
+ 5 12.6 4.1
+ 20 12.9 4.1 12.2 4.1 12.3 4.1
+
+ In testing, some values for mysql-unpacked::
+
+ w/o copy w/ copy w/ copy ins w/ copy & save
+ repack time MB time MB time MB time MB
+ 1 56.6 16.9 60.7 14.2
+ 2 59.3 14.1 62.6 13.5 64.3 13.4
+ 3 64.4 13.5
+ 20 73.4 13.4
+
:cvar _default_min_compression_size: The expected minimum compression.
While packing nodes into the page, we won't Z_SYNC_FLUSH until we have
received this much input data. This saves time, because we don't bloat
More information about the bazaar-commits
mailing list