Rev 3655: Replace time/space benchmarks with real-world testing. in http://bzr.arbash-meinel.com/branches/bzr/1.7-dev/btree

John Arbash Meinel john at arbash-meinel.com
Thu Aug 21 20:23:48 BST 2008


At http://bzr.arbash-meinel.com/branches/bzr/1.7-dev/btree

------------------------------------------------------------
revno: 3655
revision-id: john at arbash-meinel.com-20080821192346-4mtm95v5g4kkxbyu
parent: john at arbash-meinel.com-20080820231159-lp0gxglwyxveiot7
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: btree
timestamp: Thu 2008-08-21 14:23:46 -0500
message:
  Replace time/space benchmarks with real-world testing.
  Basically, the value was overstated, because the artifical nodes
  were significantly more compressible than real data.
  With these results, using .copy() basically is the same time/space
  trade off as allowing another repack.
  1-repack + copy() is mostly equivalent to 2-repack with no copy
  (in both time and space).
  They generally seem to be an appropriate 'sweet spot'.
  The extra pack (copy) avoids pathological behavior of not filling in
  the last bytes while only adding a small overhead.
  (approx 10% time cost at 20% space savings.)
modified:
  bzrlib/chunk_writer.py         chunk_writer.py-20080630234519-6ggn4id17nipovny-1
-------------- next part --------------
=== modified file 'bzrlib/chunk_writer.py'
--- a/bzrlib/chunk_writer.py	2008-08-20 23:11:59 +0000
+++ b/bzrlib/chunk_writer.py	2008-08-21 19:23:46 +0000
@@ -36,16 +36,26 @@
         will sometimes start over and compress the whole list to get tighter
         packing. We get diminishing returns after a while, so this limits the
         number of times we will try.
-        In testing, some values for 100k nodes::
-
-                            w/o copy            w/ copy             w/ copy & save
-            _max_repack     time    node count  time    node count  t       nc
-             1               8.0s   704          8.8s   494         14.2    390 #
-             2               9.2s   491          9.6s   432 #       12.9    390
-             3              10.6s   430 #       10.8s   408         12.0    390
-             4              12.5s   406                             12.8    390
-             5              13.9s   395
-            20              17.7s   390         17.8s   390
+        In testing, some values for bzr.dev::
+
+                    w/o copy    w/ copy     w/ copy ins w/ copy & save
+            repack  time  MB    time  MB    time  MB    time  MB
+             1       8.8  5.1    8.9  5.1    9.6  4.4   12.5  4.1
+             2       9.6  4.4   10.1  4.3   10.4  4.2   11.1  4.1
+             3      10.6  4.2   11.1  4.1   11.2  4.1   11.3  4.1
+             4      12.0  4.1
+             5      12.6  4.1
+            20      12.9  4.1   12.2  4.1   12.3  4.1
+
+        In testing, some values for mysql-unpacked::
+
+                    w/o copy    w/ copy     w/ copy ins w/ copy & save
+            repack  time  MB    time  MB    time  MB    time  MB
+             1      56.6  16.9              60.7  14.2
+             2      59.3  14.1              62.6  13.5  64.3  13.4
+             3      64.4  13.5
+            20      73.4  13.4
+
     :cvar _default_min_compression_size: The expected minimum compression.
         While packing nodes into the page, we won't Z_SYNC_FLUSH until we have
         received this much input data. This saves time, because we don't bloat



More information about the bazaar-commits mailing list