Rev 3676: Using a different safety margin for the first repack, in http://bzr.arbash-meinel.com/branches/bzr/1.7-dev/btree

John Arbash Meinel john at arbash-meinel.com
Fri Aug 22 21:33:29 BST 2008


At http://bzr.arbash-meinel.com/branches/bzr/1.7-dev/btree

------------------------------------------------------------
revno: 3676
revision-id: john at arbash-meinel.com-20080822203320-y98xykrjms4r5goj
parent: john at arbash-meinel.com-20080822055444-5kcr0csbbvkqbbiw
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: btree
timestamp: Fri 2008-08-22 15:33:20 -0500
message:
  Using a different safety margin for the first repack,
  and using 2 repacks gives us effectively the same result, while
  still making it safe for arbitary data. (With 1-repack, it does
  effect the results 3-5%, and with 2-repacks the second margin
  gives the same results.
  Also, we now can get about 2-3:1 of lines that are 'blindly' added versus
  ones which are added with a SYNC.
modified:
  bzrlib/chunk_writer.py         chunk_writer.py-20080630234519-6ggn4id17nipovny-1
-------------- next part --------------
=== modified file 'bzrlib/chunk_writer.py'
--- a/bzrlib/chunk_writer.py	2008-08-22 05:54:44 +0000
+++ b/bzrlib/chunk_writer.py	2008-08-22 20:33:20 +0000
@@ -21,8 +21,9 @@
 from zlib import Z_FINISH, Z_SYNC_FLUSH
 
 # [max_repack, buffer_full, repacks_with_space, min_compression,
-#  total_bytes_in, total_bytes_out, avg_comp]
-_stats = [0, 0, 0, 999, 0, 0, 0]
+#  total_bytes_in, total_bytes_out, avg_comp,
+#  bytes_autopack, bytes_sync_packed]
+_stats = [0, 0, 0, 999, 0, 0, 0, 0, 0]
 
 class ChunkWriter(object):
     """ChunkWriter allows writing of compressed data with a fixed size.
@@ -169,8 +170,10 @@
             self.bytes_in.append(bytes)
             self.seen_bytes += len(bytes)
             self.unflushed_in_bytes += len(bytes)
+            _stats[7] += 1 # len(bytes)
         else:
             # This may or may not fit, try to add it with Z_SYNC_FLUSH
+            _stats[8] += 1 # len(bytes)
             out = comp.compress(bytes)
             out += comp.flush(Z_SYNC_FLUSH)
             self.unflushed_in_bytes = 0
@@ -181,7 +184,11 @@
             # We are a bit extra conservative, because it seems that you *can*
             # get better compression with Z_SYNC_FLUSH than a full compress. It
             # is probably very rare, but we were able to trigger it.
-            if self.bytes_out_len + 100 <= capacity:
+            if self.num_repack == 0:
+                safety_margin = 100
+            else:
+                safety_margin = 10
+            if self.bytes_out_len + safety_margin <= capacity:
                 # It fit, so mark it added
                 self.bytes_in.append(bytes)
                 self.seen_bytes += len(bytes)



More information about the bazaar-commits mailing list