Rev 3675: If we repack earlier, it catches this case. in http://bzr.arbash-meinel.com/branches/bzr/1.7-dev/btree

John Arbash Meinel john at arbash-meinel.com
Fri Aug 22 06:54:46 BST 2008


At http://bzr.arbash-meinel.com/branches/bzr/1.7-dev/btree

------------------------------------------------------------
revno: 3675
revision-id: john at arbash-meinel.com-20080822055444-5kcr0csbbvkqbbiw
parent: john at arbash-meinel.com-20080822054012-ikrwmq9nm2q4h6q8
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: btree
timestamp: Fri 2008-08-22 00:54:44 -0500
message:
  If we repack earlier, it catches this case.
  Still need to fix the other tests, but at least
  the too_much test passes now.
  Impact on real-world results is measurable
  (2-3% final compression). Is it worth it?
modified:
  bzrlib/chunk_writer.py         chunk_writer.py-20080630234519-6ggn4id17nipovny-1
  bzrlib/tests/test_chunk_writer.py test_chunk_writer.py-20080630234519-6ggn4id17nipovny-2
-------------- next part --------------
=== modified file 'bzrlib/chunk_writer.py'
--- a/bzrlib/chunk_writer.py	2008-08-22 05:40:12 +0000
+++ b/bzrlib/chunk_writer.py	2008-08-22 05:54:44 +0000
@@ -177,7 +177,11 @@
             if out:
                 self.bytes_list.append(out)
                 self.bytes_out_len += len(out)
-            if self.bytes_out_len + 10 <= capacity:
+
+            # We are a bit extra conservative, because it seems that you *can*
+            # get better compression with Z_SYNC_FLUSH than a full compress. It
+            # is probably very rare, but we were able to trigger it.
+            if self.bytes_out_len + 100 <= capacity:
                 # It fit, so mark it added
                 self.bytes_in.append(bytes)
                 self.seen_bytes += len(bytes)

=== modified file 'bzrlib/tests/test_chunk_writer.py'
--- a/bzrlib/tests/test_chunk_writer.py	2008-08-22 05:40:12 +0000
+++ b/bzrlib/tests/test_chunk_writer.py	2008-08-22 05:54:44 +0000
@@ -59,10 +59,8 @@
             lines.append(''.join(map(str, numbers)) + '\n')
         writer = chunk_writer.ChunkWriter(4096)
         for idx, line in enumerate(lines):
-            if idx >= 45:
-                import pdb; pdb.set_trace()
             if writer.write(line):
-                self.assertEqual(47, idx)
+                self.assertEqual(46, idx)
                 break
         bytes_list, unused, _ = writer.finish()
         node_bytes = self.check_chunk(bytes_list, 4096)



More information about the bazaar-commits mailing list