Rev 3713: Change the logic to solve it in a different way. in http://bzr.arbash-meinel.com/branches/bzr/1.8-dev/autopack_bug_242510

John Arbash Meinel john at arbash-meinel.com
Fri Sep 19 16:47:47 BST 2008


At http://bzr.arbash-meinel.com/branches/bzr/1.8-dev/autopack_bug_242510

------------------------------------------------------------
revno: 3713
revision-id: john at arbash-meinel.com-20080919154746-g07uetnih1s5flil
parent: john at arbash-meinel.com-20080919150355-14s3zy2jndtlmc2q
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: autopack_bug_242510
timestamp: Fri 2008-09-19 10:47:46 -0500
message:
  Change the logic to solve it in a different way.
  
  Now autopack will always write out a single pack file. It uses
  the same 'fitting' logic, to determine which packs are
  sub-optimally sized and should be combined. However, when it
  finally comes to packing them, it just puts them in the same file.
  It is the same amount of I/O, it just leads to fewer final
  pack files.
  
  Some analysis should be done on the long-term effects of this.
  
  As a side effect, it seems immune to bug #242510. The only way
  to trigger it was to have a large pack built, and then wanting
  a smaller pack. However, if you are genuinely building a pack,
  then there is something to be combined with.
-------------- next part --------------
=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py	2008-09-19 15:03:55 +0000
+++ b/bzrlib/repofmt/pack_repo.py	2008-09-19 15:47:46 +0000
@@ -1300,13 +1300,22 @@
                     # this pack is used up, shift left.
                     del pack_distribution[0]
                     pack_operations.append([0, []])
-
-        if len(pack_operations) > 1:
-            # Check the last pack, it is possible that it has only a single
-            # entry, which would cause us to repack the node into itself
-            if len(pack_operations[-1][1]) == 1:
-                pack_operations.pop()
-        return pack_operations
+        # At this point, we have a bunch of pack files that we are recombining.
+        # This triggers a certain amount of I/O. However, we can just shove all
+        # of these modifications into a single larger pack. That triggers the
+        # same amount of I/O, but leaves us with fewer packs in the general
+        # case.
+        final_rev_count = 0
+        final_pack_list = []
+        for num_revs, pack_files in pack_operations:
+            final_rev_count += num_revs
+            final_pack_list.extend(pack_files)
+        if len(final_pack_list) == 1:
+            raise AssertionError('We somehow generated an autopack with a'
+                ' single pack file being moved.'
+                ' I couldnt find a way to do this')
+            return []
+        return [[final_rev_count, final_pack_list]]
 
     def ensure_loaded(self):
         # NB: if you see an assertion error here, its probably access against

=== modified file 'bzrlib/tests/test_repository.py'
--- a/bzrlib/tests/test_repository.py	2008-09-19 15:03:55 +0000
+++ b/bzrlib/tests/test_repository.py	2008-09-19 15:47:46 +0000
@@ -843,19 +843,22 @@
         # rev count - 2010 -> 2x1000 + 1x10 (3)
         pack_operations = packs.plan_autopack_combinations(
             existing_packs, [1000, 1000, 10])
-        self.assertEqual([[2, ["single2", "single1"]], [0, []]], pack_operations)
+        self.assertEqual([[2, ["single2", "single1"]]], pack_operations)
 
-    def test_plan_pack_operations_110_doesnt_repack_single_entry(self):
+    def test_plan_pack_operations_creates_a_single_op(self):
         packs = self.get_packs()
-        existing_packs = [(55, "big"), (25, "medium"), (21, "medium2"),
-                          (9, "small")]
+        existing_packs = [(50, 'a'), (40, 'b'), (30, 'c'), (10, 'd'),
+                          (10, 'e'), (6, 'f'), (4, 'g')]
+        # rev count 150 -> 1x100 and 5x10
+        # The two size 10 packs do not need to be touched. The 50, 40, 30 would
+        # be combined into a single 120 size pack, and the 6 & 4 would
+        # becombined into a size 10 pack. However, if we have to rewrite them,
+        # we save a pack file with no increased I/O by putting them into the
+        # same file.
+        distribution = packs.pack_distribution(150)
         pack_operations = packs.plan_autopack_combinations(existing_packs,
-                                                           [100, 10])
-        # This initially wants to create a 100 size pack, and a 10 size pack.
-        # However, there is only a single pack with 1 entry to put into the
-        # final pack.
-        self.assertEqual([[101, ['big', 'medium', 'medium2']]],
-                         pack_operations)
+                                                           distribution)
+        self.assertEqual([[130, ['a', 'b', 'c', 'f', 'g']]], pack_operations)
 
     def test_all_packs_none(self):
         format = self.get_format()



More information about the bazaar-commits mailing list