[MERGE] Make pulling bundle 6x faster

Martin Pool mbp at canonical.com
Thu Jul 26 20:11:13 BST 2007


Martin Pool has voted +0.
Status is now: Semi-approved
Comment:
This deserves a NEWS entry.

Does this improvement show up in the bundle benchmarks in the benchmark 
suite?

--- bzrlib/merge_directive.py   2007-07-20 14:28:59 +0000
+++ bzrlib/merge_directive.py   2007-07-24 16:59:24 +0000
@@ -190,7 +190,7 @@
                      StringIO(self.get_raw_bundle()))
                  # We don't use the bundle's target revision, because
                  # MergeDirective.revision_id is authoritative.
-                info.install_revisions(target_repo)
+                info.install_revisions(target_repo, 
memory_friendly=False)

I think conceptually this is ok; as you say it might be nice to see if 
we
can have the iterable interface be just as fast so we can use it always.

'memory_friendly' is not an ideal name because reading this call I think
'well why not be friendly?' :-)  Maybe it should be called 
'stream_input'
or 'decode_entire_file' or 'stream_decompress' etc.  Something that 
gives
a better idea of what the tradeoff is, rather than something that seems
like it should always be set.

+            if (repo_kind, file_id) !=  ('file', current_file) and\
+                len(pending_file_records) > 0:

There's an extra space after the !=, and it might be better with
parentheses rather than a backslash.  John says he dislikes the 
backslash.

@@ -106,7 +106,7 @@
                  if block is None:
                      continue
                  i, j, n = block
-                while j + n < cur_line:
+                while j + n <= cur_line:
                      block = cur_block[p] = next_block(p)
                      if block is None:
                          break

I guess this is a bug fix?  I don't think you mentioned it in your merge
request?  I see you do have the commit message

> Fix benign off-by-one error generating mpdiffs

Is it really harmless?  Maybe it just makes the matched regions too 
small?

+            version_text = self.add_lines(version, parent_ids, lines,
+                vf_parents, left_matching_blocks = 
left_matching_blocks)

There shouldn't be spaces around the = in the kwargs.

+                           left_matching_blocks=None):

Can you please add a docstring for left_matching_blocks at least in the
base class?  Also it doesn't seem to be directly tested - it should be,
since it's in a public interface.  Why are they _left_ matching blocks -
left parent?

If this is just a hint then it can be a bit tricky to test: maybe you
should pass a value and then hook in to observe that it doesn't redo the
diff?  Maybe this shouldn't be a public parameter?  Or we could at least
make sure that the parameter exists and accepts None.




For details, see: 
http://bundlebuggy.aaronbentley.com/request/%3C46A63CEA.9060409%40utoronto.ca%3E



More information about the bazaar mailing list