Rev 3712: Change the per-file ancestry code in log.py in http://bzr.arbash-meinel.com/branches/bzr/1.8-dev/lighter_log_file

John Arbash Meinel john at arbash-meinel.com
Thu Sep 18 17:16:01 BST 2008


At http://bzr.arbash-meinel.com/branches/bzr/1.8-dev/lighter_log_file

------------------------------------------------------------
revno: 3712
revision-id: john at arbash-meinel.com-20080918161558-745a1vuxechgcq8h
parent: pqm at pqm.ubuntu.com-20080917230446-p0wippqwikt511sp
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: lighter_log_file
timestamp: Thu 2008-09-18 11:15:58 -0500
message:
  Change the per-file ancestry code in log.py
  
  The old code would create a new set() object whenever there
  was more than one parent. However, many of those cases did
  not introduce any new per-file revisions, and thus was
  wasted memory.
-------------- next part --------------
=== modified file 'NEWS'
--- a/NEWS	2008-09-17 22:22:25 +0000
+++ b/NEWS	2008-09-18 16:15:58 +0000
@@ -13,6 +13,12 @@
 
   IMPROVEMENTS:
 
+    * The per-file log algorithm now uses a lighter-weight way of tracking
+      when changes to a file are merged into another revision. The old
+      method created a new set at every merge point, which was wasteful
+      when the merge did not introduce new revisions.
+      (John Arbash Meinel)
+
   BUG FIXES:
 
     * Branching from a shared repository on a smart server into a new

=== modified file 'bzrlib/log.py'
--- a/bzrlib/log.py	2008-09-11 17:13:46 +0000
+++ b/bzrlib/log.py	2008-09-18 16:15:58 +0000
@@ -570,26 +570,42 @@
     sorted_rev_list = tsort.topo_sort(parent_map.items())
     text_keys = [(file_id, rev_id) for rev_id in sorted_rev_list]
     modified_text_versions = branch.repository.texts.get_parent_map(text_keys)
+    # Ancestry will contain the set of all changes to the given file_id, for
+    # every revision in the whole tree.
     ancestry = {}
     for rev in sorted_rev_list:
         text_key = (file_id, rev)
         parents = parent_map[rev]
-        if text_key not in modified_text_versions and len(parents) == 1:
-            # We will not be adding anything new, so just use a reference to
-            # the parent ancestry.
-            rev_ancestry = ancestry[parents[0]]
+        rev_ancestry = None
+        if text_key in modified_text_versions:
+            rev_ancestry = set([rev])
+        for parent in parents:
+            if parent not in ancestry:
+                # parent is a Ghost, which won't be present in
+                # sorted_rev_list, but we may access it later, so create an
+                # empty node for it
+                ancestry[parent] = frozenset()
+            if rev_ancestry is None:
+                # We don't have anything worked out yet, so just copy this
+                # parent's ancestry
+                rev_ancestry = ancestry[parent]
+            else:
+                # Check to see if this other parent introduces any new
+                # revisions. If it doesn't then we don't need to create a new
+                # set.
+                parent_ancestry = ancestry[parent]
+                new_revisions = parent_ancestry.difference(rev_ancestry)
+                if new_revisions:
+                    # We need to create a non-frozen set so that we can
+                    # update it. If we already have a non-frozen set, then we
+                    # are free to modify it.
+                    if isinstance(rev_ancestry, frozenset):
+                        rev_ancestry = set(rev_ancestry)
+                    rev_ancestry.update(new_revisions)
+        if rev_ancestry is None:
+            ancestry[rev] = frozenset()
         else:
-            rev_ancestry = set()
-            if text_key in modified_text_versions:
-                rev_ancestry.add(rev)
-            for parent in parents:
-                if parent not in ancestry:
-                    # parent is a Ghost, which won't be present in
-                    # sorted_rev_list, but we may access it later, so create an
-                    # empty node for it
-                    ancestry[parent] = set()
-                rev_ancestry = rev_ancestry.union(ancestry[parent])
-        ancestry[rev] = rev_ancestry
+            ancestry[rev] = frozenset(rev_ancestry)
 
     def is_merging_rev(r):
         parents = parent_map[r]



More information about the bazaar-commits mailing list