Rev 3712: Change the per-file ancestry code in log.py in http://bzr.arbash-meinel.com/branches/bzr/1.8-dev/lighter_log_file
John Arbash Meinel
john at arbash-meinel.com
Thu Sep 18 17:16:01 BST 2008
At http://bzr.arbash-meinel.com/branches/bzr/1.8-dev/lighter_log_file
------------------------------------------------------------
revno: 3712
revision-id: john at arbash-meinel.com-20080918161558-745a1vuxechgcq8h
parent: pqm at pqm.ubuntu.com-20080917230446-p0wippqwikt511sp
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: lighter_log_file
timestamp: Thu 2008-09-18 11:15:58 -0500
message:
Change the per-file ancestry code in log.py
The old code would create a new set() object whenever there
was more than one parent. However, many of those cases did
not introduce any new per-file revisions, and thus was
wasted memory.
-------------- next part --------------
=== modified file 'NEWS'
--- a/NEWS 2008-09-17 22:22:25 +0000
+++ b/NEWS 2008-09-18 16:15:58 +0000
@@ -13,6 +13,12 @@
IMPROVEMENTS:
+ * The per-file log algorithm now uses a lighter-weight way of tracking
+ when changes to a file are merged into another revision. The old
+ method created a new set at every merge point, which was wasteful
+ when the merge did not introduce new revisions.
+ (John Arbash Meinel)
+
BUG FIXES:
* Branching from a shared repository on a smart server into a new
=== modified file 'bzrlib/log.py'
--- a/bzrlib/log.py 2008-09-11 17:13:46 +0000
+++ b/bzrlib/log.py 2008-09-18 16:15:58 +0000
@@ -570,26 +570,42 @@
sorted_rev_list = tsort.topo_sort(parent_map.items())
text_keys = [(file_id, rev_id) for rev_id in sorted_rev_list]
modified_text_versions = branch.repository.texts.get_parent_map(text_keys)
+ # Ancestry will contain the set of all changes to the given file_id, for
+ # every revision in the whole tree.
ancestry = {}
for rev in sorted_rev_list:
text_key = (file_id, rev)
parents = parent_map[rev]
- if text_key not in modified_text_versions and len(parents) == 1:
- # We will not be adding anything new, so just use a reference to
- # the parent ancestry.
- rev_ancestry = ancestry[parents[0]]
+ rev_ancestry = None
+ if text_key in modified_text_versions:
+ rev_ancestry = set([rev])
+ for parent in parents:
+ if parent not in ancestry:
+ # parent is a Ghost, which won't be present in
+ # sorted_rev_list, but we may access it later, so create an
+ # empty node for it
+ ancestry[parent] = frozenset()
+ if rev_ancestry is None:
+ # We don't have anything worked out yet, so just copy this
+ # parent's ancestry
+ rev_ancestry = ancestry[parent]
+ else:
+ # Check to see if this other parent introduces any new
+ # revisions. If it doesn't then we don't need to create a new
+ # set.
+ parent_ancestry = ancestry[parent]
+ new_revisions = parent_ancestry.difference(rev_ancestry)
+ if new_revisions:
+ # We need to create a non-frozen set so that we can
+ # update it. If we already have a non-frozen set, then we
+ # are free to modify it.
+ if isinstance(rev_ancestry, frozenset):
+ rev_ancestry = set(rev_ancestry)
+ rev_ancestry.update(new_revisions)
+ if rev_ancestry is None:
+ ancestry[rev] = frozenset()
else:
- rev_ancestry = set()
- if text_key in modified_text_versions:
- rev_ancestry.add(rev)
- for parent in parents:
- if parent not in ancestry:
- # parent is a Ghost, which won't be present in
- # sorted_rev_list, but we may access it later, so create an
- # empty node for it
- ancestry[parent] = set()
- rev_ancestry = rev_ancestry.union(ancestry[parent])
- ancestry[rev] = rev_ancestry
+ ancestry[rev] = frozenset(rev_ancestry)
def is_merging_rev(r):
parents = parent_map[r]
More information about the bazaar-commits
mailing list