Fastest way to find repository heads?

John Arbash Meinel john at arbash-meinel.com
Thu Jul 9 00:37:58 BST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> John Arbash Meinel wrote:
...

>> I'll mention that KnownGraph is probably a lot better about this. So doing:
> 
>> all_revs = repo.all_revision_ids()
>> parent_map = repo.get_parent_map(all_revs)
>> graph = graph.KnownGraph(parent_map)
>> heads = graph.heads(all_revs)
> 
>> John
>> =:->
> 
> KnownGraph should be O(N) while Graph is probably N^2 in places.
> However, for what you are doing I just realized:
> 
> all_revs = repo.all_revision_ids()
> parent_map = repo.get_parent_map(all_revs)
> all_parents = set()
> map(all_parents.update, parent_map.itervalues())
> heads = set(all_revs) - all_parents
> 
> That should be the fastest, and still give correct answers.

To give some numbers:


 330ms	all_revs = repo.all_revision_ids()
1700ms  parent_map = repo.get_parent_map(all_revs)
 	g = graph.KnownGraph(parent_map)
 140ms  heads = g.heads(all_revs)

versus
        g = repo.get_graph()
18400ms g.heads(all_revs)

versus
        parent_map = repo.get_parent_map(all_revs)
        all_parents = set()
        map(all_parents.update, parent_map.itervalues())
  30ms  heads = set(all_revs) - all_parents


I'll note that in the 'fast' case, the slowest thing is
'get_parent_map()'. We could actually make a "get_all_parent_map()" that
would be as fast as all_revision_ids(), we just haven't had a need.
(all_revision_ids reads the whole btree and throws away the info,
because we don't do any caching in BTreeIndex.iter_all_entries().)
get_parent_map() is slower than all_revision_ids() because it does the
same work *plus* filtering the lists into subsets based on the btree
index, etc.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpVLdYACgkQJdeBCYSNAANmywCgvmzva2dhoEx5ZEfx7tF3nHbV
FGEAnRIoKQ7KUXiVGbNNNdbNV9sjE89H
=oPgN
-----END PGP SIGNATURE-----



More information about the bazaar mailing list