Fastest way to find repository heads?

John Arbash Meinel john at arbash-meinel.com
Thu Jul 9 00:06:06 BST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> Jelmer Vernooij wrote:
>> When talking to a git smart server and trying to figure out which
>> revisions to fetch, bzr-git has to provide the server with all of the
>> heads in the repository so that the client and the server can negotiate
>> which revisions have to be transferred.
> 
>> What is the best way of doing this? At the moment I'm using this:
> 
>> heads = repo.get_graph().heads(repo.all_revision_ids())
> 
>> But this is taking quite some time for large repositories, is there
>> perhaps a more efficient way of doing this?
> 
>> Alternatively, I guess the repository could store this information, but
>> that would obviously require a new repository format.
> 
>> Cheers,
> 
>> Jelmer
> 
> I'll mention that KnownGraph is probably a lot better about this. So doing:
> 
> all_revs = repo.all_revision_ids()
> parent_map = repo.get_parent_map(all_revs)
> graph = graph.KnownGraph(parent_map)
> heads = graph.heads(all_revs)
> 
> John
> =:->

KnownGraph should be O(N) while Graph is probably N^2 in places.
However, for what you are doing I just realized:

all_revs = repo.all_revision_ids()
parent_map = repo.get_parent_map(all_revs)
all_parents = set()
map(all_parents.update, parent_map.itervalues())
heads = set(all_revs) - all_parents

That should be the fastest, and still give correct answers.

(A head being any revision that is not referenced as a parent of some
other revision.)

The Graph objects do more work because they are trying to count without
walking every node. But since you already *have* every node, there is no
benefit.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpVJl4ACgkQJdeBCYSNAAN/9wCgvoDSbBG3ciMJZ7hUHDap9N0k
1uMAn1XgZPlO31/Lf7msGIctmlHOodCi
=LSYS
-----END PGP SIGNATURE-----



More information about the bazaar mailing list