[MERGE] Improvements to find_unique_ancestors()

Wed May 21 03:18:11 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

It seems that 1 in 10 might be too infrequently. It is mostly a tuning
parameter. If you get a graph that needs all_unique to converge then you want it
to step. I believe most of them don't.

This is somewhat from my testing, where I found that with STEP=1 it was
searching the whole graph by the time one of the common threads converged. I
probably tuned it too heavily for that specific graph.

Using 10, my avg time is 0.053, and my max time is 1.898s
Using 5, my avg time is 0.051, and my max time is 2.25s

Max time is on the same revision, which is probably where I was tuning it
originally. So over all 4000 tests, STEP=5 is slightly faster, but for the
specific worst case, it is better. These numbers are fairly reproducible, though
I won't claim they are outside of the noise margin.

Also, these numbers don't include latency to a remote host or pack index loading
time. These numbers have already loaded the sets into a dict, so it is mostly
the computation time. If you include Pack.get_parent_map() overhead, I would
expect the 'sweet spot' to change a bit. For now, I'll go with 5.

|
| bb: tweak
|
|> +        return newly_seen_common, newly_seen_unique,
|
| Trailing comma can go here.
|
|> +    def _find_nodes_common_to_all_unique(self, unique_tip_searchers,
|> +                                         all_unique_searcher,
|> +                                         newly_seen_unique, step_all_unique):
|> +        """Find nodes that are common to all unique_tip_searchers.
|> +
|> +        If it is time, step the all_unique_searcher, and add its nodes to the
|> +        result.
|> +        """
|> +        unique_are_common_nodes = newly_seen_unique.copy()
|> +        for searcher in unique_tip_searchers:
|> +            unique_are_common_nodes.intersection_update(searcher.seen)
|> +        unique_are_common_nodes.intersection_update(
|> +                                    all_unique_searcher.seen)
|
| I'd really prefer it if unique_are_common_nodes was renamed to
| common_to_all_unique_nodes, at least within this particular method.

Done. I prefer that name as well, though I find it is hard to fit these
variables in a single 80 char line anymore.

|
|> +        # Step all-unique less frequently than the other searchers.
|> +        # In the common case, we don't need to spider out far here, so
|> +        # avoid doing extra work.
|> +        if not step_all_unique:
|
| You've changed step_all_unique from a counter (outside this method) to
| a boolean (inside this method). That's what I was expecting which is
| good. You need to invert this test now though.

Yeah.

|
|> +            tstart = time.clock()
|> +            nodes = all_unique_searcher.step()
|> +            unique_are_common_nodes.update(nodes)
|> +            tdelta = time.clock() - tstart
|> +            if 'graph' in debug.debug_flags:
|> +                trace.mutter('all_unique_searcher step() took %.3fs'
|> +                             'for %d nodes (%d total), iteration: %s',
|> +                             tdelta, len(nodes), len(all_unique_searcher.seen),
|> +                             all_unique_searcher._iterations)
|> +        return unique_are_common_nodes
|
| The calculation of tdelta can move inside the if block.
|
|
|> | As a minor note, several of the _mutter calls probably ought to be
|> | using %d instead of %s. It's pretty minor compared to the issues above
|> | though.
|>
|> I didn't do this, since "%d" % (int) == "%s" % (int). The reasons for %d
|> are to
|> handle "%2.2d" or to round from a float. %s "always works" which means I
|> won't
|> accidentally get an error if I change an int into an object. I'll do it
|> if you
|> think it adds clarity, though.
|
| I think it does read better when integers are given as %d in format
| strings.

I went ahead and switched all of them but the "_iterations" value. It should be
an integer, but the others are len() which must be an integer.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgzhmMACgkQJdeBCYSNAAPdJACdH9yfawbsfYTmOKLWpHgKz43h
vdgAoJ0bLnjuHQn2aW0U4TQepU6fLVv7
=rz0e
-----END PGP SIGNATURE-----