[MERGE] Use a LRUCache in LocalTransport.clone

Aaron Bentley aaron at aaronbentley.com
Fri Jan 4 18:09:28 GMT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> This is a simple change to use a LRUCache in LocalTransport.clone.  It seems
> that calculating an abspath can be quite expensive.

Yes, I've also observed that abspath takes a long time for the
find_branches API.  It doesn't seem really reasonable.

> By keeping the LRUCache I
> reduce the time to pull a 300 revisions from a knit repo into a pack repo from
> ~32 seconds to ~26 seconds!

Nice.

> Specifically, I'm testing with a Twisted trunk import made with bzr-svn which is
> in “Bazaar Knit Repository Format 3 (bzr 0.15)” format, and timing using this
> simple-minded command:
> 
>     time (bzr init --rich-root-pack foo && cd foo && bzr pull -r 300 /tmp/twisted-trunk)
> 
> (So it's also building the working tree, so the 32s vs 26s difference slightly
> understates the relative improvement).

For benchmarking purposes, it might make sense to create a treeless
branch, e.g. by sticking a no-trees shared repo above foo.

> Or perhaps clone should just be faster; by
> calling self.abspath it invokes local_path_to_url, and in constructing a new
> LocalTransport it also invokes local_path_from_url... it seems like one of those
> ought to be unnecessary in an ideal world.

I think clone ought to be much, much faster.  We shouldn't pay a huge
performance penalty for doing things the "right" way.

> +            try:
> +                t = _local_transport_lru[key]
> +            except KeyError:
> +                abspath = self.abspath(offset)
> +                if abspath == 'file://':
> +                    # fix upwalk for UNC path
> +                    # when clone from //HOST/path updir recursively
> +                    # we should stop at least at //HOST part
> +                    abspath = self.base
> +                t = LocalTransport(abspath)
> +                _local_transport_lru[key] = t
> +            return t

It seems to me like our use cases for LRU caching are going to be pretty
redundant and boring.  So what about a decorator?  e.g.

@with_lru_cache(max_size=6000)
def _abspath():

Also, depending on the hit/miss ratio, dict.get may be a better choice
than dict[] and catching exceptions.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHfnZY0F+nu1YWqI0RAnvRAJ9GHGCqpfIU1Gi2Kd/kK+TM4MC/KACcC+hh
PuyKBR6O6CdWlPCQYROnycw=
=wRTL
-----END PGP SIGNATURE-----



More information about the bazaar mailing list