Making diff fast (was Re: Some notes on distributed SCM)

Daniel Phillips phillips at
Mon Apr 11 00:02:07 BST 2005

On Sunday 10 April 2005 18:47, Aaron Bentley wrote:
> Hash: SHA1
> Daniel Phillips wrote:
> > On Sunday 10 April 2005 18:17, Aaron Bentley wrote:
> >>This is what Arch does, and it's quite slow on large trees.  Robert
> >>Collins has recently improved this in Baz, but it doesn't change the
> >>fact that it's an O(versioned files) operation, rather than O(changed
> >>files).
> >
> > But statting the full working copy kernel tree takes less than .1 second
> > if the dentries are in cache, and it takes less than 5 seconds to get
> > them in cache.  What is wrong with that?
> 5 seconds isn't fast enough.  There's a huge psychological difference
> between 5-second and subsecond response times.  At 5 seconds, it's still
> an awkward delay.  If you just want a reminder of what you changed, it's
> annoying.

The 5 seconds is a one-time delay, basically once per turning your machine on.

> If we can find ways to make operations O(changed files), then we avoid
> penalizing people with large projects.
> If subsecond is the common case, that's great.  Otherwise, I really do
> believe there's room for improvement.

The point that has been made several times already is: we should not get hung 
up on premature micro-optimization.  There are several crucial algorithms 
that need to be put in place and functioning reliably within a month.  Though 
all algorithms have to work within tolerable speeds, warming the dcache at 5 
seconds once a day just does not move the needle.  Shaving this down isn't 
worth a diversion at this point, there are bigger fish to fry.

Of course, if you've already got a patch...



More information about the bazaar mailing list