Making diff fast (was Re: Some notes on distributed SCM)

Chris Mason mason at suse.com
Mon Apr 11 00:11:52 BST 2005


On Sunday 10 April 2005 19:02, Daniel Phillips wrote:
> On Sunday 10 April 2005 18:47, Aaron Bentley wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Daniel Phillips wrote:
> > > On Sunday 10 April 2005 18:17, Aaron Bentley wrote:
> > >>This is what Arch does, and it's quite slow on large trees.  Robert
> > >>Collins has recently improved this in Baz, but it doesn't change the
> > >>fact that it's an O(versioned files) operation, rather than O(changed
> > >>files).
> > >
> > > But statting the full working copy kernel tree takes less than .1
> > > second if the dentries are in cache, and it takes less than 5 seconds
> > > to get them in cache.  What is wrong with that?
> >
> > 5 seconds isn't fast enough.  There's a huge psychological difference
> > between 5-second and subsecond response times.  At 5 seconds, it's still
> > an awkward delay.  If you just want a reminder of what you changed, it's
> > annoying.
>
> The 5 seconds is a one-time delay, basically once per turning your machine
> on.

This is only true if your machine does nothing other then source control.  
make -j 10 can toss the inodes from cache, or starting your gui email client, 
or ...

When the user doesn't somehow keep track of the files that have changed, he 
should expect the SCM to do a stat on every file in the tree.  I think it is 
enough to make bzr diff, commit etc take a list of files to consider, and 
have it be O(that list) for all operations.

This gives the user control.  If he hates waiting for the stats, he can manage 
a list of changed files, perhaps with an optional bzr edit command.  If he 
hates managing the list of changed files, he can let bzr do all the stats.

-chris






More information about the bazaar mailing list