Making diff fast (was Re: Some notes on distributed SCM)

Martin Pool mbp at sourcefrog.net
Mon Apr 11 01:57:26 BST 2005


On Sun, 2005-04-10 at 20:36 -0400, Chris Mason wrote:

> I seem to be in the minority of people who hate stats.  I think <other SCMs> 
> are just so slow that people think stating the whole tree in bzr/arch/git 
> feels fast ;)  This is why quilt feels so fast, it really is O(change) for 
> everything.

Right, so you have to tell quilt when you modify something, or it gets
confused.  For people who want that we can certainly have such a mode,
and if you hook it into your editor it shouldn't be a pain.  I guess if
you're going to apply a patch you can just "bzr edit .".   Unlike bk I
think the default should certainly be to give people editable files by
default.

Even better, I was thinking on Friday that perhaps there should be a
command to directly load a patch.  It's inefficient to walk the whole
tree after patch to find out what happened when patch kindly gives us a
list of modified files already.

To some extent I think it's a shortcoming of the kernel interface that
stat should be so slow.  Even with a cold cache, 12000 stats shouldn't
be reading *that* many blocks from disk, and one would hope at least
some of the inode data is contiguous.  Maybe eventually we could get
some kind of readdir-like call that also returns the stat information in
one go...

> > I've already been running bzr under "stat -c" to make sure that as much
> > as possible the system call counts are a low multiple of the number of
> > source files.
> 
> The attached patch replaces my minor diff improvement from last night.  It 
> eliminates all the extra stats from bzr diff file1 file2 ....  It won't play 
> nice with renames, but I'm happy to rework it a little later if you want 
> someone to go on a stat hunt.

Thanks.  The diff_trees() is rather unoptimal already because of a
probably misguided attempt to generate the two in order.

-- 
Martin

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050411/65b0ee2e/attachment.pgp 


More information about the bazaar mailing list