[MERGE] Faster diff on historical data

Thu Aug 9 22:01:50 BST 2007

On Št, 2007-08-09 at 12:50 -0500, John Arbash Meinel wrote:
> If I'm reading your numbers correctly, it seems that:
> 
> a) cpatience has a bigger effect that fastdiff
> b) the effects do stack
> c) fastdiff has a bigger effect when you have more history (as I would
> expect).

Actually, it is the other way around (or maybe I misunderstood something).
cpatience is slower than "fastdiff" on X-1..X diffs, and faster on
diffs where we have more less common deltas and more/bigger files to
diff.

> I think fastdiff would have less of an effect after my
> "faster_knit_extract" and "pyrex_knit_extract" code gets merged. Because
> it speeds up the time to extract data from a knit. But I only optimized
> the single-text case at the moment. It modifies get_lines() but not
> get_line_list(). At one point I had written it to affect both, but I
> could make the single-file case even better by special casing it, and at
> the moment, it is very rare that we grab more than 1 text at a time. So
> it made an overall smaller patch to just do the special case.
> 
> Actually, with my patch, but without changing get_line_list() I could
> argue that you might see a net loss from 'fastdiff'. It would be pretty
> situation dependent, though.

I've tested it only against "faster_knit_extract", but for revisions
with common fulltext knit entries was "fastdiff" still faster. It was
little slower on wide diffs with unrelated revisions. I guess the Pyrex
knit extraction code would probably change this significantly, though.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Toto je =?ISO-8859-1?Q?digit=E1lne?=
	=?ISO-8859-1?Q?_podp=EDsan=E1?= =?UTF-8?Q?_=C4=8Das=C5=A5?=
	=?ISO-8859-1?Q?_spr=E1vy?=
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070809/080a4cbf/attachment.pgp