One case where patience diff does much better

John A Meinel john at arbash-meinel.com
Sun Feb 26 22:42:31 GMT 2006


Martin Pool wrote:
> On 26 Feb 2006, John Arbash Meinel <john at arbash-meinel.com> wrote:
> 
>> And he has a point that the matching is a little unsure, but Aaron & I
>> both felt that the 'aaa' should be matched.
>>
>> (The unsure case is something like XaaaaY => QaaaP, which 'a' was
>> deleted? Or even worse XaaaaY => QaaPa, which one was deleted, was P
>> inserted, or were 2 lines deleted and one more 'a' was added.)
>>
>> Because of the disagreement, I fall back to difflib (which will find
>> those matches), after using Patience in the first pass.
> 
> Do you know if that fallback has much effect on speed?
> 

I would guess it has quite a bit of effect for files which are very
different. (I only fall back for unmatched ranges with length > 2).
So if you were doing a diff for an unmodified text, then there would be
no penalty.
If you diff'd 2 files whose contents didn't match at all, you would have
one pass with Patience, and 1 pass with difflib. (Maybe, Patience
special cases the beginning and end of the file to consider them
matches, so it might act a little bit different, but certainly you could
create input that would pass over the text 2 times).

I didn't do any performance comparison with and without difflib, though.
So I don't know how it compares in the average case.

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060226/f35a7241/attachment.pgp 


More information about the bazaar mailing list