Longest Common Subsequences code
John Arbash Meinel
john at arbash-meinel.com
Thu Nov 2 15:03:26 GMT 2006
Cheuksan Edward Wang wrote:
>
>
> On 11/2/06, *John Arbash Meinel* <john at arbash-meinel.com
> <mailto:john at arbash-meinel.com>> wrote:
>
>
> 1) Delta compression of full-texts. Minimal diffs would make this
> better, but I would guess only marginally so. ATM we are gzip
> compressing the final hunks anyway. The most important thing is that a
> change of 2 lines to a 100Kline source file should be ~2-lines, not
> 100Klines. And I think all algorithms will give us that.
>
>
> Unfortunately, this problem can theoretically happen with patience diff.
> If people do run into it, we might need to use something else.
>
> Cheuksan Edward Wang
The scenario I can come up with is a 100K line file, where all the lines
are duplicated somewhere, and you change the first and last line. Then
what should have been a 2-line change would indeed be much too long.
I have the feeling patience-diff could be updated to avoid that sort of
pathological behavior without needing a major overhaul. But for now, I'm
content with what we have.
If you have good feeling about doing things differently, I'll certainly
listen.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20061102/77461043/attachment.pgp
More information about the bazaar
mailing list