[MERGE] Patience diff

John Arbash Meinel john at arbash-meinel.com
Tue May 23 01:40:31 BST 2006


Aaron Bentley wrote:
> John Arbash Meinel wrote:
> | First off, in your 'patience-test.py' you do:
> |
> |   new_file = file(file_path, 'rb')
> |   for line_a, line_b in zip(new_file, new_lines):
> |       assert line_a == line_b
> |
> | Which is great, except 'zip()' stops as soon as one of the generators
> | runs out. So if one has extra stuff at the end, you won't detect it.
> | Since 'assert line_a == line_b' doesn't actually say anything about what
> | is different, I would tend to change it to:
> |
> | new_file = file(file_path, 'rb')
> | assert list(new_file) == new_lines
> 
> Makes sense.
> 
> |
> | We should go through and make sure that our copyright statements have
> | 2006, and use # all the way through. (bzrlib/cdv/__init__.py doesn't
> | have the right lines in the beginning)
> 
> It's also attributed to Canonical, and I think it should be attributed
> to Bram Cohen, at least in part.

Well __init__.py is my addition, and can be attributed to Canonical. But
nofrillsprecisemerge.py should be attributed to Bram.

> 
> | We should also check that we have the latest 'nofrillsprecisemerge.py'
> | files. I tried hard to make sure I was using an unmodified version so
> | that any updates could be easily obtained.
> 
> Unfortunately, there's a lot of stuff in the original
> nofrillsprecisemerge.py that's not related to what this patcheset is
> meant to do.  I removed a bunch of stuff we're not (yet) using, because
> it seems easier to make single-purpose changes to bzr, rather doing
> several things at once.

As you wish.

> 
> | We probably should change the names from 'cdv' to 'patience'. I'm not
> | stuck on either, but I'm guessing Codeville isn't the right name to use.
> 
> Perhaps we should pull the nofrills stuff into cdvdifflib, and rename it
> to bzrlib/patiencediff.py

I would be okay with it going either way.

> 
> | In merge3.py we have a commented out import. I think we prefer to just
> | remove them now.
> 
> Right.
> 
> | We also used to use a junk expression so that lines that only contained
> | space,tab or # would be ignored. It probably isn't needed anymore, but
> | we should be aware that the junk_re should be deleted.
> |
> | test_diff.py has a bunch of pep8 whitespace issues.
> |
> | We are using the new cdv diff for all of the Weave code, but it doesn't
> | look like anything has been done to use it for the 'knit' code. (It
> | should still be used for 'bzr diff')
> 
> The knit code has its own SequenceMatcher implementation, so I was
> unsure whether we should replace it.

Well, ultimately I think this is just a tuned version of the original
SequenceMatcher. I see that Robert was the one making the tuning, so we
should ask him. But it doesn't benefit us much to include Patience diff
if our primary storage mechanism isn't going to use it.

I wasn't part of the tuning, so I don't know what it involved, or how
much work it would be for him to tune Patience diff.

> 
> | At one point, I was thinking we could provide a flag to switch between
> | the diff engines. Is there any interest in that?
> 
> I don't think it would be used often enough to put it in.
> 
> | So, +0.5 from me. It just needs a little bit of cleanup still.
> 
> Okay.
> 
> Aaron

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060522/ba104377/attachment.pgp 


More information about the bazaar mailing list