[RFC] Refactor diffing
Robert Collins
robertc at robertcollins.net
Thu Nov 22 08:54:40 GMT 2007
On Wed, 2007-11-21 at 23:47 -0500, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Here's an updated version. A couple of things:
> 1. several parts of Bazaar treat binary file handling as part of the
> text diff protocol, so having a separate BinaryFileDiffer breaks
> compatibility, and doesn't seem like a great improvement.
Uhm, I was mainly thinking here about clarity of code structure; if its
better rolled into one, so be it. You're cutting the code :).
> 2. I'm not sure the tri-value return
> (unchanged/changed/could-not-handle) makes sense. We already know
> whether the file is changed, and we wouldn't be diffing it if it was
> unchanged. So I think this could just be True for "could handle" and
> False for "could not handle".
I was specifically thinking of file formats where the representation can
change but the value might not. For instance two open office documents,
which are gzipped, and thus have a datestamp in the gzip header but may
have the same decompressed contents.
> Thoughts?
It looks pretty clean to me. I think you should add docstrings to the
classes and methods to capture intent about how its structured. Other
than that the only thing missing is a mechanism for plugins to insert
new differ's and it will be really cool and extensible.
One thing that might make that easier to do is to require a standard
factory signature for all Differ's - specifically I'm thinking of the
label and path_encoding and internal_diff parameters, which currently
are only on TextDiffer.
So I'm thinking TreeDiffer.__init__ has something like:
differs = []
for differ_factory in TreeDiffer._differ_factories:
differs.append(differ_factory(old_tree, new_tree, old_label,
new_label, path_encoding, to_file, internal_diff)
differs.append(KindChangeDiff(differs))
and _differ_factories is statically initialised as
[SymlinkDiff, TextDiff]
Cheers,
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20071122/74d2011f/attachment.pgp
More information about the bazaar
mailing list