Thinking about binary diff and merge

jbowtie at amathaine.com jbowtie at amathaine.com
Tue Sep 28 12:00:16 BST 2010


I'm currently drafting a feature specification for binary diff and
merge. I have a few areas I need to understand before publishing the
spec, though. So this is partly thinking aloud and partly asking for
comment/clarification.

The model I expect is that plugins will be written for each binary
format. This implies we want per-mimetype hooks (as opposed to the
per-file hooks we have currently).

There might be multiple plugins for the same format; we need some way
of choosing one. As an example, ODF could be handled by both an
OpenOffice plugin and a zip file plugin (since the ODF format is a zip
file with specific contents and its own mimetype).

For binary diffing we probably have: a GUI representation, a
text-only, human-readable representation, and something that can be
consumed by GNU patch (or not, how are binary files handled by
diff/patch?). How do we distinguish between these modes?

Merging is probably not that hard, but conflict resolution will likely
entail a GUI. Conflict markers do not make sense for most binary
formats.

I note that bzrlib.diff has a registry, while bzrlib.merge has a hook.
Shouldn't these both be using the same mechanism?

bzrlib.diff.register("application/zip", bzrlib.plugins.example.zip.DiffContents)
bzrlib.merge.register("application/zip",
bzrlib.plugins.example.zip.Merge2Way,
bzrlib.plugins.example.zip.Merge3Way)

Use cases
---------------

zip, gzip, bzip2, tar - treat the archive as a folder. Merge will
probably involve repacking the contents.
odf  - use oodiff for human-readable diffs. Merges with change
tracking turned on should be interesting (need to preserve or create
annotations).
xcf - diff shows changed layers (by name for human-readable text).
Conflicts when touching the same layer contents or layer properties.
Layer ordering during merges might be tricky.
pitivi/audacity/openshot - diff shows changed tracks, effects, list of
referenced media. Conflicts when changing same tracks or clip effects.
blend - diff and merge depending on what enitities are effected (ie,
models vs animations vs lighting, etc....)

Basically we want Bazaar to be something that artists can use as part
of their workflow to collaborate with a team.



More information about the bazaar mailing list