Any plans/needs to extend the fast-import format?

Ian Clatworthy ian.clatworthy at canonical.com
Tue Aug 25 07:17:21 BST 2009


Sverre Rabbelier wrote:

> On Mon, Aug 24, 2009 at 21:47, Ian
> Clatworthy<ian.clatworthy at canonical.com> wrote:
>> The current alternative is to do the sort of thing that
>> Monotone's fast-export tool does: append metadata as name:value lines
>> onto the end of the commit message. That's pretty gross IMO.
> 
> Why? The commit message is free form, and under control of the export
> tool. In git there isn't much difference between storing the meta-data
> in the commit message and some special field, it just moves up/down a
> few lines in the commit object.

> Honestly, what is the difference? Except that with the former you have
> to change the git object format, and the tools to deal with your
> special fields, whereas with the latter everything Just Works (see
> hg-git and git-svn as 'proof'). Another advantage of the
> commit-message based approach is that the user can edit changes
> easily; rewriting of bzr history using git filter-branch's
> commit-message-filter is suddenly possible.

There's a big difference. Going back to my original problem, I want to
use fast-import format to round-trip Bazaar repositories. Many users
doing that won't have git installed and shouldn't have to install git to
achieve that. The point is that fast-import has become a de-facto data
interchange format. We ought to be extending it by capturing semantics
whenever we can.

Design wise, taking the existing set of per-revision properties and
hacking them on to the end of the commit message is overloading the
semantics of the commit message unnecessarily. Storing them in a
separate dictionary is (marginally) better because it means importers
don't need to do magic parsing of commit messages looking for properties
that aren't part of the true commit message. If anything, we want to be
going the *other* way and tightening up the semantics of revision
properties and turning them into first-class metadata fields. Of course,
*if* git-fast-import wants to take those revision properties from the
stream and append them onto the end of the commit message as you're
doing for hg-git, then that's your choice. Other tools like Bazaar will
store the properties separately and selectively display some of them in
logs based on whatever log formatter the end user selects.

BTW, most Bazaar users would grumble loudly if most commit messages had
that additional cross-tool metadata tacked on the end. That sort of
thing may not matter to the git user base but I can promise you it does
for ours. Different tools appeal to different groups of users - no
surprises there.

git filter-branch is certainly an interesting tool. We have something
similar in the bzr-fastimport plugin: fast-import-filter. In our case,
we take a fast-import stream and generate another so it's tool agnostic
and doesn't require a repository to exist or be created.  For *any* tool
supporting fast-export/import, one can do:

  xxx-fast-export > data.fi
  bzr fast-import-filter [options] data.fi > new-data.fi
  xxx-fast-import new-data.fi

fast-import-filter isn't yet as powerful as git-filter-branch but
there's no reason it can't be. It's pretty easy to add features to it as
we require them.

Ian C.



More information about the bazaar mailing list