Any plans/needs to extend the fast-import format?

Shawn O. Pearce spearce at spearce.org
Tue Aug 25 15:57:50 BST 2009


Ian Clatworthy <ian.clatworthy at canonical.com> wrote:
> Shawn O. Pearce wrote:
> >> * An optional properties section in a commit. Each property would
> >>   have a name and value, both utf-8 encoded.
> > 
> > This I find could be dangerous, what are additional properties,
> > and what should they look like when loaded in another VCS?  Git has
> > tried to resist adding hidden fields to the commit headers, because
> > then they aren't as easy to access for a human.
> 
> By definition, revision properties are tool-specific. We could capture
> that semantically by prefixing the names with the tool, a bit like xml
> namespaces. The current alternative is to do the sort of thing that
> Monotone's fast-export tool does: append metadata as name:value lines
> onto the end of the commit message. That's pretty gross IMO.

Tomato, Tomato.  Oh, that joke doesn't work by email.

Shoving stuff into the commit message is gross, but so is having
an arbitrary dictionary of properties.

I'm really worried about having a "bzr bug ..." and "hg bug ..." and
then hg trying to read the "bzr bug" line in addition to the "hg bug"
line, but bzr only reading the "bzr bug" and an hg->bzr conversion
sliently failing because the bzr importer doesn't know how to read
the "hg bug" data.

However.  I'm with you on this (and not Sverre), if we are going
to have these additional properties be encoded in the stream,
lets do it in a way that is more standardized and processable by
the various tools than by shoving it into the commit message blob.
 
> > Sverre recently added patches to git fast-import to declare options
> 
> So what do you expect other importers to do with these? Perhaps you
> should include a tool field in the option command so this becomes more
> widely useful and importers know which ones are for them vs other
> importers? There's no way all importers will have the same option names,
> let alone semantics.

That's a good point.

Actually, one option, "option date-format={raw|rfc2822|now}" is
probably something that should be a standard in the format, since
it says how the dates are encoded.  A tool which doesn't understand
rfc2822 dates should abort and not try to process the stream.

But clearly "option export-pack-edges=<file>" is git specific,
and has no meaning for bzr or hg, or any other system.  It likely
should be "option git export-pack-edges=<file>", so that a git tool
would honor it, but a non-git tool would silently skip it.

Somewhere in the middle is "option import-marks=<file>".  The
git-fast-import tool knows how to save and reload marks, which
are tied to Git SHA-1s, so its unlikely a git marks file would
be useful to any other tool, but other tools might also support
loading and saving marks across runs.  I don't know.  The option
is here so the stream generator can clearly signal to the parser
that the mark database should be picked up from a prior state in
order to parse the stream correctly.

We haven't yet made this change to git.  We could modify this to
take a VCS name.

Maybe "feature" (see below) is required to be understood, and
date-format should be a "feature", while "option" can be skipped only
if the VCS name which appears in the option line is not yourself?

  # Must be understood by parser, or parser must abort.
  #
  feature      ::= 'feature' sp feature_name ('=' path_str)? lf
  feature_name ::= path_str

  # Must be skipped by parser if vcs_name does not match self.
  # Must be understood by parser if vcs_name matches self.
  #
  option      ::= 'option' sp vcs_name sp option_name ('=' path_str)? lf
  vcs_name    ::= ('git' | 'bzr' | 'hg' | path_str)
  option_name ::= path_str

is what we are talking about?
 
> That sounds a useful way forward. Let's start with ...
> 
> feature file-commands-apply-to-committed-state

:-)

-- 
Shawn.



More information about the bazaar mailing list