[fastimport/MERGE] Train fixes

Shawn O. Pearce spearce at spearce.org
Tue Mar 11 00:23:18 GMT 2008


James Westby <jw+debian at jameswestby.net> wrote:
> On Mon, 2008-03-10 at 19:35 -0400, Shawn O. Pearce wrote:
> > > My first thoughts are that we should strive to have the stream
> > > format contain information that allows you to describe the history
> > > that you are converting from, and then allow the importers to handle
> > > this as they see fit.
> > 
> > I don't disagree at all.
> 
> I didn't think that you did. I applaud your wish to make the format
> as VCS-agnostic as possible. That will allow us to build many great
> tools and share the work.
> 
> When talking about bzr fast-import with Ian at the sprint last week
> we both commented how un-git-specific the stream was already, for 
> instance reporting renames directly, which is not necessary for git,
> but I guess is a space saver if you have the information already.
> 

Yea, the rename feature came about because someone's source had a
rename, but they couldn't generate the old file content hash or mark
to send us a D/A pair.  A rename was easier for them to generate
as then the frontend didn't have to track everything on its own.

I'm shocked the format has had legs as long as it has had thus far.
It never was my intent to create a stream format that was VCS
agnostic.  I was just trying to shovel data from cvs2svn into git
as fast as we possibly could.

git-fast-export doesn't generate rename commands, as it does not
enable the git rename detection codepath during output generation.
We probably could add that in the future.  Minor implementation
detail for us.

> Well, we're not even dealing with SHA-1 names here, as bzr doesn't have
> them readily available, so bzr-fast-export just uses mark references
> everywhere.
> 
> I don't really understand all the implications of a ghost, but I know
> it is a reference to an object (commit) that you know exists (usually
> because it is one of your parents), but that you don't actually have
> stored.
> 
> This means that bzr-fast-export currently falls over as it tries to 
> export the ghost as a commit, as it thinks it needs to as it is a parent
> of something that it is exporting, but it can't access the object to
> export it.
> 
> I think the best solution may be to introduce a ghost command in the 
> stream that is similar to a commit command, and specifically allows
> you to assign a mark. Then the commit command can simply reference that,
> and the importer handle the ghost command however they like. Do you
> have another suggested solution?

Nothing better comes to mind.

On Git we would have to have the SHA-1 of the "ghost parent" in order
to generate the correct SHA-1 of the child commit that references
it, otherwise we cannot store the child commit.  Mark or no mark,
it has to in the end boil down to a SHA-1 before we can finish the
child commit.

Rough idea of a BNF:

	new_ghost ::= 'ghost' lf
	  mark
	  ('id' sp hexsha1 lf)?
	  lf?;

and require that at least on Git to import a ghost the frontend
must give us the "id" subcommand.  What do the bzr folks think?
 
-- 
Shawn.



More information about the bazaar mailing list