[fastimport/MERGE] Train fixes

Shawn O. Pearce spearce at spearce.org
Mon Mar 10 22:50:52 GMT 2008


Ian Clatworthy <ian.clatworthy at internode.on.net> wrote:
> John Arbash Meinel wrote:
> 
> > I'm pretty sure that fast-import doesn't handle ghosts. 'git' has an
> > explicit
> > property to chain their hashes back through all of history. (So if you
> > changed
> > the very first commit somehow all of your sha1-revision-ids would change.)
> > 
> > So I'm guessing that fast-export | fast-import would lose information
> > about ghosts.
> 
> That's my understanding as well. You could specify parents that don't
> exist but bzr fast-import and git-fast-import would currently consider
> that an error.
> 
> I'm of the opinion that they ought to continue to do so, at least by
> default. When generating the references to ghosts, do we know they're
> ghosts at that point? If so, we could extend the stream format to
> explicitly mark them as such. Alternatively, we could add an option (to
> fast-import) to permit ghost references. Even if not stored initially,
> at least the importer wouldn't abort at that point.

Yea.  Technically git-fast-import's code would permit me to let
the stream request a ghost parent, and actually produce data files
with a successful exit code, but the repository would fail git-fsck.
Tools like git-checkout (to fetch files) and git-log would seem to
work just fine, until we tried to hit the ghost, and then all hell
would break loose.

If you really do have such a parent commit that you know the SHA-1
of but will never be able to obtain the proper content for again,
but you need that parent in your ancestry in order to maintain other
child SHA-1s then yea, you don't really have any choice other than
to create the ghost reference.

In git a "graft" can be used to tell the runtime tools like git-log
and git-fsck that the parent is gone and shall never be, but grafts
do not automatically travel when you use git-clone to copy someone
else's tree to your own system.  Publishing a history with a graft
is a pretty horrible thing to do.

Fortunately nobody has thus far come to us and said "soooo, my
history is bust but I need to record that parent marker anyway".
Sounds like a bzr<->git roundtrip may actually need to support that?

I'm open to some sort of notation in the stream format that a
ghost is being used and thus a missing object is to be expected.
With such a notation git-fast-import can automatically dump the
ghost entry into the graft file, and spit out a big fat warning
after the import is done.

I'd rather not blanket allow missing references during an import.
That just sounds like trouble for a buggy frontend.

-- 
Shawn.



More information about the bazaar mailing list