[BUG] baz-import creating different inventory texts (ghosts? and corruption)

John Arbash Meinel john at arbash-meinel.com
Mon Mar 12 22:52:08 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> John Arbash Meinel wrote:
>>> I just came across a source of corruption for baz => bzr conversions.
>>>
>>> It seems that a baz-import has been run 2 times. And in one case, it
>>> decided that the last-changed revision for the files should be marked as
>>> the ghost revision (--patch-200), and in the other case the last-changed
>>> was '--base-0'.
> 
> These were exactly the same version?  I'm not sure how that could happen.

It was private code, so I can't be completely positive about it. But
IIRC the only difference in the inventory lines was the "revision="
marker. (So the file id and sha1 fields matched).

> 
>>> Was this just a change in baz-import? On was it a change to the internal
>>> bzrlib logic (such that ghosts could be considered a last-changed).
> 
> We haven't made many changes to baz-import along those lines.  It's just
> been maintenance, maintenance, maintenance.

Yeah, what has me concerned is that I don't have a great handle as to
why the 'baz-import' converted changed the texts.

> 
>>> It could also be that one conversion had more history available than
>>> another conversion.
> 
> I'm pretty sure that baz-import insists on importing a whole version at
> once, which is why I asked the earlier question.
> 
>>> Certainly that is going to be a general problem.
>>> Because the inventory files can differ drastically if one has fewer
>>> ghosts than the other.
> 
> I don't think it is a general problem.  Aside from converters, we should
>  never recreate an inventory.
> 
>>> Essentially though, we can't really support ghosts as well as we think
>>> we can. Unless we could somehow make 'InventoryEntry.revision' a loose
>>> value. (Not include it in the checksum, etc)
> 
> InventoryEntry.revision was already considered somewhat loose.
> Testament format 1 doesn't include it.  Perhaps we need to make it looser?
> 
> Aaron

Well, this was detecting corruption at the Knit.sha1sum layer. Because
it was pulling across a line-delta for a text that doesn't actually
layer on top of the existing base.

So in order to not trigger this sort of thing, we would have to change
our serialization and storage method. Like storing the 'last modified'
value in a separate location.

I certainly think we can trigger this bug, just by doing 1 conversion
where we have the complete ancestry, and another conversion where we
stop at some archive boundary.

So we probably need some sort of resolution, since converters really are
a source of corruption for us.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF9dmYJdeBCYSNAAMRAtJoAJ4rF6z7wR9A3sq/SIoHzeiXBCwXfACgwEhM
Oq9392/c6wEKw+64UliY0gI=
=4DHt
-----END PGP SIGNATURE-----



More information about the bazaar mailing list