[BUG] baz-import creating different inventory texts (ghosts? and corruption)

Robert Collins robertc at robertcollins.net
Tue Mar 13 02:59:59 GMT 2007


On Mon, 2007-03-12 at 21:31 -0500, Matthew D. Fuller wrote:
> 
> If I need to take out something a long way back in history, I need to
> re-create all the revs since then (without the bad data, 'course).
> That means any poor schmuck tracking the tree now has to re-download
> *REAMS* of data, even though most of it is identical to what he
> already has.  Referencing the file texts by hash would mean that he
> wouldn't need to re-download all those texts.  That's a big gain,
> considering those re-downloads could be gigabytes. 

Well, theres other ways of doing that. Right now that pain is because
the index we add is the same as the revision id being created. Thats
bogus in a number of ways, though it is good for manual inspection.

Fixing that would address this overhead, and I'd like to fix it anyhow.
What we basically need is fast mapping from inventory contents to file
versions, and we have that via file_ids_affected_by - the ids used in
the per file graph could be decoupled from the ids used in text storage.

Note though, that if we *did* use shas to identify texts, you'd still
have a problem with our current storage, because we map:
inventory -> file @ version
file @ version -> delta-chain

and then build the delta chain.

Unless we remove the per-file merge graph, the file @ version step would
still be there, and the graph from the nuclear incident outwards is
still invalidated.

hg use sha1s to identify text versions, but they still have a per file
graph, and AIUI they would indeed cause gb's to be redownloaded in this
scenarion, because their sha1s include history, not just text (which is
good for a number of reasons).

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070313/7f086585/attachment.pgp 


More information about the bazaar mailing list