deleting InterDiffferingSerializer

Robert Collins robert.collins at canonical.com
Tue May 5 01:13:37 BST 2009


On Mon, 2009-05-04 at 10:17 -0500, John Arbash Meinel wrote:

> There are many things that InterDifferingSerializer gives us that
> streaming fetch doesn't yet. If we can get streaming fetch to do it,
> then I'm fine getting rid of it.
> 
> 1) Incremental updates. IDS converts batches of 100 revs at a time,
> which also triggers autopacks at 1k revs. Streaming fetch is currently
> an all-or-nothing, which isn't appropriate (IMO) for conversions.
> Consider that conversion can take *days*, it is important to have
> something that can be stopped and resumed.

OTOH the way we convert is:
do many groups
do a full repack at the end

If streaming issued data in the right order we could go straight into
the final pack.

> 2) Also, auto-packing as you go avoids the case you ran into, where bzr
> bloats to 2.4GB before packing back to 25MB. We know the new format is
> even more sensitive to packing efficiency. Not to mention that a single
> big-stream generates a single large pack, it isn't directly obvious that
> we are being so inefficient.

^ See above :).

> 3) "delta based", this is another rather big issue. IIRC Streaming Fetch
> will send whole-inventories. 

Andrew and I have a patch to send inventory deltas, its incomplete
pending fixing the rather severe stacking-streaming issues.

> 4) Computing rich root information from the 'stream'. IIRC the
> "Streaming" code does yet-another pass over the inventory data (I think
> that makes 3) to determine the root-id and whether it has changed or not.

IDS does this too, and the delta based stream won't need to do that.

> So if you have the time to implement all of this for the streaming code,
> by all means get rid of IDS. But it isn't like it is a trivial fix to
> the smart fetch code to get it to provide all the benefits that we have
> IDS for *today*. 1 bug is easily traded off the 4 benefits above, IMO.
> 
> I fully agree that maintaining multiple code paths is crummy. But the
> streaming code doesn't lend itself easily to most of the things I
> described above.
> 
> So -1 for just removing IDS, without addressing at least some of the above.

I think the key one is being able to stop and restart. While you say its
important, I don't think its at all obvious to users how to do this: bzr
branch old new preserves format unless they init a repo first; if they
have done this, hitting ctrl-C will abort with no indicator that it is
partial.

And our builtin 'upgrade' command is definitely not incremental.

Anyhow, I'll leave it alone at least until we have the deltas and other
stuff in place; I'll bring up the issue again later.

-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090505/9a259c1c/attachment.pgp 


More information about the bazaar mailing list