[MERGE] Support delta_closure=True with NetworkRecordStream to transmit deltas over the wire when full text extraction is required on the far end.

Robert Collins robertc at robertcollins.net
Tue Feb 17 07:01:20 GMT 2009


This patch is conceptually simple: allow serialisation of
get_record_stream to bytes and back.

Complicating this a little is the fact that some clients of
get_record_stream want to use it to insert_record_stream, and some want
to get the texts out.

So this tightens up the behaviour a little - its not really documented
clearly yet outside of the tests, so I'm expecting to roundtrip this
review with other review feedback to improve on that.

The tightened up behaviour is that 
 - record.get_bytes_as('chunked') or 'fulltext' returns what it does
today: the content the user requested.
 - record.get_bytes_as(record.storage_kind) now always returns a network
suitable bytestring.
 - its now explicit (in that we do it :)) that you cannot filter records
   out of a networkstream when you are transmitting the
   record.get_bytes_as(record.storage_kind) bytes over the network,
   because a stream generator may group many records together for
   efficiency/logical coherency on the wire. A trivial example would be
   group compress, where one bytestring encodes many texts. The use made
   of this approach in this patch is to encode the record_map that 
   get_content_maps needs on the wire. This works great, but because
   we are encoding many different texts into the stream adding bytes for
   each call to record.get_bytes_as doesn't fit that well.

It may be a good idea to map this up in get_record_stream - by having a
stream-of-streams approach, but I think its ok as-is.

The good thing about this is that infinite-buffering isn't needed on the
server - the server buffers one raw_record_map, which is exactly what
we'd buffer on the client. It also means we don't have two
implementations of 'how much to buffer' and 'how to schedule IO' -
improvements we make on the client in terms of scheduling text
reconstruction should immediately benefit servers. Note that the encoded
record map *is* incrementally parsable, so if we get to the point of
having a streaming window and discarding things, we can hook that up
server side without a wire format change.

This is defining a wire protocol, so possibly we want explicit tests
about the encoding, though for now I'm comfortable with 'do not
change' :P.

Anyhow, this component will be used by Andrew and I to make the
streaming-push branch push deltas to branches of the same rich-root
format, and deltas-with-closure to branches with a different rich-root
value, without lots of choppy IO - so I think this is definitely a
success in terms of getting to that goal:).

-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: VersionedFiles-NetworkRecordStream-4011.patch
Type: text/x-patch
Size: 66597 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090217/8ebde1bd/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090217/8ebde1bd/attachment-0001.pgp 


More information about the bazaar mailing list