BundleReader, Containers, and file IO

Andrew Bennetts andrew at canonical.com
Thu Oct 25 14:43:57 BST 2007


Robert Collins wrote:
> 
> On Thu, 2007-10-25 at 20:13 +1000, Andrew Bennetts wrote:
[...]
> > It adds a bzrlib.packs.iter_records_from_file function you can use instead of
> > ContainerReader(somefile).iter_records().  I'm curious to find out how it goes
> > for you!
> 
> Packs only use readv; whats the best way to glue these together in your
> opinion ?

A quick grep suggests packs do this in a few places:

    reader = make_readv_reader(...)
    ...
    for record, bytes_func in reader.iter_records():
        ...

The lazy way would be to define a simple function like:

    def iter_records_from_transport_readv(transport, filename, requested_records):
        readv_blocks = [(0, len(FORMAT_ONE)+1)]
        readv_blocks.extend(requested_records)
        return iter_records_from_file(ReadVFile(
            transport.readv(filename, readv_blocks)))

(This is very similar to the existing make_readv_reader function.)
 
Then replace the snippets using make_readv_reader with:

    records_iter = iter_records_from_transport_readv(...)  # same args as before
    ...
    for record, bytes in records_iter:   # note bytes, not bytes_func
        ...

A better definition of iter_records_from_transport_readv might be:

    def iter_records_from_transport_readv(transport, filename, requested_records):
        readv_blocks = [(0, len(FORMAT_ONE)+1)]
        readv_blocks.extend(requested_records)
        readv_result = transport.readv(filename, readv_blocks)
        parser = ContainerPushParser()
        while True:
            length, bytes = readv_result.next()
            parser.accept_bytes(bytes)
            for record in parser.read_pending_records():
                yield record
            if parser.finished:
                break

(This is pretty similiar to my iter_records_from_file function.)

I'm guessing that with the pack code typically each readv range will have
complete records in them?  So feeding in readv blocks as is will probably
minimise string splitting, and this doesn't use the intermediate ReadVFile
object, so it's probably the faster way.

-Andrew.




More information about the bazaar mailing list