[MERGE/RFC] Use _read_records_iter_unchecked in _get_remaining_record_stream.
John Arbash Meinel
john at arbash-meinel.com
Fri Mar 6 21:37:06 GMT 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> Hi all,
>
> I did a little bit of the profiling of the time spent in the
> Repository.get_stream RPC with Robert, and we saw 16% of the time was due
> knit record header parsing. Uncompressing and parsing those headers is
> basically wasted effort, at least for the network, because we don't use that
> data, we just send the raw bytes on the wire.
>
> So this patch changes _get_remaining_record_stream to call
> _read_records_iter_unchecked rather than _read_records_iter_raw, to skip the
> header parsing.
>
> I'm interested to know what the other developers think... is it reasonable
> to be a bit more lax about checking the data we get out of a repository is a
> parseable knit record? Ideally the target repository would be checking it
> as it inserts it anyway (although I don't think this happens today).
>
> -Andrew.
>
>
I think we need to check as we stream. For the groupcompress code, I
plan on having it fully extract the texts (and then not regenerate the
deltas). The more we can catch before corruption gets silently
transmitted the better. We only check the header today because full
extraction was too expensive.
So as Robert mentioned, if you put a check in insert_record_stream, then
I'm fine moving this away from the server during 'pull'. I think it is
better to have the process actually inserting the content being the one
checking it anyway.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkmxl4IACgkQJdeBCYSNAAN1awCfVi1iQOzVJTduy45Z/5y2Rms9
M7IAn2ovCowhzSOyG+myoE+dge5nUUtH
=pT8k
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list