[merge] Weave checks sha1 sum while extracting
Robert Collins
robertc at robertcollins.net
Tue Dec 20 02:29:57 GMT 2005
On Mon, 2005-12-19 at 11:46 -0600, John A Meinel wrote:
> Right now, every text inside of a weave has a sha1 sum, but we never
> check it. Attached is a patch which checks the sha1 sum every time we
> extract a text.
> The interesting thing about our current weave code, is that doing the
> sha1 only adds a negligible amount of time. Sha1 is pretty fast, much
> faster than iterating over our weave format.
>
> I ran "bzr selftest" twice, with and without the change, and it runs in
> the same amount of time (114s,111s without vs 112s,113s with).
> I also ran bzr check twice
>
> I really think this should be included. The overhead is low, and it
> means we actually do something about the sha1 sums, rather than only at
> "bzr check" time. (which until now they were actually ignored as well,
> only the inventory => text sha1 was ever checked).
>
> I also ran "bzr check" and the difference was 27min without, and 28min
> with. So again, the overhead is minimal.
>
> I've added the change to my jam-integration branch. Because the
> performance hit is negligible, and it does improve the integrity checking.
Meta-question - what does jam-integration represent? Is it 'all past the
review process' changes, or is it 'stuff you are happy with' ? (this
just lets me know whether to cherry pick as things get a second +1 from
me (yours is implied ;), or whether its always safe to yank the entire
branch.) I've started only putting stuff that is passed the +1 from two
people requirement into my integration branch -> which means that
usually my changes do not land there. Ironically enough.
> John
> =:->
> plain text document attachment (weave-checks-sha1sum.patch)
> === modified file 'bzrlib/tests/test_weave.py'
> --- bzrlib/tests/test_weave.py
> +++ bzrlib/tests/test_weave.py
> @@ -881,3 +898,103 @@
> eq = self.assertEquals
> eq(sorted(wa.iter_names()), ['v1', 'v2', 'v3', 'x1',])
> eq(wa.get_text('x1'), 'line from x1\n')
> +
> +
> +
> +class Corruption(TestCase):
^^
two lines between top level code blocks please :).
> +
> + def test_detection(self):
> + """Test weaves detect corruption.
I think that docstrings on tests are actually questionable: they make it
harder to find the test with 'selftest -v' - a python path is much nicer
for finding the test. What do you think? (i.e.
def test_corruption_detection(self):
# check that a weave with content that results in bad
# checksums raises WeaveInvalidChecksum on text
# extraction.
> @@ -548,8 +549,24 @@
> def get_iter(self, name_or_index):
> """Yield lines for the specified version."""
> incls = [self.maybe_lookup(name_or_index)]
> + if len(incls) == 1:
> + index= incls[0]
> + s = sha.new()
whats s? Is it the current sha ? (single letter variable names hurt my
brain).
-0 with the 's' variable.
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051220/0d562b23/attachment.pgp
More information about the bazaar
mailing list