[merge] Weave checks sha1 sum while extracting
John Arbash Meinel
john at arbash-meinel.com
Tue Dec 20 02:36:17 GMT 2005
Robert Collins wrote:
> On Mon, 2005-12-19 at 11:46 -0600, John A Meinel wrote:
>
>>Right now, every text inside of a weave has a sha1 sum, but we never
>>check it. Attached is a patch which checks the sha1 sum every time we
>>extract a text.
>>The interesting thing about our current weave code, is that doing the
>>sha1 only adds a negligible amount of time. Sha1 is pretty fast, much
>>faster than iterating over our weave format.
>>
>>I ran "bzr selftest" twice, with and without the change, and it runs in
>>the same amount of time (114s,111s without vs 112s,113s with).
>>I also ran bzr check twice
>>
>>I really think this should be included. The overhead is low, and it
>>means we actually do something about the sha1 sums, rather than only at
>>"bzr check" time. (which until now they were actually ignored as well,
>>only the inventory => text sha1 was ever checked).
>>
>>I also ran "bzr check" and the difference was 27min without, and 28min
>>with. So again, the overhead is minimal.
>>
>>I've added the change to my jam-integration branch. Because the
>>performance hit is negligible, and it does improve the integrity checking.
>
>
> Meta-question - what does jam-integration represent? Is it 'all past the
> review process' changes, or is it 'stuff you are happy with' ? (this
> just lets me know whether to cherry pick as things get a second +1 from
> me (yours is implied ;), or whether its always safe to yank the entire
> branch.) I've started only putting stuff that is passed the +1 from two
> people requirement into my integration branch -> which means that
> usually my changes do not land there. Ironically enough.
>
Actually, I was thinking about creating a jam-pending branch, just for
this sort of thing. So that jam-integration does become 'always ready'
and not something that would ever need to be cherrypicked.
Since you have made the request, I'll create one.
>
>>John
>>=:->
>>plain text document attachment (weave-checks-sha1sum.patch)
>
>
>>=== modified file 'bzrlib/tests/test_weave.py'
>>--- bzrlib/tests/test_weave.py
>>+++ bzrlib/tests/test_weave.py
>>@@ -881,3 +898,103 @@
>> eq = self.assertEquals
>> eq(sorted(wa.iter_names()), ['v1', 'v2', 'v3', 'x1',])
>> eq(wa.get_text('x1'), 'line from x1\n')
>>+
>>+
>>+
>>+class Corruption(TestCase):
>
>
> ^^
> two lines between top level code blocks please :).
I keep going around and cleaning up code, and still they sneak through :)
>
>
>>+
>>+ def test_detection(self):
>>+ """Test weaves detect corruption.
>
>
> I think that docstrings on tests are actually questionable: they make it
> harder to find the test with 'selftest -v' - a python path is much nicer
> for finding the test. What do you think? (i.e.
> def test_corruption_detection(self):
> # check that a weave with content that results in bad
> # checksums raises WeaveInvalidChecksum on text
> # extraction.
>
>
I have been doing comments rather than docstrings, but I'm realizing we
probably should just get rid of the "use first line if present" stuff,
because I really do prefer to always see the paths.
>
>>@@ -548,8 +549,24 @@
>> def get_iter(self, name_or_index):
>> """Yield lines for the specified version."""
>> incls = [self.maybe_lookup(name_or_index)]
>>+ if len(incls) == 1:
>>+ index= incls[0]
>>+ s = sha.new()
>
>
> whats s? Is it the current sha ? (single letter variable names hurt my
> brain).
>
> -0 with the 's' variable.
-0 overall, or just for 's'?
Personally, for short-lived variables, I like them, as sort of a Math
notation thing. But I'm not stuck on them. The problem is the nice names
"sha", "sha1" are frequently taken elsewhere, and
this_will_keep_track_of_the_sha_when_all_lines_are_read is a little bit
too long. :)
Seriously, though, for your sake, I will try to avoid vars shorter than
3 letters.
>
> Rob
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051219/12daf13a/attachment.pgp
More information about the bazaar
mailing list