[MERGE] knit text adding reductions
Ian Clatworthy
ian.clatworthy at internode.on.net
Wed Sep 12 01:13:33 BST 2007
Robert Collins wrote:
> Knits no longer assert that content being inserted is correctly split
> and bytestrings. The overhead of doing that was nearly 1/3 the time it
> takes to gzip the content - a very high overhead, and one that is
> unlikely to find bugs in bzr's core code. I've added docstring changes
> to make it clear that this is not enforced by the API.
bb:tweak
As Aaron has expressed as well, safety vs performance in API design
leaves my torn. I really want both.
I think what you've done here with random_id is excellent: made the
default safe but provided a way for the client code to say "skip the
check - I know what I'm doing".
I'd like to suggest you do the same for the content checks. Add a
parameter called (say) be_paranoid, default it to True and use that
parameter to run the checks if set. In the call to add_lines_with_ghosts
in repository.py, we can set be_paranoid=False and get 99% of the speed
gain.
Some other questions/comments below.
> ------------------------------------------------------------------------
>
> # Bazaar merge directive format 2 (Bazaar 0.90)
> # revision_id: robertc at robertcollins.net-20070910035304-\
> # iwf9nmqoxeshjaki
> # target_branch: http://bazaar-vcs.org/bzr/bzr.dev
> # testament_sha1: 4830a4736b68330b212ffba5511d2f144f284ea9
> # timestamp: 2007-09-10 13:53:25 +1000
> # source_branch: http://people.ubuntu.com/~robertc/baz2.0/knits
> # base_revision_id: pqm at pqm.ubuntu.com-20070907145828-hjh5941jv7y8d9z8
> #
> # Begin patch
> === modified file 'NEWS'
> --- NEWS 2007-09-07 13:33:03 +0000
> +++ NEWS 2007-09-10 03:37:19 +0000
> @@ -186,6 +186,11 @@
> allows the avoidance of double-sha1 calculations during commit.
> (Robert Collins)
>
> + * The ``VersionedFile`` interface no longer protects against misuse when
> + lines that are not lines, or are not strings are supplied. This saves
> + nearly 30% of the minimum cost to store a version of a file.
> + (Robert Collins)
> +
> * ``Transport.should_cache`` has been removed. It was not called in the
> previous release. (Martin Pool)
Rather than 'no longer protects', reword as 'checks can be bypassed'
(assuming you agree with the be_paranoid approach).
> @@ -904,10 +898,12 @@
> options = []
> if lines:
> if lines[-1][-1] != '\n':
> + # copy the contents of lines.
> + lines = lines[:]
> options.append('no-eol')
> lines[-1] = lines[-1] + '\n'
Is this fixing a bug? It wasn't obvious to me why you needed to make a
copy of the lines (given the other changes I could see in this patch).
> === modified file 'bzrlib/tests/repository_implementations/test_commit_builder.py'
> --- bzrlib/tests/repository_implementations/test_commit_builder.py 2007-09-04 03:53:07 +0000
> +++ bzrlib/tests/repository_implementations/test_commit_builder.py 2007-09-10 01:27:10 +0000
> @@ -34,6 +34,7 @@
> builder = branch.repository.get_commit_builder(
> branch, [], branch.get_config())
> self.assertIsInstance(builder, CommitBuilder)
> + self.assertTrue(builder.random_revid)
> branch.repository.commit_write_group()
> branch.repository.unlock()
>
> @@ -101,6 +102,7 @@
> except CannotSetRevisionId:
> # This format doesn't support supplied revision ids
> return
> + self.assertFalse(builder.random_revid)
> self.record_root(builder, tree)
> builder.finish_inventory()
> self.assertEqual(revision_id, builder.commit('foo bar'))
I'm fine with this. I'm curious though why you felt it necessary to make
random_revid a "public" attribute of a CommitBuilder. Being internal and
not testing its value in unit tests would have been ok IMO. By making
it public, I gather you want each and every CommitBuilder from now on to
expose this attribute?
> === modified file 'bzrlib/tests/test_versionedfile.py'
> --- bzrlib/tests/test_versionedfile.py 2007-09-06 05:58:13 +0000
> +++ bzrlib/tests/test_versionedfile.py 2007-09-10 03:37:19 +0000
> @@ -115,16 +115,6 @@
> f = self.reopen_file()
> verify_file(f)
>
> - def test_add_unicode_content(self):
> - # unicode content is not permitted in versioned files.
> - # versioned files version sequences of bytes only.
> - vf = self.get_file()
> - self.assertRaises(errors.BzrBadParameterUnicode,
> - vf.add_lines, 'a', [], ['a\n', u'b\n', 'c\n'])
> - self.assertRaises(
> - (errors.BzrBadParameterUnicode, NotImplementedError),
> - vf.add_lines_with_ghosts, 'a', [], ['a\n', u'b\n', 'c\n'])
> -
> def test_add_follows_left_matching_blocks(self):
> """If we change left_matching_blocks, delta changes
>
> @@ -142,21 +132,6 @@
> left_matching_blocks=[(0, 2, 1), (1, 3, 0)])
> self.assertEqual(['a\n', 'a\n', 'a\n'], vf.get_lines('3'))
>
> - def test_inline_newline_throws(self):
> - # \r characters are not permitted in lines being added
> - vf = self.get_file()
> - self.assertRaises(errors.BzrBadParameterContainsNewline,
> - vf.add_lines, 'a', [], ['a\n\n'])
> - self.assertRaises(
> - (errors.BzrBadParameterContainsNewline, NotImplementedError),
> - vf.add_lines_with_ghosts, 'a', [], ['a\n\n'])
> - # but inline CR's are allowed
> - vf.add_lines('a', [], ['a\r\n'])
> - try:
> - vf.add_lines_with_ghosts('b', [], ['a\r\n'])
> - except NotImplementedError:
> - pass
> -
> def test_add_reserved(self):
> vf = self.get_file()
> self.assertRaises(errors.ReservedId,
I'd like to leave these tests in given they are still meaningful in the
be_paranoid=True case.
Ian C.
More information about the bazaar
mailing list