[MERGE] Treat assert statements in our code as a hard error
Andrew Bennetts
andrew at canonical.com
Tue Apr 29 08:49:15 BST 2008
Martin Pool wrote:
> Per the previous rfc, here is a long patch that removes all assert statements
> from bzrlib.
>
> It's not quite mechanical because I tried to make a reasonable judgement about
> what to do with each of them. Some should always be done with an if/raise, some
> should be done as unittest assertions, and some are look-before-you-leap or
> unlikely to ever fail.
bb:tweak
This patch looks good to me. I have a couple of comments on specific parts, but
nothing serious. I ran all the changed files through pyflakes and didn't find
any typos in the new code (I found a few old ones, though...).
[...]
> The test for not using assert uses the python parser to read all our files. It
> is a bit slow, taking 12s on my machine to load and search them. We could use
> a less precise but faster regexp, but I think the cost is reasonable to avoid
> false negatives or positives.
I think that 12s is acceptable for that.
> === modified file 'bzrlib/_dirstate_helpers_py.py'
> --- bzrlib/_dirstate_helpers_py.py 2007-10-09 03:35:33 +0000
> +++ bzrlib/_dirstate_helpers_py.py 2008-04-24 07:22:53 +0000
> @@ -200,7 +200,9 @@
> fields = text.split('\0')
> # Remove the last blank entry
> trailing = fields.pop()
> - assert trailing == ''
> + if trailing != '':
> + raise AssertionError("dirtstate line %r has trailing garbage: %r"
> + % (trailing,))
As Matt Nordhoff pointed out, you have a typo here: “dirtstate” :)
> === modified file 'bzrlib/_patiencediff_py.py'
> --- bzrlib/_patiencediff_py.py 2007-09-04 09:10:35 +0000
> +++ bzrlib/_patiencediff_py.py 2008-04-24 07:22:53 +0000
> @@ -197,8 +197,10 @@
> next_a = -1
> next_b = -1
> for a,b,match_len in answer:
> - assert a >= next_a, 'Non increasing matches for a'
> - assert b >= next_b, 'Not increasing matches for b'
> + if a < next_a:
> + raise ValueError('Non increasing matches for a')
> + if b < next_b:
> + raise ValueError('Not increasing matches for b')
Whereas here you kept an existing typo: “Non” vs. “Not”.
If you feel like doing an extra drive-by cleanup, PEP 8 says there should be
more whitespace in “for a,b,match_len”.
> === modified file 'bzrlib/dirstate.py'
This wasn't changed by your patch, but pyflakes shows that some other
AssertionErrors in this file have a typo in them:
bzrlib/dirstate.py:2583: undefined name 'dirblocks'
bzrlib/dirstate.py:2588: undefined name 'dirblocks'
The relevant code is:
if len(self._dirblocks) > 0:
if not self._dirblocks[0][0] == '':
raise AssertionError(
"dirblocks don't start with root block:\n" + \
pformat(dirblocks))
if len(self._dirblocks) > 1:
if not self._dirblocks[1][0] == '':
raise AssertionError(
"dirblocks missing root directory:\n" + \
pformat(dirblocks))
It'd be good to do the trivial fix (s/dirblocks/self._dirblocks/) on the
offending lines as part of this patch.
> --- bzrlib/dirstate.py 2008-03-06 18:21:43 +0000
> +++ bzrlib/dirstate.py 2008-04-24 07:22:53 +0000
> @@ -393,8 +393,9 @@
> # faster than three separate encodes.
> utf8path = (dirname + '/' + basename).strip('/').encode('utf8')
> dirname, basename = osutils.split(utf8path)
> - assert file_id.__class__ == str, \
> - "must be a utf8 file_id not %s" % (type(file_id))
> + if file_id.__class__ != str:
“if type(file_id) is not str:” is the usual idiom, and avoids the nasty
double-underscores. “is not” is faster than “!=” too.
Hmm, maybe this code is using __class__ for speed? Robert might remember. If
so, a comment to say so would be good.
[...]
> - assert path_utf8.__class__ == str, ('path_utf8 is not a str: %s %s'
> - % (type(path_utf8), path_utf8))
> + if not isinstance(path_utf8, str):
Again, “type(path_utf8) is not str” is preferable, I think.
> === modified file 'bzrlib/inventory.py'
> --- bzrlib/inventory.py 2008-04-08 03:39:43 +0000
> +++ bzrlib/inventory.py 2008-04-24 07:22:53 +0000
> @@ -112,7 +112,6 @@
> InventoryFile('2326', 'wibble.c', parent_id='2325', sha1=None, len=None)
> >>> for path, entry in i.iter_entries():
> ... print path
> - ... assert i.path2id(path)
This is effectively changing a test... I guess you've already carefully
considered the impact of this? The obvious alternative would be something like
“if not i.path2id(path): raise AssertionError()”.
> - @deprecated_method(symbol_versioning.one_zero)
> - def diff(self, text_diff, from_label, tree, to_label, to_entry, to_tree,
> - output_to, reverse=False):
Hooray for removing deprecated code! :)
> === modified file 'bzrlib/lazy_import.py'
[...]
> - assert import_str.startswith('import ')
> + if not import_str.startswith('import '):
> + raise ValueError('bad import string %r'
> + % (import_str,))
It's weird to wrap such a short line.
> === modified file 'bzrlib/merge.py'
[...]
> @@ -1236,7 +1236,6 @@
> a_cur = ai + l
> b_cur = bi + l
> for text_a, text_b in zip(plain_a[ai:a_cur], plain_b[bi:b_cur]):
> - assert text_a == text_b
> yield "unchanged", text_a
I feel slightly sad to lose some of these assertions, like this one. Even
though I don't expect it to ever fail, it helps the readability of the function
a little bit by reminding the reader of an invariant or precondition in the
code, especially algorithmic code. Not so much that it out-weighs the
drawbacks...
> === modified file 'bzrlib/merge3.py'
[...]
> for zmatch, zend, amatch, aend, bmatch, bend in self.find_sync_regions():
> - #print 'match base [%d:%d]' % (zmatch, zend)
> -
> matchlen = zend - zmatch
> - assert matchlen >= 0
> - assert matchlen == (aend - amatch)
> - assert matchlen == (bend - bmatch)
Ditto here...
Perhaps for very algorithmic code like these we should just have a comment like:
# Loop invariants:
# - matchlen >= 0
# - matchlen == (aend - amatch)
# - matchlen == (bend - amatch)
I haven't looked at this particular method outside of this diff, so I'm not sure
how helpful this would actually be. Just a thought.
> === modified file 'bzrlib/osutils.py'
[...]
> def sha_file(f):
> - if getattr(f, 'tell', None) is not None:
> - assert f.tell() == 0
> + """Calculate the hexdigest of an open file.
> +
> + The file cursor should be already at the start.
> + """
Thanks for adding a docstring!
> === modified file 'bzrlib/remote.py'
[...]
> - assert type(key) is str
> + if not isinstance(key, str):
> + raise ValueError(key)
Hmm, why change to isinstance here? Having a good reason to use a subclass of
str is a really unlikely... The extra flexibility strikes me as a YAGNI (and
changing type to isinstance later is cheap if I turn out to be wrong).
> === modified file 'bzrlib/tests/test_patches.py'
[...]
> -import unittest
[...]
> def test():
> patchesTestSuite = unittest.makeSuite(PatchesTester,'test')
There's still a few references to the “unittest” global in this module. Looks
like it's probably dead code, so just delete it I guess.
> === modified file 'bzrlib/status.py'
[...]
> - assert inner_merges[0] is None
> - inner_merges.pop(0)
> + if inner_merges.pop(0) != None:
> + raise AssertionError()
It's better style to compare None using “is”/“is not” than ==/!=. The is
operator can't be fooled by overriding a magic method, and it's quicker.
> === modified file 'bzrlib/repofmt/pack_repo.py'
Again, not the fault of your patch, but there's some pyflakes warnings in this
file that are coincidentally also in rarely-hit code paths (like a raising an
exception):
bzrlib/repofmt/pack_repo.py:1099: undefined name 'keys'
bzrlib/repofmt/pack_repo.py:1936: undefined name 'mutter_callsite'
bzrlib/repofmt/pack_repo.py:2175: undefined name 'RepositoryFormat'
-Andrew.
More information about the bazaar
mailing list