[patch] cleanups to trace/selftest

Fri Jul 7 03:10:58 BST 2006

Thanks to both of you for reviewing it.

On  6 Jul 2006, John Arbash Meinel <john at arbash-meinel.com> wrote:

> > Remove the --quiet global option, which only applied to one command and not
> > very well.
> 
> Didn't --quiet switch to using the SilentUIFactory instead of Text? If
> it didn't it probably should have. Which might be a reason to leave it in.

No, it didn't.  It basically just set one flag which was then used by
add, which was a kind of convoluted way to do it.

Possibly there should be a way to set the factory globally.  I'm not
sure we'd want it to be totally silent, but one that shows only
warnings/errors could be good.

> > It removes the dependency on logging, which may have a slight peformance or
> > load time gain.  (We probably want to set that up explicitly when loading
> > paramiko, which does use it, but that wasn't done perfectly before anyhow.)
> > 
> 
> logging costs about 12ms to load. So there is a gain, but not a huge one.

Right and as Aaron says it's also slower.  I haven't measured the
overall suite time, I might do that later.

> > === modified file 'bzrlib/builtins.py'
> > --- bzrlib/builtins.py	2006-07-04 15:04:11 +0000
> > +++ bzrlib/builtins.py	2006-07-05 07:53:58 +0000
> > @@ -31,7 +31,7 @@
> >  from bzrlib.bundle.apply_bundle import install_bundle, merge_bundle
> >  from bzrlib.conflicts import ConflictList
> >  from bzrlib.commands import Command, display_command
> > -from bzrlib.errors import (BzrError, BzrCheckError, BzrCommandError, 
> > +from bzrlib.errors import (BzrError, BzrCheckError, BzrCommandError,
> >                             NotBranchError, DivergedBranches, NotConflicted,
> >                             NoSuchFile, NoWorkingTree, FileInWrongBranch,
> >                             NotVersionedError, NotABundle)
> 
> ^-- What is your vim setup to detect trailing whitespace? These changes
> are confusing, but if we get them fixed, we'll see less of them.

It's 

  let python_highlight_builtins = 1
  let python_highlight_space_errors = 1
  let python_highlight_exceptions = 1

(well, really the second of those)

Actually it's kind of a bad habit to change them ad-hoc like this.  I
think for indents or cleanups like this it's better to just make one
batch change.  Similarly for the copyrights.

> > @@ -1912,7 +1915,7 @@
> >          print '%10s: %s' % ('bzr', osutils.realpath(sys.argv[0]))
> >          print '%10s: %s' % ('bzrlib', bzrlib.__path__[0])
> >          print
> > -        info('running tests...')
> > +        print 'running tests...'
> 
> ^- I kind of cringe at this.
> 1) info() used to go to stderr instead of stdout. Which for selftest,
> going to stdout is fine.
> By why 'print' instead of self.outf.write()?

Only because it's consistent with the lines above.  I can change all of
them if you like.

> ^- So is this just assuming that AmbiguousBase will be reported properly
> by the standard exception handler?

That's correct, the general plan is that errors know how to print themselves,
rather than being caught and printed.  (Though there may be special cases where
the error has a different meaning in a particular context.)  But more
importantly, this error is deprecated and not raised anymore.

> What are the FTP server tests doing? I can see that they are calling
> mutter, which was probably just a debugging thing for me. But because
> they just go through the standard test_transport_implementations, I
> don't believe the log is inspected for anything.

Actually they get a unicode error from inside the server, which was rather
surprising to me.  I can keep trying to track it down but wanted to get this
reviewed & landed before it touched too many more places.

I do think in general making the UI silent while running non-UI-related tests
is probably the way to go.

> I belive you need to have:
> 
> def mutter_to_test_log(fmt, *args):
>                             ^-important
> 
> So that you can pass multiple arguments. And so that if you pass a tuple
> as the first argument it gets formatter properly.

Actually the top-level mutter() method takes care of that and always passes a
tuple.

> ^- Would it be possible to not put the log files in /tmp, we haven't
> been very good about cleaning them up. (Though it may just be that we
> weren't cleaning up skipped tests). 'ls /tmp' sometime, you'll see all
> the old log messages.

We could put them in test0000.tmp, but then if we forget to close them we may
be unable to clean up on windows.  I think the only thing is to run cleanups
on skipped tests, and also make sure to run all the cleanups even if one
fails.

Anyhow once we fix the ftp bug we can get rid of this function, and only set
up a UI factory for tests that are actually going to use it.  Actually no
tests look at the result now.

> >      def _get_log(self):
> > -        """Return as a string the log for this test"""
> > +        """Return as a string the log for this test
> > +        
> > +        The log accumulates all diagnostic messages from the program.
> > +
> > +        This may be called multiple lines to look for lines arriving in 
> > +        the log.
> > +        """
> 
> ^- "may be called multiple times" not lines. I still wonder what the
> difference is between the debug log and the real log. And why when a
> test is run the contents are going to be scattered rather than serialized.

After doing this I'm more sure there really is a difference and it's this: the
regular log is things the process writes to stdout/stderr; if you are doing a
blackbox test and want to make assertions about what bzr writes out this is
the place to do it.  (But actually it's better to do that through
run_bzr_captured, not this method.)  The debug log is something we generally
shouldn't be making assertions about, but that should be recorded and
displayed if something goes wrong, and where we'll want to accumulate it
across the whole test, not just through particular calls.  So I am convinced
they should be separate.

> v- I think I would miss having the exact command arguments be logged.
> Because many times the code may be iterating over a list, and you don't
> know which one is failing.

OK, done.

> > -        self.log('run bzr: %r', argv)
> > -        # FIXME: don't call into logging here
> > -        handler = logging.StreamHandler(stderr)
> > -        handler.setLevel(logging.INFO)
> > -        logger = logging.getLogger('')
> > -        logger.addHandler(handler)
> > -        old_ui_factory = bzrlib.ui.ui_factory
> > -        bzrlib.ui.ui_factory = bzrlib.tests.blackbox.TestUIFactory(
> > +        ## stdout = codecs.getwriter(encoding)(tempfile.TemporaryFile())
> > +        ## stderr = codecs.getwriter(encoding)(tempfile.TemporaryFile())
> 
> ^-- stdout and stderr should really be defined as bytestreams. The
> automatic translation was hurting us, which is why I added 'self.outf'.
> Which will do the auto-translation, but let us have more control over
> it. I'm not sure what you were testing here.

I was just trying to get things to pass - some code seemed to have an
assumption they handled encodings, but I was wrong.  I agree they should be
bytestreams, same as the real streams.

> > -        out = stdout.getvalue()
> > -        err = stderr.getvalue()
> > +            ui.switch_ui_factory(old_ui_factory)
> > +        stdout.seek(0)
> > +        out = stdout.read()
> > +        stderr.seek(0)
> > +        err = stderr.read()
> 
> ^- Why seek instead of just using stdout.getvalue()?
> It looks like you used a temp file for a while. Maybe that is why.

Yes.  I put it back.

> > -        log = self._get_log()
> > -        self.failUnless('All changes applied successfully.\n' in log)
> > +        # we used to look in the UI output from merge, but this is a bit of a
> > +        # layering violation - only blackbox tests should look at that --
> > +        # merge should probably give some kind of structured report on what
> > +        # was done -- mbp 20060706
> > +        ## log = self._get_log()
> > +        ## self.failUnless('All changes applied successfully.\n' in log)
> 
> ^- I'm a little concerned that until merge does create a structured
> report, that we might miss stuff. Obviously the test is now not
> asserting the same consistency it used to.

OK, I might run that through a TestUIFactory and look at the results.  That
would be better than the previous test because it doesn't make sure the
message comes from the second call rather than the first - which is a case of
the thing above about not running together all of the ui output for the whole
test.

> >  class MetaTestLog(TestCase):
> >  
> >      def test_logging(self):
> > -        """Test logs are captured when a test fails."""
> > +        """Debug messages from tests are captured."""
> >          self.log('a test message')
> >          self._log_file.flush()
> > -        self.assertContainsRe(self._get_log(), 'a test message\n')
> > +        self.assertContainsRe(self._get_debug_log(), 'a test message\n')
> 
> ^- If you are flushing in _get_debug_log() do we need to flush here?

Done

> > +
> > +    def test_fails(self):
> > +        return
> > +        self.log('something from mutter')
> > +        self.fail()
> 
> ^- At least put a comment that this is for testing. Otherwise it looks
> like a whole lot of cruft.

Done

> > -class TestTrace(TestCase):
> > +class TestTrace(TestCase):   ## tests.TestCaseWithTransport):
> 
> ^- looks like something that should be cleaned up one way or another.

Done

> >      def test_format_sys_exception(self):
> >          try:
> > @@ -80,8 +81,6 @@
> >      def test_trace_unicode(self):
> >          """Write Unicode to trace log"""
> >          self.log(u'the unicode character for benzene is \N{BENZENE RING}')
> > -        self.assertContainsRe('the unicode character',
> > -                self._get_log())
> 
> ^- This was testing that we could log a unicode character and not die,
> and that the text was actually in the log. It seems like we should keep
> some sort of check here. Maybe it should be self._get_debug_log().

Yes, I put that in - though the parameters to assertContainsRe were
reversed and I'm not sure whow it worked before.

> It looks like self.log() is going through mutter(), which is now
> silenced unless we are running a run_bzr() command.

No, it's normally accumuated into the test case's _debug_log.  Which makes
sense, I think - a list of internal trace messages accumulated while
running this particular test.

> Something isn't sitting right with me about this. I do want a way to
> test that mutter(u'Unicode string') won't die.
> And never muttering anything doesn't seem the right fix.

So this test now makes sure that messages emitted while testing are
accumulated in the test.  We might also want a test that the non-test
behaviour works properly, which is a bit ticklish, but can be handled by a
test that restores the regular behaviour.

> > @@ -97,12 +96,18 @@
> >          # raise an exception
> >          mutter(u'Writing a greek mu (\xb5) works in a unicode string')
> >          mutter('But fails in an ascii string \xb5')
> > -        # TODO: jam 20051227 mutter() doesn't flush the log file, and
> > -        #       self._get_log() opens the file directly and reads it.
> > -        #       So we need to manually flush the log file
> > -        import bzrlib.trace
> > -        bzrlib.trace._trace_file.flush()
> > -        log = self._get_log()
> > -        self.assertContainsRe(log, 'Writing a greek mu')
> > -        self.assertContainsRe(log, 'UnicodeError')
> > -        self.assertContainsRe(log, "'But fails in an ascii string")
> > +        # this test just wants to make sure that writing them doesn't fail, so
> > +        # we don't check anywhere that they were written
> > +
> 
> But if they aren't written anywhere, we don't know if they would fail if
> they were *really* being written somewhere.
> We need some way to really test mutter.
> It may be that for these tests we need to hook mutter up to actually log
> stuff. But I really don't like having the tests passing because mutter()
> isn't doing anything.

I think the right way to test this is to add a -D option then run that
captured, and make sure the messages are seen on stderr.

> > +
> > +def note(*args):
> > +    """Write a diagnostic message to the user"""
> > +    ui.ui_factory.show_note((args[0] % args[1:]))
> 
> I think it is better to write these as:
> def note(fmt, *args):
> 	ui.ui_factory.show_note(fmt % args)
> 
> That way you get a TypeError about missing an argument, rather than an
> index error.

Done.

> Also, why the double parenthesis? Where you planning on passing a tuple?
> You need an extra comma for that.

Just a leftover.

> I'm also concerned that there is a cyclical import here. Where trace
> imports ui, and ui imports trace.

No, there's not.

> > === modified file 'bzrlib/ui/__init__.py'
> > --- bzrlib/ui/__init__.py	2006-06-20 03:57:11 +0000
> > +++ bzrlib/ui/__init__.py	2006-07-06 07:02:54 +0000
> > @@ -1,4 +1,4 @@
> > -# Copyright (C) 2005 Canonical Ltd
> > +# Copyright (C) 2005, 2006 Canonical Ltd
> >  
> >  # This program is free software; you can redistribute it and/or modify
> >  # it under the terms of the GNU General Public License as published by
> 
> ^- if you are fixing the copyright, shouldn't you fix the blank lines?

I'll do a batch update to them in a while.

> I think I like the overall direction of this. I'm just concerned about
> some of the entries which stop testing code, especially whether mutter()
> is unicode safe.

OK, thanks.  I'll add the ability to send mutter to stderr, then test
that.

-- 
Martin