Questions after testtools merge

Martin (gzlist) gzlist at googlemail.com
Tue Dec 29 09:31:04 GMT 2009


Sorry for some of this feedback having been confusing, it's just
things I happened to run into. Reading a bunch of new code to try and
understand why things have broken makes giving clear reports hard.

On 29/12/2009, Robert Collins <robertc at robertcollins.net> wrote:
>
> MIME can represent bytes too; I chose text for tracebacks because python
> unittest treats backtraces as text: the first thing the stock unittest
> does is stringify backtraces: putting them in a mime container is a
> natural fit. Any encoding issues there are python 2.x issues, as 3.x
> doesn't have the horrible ambiguity that 2.x does.

Tracebacks and logs *are* conceptually text, they just don't have a
defined, consistent encoding prior to Python 3 so arbitrary bytes have
to be handled somehow. Currently, the log gets all non-UTF-8 replaced
with question marks, and non-ascii tracebacks cause the test runner to
fall over.

The code I already have handling logs and tracebacks and stderr output
is robust against any kind of bytestring and retains the original
bytes in a readable manner. I just want to give the raw bytes to my
smarter code, and testtools is getting in the way.

Not being able to get the raw traceback with testtools is the most
annoying problem. I see no reason for addError et al not to be passed
the exc_info they are given, the 'details' api is a step backwards.

> For logs, I started a thread on this list about what encoding the log
> should be in, and took the rough opinions from that thread into account
> in what I did. Note that prior to this patch landing we showed the log
> regardless, and if the encoding was wrong this could corrupt peoples
> terminals - the decode-and-replace approach is much better IMO.

Not breaking terminals is good, but should happen at the
printing-to-term stage, and still expose the original, untampered-with
bytes to subclasses.

> As for the indirection: self.getDetails() returns a dict of Mime
> objects, each object has a content_type and iter_bytes() method (which
> returns a generator of the bytes of the object). text MIME objects have
> an iter_text() method too, which decodes using the appropriate codec.

Why not just use unicode values in the dict...

> For the issue with _get_log, you haven't described what problem you're
> having, only said 'its worse', so I really can't help you much.

That was a small aside, just commenting on the diff after I had to
look into it to restore the old behaviour. Duplicating the (dodgy)
UnicodeDecodeError handling isn't great for starts.

> Can you
> at least describe what you want it to do different, and how what it does
> now does not work for you? setKeepLogFile is a bad function: it leads to
> leaked log files, and I really encourage you to stop using it. Perhaps
> you can explain why you want to use it? (Or just try the new code as-is
> - it may fix things for you).

I'm running the test suite in a (privilege-dropped) subprocess, using
setKeepLogFile means the test runner can just tell the controller the
log file name rather than reserialising it. It also means the log
stays around if something dies, which helps debugging - in other
words, leaking is a feature.

Anyway, I've reported the bugs I care about, except the crash which I
will need to look into again.

Martin



More information about the bazaar mailing list