<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
For what it's worth, to distinguish between tests based on the times
they take to run is borderline naive. Meaningful distinction is what
the test tests :D <br>
Unit test checks that unit of work under testing is doing what is
expected; <br>
integration tests tests that we play well together; <br>
functional tests tests behaviour; <br>
static analysis analyses codebase to ensure conformity to agreed
policies. <br>
<br>
They all have meaning at different stages of development and to
bundle them based on the running time is to compromise these stages
in long-term.<br>
<br>
<div class="moz-cite-prefix">On 29/04/16 05:03, Nate Finch wrote:<br>
</div>
<blockquote
cite="mid:CAK=yn+t2d6wT5gHhom7+W9GwOvoM7SSd0L6RbX402Au3Wan_Pw@mail.gmail.com"
type="cite">
<div dir="ltr">Our full set of tests in <a moz-do-not-send="true"
href="http://github.com/juju/juu">github.com/juju/juu</a>
takes 10-15 minutes to run, depending on the speed of your
computer. It's no coincidence that our test pyramid looks more
like this ▽ than this △. Also, we have a lot of tests:
<div><br>
</div>
<div>
<div>/home/nate/src/<a moz-do-not-send="true"
href="http://github.com/juju/juju/$">github.com/juju/juju/$</a>
grep -r ") Test" . --include="*_test.go" | wc -l</div>
<div>9464</div>
<div><br>
</div>
<div>About small, medium, and large tests... I think that's a
good designation. Certainly 17 seconds is not a small
test. But I <i>think</i> it qualifies as medium (hopefully
most would be faster). Here's my suggestion, tying this
back into what I was talking about originally:</div>
<div><br>
</div>
<div>Small tests would be those that run with go test -short.
That gives you something you can run frequently during
development to give you an idea of whether or not you really
screwed up. Ideally each one should be less than 100ms to
run. (Note that even if all our tests ran this fast, it
would still take 15 minutes to run them, not including
compilation time).</div>
<div><br>
</div>
<div>Medium tests would also be run if you don't use -short.
Medium tests would still be something that an average
developer could run locally, and while she may want to get
up to grab a drink while they're running, she probably
wouldn't have time to run to the coffee shop to get said
drink. Medium tests would be anything more than 100ms, but
probably less than 15-20 seconds (and hopefully not many of
the latter). Medium tests would be run before making a PR,
and as a gating job.</div>
<div><br>
</div>
<div>Long tests should be relegated to CI, such as bringing up
instances in real clouds.</div>
</div>
<div><br>
</div>
<div>I don't think it's terribly useful to divide tests up by
type of test. Who cares if it's a bug found with static
analysis or by executing the code? Either way, it's a bug.
The only thing that really matters is how long the tests take,
so we can avoid running slow tests over and over. I run go
vet, go lint, and go fmt on save in my editor. That's static
analysis, but they run far more often than I actually run
tests.... and that's because they're always super fast.</div>
<div><br>
</div>
<div>I think we all agree that all of these tests (except for CI
tests) should be used to gate landings. The question then is,
how do you run the tests, and how do you divide up the tests?
To me, the only useful metric for dividing them up is how long
they take to run. I'll run any kind of test you give me so
long as it's fast enough.</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Thu, Apr 28, 2016 at 12:39 PM Nicholas Skaggs
<<a moz-do-not-send="true"
href="mailto:nicholas.skaggs@canonical.com">nicholas.skaggs@canonical.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">On
04/28/2016 10:12 AM, Katherine Cox-Buday wrote:<br>
> On 04/27/2016 09:51 PM, Nate Finch wrote:<br>
>> So, this is exactly why I didn't want to mention the
nature of the<br>
>> test, because we'd get sidetracked. I'll make another
thread to talk<br>
>> about that specific test.<br>
Sorry I forced you into it, but it was important to this
discussion. I<br>
was wanting to understand your feelings towards a test you
should be<br>
running regularly as you develop, aka a unit test, that took
more than a<br>
trivial amount of time to actually execute.<br>
>><br>
>> I do still want to talk about what we can do for unit
tests that take<br>
>> a long time. I think giving developers the option to
skip long tests<br>
>> is handy - getting a reasonable amount of coverage
when you're in the<br>
>> middle of the develop/test/fix cycle. It would be
really useful for<br>
>> when you're making changes that affect a lot of
packages and so you<br>
>> end up having to run full tests over and over. Of
course, running<br>
>> just the short tests would not give you 100%
confidence, but once<br>
>> you've fixed everything so the short tests pass,
*then* you could do<br>
>> a long run for thorough coverage.<br>
><br>
> I believe Cheryl has something like this in the works and
will be<br>
> sending a note out on it soon.<br>
><br>
Yes. It is imperative that developers can quickly (and I mean
quickly or<br>
it won't happen!) run unit tests. We absolutely want testruns
to be a<br>
part of the code, build, run iteration loop.<br>
>> This is a very low friction way to increase developer
productivity,<br>
>> and something we can implement incrementally. It can
also lead to<br>
>> better test coverage over all. If you write 10 unit
tests that<br>
>> complete in milliseconds, but were thinking about
writing a couple<br>
>> longer-running unit tests that make sure things are
working<br>
>> end-to-end, you don't have the disincentive of "well,
this will make<br>
>> everyone's full test runs 30 seconds longer", since
you can always<br>
>> skip them with -short.<br>
>><br>
>> The only real negative I see is that it makes it less
painful to<br>
>> write long tests for no reason, which would still
affect landing<br>
>> times.... but hopefully everyone is still aware of
the impact of<br>
>> long-running tests, and will avoid them whenever
possible.<br>
><br>
> I will gently point out that we were prepared to land a
test that<br>
> takes ~17s to run without discussion. The motivations are
honest and<br>
> good, but how many others think the same? This is how our
test suite<br>
> grows to be unmanageable.<br>
><br>
> I also agree with Andrew that the nature of the test
should be the<br>
> delineating factor. Right now we tend to view everything
through the<br>
> lens of the Go testing suite; it's a hammer, and
everything is a nail.<br>
> Moving forward, I think we should try much harder to
delineate between<br>
> the different types of tests in the so-called test
pyramid,<br>
> <<a moz-do-not-send="true"
href="http://martinfowler.com/bliki/TestPyramid.html"
rel="noreferrer" target="_blank">http://martinfowler.com/bliki/TestPyramid.html</a>>
place like tests with<br>
> like tests, and then run classes of tests when and where
they're most<br>
> appropriate.<br>
I advocate for slotting things into the pyramid, and making
sure we are<br>
right-sized in our testing. What sort of test counts would we
come up<br>
with for tests are each level? Would the base of the pyramid
contain the<br>
bulk of the tests? I suspect many of the juju unit tests are
really<br>
integration tests, and part of the problem that exists now
with running<br>
the unit tests suite. The other thing to note is the higher
you go in<br>
the pyramid, several things happen that work against making it
easy for<br>
developers. The higher the test on the pyramid, the more
fragile the<br>
test is (more prone to intermittent failures, breaking code),
the harder<br>
it is to write, and the longer it takes to run. Those tests at
the top<br>
of the pyramid will absolutely require the most investment and<br>
maintenance. This is why it's important for our testsuites to
be<br>
right-sized, and for us to think carefully about what we need
to test<br>
and where / how we test it.<br>
<br>
To help with semantics, you can simply designate tests as
small, medium<br>
and large based upon how long they take to run. Small being
the bottom<br>
of the pyramid, and large being the top. No need to argue
scope which<br>
can get tricky. So Nate, assuming your test in this case
wasn't static<br>
analysis or code checking (which by the way I would recommend
be<br>
'enforced' at the build bot level) but did require 17 seconds
to run, I<br>
would be hard pressed to place it in the small category. For a
codebase<br>
the size of juju, having even a small percentage of "unit"
tests run<br>
that long would quickly spiral to long overall runtimes. For
example,<br>
even if only 5% of say 500 tests ran for 10 seconds, a full
testrun<br>
still takes over 4 minutes.<br>
<br>
<br>
Nicholas<br>
<br>
</blockquote>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
</blockquote>
<br>
</body>
</html>