[MERGE] Automatic discovery of tests

Fri Nov 9 09:07:20 GMT 2007

Sorry if I offended you Alexander, that was not my intent. I
fully respect the work you've done so far to maintain bzr on
windows and it's far more in my view than just building the
installers.

Thanks for taking the time to reply with such details, I'll try
to do the same to explain my point.

>>>>> "bialix" == Alexander Belchenko <bialix at ukr.net> writes:

    bialix> Vincent Ladeuil пишет:
    >>>>>>> "bialix" == Alexander Belchenko <bialix at ukr.net> writes:
    >> 
    >> <snip/>
    >> 
    >> >> I'm with Robert on that point. I understand Alexander concerns
    >> >> about chasing bad citizens with respect to windows compatibility,
    >> >> but I think in the long run, things will go worse if we stop
    >> >> embedding the tests.
    >> 
    bialix> Sorry, but I don't understand your point.
    >> 
    >> If you make it harder for people to run selftest, fewer people
    >> will run it.
    >> 
    >> If less people run the test, more and more will fail on windows
    >> and go undetected.

    bialix> For me as bzr users test is useless and I never like
    bialix> to run it on Windows in the current state, because
    bialix> they run very long (about 20-40 minutes on average
    bialix> machine, and during this time python process try to
    bialix> eat as much %CPU as he can, it intensively use hard
    bialix> disk [and some PC, especially notebooks don't like
    bialix> such stresses]), in the end it *always* fails because
    bialix> selftest has many windows-incompatibilities (on win98
    bialix> it just hangs on some http test),

I don't have any windows98 around but I'm interested in more
details. Fox example, did the fix for #158972 (bzr.dev at 2956) made
the matter worse or better ?

And we need that 'selftest hangs' bugs fixed ! That's critical.

    bialix> selftest is hidden command and therefore only
    bialix> hardcore users/semi-bzr-developers (like me) know how
    bialix> to run it, during selftest run there is too much
    bialix> warning messages that regular users don't know how to
    bialix> interpret. And bzr.exe has some additional
    bialix> limitation, so some tests should be skipped for
    bialix> bzr.exe.

Ok. I get the point, so go ahead, drop selftest from bzr.exe,
you're right, the added benefit of having it available there is
not worth the time you can spend on enhancing the selftest for
the devs ;-)

    bialix> And most important: I see and I feel that nobody care
    bialix> about results of selftest on Windows.

You're damn wrong here. *I* care. Granted I spend less time on it
than on other areas, but I try to use my time for tasks where my
entropy is minimal.

    bialix> And I think I understand why. If you disagree with me
    bialix> than try to recall how 0.15 was delayed because of
    bialix> locking problems on Windows.

I realize (and *you* should too) how important your participation
was in that occasion.

    bialix> This is my strong opinion as bzr user. Not as bzr
    bialix> developer (even though I can't say I'm hardcore
    bialix> developer: my main role is just make windows
    bialix> releases).

I disagree here too, you main role is to remain other devs about
the status of bzr on windows, a side-effect being that you
produce the installers.

That's a *huge* difference in my eye. I support you in your
efforts, as far as I can (which may not shows ;).

<snip/>

    bialix> So I don't care about selftest, because many of bzr
    bialix> flaws on windows discovered by users without any test
    bialix> suite.

That's where the problem is *today*. Ideally the test suite
should be able to say: you lack symlink support you can't do
that, your locale is badly configured fix it, your filesystem
can't do that, etc.

May we should define a smaller test suite focused on ensuring
basics.

A possible way to get there is to attach more information to the
tests (like we began to do with ExpectedFailure).

Tests *are* the specifications.

bzr runs on this os if this os matches the specifications.

If this os doesn't match either it should be fixed or bzr should
be fixed (the later may be easier though ;).

That's the theory.

In practice, what we have today is a bunch of tests failing.

But you have made a significant work in the right direction so
far. For example, we now have a rule that we must use osutils
functions and not the os ones.

It may possible to go further in that direction by forbidding the
use of 'import os' in bzr and forcing the use of osutils (and I
don't think performance is a concern here if we just provide
bindings).

    bialix> Users just run bzr and see how it works. They see
    bialix> tracebacks, they post them to Launchpad or mailing
    bialix> lists, and nobody even run test suite. We don't need
    bialix> test suite to say that something broken when it
    bialix> broken.

We need the test suite to tell us that we don't make things worse
when developing new features or fixing bugs. And the more the
test suite tell us, the less anyone has to learn to fix it.

*Tests* are the specifications.

Look at all the people coming to participate, they all face
challenges, the most important being IMHO:

- how does that works, why is it failing ?

- how should I write the bug fix (tough one for people who know
  zero about TDD) ?

The more complete the test suite is (hence the specs), the easier
it becomes to find the holes and plug them.

The problem is that it's quit a challenge to write code that will
check that os2 will break when the code runs on os1.

    bialix> If you think I don't like tests at all -- you're
    bialix> wrong. As bzr developer -- I like it.

I damn well know that !

    bialix> I like to use TDD when I wrote some new
    bialix> functionality. But only when I have my dev hat
    bialix> on. But even then I hate to see that selftest on
    bialix> windows never pass cleanly.

I don't run the test suite on windows but I know the feeling on
OSX and Solaris even if the results are far better than on
windows !

And I *love* to use TDD when I *change* some existing
functionality !

Write once read hundreds applies to code.
Write once use hundreds applies to tests.

    bialix> It was, but now it don't. I hate that there is test
    bialix> that just incorrect (see
    bialix> https://bugs.launchpad.net/bzr/+bug/158596).

Add an ExpectedFailure in the test suite ! Please !

    bialix> So, I don't see how selftest will help
    bialix> users. Especially windows users.

<shudder>

Ideally they should be able to run it (even if it last a couple
of hours) and the end bzr will tell them: sorry chap, this, this
and that, can't work, try the following workarounds or watch for
bug #nnn, #mmm to be fixed.

Realistically having expected test failures will help us fix the
bugs and *communicate* about that intent.

Having bug fixed will help users.

Having the test suite run automatically for windows and
results posted somewhere daily will help us hugely too.

    bialix> I don't think it's silver bullet.

There is no silver bullet ;-)

But OOP and TDD are a huge improvement (especially the later).

Look at bug #150860 for example.

A couple of people have been involved here and launchpad and the
test suite. The later two are *very* important. I would never
have been able to redact the summary in
https://bugs.edge.launchpad.net/bzr/+bug/150860/comments/2
without the test suite forcing everybody to respect the
specifications and filing bugs have helped to track the issue !
And just imagine how this same bug could have been handled
without the test suite...

    bialix> There is a bunch of problems currently not covered by
    bialix> test suite,

That should be fixed.

    bialix> and no matter how many tests bzr will have in the
    bialix> long term.

I disagree here: if bzr fails for <something> <somewhere> it
means that <somewhere> does not implement the <something>
specifications OR that <something> specifications are incorrect
(incomplete, bogus, whatever).

Theory again. But in practice it means adding more tests. It
could be failing tests, it could be
ExpectedFailingTestBecauseSomewhereIsUtterlyBroken or
ExpectedFailingTestBecauseWeForgotAboutSomewhereAGAIN.

    bialix> Some problems cannot be predicted in test suite and
    bialix> only real users face with these problems in real
    bialix> usage.

Great, they help us getting good specs. We're human, we're
expected to fail but we also have the choice to try to do better.

    bialix> Users will try to checkout working tree with symlinks
    bialix> and get traceback: NmeError: os module don't have
    bialix> name 'symlink'.

See above, let's forbid 'import os' and have check in osutils
regarding all supported OSes.

    bialix> Users will try to rename 'test.txt' to 'Test.txt' and will get error:
    bialix> C:\Temp\2>bzr mv test.txt Test.txt
    bialix> bzr: ERROR: Could not rename test.txt => Test.txt because both files exist. (Use --after to update
    bialix> the Bazaar id)
    bialix> and someone will teach them to rename by hands and then use `bzr mv --after`.

Great, add an ExpectedFailure test with a bug number.

    bialix> Non-ascii users constantly will be annoyed by the bug:
    bialix> https://bugs.launchpad.net/bzr/+bug/54173
    bialix> (or similar behavior of diff)
    bialix> and then think that bzr will not support unicode properly.

Idem.

    bialix> Some users will be annoyed every time they setup bzr
    bialix> on clean machine to see in default ignore list such
    bialix> patterns as:
    bialix> *.a
    bialix> *.o
    bialix> *.so

    bialix> That have not many sense because GCC is not default
    bialix> compiler on Windows platform.

Don't tell me :-)

    bialix> So, Vincent, I really don't understand, what do you
    bialix> mean, when you said "in long term".  If 2 years is
    bialix> not enough long term, then I don't know.

10 years is long term :)
2 years is middle term.

We're talking VCS here, VCS is here to stay.

Some of the ideas I pointed above may seems utopic, that's not a
problem, I use them as goals and try to find steps in the right
directions.

Here are some more, still around the test suite:

- offer a way to test a server against bzr (see problems
  encountered with ftp or http servers)

- the idea above could be targeted at file systems too,

- have the test suite run only the relevant tests (I change a bit
  of code, run the 3 (40, 1027...) impacted tests

- tell me what sources are concerned by this test (who implements
  the feature I'm testing here)

In summary, I think we agree more than we disagree (I'm not
trying to force you to adopt my point of view, I'm just stating
mine regarding yours).

I have been faced with the root problem of the test suite not
fully passing since I began working on bzr.

I'm annoyed by the 'UserWarning: file bla bla was not explicitly
unlocked' noise.

I'm concerned by the test suite failing on OSX, Solaris *and*
windows.

I regret (but understand) that more time have been spent lately
on 'make it fast' than on 'make it right'.

The list goes on BUT don't touch my test suite ! Make it better
instead, that's a GIFT damn it and it's improving constantly.

         Vincent

P.S.: My remarks above points actual flaws in the test suite, but
rest assured that overall I value the test suite more than bzr
itself on pure technical merits and as a demonstration that what
can be achieved with TDD.

P.P.S.: I'm a beginner in TDD but does not intend to stay one.