[MERGE] fix blackbox failures when run from windows executable

Wed Aug 20 13:49:47 BST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Adrian Wilkins wrote:
> Alexander Belchenko wrote:
>> I have strong opinion that bzr.exe should be distributed without test
> suite.
> 
> I concur on this point ; process creation overhead on Windows is large
> enough, which is one of the reasons why git is (IMHO) less practical on
> Windows than bzr. I've had instances where I've written a small *nix
> style tool to process large numbers of files, and running it through a
> shell pipeline on Windows over half the runtime is consumed by process
> startup and teardown.
> 

I would be fine with it, though I would like to see a real test of how much
overhead it actually causes.

>> And because these problems stays here for years, fixing test suite
> should not be high priority. Mostly because test suite on Windows don't
> running regularly opposing to PQM
> 
> I see the inverse of this point ; part of the reason there is little
> incentive to set up a Windows test server for PQM is that the test suite
> has so many failures on Windows. This means that rather than just
> checking that failures == 0, you have to specifically check what failed,
> whether known (but not expected) failures failed in a different way to
> last time, etc. This makes writing a useful automated PQM gatekeeper for
> Windows tests much harder, which is a huge disincentive to actually set
> one up. If the test suite passed on Windows, that important link in the
> quality chain has one less reason to be absent.
> 
> 

I think the big issue here is having someone willing to take up the task of
cleaning up the test suite on win32. I've taken point on that a few times,
especially before I was employed by Canonical. ATM I think there are only a
couple issues, which will clean up a lot of the test suite. The biggest is the
locking issues. On Linux it doesn't fail if a process fcntl locks a file for
both shared access and exclusive access. Which actually seems like a bug on
the Linux side.

I've written the code to cause the locking to error under those conditions,
but I haven't had the time to clean up the code that triggers it. If someone
took the time to clean up those code paths, then we could bring in the
exclusive locking, and prevent regression on Win32 even only running the test
suite on Linux. (The biggest offender I know of is the merge code, because it
sometimes opens the same working tree more than once and doesn't realize it.)

I believe Alexander tried here and there, but mostly felt overwhelmed. I think
Mark Hammond is making good strides in this direction. Perhaps with a bit of
extra help we could finally get a clean test suite on win32. And at that point
we can, indeed, make it easier to keep it clean.

One of the big things is to try to find where the differences happen, and
include explicit tests for the win32 behavior. I've managed to do that for
quite a few functions (which is why osutils has a lot of _win32_* functions so
that they can be tested on all platforms.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIrBLrJdeBCYSNAAMRAs5sAJ9dQjq4BegyACT8YRJ0UJ+PE+YoFgCfa9dn
gFQeATb2pKGEUoaVHdq1VTw=
=Fc2v
-----END PGP SIGNATURE-----