Please run your tests with -race
roger peppe
roger.peppe at canonical.com
Thu May 21 08:00:09 UTC 2015
FWIW at least one of those errors isn't a race condition but
because you're using the Go tip compiler
to run the tests. For example, this one:
check /tmp/go-build156522501/github.com/juju/juju/cmd/jujud/_test/jujud.test
[]string{"-test.run", "TestRunMain", "-run-main", "--", "remote"}
main_test.go:291:
c.Assert(output, gc.Matches, err)
... value string = "error: dial unix
<nil>->/tmp/check-550970890984547152/10/bad.sock: connect: no such
file or directory\n"
... regex string = "error: dial unix
/tmp/check-550970890984547152/10/bad.sock: .*\n"
is because error messages have changed in tip. (The "<nil>->"
issue I have reported in the golang-dev mailing list and I
have a fix pending here: https://go-review.googlesource.com/#/c/10270/ )
On 21 May 2015 at 07:54, David Cheney <david.cheney at canonical.com> wrote:
> Here is a report of the current races. http://paste.ubuntu.com/11256308/
>
> On Thu, May 21, 2015 at 11:10 AM, Tim Penhey <tim.penhey at canonical.com> wrote:
>> Thanks Dave for this great write up.
>>
>> In order to drive the data races to zero, and make a soon to be added CI
>> test as voting, I propose that as folks are writing tests in packages,
>> races in at least that file, an possibly that package should be fixed.
>>
>> Tim
>>
>> On 21/05/15 12:12, David Cheney wrote:
>>> Hello,
>>>
>>> TL;DR - juju has lots of problems with data races, please test your
>>> code with the -race flag to ensure it doesn't get worse while we try
>>> to fix the problem.
>>>
>>> Longer version:
>>>
>>> In debugging https://bugs.launchpad.net/bugs/1456398 I found that
>>> there are multiple data races in the Juju code base. It's been long
>>> suspected that the test's are racy, PatchValue is super easy to
>>> introduce a race if all the workers started by a test have not exited
>>> before the suite's tear down function runs.
>>>
>>> However, more serious races have been discovered, such as
>>> https://bugs.launchpad.net/bugs/1456857 which affects code all the way
>>> back to 1.22.
>>>
>>> Why is a data race bad ?
>>>
>>> Ok, so you're looking at https://bugs.launchpad.net/bugs/1456857 and
>>> you're thinking, so maybe the tls code accidentally uses the wrong
>>> certificate for a little bit, how bad is that?
>>>
>>> The problem is data races affect the integrity of the structures that
>>> the garbage collector uses. In the example above replacing the
>>> certificate means one CPU can see the new value, and another
>>> potentially the old value. When it comes to to run the gc, depending
>>> on which CPU walks that chain of pointers it may think that the old
>>> certificate is still live, or the new certificate is unreachable --
>>> and boom that memory is marked as free and the certificate corrupted.
>>>
>>> The short version is: there are no safe data races, and Juju is not
>>> reliable until they have all been fixed.
>>>
>>> How to run the race detector ?
>>>
>>> The race detector comes with Go and is available by adding the -race
>>> flag to invocations of go test, so what was
>>>
>>> go test github.com/juju/juju/...
>>>
>>> becomes
>>>
>>> go test -race github.com/juju/juju/...
>>>
>>> The downside of this is the race detector has significant overhead, at
>>> least 2x, so tests will be even slower.
>>>
>>> Thanks
>>>
>>> Dave
>>>
>>
>
> --
> Juju-dev mailing list
> Juju-dev at lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
More information about the Juju-dev
mailing list