Please run your tests with -race

David Cheney david.cheney at canonical.com
Thu May 21 09:25:59 UTC 2015


Thanks for fixing the net errors on tip. Yes, there are other errors
in there besides races, they'll all be fixed.

On Thu, May 21, 2015 at 6:00 PM, roger peppe <roger.peppe at canonical.com> wrote:
> FWIW at least one of those errors isn't a race condition but
> because you're using the Go tip compiler
> to run the tests. For example, this one:
>
> check /tmp/go-build156522501/github.com/juju/juju/cmd/jujud/_test/jujud.test
> []string{"-test.run", "TestRunMain", "-run-main", "--", "remote"}
> main_test.go:291:
>     c.Assert(output, gc.Matches, err)
> ... value string = "error: dial unix
> <nil>->/tmp/check-550970890984547152/10/bad.sock: connect: no such
> file or directory\n"
> ... regex string = "error: dial unix
> /tmp/check-550970890984547152/10/bad.sock: .*\n"
>
> is because error messages have changed in tip. (The "<nil>->"
> issue I have reported in the golang-dev mailing list and I
> have a fix pending here: https://go-review.googlesource.com/#/c/10270/ )
>
>
> On 21 May 2015 at 07:54, David Cheney <david.cheney at canonical.com> wrote:
>> Here is a report of the current races. http://paste.ubuntu.com/11256308/
>>
>> On Thu, May 21, 2015 at 11:10 AM, Tim Penhey <tim.penhey at canonical.com> wrote:
>>> Thanks Dave for this great write up.
>>>
>>> In order to drive the data races to zero, and make a soon to be added CI
>>> test as voting, I propose that as folks are writing tests in packages,
>>> races in at least that file, an possibly that package should be fixed.
>>>
>>> Tim
>>>
>>> On 21/05/15 12:12, David Cheney wrote:
>>>> Hello,
>>>>
>>>> TL;DR - juju has lots of problems with data races, please test your
>>>> code with the -race flag to ensure it doesn't get worse while we try
>>>> to fix the problem.
>>>>
>>>> Longer version:
>>>>
>>>> In debugging https://bugs.launchpad.net/bugs/1456398 I found that
>>>> there are multiple data races in the Juju code base. It's been long
>>>> suspected that the test's are racy, PatchValue is super easy to
>>>> introduce a race if all the workers started by a test have not exited
>>>> before the suite's tear down function runs.
>>>>
>>>> However, more serious races have been discovered, such as
>>>> https://bugs.launchpad.net/bugs/1456857 which affects code all the way
>>>> back to 1.22.
>>>>
>>>> Why is a data race bad ?
>>>>
>>>> Ok, so you're looking at https://bugs.launchpad.net/bugs/1456857 and
>>>> you're thinking, so maybe the tls code accidentally uses the wrong
>>>> certificate for a little bit, how bad is that?
>>>>
>>>> The problem is data races affect the integrity of the structures that
>>>> the garbage collector uses. In the example above replacing the
>>>> certificate means one CPU can see the new value, and another
>>>> potentially the old value. When it comes to to run the gc, depending
>>>> on which CPU walks that chain of pointers it may think that the old
>>>> certificate is still live, or the new certificate is unreachable --
>>>> and boom that memory is marked as free and the certificate corrupted.
>>>>
>>>> The short version is: there are no safe data races, and Juju is not
>>>> reliable until they have all been fixed.
>>>>
>>>> How to run the race detector ?
>>>>
>>>> The race detector comes with Go and is available by adding the -race
>>>> flag to invocations of go test, so what was
>>>>
>>>>     go test github.com/juju/juju/...
>>>>
>>>> becomes
>>>>
>>>>     go test -race github.com/juju/juju/...
>>>>
>>>> The downside of this is the race detector has significant overhead, at
>>>> least 2x, so tests will be even slower.
>>>>
>>>> Thanks
>>>>
>>>> Dave
>>>>
>>>
>>
>> --
>> Juju-dev mailing list
>> Juju-dev at lists.ubuntu.com
>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev



More information about the Juju-dev mailing list