Request for help / ideas to debug issue

Michael Hudson-Doyle michael.hudson at canonical.com
Sun Mar 12 08:38:25 UTC 2017


PS: I guess I should back port that Go fix to all supported Go releases?

On 12 March 2017 at 21:37, Michael Hudson-Doyle <
michael.hudson at canonical.com> wrote:

> Before we get into this, what is the actual problem here? Just the ugly
> messages?
>
> On 11 March 2017 at 02:58, Alfonso Sanchez-Beato <alfonso.sanchez-beato@
> canonical.com> wrote:
>
>> On Fri, Mar 10, 2017 at 10:22 AM, John Lenton <john.lenton at canonical.com>
>> wrote:
>>
>> > Hello!
>> >
>> > We're seeing a weird issue with either go, pthreads, or the kernel. If
>> > you're knowledgeable about one or more of those things, could you take
>> > a look? Thank you.
>> >
>> > The issue manifests as nasty warnings from the "snap run" command,
>> > which is also the first step into a snapped app or service. It looks
>> > like
>> >
>> > runtime/cgo: pthread_create failed: Resource temporarily unavailable
>> >
>> > a very stripped-down reproducer is http://pastebin.ubuntu.com/24150663/
>> >
>> > build that, run it in a loop, and you'll see a bunch of those messages
>> > (and some weirder ones, but let's take it one step at a time)
>>
>
> Turns out this was fixed in Go 1.8: https://go-review.
> googlesource.com/c/33894/
>
>
>> > if you comment out the 'import "C"' line the message will change but
>> > still happen, which makes me think that at least in part this is a Go
>> > issue (or that we're holding it wrong).
>>
>
> ... but only in the non-cgo case, you can (occasionally) still get
> messages like:
>
> runtime: failed to create new OS thread (have 5 already; errno=11)
> runtime: may need to increase max user processes (ulimit -u)
> fatal error: newosproc
>
> if you comment out the import "C". I guess we should report that upstream.
>
>
>> > Note that the exec does work; the warning seems to come from a
>> > different thread than the one doing the Exec (the other clue that
>> > points in this direction is that sometimes the message is truncated).
>> > You can verify the fact that it does run by changing the /bin/true to
>> > /bin/echo os.Args[1], but because this issue is obviously a race
>> > somewhere, this change makes it less likely to happen (from ~10% down
>> > to ~.5% of runs, in my machines).
>> >
>> > One thing that makes this harder to debug is that strace'ing the
>> > process hangs (hard, kill -9 of strace to get out) before reproducing
>> > the issue. This probably means we need to trace it at a lower level,
>> > and I don't know enough about tracing a process group from inside the
>> > kernel to be able to do that; what I can find about kernel-level
>> > tracing is around syscalls or devices.
>> >
>> > Ideas?
>> >
>>
>> I found this related thread:
>>
>> https://groups.google.com/forum/#!msg/golang-nuts/8gszDBRZh_
>> 4/lhROTfN9TxIJ
>>
>> <<
>> I believe this can happen on GNU/Linux if your program uses cgo and if
>> thread A is in the Go runtime starting up a new thread B while thread
>> C is execing a program.  The underlying cause is that while one thread
>> is calling exec the Linux kernel will fail attempts by other threads
>> to call clone by returning EAGAIN.  (Look for uses of the in_exec
>> field in the kernel sources.)
>> >>
>>
>
> Yeah, this seems to be very accurate. It's also why it seems this is a
> cosmetic problem only, some thread not calling exec fails, but well, the
> thread is about to die anyway.
>
>
>> Something like adding a little sleep removes the traces, for instance:
>>
>> http://paste.ubuntu.com/24151637/
>>
>> where the program run sleep for 1ms before calling Exec. For smaller units
>> (say, 20 us) the issue still happens.
>>
>> It looks to me that right before running main(), go creates some threads,
>> calling clone() and probably getting the race described in the thread. As
>> anyway you are running Exec I guess the traces are harmless, you do not
>> need the go threads. Nonetheless, I think that the go run time should
>> retry
>> instead of printing that trace.
>
>
> Cheers,
> mwh
>



More information about the Snapcraft mailing list