[Bug 788998] Re: winbind hang after 10 hours, Kerberos and DNS related
Clint Byrum
clint at fewbar.com
Fri May 27 16:07:28 UTC 2011
Hi Keith. Thanks for taking the time to file a bug report and help us
make Ubuntu better.
Its not clear to me what you mean "winbind is not happy with the
response it gets." Can you maybe upload a log file or paste some log
lines that will help us understand which failure condition is causing
the fallback to lmhosts/broadcast? To my knowledge, the DNS lookup is
simply a SRV lookup, and there should be no reason that it would fail to
respect a SRV record that it was given.
It would help if you could also run 'apport-collect 788998' on the
affected machine so we get extra information like dependencies.
Marking Incomplete pending response from Keith.
** Changed in: samba (Ubuntu)
Status: New => Incomplete
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to samba in Ubuntu.
https://bugs.launchpad.net/bugs/788998
Title:
winbind hang after 10 hours, Kerberos and DNS related
Status in “samba” package in Ubuntu:
Incomplete
Bug description:
Binary package hint: samba
Ubuntu 10.04 (Samba2:3.4.7~dfsg-1ubuntu3.6) , Ubuntu 10.10. (Samba
2:3.5.4~dfsg-1ubuntu8.4)
Winbind is using a Windows active directory. /etc/resolv.conf is
pointing at a Linux based DNS which "replicates" data from the Windows
DC.
Problem:
winbind starts up, issues DNS requests, is happy with its responses.
winbind gets its Kerberos token [see below for a related problem].
Everything is happy for 10 hours (Kerberos token lifetime).
The Kerberos token expires.
winbind starts issuing DNS requests to the same Linux server to find
the Kerberos server. It gets responses but this time it is not happy
with the contents.
After a few tries at getting the DNS response, winbind gives up on DNS
and drops through to LMHOSTS (nothing there) and finally SMB broadcast
mode.
The winbind client is on a different network from the Windows DC, so
broadcast gets no response.
winbind stalls issuing broadcast packets but not doing any work
(Kerberos has expired).
winbind still accepts client connections over its pipe, each of which
gets allocated a file descriptor. Its clients all stall, waiting for a
response from winbind.
Eventually winbind runs out of file descriptors and dies.
If winbind needs to open a new log file it cannot, all file
descriptors are in use.
Everything grinds to a halt waiting for a response from winbind, until
winbind is restarted.
Workaround:
Point /etc/resolv.conf at the Windows DC instead of the Linux name
server.
Outstanding:
Why does winbind accept the DNS responses at start up but not when
Kerberos expires?
winbind needs to reserve a couple of file descriptors for its log
files. Losing the log file in the middle of this problem was
"interesting".
winbind constructs its own krb5.conf, removing all the local settings
that are set in /etc/krb5.conf. This results in the main winbind
process using values from /etc/krb5.conf, while the remaining
processes use potentially different values from its own krb5.conf.
Trying to test workarounds was frustrating because dropping the key
lifetime in /etc/krb5.conf had no effect. In the end I used a syscall
wrapper around rename to intercept and replace the contents of
winbind's copy of krb5.conf. As I say, it was "interesting".
More information about the foundations-bugs
mailing list