[Bug 788998] [NEW] winbind hang after 10 hours, Kerberos and DNS related

Keith Owens kaos-ubuntu at ocs.com.au
Fri May 27 08:07:02 UTC 2011


Public bug reported:

Binary package hint: samba

Ubuntu 10.04 (Samba2:3.4.7~dfsg-1ubuntu3.6) , Ubuntu 10.10. (Samba
2:3.5.4~dfsg-1ubuntu8.4)

Winbind is using a Windows active directory. /etc/resolv.conf is
pointing at a Linux based DNS which "replicates" data from the Windows
DC.

Problem:

winbind starts up, issues DNS requests, is happy with its responses.

winbind gets its Kerberos token [see below for a related problem].

Everything is happy for 10 hours (Kerberos token lifetime).

The Kerberos token expires.

winbind starts issuing DNS requests to the same Linux server to find the
Kerberos server. It gets responses but this time it is not happy with
the contents.

After a few tries at getting the DNS response, winbind gives up on DNS
and drops through to LMHOSTS (nothing there) and finally SMB broadcast
mode.

The winbind client is on a different network from the Windows DC, so
broadcast gets no response.

winbind stalls issuing broadcast packets but not doing any work
(Kerberos has expired).

winbind still accepts client connections over its pipe, each of which
gets allocated a file descriptor. Its clients all stall, waiting for a
response from winbind.

Eventually winbind runs out of file descriptors and dies.

If winbind needs to open a new log file it cannot, all file descriptors
are in use.

Everything grinds to a halt waiting for a response from winbind, until
winbind is restarted.

Workaround:

Point /etc/resolv.conf at the Windows DC instead of the Linux name
server.

Outstanding:

Why does winbind accept the DNS responses at start up but not when
Kerberos expires?

winbind needs to reserve a couple of file descriptors for its log files.
Losing the log file in the middle of this problem was "interesting".

winbind constructs its own krb5.conf, removing all the local settings
that are set in /etc/krb5.conf. This results in the main winbind process
using values from /etc/krb5.conf, while the remaining processes use
potentially different values from its own krb5.conf. Trying to test
workarounds was frustrating because dropping the key lifetime in
/etc/krb5.conf had no effect. In the end I used a syscall wrapper around
rename to intercept and replace the contents of winbind's copy of
krb5.conf. As I say, it was "interesting".

** Affects: samba (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to samba in Ubuntu.
https://bugs.launchpad.net/bugs/788998

Title:
  winbind hang after 10 hours, Kerberos and DNS related

Status in “samba” package in Ubuntu:
  New

Bug description:
  Binary package hint: samba

  Ubuntu 10.04 (Samba2:3.4.7~dfsg-1ubuntu3.6) , Ubuntu 10.10. (Samba
  2:3.5.4~dfsg-1ubuntu8.4)

  Winbind is using a Windows active directory. /etc/resolv.conf is
  pointing at a Linux based DNS which "replicates" data from the Windows
  DC.

  Problem:

  winbind starts up, issues DNS requests, is happy with its responses.

  winbind gets its Kerberos token [see below for a related problem].

  Everything is happy for 10 hours (Kerberos token lifetime).

  The Kerberos token expires.

  winbind starts issuing DNS requests to the same Linux server to find
  the Kerberos server. It gets responses but this time it is not happy
  with the contents.

  After a few tries at getting the DNS response, winbind gives up on DNS
  and drops through to LMHOSTS (nothing there) and finally SMB broadcast
  mode.

  The winbind client is on a different network from the Windows DC, so
  broadcast gets no response.

  winbind stalls issuing broadcast packets but not doing any work
  (Kerberos has expired).

  winbind still accepts client connections over its pipe, each of which
  gets allocated a file descriptor. Its clients all stall, waiting for a
  response from winbind.

  Eventually winbind runs out of file descriptors and dies.

  If winbind needs to open a new log file it cannot, all file
  descriptors are in use.

  Everything grinds to a halt waiting for a response from winbind, until
  winbind is restarted.

  Workaround:

  Point /etc/resolv.conf at the Windows DC instead of the Linux name
  server.

  Outstanding:

  Why does winbind accept the DNS responses at start up but not when
  Kerberos expires?

  winbind needs to reserve a couple of file descriptors for its log
  files. Losing the log file in the middle of this problem was
  "interesting".

  winbind constructs its own krb5.conf, removing all the local settings
  that are set in /etc/krb5.conf. This results in the main winbind
  process using values from /etc/krb5.conf, while the remaining
  processes use potentially different values from its own krb5.conf.
  Trying to test workarounds was frustrating because dropping the key
  lifetime in /etc/krb5.conf had no effect. In the end I used a syscall
  wrapper around rename to intercept and replace the contents of
  winbind's copy of krb5.conf. As I say, it was "interesting".




More information about the foundations-bugs mailing list