[Bug 2085389] Re: File descriptor leak on /var/lib/sss/pipes/nss socket

Matthew Ruffell 2085389 at bugs.launchpad.net
Thu Nov 7 03:08:10 UTC 2024


Attached is the final debdiff that was sponsored for noble.

** Description changed:

  [Impact]
  
  When running in a multithreaded environment each pthread that opens the
  /var/lib/sss/pipes/nss socket retains the file descriptor in a thread
  specific structure. This file descriptor should be closed when the
  thread is destroyed but due to a bug it is left open thus generating the
  leak.
  
  [Test Plan]
  
  Start two VMs. One will be a ldap client, the other will be a ldap
  server.
  
  On the server:
  
  $ sudo apt install slapd ldap-utils
  $ sudo dpkg-reconfigure slapd
  Set DNS to example.com
  $ ldapsearch -x -LLL -H ldap:/// -b dc=example,dc=com dn
  dn: dc=example,dc=com
  $ vim add_content.ldif
  dn: ou=People,dc=example,dc=com
  objectClass: organizationalUnit
  ou: People
  
  dn: ou=Groups,dc=example,dc=com
  objectClass: organizationalUnit
  ou: Groups
  
  dn: cn=miners,ou=Groups,dc=example,dc=com
  objectClass: posixGroup
  cn: miners
  gidNumber: 5000
  
  dn: uid=john,ou=People,dc=example,dc=com
  objectClass: inetOrgPerson
  objectClass: posixAccount
  objectClass: shadowAccount
  uid: john
  sn: Doe
  givenName: John
  cn: John Doe
  displayName: John Doe
  uidNumber: 10000
  gidNumber: 5000
  userPassword: {CRYPT}x
  gecos: John Doe
  loginShell: /bin/bash
  homeDirectory: /home/john
  $ ldapsearch -x -LLL -b dc=example,dc=com '(uid=john)' cn gidNumber
  dn: uid=john,ou=People,dc=example,dc=com
  cn: John Doe
  gidNumber: 5000
  
- ubuntu at noble-server:~$ ldappasswd -x -D cn=admin,dc=example,dc=com -W -S uid=john,ou=people,dc=example,dc=com
- New password: 
- Re-enter new password: 
- Enter LDAP Password: 
+ $ ldappasswd -x -D cn=admin,dc=example,dc=com -W -S uid=john,ou=people,dc=example,dc=com
+ New password:
+ Re-enter new password:
+ Enter LDAP Password:
  
  On the client, open /etc/hosts and add:
  
  $ sudo vim /etc/hosts
  192.168.122.150 ldap01.example.com
  $ sudo vim /etc/sssd/sssd.conf
  [sssd]
  config_file_version = 2
  domains = example.com
  
  [domain/example.com]
  id_provider = ldap
  auth_provider = ldap
  ldap_uri = ldap://ldap01.example.com
  cache_credentials = True
  ldap_search_base = dc=example,dc=com
  $ sudo chmod 600 /etc/sssd/sssd.conf
  $ sudo systemctl restart sssd
  $ getent passwd john
  john:*:10000:5000:John Doe:/home/john:/bin/bash
  
  Now we are set up for nss, try the reproducer:
  
  This code generates many threads that open the mentioned socket.
  
  $ sudo apt install sssd build-essential
  $ cat > test_code.c < EOF
  #include <pwd.h>
  #include <unistd.h>
  #include <pthread.h>
  
  static void *client(void *arg)
  {
      int i = *((int *)arg);
      struct passwd pwd;
      char buf[10000];
      struct passwd *r;
  
      getpwuid_r(i, &pwd, buf, 10000, &r);
  
      return NULL;
  }
  
  int main(void)
  {
      pthread_t thread;
      int arg;
      void *t_ret;
  
      for (int i = 0; i < 1000; ++i) {
          arg = 100000+i;
          pthread_create(&thread, NULL, client, &arg);
          pthread_join(thread, &t_ret);
      }
  
      while (1) {
          sleep(1);
      }
  
      return 0;
  }
  EOF
  $ gcc -o test_code test_code.c -lpthread
  $ ./test_code
  
  The file descriptor leak problem can be tested by compiling this code as
  a test_code binary for example and running
  
  $ lsof -p `pidof test_code` | wc -l
  1015
  
  The count can reach more than a thousand when should not be bigger than
  around 20 normally.
  
  [Where problems could occur]
  
  The patched code correctly accesses the thread specific structure to get
  the file descriptor and close the socket. Previously it just considered
  the structure was null and did nothing. The only new problems that could
  occur are related to the closing of the socket but that would be not
  worse than the previous situation.
  
  If a regression were to occur this would affect most sssd users, as it
  is in the core sssd component, and not any subcomponents. Worst case, it
  would affect fd leaks, leading to intermittent crashes when they hit
  rlimits.
  
  [Other Info]
  
  This bug only affects Noble. This is the original github issue that was patched:
  https://github.com/SSSD/sssd/issues/7189
  
  Fixed in commit:
  commit b439847bc88ad7b89f0596af822c0ffbf2a579df
  From: Sumit Bose <sbose at redhat.com>
  Date: Tue, 23 Jan 2024 09:28:26 +0100
  Subject: sss-client: handle key value in destructor
  Link: https://github.com/SSSD/sssd/commit/b439847

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/2085389

Title:
  File descriptor leak on /var/lib/sss/pipes/nss socket

Status in sssd package in Ubuntu:
  Fix Released
Status in sssd source package in Noble:
  In Progress
Status in sssd source package in Oracular:
  Fix Released
Status in sssd source package in Plucky:
  Fix Released

Bug description:
  [Impact]

  When running in a multithreaded environment each pthread that opens
  the /var/lib/sss/pipes/nss socket retains the file descriptor in a
  thread specific structure. This file descriptor should be closed when
  the thread is destroyed but due to a bug it is left open thus
  generating the leak.

  [Test Plan]

  Start two VMs. One will be a ldap client, the other will be a ldap
  server.

  On the server:

  $ sudo apt install slapd ldap-utils
  $ sudo dpkg-reconfigure slapd
  Set DNS to example.com
  $ ldapsearch -x -LLL -H ldap:/// -b dc=example,dc=com dn
  dn: dc=example,dc=com
  $ vim add_content.ldif
  dn: ou=People,dc=example,dc=com
  objectClass: organizationalUnit
  ou: People

  dn: ou=Groups,dc=example,dc=com
  objectClass: organizationalUnit
  ou: Groups

  dn: cn=miners,ou=Groups,dc=example,dc=com
  objectClass: posixGroup
  cn: miners
  gidNumber: 5000

  dn: uid=john,ou=People,dc=example,dc=com
  objectClass: inetOrgPerson
  objectClass: posixAccount
  objectClass: shadowAccount
  uid: john
  sn: Doe
  givenName: John
  cn: John Doe
  displayName: John Doe
  uidNumber: 10000
  gidNumber: 5000
  userPassword: {CRYPT}x
  gecos: John Doe
  loginShell: /bin/bash
  homeDirectory: /home/john
  $ ldapsearch -x -LLL -b dc=example,dc=com '(uid=john)' cn gidNumber
  dn: uid=john,ou=People,dc=example,dc=com
  cn: John Doe
  gidNumber: 5000

  $ ldappasswd -x -D cn=admin,dc=example,dc=com -W -S uid=john,ou=people,dc=example,dc=com
  New password:
  Re-enter new password:
  Enter LDAP Password:

  On the client, open /etc/hosts and add:

  $ sudo vim /etc/hosts
  192.168.122.150 ldap01.example.com
  $ sudo vim /etc/sssd/sssd.conf
  [sssd]
  config_file_version = 2
  domains = example.com

  [domain/example.com]
  id_provider = ldap
  auth_provider = ldap
  ldap_uri = ldap://ldap01.example.com
  cache_credentials = True
  ldap_search_base = dc=example,dc=com
  $ sudo chmod 600 /etc/sssd/sssd.conf
  $ sudo systemctl restart sssd
  $ getent passwd john
  john:*:10000:5000:John Doe:/home/john:/bin/bash

  Now we are set up for nss, try the reproducer:

  This code generates many threads that open the mentioned socket.

  $ sudo apt install sssd build-essential
  $ cat > test_code.c < EOF
  #include <pwd.h>
  #include <unistd.h>
  #include <pthread.h>

  static void *client(void *arg)
  {
      int i = *((int *)arg);
      struct passwd pwd;
      char buf[10000];
      struct passwd *r;

      getpwuid_r(i, &pwd, buf, 10000, &r);

      return NULL;
  }

  int main(void)
  {
      pthread_t thread;
      int arg;
      void *t_ret;

      for (int i = 0; i < 1000; ++i) {
          arg = 100000+i;
          pthread_create(&thread, NULL, client, &arg);
          pthread_join(thread, &t_ret);
      }

      while (1) {
          sleep(1);
      }

      return 0;
  }
  EOF
  $ gcc -o test_code test_code.c -lpthread
  $ ./test_code

  The file descriptor leak problem can be tested by compiling this code
  as a test_code binary for example and running

  $ lsof -p `pidof test_code` | wc -l
  1015

  The count can reach more than a thousand when should not be bigger
  than around 20 normally.

  [Where problems could occur]

  The patched code correctly accesses the thread specific structure to
  get the file descriptor and close the socket. Previously it just
  considered the structure was null and did nothing. The only new
  problems that could occur are related to the closing of the socket but
  that would be not worse than the previous situation.

  If a regression were to occur this would affect most sssd users, as it
  is in the core sssd component, and not any subcomponents. Worst case,
  it would affect fd leaks, leading to intermittent crashes when they
  hit rlimits.

  [Other Info]

  This bug only affects Noble. This is the original github issue that was patched:
  https://github.com/SSSD/sssd/issues/7189

  Fixed in commit:
  commit b439847bc88ad7b89f0596af822c0ffbf2a579df
  From: Sumit Bose <sbose at redhat.com>
  Date: Tue, 23 Jan 2024 09:28:26 +0100
  Subject: sss-client: handle key value in destructor
  Link: https://github.com/SSSD/sssd/commit/b439847

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/2085389/+subscriptions




More information about the Ubuntu-sponsors mailing list