[Bug 1347147] Re: krb5 database operations enter infinite loop
Tom Yu
tlyu at mit.edu
Thu Jul 31 16:06:08 UTC 2014
** Description changed:
- In some conditions, propagating a kerberos database to a slave KDC server or performing other database operations can stall. As we've investigated the issue, it looks like a database with more than a few hundred principals is very likely to run into this issue.
+ [Impact]
+
+ On krb5 KDC databases with more than a few hundred principals,
+ operations can enter an infinite loop in the database library. This
+ affects both read and write operations. If operators are fortunate,
+ they will encounter this bug while testing a migration. If they are not
+ so fortunate, they will encounter this bug in a production KDC when the
+ number of principals crosses the threshold where this bug manifests,
+ resulting in a service outage and possible database corruption.
+ Probably the only way to restore service in that situation is to install
+ a patched KDC or to downgrade to an unaffected version.
+
+ Both Trusty and Utopic amd64 have been verified to have this issue.
+
+ One concrete reported example is an invocation of kdb5_util load (as
+ part of a slave KDC propagation) spinning:
+
+ http://mailman.mit.edu/pipermail/kerberos/2014-July/020007.html
+
+ Additional failure modes are likely
+
+ The proposed fix at https://launchpad.net/~hartmans/+archive/ubuntu/ubuntu-fixes
+ works around a compiler optimizer bug in the gcc-4.8 series, which incorrectly deduces that a strict aliasing violation has occurred and miscompiles part of the bundled libdb2 library that the KDC database back end depends upon. The miscompilation causes a data structure to contain an inappropriate cycle, which leads to an infinite loop when the structure is traversed.
+
+ [Test Case]
+
+ apt-get install krb5-kdc krb5-admin-server
+ kdb5_util -W -r T create -s
+ awk 'BEGIN{ for (i = 0; i < 1024; i++) { printf("ank -randkey a%06d\n", i) } }' /dev/null | kadmin.local -r T
+
+ (Enter any password for the master key when requested.)
+
+ On platforms with this issue, kadmin.local spins consuming 100% CPU
+ after a few hundred principals have been created. (This is "a000762" on
+ two examples.)
+
+ To clean up,
+
+ rm /etc/krb5kdc/principal*
+
+ or
+
+ krb5kdc -r T destroy
+
+ but the latter can possibly enter the same infinite loop.
+
+ [Regression Potential]
+
+ Negligible.
+
+ It is theoretically possible that our upstream workaround, which
+ involves using TAILQ macros instead of CIRCLEQ macros in the bundled
+ libdb2 that backs the KDC database, will have some as-yet undiscovered
+ bugs or compiler interactions with consequences worse than this current
+ issue. I think this is rather unlikely.
+
+ The patched libdb2 passes both the extensive libdb2 test suite and the
+ rest of the krb5 test suite. Prior to patching, compiling krb5 with an
+ affected gcc would cause the krb5 test suite to stall when it reached
+ the libdb2 test suite. (The test suite stall is how we became aware of
+ the gcc optimizer bug.)
+
+ The BSD TAILQ macros are generally considered to be safer than the
+ CIRCLEQ macros, and the various open-source BSD derivatives have made
+ the corresponding change to their libdb sources years ago, with no
+ reported ill effects that I can see.
+
+
+ Original report from Ben Kaduk:
+
+ ==========
+
+ In some conditions, propagating a kerberos database to a slave KDC server can stall.
This is due to a misoptimization by gcc 4.8 of the CIRCLEQ famliy of macros, apparently due to overzealous strict aliasing deductions.
One case of this stall is reported at
http://mailman.mit.edu/pipermail/kerberos/2014-July/020007.html (and the
rest of the thread), and there is an entry in the upstream bugtracker at
http://krbdev.mit.edu/rt/Ticket/Display.html?id=7860 .
gcc 4.9 (as used in Debian unstable at present) is not believed to
induce this problem. Upstream has patched their code to use the TAILQ
family of macros instead, as a workaround, but that workaround has not
yet appeared in an upstream release:
https://github.com/krb5/krb5/commit/26d8744129
- A branch is linked including this upstream work around and two other
- patches to bugs already nominated for trusty applied to the krb5 in
- trusty. We believe the impact is significant because this is likely to
- be a problem for sites with a large database running trusty. The
- regression potential is very small. The upstream work around changes
- from one family of queue macros that are stable and well-tested to
- another.
-
- For utopic, the simplest fix is to rebuild krb5 with the compiler
- currently in utopic. An alternative is to request that the Debian
- maintainers (both monitoring this bug for such a request) upload the
- upstream work around to Debian and sync that. You could do an ubuntu-
- specific upload but it seems undesirable to introduce a change between
- Ubuntu and Debian when all the right parties are happy to avoid it.
-
Because of the different compiler versions used on Debian and Ubuntu, I
am filing this as an Ubuntu-specific bug.
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to krb5 in Ubuntu.
https://bugs.launchpad.net/bugs/1347147
Title:
krb5 database operations enter infinite loop
To manage notifications about this bug go to:
https://bugs.launchpad.net/gcc/+bug/1347147/+subscriptions
More information about the Ubuntu-server-bugs
mailing list