[Bug 1347147] [NEW] krb5 database operations enter infinite loop

Launchpad Bug Tracker 1347147 at bugs.launchpad.net
Thu Jul 31 18:39:00 UTC 2014


You have been subscribed to a public bug by Gunnar Hjalmarsson (gunnarhj):

[Impact]

On krb5 KDC databases with more than a few hundred principals,
operations can enter an infinite loop in the database library.  This
affects both read and write operations.  If operators are fortunate,
they will encounter this bug while testing a migration.  If they are not
so fortunate, they will encounter this bug in a production KDC when the
number of principals crosses the threshold where this bug manifests,
resulting in a service outage and possible database corruption.
Probably the only way to restore service in that situation is to install
a patched KDC or to downgrade to an unaffected version.

Both Trusty and Utopic amd64 have been verified to have this issue.

One concrete reported example is an invocation of kdb5_util load (as
part of a slave KDC propagation) spinning:

http://mailman.mit.edu/pipermail/kerberos/2014-July/020007.html

Additional failure modes are likely

A branch is linked including the upstream work around for this bug,
along with two other patches to bugs already nominated for trusty
applied to the krb5 in trusty.

For utopic, the simplest fix is to rebuild krb5 with the compiler
currently in utopic.  An alternative is to request that the Debian
maintainers (both monitoring this bug for such a request) upload the
upstream work around to Debian and sync that.  You could do an ubuntu-
specific upload but it seems undesirable to introduce a change between
Ubuntu and Debian when all the right parties are happy to avoid it.

The upstream patch works around a compiler optimizer bug in the gcc-4.8
series, which incorrectly deduces that a strict aliasing violation has
occurred and miscompiles part of the bundled libdb2 library that the KDC
database back end depends upon.  The miscompilation causes a data
structure to contain an inappropriate cycle, which leads to an infinite
loop when the structure is traversed.

[Test Case]

apt-get install krb5-kdc krb5-admin-server
kdb5_util -W -r T create -s
awk 'BEGIN{ for (i = 0; i < 1024; i++) { printf("ank -randkey a%06d\n", i) } }' /dev/null | kadmin.local -r T

(Enter any password for the master key when requested.)

On platforms with this issue, kadmin.local spins consuming 100% CPU
after a few hundred principals have been created.  (This is "a000762" on
two examples.)

To clean up,

rm /etc/krb5kdc/principal*

or

krb5kdc -r T destroy

but the latter can possibly enter the same infinite loop.

[Regression Potential]

Negligible.

It is theoretically possible that our upstream workaround, which
involves using TAILQ macros instead of CIRCLEQ macros in the bundled
libdb2 that backs the KDC database, will have some as-yet undiscovered
bugs or compiler interactions with consequences worse than this current
issue.  I think this is rather unlikely.

The patched libdb2 passes both the extensive libdb2 test suite and the
rest of the krb5 test suite.  Prior to patching, compiling krb5 with an
affected gcc would cause the krb5 test suite to stall when it reached
the libdb2 test suite.  (The test suite stall is how we became aware of
the gcc optimizer bug.)

The BSD TAILQ macros are generally considered to be safer than the
CIRCLEQ macros, and the various open-source BSD derivatives have made
the corresponding change to their libdb sources years ago, with no
reported ill effects that I can see.

Original report from Ben Kaduk:

==========

In some conditions, propagating a kerberos database to a slave KDC server can stall.
This is due to a misoptimization by gcc 4.8 of the CIRCLEQ famliy of macros, apparently due to overzealous strict aliasing deductions.

One case of this stall is reported at
http://mailman.mit.edu/pipermail/kerberos/2014-July/020007.html (and the
rest of the thread), and there is an entry in the upstream bugtracker at
http://krbdev.mit.edu/rt/Ticket/Display.html?id=7860 .

gcc 4.9 (as used in Debian unstable at present) is not believed to
induce this problem.  Upstream has patched their code to use the TAILQ
family of macros instead, as a workaround, but that workaround has not
yet appeared in an upstream release:
https://github.com/krb5/krb5/commit/26d8744129

Because of the different compiler versions used on Debian and Ubuntu, I
am filing this as an Ubuntu-specific bug.

** Affects: gcc
     Importance: Unknown
         Status: Unknown

** Affects: kerberos
     Importance: Unknown
         Status: Unknown

** Affects: gcc-4.8 (Ubuntu)
     Importance: Undecided
         Status: Confirmed

** Affects: krb5 (Ubuntu)
     Importance: Undecided
         Status: Triaged


** Tags: regression-release testcase trusty utopic
-- 
krb5 database operations enter infinite loop
https://bugs.launchpad.net/bugs/1347147
You received this bug notification because you are a member of Ubuntu Sponsors Team, which is subscribed to the bug report.



More information about the Ubuntu-sponsors mailing list