[Bug 235793] Re: Segmentation fault in ntpd when system has more than 1134 interface addresses

Bug Watch Updater 235793 at bugs.launchpad.net
Tue Aug 7 09:29:09 UTC 2012


Launchpad has imported 13 comments from the remote bug at
https://support.ntp.org/bugs/show_bug.cgi?id=1071.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2008-09-19T01:40:42+00:00 Jw-ntp wrote:

Hi:

We're running into the same problem on a redhat system that was previously
reported against ubuntu.

I don't think I could write a better bug report, so I'll just include/reference
it here:

http://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg858457.html


$ uname -a
Linux xxxxx 2.6.24-17-generic #1 SMP Thu May 1 13:57:17 UTC 2008 x86_64 
GNU/Linux

$ lsb_release -rd
Description:    Ubuntu 8.04
Release:        8.04

$ apt-cache policy ntp
ntp:
  Installed: 1:4.2.4p4+dfsg-3ubuntu2
  Candidate: 1:4.2.4p4+dfsg-3ubuntu2
  Version table:
 *** 1:4.2.4p4+dfsg-3ubuntu2 0
        500 http://gb.archive.ubuntu.com hardy/main Packages
        100 /var/lib/dpkg/status

On a system which has any more than 1134 interface addresses (counting
both IPv4 and IPv6 address thus: 'ip addr ls | grep inet | wc -l') ntpd
fails with a segmentation fault.

I run a large network for a university (20,000 people), and we routinely
operate open-source routers with several thousand interface addresses
(every client machine lives in it's own /30 subnet).


Whilst examining the problem I used this script to add interface addresses:

$ cat breakit.sh
#!/bin/sh
set -e
dev="vlan252"
sudo ifdown ${dev}                    # clears all old addr
naddr=`ip add ls | grep inet | wc -l` # number of addr already on system
echo "Pre-existing addresses on system: ${naddr}"
: ${nbreakit:=1217}                   # number of addr required to break ntp
sudo ifup ${dev}
total=${naddr}
for I in `seq 1 5` ; do
  for J in `seq 0 255` ; do
    sudo ip addr add 10.0.${I}.${J}/32 dev ${dev}
    total=$((${total}+1))
    if [ ${total} -ge ${nbreakit} ]; then break ; fi
  done
  if [ ${total} -ge ${nbreakit} ]; then break ; fi
done
echo "Added extra addresses: $((${total}-${naddr}))"
echo "Total addresses on system now: $(ip add ls | grep inet | wc -l)"
exit 0

$ grep -B1 -A2 vlan252 /etc/network/interfaces

#auto vlan252
iface vlan252 inet manual
  vlan-raw-device eth0

I could then add the desired number of interfaces by doing:
$ nbreakit=1135 ../breakit.sh

and then run ntpd [1][2] under gdb by doing:

sudo sh -c "ulimit -n 8192 ; gdb --args /usr/sbin/ntpd -n -d -D3 -p
/var/run/ntpd.pid -u 115:126 -g"

[1] You can't use '-d -D3' on the standard ubuntu package - debugging is 
disabled.
[2] It is necessary to raise the number of open files ulimit of 1024 in order 
to run ntpd with this many addresses.

In order to help me understand what was going wrong with the standard
package, I rebuilt it with debugging enabled and symbols not stripped:

$ mkdir work
$ cd work/
$ apt-get source ntp
Reading package lists... Done
Building dependency tree
Reading state information... Done
NOTICE: 'ntp' packaging is maintained in the 'Svn' version control system at:
svn://svn.debian.org/pkg-ntp/ntp/
Need to get 3120kB of source archives.
Get: 1 http://gb.archive.ubuntu.com hardy/main ntp 1:4.2.4p4+dfsg-3ubuntu2 
(dsc) [1034B]
Get: 2 http://gb.archive.ubuntu.com hardy/main ntp 1:4.2.4p4+dfsg-3ubuntu2 
(tar) [2835kB]
Get: 3 http://gb.archive.ubuntu.com hardy/main ntp 1:4.2.4p4+dfsg-3ubuntu2 
(diff) [284kB]
Fetched 3120kB in 0s (7433kB/s)
dpkg-source: extracting ntp in ntp-4.2.4p4+dfsg
dpkg-source: unpacking ntp_4.2.4p4+dfsg.orig.tar.gz
dpkg-source: applying ./ntp_4.2.4p4+dfsg-3ubuntu2.diff.gz
$ cd ntp-4.2.4p4+dfsg/
$ cp -p debian/rules{,.orig}
$ vi debian/rules
$ diff -u debian/rules{.orig,}
--- debian/rules.orig   2008-05-28 17:52:21.000000000 +0100
+++ debian/rules        2008-05-28 17:53:21.000000000 +0100
@@ -21,7 +21,7 @@
        ./configure CFLAGS='$(CFLAGS)' \
                --prefix=/usr \
                --enable-all-clocks --enable-parse-clocks --enable-SHM \
-               --disable-debugging --sysconfdir=/var/lib/ntp \
+               --enable-debugging --sysconfdir=/var/lib/ntp \
                --with-sntp=no \
                --enable-linuxcaps \
                --disable-dependency-tracking
@@ -104,7 +104,7 @@
        dh_installlogcheck -a
        dh_installchangelogs -a
        dh_perl -a
-       dh_strip -a
+       #dh_strip -a
        dh_compress -a
        dh_fixperms -a
        dh_installdeb -a
$ dpkg-buildpackage -us -uc
...
dpkg-buildpackage: binary and diff upload (original source NOT included)
$ sudo dpkg -i ../ntp_4.2.4p4+dfsg-3ubuntu2_amd64.deb

Once one has 1135 interface addresses on the system you get this
segmentation fault:

0x000000000040c9d2 in update_interfaces (port=<value optimized out>,
    receiver=0, data=0x0) at ntp_io.c:769
769             ISC_LIST_UNLINK_TYPE(inter_list, interface, link, struct 
interface);

If one increases the number of interface addresses to 1215 you get a
different segmentation fault:

update_interfaces (port=<value optimized out>, receiver=0, data=0x0)
    at ntp_io.c:1325
1325                            if (!(interf->flags & 
(INT_WILDCARD|INT_MCASTIF))) {

and if one increases it to 1216 or more you finally get this one:

0x000000000040ba7f in add_interface (interface=0x7d13c0) at ntp_io.c:756
756             ISC_LIST_APPEND(inter_list, interface, link);

I had hoped to work around the problem by renaming the devices to 
'vlan252:foo', since in ntpd/ntp_io.c:address_okay() the -L flag causes ntpd to 
ignore "virtual addresses" - this turns out to mean addresses on interfaces 
which contain a ':' in their name. Unfortunately, since the
segmentation fault occurs during interface enumeration (building a linked list 
of all interface+address) this doesn't help.

In the past, earlier versions of ntpd did not use a linked list for this
purpose, but rather a fixed-size array of 512. By simply increasing the
size of the array I was able to run with large numbers of addresses. It
seems to me that, whilst the linked-list code ought to be made to work
correctly, it is in fact an unnecessary precaution (and overhead) to
bind all addresses on a linux system - only the root user could bind a
more specific address than * on port 123, and if one has root the game
is over anyway.

I haven't submitted this bug to the upstream package maintainer
(http://www.ntp.org/bugs.html) partly because firefox won't let me see
their bugs site (invalid SSL certificate), and partly because it might
be better coming via the distribution. I note that ubuntu has the
currently up-to-date version of ntp (4.2.4p4).

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/2

------------------------------------------------------------------------
On 2008-09-19T03:04:56+00:00 Mayer-r wrote:

Can you say why you need that many addresses on one system?

I haven't looked recently at what's going on but I would recommend that you turn
off rescanning by setting -u 0 on the command line. The only way that I can
think of that might cause a linked list to fail is if it ran out of memory but
then I don't see why that would happen where you say since it would be in the
malloc() call.

Does BIND9 run okay on this system?

Danny

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/3

------------------------------------------------------------------------
On 2008-09-19T03:12:03+00:00 Jw-ntp wrote:

Hi Danny:

The systems in question are part of an outbound mail farm running eCelerity (
http://messagesystems.com/ ). Clients get one or more  ip addresses depending on
their volume/rate.

I'll see if -u 0 helps.

We don't run bind on those boxes, but I can try and see if it starts...

Thanks!

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/4

------------------------------------------------------------------------
On 2008-09-19T03:47:55+00:00 Stenn wrote:

Subject: Segmentation fault in ntpd when system has more than 1134
interface addresses

While I can appreciate that this is a rare case, the bottom line is we
seem to have a bug and it should be fixed.

We've been meaning to upgrade and use a more "stock" libisc/, perhaps
reporting some bugs along the way.

This looks like an opportunity...

-- 
Harlan Stenn <stenn at ntp.org>
http://ntpforum.isc.org  - be a member!


Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/5

------------------------------------------------------------------------
On 2008-09-19T14:42:28+00:00 Jw-ntp wrote:

-u 0 didn't make any difference. We'll try to see if bind starts.

We're happy to test a patch, if one comes available.

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/6

------------------------------------------------------------------------
On 2008-12-06T01:46:09+00:00 Norbert-eder wrote:

It is clear, that your additional options don't solve the problem.

You must use a BIG U

I used  -U 0   and it solved this Problem.
I opened the Bug-ID 1102.

bye,
Nobsi

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/7

------------------------------------------------------------------------
On 2009-05-07T21:21:38+00:00 Dave Hart wrote:

Thanks to a few hours of cooperative debugging with Sandu Adrian 
<dexter at d3xt3r01.tk> today, the faulty code has been identified.  In 
ntpd/ntp_io.c the last line of add_fd_to_list is FD_SET(fd, &activefds).  This 
(and the preceding maxactivefd maintenance) should be protected by a check that 
fd < FD_SETSIZE.  If you add code to check for that and msyslog a message and 
then exit, you will no longer crash ntpd starting with many interfaces.

Fixing the actual crash is easy.  The tougher part is figuring out how we are 
going to enumerate more interfaces than FD_SETSIZE, in other words, how to 
defer opening sockets for each until after enumeration, so that some mechanism 
can be used to select a subset of interfaces less numerous than FD_SETSIZE that 
ntpd can use.

I suggest we immediately add code to terminate with an error if we are about to 
try to add an fd >= FD_SETSIZE with FD_SET.  We could also add code to 
add_interface to catch corruption of inter_list.head sooner, as it helped to 
track this issue down.  That depends on the first entry in inter_list never 
being removed but I think that's a safe assumption.



Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/8

------------------------------------------------------------------------
On 2009-05-08T01:49:31+00:00 Dave Hart wrote:

I have the two fixes described in Comment #6 ready, which fixes the crash 
reported.  Note that ntpd will simply refuse to start with more interfaces than 
around FD_SETSIZE (1024 on Linux), so this is not a solution to the overall 
problem, but it does fix the bug that caused corruption and the fault.

pogo:~hart/ntp-stable-784-jjy

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/9

------------------------------------------------------------------------
On 2009-05-08T07:51:59+00:00 Stenn wrote:

Jeff,

Please check 4.2.4p7-RC6 (or later) and mark this bug as VERIFIED or
REOPENED, as appropriate.

Dave Hart and Sandu, thanks for your work in getting the root cause
identified and resolved.

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/10

------------------------------------------------------------------------
On 2009-05-09T04:47:05+00:00 Dave Hart wrote:

See Bug #1180 regarding a fix for starting ntpd with more than 1000
interfaces

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/11

------------------------------------------------------------------------
On 2009-05-09T17:22:57+00:00 Mayer-r wrote:

Just note that FD_SETSIZE can be any value and it may default 1024 on SOME
flavors of Linux and be something totally different on other O/S's but it could
be anything if overridden so this check looks right.

Danny

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/13

------------------------------------------------------------------------
On 2009-10-15T16:13:23+00:00 Kostecke-8 wrote:

Please mark this bug as VERIFIED if you agree that it is fixed.

Or reopen it if further work is required.

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/18

------------------------------------------------------------------------
On 2012-08-07T05:27:33+00:00 Stenn wrote:

*** Bug 1102 has been marked as a duplicate of this bug. ***

Reply at: https://bugs.launchpad.net/ntp/+bug/235793/comments/19


** Changed in: ntp
   Importance: Unknown => High

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to ntp in Ubuntu.
https://bugs.launchpad.net/bugs/235793

Title:
  Segmentation fault in ntpd when system has more than 1134 interface
  addresses

To manage notifications about this bug go to:
https://bugs.launchpad.net/ntp/+bug/235793/+subscriptions



More information about the Ubuntu-server-bugs mailing list