[apparmor] wrong loggingtime in apparmorlog

John Johansen john.johansen at canonical.com
Fri Feb 6 09:22:25 UTC 2015


On 02/06/2015 01:06 AM, Hajo Locke wrote:
> Hello,
> 
> Am 05.02.2015 um 14:56 schrieb John Johansen:
>> On 02/05/2015 02:49 AM, Hajo Locke wrote:
>>> Hello,
>>>
>>>
>>> Am 05.02.2015 um 00:11 schrieb Seth Arnold:
>>>> On Wed, Feb 04, 2015 at 10:47:58AM +0100, Hajo Locke wrote:
>>>>> Feb  4 09:55:01 hostname CRON[3898]: (root)...
>>>>> Feb  4 11:30:50 hostname kernel: type=1400
>>>>> audit(1423040101.577:7785293): apparmor="DENIED"...
>>>>> Feb  4 09:55:01 hostname CRON[3911]: (root)...
>>>>>
>>>>> a relation to audit seems possible.  i found more kernel loglines in
>>>>> syslog with wrong date which are not in direct relation to apparmor.
>>>>> i think complete kern.log is wrong.
>>>>> may be more kernel related?  we upgrade regularly, current version
>>>>> on this machine is 3.13.0-43-generic (Ubuntu 14.04 LTS)
>>>>> Currently only rebooting solves this problem. after reboot
>>>>> timestamps in loglines are identical. But by increasing uptime the
>>>>> difference between realtime and kernel-log-time grows.
>>>>> one server with uptime ~2 days has kernel-time 53 seconds in future
>>>>> (3.13.0-45-generic).  server from loglines above has uptime 48 days
>>>>> and difference is more than 1,5 hours.
>>>> I've got a theory that's not very well thought through yet; to be honest,
>>>> this is an interesting problem that I've never seen before.
>>>>
>>>> Here's some source code that's very similar to the rsyslog daemon used in
>>>> Ubuntu 14.04 LTS:
>>>>
>>>> http://sources.debian.net/src/rsyslog/7.4.8-1%7Ebpo70%2B1/plugins/imklog/bsd.c/#L58
>>>>
>>>> Note this section:
>>>>
>>>> /* ... */
>>>>      clock_gettime(CLOCK_MONOTONIC, &monotonic);
>>>>      clock_gettime(CLOCK_REALTIME, &realtime);
>>>>      secOffs = realtime.tv_sec - monotonic.tv_sec;
>>>>      usecOffs = (realtime.tv_nsec - monotonic.tv_nsec) / 1000;
>>>>      if(usecOffs < 0) {
>>>>          secOffs--;
>>>>          usecOffs += 1000000l;
>>>>      }
>>>>           usecs += usecOffs;
>>>>      if(usecs > 999999l) {
>>>>          secs++;
>>>>          usecs -= 1000000l;
>>>>      }
>>>>      secs += secOffs;
>>>>      tv.tv_sec = secs;
>>>>      tv.tv_usec = usecs;
>>>>      tp = &tv;
>>>>
>>>> done:
>>>>      Syslog(pri, buf, tp);
>>>> /* ... */
>>>>
>>>> rsyslog is changing the timestamp supplied to its internal Syslog()
>>>> routine based on the time difference between the CLOCK_MONOTONIC and
>>>> CLOCK_REALTIME clocks. The CLOCK_MONOTONIC clock will only ever move
>>>> forward, while CLOCK_REALTIME can jump backwards if e.g. ntpdate(8) or
>>>> date(1) sets an entirely new time.
>>>>
>>>> So, my theory -- this system is using ntpdate(8) in a cronjob to set the
>>>> time rather than using ntpd(8), which would _skew_ the time in a fashion
>>>> that's more amenable to keeping CLOCK_MONOTONIC and CLOCK_REALTIME closer
>>>> to synchronized. (ntpd(8) can ask the system clock to speed up or slow
>>>> down so that seconds take more or less time, but when skewing it won't
>>>> jump the time forwards or backwards abruptly.)
>>>>
>>>> I've attached a little program that will output all the clocks documented
>>>> in clock_gettime(3) as well as the difference between the MONOTONIC clock
>>>> and REALTIME clock. If you've installed build-essential, you should be
>>>> able to compile it and run with just "make clocks ; ./clocks".
>>>>
>>>> If I'm correct, the difference between the MONOTONIC and the REALTIME
>>>> clock will get worse over the uptime of the system. And if I'm correct,
>>>> you can fix the issue by switching to ntpd rather than using ntpdate
>>>> periodically. You may also be able to use the klogParseKernelTimestamp
>>>> configuration variable from rsyslog:
>>>> http://www.rsyslog.com/doc/master/configuration/modules/imklog.html
>>> Thanks for your effort. i ran your program on affected machine and this is output:
>>>
>>>
>>>             CLOCK_REALTIME Thu Feb  5 10:36:40 2015
>>>      CLOCK_REALTIME_COARSE Thu Feb  5 10:36:40 2015
>>>            CLOCK_MONOTONIC Thu Feb 19 09:50:37 1970
>>>     CLOCK_MONOTONIC_COARSE Thu Feb 19 09:50:37 1970
>>>        CLOCK_MONOTONIC_RAW Thu Feb 19 09:51:01 1970
>>>             CLOCK_BOOTTIME Thu Feb 19 09:50:37 1970
>>>   CLOCK_PROCESS_CPUTIME_ID Thu Jan  1 01:00:00 1970
>>>    CLOCK_THREAD_CPUTIME_ID Thu Jan  1 01:00:00 1970
>>> CLOCK_REALTIME - CLOCK_MONOTONIC 1418863562.991217
>>>
>>>
>>> Current real-time is: Do 5. Feb 10:36:42 CET 2015
>>>
>>> we do not use rsyslog, for historical reasons we use syslog-ng. I dont know if this matters.
>> maybe they are different code bases. Overall I don't think it matters they
>> should both be handling time roughly the same or there would be lots of
>> complaints.
>>
>>> our systems run ntpd. In consultation with my fellows, nobody knows about using ntpdate. while running ntpd the use of ntpdate results in error.
>>> The difference between CLOCK_REALTIME - CLOCK_MONOTONIC is 1:46:02
>>>
>> ntpdate is just the old outdated way, it can result in large jumps where ntpd
>> keeps the time adjustments small. This can however be a problem when the skew
>> is large as it may not ever be able to catch up (well considering something is
>> introducing skew and maybe continuing to introduce skew).
>>
>> So
>> CLOCK_REALTIME can jump forwards and backwards in time
>> CLOCK_MONOTONIC will never jump backwards in time but is subject to skewing by
>>    ntpd
>> CLOCK_MONOTONIC_RAW is the unskewed MONOTONIC value
>>
>>> if i reload apparmor and use date-command to compare times, i see a difference of ~ 1:38
>>> Feb  5 13:00:05 hostname kernel: type=1400 audit(1423131731.158:8016925): apparmor...
>>> Do 5. Feb 11:22:11 CET 2015
>>>
>> I am not sure about the differences between the apparmor message drift and those
>> reported by Seth's little program. The difference could be some what accounted
>> for by CLOCK_MONOTONIC_RAW but I doubt that is actually it. How soon after
>> running seths program did you generate the apparmor message? ntpd could be doing
>> its thing and gradually trying to bring the clock in line.
> i generated the apparmor-messages ~46 Minutes after running seths program.
>>
>>
>>> hmm, i cant handle these results. I can do some more tests if you want.
>>>
>> hrmm so with the report that other kernel log messages are wrong I think it is
>> certainly dmesg that is the problem
>>
>> the time source used for dmesg time stamps is not guaranteed to be accurate, it
>> is not updated after suspend/resume etc. There is no real solution for this
>> except to have your logging daemon replace the dmesg time stamp with its own.
>> This will result in a slightly wrong time on the message but better than
>> the clock skew you are seeing.
>>
>> I think for syslog-ng you set the keep-timestamp option to no
>>
>> This of course will affect the logs from other sources as well.
>>
>> there are actually quite a few complaints about kernel log time being off
>> and running things like virtual machines just seem to exacerbate those
>> problems.
>>
>> The real question is why you are getting so much skew, I'd really like to
>> know what is causing it. Are they newer cpus with really deep sleep states,
>> that maybe are being handled correctly? Maybe older cpus/machines using the
>> htsc as the "high precision" source (lets just say it had lots of problems
>> with skew).
> hmm,  not very special servers. These are webservers which are running 24/7, permanent uptime.
> They never should really sleep. May be CPU sometimes idles, but in all this cases i would expect that the time from kernel logs is slower than realtime. But in all my cases the time drifts into the future.
> Hardware ist Standard, not very new AMD Phenom(tm) II X4 840 Processor in this case, but may be more other CPUs in other machines. AMD/Intel as well.
> I think we have hundreds of affected machines.
> A special hint may be that only Ubuntu14.04 seems to be affected.
> We have some completely new installed 14.04 Servers with this behaviour as well older servers which are upgraded from previous Ubuntu-Versions to 14.04.
> Currently i look in old syslogs of a upgraded server and i dont see this clock skew. It starts as this server was upgraded to Ubuntu14.04. But like i said also new installed machines show this behaviour, its not a problem of the upgrade.
> May be some bios-clocks have s skew, but this should not be a problem.
> 
> I tried the keep_timestamp(no) option for syslog-ng and this is working.  Real reason stays hidden, but workaround works.

I'm glad to hear that the work around is working.

I think it is very odd that you are seeing this drift and I think it is worth
filing an ubuntu bug against the kernel. Tell them that all kernel messages are
affected, that it something new in 14.04 and that you see it on AMD/Intel etc.
With a little luck they might have someone look at it. If you shoot the bug #
my way I can even try to nudge someone into looking into more. No promises
that someone will, the reality is they have hundreds and even thousands of
bugs to look at, and Ubuntu's kernel team isn't exactly teaming with hundreds
or even dozens of people.

But this is an interesting problem that might peek someones interest.





More information about the AppArmor mailing list