[PATCH 0/1] Switch to jiffies for native_sched_clock() when TSC warps

Thu Mar 11 16:54:19 UTC 2010

Chase Douglas wrote:
> On Thu, Mar 11, 2010 at 5:58 AM, Stefan Bader
> <stefan.bader at canonical.com> wrote:
>> Colin Ian King wrote:
>>> On Wed, 2010-03-10 at 17:56 -0500, Chase Douglas wrote:
>>>> There are other paths
>>>> that are causing oops messages [1]. Further bugs may be caused by TSC
>>>> warping that we just haven't seen yet.
>>> Here is an example of this: Doing an slow I/O operation by default uses
>>> writing to port 0x80 for a small delay. However, the io_delay=udelay
>>> kernel parameter uses a 2 microsecond udelay(), so if the TSC warps
>>> forward then we may pop out of the delay prematurely which could be
>>> problematic.
>>>
>>> If we are *really* unlucky, it is hypothetically possible for the TSC
>>> may warp to 0xffffffffffffffff coming out of S3 and then immediately
>>> wrap to zero. I believe it may be then possible for a TSC based udelay()
>>> to get stuck in the delay loop for possibly years/centuries/millennia.
>> Right so to me the best solution sound like having something similar to the
>> macros in the clock framework (or use that) to handle wraps in general.
>> Like time_after or such things.
> 
> So the question I have is: is the absolute value of the TSC relevant,
> or just the relative value. Having proper wrap checking would solve
> some of the issues if we only cared about the relative values.
> However, the output of native_sched_clock is supposed to be an
> absolute number of nanoseconds since system boot. I don't know myself
> whether the absolute value is expected by any calling functions to be
> correct.
> 
> Beyond that though, there still may be instances where it is expected
> that the time stamps not jump years into the future. I'd be afraid
> that some protocol stack, like TCP, that depends highly on proper
> timing would go awry in such situations.
> 
> I sent a message to linux-kernel last night asking about the
> possibility of switching to the jiffies count at runtime when a TSC
> warp is found [1]. No responses yet though.
> 
> -- Chase
> 
> [1] http://lkml.org/lkml/2010/3/10/437

What I saw in the code (delay.c)

 * Since we calibrate only once at boot, this
 * function should be set once at boot and not changed

and in (tsc.c)

 * But note that we still use it if the TSC is marked
 * unstable. We do this because unlike Time Of Day,
 * the scheduler clock tolerates small errors and it's
 * very important for it to be as fast as the platform
 * can achive it.

makes me feel that changing it should be handled with care.

Stefan