[PATCH 0/1] Switch to jiffies for native_sched_clock() when TSC warps

Thu Mar 11 10:58:50 UTC 2010

Colin Ian King wrote:
> On Wed, 2010-03-10 at 17:56 -0500, Chase Douglas wrote:
>> On Wed, Mar 10, 2010 at 5:37 PM, Stefan Bader
>> <stefan.bader at canonical.com> wrote:
>>> Chase Douglas wrote:
>>>> I took a look at the x86 code handling the clock to see what could be done
>>>> about the TSC warping coming out of resume on some of the newer processors. The
>>>> code includes a built-in fallback path that uses the jiffies count instead of
>>>> the TSC register if "notsc" is used on the command line. This patch merely sets
>>>> this option at runtime if two TSC time stamps differ by more than 6 years.
>>>>
>>>> I'm sending this here first because I've not touched clocking code before. I'm
>>>> not sure whether this is a feasible approach, and I would like feedback. Note
>>>> that the TSC warping hasn't caused any noticeable issues beyond triggering some
>>>> oops messages, so even if there's some skew in the switch from TSC to jiffies
>>>> it should hopefully not cause too much of an issue.
>>>>
>>>> The only truly negative outcome I foresee is that the clock won't be stable on
>>>> a single CPU.
> 
> ..and it's hard to determine which CPUs are buggy because they may/may
> not have BIOS loaded or kernel loaded microcode fixes.
> 
>>  Programs needing accurate clock timing can pin themselves to a
>>>> single CPU in order to get TSC time stamps that are monotonic and accurate (The
>>>> TSC register is per cpu, and there may be skew between CPUs).
> 
> ..believe me, if it can skew, it will skew.
> 
>>  However, if the
>>>> TSC has warped we are beyond that point anyways. If you have a warping
>>>> processor you should run with notsc if you care about accuracy, even though
>>>> precision would be reduced.
>>>>
> ..and "notsc" impacts on low-latency (see later).
>>> From my feeling, to change the sched_clock to jiffies after resume sounds not
>>> like a good idea. What was wrong with Colin's approach of just fixing the math?
>> Colin's patch fixes soft lockup bugs from being fired. That's fixing
>> merely one symptom, but not the real problem. 
> 
> Well, actually, it's a little more complex than that. Here are some
> extra things to throw into the discussion:
> 
> 1) One some processors, the TSC can set the top 32 bits to 0xffffffff
> when coming out of S3. This is a processor issue which may be possible
> to fix on a microcode update (loaded from a new BIOS upgrade) or maybe
> by installing in the intel-microcode package.  So maybe, on some systems
> we can advise users to first try the intel-microcode update. If the CPU
> is misbehaving, perhaps that's the first thing to fix.
> 
> 2) While poking around I saw that we get spurious warnings from the
> softlockup detection code when the approximated seconds timing tends
> towards 0xffffffff because of a math overflow. This will happen whether
> or not we use the TSC or not.  So it's good to have this fixed anyhow,
> even if the bug only happens after thousands of years uptime.
> 
> 3) Disabling the use of the TSC impacts on low-latency. For example,
> when doing udelays the default is to use the TSC based delay which
> periodically yields to the scheduler rather than burning up cycles in a
> hard loop. The use of the TSC enables the delay loop to figure out how
> much delay is left after coming back from the scheduler.   With the
> non-TSC mode, we burn up cycles and don't yield, so low-latency users
> may/will object to this.
> 
> 
>> There are other paths
>> that are causing oops messages [1]. Further bugs may be caused by TSC
>> warping that we just haven't seen yet.
> 
> Here is an example of this: Doing an slow I/O operation by default uses
> writing to port 0x80 for a small delay. However, the io_delay=udelay
> kernel parameter uses a 2 microsecond udelay(), so if the TSC warps
> forward then we may pop out of the delay prematurely which could be
> problematic.  
> 
> If we are *really* unlucky, it is hypothetically possible for the TSC
> may warp to 0xffffffffffffffff coming out of S3 and then immediately
> wrap to zero. I believe it may be then possible for a TSC based udelay()
> to get stuck in the delay loop for possibly years/centuries/millennia.

Right so to me the best solution sound like having something similar to the
macros in the clock framework (or use that) to handle wraps in general.
Like time_after or such things.

>> Also, the TSC warping issue seems more prevalent than first thought.
>> Originally, Colin believed the issue was confined to new Arrandale
>> processors, but we're seeing the issue hit Core 2 processors as well
>> [1].
>>
>> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/535077
>>
> 
>