Bug 263555: A fix for the root cause of e1000e issue

Tim Gardner tcanonical at tpi.com
Thu Oct 16 13:57:04 UTC 2008

Zhao, Yingying wrote:
> For the bug https://bugs.launchpad.net/ubuntu/+source/linux/+bug/263555
> Here's a patch which has passed Intel's testing and with this patch that
> issue can't be reproduced again now. It looks to be a work-around and
> with the .28 a fix for the root cause of the problem. 
> Please apply this patch for Intrepid.
> - Yingying
>> Date: Wed, 15 Oct 2008 18:21:44 -0400 (EDT)
>> From: Steven Rostedt <rostedt at goodmis.org>
>> To: LKML <linux-kernel at vger.kernel.org>, stable at kernel.org
>> cc: Linus Torvalds <torvalds at linux-foundation.org>,
>>         Andrew Morton <akpm at linux-foundation.org>,
>>         Arjan van de Ven <arjan at infradead.org>, gregkh at suse.de,
>>         jesse.brandeburg at intel.com, Thomas Gleixner
> <tglx at linutronix.de>,
>>         Ingo Molnar <mingo at elte.hu>
>> Subject: [PATCH -stable] disable CONFIG_DYNAMIC_FTRACE due to possible
> memory
>>         corruption on module unload
>> While debugging the e1000e corruption bug with Intel, we discovered
>> today that the dynamic ftrace code in mainline is the likely source of
>> this bug.
>> For the stable kernel we are providing the only viable fix patch:
> labeling
>> CONFIG_DYNAMIC_FTRACE as broken. (see the patch below)
>> We will follow up with a backport patch that contains the fixes. But
> since
>> the fixes are not a one liner, the safest approach for now is to
>> disable the code in question.
>> The cause of the bug is due to the way the current code in mainline
>> handles dynamic ftrace.  When dynamic ftrace is turned on, it also
>> turns on CONFIG_FTRACE which enables the -pg config in gcc that places
>> a call to mcount at every function call. With just CONFIG_FTRACE this
>> causes a noticeable overhead.  CONFIG_DYNAMIC_FTRACE works to ease
> this
>> overhead by dynamically updating the mcount call sites into nops.
>> The problem arises when we trace functions and modules are unloaded.
>> The first time a function is called, it will call mcount and the
> mcount
>> call will call ftrace_record_ip. This records the calling site and
>> stores it in a preallocated hash table. Later on a daemon will
>> wake up and call kstop_machine and convert any mcount callers into
>> nops.
>> The evolution of this code first tried to do this without the
> kstop_machine
>> and used cmpxchg to update the callers as they were called. But I
>> was informed that this is dangerous to do on SMP machines if another
>> CPU is running that same code. The solution was to do this with
>> kstop_machine.
>> We still used cmpxchg to test if the code that we are modifying is
>> indeed code that we expect to be before updating it - as a final
>> line of defense.
>> But on 32bit machines, ioremapped memory and modules share the same
>> address space. When a module would load its code into memory and
> execute
>> some code, that would register the function.
>> On module unload, ftrace incorrectly did not zap these functions from
>> its hash (this was the bug). The cmpxchg could have saved us in most
>> cases (via luck) - but with ioremap-ed memory that was exactly the
> wrong
>> thing to do - the results of cmpxchg on device memory are undefined.
>> (and will likely result in a write)
>> The pending .28 ftrace tree does not have this bug anymore, as a
> general push
>> towards more robustness of code patching, this is done differently: we
> do not
>> use cmpxchg and we do a WARN_ON and turn the tracer off if anything
> deviates
>> from its expected state. Furthermore, patch sites are statically
> identified
>> during build time so there's no runtime discovery of dynamic code
> areas
>> anymore, and no room for code unmaps to cause the hash to become out
> of date.
>> We believe the fragility of dynamic patching has been sufficiently
>> addressed in the development code via the static patching method, but
> further
>> suggestions to make it more robust are welcome.
>> Signed-off-by: Steven Rostedt <srostedt at goodmis.org>
>> Acked-by: Ingo Molnar <mingo at elte.hu>
>> Acked-by: Thomas Gleixner <tglx at linutronix.de>
>> ---
>>  kernel/trace/Kconfig |    3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> Index: linux-compile.git/kernel/trace/Kconfig
>> ===================================================================
>> --- linux-compile.git.orig/kernel/trace/Kconfig 2008-10-02
> 10:18:49.000000000
>> -0400
>> +++ linux-compile.git/kernel/trace/Kconfig      2008-10-15
> 17:29:34.000000000
>> -0400
>> @@ -103,7 +103,8 @@ config CONTEXT_SWITCH_TRACER
>>           all switching of tasks.
>> -       bool "enable/disable ftrace tracepoints dynamically"
>> +       bool "enable/disable ftrace tracepoints dynamically (BROKEN)"
>> +       depends on BROKEN
>>         depends on FTRACE
>>         depends on HAVE_DYNAMIC_FTRACE
>>         default y
>> --
>> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> Best Regards, 
> Yingying 


I saw this on LKML in the 2.6.27.y stable tree announcement this
morning, so I've cherry-picked it from there. Thanks for your note.

Tim Gardner tim.gardner at canonical.com

More information about the kernel-team mailing list