[Maverick] [ti-omap4] SRU: A workaround for highmem issue on OMAP4 platform
Nicolas Pitre
nicolas.pitre at canonical.com
Fri Sep 24 18:03:16 UTC 2010
On Fri, 24 Sep 2010, Ricardo Salveti de Araujo wrote:
> On Fri, Sep 24, 2010 at 12:45:30AM -0300, Ricardo Salveti de Araujo wrote:
> > On Fri, Sep 24, 2010 at 11:21:01AM +0800, Bryan Wu wrote:
> > > SRU Justification:
> > >
> > > Impact:
> > > There is a critical highmem issue on our latest OMAP4 ES2.0 platform. When we
> > > build kernel package natively on ES2.0 platform with mem=1G and highmem
> > > enabled, we will meet 'Bus Error' corruption from gcc shortly. And 'Unhandled
> > > imprecised external abort' kernel oops messages. Then the whole system will be
> > > very instable.
> > >
> > > Fix: After some debugging, this issue is related to highmem. If we don't use
> > > mem=1G (no memory in highmem), the corruption is gone. So there is a workaround
> > > which is CONFIG_VMSPLIT_2G=y. So user and kernel memory split is 2G:2G instead
> > > of default 3G:1G. We can use all the 1G memory on ES2.0, but don't put any
> > > memory in highmem. As a result, the issue is gone.
> >
> > Generally when using highmem we can reproduce this issue quickly, with 10, 15
> > minutes after started the kernel build. Currently without highmem I was able to
> > build the whole kernel 3 times already, and didn't face this issue.
> >
> > I just started another batch that will run at least more 5 times during this
> > night, and will reply tomorrow with the test result.
> >
> > Meanwhile we're debugging the highmem issue with Nicolas's help.
Regardless of the issue, I suggest that my patch fixing the VMALLOC_END
be merged. Without it, the CONFIG_VMSPLIT_2G option is simply totally
useless.
> Unfortunatelly something else doesn't seems to be right :-(
> After building it one time successfully with -j 2 I changed to -j 3 and after 10
> minutes I got the following error:
>
> Bad mode in data abort handler detected
> Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
Oooooh! Here's what the source code has to say about this:
/*
* bad_mode handles the impossible case in the vectors. If you see one of
* these, then it's extremely serious, and could mean you have buggy hardware.
* It never returns, and never tries to sync. We hope that we can at least
* dump out some state information...
*/
The only way this can happen is to get an exception while still in the
exception entry path for a previous exception. Needless to say that
this should normally be _impossible_.
Nicolas
More information about the kernel-team
mailing list