[Maverick] [ti-omap4] SRU: A workaround for highmem issue on OMAP4 platform
Ricardo Salveti de Araujo
rsalveti at rsalveti.net
Fri Sep 24 06:04:10 UTC 2010
On Fri, Sep 24, 2010 at 12:45:30AM -0300, Ricardo Salveti de Araujo wrote:
> On Fri, Sep 24, 2010 at 11:21:01AM +0800, Bryan Wu wrote:
> > SRU Justification:
> >
> > Impact:
> > There is a critical highmem issue on our latest OMAP4 ES2.0 platform. When we
> > build kernel package natively on ES2.0 platform with mem=1G and highmem
> > enabled, we will meet 'Bus Error' corruption from gcc shortly. And 'Unhandled
> > imprecised external abort' kernel oops messages. Then the whole system will be
> > very instable.
> >
> > Fix: After some debugging, this issue is related to highmem. If we don't use
> > mem=1G (no memory in highmem), the corruption is gone. So there is a workaround
> > which is CONFIG_VMSPLIT_2G=y. So user and kernel memory split is 2G:2G instead
> > of default 3G:1G. We can use all the 1G memory on ES2.0, but don't put any
> > memory in highmem. As a result, the issue is gone.
>
> Generally when using highmem we can reproduce this issue quickly, with 10, 15
> minutes after started the kernel build. Currently without highmem I was able to
> build the whole kernel 3 times already, and didn't face this issue.
>
> I just started another batch that will run at least more 5 times during this
> night, and will reply tomorrow with the test result.
>
> Meanwhile we're debugging the highmem issue with Nicolas's help.
Unfortunatelly something else doesn't seems to be right :-(
After building it one time successfully with -j 2 I changed to -j 3 and after 10
minutes I got the following error:
Bad mode in data abort handler detected
Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
last sysfs file: /sys/devices/virtual/net/lo/type
Modules linked in: twl4030_pwrbutton sg usb_storage
CPU: 0 Not tainted (2.6.35.3+ #52)
PC is at 0xffff0010
LR is at 0x2abab896
pc : [<ffff0010>] lr : [<2abab896>] psr: 00000097
sp : bffcffb0 ip : 00000000 fp : 000b2de4
r10: 00000000 r9 : 000ac68c r8 : 00000050
r7 : 00000022 r6 : 004c7108 r5 : 0008cab0 r4 : ca797762
r3 : 004d6a68 r2 : 00000037 r1 : 0008cab0 r0 : 00000038
Flags: nzcv IRQs off FIQs on Mode ABT_32 ISA ARM Segment user
Control: 10c53c7d Table: bfffc04a DAC: 00000015
Process dhclient-script (pid: 6199, stack limit = 0xbffce2f8)
Stack: (0xbffcffb0 to 0xbffd0000)
ffa0: 00000038 0008cab0 00000037 004d6a68
ffc0: ca797762 0008cab0 004c7108 00000022 00000050 000ac68c 00000000 000b2de4
ffe0: 00000000 bffcffb0 2abab896 ffff0010 00000097 ffffffff 8102a021 8102a421
Code: ef9f0000 ea0000dd e59ff410 ea0000bb (ea00009a)
And at the userspace side I got a segfault in GCC instead of a bus error.
Kernel boot log to show that highmem is not being used:
Kernel command line: splash ro elevator=noop vram=32M root=/dev/sda5 fixrtc console=ttyO2,115200 mem=1G earlyprintk=ttyO2 omapdss.debug=1 loglevel=8 user_debug=16
PID hash table entries: 4096 (order: 2, 16384 bytes)
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
allocated 5242880 bytes of page_cgroup
please try 'cgroup_disable=memory' option if you don't want memory cgroups
Memory: 1024MB = 1024MB total
Memory: 975176k/975176k available, 73400k reserved, 0K highmem
Virtual kernel memory layout:
vector : 0xffff0000 - 0xffff1000 ( 4 kB)
fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB)
DMA : 0xffc00000 - 0xffe00000 ( 2 MB)
vmalloc : 0xc0800000 - 0xf8000000 ( 888 MB)
lowmem : 0x80000000 - 0xc0000000 (1024 MB)
pkmap : 0x7fe00000 - 0x80000000 ( 2 MB)
modules : 0x7f000000 - 0x7fe00000 ( 14 MB)
.init : 0x80008000 - 0x8003f000 ( 220 kB)
.text : 0x8003f000 - 0x806ef000 (6848 kB)
.data : 0x80728000 - 0x80798180 ( 449 kB)
Will also test it with only one cpu to see if this could be realted with SMP
issues.
And as reference, I'm testing this on a Panda ES2 8 layers.
Cheers,
--
Ricardo Salveti de Araujo
More information about the kernel-team
mailing list