LKCD

Joseph Salisbury joseph.salisbury at canonical.com
Wed Dec 22 22:25:23 UTC 2010


On 12/22/2010 04:24 PM, Peter M. Petrakis wrote:
>
>
> On 12/22/2010 04:04 PM, Joseph Salisbury wrote:
>> On 12/22/2010 02:21 PM, Peter M. Petrakis wrote:
>>> Hi,
>>>
>>> On 12/21/2010 09:45 AM, Joseph Salisbury wrote:
>>>> Hello,
>>>>
>>>> I'm attempting to use linux-crashdump to debug an issue.  I've been
>>>> following the documentation at:
>>>>
>>>> https://wiki.ubuntu.com/Kernel/CrashdumpRecipe
>>>>
>>>> The exact steps I've done are:
>>>> Installed linux-crashdump:
>>>> sudo apt-get install linux-crashdump
>>>> Rebooted system to enable crashdump.
>>>>
>>>> My test to force a crash:
>>>> echo 1 | sudo tee /proc/sys/kernel/panic_on_oops
>>>> echo c | sudo tee /proc/sysrq-trigger
>>>>
>>>> However, I never get anything in /var/crash.  In fact the /var/crash
>>>> directory didn't exist until I created it.  I've tried this on Lucid,
>>>> Maverick and Natty with the same results.
>>>>
>>>> Has anyone successfully used linux-crashdump recently, or suggest
>>>> another tool like kdump?  Maybe I'm missing a step?
>>>
>>> No that's about right, it should work, but the crashdump package
>>> isn't very robust. Nor is it nearly as configurable as the RHEL
>>> variant, you have to customize it yourself. Judging from the bug
>>> list it doesn't appear to be getting much attention either.
>>>
>>> Yes we do use it, and it does work, but it doesn't always work
>>> out of the box. So a few things:
>>>
>>> 1) I've had issues using kexec in VirtualBox in the past, if you're
>>> trying to sandbox it there, try bare metal instead.
>>>
>>> 2) Can you do a "simple" kexec and succeed? See the man page on how
>>> to prepare it. Just take what you're booting now, load that, and
>>> kexec. If it works it'll be like a really fast reboot :)
>>>
>>> 3) kdump *is* linux-crashdump. The old, driver specific method of
>>> dumping is gone. Like diskdump.
>>>
>>> 4) Not all drivers take kindly to being thrown through kdump/kexec.
>>> Alot of them you don't need. So if you have a serial console, start
>>> taking note of all the peripherals that give you problems, and compile
>>> a new kernel just for the purposes of kdump without those things enabled.
>>>
>>> 5) kexec/kdump doesn't always work, but with a solid, reproducible test
>>> case, probability will usually grant you with a readable dump :)
>>>
>>>> Thanks,
>>>>
>>>> Joe
>>>
>>>
>>> Peter
>>>
>>
>> Thanks for the feedback, Peter.
>>
>> 1) Yes, I've tried bare metal as well as KVM VMs
>>
>> 2) I performed the following kexec, and it did do a really fast reboot :-)
>>
>> /sbin/kexec --command-line="BOOT_IMAGE=/boot/vmlinuz-2.6.37-11-generic
>> root=UUID=16a635bc-7110-4c13-97bf-1a3bb5931a96 ro vt.handoff=7  quiet
>> splash irqpoll maxcpus=1 nousb"
>> --initrd=/boot/initrd.img-2.6.37-11-generic
>> /boot/vmlinuz-2.6.37-11-generic
>
> Good!
>
>> So seems like kexec is working, but it is not triggered when I do an
>> "alt+sysrq c" or "echo c | sudo tee /proc/sysrq-trigger". In either case
>> the system just freezes.
>
> Just freezes... Damn that's interesting :) What's you /proc/cmdline
> look like before you issue the panic?

The following is /proc/cmdline before I initiated the panic:
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-2.6.37-11-generic
root=UUID=16a635bc-7110-4c13-97bf-1a3bb5931a96 ro vt.handoff=7
crashkernel=384M-2G:64M,2G-:128M quiet splash


>
> It doesn't take much from the kernel arg perspective, for example:
>
> (ignore all the casper stuff)
> $ cat /proc/cmdline
> initrd=/casper/initrd.lz boot=casper live-config toram persistent noprompt LIVEMEDIA=/dev/disk/by-label/DEBIAN_LIVE console=tty0 console=ttyS0,115200n8 apparmor=0 crashkernel=256M username=ubuntu hostname=ubuntu exposedroot  BOOT_IMAGE=/casper/vmlinuz
>
> The crashkernel= is really all you need, the init service however should have primed
> the kexec kernel + initrd too. There's no magic window (to my knowledge) of "when" you
> must prime the kdump kernel, because the memory has already been reserved. So for example,
> you could disable the kdump init script and setup manually, using it as a guide, to ensure
> the init script is doing the right thing.

I can try disabling the kdump init script and see if I can set things up 
manually like you suggest.

>
>   Also,
>
> - CPU make, model, and #
> - Current kernel

I'm running this on a cheap netbook, but I can also reproduce this on a 
server if you prefer.  The netbooks cpu is: Single CPU: Intel(R) 
Atom(TM) CPU N455   @ 1.66GHz

I'm also running the latest Natty kernel:
$ uname -r
2.6.37-11-generic


>
> If you try booting with "nosmp" and then trigger the panic does it
> still hang?

Yes, I added nosmp to the end of GRUB_CMDLINE_LINUX_DEFAULT and ran 
update-grub.  The contents of /proc/cmdline changed to:

BOOT_IMAGE=/boot/vmlinuz-2.6.37-11-generic
root=UUID=16a635bc-7110-4c13-97bf-1a3bb5931a96 ro vt.handoff=7
crashkernel=384M-2G:64M,2G-:128M quiet splash nosmp

>
> Perhaps you could send a magic sysrq key and dump the current process list?
>

I took a screen shot of the console after I triggered the panic. 
Hopfully it is readable and/or useful.  I also attached the output of 
"alt+sysrq t"


>> 3) Thanks for the info about kdump.
>
> No problem. The more crashdump users the better.
>
>> 4) Thanks for the suggestions, I will try this.
>>
>> Thanks for the help, Peter!  I appreciate you taking the time, and
>> sending me a response.
>
> :)
>
>> Joe
>
> Peter
>

Thanks again!

Joe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2010-12-22 16.47.09.jpg
Type: image/jpeg
Size: 1592041 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20101222/85df23cf/attachment.jpg>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: altsysrq_t_output.txt
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20101222/85df23cf/attachment.txt>


More information about the kernel-team mailing list