Major issues with Xen and Hardy

Mon Dec 8 22:54:08 UTC 2008

Have you tried upgrading the kernel to an offical ubuntu version.  The
bug that is mentioned in that how to doesn't exist anymore.  I have been
using xen for awhile now with the kernel out of the Repos without the
patch and haven't had an issue.  Try that it may fix your problems.
-  
V/R
Thomas E. Maleshafske
tmaleshafske at maleshafske.com
Helping People Take Control Over Their Computer!
http://www.skishosting.com
Instant Message me at anyone of the following
jabber=gravity1187 at binaryfreedom.info
aim=gravity1187
yahoo=gravity_1187
IRC=gravity1187 on freenode.net


On Mon, 2008-12-08 at 15:23 -0700, Eujon Sellers wrote:
> I hope this is the right area for this. I've been having some major
> issues running Xen on Ubuntu server 8.04 recently. I previously posted
> this up on the forums without any responses, but here is what that
> post consisted of:
> 
> - Running Ubuntu 8.04 server as the dom0
> - Currently using the 2.6.24-19-xen from the ubuntu archives
> - domU in question is also Ubuntu 8.04 built using "xen-create-image"
> 
> About a week ago, the website that is run on this problem domU
> wouldn't respond and I couldn't ssh into the domU. I logged into my
> dom0 and attached myself to the domU console and found this error
> scrolling over and over:
> [259458.110263] __find_get_block_slow() failed. block=1360, b_blocknr=32148
> [259458.110265] b_state=0x00000029, b_size=4096
> [259458.110269] device blocksize: 4096
> 
> I tried to figure out what the cause was (a reboot was the only fix),
> but the only thing I could come up with is that it happened around
> 3am. As an FYI, this happened when I was running the same kernel from
> this howto: http://www.howtoforge.com/ubuntu-8.04-server-install-xen-from-ubuntu-repositories.
> So I never figured out what the cause was and assumed it was a one off
> thing. Three days later though I ran into a similar situation (still
> with this other kernel). The domU wouldn't respond so I logged into
> the dom0 and attached myself to the domU console again. This time I
> found this error message:
> [231747.850967]  =======================
> [231759.665823] BUG: soft lockup - CPU#0 stuck for 11s! [apache2:1782]
> [231759.665826]
> [231759.665828] Pid: 1782, comm: apache2 Tainted: G      D (2.6.24-16-xen #2)
> [231759.665831] EIP: 0061:[<c0327dc7>] EFLAGS: 00000286 CPU: 0
> [231759.665835] EIP is at _spin_lock+0x7/0x10
> [231759.665837] EAX: cf802c18 EBX: cf802bd4 ECX: c1671ec0 EDX: 00000000
> [231759.665840] ESI: c1671ec0 EDI: 00000000 EBP: c0f15ddc ESP: c0f15c4c
> [231759.665843]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
> [231759.665846] CR0: 8005003b CR2: b7388e80 CR3: 01741000 CR4: 00000660
> [231759.665849] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [231759.665852] DR6: ffff0ff0 DR7: 00000400
> [231759.665854]  [<c01a730d>] try_to_free_buffers+0x2d/0x90
> [231759.665859]  [<c0167c95>] shrink_page_list+0x4b5/0x5f0
> [231759.665864]  [<c01777e3>] page_check_address+0x1d3/0x3d0
> [231759.665869]  [<c0166e20>] isolate_lru_pages+0x50/0x1c0
> [231759.665874]  [<c0167eeb>] shrink_inactive_list+0x11b/0x3b0
> [231759.665879]  [<c0168224>] shrink_zone+0xa4/0x100
> [231759.665883]  [<c0168d62>] try_to_free_pages+0x152/0x250
> [231759.665888]  [<c016305b>] __alloc_pages+0x12b/0x390
> [231759.665893]  [<c0171739>] handle_mm_fault+0x7f9/0x1330
> [231759.665899]  [<c0107deb>] local_clock+0x3b/0x80
> [231759.665903]  [<c010824b>] sched_clock+0x1b/0x60
> [231759.665908]  [<c03299ac>] do_page_fault+0x3bc/0xee0
> [231759.665912]  [<c011c6d0>] update_curr+0x70/0x110
> [231759.665916]  [<c03263d4>] schedule+0x244/0x600
> [231759.665921]  [<c0175824>] do_mmap_pgoff+0x314/0x330
> [231759.665925]  [<c0109ef5>] sys_mmap2+0x65/0xd0
> [231759.665930]  [<c03295f0>] do_page_fault+0x0/0xee0
> [231759.665934]  [<c0328285>] error_code+0x35/0x40
> [231759.665940]  =======================
> 
> Again, looking through the logs I couldn't find anything to set this
> off, just that it happened between midnight and 4am. So now I think
> it's possibly the kernel and I update to the 2.6.24-19-xen kernel from
> the ubuntu repo's. Well, it worked for three days again until this
> morning I noticed the site was not responding AGAIN. I logged into the
> console from the dom0 and found this error:
> [266281.773639] BUG: soft lockup - CPU#0 stuck for 11s! [webalizer:11884]
> 
> As of today, other domU's are starting to see similar problems. I lost
> one domU last night to the "CPU#0 stuck for 11s" error, and a
> different domU locked up this morning with the same issue. In all of
> these cases there are no entries in any of the logfiles showing a
> problem that sets these errors off. In addition, the dom0 never seems
> to have a problem, it's only the domU's that are having issues.
> 
> I've searched launchpad and found multiple bugs that referenced these
> various errors, but they are usually open or don't offer a solution.
> Am I better off trying to build Xen and a Xen enabled kernel from
> source? One bug report mentioned using the debian
> linux-image-2.6.26-1-xen-686 kernel package, but in my testing that
> doesn't work because the domU's won't boot up with it (an error about
> "XENBUS: Waiting for devices to initialise" holds everything up). I'd
> really like to figure this out and keep Ubuntu, but the more I look at
> this the more it looks like I'll be heading back to CentOS.
> 
> If you've made it this far thanks for reading everything. I appreciate
> any help/suggestions people can offer...
>