edubuntu thin client server crashed
Gavin McCullagh
gmccullagh at gmail.com
Mon Apr 30 12:50:22 UTC 2007
Hi,
I'm not clear what's to blame for this. It could be fuse/ltspfs, I'm not
certain. We're running a very standard Edubuntu Edgy installation as an
ltsp server.
I got a logcheck email today with the following in it:
Apr 30 11:48:43 medlycott kernel: [18274019.612000] BUG: soft lockup detected on CPU#1!
Apr 30 11:48:43 medlycott kernel: [18274019.612000] <c01491cf> softlockup_tick+0x9f/0xf0 <c012bee1> update_process_times+0x31/0x80
Apr 30 11:48:43 medlycott kernel: [18274019.612000] <c0114d13> smp_apic_timer_interrupt+0x53/0x60 <c010413c> apic_timer_interrupt+0x1c/0x30
Apr 30 11:48:43 medlycott kernel: [18274019.612000] <c014c25c> do_generic_mapping_read+0xcc/0x590 <c014d118> __generic_file_aio_read+0xf8/0x270
Apr 30 11:48:43 medlycott kernel: [18274019.612000] <c014b8a0> file_read_actor+0x0/0xf0 <c014e673> generic_file_read+0xa3/0xd0
Apr 30 11:48:43 medlycott kernel: [18274019.612000] <c0136180> autoremove_wake_function+0x0/0x50 <c012bd4a> run_timer_softirq+0x3a/0x1a0
Apr 30 11:48:43 medlycott kernel: [18274019.612000] <c016af6c> vfs_read+0xbc/0x180 <c014e5d0> generic_file_read+0x0/0xd0
Apr 30 11:48:43 medlycott kernel: [18274019.612000] <c016b4e1> sys_read+0x41/0x70 <c0102fbb> sysenter_past_esp+0x54/0x79
Apr 30 11:48:44 medlycott kernel: [18274019.880000] BUG: soft lockup detected on CPU#3!
Apr 30 11:48:44 medlycott kernel: [18274019.880000] <c01491cf> softlockup_tick+0x9f/0xf0 <c012bee1> update_process_times+0x31/0x80
Apr 30 11:48:44 medlycott kernel: [18274019.880000] <c0114d13> smp_apic_timer_interrupt+0x53/0x60 <c010413c> apic_timer_interrupt+0x1c/0x30
Apr 30 11:48:44 medlycott kernel: [18274019.880000] <c0173f29> generic_fillattr+0x69/0xb0 <f8a1da2f> fuse_getattr+0x5f/0x80 [fuse]
Apr 30 11:48:44 medlycott kernel: [18274019.880000] <f8a1d9d0> fuse_getattr+0x0/0x80 [fuse] <c01743ac> vfs_getattr+0x4c/0xd0
Apr 30 11:48:44 medlycott kernel: [18274019.880000] <c0174464> vfs_lstat_fd+0x34/0x50 <c01744cf> sys_lstat64+0xf/0x30
Apr 30 11:48:44 medlycott kernel: [18274019.880000] <c0102fbb> sysenter_past_esp+0x54/0x79
Apr 30 11:48:44 medlycott kernel: [18274020.332000] BUG: soft lockup detected on CPU#2!
Apr 30 11:48:44 medlycott kernel: [18274020.332000] <c01491cf> softlockup_tick+0x9f/0xf0 <c012bee1> update_process_times+0x31/0x80
Apr 30 11:48:44 medlycott kernel: [18274020.332000] <c0114d13> smp_apic_timer_interrupt+0x53/0x60 <c010413c> apic_timer_interrupt+0x1c/0x30
Apr 30 11:48:44 medlycott kernel: [18274020.332000] <c0173f38> generic_fillattr+0x78/0xb0 <f8a1da2f> fuse_getattr+0x5f/0x80 [fuse]
Apr 30 11:48:44 medlycott kernel: [18274020.332000] <f8a1d9d0> fuse_getattr+0x0/0x80 [fuse] <c01743ac> vfs_getattr+0x4c/0xd0
Apr 30 11:48:44 medlycott kernel: [18274020.332000] <c0174464> vfs_lstat_fd+0x34/0x50 <c011bde0> default_wake_function+0x0/0x10
Apr 30 11:48:44 medlycott kernel: [18274020.332000] <c01744cf> sys_lstat64+0xf/0x30 <c013a738> sys_futex+0x88/0x100
Apr 30 11:48:44 medlycott kernel: [18274020.332000] <c0102fbb> sysenter_past_esp+0x54/0x79
Worried, I logged in to see what was going on. Looking at top, a user's
nautilus session was going nuts and using 300% cpu according to top. I
tried to kill that pid (eventually with -9) but that didn't work. That
user had logged out, so I looked at their process list and killed a couple
of them:
gavinmc at medlycott:~$ ps aux |grep rbuc
rbuckley 3432 0.0 0.0 5884 3464 ? S 11:28 0:00 /usr/lib/libgconf2-4/gconfd-2 5
rbuckley 3547 139 0.8 93376 30700 ? RNsl 11:28 53:21 nautilus --no-default-window --sm-client-id default2
rbuckley 3558 0.0 0.0 105936 3264 ? Ssl 11:28 0:00 /usr/lib/bonobo-activation/bonobo-activation-server --ac-activate --ior-output-fd=16
rbuckley 3650 0.0 0.0 2340 860 ? S 11:28 0:00 /usr/lib/nautilus-cd-burner/mapping-daemon
rbuckley 14105 0.0 0.2 20804 7568 ? S 11:55 0:00 nautilus --no-desktop file:///home/teachers/rbuckley/Desktop
rbuckley 14147 0.0 0.2 20800 7572 ? S 11:55 0:00 nautilus --no-desktop file:///home/teachers/rbuckley/Desktop
rbuckley 14220 0.0 0.2 20800 7564 ? S 11:55 0:00 nautilus --no-desktop computer:
rbuckley 14676 0.0 0.2 20804 7568 ? S 11:56 0:00 nautilus --no-desktop file:///media/rbuckley/usbdisk-sda1
rbuckley 14740 0.0 0.2 20800 7572 ? S 11:57 0:00 nautilus --no-desktop file:///media/rbuckley/usbdisk-sda1
rbuckley 14822 0.0 0.2 20800 7564 ? S 11:57 0:00 nautilus --no-desktop file:///media/rbuckley/usbdisk-sda1
rbuckley 14862 0.0 0.2 20804 7576 ? S 11:57 0:00 nautilus --no-desktop file:///media/rbuckley/usbdisk-sda1
rbuckley 14896 0.0 0.2 20804 7568 ? S 11:57 0:00 nautilus --no-desktop file:///home/teachers/rbuckley/Desktop
rbuckley 15068 0.0 0.2 20800 7564 ? S 11:57 0:00 nautilus --no-desktop file:///media/rbuckley/usbdisk-sda1
rbuckley 15437 0.0 0.2 20804 7568 ? S 11:58 0:00 nautilus --no-desktop file:///home/teachers/rbuckley/Desktop
gavinmc 18777 0.0 0.0 2800 752 pts/0 R+ 12:06 0:00 grep rbuc
gavinmc at medlycott:~$ sudo kill 3432 3547 3558 3650
gavinmc at medlycott:~$ ps aux |grep rbuc
rbuckley 3547 141 0.8 93376 30700 ? RNsl 11:28 54:16 nautilus --no-default-window --sm-client-id default2
rbuckley 18891 0.0 0.0 4644 2032 ? S 12:06 0:00 /usr/lib/libgconf2-4/gconfd-2 16
gavinmc 18893 0.0 0.0 2796 752 pts/0 R+ 12:06 0:00 grep rbuc
At this point I ran top quickly to see nautilus was still running at 300%.
The next command I typed (a small awk script) froze midway, at which point
I could no longer ssh into the machine (though it responded to pings).
I then had to get someone on-site to reboot the machine from the console.
It has come back up fine, but I'm a little concerned that there's a nasty
bug here which can allow a user to crash the system.
I don't suppose anyone here knows the cause but should I be trying to track
this down and which project should I approach (Edubuntu, ltsp, nautilus,
linux?)
Gavin
More information about the edubuntu-devel
mailing list