[Bug 1032550] Re: [multipath] failed to get sysfs information
Ronald Moesbergen
intercommit at gmail.com
Fri Dec 21 10:36:04 UTC 2012
Peter,
I got it to crash again, this time with a nice kernel dump. The dump can
be fetched here:
http://www.rmoesbergen.nl/linux-image-3.2.0-34-generic.0.crash.gz
The crash itself looked like this:
Dec 21 11:07:32 ealxs00161 kernel: [63272.392812] sd 4:0:1:1: emc: ALUA failover mode detected
Dec 21 11:07:32 ealxs00161 kernel: [63272.392820] sd 4:0:1:1: emc: at SP B Port 1 (owned, default SP B)
Dec 21 11:07:32 ealxs00161 kernel: [63272.393180] sd 3:0:0:1: emc: ALUA failover mode detected
Dec 21 11:07:32 ealxs00161 kernel: [63272.393187] sd 3:0:0:1: emc: at SP B Port 0 (owned, default SP B)
Dec 21 11:10:36 ealxs00161 kernel: [63455.641431] qla2xxx [0000:07:00.0]-500b:3: LOOP DOWN detected (2 3 0 0).
Dec 21 11:10:52 ealxs00161 multipathd: sdf: remove path (uevent)
Dec 21 11:10:52 ealxs00161 kernel: [63471.548255] rport-3:0-1: blocked FC remote port time out: removing target and saving binding
Dec 21 11:10:52 ealxs00161 kernel: [63471.676065] rport-3:0-0: blocked FC remote port time out: removing target and saving binding
Dec 21 11:11:08 ealxs00161 cimserver[2079]: Authentication failed for user=root.
Dec 21 11:11:10 ealxs00161 cimserver[2079]: Authentication failed for user=root.
Dec 21 11:13:28 ealxs00161 kernel: [63627.745648] INFO: task jbd2/dm-1-8:1530 blocked for more than 120 seconds.
Dec 21 11:13:28 ealxs00161 kernel: [63627.746025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 21 11:13:28 ealxs00161 kernel: [63627.756371] jbd2/dm-1-8 D ffff8803aa11a620 0 1530 2 0x00000000
Dec 21 11:13:28 ealxs00161 kernel: [63627.756380] ffff880416141ac0 0000000000000046 ffff880416141a60 ffff88042ee137c0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756388] ffff880416141fd8 ffff880416141fd8 ffff880416141fd8 00000000000137c0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756395] ffffffff81c0d020 ffff880415ef9700 ffff880416141a90 ffff88042ee14080
Dec 21 11:13:28 ealxs00161 kernel: [63627.756403] Call Trace:
Dec 21 11:13:28 ealxs00161 kernel: [63627.756416] [<ffffffff81117230>] ? __lock_page+0x70/0x70
Dec 21 11:13:28 ealxs00161 kernel: [63627.756431] [<ffffffff81659ebf>] schedule+0x3f/0x60
Dec 21 11:13:28 ealxs00161 kernel: [63627.756441] [<ffffffff81659f6f>] io_schedule+0x8f/0xd0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756451] [<ffffffff8111723e>] sleep_on_page+0xe/0x20
Dec 21 11:13:28 ealxs00161 kernel: [63627.756460] [<ffffffff8165a78f>] __wait_on_bit+0x5f/0x90
Dec 21 11:13:28 ealxs00161 kernel: [63627.756470] [<ffffffff811173a8>] wait_on_page_bit+0x78/0x80
Dec 21 11:13:28 ealxs00161 kernel: [63627.756481] [<ffffffff8108ad60>] ? autoremove_wake_function+0x40/0x40
Dec 21 11:13:28 ealxs00161 kernel: [63627.756492] [<ffffffff811174bc>] filemap_fdatawait_range+0x10c/0x1a0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756503] [<ffffffff8111757b>] filemap_fdatawait+0x2b/0x30
Dec 21 11:13:28 ealxs00161 kernel: [63627.756516] [<ffffffff81260ea0>] journal_finish_inode_data_buffers+0x70/0x170
Dec 21 11:13:28 ealxs00161 kernel: [63627.756528] [<ffffffff81261795>] jbd2_journal_commit_transaction+0x665/0x1240
Dec 21 11:13:28 ealxs00161 kernel: [63627.756538] [<ffffffff8108ad20>] ? add_wait_queue+0x60/0x60
Dec 21 11:13:28 ealxs00161 kernel: [63627.756548] [<ffffffff8126603b>] kjournald2+0xbb/0x220
Dec 21 11:13:28 ealxs00161 kernel: [63627.756557] [<ffffffff8108ad20>] ? add_wait_queue+0x60/0x60
Dec 21 11:13:28 ealxs00161 kernel: [63627.756566] [<ffffffff81265f80>] ? commit_timeout+0x10/0x10
Dec 21 11:13:28 ealxs00161 kernel: [63627.756575] [<ffffffff8108a27c>] kthread+0x8c/0xa0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756587] [<ffffffff81666534>] kernel_thread_helper+0x4/0x10
Dec 21 11:13:28 ealxs00161 kernel: [63627.756596] [<ffffffff8108a1f0>] ? flush_kthread_worker+0xa0/0xa0
Dec 21 11:13:28 ealxs00161 kernel: [63627.756606] [<ffffffff81666530>] ? gs_change+0x13/0x13
Dec 21 11:13:28 ealxs00161 kernel: [63627.756612] Kernel panic - not syncing: hung_task: blocked tasks
Dec 21 11:13:28 ealxs00161 kernel: [63627.768425] Pid: 66, comm: khungtaskd Tainted: G W 3.2.0-34-generic #53-Ubuntu
Dec 21 11:13:28 ealxs00161 kernel: [63627.779691] Call Trace:
Dec 21 11:13:28 ealxs00161 kernel: [63627.790147] [<ffffffff81643128>] panic+0x91/0x1a4
Dec 21 11:13:28 ealxs00161 kernel: [63627.800888] [<ffffffff810d78f2>] check_hung_task+0xb2/0xc0
Dec 21 11:13:28 ealxs00161 kernel: [63627.811370] [<ffffffff810d7a1b>] check_hung_uninterruptible_tasks+0x11b/0x140
Dec 21 11:13:28 ealxs00161 kernel: [63627.821998] [<ffffffff810d7a40>] ? check_hung_uninterruptible_tasks+0x140/0x140
Dec 21 11:13:28 ealxs00161 kernel: [63627.833715] [<ffffffff810d7a8f>] watchdog+0x4f/0x60
Dec 21 11:13:28 ealxs00161 kernel: [63627.844538] [<ffffffff8108a27c>] kthread+0x8c/0xa0
Dec 21 11:13:28 ealxs00161 kernel: [63627.855370] [<ffffffff81666534>] kernel_thread_helper+0x4/0x10
Dec 21 11:13:28 ealxs00161 kernel: [63627.866367] [<ffffffff8108a1f0>] ? flush_kthread_worker+0xa0/0xa0
Dec 21 11:13:28 ealxs00161 kernel: [63627.877343] [<ffffffff81666530>] ? gs_change+0x13/0x13
output of ps xa, just before the crash:
PID TTY STAT TIME COMMAND
1 ? Ss 0:02 /sbin/init
2 ? S 0:00 [kthreadd]
3 ? S 0:01 [ksoftirqd/0]
6 ? S 0:01 [migration/0]
7 ? S 0:00 [watchdog/0]
8 ? S 0:00 [migration/1]
10 ? S 0:00 [ksoftirqd/1]
12 ? S 0:00 [watchdog/1]
13 ? S 0:01 [migration/2]
15 ? S 0:00 [ksoftirqd/2]
16 ? S 0:00 [watchdog/2]
17 ? S 0:00 [migration/3]
19 ? S 0:00 [ksoftirqd/3]
20 ? S 0:00 [watchdog/3]
21 ? S 0:00 [migration/4]
23 ? S 0:00 [ksoftirqd/4]
24 ? S 0:00 [watchdog/4]
25 ? S 0:00 [migration/5]
27 ? S 0:00 [ksoftirqd/5]
28 ? S 0:00 [watchdog/5]
29 ? S 0:00 [migration/6]
30 ? S 0:00 [kworker/6:0]
31 ? S 0:00 [ksoftirqd/6]
32 ? S 0:00 [watchdog/6]
33 ? S 0:00 [migration/7]
35 ? S 0:00 [ksoftirqd/7]
36 ? S 0:00 [watchdog/7]
37 ? S 0:00 [migration/8]
38 ? S 0:00 [kworker/8:0]
39 ? S 0:00 [ksoftirqd/8]
40 ? S 0:00 [watchdog/8]
41 ? S 0:00 [migration/9]
42 ? S 0:00 [kworker/9:0]
43 ? S 0:00 [ksoftirqd/9]
44 ? S 0:00 [watchdog/9]
45 ? S 0:00 [migration/10]
47 ? S 0:00 [ksoftirqd/10]
48 ? S 0:00 [watchdog/10]
49 ? S 0:00 [migration/11]
51 ? S 0:00 [ksoftirqd/11]
52 ? S 0:00 [watchdog/11]
53 ? S< 0:00 [cpuset]
54 ? S< 0:00 [khelper]
55 ? S 0:00 [kdevtmpfs]
56 ? S< 0:00 [netns]
58 ? S 0:00 [sync_supers]
59 ? S 0:00 [bdi-default]
60 ? S< 0:00 [kintegrityd]
61 ? S< 0:00 [kblockd]
62 ? S< 0:00 [ata_sff]
63 ? S 0:00 [khubd]
64 ? S< 0:00 [md]
66 ? S 0:00 [khungtaskd]
67 ? S 0:14 [kswapd0]
68 ? SN 0:00 [ksmd]
69 ? SN 0:00 [khugepaged]
70 ? S 0:00 [fsnotify_mark]
71 ? S 0:00 [ecryptfs-kthrea]
72 ? S< 0:00 [crypto]
80 ? S< 0:00 [kthrotld]
81 ? S 0:00 [scsi_eh_0]
82 ? S 0:00 [scsi_eh_1]
104 ? S< 0:00 [devfreq_wq]
265 ? S 0:00 [scsi_eh_2]
267 ? S 0:00 [hpsa]
349 ? S 0:00 [kworker/6:1]
352 ? S 0:00 [kworker/9:1]
353 ? S 0:00 [kworker/10:1]
354 ? S 0:00 [kworker/4:1]
357 ? S< 0:00 [kdmflush]
365 ? S 0:00 [jbd2/sda1-8]
366 ? S< 0:00 [ext4-dio-unwrit]
458 ? S 0:00 upstart-udev-bridge --daemon
461 ? Ss 0:00 /sbin/udevd --daemon
547 ? S< 0:00 [kmpathd]
548 ? S< 0:00 [kmpath_handlerd]
626 ? S< 0:00 [edac-poller]
660 ? S 0:00 [scsi_eh_3]
702 ? S< 0:00 [kpsmoused]
861 ? S< 0:00 [qla2xxx_3_dpc]
862 ? Ss 0:00 rpcbind -w
864 ? S< 0:00 [scsi_wq_3]
877 ? S 0:00 [scsi_eh_4]
879 ? Ss 0:00 rpc.statd -L
888 ? S< 0:00 [rpciod]
891 ? S< 0:00 [nfsiod]
893 ? S 0:00 upstart-socket-bridge --daemon
895 ? S< 0:00 [qla2xxx_4_dpc]
896 ? S< 0:00 [scsi_wq_4]
902 ? S< 0:00 [bond0]
1054 ? S< 0:00 [kdmflush]
1109 ? S< 0:00 [kdmflush]
1490 ? S 0:06 [jbd2/dm-2-8]
1491 ? S< 0:00 [ext4-dio-unwrit]
1530 ? D 0:42 [jbd2/dm-1-8]
1531 ? S< 0:00 [ext4-dio-unwrit]
1573 ? Ss 0:00 /usr/sbin/sshd -D
1576 ? Ss 0:00 rpc.idmapd
1580 ? Ss 0:00 dbus-daemon --system --fork --activation=upstart
1603 ? Sl 0:02 rsyslogd -c5
1677 tty4 Ss+ 0:00 /sbin/getty -8 38400 tty4
1684 tty5 Ss+ 0:00 /sbin/getty -8 38400 tty5
1693 tty2 Ss+ 0:00 /sbin/getty -8 38400 tty2
1697 tty3 Ss+ 0:00 /sbin/getty -8 38400 tty3
1703 tty6 Ss+ 0:00 /sbin/getty -8 38400 tty6
1710 ? Ss 0:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket
1712 ? Ss 0:00 cron
1715 ? Ss 0:00 atd
1727 ? S 0:00 /usr/sbin/zabbix_agentd
1729 ? Ss 0:20 /usr/sbin/irqbalance
1733 ? Ssl 0:00 whoopsie
1738 ? Ssl 2:51 /usr/sbin/mysqld
1745 ? S 0:35 /usr/sbin/zabbix_agentd
1746 ? S 0:10 /usr/sbin/zabbix_agentd
1747 ? S 0:10 /usr/sbin/zabbix_agentd
1748 ? S 0:11 /usr/sbin/zabbix_agentd
1749 ? S 0:12 /usr/sbin/zabbix_agentd
1750 ? S 0:11 /usr/sbin/zabbix_agentd
1751 ? S 0:01 /usr/sbin/zabbix_agentd
1919 ? S 0:00 [kworker/5:2]
2004 ? S 0:00 [kworker/8:2]
2010 ? S 0:00 [kworker/11:2]
2011 ? S 0:00 [kworker/11:3]
2024 ? Sl 0:01 /opt/Unisphere/bin/hostagent -f /etc/Unisphere/agent.config
2046 ? SLl 0:08 /sbin/multipathd
2079 ? SLsl 0:19 /opt/microsoft/scx/bin/scxcimserver
2177 tty1 Ss+ 0:00 /sbin/getty -8 38400 tty1
2179 ? S 0:00 [flush-8:0]
2180 ? D 1:05 [flush-252:1]
2181 ? S 0:00 [flush-252:2]
2257 ? Ssl 0:28 /opt/microsoft/scx/bin/scxcimprovagt 0 9 12 root SCXCoreProviderModule
2422 ? Ss 0:02 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 109:115
2625 ? Ss 0:00 sshd: ronaldm [priv]
2824 ? S 0:00 sshd: ronaldm at pts/0
2825 pts/0 Ss 0:00 -bash
2924 pts/0 S 0:00 sudo -i
2929 pts/0 S 0:00 -bash
3194 ? Ssl 0:01 /opt/microsoft/scx/bin/scxcimprovagt 0 8 14 scoma SCXUserCoreProviderModule
3207 ? S 0:17 [kworker/1:3]
3614 ? Ss 0:00 sshd: ronaldm [priv]
3753 ? S 0:00 sshd: ronaldm at pts/2
3754 pts/2 Ss 0:00 -bash
3868 pts/2 S 0:00 sudo -i
3874 pts/2 S 0:00 -bash
4925 ? S 0:00 [kworker/7:3]
5248 ? S 0:00 [kworker/u:2]
5251 ? S 0:00 [kworker/u:3]
5348 ? S 0:00 [kworker/10:2]
5353 ? S 0:00 [kworker/1:1]
5361 ? S 0:00 [kworker/0:1]
5382 ? S 0:00 [kworker/3:0]
5383 ? S 0:00 [kworker/3:3]
5384 ? S 0:00 [kworker/5:3]
5387 ? S 0:00 [kworker/0:5]
5391 ? S 0:00 [kworker/1:2]
5691 ? S 0:00 [kworker/7:4]
6088 ? S 0:00 [kworker/1:4]
6221 ? S 0:00 [kworker/4:2]
6260 ? S 0:00 [kworker/2:0]
6261 ? S 0:00 [kworker/2:4]
6521 pts/0 D+ 0:17 bonnie++ -d . -u root
6655 ? S 0:00 /sbin/udevd --daemon
6656 ? S 0:00 /sbin/udevd --daemon
6910 ? S 0:00 [kworker/1:0]
6915 pts/2 R+ 0:00 ps xa
Acceptatie - DB01 (root at ealxs00161):~# ps xa | grep multi
2046 ? SLl 0:08 /sbin/multipathd
6917 pts/2 S+ 0:00 grep --color=auto multi
Also, just before the crash:
Acceptatie - DB01 (root at ealxs00161):~# multipath -ll
LUN-DATABASE (36006016061e02e003cf1aca4ae07e211) dm-2 DGC,VRAID
size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| |- 4:0:1:1 sdi 8:128 active ready running
| `- #:#:#:# - #:# active faulty running
`-+- policy='round-robin 0' prio=10 status=enabled
|- 4:0:0:1 sde 8:64 active ready running
`- #:#:#:# - #:# active faulty running
LUN-LOGGING (36006016061e02e000286c1adae07e211) dm-1 DGC,VRAID
size=20G features='0' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=130 status=active
| |- 4:0:0:0 sdd 8:48 active ready running
| `- #:#:#:# - #:# active faulty running
`-+- policy='round-robin 0' prio=10 status=enabled
|- 4:0:1:0 sdh 8:112 active ready running
`- #:#:#:# - #:# active faulty running
Output of dmsetup table -v before starting the tests:
Name: vg-swap
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 2
Event number: 0
Major, minor: 252, 0
Number of targets: 1
UUID: LVM-BySGZfHLAZg250K7UjTxYBStGjTdkb2CE8b7q7HMxBUtJso72BPYfnAcLpxixYP4
0 3997696 linear 8:2 512
Name: LUN-DATABASE
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 4
Major, minor: 252, 2
Number of targets: 1
UUID: mpath-36006016061e02e003cf1aca4ae07e211
0 419430400 multipath 1 queue_if_no_path 1 emc 2 1 round-robin 0 2 1
8:128 1000 8:32 1000 round-robin 0 2 1 8:64 1000 8:96 1000
Name: LUN-LOGGING
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 4
Major, minor: 252, 1
Number of targets: 1
UUID: mpath-36006016061e02e000286c1adae07e211
0 41943040 multipath 1 queue_if_no_path 1 emc 2 1 round-robin 0 2 1 8:48
1000 8:80 1000 round-robin 0 2 1 8:112 1000 8:16 1000
Output of lsscsi -lv before starting the tests:
[2:0:0:0] storage HP P420i 3.04 -
state=running queue_depth=1020 scsi_level=6 type=12 device_blocked=0 timeout=0
dir: /sys/bus/scsi/devices/2:0:0:0 [/sys/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:0:0/2:0:0:0]
[2:0:0:1] disk HP LOGICAL VOLUME 3.04 /dev/sda
state=running queue_depth=1020 scsi_level=6 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/2:0:0:1 [/sys/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:0:0/2:0:0:1]
[3:0:0:0] disk DGC VRAID 0531 /dev/sdb
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/3:0:0:0 [/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-0/target3:0:0/3:0:0:0]
[3:0:0:1] disk DGC VRAID 0531 /dev/sdc
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/3:0:0:1 [/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-0/target3:0:0/3:0:0:1]
[3:0:1:0] disk DGC VRAID 0531 /dev/sdf
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/3:0:1:0 [/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-1/target3:0:1/3:0:1:0]
[3:0:1:1] disk DGC VRAID 0531 /dev/sdg
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/3:0:1:1 [/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-1/target3:0:1/3:0:1:1]
[4:0:0:0] disk DGC VRAID 0531 /dev/sdd
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/4:0:0:0 [/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-0/target4:0:0/4:0:0:0]
[4:0:0:1] disk DGC VRAID 0531 /dev/sde
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/4:0:0:1 [/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-0/target4:0:0/4:0:0:1]
[4:0:1:0] disk DGC VRAID 0531 /dev/sdh
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/4:0:1:0 [/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-1/target4:0:1/4:0:1:0]
[4:0:1:1] disk DGC VRAID 0531 /dev/sdi
state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30
dir: /sys/bus/scsi/devices/4:0:1:1 [/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-1/target4:0:1/4:0:1:1]
I hope this helps...
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to multipath-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1032550
Title:
[multipath] failed to get sysfs information
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1032550/+subscriptions
More information about the Ubuntu-server-bugs
mailing list