[Bug 1317833] Re: DRBD 8.4 kernel crash when using same resource for 2 minors

Bram Klein Gunnewiek 1317833 at bugs.launchpad.net
Thu May 15 09:12:31 UTC 2014


I can confirm this bug and have some additional information. When using
a device name as minor number (e.g. /dev/drbdX) drbdsetup gives a
correct error:

root at xxxx:~# drbdsetup new-minor r0 /dev/drbd0 0
root at xxxx:~# drbdsetup new-minor r0 /dev/drbd0 1
r0: Failure: (162) Invalid configuration request
additional info from kernel:
minor exists as different volume

When using only the minor number (root at xxxx:~# drbdsetup new-minor r0 1
0) drbdsetup hangs/crashes. The kernel log shows the following:

May 15 11:04:06 testnode1 kernel: [265432.796294] request: minor=0, volume=1; but that minor is volume 0 in r0
May 15 11:04:32 testnode1 kernel: [265458.326351] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
May 15 11:04:32 testnode1 kernel: [265458.328289] IP: [<ffffffff81351b38>] blk_throtl_drain+0x28/0x130
May 15 11:04:32 testnode1 kernel: [265458.329938] PGD d767b067 PUD d7411067 PMD 0 
May 15 11:04:32 testnode1 kernel: [265458.331067] Oops: 0000 [#1] SMP 
May 15 11:04:32 testnode1 kernel: [265458.331889] Modules linked in: drbd dm_snapshot lru_cache libcrc32c nfsd bridge auth_rpcgss stp nfs_acl llc nfs lockd sunrpc fscache gpio_ich coretemp kvm_intel kvm psmouse serio_raw joydev lpc_ich lp parport ioatdma dca i7core_edac edac_core mac_hid hid_generic usbhid hid e1000e floppy ptp pps_core [last unloaded: drbd]
May 15 11:04:32 testnode1 kernel: [265458.339787] CPU: 4 PID: 22159 Comm: drbdsetup Not tainted 3.13.0-24-generic #47-Ubuntu
May 15 11:04:32 testnode1 kernel: [265458.341696] Hardware name: Supermicro X8STi/X8STi, BIOS 2.0        09/17/10  
May 15 11:04:32 testnode1 kernel: [265458.343456] task: ffff88019260dfc0 ti: ffff88009b8b2000 task.ti: ffff88009b8b2000
May 15 11:04:32 testnode1 kernel: [265458.345260] RIP: 0010:[<ffffffff81351b38>]  [<ffffffff81351b38>] blk_throtl_drain+0x28/0x130
May 15 11:04:32 testnode1 kernel: [265458.347357] RSP: 0018:ffff88009b8b3b10  EFLAGS: 00010046
May 15 11:04:32 testnode1 kernel: [265458.348636] RAX: 0000000000000000 RBX: ffff88019258e148 RCX: 000000000000b562
May 15 11:04:32 testnode1 kernel: [265458.350641] RDX: 000000000000000e RSI: 0000000000000001 RDI: 0000000000000000
May 15 11:04:32 testnode1 kernel: [265458.352361] RBP: ffff88009b8b3b28 R08: 00000000000172a0 R09: ffff88019fc972a0
May 15 11:04:32 testnode1 kernel: [265458.354080] R10: ffffea0000d90000 R11: ffffffff81340f10 R12: ffff88019258e148
May 15 11:04:32 testnode1 kernel: [265458.355800] R13: ffff8800d7747300 R14: ffff88019258e798 R15: ffff880192180000
May 15 11:04:32 testnode1 kernel: [265458.357521] FS:  00007f18ff858740(0000) GS:ffff88019fc80000(0000) knlGS:0000000000000000
May 15 11:04:32 testnode1 kernel: [265458.359618] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May 15 11:04:32 testnode1 kernel: [265458.361043] CR2: 0000000000000028 CR3: 00000000d5c44000 CR4: 00000000000007e0
May 15 11:04:32 testnode1 kernel: [265458.362797] Stack:
May 15 11:04:32 testnode1 kernel: [265458.363282]  ffff88019258e148 0000000000000001 ffff88019258e7a8 ffff88009b8b3b38
May 15 11:04:32 testnode1 kernel: [265458.365152]  ffffffff8134eb7e ffff88009b8b3b68 ffffffff81332c84 ffff88019258e148
May 15 11:04:32 testnode1 kernel: [265458.367060]  ffff88019436629c ffff88019258e948 ffff880194366020 ffff88009b8b3b90
May 15 11:04:32 testnode1 kernel: [265458.368930] Call Trace:
May 15 11:04:32 testnode1 kernel: [265458.369654]  [<ffffffff8134eb7e>] blkcg_drain_queue+0xe/0x10
May 15 11:04:32 testnode1 kernel: [265458.371239]  [<ffffffff81332c84>] __blk_drain_queue+0x74/0x180
May 15 11:04:32 testnode1 kernel: [265458.372648]  [<ffffffff81332eed>] blk_cleanup_queue+0x8d/0x180
May 15 11:04:32 testnode1 kernel: [265458.374064]  [<ffffffffa0354b35>] conn_new_minor+0x2d5/0x3c0 [drbd]
May 15 11:04:32 testnode1 kernel: [265458.375616]  [<ffffffffa035d787>] drbd_adm_add_minor+0xb7/0xc0 [drbd]
May 15 11:04:32 testnode1 kernel: [265458.377173]  [<ffffffff8164978d>] genl_family_rcv_msg+0x18d/0x370
May 15 11:04:32 testnode1 kernel: [265458.378683]  [<ffffffff81649970>] ? genl_family_rcv_msg+0x370/0x370
May 15 11:04:32 testnode1 kernel: [265458.380504]  [<ffffffff81649a01>] genl_rcv_msg+0x91/0xd0
May 15 11:04:32 testnode1 kernel: [265458.381785]  [<ffffffff81647a89>] netlink_rcv_skb+0xa9/0xc0
May 15 11:04:32 testnode1 kernel: [265458.383168]  [<ffffffff81647f88>] genl_rcv+0x28/0x40
May 15 11:04:32 testnode1 kernel: [265458.384366]  [<ffffffff816470b5>] netlink_unicast+0xd5/0x1b0
May 15 11:04:32 testnode1 kernel: [265458.385772]  [<ffffffff8164748f>] netlink_sendmsg+0x2ff/0x740
May 15 11:04:32 testnode1 kernel: [265458.387204]  [<ffffffff816016fe>] sock_aio_write+0xfe/0x130
May 15 11:04:32 testnode1 kernel: [265458.388548]  [<ffffffff8164664d>] ? netlink_insert+0x14d/0x240
May 15 11:04:32 testnode1 kernel: [265458.390115]  [<ffffffff811b8daa>] do_sync_write+0x5a/0x90
May 15 11:04:32 testnode1 kernel: [265458.471767]  [<ffffffff811b962d>] vfs_write+0x1ad/0x1f0
May 15 11:04:32 testnode1 kernel: [265458.550441]  [<ffffffff811b9f69>] SyS_write+0x49/0xa0
May 15 11:04:32 testnode1 kernel: [265458.632074]  [<ffffffff817266bf>] tracesys+0xe1/0xe6
May 15 11:04:32 testnode1 kernel: [265458.713392] Code: ff 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 49 89 fc 53 4c 8b af 70 08 00 00 49 8b 85 a0 00 00 00 31 ff 48 8b 80 c8 05 00 00 <48> 8b 70 28 e8 9f 92 d9 ff 48 85 c0 48 89 c3 74 61 0f 1f 80 00 
May 15 11:04:32 testnode1 kernel: [265458.881683] RIP  [<ffffffff81351b38>] blk_throtl_drain+0x28/0x130
May 15 11:04:32 testnode1 kernel: [265458.965565]  RSP <ffff88009b8b3b10>
May 15 11:04:32 testnode1 kernel: [265459.048030] CR2: 0000000000000028
May 15 11:04:32 testnode1 kernel: [265459.351201] ---[ end trace 0e04c0062e5baad0 ]---

The system is not completely dead at this point but there is no way to
kill the open drbd processes:

root     21842  0.0  0.0      0     0 ?        S<   11:00   0:00  \_ [drbd-reissue]
root     22007  0.0  0.0      0     0 ?        S<   11:03   0:00  \_ [drbd0_submit]
root     22203  0.0  0.0   4440   364 pts/0    D+   11:04   0:00  |       \_ drbdsetup new-minor r0 1 0
root     22545  0.0  0.0  11744   920 pts/3    S+   11:12   0:00          \_ grep --color=auto drbd

To use DRBD again we need to reboot the system.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to drbd8 in Ubuntu.
https://bugs.launchpad.net/bugs/1317833

Title:
  DRBD 8.4 kernel crash when using same resource for 2 minors

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/drbd8/+bug/1317833/+subscriptions



More information about the Ubuntu-server-bugs mailing list