[Bug 1920991] Re: Ubuntu 20.04 - NVMe/IB I/O error detected while manually resetting controller

Jennifer Duong 1920991 at bugs.launchpad.net
Tue May 18 15:16:21 UTC 2021


I spoke with a few of our controller firmware developers and it sounds
like the controllers are returning the proper status whenever a single
controller is reset. However, it appears as if Ubuntu may not have
retried IO on the alternate path.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to nvme-cli in Ubuntu.
https://bugs.launchpad.net/bugs/1920991

Title:
  Ubuntu 20.04 - NVMe/IB I/O error detected while manually resetting
  controller

Status in nvme-cli package in Ubuntu:
  Incomplete

Bug description:
  On all four of my Ubuntu 20.04 hosts, an I/O error is detected almost
  immediately after my E-Series storage controller. I am currently
  running with Ubuntu 20.04, kernel-5.4.0-67-generic, rdma-
  core-28.0-1ubuntu1, nvme-cli-1.9-1ubuntu0.1, and native NVMe
  multipathing enabled. These message appear to coincide with when my
  test fails:

  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.616408] blk_update_request: I/O error, dev nvme0c0n12, sector 289440 op 0x1:(WRITE) flags 0x400c800 phys_seg 6 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.616433] blk_update_request: I/O error, dev nvme0c0n12, sector 291488 op 0x1:(WRITE) flags 0x4008800 phys_seg 134 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.617137] blk_update_request: I/O error, dev nvme0c0n12, sector 295048 op 0x1:(WRITE) flags 0x4008800 phys_seg 87 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.617184] blk_update_request: I/O error, dev nvme0c0n12, sector 293000 op 0x1:(WRITE) flags 0x400c800 phys_seg 180 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.617624] blk_update_request: I/O error, dev nvme0c0n12, sector 298608 op 0x1:(WRITE) flags 0x4008800 phys_seg 47 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.617678] blk_update_request: I/O error, dev nvme0c0n12, sector 296560 op 0x1:(WRITE) flags 0x400c800 phys_seg 62 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.618070] blk_update_request: I/O error, dev nvme0c0n12, sector 302160 op 0x1:(WRITE) flags 0x4008800 phys_seg 24 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.618084] blk_update_request: I/O error, dev nvme0c0n12, sector 300112 op 0x1:(WRITE) flags 0x400c800 phys_seg 47 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.618497] blk_update_request: I/O error, dev nvme0c0n12, sector 305712 op 0x1:(WRITE) flags 0x4008800 phys_seg 25 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.618521] blk_update_request: I/O error, dev nvme0c0n12, sector 303664 op 0x1:(WRITE) flags 0x400c800 phys_seg 63 prio class 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.640763] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.641099] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.641305] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.641317] ldm_validate_partition_table(): Disk read failed.
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.641551] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.641751] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.641955] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.642160] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.642172] Dev nvme0n12: unable to read RDB block 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.642394] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.642600] Buffer I/O error on dev nvme0n12, logical block 3, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.642802] Buffer I/O error on dev nvme0n12, logical block 0, async page read
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.643015] nvme0n12: unable to read partition table
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.653495] ldm_validate_partition_table(): Disk read failed.
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.654188] Dev nvme0n20: unable to read RDB block 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.654850] nvme0n20: unable to read partition table
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.665151] ldm_validate_partition_table(): Disk read failed.
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.665673] Dev nvme0n126: unable to read RDB block 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.666194] nvme0n126: unable to read partition table
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.685662] ldm_validate_partition_table(): Disk read failed.
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.686504] Dev nvme0n124: unable to read RDB block 0
  Mar 23 12:23:58 ICTM1605S01H1 kernel: [ 1232.687187] nvme0n124: unable to read partition table

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvme-cli/+bug/1920991/+subscriptions



More information about the foundations-bugs mailing list