[Bug 2054872] Re: Cannot change IRQ 70 affinity: Input/output error

Robert Malz 2054872 at bugs.launchpad.net
Wed Mar 20 14:28:31 UTC 2024


irqbalance 1.9.3-2ubuntu4 verified on 6.8.0-11-generic
Issue no longer reproduces.

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/2054872

Title:
  Cannot change IRQ 70 affinity: Input/output error

Status in irqbalance package in Ubuntu:
  Fix Committed
Status in irqbalance source package in Noble:
  Fix Committed

Bug description:
  [ Impact ]

   * irqbalance during runtime changes smp_affinity by writing to /proc/irq/<id>/smp_affinity file.
     Previous upstream implementation was closing file just after writing to it which could cause random IO erorrs.
   * Due to this issue, irqbalance could mark IRQs as unmigratable
   * Issue is visible in v1.9.3 irqbalance packges because additional check logic has been added (470a64b190628574c28a266bdcf8960291463191)
   * Before v1.9.3 irqbalancer could still fail on fclose but that was not leading to marking IRQ as unmigratable

  [ Test Plan ]

   * Install and run irqbalance in v1.9.3
   * Whenever irqbalance will decide to change IRQ affinity following error can occur:
  Mar 14 11:59:55 pre-noble irqbalance[1536]: Cannot change IRQ 27 affinity: Input/output error
  Mar 14 11:59:55 pre-noble irqbalance[1536]: IRQ 27 affinity is now unmanaged

  [ Where problems could occur ]

   * Issue is fixed by adding fflush before closing the file to make sure write operation is finished before closing
   * During local tests I have not observed any issues after adding file flushing
   
  [ Other Info ]
   
   * Fix for the issue is already merged into the upstream: https://github.com/Irqbalance/irqbalance/pull/302
   * Original description of the case below:

  Hello Ubuntu Team,

  I notice this today when using Ubuntu 24.04 Noble.
  These messages appear quite often in the systemd journal.

  My kernel is: 6.8.0-rc4-realtime-rt4

  # journalctl -b --no-pager --no-hostname | grep irqbalance
  Feb 24 13:21:02 systemd[1]: Started irqbalance.service - irqbalance daemon.
  Feb 24 13:21:02 (qbalance)[1209]: irqbalance.service: Referenced but unset environment variable evaluates to an empty string: IRQBALANCE_ARGS
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 73 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 73 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 63 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 63 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 71 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 71 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 61 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 61 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 68 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 68 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 58 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 58 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 66 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 66 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 64 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 64 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 72 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 72 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 62 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 62 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 70 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 70 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 60 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 60 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 69 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 69 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 59 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 59 affinity is now unmanaged
  Feb 24 13:21:12 irqbalance[1209]: Cannot change IRQ 67 affinity: Input/output error
  Feb 24 13:21:12 irqbalance[1209]: IRQ 67 affinity is now unmanaged
  Feb 24 13:22:22 irqbalance[1209]: Cannot change IRQ 65 affinity: Input/output error
  Feb 24 13:22:22 irqbalance[1209]: IRQ 65 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 73 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 73 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 63 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 63 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 61 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 61 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 68 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 68 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 66 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 66 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 64 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 64 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 72 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 72 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 62 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 62 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 80 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 80 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 60 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 60 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 79 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 79 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 69 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 69 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 67 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 67 affinity is now unmanaged
  Feb 24 13:40:36 irqbalance[1209]: Cannot change IRQ 65 affinity: Input/output error
  Feb 24 13:40:36 irqbalance[1209]: IRQ 65 affinity is now unmanaged
  Feb 24 13:40:46 irqbalance[1209]: Cannot change IRQ 71 affinity: Input/output error
  Feb 24 13:40:46 irqbalance[1209]: IRQ 71 affinity is now unmanaged
  Feb 24 14:23:06 irqbalance[1209]: Cannot change IRQ 70 affinity: Input/output error
  Feb 24 14:23:06 irqbalance[1209]: IRQ 70 affinity is now unmanaged

  My irq's for this system is:

  # cat /proc/interrupts
              CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15
     0: 116 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 2-edge timer
     6: 0 0 5405959 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 6-edge AMDI0010:03
     7: 0 0 0 0 0 0 299947 0 0 0 0 0 0 0 0 0 IR-IO-APIC 7-fasteoi pinctrl_amd
     8: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 8-edge rtc0
     9: 0 41 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-IO-APIC 9-fasteoi acpi
    25: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IOMMU-MSI 368-edge AMD-Vi0-Evt
    26: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IOMMU-MSI 376-edge AMD-Vi0-PPR
    27: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IOMMU-MSI 384-edge AMD-Vi0-GA
    28: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 amd_gpio 0 ACPI:Event
    29: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 amd_gpio 44 ACPI:Event
    30: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 amd_gpio 58 ACPI:Event
    31: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 amd_gpio 59 ACPI:Event
    32: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 amd_gpio 18 ACPI:Event
    33: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-0000:00:01.1 0-edge PCIe PME, pciehp
    34: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-0000:00:02.2 0-edge PCIe PME, pciehp
    35: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-0000:00:02.4 0-edge PCIe PME
    36: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-0000:00:08.1 0-edge PCIe PME
    37: 0 0 0 0 0 0 299946 0 0 0 0 0 0 0 0 0 amd_gpio 9 ELAN1201:00
    39: 0 0 0 0 0 0 0 0 0 0 23388 0 95 0 0 0 IR-PCI-MSIX-0000:04:00.3 0-edge xhci_hcd
    48: 0 0 0 0 0 0 0 0 0 0 0 0 0 8144 787089 0 IR-PCI-MSIX-0000:04:00.4 0-edge xhci_hcd
    57: 0 0 0 0 0 0 0 1302 0 0 0 0 0 0 0 0 IR-PCI-MSI-0000:04:00.6 0-edge snd_hda_intel:card2
    58: 0 192 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSI-0000:04:00.1 0-edge snd_hda_intel:card1
    59: 0 0 0 101 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 0-edge nvme0q0
    60: 22957 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 1-edge nvme0q1
    61: 0 20716 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 2-edge nvme0q2
    62: 0 0 24443 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 3-edge nvme0q3
    63: 0 0 0 25781 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 4-edge nvme0q4
    64: 0 0 0 0 26558 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 5-edge nvme0q5
    65: 0 0 0 0 0 21966 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 6-edge nvme0q6
    66: 0 0 0 0 0 0 23507 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 7-edge nvme0q7
    67: 0 0 0 0 0 0 0 19535 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 8-edge nvme0q8
    68: 0 0 0 0 0 0 0 0 27158 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 9-edge nvme0q9
    69: 0 0 0 0 0 0 0 0 0 24531 0 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 10-edge nvme0q10
    70: 0 0 0 0 0 0 0 0 0 0 23135 0 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 11-edge nvme0q11
    71: 0 0 0 0 0 0 0 0 0 0 0 21913 0 0 0 0 IR-PCI-MSIX-0000:03:00.0 12-edge nvme0q12
    72: 0 0 0 0 0 0 0 0 0 0 0 0 25956 0 0 0 IR-PCI-MSIX-0000:03:00.0 13-edge nvme0q13
    73: 0 0 0 0 0 0 0 0 0 0 0 0 0 21804 0 0 IR-PCI-MSIX-0000:03:00.0 14-edge nvme0q14
    75: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IR-PCI-MSIX-0000:04:00.2 0-edge psp-1
    77: 0 0 182 0 0 0 0 0 0 0 0 0 0 128 0 0 IR-IO-APIC 1-fasteoi snd_hda_intel:card0
    79: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 25693 0 IR-PCI-MSIX-0000:03:00.0 15-edge nvme0q15
    80: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22773 IR-PCI-MSIX-0000:03:00.0 16-edge nvme0q16
    82: 0 0 0 18139 0 0 628301 0 236532 266 0 467208 0 0 0 0 IR-PCI-MSI-0000:02:00.0 0-edge mt7921e
    83: 0 0 0 338579 1458 0 0 0 0 1118391 0 0 0 0 0 0 IR-PCI-MSIX-0000:04:00.0 0-edge amdgpu
   NMI: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Non-maskable interrupts
   LOC: 1413160 1341033 1515330 1346117 1486268 1353018 1498988 1334907 1475044 1389584 1499673 1360128 1503094 1356648 1481898 1338550 Local timer interrupts
   SPU: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Spurious interrupts
   PMI: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Performance monitoring interrupts
   IWI: 52036 57306 36597 61733 53803 79504 67600 65707 99768 106673 39212 50191 49983 52515 24180 36440 IRQ work interrupts
   RTR: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 APIC ICR read retries
   RES: 1135734 961961 1102398 869315 1245660 933642 1169006 917816 1359995 763672 1220046 820725 1266840 869766 1094597 834027 Rescheduling interrupts
   CAL: 1063814 373555 207657 168710 192237 189824 188438 174955 233311 126152 160532 145524 171051 153168 119414 134045 Function call interrupts
   TLB: 94988 98433 115358 96450 118525 98758 124773 94916 123529 90824 119527 95110 120917 98504 107344 96577 TLB shootdowns
   TRM: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Thermal event interrupts
   THR: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Threshold APIC interrupts
   DFR: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Deferred Error APIC interrupts
   MCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Machine check exceptions
   MCP: 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 Machine check polls
   ERR: 1
   MIS: 0
   PIN: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Posted-interrupt notification event
   NPI: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Nested posted-interrupt event
   PIW: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Posted-interrupt wakeup event

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/irqbalance/+bug/2054872/+subscriptions




More information about the Ubuntu-sponsors mailing list