[Bug 1987663] Re: cinder-volume: "Failed to re-export volume, setting to ERROR" with "tgtadm: failed to send request hdr to tgt daemon, Transport endpoint is not connected" on service startup
Mauricio Faria de Oliveira
1987663 at bugs.launchpad.net
Fri Apr 19 15:48:42 UTC 2024
The packages built successfully in Test PPA ppa:mfo/cinder-
lp1987663-lp1988942-lp1994521
All build-time unit tests passed in the Test PPA (as in the Ubuntu
Archive).
Mantic:
---
mantic-updates:
BUILDLOG='https://launchpad.net/ubuntu/+source/cinder/2:23.0.0-0ubuntu1.1/+build/27741243/+files/buildlog_ubuntu-mantic-amd64.cinder_2%3A23.0.0-0ubuntu1.1_BUILDING.txt.gz'
curl -sL "$BUILDLOG" -o - | gzip -dc | grep -e '^Ran .* tests' -e '^OK (skipped='
Ran 17293 tests in 343.725s
OK (skipped=44)
mantic test ppa:
BUILDLOG='https://launchpadlibrarian.net/725471761/buildlog_ubuntu-mantic-amd64.cinder_2%3A23.0.0-0ubuntu1.2_BUILDING.txt.gz'
curl -sL "$BUILDLOG" -o - | gzip -dc | grep -e '^Ran .* tests' -e '^OK (skipped='
Ran 17293 tests in 333.393s
OK (skipped=44)
Jammy:
---
jammy-updates:
BUILDLOG='https://launchpad.net/ubuntu/+source/cinder/2:20.3.1-0ubuntu1.1/+build/27741270/+files/buildlog_ubuntu-jammy-amd64.cinder_2%3A20.3.1-0ubuntu1.1_BUILDING.txt.gz'
curl -sL "$BUILDLOG" -o - | gzip -dc | grep -e '^Ran .* tests' -e '^OK (skipped='
Ran 16048 tests in 367.044s
OK (skipped=48)
jammy test ppa:
BUILDLOG='https://launchpad.net/~mfo/+archive/ubuntu/cinder-lp1987663-lp1988942-lp1994521/+build/28125465/+files/buildlog_ubuntu-jammy-amd64.cinder_2%3A20.3.1-0ubuntu1.2_BUILDING.txt.gz'
curl -sL "$BUILDLOG" -o - | gzip -dc | grep -e '^Ran .* tests' -e '^OK (skipped='
Ran 16054 tests in 343.184s
OK (skipped=48)
Focal:
---
focal-updates:
BUILDLOG='https://launchpadlibrarian.net/666807464/buildlog_ubuntu-focal-amd64.cinder_2%3A16.4.2-0ubuntu2.4_BUILDING.txt.gz'
curl -sL "$BUILDLOG" -o - | gzip -dc | sed -n -e '/^Ran:/,/^Sum of/p'
Ran: 12772 tests in 205.1574 sec.
- Passed: 12760
- Skipped: 12
- Expected Fail: 0
- Unexpected Success: 0
- Failed: 0
Sum of execute time for each test: 802.0120 sec.
focal test ppa:
BUILDLOG='https://launchpad.net/~mfo/+archive/ubuntu/cinder-lp1987663-lp1988942-lp1994521/+build/28125762/+files/buildlog_ubuntu-focal-amd64.cinder_2%3A16.4.2-0ubuntu2.5_BUILDING.txt.gz'
curl -sL "$BUILDLOG" -o - | gzip -dc | sed -n -e '/^Ran:/,/^Sum of/p'
Ran: 12772 tests in 228.6830 sec.
- Passed: 12760
- Skipped: 12
- Expected Fail: 0
- Unexpected Success: 0
- Failed: 0
Sum of execute time for each test: 866.5233 sec.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to cinder in Ubuntu.
https://bugs.launchpad.net/bugs/1987663
Title:
cinder-volume: "Failed to re-export volume, setting to ERROR" with
"tgtadm: failed to send request hdr to tgt daemon, Transport endpoint
is not connected" on service startup
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive antelope series:
New
Status in Ubuntu Cloud Archive bobcat series:
New
Status in Ubuntu Cloud Archive caracal series:
Fix Released
Status in Ubuntu Cloud Archive ussuri series:
Incomplete
Status in Ubuntu Cloud Archive victoria series:
Won't Fix
Status in Ubuntu Cloud Archive wallaby series:
Won't Fix
Status in Ubuntu Cloud Archive xena series:
Won't Fix
Status in Ubuntu Cloud Archive yoga series:
New
Status in Ubuntu Cloud Archive zed series:
Won't Fix
Status in cinder package in Ubuntu:
Fix Released
Status in cinder source package in Bionic:
Won't Fix
Status in cinder source package in Focal:
New
Status in cinder source package in Jammy:
New
Status in cinder source package in Lunar:
Won't Fix
Status in cinder source package in Mantic:
New
Status in cinder source package in Noble:
Fix Released
Bug description:
[Impact]
* The cinder-volume service might fail to re-export volumes
in-use on startup if tgt.service isn't fully started yet.
* This affects the 'lvm' driver with 'tgtadm' target helper
(which runs 'tgtadm' commands that need the service ready).
* Snippets from /var/log/cinder/cinder-volume.log:
Failed to re-export volume, setting to ERROR.
...
Command: tgtadm --lld iscsi --op show --mode target
...
Stderr: 'tgtadm: failed to send request hdr to tgt daemon,
Transport endpoint is not connected\n'
* This issue is more common in openstack compute nodes
with networking (ovs/ovn) that takes long to startup,
which might delay the startup of tgt.service _after_
cinder-volume.service.
[Test Steps]
* Steps to reproduce are detailed in comment #3.
Summary:
* Install mysql, rabbitmq-server, keystone, and cinder
(controller and storage nodes; backup node unneeded).
* Configure cinder-volume (storage node) for LVM backend
and tgtadm iSCSI helper (tgt.service).
* Create a cinder volume, and configure it as 'in-use'.
* Simulate a start delay on tgt.service with a drop-in.
* Restart services: cinder-volume.service tgt.service
* Check sequence of service startup.
* Check status of the cinder volume:
'in-use' (expected) or 'error' (bug).
* Check log file /var/log/cinder/cinder-volume.log for
'tgtadm: failed to send request hdr to tgt daemon'.
[Regression Potential]
* The fix introduces systemd unit 'After=' and 'Wants='
properties for tgt.service in cinder-volume.service,
thus might delay the boot process (multi-user.target).
$ systemctl show cinder-volume.service | grep WantedBy=
WantedBy=multi-user.target
* However, the boot process already waits on tgt.service
anyway, thus the difference (if any) should not be big,
and would provide more correct behavior.
$ systemctl show tgt.service | grep WantedBy=
WantedBy=multi-user.target
* If tgt.service is not present (tgt package not installed)
_no errors_ occur, as both 'After=' and 'Wants=' are weak
ordering/dependency properties (man 5 systemd.unit).
[Other Info]
* The fix uses a systemd service drop-in snippet because
the service unit is generated by openstack-pkg-tools
(pkgos-gen-systemd-unit) based on the 'init' service,
and it only emits 'Wants=' for network-online.target.
* Changing that in openstack-pkg-tools changes behavior
in stable releases, and only manifest at build time,
for many openstack packages that have no issues now.
* We'll continue to pursue the general improvement in
Debian, so it comes into Ubuntu development release,
but for the Ubuntu stable releases, this should do.
[Original Bug Description]
Real-world example in comment #2.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1987663/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list