[Bug 1683076] Please test proposed package
James Page
james.page at ubuntu.com
Wed Jul 19 06:09:18 UTC 2017
Hello Jill, or anyone else affected,
Accepted swift into mitaka-proposed. The package will build now and be
available in the Ubuntu Cloud Archive in a few hours, and then in the
-proposed repository.
Please help us by testing this new package. To enable the -proposed
repository:
sudo add-apt-repository cloud-archive:mitaka-proposed
sudo apt-get update
Your feedback will aid us getting this update out to other Ubuntu users.
If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-mitaka-needed to verification-mitaka-done. If it does
not fix the bug for you, please add a comment stating that, and change
the tag to verification-mitaka-failed. In either case, details of your
testing will help us make a better decision.
Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
advance!
** Changed in: cloud-archive/mitaka
Status: Triaged => Fix Committed
** Tags added: verification-mitaka-needed
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to swift in Ubuntu.
https://bugs.launchpad.net/bugs/1683076
Title:
Swift-storage dies if rsyslog is stopped
Status in OpenStack swift-storage charm:
Invalid
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive icehouse series:
Triaged
Status in Ubuntu Cloud Archive kilo series:
Triaged
Status in Ubuntu Cloud Archive mitaka series:
Fix Committed
Status in Ubuntu Cloud Archive newton series:
Fix Released
Status in swift package in Ubuntu:
Invalid
Status in swift source package in Trusty:
Fix Committed
Status in swift source package in Xenial:
Fix Committed
Bug description:
Trusty, Mitaka, Juju 1.25.11
We have a cloud where swift replicators are constantly falling over on
2 nodes. This occurs whenever rsyslog restarts, as in
https://bugs.launchpad.net/swift/+bug/1094230
https://review.openstack.org/#/c/24871
https://bugs.python.org/issue15179
rsyslog restarts are unfortunately frequent right now, due to
https://bugs.launchpad.net/juju-core/+bug/1683075
Nodes are landscape managed and up to date but still exhibit the
failure.
Not much from running swift in verbose.
https://pastebin.canonical.com/185609/
sosreports are uploading to https://private-
fileshare.canonical.com/~jillr/sf00137831/
[Impact]
* Stopping rsyslog causes swift daemons to crash due to overflowing
the call stack when attempting to write an entry to the logging
subsystem and the attempt to write to /dev/log fails. When rsyslog
stops, the /dev/log socket is unavailable and results in an exception.
The swift logging code attempts to log the resultant error, which
again results in an exception. This continues until the stack is
overflowed and the swift daemons crash. When the swift daemons crash,
the object, container and account data are not able to be replicated
to other storage nodes in the system, which affects the data integrity
of the data being written to the system.
* The patch should be backported to stable releases in order to
ensure that the data integrity of objects, accounts, and containers
within Swift are not adversely affected due to failed logging
subsystems.
* The uploaded patches fix the bug by only attempting to log an entry
to the logging subsystem if the current call stack does not include an
attempt to write to the logs. If the current call stack includes an
attempt to log to the logging subsystem, the log message is dropped
avoiding the recursion.
[Test Case]
* Install swift storage cluster
* Log into one of the swift storage nodes
* Ensure the swift-{object,account,container}-replicator processes are running
* Stop the rsyslog service
* Wait a minute
* Observe the swift-{object,account,container}-replicator processes are no longer running
[Regression Potential]
* This affects the logging capabilities provided by the Swift code.
Possible regressions could occur in almost any subsystem, since the
logging is universal throughout the code base and could result in lost
log entries in the best regression scenario and possible crashing of
swift daemons in the worst case scenario. The regression potential is
mitigated by the fact that this patch has already been included
upstream for over a year now and no regressions have been reported
against this code since.
[Other Info]
* /dev/log is not provided by the rsyslog daemon in Xenial, but this
patch still applies in that any persistent exception encountered when
writing to /dev/log will cause the call stack to overflow and crash
the swift daemons.
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-swift-storage/+bug/1683076/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list