[Bug 1477225] Re: ceph-radosgw restart fails
Liam Young
liam.young at canonical.com
Wed Sep 2 10:31:16 UTC 2015
** Description changed:
+ [Impact]
+
+ On 14.04 the restart target of the sysvinit script brings the service down
+ but almost always fails to bring the service back up again.
+
+ The proposed fix updates /etc/init.d/radosgw and so that the stop target
+ waits for up to 30 seconds for the service to stop cleanly
+
+
+ [Test Case]
+
+ sudo apt-get install --yes radosgw
+ sudo mkdir /etc/ceph
+ cat <<-EOF > /etc/ceph/ceph.conf
+ [global]
+
+ auth cluster required = cephx
+ auth service required = cephx
+ auth client required = cephx
+
+ mon host = 127.0.0.1:6789
+
+ [client.radosgw.gateway]
+ host = $(hostname -s)
+ keyring = /etc/ceph/keyring.rados.gateway
+ rgw socket path = /tmp/radosgw.sock
+ log file = /var/log/ceph/radosgw.log
+ rgw frontends = civetweb port=70
+ EOF
+
+ cat <<-EOF > /etc/ceph/keyring.rados.gateway
+ [client.radosgw.gateway]
+ key = BBBBBBBBBBBBBBBBB/kkkkkkkkkkkkkkkkkkkk==
+ EOF
+
+ service radosgw stop
+ service radosgw start
+ service radosgw status
+ service radosgw restart
+ service radosgw status
+
+ At this point /usr/bin/radosgw will no be running
+
+ [Regression Potential]
+
+ * The only change in behaviour that would result from this change is that
+ running the stop target in the init script will wait for up to 30s before
+ exiting rather than retuning immediatly. I cannot think of any use cases
+ where this would be an issue.
+
+ [Original Bug Report]
job handler:
- Jul 22 16:03:44 job-handler-1 ERR Failed to execute job: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph at ubuntu.com to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'#012Traceback (most recent call last):#012 File "/opt/canonical/landscape/canonical/landscape/model/activity/jobrunner.py", line 38, in run#012 yield self._run_activity(account_id, activity_id)#012HTTPError: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph at ubuntu.com to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'
+ Jul 22 16:03:44 job-handler-1 ERR Failed to execute job: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph at ubuntu.com to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'#012Traceback (most recent call last):#012 File "/opt/canonical/landscape/canonical/landscape/model/activity/jobrunner.py", line 38, in run#012 yield self._run_activity(account_id, activity_id)#012HTTPError: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph at ubuntu.com to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'
Other logs attached.
** Description changed:
[Impact]
On 14.04 the restart target of the sysvinit script brings the service down
but almost always fails to bring the service back up again.
- The proposed fix updates /etc/init.d/radosgw and so that the stop target
+ The proposed fix updates /etc/init.d/radosgw so that the stop target
waits for up to 30 seconds for the service to stop cleanly
-
[Test Case]
sudo apt-get install --yes radosgw
sudo mkdir /etc/ceph
cat <<-EOF > /etc/ceph/ceph.conf
[global]
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
mon host = 127.0.0.1:6789
[client.radosgw.gateway]
host = $(hostname -s)
keyring = /etc/ceph/keyring.rados.gateway
rgw socket path = /tmp/radosgw.sock
log file = /var/log/ceph/radosgw.log
rgw frontends = civetweb port=70
EOF
cat <<-EOF > /etc/ceph/keyring.rados.gateway
[client.radosgw.gateway]
- key = BBBBBBBBBBBBBBBBB/kkkkkkkkkkkkkkkkkkkk==
+ key = BBBBBBBBBBBBBBBBB/kkkkkkkkkkkkkkkkkkkk==
EOF
service radosgw stop
service radosgw start
service radosgw status
service radosgw restart
service radosgw status
At this point /usr/bin/radosgw will no be running
[Regression Potential]
- * The only change in behaviour that would result from this change is that
- running the stop target in the init script will wait for up to 30s before
- exiting rather than retuning immediatly. I cannot think of any use cases
- where this would be an issue.
+ * The only change in behaviour that would result from this change is that
+ running the stop target in the init script will wait for up to 30s before
+ exiting rather than retuning immediatly. I cannot think of any use cases
+ where this would be an issue.
[Original Bug Report]
job handler:
Jul 22 16:03:44 job-handler-1 ERR Failed to execute job: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph at ubuntu.com to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'#012Traceback (most recent call last):#012 File "/opt/canonical/landscape/canonical/landscape/model/activity/jobrunner.py", line 38, in run#012 yield self._run_activity(account_id, activity_id)#012HTTPError: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph at ubuntu.com to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'
Other logs attached.
** Description changed:
[Impact]
On 14.04 the restart target of the sysvinit script brings the service down
but almost always fails to bring the service back up again.
The proposed fix updates /etc/init.d/radosgw so that the stop target
waits for up to 30 seconds for the service to stop cleanly
[Test Case]
sudo apt-get install --yes radosgw
sudo mkdir /etc/ceph
+ sudo su -
cat <<-EOF > /etc/ceph/ceph.conf
[global]
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
mon host = 127.0.0.1:6789
[client.radosgw.gateway]
host = $(hostname -s)
keyring = /etc/ceph/keyring.rados.gateway
rgw socket path = /tmp/radosgw.sock
log file = /var/log/ceph/radosgw.log
rgw frontends = civetweb port=70
EOF
cat <<-EOF > /etc/ceph/keyring.rados.gateway
[client.radosgw.gateway]
key = BBBBBBBBBBBBBBBBB/kkkkkkkkkkkkkkkkkkkk==
EOF
service radosgw stop
service radosgw start
service radosgw status
service radosgw restart
service radosgw status
At this point /usr/bin/radosgw will no be running
[Regression Potential]
* The only change in behaviour that would result from this change is that
running the stop target in the init script will wait for up to 30s before
exiting rather than retuning immediatly. I cannot think of any use cases
where this would be an issue.
[Original Bug Report]
job handler:
Jul 22 16:03:44 job-handler-1 ERR Failed to execute job: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph at ubuntu.com to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'#012Traceback (most recent call last):#012 File "/opt/canonical/landscape/canonical/landscape/model/activity/jobrunner.py", line 38, in run#012 yield self._run_activity(account_id, activity_id)#012HTTPError: PUT request for http://10.96.4.129:80/swift/v1/simplestreams failed with code 500 Internal Server Error: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html><head>\n<title>500 Internal Server Error</title>\n</head><body>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.</p>\n<p>Please contact the server administrator at \n ceph at ubuntu.com to inform them of the time this error occurred,\n and the actions you performed just before this error.</p>\n<p>More information about this error may be available\nin the server error log.</p>\n</body></html>\n'
Other logs attached.
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to ceph in Ubuntu.
https://bugs.launchpad.net/bugs/1477225
Title:
ceph-radosgw restart fails
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1477225/+subscriptions
More information about the Ubuntu-server-bugs
mailing list