terminating a failed post-start stanza

Michael Barrett loki77 at gmail.com
Tue Apr 16 19:43:50 UTC 2013


Hi, I'm working on converting my postgresql package to using upstart.  While doing so, I found that postgres takes a few moments after starting to actually start accepting connections.  I decided to use a post-start job to test whether the postgres server was up and available before letting upstart believe the service was ready.

Currently my upstart job looks like this: http://dpaste.com/1060864/

The issue I'm running into now is that whenever there is an issue with the postgres server that causes it to crash, the post-start job hangs indefinitely (as you'd expect from the loop).  I can't stop the job or restart it once I fix the issue.  The only way I've been able to fix it is to bring up postgres manually, which allows the post-start to finish.

I've tried putting a maximum # of retries in the post-start script, then doing an exit 1 when it exceeds that amount, but that results in the 'start postgresql' command exiting successfully, which causes issues in other places in my application (because postgres isn't actually running).

What is the suggested method for dealing with something like this?  Is there anyway to tell the post-start to terminate?  Is there any way for post-start to signal that there was an issue starting the daemon after a # of retries?  I'm sure I'm missing something key in my understanding, so any help would be appreciated.

Thanks!

--
Michael Barrett
loki77 at gmail.com







More information about the upstart-devel mailing list