Detecting and handling service failures

David Tong scarabus at gmail.com
Fri Aug 24 18:08:55 UTC 2012


I am familiar with SMF on Solaris. In particular, when a service cannot be
started by SMF it is marked as being
in maintenance state. I'm trying to use upstart to detect and report on
similar conditions.

My understanding of the way that Upstart works is that if a service fails
then an event is emitted
indicating the failure and the service is stopped. If you don't catch the
event then you don't know it's failed.
If a user queries the status of a service they only see that it is stopped;
they don't see the reason.
Am I right in thinking that once a service is stopped the only way to
determine the cause is to view the system logs?

Now it's easy to configure upstart to run a job when another process fails:
   start on stopped tongo RESULT=failed

But as far as I can work out you would need to explicitly enumerate all the
jobs that you wanted to monitor -
or is there a wildcard option?
   start on stopped *ANY* RESULT=failed

What about the case where a new service is added? Obviously I also want to
be notified if that fails.

Specific RTFM pointers would be welcomed.
Thanks
Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/upstart-devel/attachments/20120824/7ce25d08/attachment.html>


More information about the upstart-devel mailing list