server outage - which logs are relevant?

Dave Stevens geek at
Mon Jan 14 19:49:44 UTC 2013

Hello group,

I don't know if this is the best list so if there are suggestions for  
a better place please let me know.

I help manage a VPS using Ubuntu 10.04.4. It's right up to date with  
patches; it uses fail2ban to ward off nasties, apparently  
successfully. Every night at midnight logwatch is run and mails me the  
output, no surprises there just long boring details about dns  
resolution failures and failed login attempts.

But twice in six months the virtual machine has become unresponsive  
during the daily backup which runs at midnight (for about three  
minutes). Apart from these incidents the site gets 100% uptime.

A mail to the host and a quick reboot gets it back in business. There  
are two people who get notified, by pingdom, if the computer fails to  
serve pages, but we are both in the same time zone and in this case  
both didn't notice until several hours had gone by. This VPS hosts  
sites that are not overall mission critical for anyone but it is  
upsetting nonetheless.

I've looked at the apache access and error logs but see nothing very  
informative there. Other suggestions?



If all the advertising in the world were to shut down tomorrow, would people
still go on buying more soap, eating more apples, giving their children more
vitamins, roughage, milk, olive oil, scooters and laxatives, learning more
languages by iPod, hearing more virtuosos by radio, re-decorating their
houses, refreshing themselves with more non-alcoholic thirst-quenchers,
cooking more new, appetizing dishes, affording themselves that little extra
touch which means so much? Or would the whole desperate whirligig slow
down, and the exhausted public relapse upon plain grub and elbow-grease?

--- Dorothy L Sayers, in Murder Must Advertise

More information about the ubuntu-users mailing list