[Bug 960276] [NEW] a bad AMI can hang an entire compute node

Nick Moffitt nick.moffitt at canonical.com
Tue Mar 20 14:56:50 UTC 2012


Public bug reported:

Using the attached image (and others) causes the entire compute node to
hang between the booting of the image and the configuration of
networking.  The running image has a console ring buffer output file
(however problematic--often it looks like it never got a proper root
filesystem somehow--lots of "NO PTY" errors), but is unpingable.

The only way to terminate these instances is to restart nova-compute so
that it will collect amqp messages again, and then send the terminate
request.  This seems suspiciously like the compute code is blocking in a
libvirt call of some sort.

The cluster used booted an older Oneiric image with no problems
whatsoever.

This effectively can DoS an entire openstack installation through
nothing more than running instances.

Attached is the amd64 image from http://cloud-
images.ubuntu.com/precise/20120319/ which exhibited this problem in our
rc1 cloud.

** Affects: nova (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/960276

Title:
  a bad AMI can hang an entire compute node

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/960276/+subscriptions



More information about the Ubuntu-server-bugs mailing list