[Bug 1071910] [NEW] lxc stop will hang forever

Tim iceczd at gmail.com
Fri Oct 26 21:19:10 UTC 2012


Public bug reported:

Background:
This is issue occurs during an automated process and occurs with a 1/20 chance per iteration
I have one lxc-container on the machine
It is backed with an lvm2 snapshot
Running on ubuntu 12.10 on ec2 small instance - upgraded from 12.04 fresh instance
This is a new issue that has occurred after migrating my code from 11.10

Process:
create snapshot "lvcreate"
mount snapshot "mount"
lxc-start
do actions in container
lxc-stop
unmount snapshot "umount"
remove snapshot "lvremove"
-repeat

The issue can occur at either lxc-stop or lvremove.

when it occurs with lxc-stop:
ps -A reveals that lxc-start is still running along with kdmflush, kjournald, and init that appears to be the init process for the container
kdmflush, kjournald, init or it's sub-processes cannot be killed with "kill -9 pid" but lxc-start can

when it occurs with lvremove it occurs after lvremove is called again after failing the first time with stderr:
Using logical volume(s) on command line
    Archiving volume group "vmg1" metadata (seqno 272).
    Removing snapshot snap
    Found volume group "vmg1"
    Found volume group "vmg1"
    Loading vmg1-vm table (252:0)
    Loading vmg1-snap table (252:1)
  /sbin/dmeventd: stat failed: No such file or directory
    vmg1/snapshot0 already not monitored.
    Suspending vmg1-vm (252:0) with device flush
    Suspending vmg1-snap (252:1) with device flush
    Suspending vmg1-vm-real (252:2) with device flush
    Suspending vmg1-snap-cow (252:3) with device flush
    Found volume group "vmg1"
    Resuming vmg1-snap-cow (252:3)
    Resuming vmg1-vm-real (252:2)
    Resuming vmg1-snap (252:1)
    Removing vmg1-snap-cow (252:3)
  device-mapper: remove ioctl on  failed: Device or resource busy
  Unable to deactivate vmg1-snap-cow (252:3)
  Failed to resume snap.
  libdevmapper exiting with 1 device(s) still suspended.

lvremove spawns the lvm process and neither can be killed with "kill -9
pid" which indicates to me that they are waiting for something from the
kernel, and I am guessing this happens because of the same reason why
lxc-stop also hangs, and the containers processes can not be killed.

This is all I can report for now, but I'll try getting some log info
from lxc next Friday, let me know if you have any suggestions in the
meantime.

** Affects: lxc (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: lvm2 lxc quantal

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to lxc in Ubuntu.
https://bugs.launchpad.net/bugs/1071910

Title:
  lxc stop will hang forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1071910/+subscriptions



More information about the Ubuntu-server-bugs mailing list