[Bug 1834875] Re: cloud-init growpart race with udev

Dan Watkins daniel.watkins at canonical.com
Fri Aug 23 20:56:20 UTC 2019


On Fri, Aug 23, 2019 at 08:23:06PM -0000, Tobias Koch wrote:
> I may be missing the point, but the symlink in question is eventually
> recreated, does that tell us anything? This here

Yes, this is more supporting evidence that this is a race condition; the
state of the system both before and some time after the resize is
consistent, the kernel/udev just lose track of the existence of the
partition for some amount of time after the resize happens.

> > Dan had put a udevadm settle in this spot like so
> >
> > def get_size(filename)
> >    util.subp(['udevadm', 'settle'])
> >    os.open(....)
> 
> looks to me like the event queue should be empty now, but how do you
> know userspace has acted on what came out of it?

The symlink exists before the resize, so if it's missing then we know
that the udev events have been processed.  The DEVLINKS line in the diff
in comment #24 shows that udev doesn't think that symlink should exist.
We can see the symlink being deleted in the logging in comment #22.

> Is it strictly required that any event is cleared only after the
> corresponding action has completed? If yes, we can probably blame
> udev. If not, cloud-init should wait on the link to appear.

The link is not going to appear without something prompting a re-read of
the partition table from udev.  I don't believe this should be necessary
if the kernel/udev are behaving properly.

(Odds are that whatever causes it to be recreated later in boot would be
blocked by cloud-init waiting.)

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1834875

Title:
  cloud-init growpart race with udev

Status in cloud-init:
  Incomplete
Status in systemd package in Ubuntu:
  New

Bug description:
  On Azure, it happens regularly (20-30%), that cloud-init's growpart
  module fails to extend the partition to full size.

  Such as in this example:

  ========================================

  2019-06-28 12:24:18,666 - util.py[DEBUG]: Running command ['growpart', '--dry-run', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True)
  2019-06-28 12:24:19,157 - util.py[DEBUG]: Running command ['growpart', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True)
  2019-06-28 12:24:19,726 - util.py[DEBUG]: resize_devices took 1.075 seconds
  2019-06-28 12:24:19,726 - handlers.py[DEBUG]: finish: init-network/config-growpart: FAIL: running config-growpart with frequency always
  2019-06-28 12:24:19,727 - util.py[WARNING]: Running module growpart (<module 'cloudinit.config.cc_growpart' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed
  2019-06-28 12:24:19,727 - util.py[DEBUG]: Running module growpart (<module 'cloudinit.config.cc_growpart' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed
  Traceback (most recent call last):
    File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 812, in _run_modules
      freq=freq)
    File "/usr/lib/python3/dist-packages/cloudinit/cloud.py", line 54, in run
      return self._runners.run(name, functor, args, freq, clear_on_fail)
    File "/usr/lib/python3/dist-packages/cloudinit/helpers.py", line 187, in run
      results = functor(*args)
    File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 351, in handle
      func=resize_devices, args=(resizer, devices))
    File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2521, in log_time
      ret = func(*args, **kwargs)
    File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 298, in resize_devices
      (old, new) = resizer.resize(disk, ptnum, blockdev)
    File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 159, in resize
      return (before, get_size(partdev))
    File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 198, in get_size
      fd = os.open(filename, os.O_RDONLY)
  FileNotFoundError: [Errno 2] No such file or directory: '/dev/disk/by-partuuid/a5f2b49f-abd6-427f-bbc4-ba5559235cf3'

  ========================================

  @rcj suggested this is a race with udev. This seems to only happen on
  Cosmic and later.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1834875/+subscriptions



More information about the foundations-bugs mailing list