[Bug 1834875] Re: cloud-init growpart race with udev
Dan Watkins
daniel.watkins at canonical.com
Fri Aug 23 20:56:20 UTC 2019
On Fri, Aug 23, 2019 at 08:23:06PM -0000, Tobias Koch wrote:
> I may be missing the point, but the symlink in question is eventually
> recreated, does that tell us anything? This here
Yes, this is more supporting evidence that this is a race condition; the
state of the system both before and some time after the resize is
consistent, the kernel/udev just lose track of the existence of the
partition for some amount of time after the resize happens.
> > Dan had put a udevadm settle in this spot like so
> >
> > def get_size(filename)
> > util.subp(['udevadm', 'settle'])
> > os.open(....)
>
> looks to me like the event queue should be empty now, but how do you
> know userspace has acted on what came out of it?
The symlink exists before the resize, so if it's missing then we know
that the udev events have been processed. The DEVLINKS line in the diff
in comment #24 shows that udev doesn't think that symlink should exist.
We can see the symlink being deleted in the logging in comment #22.
> Is it strictly required that any event is cleared only after the
> corresponding action has completed? If yes, we can probably blame
> udev. If not, cloud-init should wait on the link to appear.
The link is not going to appear without something prompting a re-read of
the partition table from udev. I don't believe this should be necessary
if the kernel/udev are behaving properly.
(Odds are that whatever causes it to be recreated later in boot would be
blocked by cloud-init waiting.)
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1834875
Title:
cloud-init growpart race with udev
Status in cloud-init:
Incomplete
Status in systemd package in Ubuntu:
New
Bug description:
On Azure, it happens regularly (20-30%), that cloud-init's growpart
module fails to extend the partition to full size.
Such as in this example:
========================================
2019-06-28 12:24:18,666 - util.py[DEBUG]: Running command ['growpart', '--dry-run', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True)
2019-06-28 12:24:19,157 - util.py[DEBUG]: Running command ['growpart', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True)
2019-06-28 12:24:19,726 - util.py[DEBUG]: resize_devices took 1.075 seconds
2019-06-28 12:24:19,726 - handlers.py[DEBUG]: finish: init-network/config-growpart: FAIL: running config-growpart with frequency always
2019-06-28 12:24:19,727 - util.py[WARNING]: Running module growpart (<module 'cloudinit.config.cc_growpart' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed
2019-06-28 12:24:19,727 - util.py[DEBUG]: Running module growpart (<module 'cloudinit.config.cc_growpart' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 812, in _run_modules
freq=freq)
File "/usr/lib/python3/dist-packages/cloudinit/cloud.py", line 54, in run
return self._runners.run(name, functor, args, freq, clear_on_fail)
File "/usr/lib/python3/dist-packages/cloudinit/helpers.py", line 187, in run
results = functor(*args)
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 351, in handle
func=resize_devices, args=(resizer, devices))
File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2521, in log_time
ret = func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 298, in resize_devices
(old, new) = resizer.resize(disk, ptnum, blockdev)
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 159, in resize
return (before, get_size(partdev))
File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 198, in get_size
fd = os.open(filename, os.O_RDONLY)
FileNotFoundError: [Errno 2] No such file or directory: '/dev/disk/by-partuuid/a5f2b49f-abd6-427f-bbc4-ba5559235cf3'
========================================
@rcj suggested this is a race with udev. This seems to only happen on
Cosmic and later.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1834875/+subscriptions
More information about the foundations-bugs
mailing list