[Bug 1017715] Re: Serialization problem of udev events with DM_COOKIE set

Herton R. Krzesinski 1017715 at bugs.launchpad.net
Mon Jun 25 22:10:03 UTC 2012


** Patch added: "Patch for precise udev"
   https://bugs.launchpad.net/ubuntu/+source/udev/+bug/1017715/+attachment/3204171/+files/precise_udev.patch

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to udev in Ubuntu.
https://bugs.launchpad.net/bugs/1017715

Title:
  Serialization problem of udev events with DM_COOKIE set

Status in “udev” package in Ubuntu:
  In Progress

Bug description:
  Alex Lyakas reports:

  Hi Herton,

  we are doing initial testing of our software on stock ubuntu-precise
  3.2.0-23 - 36. It looks like we have identified a problem with the udevd
  code. Here are some details:
  We believe the problem was introduced by a fix to
  https://bugs.launchpad.net/ubuntu/+source/lvm2/+bug/802626. The fix
  introduced code like this:
  in event_queue_insert():
  /* run all events with a timeout set immediately, or in the case
   * it's a dm_cookie event being processed */
  if (udev_device_get_timeout(dev) > 0 ||
      udev_device_get_dm_cookie_set(dev)) {
      event_run(event, true);
      return 0;
  }
  Basically, what this fix does, if the udev event has a DM_COOKIE, then
  this event is dispatched immediately. Previously, this event was going
  through the event_queue_start(), which was checking that there are no
  in-flight events for the same device (or parent, or child etc.) by
  calling is_devpath_busy(). With the fix, this doesn’t happen and event
  is dispatched immediately.
  What we see that when a new device-mapper device is created in the
  system, an “add” event is fired, then “change” event comes in (that has
  a DM_COOKIE). As a result of the fix, the “change” processing is not
  delayed until the “add” processing completes. As a result, both events
  are handled concurrently by two different udev-workers. This causes
  several different unwanted effects that we observed.
  One such effect is in the attached log: dm-16 device is being created.
  “add” and “change” events are fired:
  May 24 20:26:37 vc-00-00-A-dev udevd[5859]: seq 100141 queued, 'add'
  'block'
  May 24 20:26:37 vc-00-00-A-dev udevd[5859]: seq 100142 queued, 'change'
  'block'
  May 24 20:26:37 vc-00-00-A-dev udevd[26633]: seq 100142 running
  The run in parallel and “change” processing starts first. Then “add”
  processing starts:
  May 24 20:26:37 vc-00-00-A-dev udevd[26544]: seq 100141 running
  “change” processing creates symbolic links
  May 24 20:26:37 vc-00-00-A-dev udevd[26633]: creating link '/dev/disk
  /by-id/dm-name-vpart-5555' to '/dev/dm-16'
  May 24 20:26:37 vc-00-00-A-dev udevd[26633]: creating symlink '/dev/disk
  /by-id/dm-name-vpart-5555' to '../../dm-16'
  May 24 20:26:37 vc-00-00-A-dev udevd[26633]: creating link
  '/dev/mapper/vpart-5555' to '/dev/dm-16'
  May 24 20:26:37 vc-00-00-A-dev udevd[26633]: creating symlink
  '/dev/mapper/vpart-5555' to '../dm-16'
  but then “add” processing removes them:
  May 24 20:26:37 vc-00-00-A-dev udevd[26544]: update old name, '/dev/disk
  /by-id/dm-name-vpart-5555' no longer belonging to
  '/devices/virtual/block/dm-16'
  May 24 20:26:37 vc-00-00-A-dev udevd[26544]: no reference left, remove
  '/dev/disk/by-id/dm-name-vpart-5555'
  May 24 20:26:37 vc-00-00-A-dev udevd[26544]: update old name,
  '/dev/mapper/vpart-5555' no longer belonging to
  '/devices/virtual/block/dm-16'
  May 24 20:26:37 vc-00-00-A-dev udevd[26544]: no reference left, remove
  '/dev/mapper/vpart-5555'
  As a result, there is no /dev/mapper/XXX symbolic link.
  We have also seen some other bad effects of this parallel processing
  like:
  # bad symbolic link is created in /dev/disk/by-path (this happens in
  “add” processing, when there is no DM_NAME property):
  ll /dev/disk/by-path:
  lrwxrwxrwx 1 root root 10 May 23 21:47 dm-name- -> ../../dm-4
  lrwxrwxrwx 1 root root 10 May 23 21:45 dm-name-ioerror -> ../../dm-0
  lrwxrwxrwx 1 root root 10 May 23 21:47 dm-name-ppart-100 -> ../../dm-5
  lrwxrwxrwx 1 root root 10 May 23 21:47 dm-name-ppart-103 -> ../../dm-3
  lrwxrwxrwx 1 root root 10 May 23 21:47 dm-name-ppart-105 -> ../../dm-2
  lrwxrwxrwx 1 root root 10 May 23 21:47 dm-name-ppart-108 -> ../../dm-1
  lrwxrwxrwx 1 root root 10 May 23 21:47 dm-name-ppart-109 -> ../../dm-6
  # devnode is created in /dev/mapper, instead of symbolic link:
  root at vc-00-00-A-dev:~# ll /dev/mapper/
  total 0
  drwxr-xr-x 2 root root 140 May 23 22:01 ./
  drwxr-xr-x 14 root root 9900 May 23 22:01 ../
  lrwxrwxrwx 1 root root 7 May 23 22:01 blabla -> ../dm-1
  crw------- 1 root root 10, 236 May 23 19:09 control
  lrwxrwxrwx 1 root root 7 May 23 21:53 ioerror -> ../dm-0
  lrwxrwxrwx 1 root root 7 May 23 22:01 ppart-4927362 -> ../dm-2
  brw-rw---- 1 root disk 252, 3 May 23 22:01 ppart-4927363  // This is a
  devnode, not symlink!
  and some others.
  We see this issue only when doing stress testing of the udev system,
  like creating and deleting devices, and then creating them again and
  deleting again etc.
  To resolve the issue, we disabled the code that handles the DM_COOKIE in
  udevd. Then we stopped seeing issues.
  All in all, our understanding is that two udev events for the same
  devpath should not be executing concurrently, which this fix violates
  for Device-Mapper devices. Also, this problem cannot be easily addressed
  by playing with udev rules, because the flaw is in the udevd code
  itself.

  Can you pls share your view on the problems we see?

  Thanks,
   Alex.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/udev/+bug/1017715/+subscriptions




More information about the foundations-bugs mailing list