[Bug 88746] Re: ehci_hcd module causes I/O errors in USB 2.0 devices

David Becker becker.david at hotmail.com
Sun Dec 21 15:30:10 UTC 2008

Sorry for the length ;)

>From Wikipedia: "On Linux systems the kernel calls out to udev which in
turn provides notifications to HAL through a standard Unix domain socket
whenever a device plugs in."

So there's communication triggered by the device connecting, which is
relayed through udev on to hald. The Wikipedia page doesn't mention
this, but I'm also assuming communication when the device is unplugged.

Let's say a device plugs in. The kernel notices this, triggers something
in udev which is relayed to hald which looks up the device and mounts

Let's say a (previously connected) device unplugs. What would happen if
the (unplug) message doesn't arrive at hald?

Let's say a device plugs in, and then very quickly unplugs and repeats
this over and over.

It's possible that hald at one point thinks the device is plugged in
(thus available) while it's not or vice versa.

A (plug/unplug) message may get lost. If hald (or udev or even the
kernel) doesn't do blocking (of pending signals) correctly or is
ignoring when it should be blocking, then a message may get lost.

A lost message would quickly result in a mismatch between the actual
device state and what hald (or udev or the kernel) considers to be the
(actual) state.

- Disconnects are normal events on a USB bus.
- Windows (tends to) behave differently when filesystems are suddenly disconnected.
- These errors occur on various distro's
- Issue seems to have started from 2.6.21 (or .23)
- Kernel developers apparently don't believe it has anything to do with the kernel
- Issue doesn't (seem to) occur with ohci, thus the source is possibly a race-condition.
- Previous reports on similar issues would be somewhat resolved by introducing a usleep call in the kernel (which probably just delays the occurance of the race condition).
- Lowering of potential throughput (decreasing max_sectors) seems to improve the situation, but really doesn't solve it (= postponement of race-condition).

What is thus involved with mounting? kernel, udev, hald and possibly

Just a hunch.


