[Bug 88746] Re: ehci_hcd module causes I/O errors in USB 2.0 devices

David Becker becker.david at hotmail.com
Sun Dec 21 10:54:08 UTC 2008


I seem to have solved (i.e., worked around) my previous issues. Note
that the "lost page write" errors I'm getting are likely due to the lvm
copyonwrite store I'm using. When the store overflows, all kinds of
strange things happen, but this is probably unrelated to the issues
going on here. "df" output also doesn't reflect the actual usage of the
store, so this error may occur when you're not expecting it, i.e. when
you would otherwise think you have enough storage space (when actually
you don't by virtue of the copyonwrite store overflowing).

Anyway, I've installed linux to a USB drive without the copyonwrite
store and I don't have any more disconnect, lost page write or other
instability problems. Note that the USB drive and the computer(s) are
the same pieces of hardware that were previously producing the errors.
All AMD/ATI hardware.

But, I also connected the same USB drive to a Proliant which has
(mostly) intel hardware. Low speed USB is uchi while high speed is ehci
based.

Now I also got several errors with the Proliant (ehci) which leads me to
believe that the errors could likely have something to do with either
hald or dbus communication (deficiencies).

I was preparing another live-cd-on-a-usb-stick. This process involves
mounting the usb drive, then mounting additional (tmpfs) filesystems to
the usb drive, then prepare the drive (formatting), then populate the
drive. The copyonwrite store is also involved at this stage.

Now during the preparation stage, with the USB drive mounted, I decided
to abort the process. Here's where the errors started occuring. I hit
ctrl-c on the process which is creating the livecd-usb and started
receiving disconnect errors. Note that a ctrl-c could be analogous to a
USB disconnect, although there's also significant differences (since the
signals originate from different sources). I manually unmounted the
filesystems involved in the preparation process. One would expect that I
would then be able to (physically) remove the drive, reinsert the drive
and then start the process over, but that wasn't the case. When I
reinserted the drive, I couldn't access the drive anymore. No real
errors messages, it seems as if the port was unavailable. I did what I
normally never have to do (with this machine), I rebooted.

After reboot, I couldn't use that port anymore. I kept getting
disconnect errors. Things were going from bad to worse and I rebooted
the machine again. I then tried the same process on the front-side ports
(was using a port on the back previously). I had no problem performing
aforementioned process to completion on the front side port.

I now have this funny feeling that something is going wrong with the
mount-state of the filesystems involved. This really reminds me of
removing a floppy drive on Sun workstations without having invoked the
"eject" command (which unmounts the floppy prior to physically ejecting
the floppy disk).

It would seem that "disconnects" are quite normal on a USB bus. It does
however get very tricky with the dependencies once a disconnect occurs
(is it intermittent or permanent?). This likely differs between devices
type (printer daemon and filesystem may respond differently), but it
seems as if a discrepency arises between the device connection state and
how the dependent processes/modules (the latter thus being the
mounting/filesystem or printer subsystem) perceive that state. It
wouldn't surprise me if the connection state doesn't correspond with the
actual device state. From that moment on it's fubar until by chance the
actual device state corresponds with the perceived state (within
relevant modules/processes).

FWIW, possibly a long shot, I'd ask the people who are receiving errors
during large transfers to disable hald prior to initiating the transfer.
That is, have the device (auto) mounted, then disable hald, then start
the transfer. The same thing could be causing printer errors and even
usb wireless devices to reach a state of no go.

Disabling hald may obviously defeat your (other) purposes, but it could
isolate the problem (or possibly just rule out hald's involvement in
this ordeal).

Hope this helps,

David

-- 
ehci_hcd module causes I/O errors in USB 2.0 devices
https://bugs.launchpad.net/bugs/88746
You received this bug notification because you are a member of Kernel
Bugs, which is subscribed to Linux.




More information about the kernel-bugs mailing list