[Jaunty] [Karmic] Fix hostap interrupt handler oops on bootup

Colin Ian King colin.king at canonical.com
Wed Aug 19 17:11:12 UTC 2009


I don't want to make a meal out of this one but... see my comments

On Wed, 2009-08-19 at 09:42 -0600, Tim Gardner wrote:
> Colin Ian King wrote:
> > https://bugs.launchpad.net/ubuntu/+bug/254837
> > 
> > SRU Justification:
> > 
> > Impact: Booting with the Senao NL-2511CD (PRISM II compatible) wireless
> > card can generate a kernel oops in the hostap interrupt handler.
> > 
> > Spurious shared interrupts or early probing interrupts can cause the
> > hostap interrupt handler to oops before the driver has fully configured
> > the IO base port addresses. In some cases the oops can be because
> > the hardware shares an interrupt line, on other cases it is due to a
> > race condition between probing for the hardware and configuring
> > the IO base port. The latter occurs because the probing is required to
> > determine the hardware port address which is only determined when the
> > probe can interrupt the hardware (catch 22).
> > 
> > Fix: This patch catches this pre-configured condition in the interrupt
> > handler to avoid the oops.
> > 
> > Testcase: Without the patch a kernel oops occurs on boot when the card
> > is installed. With the patch, there is no kernel oops and the wireless
> > card works.
> > 
> > I originally started debugging this with a user on a Jaunty kernel, and
> > now they have moved to Karmic. This patch will has been tested on Karmic
> > by Dan Taylor Jr. and the patch will fix the problem for Jaunty too,
> > c.f. https://bugs.launchpad.net/ubuntu/+bug/254837/comments/56
> > 
> > Attached: The patch.
> > 
> > 
> Does this really handle the case where the IRQ is shared? For example,
> if hostap is chained in the front of the IRQ handlers list, it will
> return IRQ_HANDLED every time. The next device in the chain will never
> get an opportunity to service its devices and drive the IRQ level down.

Good point Tim. However, looking at handle_IRQ_event() in
kernel/irq/handle.c it appears that all the handlers on the interrupt
line get handled, so from my (possible erroneous) understanding
returning IRQ_HANDLED does not stop handlers further down the chain from
being executed:

        do {
                ret = action->handler(irq, action->dev_id);
                if (ret == IRQ_HANDLED)
                        status |= action->flags;
                retval |= ret;
                action = action->next;
        } while (action);

..am I missing something more fundamental then that? Any views?

> Maybe you need some kind of counter that indicates "I've been here N
> times, so I maybe its not my IRQ to dismiss".

The correct choice of N is the moot point.

Well, the other strategy is as follows:

For early interrupts when the IO base address has not yet been
configured, the H/W is either generating a spurious interrupt from the
probe or from early configuration of the H/W, so we can skip return
IRQ_NONE without any problem. 

For other occasions, it's from an device sharing the same IRQ line, so
returning IRQ_NONE should be used.

My concern is that too many IRQ_NONEs may make the kernel block that IRQ
since it thinks it's being raised spuriously.

Either way, IRQ_NONE looks probably more sensible, but IRQ_HANDLED won't
break anything. Comments?

> rtg
> -- 
> Tim Gardner tim.gardner at canonical.com

More information about the kernel-team mailing list