Re: Serial related oops

From: Michael K. Edwards
Date: Tue Feb 20 2007 - 00:00:29 EST


On 2/19/07, Robert Hancock <hancockr@xxxxxxx> wrote:
How do you propose to do this? Drivers can get loaded and unloaded at any
time. If you have a device generating spurious interrupts on a shared IRQ
line, there's no way you can use any device on that line until that interrupt
is shut off. Requiring all drivers to be loaded before any of them can use
interrupts is simply not practical.

Of course not. But dealing with a stuck IRQ line by locking up isn't
very practical either. IRQ sharing is stupid yet universal, and it
happens all the time that a device that has been sitting there minding
its own business since power-up, with no driver to drive it, decides
to assert its IRQ. Maybe it just got hot-plugged, maybe it just got
its first dribble of input, whatever. Other devices on the shared IRQ
are screwed (or at least semi-screwed; you could periodically
re-enable the IRQ long enough to make a run through the ISR chain
servicing the other devices). But if you run "lspci" (or whatever)
and load a driver for the newly awake device, everything goes back to
normal.

For devices compiled into the kernel, you shouldn't have to play these
games. If, that is, there were three stages of driver initialization,
called in successive passes:

1) installing an ISR with a fallback STFU path (device-specific but
not dependent on any particular pre-existing chip state), quiescing it
if you know how and registering for the IRQ if you know which it is;

2) going through the chip's soft-reset-wake-up-shut-up cycle and
populating driver data structures, possibly correcting the IRQ
registration along the way;

3) ready-as-we'll-ever-be, bring on the interrupts.

You probably can't help enabling the IRQ briefly during 2) so that you
can do tests like Russell's loopback. But it's a needless gamble to
do that without doing 1) for all compiled-in drivers and platform
devices first, in a previous discovery pass. And it's stupid to do 3)
in the same pass as 2), because you'll just open race condition
windows that will only bite when an all-the-way-live device raises its
IRQ at a moment when the writer of the wake-up-shut-up code wasn't
expecting it. All code has bugs and they're only a problem when they
bite in the field.

If a system has a device that generates interrupts before they're enabled,
and the firmware doesn't fix it, then some platform-specific quirk has to
handle it and shut off the interrupt before it allows any interrupts
to be enabled. (We have such a quirk for certain network controllers where
the boot ROM can leave the chip generating interrupts on bootup.)

You don't need quirks if your driver initialization is bomb-proof to
begin with. Devices that are quiet on power-up are purely
coincidental and should not be construed.

Cheers,
- Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/