Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

From: David P. Reed
Date: Tue Jan 01 2008 - 10:59:20 EST


Alan Cox wrote:
responds to reads differently than "unused" ports. In particular, an inb takes 1/2 the elapsed time compared to a read to "known" unused port 0xed - 792 tsc ticks for port 80 compared to about 1450 tsc ticks for port 0xed and other unused ports (tsc at 800 MHz).

Well at least we know where the port is now - thats too fast for an LPC
bus device, so it must be an SMI trap.

Only easy way to find out is to use the debugging event counters and see
how many instruction cycles are issued as part of the 0x80 port. If its
suprisingly high then you've got a firmware bug and can go spank HP.

Alan, thank you for the pointers. I have been doing variations on this testing theme for a while - I get intrigued by a good debugging challenge, and after all it's my machine...

Two relevant new data points, and then some more suggestions:

1. It appears to be a real port. SMI traps are not happening in the normal outb to 80. Hundreds of them execute perfectly with the expected instruction counts. If I can trace the particular event that creates the hard freeze (getting really creative, here) and stop before the freeze disables the entire computer, I will. That may be an SMI, or perhaps any other kind of interrupt or exception. Maybe someone knows how to safely trace through an impending SMI while doing printk's or something?

2. It appears to be the standard POST diagnostic port. On a whim, I disassembled my DSDT code, and studied it more closely. It turns out that there are a bunch of "Store(..., DBUG)" instructions scattered throughout, and when you look at what DBUG is defined as, it is defined as an IO Port at IO address DBGP, which is a 1-byte value = 0x80. So the ACPI BIOS thinks it has something to do with debugging. There's a little strangeness here, however, because the value sent to the port occasionally has something to do with arguments to the ACPI operations relating to sleep and wakeup ... could just be that those arguments are distinctive.

In thinking about this, I recognize a couple of things. ACPI is telling us something when it declares a reference to port 80 in its code. It's not telling us the function of this port on this machine, but it is telling us that it is being used by the BIOS. This could be a reason to put out a printk warning message... 'warning: port 80 is used by ACPI BIOS - if you are experiencing problems, you might try an alternate means of iodelay.'

Second, it seems likely that there are one of two possible reasons that the port 80 writes cause hang/freezes:

1. buffer overflow in such a device.

2. there is some "meaning" to certain byte values being written (the _PTS and _WAK use of arguments that come from callers to store into port 80 makes me suspicious.) That might mean that the freeze happens only when certain values are written, or when they are written closely in time to some other action - being used to communicate something to the SMM code). If there is some race in when Linux's port 80 writes happen that happen to change the meaning of a request to the hardware or to SMM, then we could be rarely stepping on



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/