Re: [PATCH v3 2/4] x86/PCI: Add $IRT PIRQ routing table support

From: Maciej W. Rozycki
Date: Wed Mar 16 2022 - 14:09:44 EST


On Tue, 15 Mar 2022, Dmitry Osipenko wrote:

> > Handle the $IRT PCI IRQ Routing Table format used by AMI for its BCP
> > (BIOS Configuration Program) external tool meant for tweaking BIOS
> > structures without the need to rebuild it from sources[1].
> >
> > The $IRT format has been invented by AMI before Microsoft has come up
> > with its $PIR format and a $IRT table is therefore there in some systems
> > that lack a $PIR table, such as the DataExpert EXP8449 mainboard based
> > on the ALi FinALi 486 chipset (M1489/M1487), which predates DMI 2.0 and
> > cannot therefore be easily identified at run time.
> >
> > Unlike with the $PIR format there is no alignment guarantee as to the
> > placement of the $IRT table, so scan the whole BIOS area bytewise.
[...]
> This patch broke crosvm using recent linux-next. The "ir = (struct
> irt_routing_table *)addr;" contains invalid pointer. Any ideas why?

This specific pointer refers to the BIOS area being iterated over:

for (addr = (u8 *)__va(0xf0000);
addr < (u8 *)__va(0x100000);
addr++) {

and it is conceptually not new code in that a similar piece as below:

for (addr = (u8 *)__va(0xf0000);
addr < (u8 *)__va(0x100000);
addr += 16) {

used to be there before my change and even now it is executed earlier on
in `pirq_find_routing_table'.

> PCI: Probing PCI hardware
> BUG: unable to handle page fault for address: ffffed1000020000
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 12fff4067 P4D 12fff4067 PUD 12fff3067 PMD 12fff2067 PTE 0
> Oops: 0000 [#1] PREEMPT SMP KASAN
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc7-next-20220310+ #226
> Hardware name: ChromiumOS crosvm, BIOS 0
> RIP: 0010:kasan_check_range+0xe6/0x1a0
> Code: 00 74 ee 48 89 c2 b8 01 00 00 00 48 85 d2 75 5d 5b 41 5c 41 5d 5d
> c3 48 85 d2 74 63 4c 01 e2 eb 09 48 83 c0 01 48 39 d0 74 55 <80> 38 00
> 74 f2 eb d2 41 bd 08 00 00 00 45 29 dd 4b 8d 54 25 00 eb

Thank you for your report and apologies for the trouble.

I don't know what a "ChromiumOS crosvm" is, but the mention of "Chromium"
indicates to me it is something reasonably recent that should be using
ACPI rather than legacy PCI IRQ routing, and even then it should be using
the standardised $PIR format rather than AMI's proprietary $IRT one. I am
more than surprised this code is active for x86-64 even, as this is solely
i386 legacy.

In any case we need to debug this and possibly work around somehow as
this BIOS is likely giving us rubbish information. Unfortunately without
access to your Linux build tree along with debug information I can do very
little. The faulting piece of code is as follows:

21: 48 83 c0 01 add $0x1,%rax
25: 48 39 d0 cmp %rdx,%rax
28: 74 55 je 7f <foo+0x7f>
2a: 80 38 00 cmpb $0x0,(%rax)
2d: 74 f2 je 21 <foo+0x21>

-- with the CMPB at 2a being the offender and further information required
as to what RAX holds at the moment.

So as the first approximation I would like to see what your BIOS actually
tells Linux. Would you therefore please try the following debug patch,
boot with the `debug' kernel parameter and send me the resulting bootstrap
log?

Maciej

---
arch/x86/include/asm/pci_x86.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

linux-x86-pci-debug.diff
Index: linux-macro/arch/x86/include/asm/pci_x86.h
===================================================================
--- linux-macro.orig/arch/x86/include/asm/pci_x86.h
+++ linux-macro/arch/x86/include/asm/pci_x86.h
@@ -7,7 +7,7 @@

#include <linux/ioport.h>

-#undef DEBUG
+#define DEBUG 1

#ifdef DEBUG
#define DBG(fmt, ...) printk(fmt, ##__VA_ARGS__)