Re: [RESEND BUGFIX PATCH 1/3] PCI/AER: fix pci_ops return NULL whenhotplug a pci bus which was doing aer error inject

From: Chen Gong
Date: Mon Aug 27 2012 - 04:51:24 EST


On Sat, Aug 25, 2012 at 05:59:44PM +0800, Yijing Wang wrote:
> Date: Sat, 25 Aug 2012 17:59:44 +0800
> From: Yijing Wang <wangyijing@xxxxxxxxxx>
> To: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>, Rusty Russell
> <rusty@xxxxxxxxxxxxxxx>, Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>
> CC: PCI <linux-pci@xxxxxxxxxxxxxxx>, Jiang Liu <liuj97@xxxxxxxxx>, Huang
> Ying <ying.huang@xxxxxxxxx>, Hanjun Guo <guohanjun@xxxxxxxxxx>,
> linux-kernel@xxxxxxxxxxxxxxx
> Subject: [RESEND BUGFIX PATCH 1/3] PCI/AER: fix pci_ops return NULL when
> hotplug a pci bus which was doing aer error inject
> User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20120713
> Thunderbird/14.0
>
> When we inject aer errors to the target pci device by aer_inject module, the pci_ops of pci
> bus which the target device is on will be assign to pci_ops_aer.So if the target pci device
> is a bridge, once we hotplug the pci bus(child bus) which the target device bridges to, child
> bus's pci_ops will be assigned to pci_ops_aer too.Now every access to the child bus's device
> will result to system panic, because it return NULL pci_ops in pci_read_aer.
> This patch fix this.
>
> CallTrace:
> bash[5908]: NaT consumption 17179869216 [1]
> Modules linked in: aer_inject cpufreq_conservative cpufreq_userspace cpufreq_pow
> ersave acpi_cpufreq binfmt_misc fuse nls_iso8859_1 loop ipmi_si(+) ipmi_devintf
> ipmi_msghandler dm_mod ppdev iTCO_wdt iTCO_vendor_support sg igb parport_pc i2c_
> i801 mptctl i2c_core serio_raw hid_generic lpc_ich mfd_core parport button conta
> iner usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif ext3 mbca
> che jbd fan processor ide_pci_generic ide_core ata_piix libata mptsas mptscsih m
> ptbase scsi_transport_sas scsi_mod thermal thermal_sys hwmon
>
[...]
>
> Signed-off-by: Yijing Wang <wangyijing@xxxxxxxxxx>
> Signed-off-by: Jiang Liu <liuj97@xxxxxxxxx>
> ---
> drivers/pci/pcie/aer/aer_inject.c | 21 +++++++++++++++++++++
> 1 files changed, 21 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer/aer_inject.c b/drivers/pci/pcie/aer/aer_inject.c
> index 5222986..fc28785 100644
> --- a/drivers/pci/pcie/aer/aer_inject.c
> +++ b/drivers/pci/pcie/aer/aer_inject.c
> @@ -109,6 +109,19 @@ static struct aer_error *__find_aer_error_by_dev(struct pci_dev *dev)
> return __find_aer_error((u16)domain, dev->bus->number, dev->devfn);
> }
>
> +static bool pci_is_upstream_bus(struct pci_bus *bus, struct pci_bus *up_bus)
> +{
> + struct pci_bus *pbus = bus->parent;
> +
> + while (pbus) {
> + if (pbus == up_bus)
> + return true;
> + pbus = pbus->parent;
> + }
> +
> + return false;
> +}
> +
> /* inject_lock must be held before calling */
> static struct pci_ops *__find_pci_bus_ops(struct pci_bus *bus)
> {
> @@ -118,6 +131,13 @@ static struct pci_ops *__find_pci_bus_ops(struct pci_bus *bus)
> if (bus_ops->bus == bus)
> return bus_ops->ops;
> }
> +
> + /* here can't find bus_ops, fall back to get bus_ops of upstream bus */
> + list_for_each_entry(bus_ops, &pci_bus_ops_list, list) {
> + if (pci_is_upstream_bus(bus, bus_ops->bus))
> + return bus_ops->ops;
> + }
> +
> return NULL;
> }
>
At least, when returning NULL, a proper check and protection is needed.

Attachment: signature.asc
Description: Digital signature