Re: [PATCH] perf/x86/intel/uncore: Fix NULL pointer dereference issue in upi_fill_topology()

From: Alexander Antonov
Date: Mon Nov 20 2023 - 14:49:30 EST



On 11/15/2023 8:00 PM, Liang, Kan wrote:

On 2023-11-15 10:13 a.m., alexander.antonov@xxxxxxxxxxxxxxx wrote:
From: Alexander Antonov <alexander.antonov@xxxxxxxxxxxxxxx>

The NULL dereference happens inside upi_fill_topology() procedure in
case of disabling one of the sockets on the system.

For example, if you disable the 2nd socket on a 4-socket system then
uncore_max_dies() returns 3 and inside pmu_alloc_topology() memory will
be allocated only for 3 sockets and stored in type->topology.
In discover_upi_topology() memory is accessed by socket id from CPUNODEID
registers which contain physical ids (from 0 to 3) and on the line:

    upi = &type->topology[nid][idx];

out-of-bound access will happen and the 'upi' pointer will be passed to
upi_fill_topology() where it will be dereferenced.

To avoid this issue update the code to convert physical socket id to
logical socket id in discover_upi_topology() before accessing memory.

Fixes: f680b6e6062e ("perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server")
Reported-by: Kyle Meyer <kyle.meyer@xxxxxxx>
Tested-by: Kyle Meyer <kyle.meyer@xxxxxxx>
Signed-off-by: Alexander Antonov <alexander.antonov@xxxxxxxxxxxxxxx>
---
arch/x86/events/intel/uncore_snbep.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 8250f0f59c2b..49bc27ab26ad 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -5596,7 +5596,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
struct pci_dev *ubox = NULL;
struct pci_dev *dev = NULL;
u32 nid, gid;
- int i, idx, ret = -EPERM;
+ int i, idx, lgc_pkg, ret = -EPERM;
struct intel_uncore_topology *upi;
unsigned int devfn;
@@ -5614,8 +5614,13 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
for (i = 0; i < 8; i++) {
if (nid != GIDNIDMAP(gid, i))
continue;
+ lgc_pkg = topology_phys_to_logical_pkg(i);
+ if (lgc_pkg < 0) {
+ ret = -EPERM;
+ goto err;
+ }
In the snbep_pci2phy_map_init(), there are similar codes to find the
logical die id. Can we factor a common function for both of them?

Thanks,
Kan
Hi Kan,

Thank you for your comment.
Yes, I think we can factor out the common loop where GIDNIDMAP is being checked.
But inside snbep_pci2phy_map_init() we have a bit different procedure which
also does the following:

if (topology_max_die_per_package() > 1)
    die_id = i;

I think that having this code, at least, in our case could bring us to the
same issue which we are trying to fix. But of course we could
parametrize this checking.

What do you think?

Thanks,
Alexander

for (idx = 0; idx < type->num_boxes; idx++) {
- upi = &type->topology[nid][idx];
+ upi = &type->topology[lgc_pkg][idx];
devfn = PCI_DEVFN(dev_link0 + idx, ICX_UPI_REGS_ADDR_FUNCTION);
dev = pci_get_domain_bus_and_slot(pci_domain_nr(ubox->bus),
ubox->bus->number,
@@ -5626,6 +5631,7 @@ static int discover_upi_topology(struct intel_uncore_type *type, int ubox_did, i
goto err;
}
}
+ break;
}
}
err:

base-commit: 9bacdd8996c77c42ca004440be610692275ff9d0