Re: [PATCH 1/2] x86/hyperv: Expose an helper to map PCI interrupts

From: Stanislav Kinsburskii
Date: Fri Apr 14 2023 - 14:07:51 EST


On Fri, Apr 14, 2023 at 09:28:52AM +0200, Thomas Gleixner wrote:
> Stanislav!
>
> On Wed, Apr 12 2023 at 09:36, Stanislav Kinsburskii wrote:
> > On Wed, Apr 12, 2023 at 09:19:51AM -0700, Stanislav Kinsburskii wrote:
> >> > > + affinity = irq_data_get_effective_affinity_mask(data);
> >> > > + cpu = cpumask_first_and(affinity, cpu_online_mask);
> >> >
> >> > The effective affinity mask of MSI interrupts consists only of online
> >> > CPUs, to be accurate: it has exactly one online CPU set.
> >> >
> >> > But even if it would have only offline CPUs then the result would be:
> >> >
> >> > cpu = nr_cpu_ids
> >> >
> >> > which is definitely invalid. While a disabled vector targeted to an
> >> > offline CPU is not necessarily invalid.
> >
> > Although this patch only tosses the code and doens't make any functional
> > changes, I guess if the fix for the used cpu id is required, it has to
> > be in a separated patch.
>
> Correct, but if the interrupt _is_ masked at the MSI level then the
> hypervisor must not deliver an interrupt at all.
>
> The point is that it is valid to target a masked MSI entry to an offline
> CPU under the assumption that the hardware/emulation respects the
> masking. Whether that's a good idea or not is a different question.
>
> The kernel as of today does not do that. It targets unused but
> configured MSI[-x] entries towards MANAGED_IRQ_SHUTDOWN_VECTOR on CPU0
> for various reasons, one of them being paranoia.
>
> But in principle there is nothing wrong with that and it should either
> succeed or being rejected at the software level and not expose a
> completely invalid CPU number to the hypercall in the first place.
>
> So if you want to be defensive, then keep the _and(), but then check the
> result for being valid and emit something useful like a pr_warn_once()
> instead of blindly handing the invalid result to the hypercall and then
> have that reject it with some undecipherable error code.
>
> Actually it would not necessarily reach the hypercall because before
> that it dereferences cpumask_of(nr_cpu_ids) here:
>
> nr_bank = cpumask_to_vpset(&(intr_desc->target.vp_set), cpumask_of(cpu));
>
> and explode with a kernel pagefault. If not it will read some random
> adjacent data and try to create a vp_set from it. Neither of that is
> anywhere close to correct.
>

Thank you Thomas.
I sent a patch to address the problmes you highlighted:

"x86/hyperv: Fix IRQ effective cpu discovery for the interrupts unmasking"

I'll update this series after that patch is merged.

Thanks,
Stanislav

> Thanks,
>
> tglx