Re: [PATCH] Allocate DMAR fault interrupts locally

From: Thomas Gleixner
Date: Sun Mar 24 2024 - 16:05:54 EST


Dimitri!

On Mon, Mar 11 2024 at 15:38, Dimitri Sivanich wrote:
> On Thu, Feb 29, 2024 at 11:18:37PM +0100, Thomas Gleixner wrote:
>> What you really want is a cpu hotplug state in the CPUHP_BP_PREPARE_DYN
>> space which enables the interrupt for the node _before_ the first AP of
>> the node is brought up. That will solve the problem nicely w/o any of
>> the above issues.
>>
>
> Initially this sounds like a good approach. As things currently stand, however,
> there are (at least) several problems with attempting to allocate interrupts on
> cpus that are not running yet via the existing dmar_set_interrupt path.
>
> - The code relies on node_to_cpumask_map (cpumask_of_node()), which has been
> allocated, but not populated at the CPUHP_BP_PREPARE_DYN stage.
>
> - The irq_matrix cpumaps do not indicate being online or initialized yet, except
> for the boot cpu instance, of course.
>
> So things still revert to boot cpu allocation, until we exhaust the
> vectors.

I thought about the following:

CPUHP_BP_PREPARE_DYN allocates the hardware interrupt on the control
CPU (the boot CPU during early boot).

CPUHP_AP_ONLINE_DYN moves it over to the AP. This needs to set
affinity and then retrigger the interrupt so that the horrible
non-remapped MSI migration logic is invoked.

Though that does not work for parallel bringup as then the prepare stage
is invoked for all CPUs before any of them gets to the online phase,
which obviously ends up with the same problem.

> Of course, running the dmar_set_interrupt code from a CPUHP_AP_ONLINE_DYN state
> does work (although I believe there is a concurrency issue that could show up
> with the current dmar_set_interrupt path).

Which concurrency issue? CPU hotplug is fully serialized.

> So the code seems to have been designed based on the assumption that it will be
> run on an already active (though not necessarily fully onlined?) cpu. To make
> this work, any code based on that assumption would need to be fixed. Otherwise,
> a different approach is needed.

Yes, the interrupt vector code it is designed that way and for the
general case this is absolutely the right thing to do.

Thanks,

tglx