Re: x86/apic: MSI address malformed for "flat" driver

From: Thomas Gleixner
Date: Tue Sep 11 2018 - 08:30:02 EST


On Mon, 10 Sep 2018, Cyril Novikov wrote:
> On 9/7/2018 12:11 PM, Thomas Gleixner wrote:
> > On Thu, 6 Sep 2018, Philipp Eppelt wrote:
> > >
> > > The "flat" driver defines the MSI addressing scheme to be used as
> > > logical addressing in flat mode. The MSI msg address is composed
> > > accordingly, but sets MSI_ADDR_REDIRECTION_CPU which is a zero at bit[3].
> >
> > Correct. That's what it means:
> >
> > * When RH is 0, the interrupt is directed to the processor listed in the
> > Destination ID field.
> >
> > So for DM:
> >
> > * If RH is 0, then the DM bit is ignored and the message is sent ahead
> > independent of whether the physical or logical destination mode is
> > used.
> >
> > which is means that the delivery does not do any magic redirections,
> > because the Redirection Hint is off. If RH is set, then the delivery can
> > redirect according to the rules in the DM section. We are not using that
> > because we want targeted single CPU delivery.
> >
> > The interpretation of the DID field is purely depending on the local APIC
> > itself by matching the APIC ID against the DID field. And the local APIC ID
> > of CPU0 is 1 << 0, i.e. 0x1 which matches the MSI message you see.
>
> I believe you are wrong here and the local APIC ID of CPU0 is 0.
>
> processor : 0
> vendor_id : GenuineIntel
> ...
> physical id : 0
> siblings : 8
> core id : 0
> cpu cores : 4
> apicid : 0
>
> The fact that the code works means that DM is not ignored when RH is 0. In
> other words, RH=0 DM=1 means logical destination mode.

Sorry, I did not explain it very well. Let me try again.

* If RH is 0, then the DM bit is ignored and the message is sent ahead
independent of whether the physical or logical destination mode is
used.

The PCI device simply writes the message data to that address, it does not
even know what the individual bits mean. It's a write of data to address.

The write gets then directed to the APIC bus or the Processor System Bus
depending on the CPU by a translation unit. The translated message which
goes on the bus to which the APIC(s) are connected contains the DM bit
which is always evaluated by the local APICs for matching.

You can simply verify that by inverting the DM field. You probably get
completely malfunctioning interrupts or if you're lucky they are delivered
to the wrong CPU.

Why? Because the APIC has two match mechanisms.

If the message on the system/apic bus has DM = 0 then it matches
the Phsyical APIC ID which you can see in /proc/cpuinfo

If the message on the system/apic bus has DM = 1 then it matches the
Logical APIC ID which is stored in the LDR register. apic flat sets that
to 1 << CPUNr, i.e. 0x01 for CPU0.

If RH is set in the address then the translation unit tries to be smart
about the delivery, i.e. by directing it to the processor which has the
lowest interrupt priority. In logical mode it choses ONE processor out of
the destination ID bits, i.e. the resulting message on the system/apic bus
contains only a single bit. Physical mode is single CPU destination anyway
so there is no real difference to RH=0.

If RH is not set then the logic translates the message without
modifications including the DM bit. If the destination ID would have more
than a single bit set, then the interrupt would be simultaneously delivered
to all CPUs which have a matching bit in the LDR. Not desired for device
interrupts, but the single CPU affinity of the vector allocation guarantees
that there is only one bit set. The kernel still uses multiple bits for
IPIs.

Yes, we could switch APIC flat to use phsyical mode in the MSI and the
IOAPIC case, but I did not see a reason to do so.

Hope that clarifies it.

Out of curiosity: What kind of problem are you trying to solve?

Thanks,

tglx