Re: [PATCH] x86/apic: reset LDR in clear_local_APIC

From: Thomas Gleixner
Date: Mon Aug 19 2019 - 19:10:40 EST


Bandan,

On Mon, 19 Aug 2019, Bandan Das wrote:
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
> > On Wed, 14 Aug 2019, Bandan Das wrote:
> >> On a 32 bit RHEL6 guest with greater than 8 cpus, the
> >> kdump kernel hangs when calibrating apic. This happens
> >> because when apic initializes bigsmp, it also initializes LDR
> >> even though it probably wouldn't be used.
> >
> > 'It probably wouldn't be used' is a not really a useful technical
> > statement.
> >
> > Either it is used, then it needs to be handled. Or it is unused then why is
> > it written in the first place?
> >
> > The bigsmp APIC uses physical destination mode because the logical flat
> > model only supports 8 APICs. So clearly bigsmp_init_apic_ldr() is bogus and
> > should be empty.
> >
>
> Your note above is what I meant by "it probably wouldn't be used" because
> I don't have much insight into the history behind why LDR is being initialized
> in the first place. The only evidence I found is a comment in apic.c that states:
> /*
> * Intel recommends to set DFR, LDR and TPR before enabling
> * an APIC. See e.g. "AP-388 82489DX User's Manual" (Intel
> * document number 292116). So here it goes...
> */

The physflat stuff is documented in the SDM and in the APIC code
(apic_flat_64.c):

static void physflat_init_apic_ldr(void)
{
/*
* LDR and DFR are not involved in physflat mode, rather:
* "In physical destination mode, the destination processor is
* specified by its local APIC ID [...]." (Intel SDM, 10.6.2.1)
*/
}

Why is LDR initialized in the bigsmp code? Probably histerical raisins and
I'm just too tired to consult the history git trees for an answer.

> That said, not initalizing the ldr in bigsmp_init_apic_ldr() should be
> enough to fix this. Would you prefer that change instead ?

That's surely something we want to get rid off. But for sanity sake we
should clear LDR as well after understanding it completely.

> >> When booting into kdump, KVM apic incorrectly reads the stale LDR
> >> values from the guest while building the logical destination map
> >> even for inactive vcpus. While KVM apic can be fixed to ignore apics
> >> that haven't been enabled, a simple guest only change can be to
> >> just clear out the LDR.
> >
> > This does not make much sense either. What has KVM to do with logical
> > destination maps while booting the kdump kernel? The kdump kernel is not
>
> This is the guest kernel and KVM takes care of injecting the interrupt to
> the right vcpu (recalculate_apic_map)() in lapic.c).

Yeah. I know that KVM injects interrupts. Still that does not explain the
issue properly.

The point is that when the kdump kernel boots in the guest and uses logical
destination mode then it will overwrite LDR _BEFORE_ the local APIC timer
calibration takes place. So no, I'm not bying this. Just because it makes
your problem disappear does not mean it's the proper explanation.

> For the KVM side change, please take a look at
> https://lore.kernel.org/kvm/aee50952-144d-78da-9036-045bd3838b59@xxxxxxxxxx/

That's the same text in diffferent form and not conclusive either.

> > going through the regular cold/warm boot process, so KVM does not even know
> > that the crashing kernel jumped into the kdump one.
> >
> > What builds the logical destination maps and what has LDR and the KVM APIC
> > to do with that?
> >
> > I'm not opposed to the change per se, but I'm not accepting change logs
> > which have the fairy tale smell.
> >
> Heh, no it's not.

Well, it's not an accurate technical description of the root cause either :)

Thanks,

tglx