Re: [PATCH -tip 2/2 resend] x86, traps: Drop nmi_reason_lock untilit is really needed

From: Ingo Molnar
Date: Wed Mar 02 2011 - 11:03:47 EST



* Cyrill Gorcunov <gorcunov@xxxxxxxxxx> wrote:

> On 03/02/2011 06:46 PM, Ingo Molnar wrote:
> >
> > * Cyrill Gorcunov <gorcunov@xxxxxxxxxx> wrote:
> >
> >> At moment we have only BSP apic configured to listen
> >> for external NMIs. So there is no reason for additional
> >> spinlock since only BSP will receive them.
> >>
> >> Though we still have UV chips which do enable external NMIs
> >> on all cpus, but since an approach to allow retrieving
> >> NMI reason on BSP only was working pretty fine before --
> >> I assume it still remains valid.
> >
> > I'm not sure I get the point here: we might get NMIs on non-BSP on UV
> > systems ... so we want to remove the spinlock?
> >
> > If UV systems can get NMIs on any CPU then the lock is needed.
> >
> > It might have worked before - but UV systems are rare and relatively
> > new - plus the race window is small, so it might not have been triggered
> > in practice.
>
> Well, it is incomplete anyway. As far as I can tell even ordering such
> NMIs with spinlock would not make situation better 'cause other cpu might
> obtain unknown nmi (ie two or more cpu's gets NMI then handing started on
> first found that it was say MCE error, handle it, unlock spinlock and then
> the second cpu gets this nmi (the reason for which was already handled by
> first cpu) and sees unknown NMI. So this lock might simply hiding a bug.

Well, the lock serializes the read-out of the 'NMI reason' port, the handling of
whatever known reason and then the reassertion of the NMI (on 32-bit).

EDAC has a callback in pci_serr_error() - and this lock serializes that. So we
cannot just remove a lock like that, if there's any chance of parallel execution on
multiple CPUs.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/