Re: [PATCH 2/2] x86, mce: rework use of TIF_MCE_NOTIFY

From: Tony Luck
Date: Tue Jun 14 2011 - 22:10:17 EST


On Tue, Jun 14, 2011 at 6:29 PM, Hidetoshi Seto
<seto.hidetoshi@xxxxxxxxxxxxxx> wrote:
> Or ... is it possible to push siginfo w/ addr and pop here?

I chatted to Peter Anvin about this over lunch ... his suggestion was that since
we know (for now) that the recovery case is always from user mode. We can
let all the non-involved cpus return from do_machine_check() .. but catch the
cpu with the problem and do a sideways stack jump from the machine check
stack to the normal trap stack. At this point we'll be executing in a context
that is effectively the same as a page fault - so we have plenty of safe options
on functions we can call, locks we can take etc.

So perhaps we can change "void do_machine_check()" to "unsigned long
do_machine_check()" and have the bystander cpus "return 0;" and the
cpu that hit the error "return m.addr;" ... and then do the necessary magic
in entry_64.S to leap from stack to stack in one mighty leap (and then
onto a "handle_action_required(regs, addr)" function.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/