Re: [RFC][PATCH 0/3] x86/nmi: Print all cpu stacks from NMI safely

From: Steven Rostedt
Date: Thu Jun 19 2014 - 19:19:32 EST


On Fri, 20 Jun 2014 01:03:28 +0200 (CEST)
Jiri Kosina <jkosina@xxxxxxx> wrote:

> On Thu, 19 Jun 2014, Steven Rostedt wrote:
>
> > > The idea basically is to *switch* what arch_trigger_all_cpu_backtrace()
> > > and arch_trigger_all_cpu_backtrace_handler() are doing; i.e. use the NMI
> > > as a way to stop all the CPUs (one by one), and let the CPU that is
> > > sending the NMIs around to actually walk and dump the stacks of the CPUs
> > > receiving the NMI IPI.
> >
> > And this is cleaner? Stopping a CPU via NMI and then what happens if
> > something else goes wrong and that CPU never starts back up? This
> > sounds like something that can cause more problems than it was
> > reporting on.
>
> It's going to get NMI in exactly the same situations it does with the
> current arch_trigger_all_cpu_backtrace(), the only difference being that
> it doesn't try to invoke printk() from inside NMI. The IPI-NMI is used
> solely as a point of synchronization for the stack dumping.

Well, all CPUs are going to be spinning until the main CPU prints
everything out. That's not quite the same thing as what it use to do.

>
> > Then you also need to print out the data while the NMIs still spin.
>
> Exactly, that's actually the whole point.

But this stops everything with a big hammer, until everything gets
printed out, not just the one CPU that happens to be stuck.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/