RE: [V5 PATCH 3/4] kexec: Fix race between panic() and crash_kexec() called directly

From: æåèå / KAWAIïHIDEHIRO
Date: Wed Dec 02 2015 - 06:57:49 EST


Hello Borislav,

Sorry, I haven't replied to this mail yet.

> On Fri, Nov 20, 2015 at 06:36:48PM +0900, Hidehiro Kawai wrote:
...
> > +void crash_kexec(struct pt_regs *regs)
> > +{
> > + int old_cpu, this_cpu;
> > +
> > + /*
> > + * Only one CPU is allowed to execute the crash_kexec() code as with
> > + * panic(). Otherwise parallel calls of panic() and crash_kexec()
> > + * may stop each other. To exclude them, we use panic_cpu here too.
> > + */
> > + this_cpu = raw_smp_processor_id();
> > + old_cpu = atomic_cmpxchg(&panic_cpu, -1, this_cpu);
> > + if (old_cpu == -1) {
> > + /* This is the 1st CPU which comes here, so go ahead. */
> > + __crash_kexec(regs);
> > +
> > + /*
> > + * Reset panic_cpu to allow another panic()/crash_kexec()
> > + * call.
>
> So can we make __crash_kexec() return error values?
>
> * failed to grab kexec_mutex -> reset panic_cpu
>
> * no kexec_crash_image -> no need to reset it, all future crash_kexec()
> calls won't work so no need to run into that path anymore. However, this could
> be problematic if we want the other CPUs to panic. Do we care?
>
> * machine_kexec successful -> doesn't matter

We can do so, but I think resetting panic_cpu always would be
simpler and safer.

Although checking kexec_crash_image each time is pointless, it
doesn't cause any actual problem.

Regards,

--
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group