Re: [PATCH] panic, kexec: Don't mutex_trylock() in __crash_kexec()

From: Petr Mladek
Date: Fri Jun 17 2022 - 09:52:46 EST


On Fri 2022-06-17 12:52:05, Valentin Schneider wrote:
> Hi Tao,
>
> On 17/06/22 18:42, Tao Zhou wrote:
> > Hi Valentin,
> >
> > On Thu, Jun 16, 2022 at 01:37:09PM +0100, Valentin Schneider wrote:
> >> @@ -964,24 +966,31 @@ late_initcall(kexec_core_sysctl_init);
> >> */
> >> void __noclone __crash_kexec(struct pt_regs *regs)
> >> {
> >> - /* Take the kexec_mutex here to prevent sys_kexec_load
> >> - * running on one cpu from replacing the crash kernel
> >> - * we are using after a panic on a different cpu.
> >> + /*
> >> + * This should be taking kexec_mutex before doing anything with the
> >> + * kexec_crash_image, but this code can be run in NMI context which
> >> + * means we can't even trylock.
> >> *
> >> - * If the crash kernel was not located in a fixed area
> >> - * of memory the xchg(&kexec_crash_image) would be
> >> - * sufficient. But since I reuse the memory...
> >> + * Pairs with smp_mb() in do_kexec_load() and sys_kexec_file_load()
> >> */
> >> - if (mutex_trylock(&kexec_mutex)) {
> >> - if (kexec_crash_image) {
> >> - struct pt_regs fixed_regs;
> >> -
> >> - crash_setup_regs(&fixed_regs, regs);
> >> - crash_save_vmcoreinfo();
> >> - machine_crash_shutdown(&fixed_regs);
> >> - machine_kexec(kexec_crash_image);
> >> - }
> >> - mutex_unlock(&kexec_mutex);
> >> + WRITE_ONCE(panic_wants_kexec, true);
> >> + smp_mb();
> >> + /*
> >> + * If we're panic'ing while someone else is messing with the crash
> >> + * kernel, this isn't going to end well.
> >> + */
> >> + if (READ_ONCE(kexec_loading)) {
> >> + WRITE_ONCE(panic_wants_kexec, false);
> >> + return;
> >> + }
> >
> > So this is from NMI. The mutex guarantee that kexec_file_load() or
> > do_kexec_load() just one of them beat on cpu. NMI can happen on more
> > than one cpu. That means that here be cumulativity here IMHO.
> >
>
> If you look at __crash_kexec() in isolation yes, but if you look at panic()
> and nmi_panic() only a single NMI can get in there. I think that is also
> true for invocations via crash_kexec().

It is true that panic() could be called only once, see this code
in panic():

* Only one CPU is allowed to execute the panic code from here. For
this_cpu = raw_smp_processor_id();
old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu);

if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu)
panic_smp_self_stop();


One the other hand, the aproach with two variables is strage
and brings exactly these questions.

If a trylock is enough that the mutex can be replaced by two
simple atomic operations. The mutex would be needed only
when a user really would need to wait for another one.


atomic_t crash_kexec_lock;

/* trylock part */
if (atomic_cmpxchg_acquire(&crash_kexec_lock, 0, 1) != 0)
return -EBUSY;

/* do anything guarded by crash_kexec_lock */

/* release lock */
atomic_set_release(&crash_kexec_lock, 0);

The _acquire, _release variants will do the barriers correctly.

Best Regards,
Petr