Re: [PATCH v1] ACPI: sleep: Avoid breaking S3 wakeup due to might_sleep()

From: Peter Zijlstra
Date: Wed Jun 14 2023 - 04:48:02 EST


On Tue, Jun 13, 2023 at 05:25:07PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>
> The addition of might_sleep() to down_timeout() caused the latter to
> enable interrupts unconditionally in some cases, which in turn broke
> the ACPI S3 wakeup path in acpi_suspend_enter(), where down_timeout()
> is called by acpi_disable_all_gpes() via acpi_ut_acquire_mutex().
>
> Namely, if CONFIG_DEBUG_ATOMIC_SLEEP is set, might_sleep() causes
> might_resched() to be used and if CONFIG_PREEMPT_VOLUNTARY is set,
> this triggers __cond_resched() which may call preempt_schedule_common(),
> so __schedule() gets invoked and it ends up with enabled interrupts (in
> the prev == next case).

Urgh, so that code was relying on the lack of contention to not trigger
the schedule path -- with the added might_sleep() it triggers a
preemption point.

> Now, enabling interrupts early in the S3 wakeup path causes the kernel
> to crash.
>
> Address this by modifying acpi_suspend_enter() to disable GPEs without
> attempting to acquire the sleeping lock which is not needed in that code
> path anyway.
>
> Fixes: 99409b935c9a locking/semaphore: Add might_sleep() to down_*() family

$ git show -s --pretty='format:%h ("%s")' 99409b935c9a
99409b935c9a ("locking/semaphore: Add might_sleep() to down_*() family")

> Reported-by: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>

Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>

> ---
> drivers/acpi/acpica/achware.h | 2 --
> drivers/acpi/sleep.c | 16 ++++++++++++----
> include/acpi/acpixf.h | 1 +
> 3 files changed, 13 insertions(+), 6 deletions(-)
>
> Index: linux-pm/drivers/acpi/acpica/achware.h
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpica/achware.h
> +++ linux-pm/drivers/acpi/acpica/achware.h
> @@ -101,8 +101,6 @@ acpi_status
> acpi_hw_get_gpe_status(struct acpi_gpe_event_info *gpe_event_info,
> acpi_event_status *event_status);
>
> -acpi_status acpi_hw_disable_all_gpes(void);
> -
> acpi_status acpi_hw_enable_all_runtime_gpes(void);
>
> acpi_status acpi_hw_enable_all_wakeup_gpes(void);
> Index: linux-pm/include/acpi/acpixf.h
> ===================================================================
> --- linux-pm.orig/include/acpi/acpixf.h
> +++ linux-pm/include/acpi/acpixf.h
> @@ -761,6 +761,7 @@ ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_sta
> acpi_event_status
> *event_status))
> ACPI_HW_DEPENDENT_RETURN_UINT32(u32 acpi_dispatch_gpe(acpi_handle gpe_device, u32 gpe_number))
> +ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status acpi_hw_disable_all_gpes(void))
> ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status acpi_disable_all_gpes(void))
> ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status acpi_enable_all_runtime_gpes(void))
> ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status acpi_enable_all_wakeup_gpes(void))
> Index: linux-pm/drivers/acpi/sleep.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/sleep.c
> +++ linux-pm/drivers/acpi/sleep.c
> @@ -636,11 +636,19 @@ static int acpi_suspend_enter(suspend_st
> }
>
> /*
> - * Disable and clear GPE status before interrupt is enabled. Some GPEs
> - * (like wakeup GPE) haven't handler, this can avoid such GPE misfire.
> - * acpi_leave_sleep_state will reenable specific GPEs later
> + * Disable all GPE and clear their status bits before interrupts are
> + * enabled. Some GPEs (like wakeup GPEs) have no handlers and this can
> + * prevent them from producing spurious interrups.
> + *
> + * acpi_leave_sleep_state() will reenable specific GPEs later.
> + *
> + * Because this code runs on one CPU with disabled interrupts (all of
> + * the other CPUs are offline at that time), it need not acquire any
> + * sleeping locks which maybe harmful due to instrumentation even if
> + * those locks are not contended, so avoid doing that by using a low-
> + * level library routine here.

I'm not sure I'd call the implicit preemption point 'instrumentation'
but yeah, fair enough I suppose.

> */
> - acpi_disable_all_gpes();
> + acpi_hw_disable_all_gpes();
> /* Allow EC transactions to happen. */
> acpi_ec_unblock_transactions();
>
>
>
>