Re: Crash with 6.6.0-rc1-rt1 and several i915 locking call traces with v6.5.2-rt8 and gnome-shell on Alder Lake laptop

From: Sebastian Andrzej Siewior
Date: Mon Oct 02 2023 - 05:45:54 EST


On 2023-09-29 04:43:32 [-0400], John B. Wyatt IV wrote:
> For stock (non-rt) I do not see it with 6.6-rc2. This was compiled
> with the Stream 9 debug config.
>
> I was able to reproduce similar call traces once I tested again
> with 6.6-rc3-rt5 at [4] and [5].
>
> What would be the best way to determine if the warning is wrongly
> triggered?

I looked at the traces in this email and they originate from a
might_sleep() in guc_context_set_prio(). The reason is that they check
at the atomic/interrupt state to figure out if they can sleep or not.
Both checks don't work on RT as intended and the former has a not to not
be used in drivers…

The snippet below should cure this. Could you test, please.

Sebastian


diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 8dc291ff00935..5b8d084c9c58c 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -317,7 +317,7 @@ static inline int intel_guc_send_busy_loop(struct intel_guc *guc,
{
int err;
unsigned int sleep_period_ms = 1;
- bool not_atomic = !in_atomic() && !irqs_disabled();
+ bool not_atomic = !in_atomic() && !irqs_disabled() && !rcu_preempt_depth();

/*
* FIXME: Have caller pass in if we are in an atomic context to avoid