Re: [PATCH][RFC] acpi: Prevent scheduling while atomic warning inearly boot

From: Andrew Morton
Date: Wed Oct 12 2011 - 16:07:26 EST


On Wed, 12 Oct 2011 14:49:02 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> I hit the following bug:
>
> ...
>
> The commit 0a7992c90828a65 acpi: fix bogus preemption logic
>
> tried again to fix the preempt logic by encapsulating the
> ACPI_PREEMPTION_POINT() with a #ifndef CONFIG_PREEMPT and only testing
> irqsoff. But when CONFIG_PREEMPT=n and CONFIG_DEBUG_ATOMIC_SLEEP=y, the
> preempt count is still active. This code is called at boot up when
> preemption is still disabled triggering the above dump.
>
> Ideally, in_atomic() should not be used in general code, but I'm not
> sure what should be used. This does silent the warning, and it should
> not be an issue while it is still encapsulated in #ifndef CONFIG_PREEMPT
>
> --- a/include/acpi/platform/aclinux.h
> +++ b/include/acpi/platform/aclinux.h
> @@ -59,6 +59,7 @@
> #include <linux/ctype.h>
> #include <linux/sched.h>
> #include <asm/system.h>
> +#include <linux/hardirq.h>
> #include <linux/atomic.h>
> #include <asm/div64.h>
> #include <asm/acpi.h>
> @@ -151,10 +152,14 @@ static inline void *acpi_os_acquire_object(acpi_cache_t * cache)
> /*
> * Used within ACPICA to show where it is safe to preempt execution
> * when CONFIG_PREEMPT=n
> + *
> + * Note we still test for !in_atomic() in case CONFIG_DEBUG_ATOMIC_SLEEP
> + * is set. In that case, preempt_count is still updated and scheduling
> + * here will cause a warning in early boot.
> */
> #define ACPI_PREEMPTION_POINT() \
> do { \
> - if (!irqs_disabled()) \
> + if (!irqs_disabled() && !in_atomic()) \
> cond_resched(); \
> } while (0)
> #endif

This macro *cannot be implemented correctly*.

If CONFIG_PREEMPT=n && CONFIG_PREEMPT_COUNT=n then there is no
indication available anywhere to tell you whether or not it is safe to
call schedule().

Please, delete it. Linux just doesn't work this way. The way the
kernel is designed is that the calling code must know what its own
state is. If you know you hold spinlocks then don't call
cond_resched(). If you know you're not holding spinlocks (or
irq_disable, etc) then it's safe to call cond_resched().


As it stands, ACPI at present might be calling schedule() while holding
spinlocks, which is deadlockable. It cunningly arranges to only
exhibit this buggy behaviour in situations where the kernel's
self-checking code is incapable of reporting on the bug :(

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/