Re: [PATCH] exit: Detect and fix irq disabled state in oops

From: Eric W. Biederman
Date: Fri Dec 23 2022 - 23:36:41 EST


"Nicholas Piggin" <npiggin@xxxxxxxxx> writes:

> On Tue Oct 4, 2022 at 7:44 PM AEST, Nicholas Piggin wrote:
>> If a task oopses with irqs disabled, this can cause various cascading
>> problems in the oops path such as sleep-from-invalid warnings, and
>> potentially worse.
>>
>> Since commit 0258b5fd7c712 ("coredump: Limit coredumps to a single
>> thread group"), the unconditional irq enable in coredump_task_exit()
>> will "fix" the irq state to be enabled early in do_exit(), so currently
>> this may not be triggerable, but that is coincidental and fragile.
>>
>> Detect and fix the irqs_disabled() condition in the oops path before
>> calling do_exit(), similarly to the way in_atomic() is handled.
>>
>> Reported-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
>> Signed-off-by: Nicholas Piggin <npiggin@xxxxxxxxx>
>
> Hey Eric, did you have any thoughts on this?

No strong thoughts.

I agree that the unconditionally disabling then enabling irqs in
coredump_task_exit will mean there is likely to be little change in real
behavior.

I also agree that is something fragile to depend upon so we making
our assumptions explicit seems good.

Acked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

>
> Thanks,
> Nick
>
>> ---
>> kernel/exit.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/kernel/exit.c b/kernel/exit.c
>> index 84021b24f79e..fa696765f694 100644
>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -738,6 +738,7 @@ void __noreturn do_exit(long code)
>> struct task_struct *tsk = current;
>> int group_dead;
>>
>> + WARN_ON(irqs_disabled());
>> WARN_ON(tsk->plug);
>>
>> kcov_task_exit(tsk);
>> @@ -865,6 +866,11 @@ void __noreturn make_task_dead(int signr)
>> if (unlikely(!tsk->pid))
>> panic("Attempted to kill the idle task!");
>>
>> + if (unlikely(irqs_disabled())) {
>> + pr_info("note: %s[%d] exited with irqs disabled\n",
>> + current->comm, task_pid_nr(current));
>> + local_irq_enable();
>> + }
>> if (unlikely(in_atomic())) {
>> pr_info("note: %s[%d] exited with preempt_count %d\n",
>> current->comm, task_pid_nr(current),
>> --
>> 2.37.2