Re: [PATCH 0/4] audit: refactor and fix for potential deadlock

From: Eiichi Tsukata
Date: Mon May 08 2023 - 21:49:48 EST




> On May 8, 2023, at 23:07, Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
>
> On Mon, May 8, 2023 at 3:58 AM Eiichi Tsukata
> <eiichi.tsukata@xxxxxxxxxxx> wrote:
>> Commit 7ffb8e317bae ("audit: we don't need to
>> __set_current_state(TASK_RUNNING)") accidentally moved queue full check
>> before add_wait_queue_exclusive() which introduced the following race:
>>
>> CPU1 CPU2
>> ======== ========
>> (in audit_log_start()) (in kauditd_thread())
>>
>> queue is full
>> wake_up(&audit_backlog_wait)
>> wait_event_freezable()
>> add_wait_queue_exclusive()
>> ...
>> schedule_timeout()
>>
>> Once this happens, both audit_log_start() and kauditd_thread() can cause
>> deadlock for up to backlog_wait_time waiting for each other. To prevent
>> the race, this patch adds queue full check after
>> prepare_to_wait_exclusive().
>
> Have you seen this occur in practice?

Yes, we hit this issue multiple times, though it’s pretty rare as you are mentioning.
In our case, sshd got stuck in audit_log_user_message(), which caused SSH connection
timeout.

Eiichi