Re: [PATCH v7] ath9k: let sleep be interrupted when unregistering hwrng

From: Toke Høiland-Jørgensen
Date: Wed Jun 29 2022 - 05:25:05 EST


"Jason A. Donenfeld" <Jason@xxxxxxxxx> writes:

> There are two deadlock scenarios that need addressing, which cause
> problems when the computer goes to sleep, the interface is set down, and
> hwrng_unregister() is called. When the deadlock is hit, sleep is delayed
> for tens of seconds, causing it to fail. These scenarios are:
>
> 1) The hwrng kthread can't be stopped while it's sleeping, because it
> uses msleep_interruptible() instead of schedule_timeout_interruptible().
> The fix is a simple moving to the correct function. At the same time,
> we should cleanup a common and useless dmesg splat in the same area.
>
> 2) A normal user thread can't be interrupted by hwrng_unregister() while
> it's sleeping, because hwrng_unregister() is called from elsewhere.
> The solution here is to keep track of which thread is currently
> reading, and asleep, and signal that thread when it's time to
> unregister. There's a bit of book keeping required to prevent
> lifetime issues on current.
>
> Reported-by: Gregory Erwin <gregerwin256@xxxxxxxxx>
> Cc: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> Cc: Kalle Valo <kvalo@xxxxxxxxxx>
> Cc: Rui Salvaterra <rsalvaterra@xxxxxxxxx>
> Cc: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c")
> Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00giw@xxxxxxxxxxxxxx/
> Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXtent3HA@xxxxxxxxxxxxxx/
> Link: https://bugs.archlinux.org/task/75138
> Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
> ---
> drivers/char/hw_random/core.c | 30 ++++++++++++++++++++++++----
> drivers/net/wireless/ath/ath9k/rng.c | 19 +++++++-----------
> 2 files changed, 33 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index 16f227b995e8..df45c265878e 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -38,6 +38,8 @@ static LIST_HEAD(rng_list);
> static DEFINE_MUTEX(rng_mutex);
> /* Protects rng read functions, data_avail, rng_buffer and rng_fillbuf */
> static DEFINE_MUTEX(reading_mutex);
> +/* Keeps track of whoever is wait-reading it currently while holding reading_mutex. */
> +static struct task_struct *current_waiting_reader;
> static int data_avail;
> static u8 *rng_buffer, *rng_fillbuf;
> static unsigned short current_quality;
> @@ -208,6 +210,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> int err = 0;
> int bytes_read, len;
> struct hwrng *rng;
> + bool wait;
>
> while (size) {
> rng = get_current_rng();
> @@ -225,9 +228,15 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> goto out_put;
> }
> if (!data_avail) {
> + wait = !(filp->f_flags & O_NONBLOCK);
> + if (wait && cmpxchg(&current_waiting_reader, NULL, current) != NULL) {
> + err = -EINTR;
> + goto out_unlock_reading;
> + }
> bytes_read = rng_get_data(rng, rng_buffer,
> - rng_buffer_size(),
> - !(filp->f_flags & O_NONBLOCK));
> + rng_buffer_size(), wait);
> + if (wait && cmpxchg(&current_waiting_reader, current, NULL) != current)
> + synchronize_rcu();

So this synchronize_rcu() is to ensure the hwrng_unregister() thread has
exited the rcu_read_lock() section below? Isn't that a bit... creative...
use of RCU? :)

Also, synchronize_rcu() can potentially take a while on a busy system,
is it OK to call it while holding the mutex?

-Toke