Re: rust nvme driver: potential sleep-in-atomic-context

From: Andreas Hindborg
Date: Thu Nov 03 2022 - 06:13:04 EST



Hi Dennis,

Dennis Dai <dzy.0424thu@xxxxxxxxx> writes:

> The rust nvme driver [1] (which is still pending to be merged into
> mainline [2]) has a potential sleep-in-atomic-context bug.
>
> The potential buggy code is below
>
> // drivers/block/nvme.rs:192
> dev.queues.lock().io.try_reserve(nr_io_queues as _)?;
> // drivers/block/nvme.rs:227
> dev.queues.lock().io.try_push(io_queue.clone())?;
>
> The queues field is wrapped in SpinLock, which means that we cannot
> sleep (or indirectly call any function that may sleep) when the lock
> is held.
> However try_reserve function may indirectly call krealloc with a
> sleepable flag GFP_KERNEL (that's default behaviour of the global rust
> allocator).
> The the case is similar for try_push.
>
> I wonder if the bug could be confirmed.

Nice catch, I was not aware of that one. I will add a TODO. Did you
manage to trigger this bug or did you find it by review?

I am not sure if it has been decided how to pass flags to allocations
yet. There is a discussion about the interface for Box here [1] and
there is also some discussion on the list [2]. For reference, I use an
atomic box allocation here [3].

The NVMe driver is very much a prototype and I expect there are many
bugs like this still in it. So while I am not surprised, really I
appreciate the report :)

[1] https://github.com/Rust-for-Linux/linux/pull/815
[2] https://lore.kernel.org/rust-for-linux/Yyr5pKpjib%2Fyqk5e@xxxxxxxxx/T/#mb55cf54067002d503ca63c5ad0688d55c6184cca
[3] https://github.com/metaspace/rust-linux/blob/nvme/drivers/block/nvme_mq.rs#L261

Best regards,
Andreas Hindborg