Re: [PATCH v1] rcu: Fix and improve RCU read lock checks when !CONFIG_DEBUG_LOCK_ALLOC

From: Gao Xiang
Date: Thu Jul 13 2023 - 15:00:39 EST




On 2023/7/14 02:14, Paul E. McKenney wrote:
On Fri, Jul 14, 2023 at 12:09:27AM +0800, Alan Huang wrote:

...


From what Sandeep described, the code path is in an RCU reader. My
question is more, why doesn't it use SRCU instead since it clearly
does so if BLK_MQ_F_BLOCKING. What are the tradeoffs? IMHO, a deeper
dive needs to be made into that before concluding that the fix is to
use rcu_read_lock_any_held().

Copied from [1]:

"Background: Historically erofs would always schedule a kworker for
decompression which would incur the scheduling cost regardless of
the context. But z_erofs_decompressqueue_endio() may not always
be in atomic context and we could actually benefit from doing the
decompression in z_erofs_decompressqueue_endio() if we are in
thread context, for example when running with dm-verity.
This optimization was later added in patch [2] which has shown
improvement in performance benchmarks.”

I’m not sure if it is a design issue.

What do you mean a design issue, honestly? I feel uneasy to hear this.


I have no official opinion myself, but there are quite a few people
who firmly believe that any situation like this one (where driver or
file-system code needs to query the current context to see if blocking
is OK) constitutes a design flaw. Such people might argue that this
code path should have a clearly documented context, and that if that
documentation states that the code might be in atomic context, then the
driver/fs should assume atomic context. Alternatively, if driver/fs


I don't think a software decoder (for example, decompression) should be
left in the atomic context honestly.

Regardless of the decompression speed of some algorithm theirselves
(considering very slow decompression on very slow devices), it means
that we also don't have a way to vmap or likewise (considering
decompression + handle extra deduplication copies) in the atomic
context, even memory allocation has to be in an atomic way.


Especially now have more cases that decodes in the RCU reader context
apart from softirq contexts?


needs the context to be non-atomic, the callers should make it so.

See for example in_atomic() and its comment header:

/*
* Are we running in atomic context? WARNING: this macro cannot
* always detect atomic context; in particular, it cannot know about
* held spinlocks in non-preemptible kernels. Thus it should not be
* used in the general case to determine whether sleeping is possible.
* Do not use in_atomic() in driver code.
*/
#define in_atomic() (preempt_count() != 0)

In the immortal words of Dan Frye, this should be good clean fun! ;-)

Honestly, I think such helper (to show whether it's in the atomic context)
is useful to driver users, even it could cause some false positive in some
configuration but it's acceptable.


Anyway, I could "Always use a workqueue.", but again, the original commit
was raised by a real vendor (OPPO), and I think if dropping this, all
downstream users which use dm-verity will be impacted and individual end
users will not be happy as well.

Thanks,
Gao Xiang