Re: [PATCH v2 1/2] maple_tree: Disable mas_wr_append() when other readers are possible

From: Peng Zhang
Date: Thu Aug 31 2023 - 04:45:44 EST




在 2023/8/31 16:25, Geert Uytterhoeven 写道:
Hi Michael,

On Thu, Aug 31, 2023 at 7:39 AM Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote:
Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> writes:
On Fri, 18 Aug 2023, Liam R. Howlett wrote:
The current implementation of append may cause duplicate data and/or
incorrect ranges to be returned to a reader during an update. Although
this has not been reported or seen, disable the append write operation
while the tree is in rcu mode out of an abundance of caution.

During the analysis of the mas_next_slot() the following was
artificially created by separating the writer and reader code:

Writer: reader:
mas_wr_append
set end pivot
updates end metata
Detects write to last slot
last slot write is to start of slot
store current contents in slot
overwrite old end pivot
mas_next_slot():
read end metadata
read old end pivot
return with incorrect range
store new value

Alternatively:

Writer: reader:
mas_wr_append
set end pivot
updates end metata
Detects write to last slot
last lost write to end of slot
store value
mas_next_slot():
read end metadata
read old end pivot
read new end pivot
return with incorrect range
set old end pivot

There may be other accesses that are not safe since we are now updating
both metadata and pointers, so disabling append if there could be rcu
readers is the safest action.

Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx>

Thanks for your patch, which is now commit cfeb6ae8bcb96ccf
("maple_tree: disable mas_wr_append() when other readers are
possible") in v6.5, and is being backported to stable.

On Renesas RZ/A1 and RZ/A2 (single-core Cortex-A9), this causes the
following warning:

clocksource: timer@e803b000: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 28958491609 ns
sched_clock: 32 bits at 66MHz, resolution 15ns, wraps every 32537631224ns
/soc/timer@e803b000: used for clocksource
/soc/timer@e803c000: used for clock events
+------------[ cut here ]------------
+WARNING: CPU: 0 PID: 0 at init/main.c:992 start_kernel+0x2f0/0x480
+Interrupts were enabled early
...

I do not see this issue on any other platform
(arm/arm64/risc-v/mips/sh/m68k), several of them use the same
RCU configuration.

There's something similar on pmac32 / mac99.

Do you have a clue?

It seems something in the maple tree code is setting TIF_NEED_RESCHED,
and that causes a subsequent call to cond_resched() to call schedule()
and enable interrupts.

On pmac32 enabling CONFIG_DEBUG_ATOMIC_SLEEP fixes/hides the problem.
But I don't see why.

Enabling CONFIG_DEBUG_ATOMIC_SLEEP on RZ/A1 and RZ/A2 does
fix the problem.
But there must be more to it, as some of my test configs had it enabled,
and others hadn't, while only RZ/A showed the issue.
I tried disabling it on R-Car M2-W (arm32) and R-Car H3 (arm64), and
that did not cause the problem to happen...

I guess this patch triggers a potential problem with the maple tree.
I don't have the environment to trigger the problem. Can you apply the
following patch to see if the problem still exists? This can help locate
the root cause. At least narrow down the scope of the code that has bug.

Thanks.

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index f723024e1426..1b4b6f6e3095 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -4351,9 +4351,6 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas)
if (new_end == wr_mas->node_end && mas_wr_slot_store(wr_mas))
return;

- if (mas_wr_node_store(wr_mas, new_end))
- return;
-
if (mas_is_err(mas))
return;



Gr{oetje,eeting}s,

Geert