Re: [PATCH 2/2] membarrier: riscv: Provide core serializing command

From: Mathieu Desnoyers
Date: Wed Nov 29 2023 - 15:01:02 EST


On 2023-11-29 13:29, Andrea Parri wrote:
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index 217fd4de61342..f63222513076d 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -323,6 +323,23 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
if (unlikely(prev == next))
return;
+#if defined(CONFIG_MEMBARRIER) && defined(CONFIG_SMP)
+ /*
+ * The membarrier system call requires a full memory barrier
+ * after storing to rq->curr, before going back to user-space.
+ *
+ * Only need the full barrier when switching between processes:
+ * barrier when switching from kernel to userspace is not
+ * required here, given that it is implied by mmdrop(); barrier
+ * when switching from userspace to kernel is not needed after
+ * store to rq->curr.
+ */
+ if (unlikely(atomic_read(&next->membarrier_state) &
+ (MEMBARRIER_STATE_PRIVATE_EXPEDITED |
+ MEMBARRIER_STATE_GLOBAL_EXPEDITED)) && prev)
+ smp_mb();
+#endif

The approach looks good. Please implement it within a separate
membarrier_arch_switch_mm() as done on powerpc.

Will do. Thanks.

As regards the Fixes: tag, I guess it boils down to what we want or we
need to take for commit "riscv: Support membarrier private cmd". :-)

I'm not seeing this commit in the Linux master branch, am I missing
something ?

FWIW, a quick git-log search confirmed that MEMBARRIER has been around
for quite some time in the RISC-V world (though I'm not familiar with
any of its mainstream uses): commit 1464d00b27b2 says (at least) since
93917ad50972 ("RISC-V: Add support for restartable sequence"). I am
currently inclined to pick the latter commit (and check it w/ Palmer),
but other suggestions are welcome.

Supporting membarrier private expedited is not optional since Linux 4.14:

see kernel/sched/core.c:

membarrier_switch_mm(rq, prev->active_mm, next->mm);
/*
* sys_membarrier() requires an smp_mb() between setting
* rq->curr / membarrier_switch_mm() and returning to userspace.
*
* The below provides this either through switch_mm(), or in
* case 'prev->active_mm == next->mm' through
* finish_task_switch()'s mmdrop().
*/
switch_mm_irqs_off(prev->active_mm, next->mm, next);

Failure to provide the required barrier is a bug in the architecture's
switch_mm implementation when CONFIG_MEMBARRIER=y.

We should probably introduce a new
Documentation/features/sched/membarrier/arch-support.txt
to clarify this requirement.

Userspace code such as liburcu [1] heavily relies on membarrier private
expedited (when available) to speed up RCU read-side critical sections.
Various DNS servers, including BIND 9, use liburcu.

Thanks,

Mathieu

[1] https:/liburcu.org


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com