Re: [RFC PATCH v3] sched: Fix performance regression introduced by mm_cid

From: Peter Zijlstra
Date: Tue Apr 11 2023 - 07:53:50 EST


On Tue, Apr 11, 2023 at 01:03:45PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 07, 2023 at 09:14:36PM -0400, Mathieu Desnoyers wrote:


> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index bc0e1cd0d6ac..f3e7dc2cd1cc 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -3354,6 +3354,37 @@ static inline int mm_cid_get(struct mm_struct *mm)
> > static inline void switch_mm_cid(struct task_struct *prev, struct task_struct *next)
> > {
> > + /*
> > + * Provide a memory barrier between rq->curr store and load of
> > + * {prev,next}->mm->pcpu_cid[cpu] on rq->curr->mm transition.
> > + *
> > + * Should be adapted if context_switch() is modified.
> > + */
> > + if (!next->mm) { // to kernel
> > + /*
> > + * user -> kernel transition does not guarantee a barrier, but
> > + * we can use the fact that it performs an atomic operation in
> > + * mmgrab().
> > + */
> > + if (prev->mm) // from user
> > + smp_mb__after_mmgrab();
> > + /*
> > + * kernel -> kernel transition does not change rq->curr->mm
> > + * state. It stays NULL.
> > + */
> > + } else { // to user
> > + /*
> > + * kernel -> user transition does not provide a barrier
> > + * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu].
> > + * Provide it here.
> > + */
> > + if (!prev->mm) // from kernel
> > + smp_mb();
> > + /*
> > + * user -> user transition guarantees a memory barrier through
> > + * switch_mm().
> > + */
> > + }

The possibly nicer way to write all this is:

if (!prev->mm != !next->mm)
smp_mb();

And then clean up the mm{grab,drop)_lazy_tlb() helpers. But we can
always do that later if we indeed end up with all this ...