Re: [RFC PATCH 0/2] Introduce serialized smp_call_function APIs

From: Avi Kivity
Date: Wed Mar 13 2024 - 18:30:58 EST

Next message: Alejandro Colomar: "Re: [PATCH] ip.7: Add not supported by SOCK_STREAM to socket options"
Previous message: Stephen Rothwell: "Re: linux-next: manual merge of the bpf-next tree with the mm-stable tree"
In reply to: Mathieu Desnoyers: "Re: [RFC PATCH 0/2] Introduce serialized smp_call_function APIs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 2024-03-13 at 18:06 -0400, Mathieu Desnoyers wrote:
> On 2024-03-13 17:14, Avi Kivity wrote:
> > On Wed, 2024-03-13 at 16:56 -0400, Mathieu Desnoyers wrote:
> > > commit 944d5fe50f3f ("sched/membarrier: reduce the ability to
> > > hammer
> > > on sys_membarrier")
> > > introduces a mutex over all membarrier operations to reduce its
> > > ability
> > > to slow down the rest of the system.
> > >
> > > This RFC series has two objectives:
> > >
> > > 1) Move this mutex to the smp_call_function APIs so other system
> > > calls
> > >    using smp_call_function IPIs are limited in the same way,
> > >
> > > 2) Restore scalability of MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ
> > > with
> > >    MEMBARRIER_CMD_FLAG_CPU, which targets specific CPUs with
> > > IPIs.
> > >    This may or may not be useful, and I would welcome benchmarks
> > > from
> > >    users of this feature to figure out if this is worth it.
> > >
> > > This series applies on top of v6.8.
> > >
> >
> >
> > I see this doesn't restore scaling of
> > MEMBARRIER_CMD_PRIVATE_EXPEDITED,
> > which I use (and wasn't aware was broken).
>
> It's mainly a mitigation for IPI Storming: CVE-2024-26602 disclosed

Very interesting.

> as part of [1].
>
> >
> > I don't have comments on the patches, but do have ideas on how to
> > work
> > around the problem in Seastar. So this was a useful heads-up for
> > me.
>
> Note that if you don't use membarrier private expedited too heavily,
> you should not notice any difference. But nevertheless I would be
> interested to hear about any regression on performance of real
> workloads resulting from commit 944d5fe50f3f.
>

In fact I did observe the original text of 944d5fe50f3f ("On some
systems, sys_membarrier can be very expensive, causing overall
slowdowns for everything") to be true [1]. So rather than causing
a regression, this commit made me fix a problem.

The smp_call_function_many_cond() in [1] is very likely due to
sys_membarrier, and it's slow since it's running on a virtual machine
without posted interrupt virtualization. Usually we detect virtual
machines and call membarrier() less frequently, but on that instance
(AWS d3en) the detection failed and triggered that IPI storm.

My fix is to just detect if there's a concurrent membarrier running and
fall back to doing something else, I don't think it's generally
applicable.

[1] https://github.com/scylladb/scylladb/issues/17207

Next message: Alejandro Colomar: "Re: [PATCH] ip.7: Add not supported by SOCK_STREAM to socket options"
Previous message: Stephen Rothwell: "Re: linux-next: manual merge of the bpf-next tree with the mm-stable tree"
In reply to: Mathieu Desnoyers: "Re: [RFC PATCH 0/2] Introduce serialized smp_call_function APIs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]