Re: [RFC PATCH 2/4] swap: apply new local_schedule_work_on() interface

From: Leonardo Brás
Date: Mon Aug 28 2023 - 20:19:06 EST


On Tue, 2023-08-08 at 16:39 -0300, Marcelo Tosatti wrote:
> On Sat, Jul 29, 2023 at 05:37:33AM -0300, Leonardo Bras wrote:
> > Make use of the new local_*lock_n*() and local_schedule_work_on() interface
> > to improve performance & latency on PREEMTP_RT kernels.
> >
> > For functions that may be scheduled in a different cpu, replace
> > local_*lock*() by local_lock_n*(), and replace schedule_work_on() by
> > local_schedule_work_on(). The same happens for flush_work() and
> > local_flush_work().
> >
> > This should bring no relevant performance impact on non-RT kernels:
> > For functions that may be scheduled in a different cpu, the local_*lock's
> > this_cpu_ptr() becomes a per_cpu_ptr(smp_processor_id()).
> >
> > Signed-off-by: Leonardo Bras <leobras@xxxxxxxxxx>
> > ---
> > mm/swap.c | 18 +++++++++---------
> > 1 file changed, 9 insertions(+), 9 deletions(-)
>
> Leo,
>
> I think the interruptions should rather be removed for both
> CONFIG_PREEMPT_RT AND !CONFIG_PREEMPT_RT.
>
> The impact of grabbing locks must be properly analyzed and not
> "rejected blindly".

Yes, I agree with the idea, but we have been noticing a lot of rejection for
these ideas lately, as the maintainers perceive this as a big change.

My idea here is to provide a general way to improve the PREEMPT_RT scenarios CPU
Isolation, even though there is resistance in !PREEMPT_RT case.

As I commented, spinlocks are already held by PREEMPT_RT's local_locks hotpaths,
and yet are using schedule_work_on() to have remote work done. This patch tries
to solve this by using spin_lock(remote_cpu_lock) instead and save a lot of
cycles while decreasing IPI at the remote cpu.

It looks a simple solution, improving isolation and performance on PREEMPT_RT
with no visible drawbacks. I agree the interface is not ideal, and for that I
really need you guys help.

I understand that having this change merged, we will have more precedence to
discuss performance for the !PREEMPT_RT case.

What do you think on that?

Thanks!
Leo

>
> Example:
>
> commit 01b44456a7aa7c3b24fa9db7d1714b208b8ef3d8
> Author: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> Date: Fri Jun 24 13:54:23 2022 +0100
>
> mm/page_alloc: replace local_lock with normal spinlock
>
> struct per_cpu_pages is no longer strictly local as PCP lists can be
> drained remotely using a lock for protection. While the use of local_lock
> works, it goes against the intent of local_lock which is for "pure CPU
> local concurrency control mechanisms and not suited for inter-CPU
> concurrency control" (Documentation/locking/locktypes.rst)
>
> local_lock protects against migration between when the percpu pointer is
> accessed and the pcp->lock acquired. The lock acquisition is a preemption
> point so in the worst case, a task could migrate to another NUMA node and
> accidentally allocate remote memory. The main requirement is to pin the
> task to a CPU that is suitable for PREEMPT_RT and !PREEMPT_RT.
>