Re: [RFC PATCH v2 1/4] rseq: Add sched_state field to struct rseq

From: Peter Zijlstra
Date: Thu Sep 28 2023 - 06:40:12 EST


On Mon, May 29, 2023 at 03:14:13PM -0400, Mathieu Desnoyers wrote:
> Expose the "on-cpu" state for each thread through struct rseq to allow
> adaptative mutexes to decide more accurately between busy-waiting and
> calling sys_futex() to release the CPU, based on the on-cpu state of the
> mutex owner.
>
> It is only provided as an optimization hint, because there is no
> guarantee that the page containing this field is in the page cache, and
> therefore the scheduler may very well fail to clear the on-cpu state on
> preemption. This is expected to be rare though, and is resolved as soon
> as the task returns to user-space.
>
> The goal is to improve use-cases where the duration of the critical
> sections for a given lock follows a multi-modal distribution, preventing
> statistical guesses from doing a good job at choosing between busy-wait
> and futex wait behavior.

As always, are syscalls really *that* expensive? Why can't we busy wait
in the kernel instead?

I mean, sure, meltdown sucked, but most people should now be running
chips that are not affected by that particular horror show, no?