Re: [RFC][PATCH 0/3] sched: User Managed Concurrency Groups

From: Peter Zijlstra
Date: Wed Dec 15 2021 - 13:19:28 EST


On Wed, Dec 15, 2021 at 09:56:06AM -0800, Peter Oskolkov wrote:

> > Right, so the problem I'm having is that a single idle server ptr like
> > before can trivially miss waking annother idle server.
>
> I believe the approach I used in my patchset, suggested by Thierry
> Delisle, works.
>
> In short, there is a single idle server ptr for the kernel to work
> with. The userspace maintains a list of idle servers. If the ptr is
> NULL, the list is empty. When the kernel wakes the idle server it
> sees, the server reaps the runnable worker list and wakes another idle
> server from the userspace list, if available. This newly woken idle
> server repoints the ptr to itself, checks the runnable worker list, to
> avoid missing a woken worker, then goes to sleep.
>
> Why do you think this approach is not OK?

Suppose at least 4 servers, 2 idle, 2 working.

Now, the first of the working servers (lets call it S0) gets a wakeup
(say Ta), it finds the idle server (say S3) and consumes it, sticking Ta
on S3 and kicking it alive.

Concurrently and loosing the race the other working server (S1) gets a
wake-up from Tb, like said, it lost, no idle server, so Tb goes on the
queue of S1.

So then S3 wakes, finds Ta and they live happily ever after.

While S2 and Tb fail to meet one another and both linger in sadness.