Re: Ping: [PATCH v15 00/13] support "task_isolation" mode

From: Chris Metcalf
Date: Fri Sep 30 2016 - 21:55:35 EST


On 9/27/2016 10:35 AM, Frederic Weisbecker wrote:
On 8/16/2016 5:19 PM, Chris Metcalf wrote:
Here is a respin of the task-isolation patch set.

Again, I have been getting email asking me when and where this patch
will be upstreamed so folks can start using it. I had been thinking
the obvious path was via Frederic Weisbecker to Ingo as a NOHZ kind of
thing. But perhaps it touches enough other subsystems that that
doesn't really make sense? Andrew, would it make sense to take it
directly via your tree? Frederic, Ingo, what do you think?
As it seems we are still debating a lot of things in this patchset that has already
reached v15, I think you should split it in smaller steps in order to move forward
and only get into the next step once the previous is merged.

You could start with a first batch that introduces the prctl() and does the best effort
one-shot isolation part. Which means the actions that only need to be performed once
on the prctl call.

So combining this with my reply a moment ago to Andy about just
disabling all deferrable work creation on task isolation cores, that
means we just need a way of checking that the dyntick is off on return
from the prctl.

We could do this in the prctl() itself, but it feels a bit fragile, since
we could do the check for no dyntick and try to return success,
and then some kind of interrupt and/or schedule event might happen
and by the time we actually got back to userspace the dyntick might
be running again.

I think what we can do is arrange to set a bit in the process state
that says we are returning from prctl, and then right as we are
returning to userspace with interrupts disabled, we can check if
that bit is set, and if so check at that point to see if the dyntick
is enabled, and if it is, force the syscall return value to EAGAIN
(and clear the bit regardless).

Within the prctl() code itself, we check for hard prerequisites like being on
a task-isolation cpu, and fail -EINVAL if not.

The upshot is that we end up spinning on a loop through userspace where
we keep retrying the prctl() until the timer quiesces.

Once we get that merged we can focus on what needs to be performed on every return
to userspace if that's really needed. Including possibly waiting on some completion.

So in NOSIG mode, instead of setting EAGAIN in the return to
userspace path, we arrange to just wait. We can figure out in a
follow-on patch whether we want to wait by spinning in some way
or by actually waiting on a completion. For now I'll just include the
remainder of the patch (with spinning) as an RFC just so people
can have the next piece to look ahead to, but I like your idea of
breaking it out of the main patch series entirely.

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com