Re: [PATCH 26/30] sched: handle preempt=voluntary under PREEMPT_AUTO

From: Ankur Arora
Date: Thu Mar 07 2024 - 22:50:54 EST



Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> writes:

> On 3/7/2024 2:01 PM, Paul E. McKenney wrote:
>> On Wed, Mar 06, 2024 at 03:42:10PM -0500, Joel Fernandes wrote:
>>> Hi Ankur,
>>>
>>> On 3/5/2024 3:11 AM, Ankur Arora wrote:
>>>>
>>>> Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> writes:
>>>>
>>> [..]
>>>>> IMO, just kill 'voluntary' if PREEMPT_AUTO is enabled. There is no
>>>>> 'voluntary' business because
>>>>> 1. The behavior vs =none is to allow higher scheduling class to preempt, it
>>>>> is not about the old voluntary.
>>>>
>>>> What do you think about folding the higher scheduling class preemption logic
>>>> into preempt=none? As Juri pointed out, prioritization of at least the leftmost
>>>> deadline task needs to be done for correctness.
>>>>
>>>> (That'll get rid of the current preempt=voluntary model, at least until
>>>> there's a separate use for it.)
>>>
>>> Yes I am all in support for that. Its less confusing for the user as well, and
>>> scheduling higher priority class at the next tick for preempt=none sounds good
>>> to me. That is still an improvement for folks using SCHED_DEADLINE for whatever
>>> reason, with a vanilla CONFIG_PREEMPT_NONE=y kernel. :-P. If we want a new mode
>>> that is more aggressive, it could be added in the future.
>>
>> This would be something that happens only after removing cond_resched()
>> might_sleep() functionality from might_sleep(), correct?
>
> Firstly, Maybe I misunderstood Ankur completely. Re-reading his comments above,
> he seems to be suggesting preempting instantly for higher scheduling CLASSES
> even for preempt=none mode, without having to wait till the next
> scheduling-clock interrupt.

Yes, that's what I was suggesting.

> Not sure if that makes sense to me, I was asking not
> to treat "higher class" any differently than "higher priority" for preempt=none.

Ah. Understood.

> And if SCHED_DEADLINE has a problem with that, then it already happens so with
> CONFIG_PREEMPT_NONE=y kernels, so no need special treatment for higher class any
> more than the treatment given to higher priority within same class. Ankur/Juri?

No. I think that behaviour might be worse for PREEMPT_AUTO.

PREEMPT_NONE=y (or PREEMPT_VOLUNTARY=y for that matter) don't
quite have a policy around when preemption happens. Preemption
might happen quickly, might happen slowly based on when the next
preemption point is found.

The PREEMPT_AUTO, preempt=none policy in this series will always
cause preemption to be at user exit or the next tick. Seems like
it would be worse for higher scheduling classes more often.

But, I wonder what Juri thinks about this.

--
ankur