Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED

From: Thomas Gleixner
Date: Wed Sep 20 2023 - 16:51:18 EST

Next message: Nicolin Chen: "[PATCH v4 0/2] iommu/arm-smmu-v3: Allow default substream bypass with a pasid support"
Previous message: Linus Torvalds: "Re: [RFC] Should writes to /dev/urandom immediately affect reads?"
In reply to: Ankur Arora: "Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED"
Next in thread: Thomas Gleixner: "Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Sep 20 2023 at 07:22, Ankur Arora wrote:
> Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
>
>> So the decision matrix would be:
>>
>> Ret2user Ret2kernel PreemptCnt=0
>>
>> NEED_RESCHED Y Y Y
>> LAZY_RESCHED Y N N
>>
>> That is completely independent of the preemption model and the
>> differentiation of the preemption models happens solely at the scheduler
>> level:
>
> This is relatively minor, but do we need two flags? Seems to me we
> can get to the same decision matrix by letting the scheduler fold
> into the preempt-count based on current preemption model.

You still need the TIF flags because there is no way to do remote
modification of preempt count.

The preempt count folding is an optimization which simplifies the
preempt_enable logic:

if (--preempt_count && need_resched())
schedule()
to
if (--preempt_count)
schedule()

i.e. a single conditional instead of two.

The lazy bit is only evaluated in:

1) The return to user path

2) need_reched()

In neither case preempt_count is involved.

So it does not buy us enything. We might revisit that later, but for
simplicity sake the extra TIF bit is way simpler.

Premature optimization is the enemy of correctness.

>> We should be able merge the PREEMPT_NONE/VOLUNTARY behaviour so that we
>> only end up with two variants or even subsume PREEMPT_FULL into that
>> model because that's what is closer to the RT LAZY preempt behaviour,
>> which has two goals:
>>
>> 1) Make low latency guarantees for RT workloads
>>
>> 2) Preserve the throughput for non-RT workloads
>>
>> But in any case this decision happens solely in the core scheduler code
>> and nothing outside of it needs to be changed.
>>
>> So we not only get rid of the cond/might_resched() muck, we also get rid
>> of the static_call/static_key machinery which drives PREEMPT_DYNAMIC.
>> The only place which still needs that runtime tweaking is the scheduler
>> itself.
>
> True. The dynamic preemption could just become a scheduler tunable.

That's the point.

>> But they support PREEMPT_COUNT, so we might get away with a reduced
>> preemption point coverage:
>>
>> Ret2user Ret2kernel PreemptCnt=0
>>
>> NEED_RESCHED Y N Y
>> LAZY_RESCHED Y N N
>
> So from the discussion in the other thread, for the ARCH_NO_PREEMPT
> configs that don't support preemption, we probably need a fourth
> preemption model, say PREEMPT_UNSAFE.

As discussed they wont really notice the latency issues because the
museum pieces are not used for anything crucial and for UM that's the
least of the correctness worries.

So no, we don't need yet another knob. We keep them chucking along and
if they really want they can adopt to the new world order. :)

Thanks,

tglx

Next message: Nicolin Chen: "[PATCH v4 0/2] iommu/arm-smmu-v3: Allow default substream bypass with a pasid support"
Previous message: Linus Torvalds: "Re: [RFC] Should writes to /dev/urandom immediately affect reads?"
In reply to: Ankur Arora: "Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED"
Next in thread: Thomas Gleixner: "Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]