Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED

From: Paul E. McKenney
Date: Fri Oct 20 2023 - 17:59:47 EST


On Thu, Oct 19, 2023 at 12:13:31PM -0700, Paul E. McKenney wrote:
> On Thu, Oct 19, 2023 at 02:21:35AM +0200, Thomas Gleixner wrote:
> > On Wed, Oct 18 2023 at 10:19, Paul E. McKenney wrote:
> > > On Wed, Oct 18, 2023 at 03:16:12PM +0200, Thomas Gleixner wrote:
> > >> On Tue, Oct 17 2023 at 18:03, Paul E. McKenney wrote:

[ . . . ]

> > >> In the end there is no CONFIG_PREEMPT_XXX anymore. The only knob
> > >> remaining would be CONFIG_PREEMPT_RT, which should be renamed to
> > >> CONFIG_RT or such as it does not really change the preemption
> > >> model itself. RT just reduces the preemption disabled sections with the
> > >> lock conversions, forced interrupt threading and some more.
> > >
> > > Again, please, no.
> > >
> > > There are situations where we still need rcu_read_lock() and
> > > rcu_read_unlock() to be preempt_disable() and preempt_enable(),
> > > repectively. Those can be cases selected only by Kconfig option, not
> > > available in kernels compiled with CONFIG_PREEMPT_DYNAMIC=y.
> >
> > Why are you so fixated on making everything hardcoded instead of making
> > it a proper policy decision problem. See above.
>
> Because I am one of the people who will bear the consequences.
>
> In that same vein, why are you so opposed to continuing to provide
> the ability to build a kernel with CONFIG_PREEMPT_RCU=n? This code
> is already in place, is extremely well tested, and you need to handle
> preempt_disable()/preeempt_enable() regions of code in any case. What is
> the real problem here?

I should hasten to add that from a conceptual viewpoint, I do support
the eventual elimination of CONFIG_PREEMPT_RCU=n code, but with emphasis
on the word "eventual". Although preemptible RCU is plenty reliable if
you are running only a few thousand servers (and maybe even a few tens
of thousands), it has some improving to do before I will be comfortable
recommending its use in a large-scale datacenters.

And yes, I know about Android deployments. But those devices tend
to spend very little time in the kernel, in fact, many of them tend to
spend very little time powered up. Plus they tend to have relatively few
CPUs, at least by 2020s standards. So it takes a rather large number of
Android devices to impose the same stress on the kernel that is imposed
by a single mid-sized server.

And we are working on making preemptible RCU more reliable. One nice
change over the past 5-10 years is that more people are getting serious
about digging into the RCU code, testing it, and reporting and fixing the
resulting bugs. I am also continuing to make rcutorture more vicious,
and of course I am greatly helped by the easier availability of hardware
with which to test RCU.

If this level of activity continues for another five years, then maybe
preemptible RCU will be ready for large datacenter deployments.

But I am guessing that you had something in mind in addition to code
consolidation.

Thanx, Paul