Re: Arches that don't support PREEMPT

From: Ingo Molnar
Date: Wed Sep 20 2023 - 03:29:33 EST



* Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> On Tue, Sep 19 2023 at 10:25, Linus Torvalds wrote:
> > On Tue, 19 Sept 2023 at 06:48, John Paul Adrian Glaubitz
> > <glaubitz@xxxxxxxxxxxxxxxxxxx> wrote:
> >>
> >> As Geert poined out, I'm not seeing anything particular problematic with the
> >> architectures lacking CONFIG_PREEMPT at the moment. This seems to be more
> >> something about organizing KConfig files.
> >
> > It can definitely be problematic.
> >
> > Not the Kconfig file part, and not the preempt count part itself.
> >
> > But the fact that it has never been used and tested means that there
> > might be tons of "this architecture code knows it's not preemptible,
> > because this architecture doesn't support preemption".
> >
> > So you may have basic architecture code that simply doesn't have the
> > "preempt_disable()/enable()" pairs that it needs.
> >
> > PeterZ mentioned the generic entry code, which does this for the entry
> > path. But it actually goes much deeper: just do a
> >
> > git grep preempt_disable arch/x86/kernel
> >
> > and then do the same for some other architectures.
> >
> > Looking at alpha, for example, there *are* hits for it, so at least
> > some of the code there clearly *tries* to do it. But does it cover all
> > the required parts? If it's never been tested, I'd be surprised if
> > it's all just ready to go.
> >
> > I do think we'd need to basically continue to support ARCH_NO_PREEMPT
> > - and such architectures migth end up with the worst-cast latencies of
> > only scheduling at return to user space.
>
> The only thing these architectures should gain is the preempt counter
> itself, [...]

And if any of these machines are still used, there's the small benefit of
preempt_count increasing debuggability of scheduling in supposedly
preempt-off sections that were ignored silently previously, as most of
these architectures do not even enable CONFIG_DEBUG_ATOMIC_SLEEP=y in their
defconfigs:

$ for ARCH in alpha hexagon m68k um; do git grep DEBUG_ATOMIC_SLEEP arch/$ARCH; done
$

Plus the efficiency of CONFIG_DEBUG_ATOMIC_SLEEP=y is much reduced on
non-PREEMPT kernels to begin with: it will basically only detect scheduling
in hardirqs-off critical sections.

So IMHO there's a distinct debuggability & robustness plus in enabling the
preemption count on all architectures, even if they don't or cannot use the
rescheduling points.

> [...] but yes the extra preemption points are not mandatory to have, i.e.
> we simply do not enable them for the nostalgia club.
>
> The removal of cond_resched() might cause latencies, but then I doubt
> that these museus pieces are used for real work :)

I'm not sure we should initially remove *explicit* legacy cond_resched()
points, except from high-freq paths where they hurt - and of course remove
them from might_sleep().

Thanks,

Ingo