Re: [RFC PATCH] kernel: allow to configure PREEMPT_NONE, PREEMPT_VOLUNTARY on kernel command line

From: Michal Hocko
Date: Fri Oct 09 2020 - 06:14:10 EST


On Fri 09-10-20 11:47:41, Peter Zijlstra wrote:
> On Wed, Oct 07, 2020 at 02:35:53PM +0200, Michal Hocko wrote:
> > On Wed 07-10-20 14:21:44, Peter Zijlstra wrote:
> > > On Wed, Oct 07, 2020 at 02:04:01PM +0200, Michal Hocko wrote:
> > > > I wanted to make sure that the idea is sound for maintainers first. The
> > > > next step would be extending the command line to support full preemption
> > > > as well but there is much more work in that area. Frederic has promissed
> > > > to look into that.
> > >
> > > The sanest way there is to static_call() __preempt_schedule() I think.
> >
> > Yes, I have checked the code and identified few other places like
> > irqentry_exit_cond_resched. We also need unconditional
> > CONFIG_PREEMPT_COUNT IIUC and there are quite some places guarded by
> > CONFIG_PREEMPTION that would need to be examined. Some of them are
> > likely pretending to be more clever than they really are/should be -
> > e.g. mm/slub.c. So there is likely a lot of leg work.
>
> The easiest way might be to introduce PREEMPT_DYNAMIC that
> depends/selects PREEMPT. That way you're basically running a PREEMPT=y
> kernel.
>
> Then have PREEMPT_DYNAMIC allow disabling the __preempt_schedule /
> preempt_schedule_irq() callsites using static_call/static_branch
> respectively.
>
> That is, work backwards (from PREEMPT back to VOLUNTARY) instead of the
> other way around.

My original idea was that the config would only define the default
preemption mode. preempt_none parameter would then just act as an
override. That would mean that CONFIG_PREEMPTION would be effectively
gone from the kernel. The reason being that any code outside of the
scheduler shouldn't really care about the preemption mode. I suspect
this will prevent from dubious hacks and provide a more robust code in
the end.

Does that sound reasonable?

--
Michal Hocko
SUSE Labs