Re: [RFC PATCH 00/86] Make the kernel preemptible

From: Christoph Lameter
Date: Tue Nov 07 2023 - 23:52:45 EST


On Tue, 7 Nov 2023, Ankur Arora wrote:

This came up in an earlier discussion (See
https://lore.kernel.org/lkml/87cyyfxd4k.ffs@tglx/) and Thomas mentioned
that preempt_enable/_disable() overhead was relatively minimal.

Is your point that always-on preempt_count is far too expensive?

Yes over the years distros have traditionally delivered their kernels by default without preemption because of these issues. If the overhead has been minimized then that may have changed. Even if so there is still a lot of code being generated that has questionable benefit and just bloats the kernel.

These are needed to avoid adding preempt_enable/disable to a lot of primitives
that are used for synchronization. You cannot remove those without changing a
lot of synchronization primitives to always have to consider being preempted
while operating.

I'm afraid I don't understand why you would need to change any
synchronization primitives. The code that does preempt_enable/_disable()
is compiled out because CONFIG_PREEMPT_NONE/_VOLUNTARY don't define
CONFIG_PREEMPT_COUNT.

In the trivial cases it is simple like that. But look f.e.
in the slub allocator at the #ifdef CONFIG_PREEMPTION section. There is a overhead added to be able to allow the cpu to change under us. There are likely other examples in the source.

And the whole business of local data access via per cpu areas suffers if we cannot rely on two accesses in a section being able to see consistent values.

The intent here is to always have CONFIG_PREEMPT_COUNT=y.

Just for fun? Code is most efficient if it does not have to consider too many side conditions like suddenly running on a different processor. This introduces needless complexity into the code. It would be better to remove PREEMPT_COUNT for good to just rely on voluntary preemption. We could probably reduce the complexity of the kernel source significantly.

I have never noticed a need to preemption at every instruction in the kernel (if that would be possible at all... Locks etc prevent that ideal scenario frequently). Preemption like that is more like a pipe dream.

High performance kernel solution usually disable overhead like that.