Re: [RFC v2 33/34] mm, slub: use migrate_disable() on PREEMPT_RT

From: Sebastian Andrzej Siewior
Date: Mon Jun 14 2021 - 10:01:43 EST


On 2021-06-14 13:33:43 [+0200], Vlastimil Babka wrote:
> On 6/14/21 1:16 PM, Sebastian Andrzej Siewior wrote:
> > I haven't looked at the series and I have just this tiny question: why
> > did migrate_disable() crash for Mel on !RT and why do you expect that it
> > does not happen on PREEMPT_RT?
>
> Right, so it's because __slab_alloc() has this optimization to avoid re-reading
> 'c' in case there is no preemption enabled at all (or it's just voluntary).
>
> #ifdef CONFIG_PREEMPTION
> /*
> * We may have been preempted and rescheduled on a different
> * cpu before disabling preemption. Need to reload cpu area
> * pointer.
> */
> c = slub_get_cpu_ptr(s->cpu_slab);
> #endif
>
> Mel's config has CONFIG_PREEMPT_VOLUNTARY, which means CONFIG_PREEMPTION is not
> enabled.
>
> But then later in ___slab_alloc() we have
>
> slub_put_cpu_ptr(s->cpu_slab);
> page = new_slab(s, gfpflags, node);
> c = slub_get_cpu_ptr(s->cpu_slab);
>
> And this is not hidden under CONFIG_PREEMPTION, so with the #ifdef bug the
> slub_put_cpu_ptr did a migrate_enable() with Mel's config, without prior
> migrate_disable().

Ach, right. The update to this field is done with cmpxchg-double (if I
remember correctly) but I don't remember if this is also re-entry safe.

> If there wasn't the #ifdef PREEMPT_RT bug:
> - this slub_put_cpu_ptr() would translate to put_cpu_ptr() thus
> preempt_enable(), which on this config is just a barrier(), so it doesn't matter
> that there was no matching preempt_disable() before.
> - with PREEMPT_RT the CONFIG_PREEMPTION would be enabled, so the
> slub_get_cpu_ptr() would do a migrate_disable() and there's no imbalance.
>
> But now that I dig into this in detail, I can see there might be another
> instance of this imbalance bug, if CONFIG_PREEMPTION is disabled, but
> CONFIG_PREEMPT_COUNT is enabled, which seems to be possible in some debug
> scenarios. Because then preempt_disable()/preempt_enable() still manipulate the
> preempt counter and compiling them out in __slab_alloc() will cause imbalance.
>
> So I think the guards in __slab_alloc() should be using CONFIG_PREEMPT_COUNT
> instead of CONFIG_PREEMPT to be correct on all configs. I dare not remove them
> completely :)

:)

Sebastian