Re: [patch] v2 mm/slub: restore/expand unfreeze_partials() local exclusion scope

From: Vlastimil Babka
Date: Sun Jul 18 2021 - 17:19:53 EST


On 7/17/21 4:58 PM, Mike Galbraith wrote:
> On Thu, 2021-07-15 at 18:34 +0200, Mike Galbraith wrote:
>> Greetings crickets,
>>
>> Methinks he problem is the hole these patches opened only for RT.
>>
>> static void put_cpu_partial(struct kmem_cache *s, struct page *page,
>> int drain)
>> {
>> #ifdef CONFIG_SLUB_CPU_PARTIAL
>> struct page *oldpage;
>> int pages;
>> int pobjects;
>>
>> slub_get_cpu_ptr(s->cpu_slab);
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Bah, I'm tired of waiting to see what if anything mm folks do about
> this little bugger, so I'm gonna step on it my damn self and be done
> with it. Fly or die little patchlet.
>
> mm/slub: restore/expand unfreeze_partials() local exclusion scope
>
> 2180da7ea70a ("mm, slub: use migrate_disable() on PREEMPT_RT") replaced
> preempt_disable() in put_cpu_partial() with migrate_disable(), which when
> combined with ___slab_alloc() having become preemptibile, leads to
> kmem_cache_free()/kfree() blowing through ___slab_alloc() unimpeded,
> and vice versa, resulting in PREMPT_RT exclusive explosions in both
> paths while stress testing with both SLUB_CPU_PARTIAL/MEMCG enabled,
> ___slab_alloc() during allocation (duh), and __unfreeze_partials()
> during free, both while accessing an unmapped page->freelist.
>
> Serialize put_cpu_partial()/unfreeze_partials() on cpu_slab->lock to

Hm you mention put_cpu_partial() but your patch handles only the
unfreeze_partial() call from that function? If I understand the problem
correctly, all modifications of cpu_slab->partial has to be protected
on RT after the local_lock conversion, thus also the one that
put_cpu_partial() does by itself (via this_cpu_cmpxchg).

On the other hand the slub_cpu_dead() part should really be unnecessary,
as tglx pointed out.

How about the patch below? It handles also the recursion issue
differently by not locking around __unfreeze_partials().
If that works, I can think of making it less ugly :/

----8<----
diff --git a/mm/slub.c b/mm/slub.c
index 581004a5aca9..1c7a41460941 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2437,6 +2437,9 @@ static void unfreeze_partials(struct kmem_cache *s)
{
struct page *partial_page;

+#ifdef CONFIG_PREEMPT_RT
+ local_lock(&s->cpu_slab->lock);
+#endif
do {
partial_page = this_cpu_read(s->cpu_slab->partial);

@@ -2444,6 +2447,9 @@ static void unfreeze_partials(struct kmem_cache *s)
this_cpu_cmpxchg(s->cpu_slab->partial, partial_page, NULL)
!= partial_page);

+#ifdef CONFIG_PREEMPT_RT
+ local_unlock(&s->cpu_slab->lock);
+#endif
if (partial_page)
__unfreeze_partials(s, partial_page);
}
@@ -2482,7 +2488,11 @@ static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
int pages;
int pobjects;

- slub_get_cpu_ptr(s->cpu_slab);
+#ifndef CONFIG_PREEMPT_RT
+ get_cpu_ptr(s->cpu_slab);
+#else
+ local_lock(&s->cpu_slab->lock);
+#endif
do {
pages = 0;
pobjects = 0;
@@ -2496,7 +2506,15 @@ static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)
* partial array is full. Move the existing
* set to the per node partial list.
*/
+#ifndef CONFIG_PREEMPT_RT
unfreeze_partials(s);
+#else
+ this_cpu_write(s->cpu_slab->partial, NULL);
+ local_unlock(&s->cpu_slab->lock);
+ __unfreeze_partials(s, oldpage);
+ local_lock(&s->cpu_slab->lock);
+#endif
+
oldpage = NULL;
pobjects = 0;
pages = 0;
@@ -2513,7 +2531,11 @@ static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain)

} while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page)
!= oldpage);
- slub_put_cpu_ptr(s->cpu_slab);
+#ifndef CONFIG_PREMPT_RT
+ put_cpu_ptr(s->cpu_slab);
+#else
+ local_unlock(&s->cpu_slab->lock);
+#endif
#endif /* CONFIG_SLUB_CPU_PARTIAL */
}