Re: [PATCH v5 3/6] zswap: make shrinking memcg-aware

From: Nhat Pham
Date: Mon Nov 06 2023 - 18:28:23 EST


On Mon, Nov 6, 2023 at 12:58 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
>
> > >
> > > This lock is only needed to synchronize updating pool->next_shrink,
> > > right? Can we just use atomic operations instead? (e.g. cmpxchg()).
> >
> > I'm not entirely sure. I think in the pool destroy path, we have to also
> > put the next_shrink memcg, so there's that.
>
> We can use xchg() to replace it with NULL, then put the memcg ref, no?
>
> We can also just hold zswap_pools_lock while shrinking the memcg
> perhaps? It's not a contended lock anyway. It just feels weird to add
> a spinlock to protect one pointer.

Ah this sounds good to me I guess. I'm not opposed to this simplification
of the concurrency scheme.

>
> >
> > >
> > > > + if (pool->next_shrink == memcg)
> > > > + pool->next_shrink =
> > > > + mem_cgroup_iter(NULL, pool->next_shrink, NULL, true);
> > > > + spin_unlock(&pool->next_shrink_lock);
> > > > + }
> > > > + spin_unlock(&zswap_pools_lock);
> > > > +}
> > > > +
> > > > /*********************************
> > > > * zswap entry functions
> > > > **********************************/
> > > > static struct kmem_cache *zswap_entry_cache;
> > > >
> > > > -static struct zswap_entry *zswap_entry_cache_alloc(gfp_t gfp)
> > > > +static struct zswap_entry *zswap_entry_cache_alloc(gfp_t gfp, int nid)
> > > > {
> > > > struct zswap_entry *entry;
> > > > - entry = kmem_cache_alloc(zswap_entry_cache, gfp);
> > > > + entry = kmem_cache_alloc_node(zswap_entry_cache, gfp, nid);
> > > > if (!entry)
> > > > return NULL;
> > > > entry->refcount = 1;
> > > [..]
> > > > @@ -1233,15 +1369,15 @@ bool zswap_store(struct folio *folio)
> > > > zswap_invalidate_entry(tree, dupentry);
> > > > }
> > > > spin_unlock(&tree->lock);
> > > > -
> > > > - /*
> > > > - * XXX: zswap reclaim does not work with cgroups yet. Without a
> > > > - * cgroup-aware entry LRU, we will push out entries system-wide based on
> > > > - * local cgroup limits.
> > > > - */
> > > > objcg = get_obj_cgroup_from_folio(folio);
> > > > - if (objcg && !obj_cgroup_may_zswap(objcg))
> > > > - goto reject;
> > > > + if (objcg && !obj_cgroup_may_zswap(objcg)) {
> > > > + memcg = get_mem_cgroup_from_objcg(objcg);
> > > > + if (shrink_memcg(memcg)) {
> > > > + mem_cgroup_put(memcg);
> > > > + goto reject;
> > > > + }
> > > > + mem_cgroup_put(memcg);
> > >
> > > Can we just use RCU here as well? (same around memcg_list_lru_alloc()
> > > call below).
> >
> > For memcg_list_lru_alloc(): there's potentially sleeping in that piece of
> > code I believe? I believe at the very least we'll have to use this gfp_t
> > flag for it to be rcu-safe:
> >
> > GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN
> > not sure the
> >
> > Same go for this particular place IIRC - there's some sleeping done
> > in zswap_writeback_entry(), correct?
>
> Ah right, I missed this. My bad.