Re: [PATCH 1/5] mm/zswap: reuse dstmem when decompress

From: Nhat Pham
Date: Thu Dec 14 2023 - 15:33:53 EST


On Thu, Dec 14, 2023 at 9:59 AM Chris Li <chrisl@xxxxxxxxxx> wrote:
>
> On Tue, Dec 12, 2023 at 8:18 PM Chengming Zhou
> <zhouchengming@xxxxxxxxxxxxx> wrote:
> >
> > In the !zpool_can_sleep_mapped() case such as zsmalloc, we need to first
> > copy the entry->handle memory to a temporary memory, which is allocated
> > using kmalloc.
> >
> > Obviously we can reuse the per-compressor dstmem to avoid allocating
> > every time, since it's percpu-compressor and protected in mutex.
>
> You are trading more memory for faster speed.
> Per-cpu data structure does not come free. It is expensive in terms of
> memory on a big server with a lot of CPU. Think more than a few
> hundred CPU. On the big servers, we might want to disable this
> optimization to save a few MB RAM, depending on the gain of this
> optimization.
> Do we have any benchmark suggesting how much CPU overhead or latency
> this per-CPU page buys us, compared to using kmalloc?

I think Chengming is re-using an existing per-CPU buffer for this
purpose. IIUC, it was previously only used for compression
(zswap_store) - Chengming is leveraging it for decompression (load and
writeback) too with this patch. This sounds fine to me tbh, because
both directions have to hold the mutex anyway, so that buffer is
locked out - might as well use it.

We're doing a bit more work in the mutex section (memcpy and handle
(un)mapping) - but seems fine to me tbh.

>
> Chris
>
> >
> > Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
> > Reviewed-by: Nhat Pham <nphamcs@xxxxxxxxx>
> > ---
> > mm/zswap.c | 29 +++++++++--------------------
> > 1 file changed, 9 insertions(+), 20 deletions(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index 7ee54a3d8281..edb8b45ed5a1 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -1772,9 +1772,9 @@ bool zswap_load(struct folio *folio)
> > struct zswap_entry *entry;
> > struct scatterlist input, output;
> > struct crypto_acomp_ctx *acomp_ctx;
> > - u8 *src, *dst, *tmp;
> > + unsigned int dlen = PAGE_SIZE;
> > + u8 *src, *dst;
> > struct zpool *zpool;
> > - unsigned int dlen;
> > bool ret;
> >
> > VM_WARN_ON_ONCE(!folio_test_locked(folio));
> > @@ -1796,27 +1796,18 @@ bool zswap_load(struct folio *folio)
> > goto stats;
> > }
> >
> > - zpool = zswap_find_zpool(entry);
> > - if (!zpool_can_sleep_mapped(zpool)) {
> > - tmp = kmalloc(entry->length, GFP_KERNEL);
> > - if (!tmp) {
> > - ret = false;
> > - goto freeentry;
> > - }
> > - }
> > -
> > /* decompress */
> > - dlen = PAGE_SIZE;
> > - src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> > + acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
> > + mutex_lock(acomp_ctx->mutex);
> >
> > + zpool = zswap_find_zpool(entry);
> > + src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> > if (!zpool_can_sleep_mapped(zpool)) {
> > - memcpy(tmp, src, entry->length);
> > - src = tmp;
> > + memcpy(acomp_ctx->dstmem, src, entry->length);
> > + src = acomp_ctx->dstmem;
> > zpool_unmap_handle(zpool, entry->handle);
> > }
> >
> > - acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
> > - mutex_lock(acomp_ctx->mutex);
> > sg_init_one(&input, src, entry->length);
> > sg_init_table(&output, 1);
> > sg_set_page(&output, page, PAGE_SIZE, 0);
> > @@ -1827,15 +1818,13 @@ bool zswap_load(struct folio *folio)
> >
> > if (zpool_can_sleep_mapped(zpool))
> > zpool_unmap_handle(zpool, entry->handle);
> > - else
> > - kfree(tmp);
> >
> > ret = true;
> > stats:
> > count_vm_event(ZSWPIN);
> > if (entry->objcg)
> > count_objcg_event(entry->objcg, ZSWPIN);
> > -freeentry:
> > +
> > spin_lock(&tree->lock);
> > if (ret && zswap_exclusive_loads_enabled) {
> > zswap_invalidate_entry(tree, entry);
> >
> > --
> > b4 0.10.1