Re: [PATCH v6] zswap: memcontrol: implement zswap writeback disabling

From: Yosry Ahmed
Date: Mon Dec 18 2023 - 16:54:59 EST


On Mon, Dec 18, 2023 at 11:21 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote:
>
> On Mon, Dec 18, 2023 at 6:44 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> >
> > On Fri, Dec 15, 2023 at 01:21:57PM -0800, Yosry Ahmed wrote:
> > > On Thu, Dec 7, 2023 at 11:24 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote:
> > > >
> > > > During our experiment with zswap, we sometimes observe swap IOs due to
> > > > occasional zswap store failures and writebacks-to-swap. These swapping
> > > > IOs prevent many users who cannot tolerate swapping from adopting zswap
> > > > to save memory and improve performance where possible.
> > > >
> > > > This patch adds the option to disable this behavior entirely: do not
> > > > writeback to backing swapping device when a zswap store attempt fail,
> > > > and do not write pages in the zswap pool back to the backing swap
> > > > device (both when the pool is full, and when the new zswap shrinker is
> > > > called).
> > > >
> > > > This new behavior can be opted-in/out on a per-cgroup basis via a new
> > > > cgroup file. By default, writebacks to swap device is enabled, which is
> > > > the previous behavior. Initially, writeback is enabled for the root
> > > > cgroup, and a newly created cgroup will inherit the current setting of
> > > > its parent.
> > > >
> > > > Note that this is subtly different from setting memory.swap.max to 0, as
> > > > it still allows for pages to be stored in the zswap pool (which itself
> > > > consumes swap space in its current form).
> > > >
> > > > This patch should be applied on top of the zswap shrinker series:
> > > >
> > > > https://lore.kernel.org/linux-mm/20231130194023.4102148-1-nphamcs@xxxxxxxxx/
> > > >
> > > > as it also disables the zswap shrinker, a major source of zswap
> > > > writebacks.
> > > >
> > > > Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > > > Signed-off-by: Nhat Pham <nphamcs@xxxxxxxxx>
> > > > Reviewed-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> > >
> > > Taking a step back from all the memory.swap.tiers vs.
> > > memory.zswap.writeback discussions, I think there may be a more
> > > fundamental problem here. If the zswap store failure is recurrent,
> > > pages can keep going back to the LRUs and then sent back to zswap
> > > eventually, only to be rejected again. For example, this can if zswap
> > > is above the acceptance threshold, but could be even worse if it's the
> > > allocator rejecting the page due to not compressing well enough. In
> > > the latter case, the page can keep going back and forth between zswap
> > > and LRUs indefinitely.
> > >
> > > You probably did not run into this as you're using zsmalloc, but it
>
> Which is why I recommend everyone to use zsmalloc, and change the
> default allocator to it in Kconfig :)
>

Internally, we have a cap on the compression ratio, after which we
reject pages because it doesn't make sense to store them (e.g.
zsmalloc will store them in a full page anyway, or the compressed size
+ metadata isn't worth it). I think this is where we should head
upstream as well, you proposed something in the right direction with
storing uncompressed pages in zswap. IOW, I think such pages should be
taken out of the LRUs one way or another.