Re: [PATCH v3] mm: zswap: multiple zpools support

From: Yosry Ahmed
Date: Tue Aug 15 2023 - 18:31:25 EST


On Tue, Aug 15, 2023 at 3:22 PM Chris Li <chrisl@xxxxxxxxxx> wrote:
>
> Hi Yosry,
>
> On Fri, Aug 11, 2023 at 4:21 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
> >
> > On Fri, Aug 11, 2023 at 2:19 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Thu, 13 Jul 2023 03:35:25 -0700 Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
> > >
> > > > >
> > > > > I'm experimenting with some other zswap changes - if I have
> > > > > extra cycles and resources I'll try to apply this patch and see how the
> > > > > numbers play out.
> > > >
> > > > That would be amazing. Looking forward to any numbers you can dig :)
> > >
> > > So this patch seems stuck. I can keep it in mm.git until the fog
> > > clears, but would prefer not to. Can we please revisit and decide on a
> > > way forward?
> >
> > Johannes did not like a config option so I proposed it here as a
> > constant (like SWAP_CLUSTER_MAX and others we have). This is a value
> > that we have been using in our data centers for almost a decade, so it
>
> I dug up the previous V1 discussion and this V3 discussion thread.
> It seems obvious having multiple pools having locking contention advantage.
> The number does not lie.
>
> However the number of pools is hard to decide at compile time.
>
> Regarding the per CPU pool. That might work well for a small number of CPUs.
> When the system has many CPUs e.g. a few hundreds of CPUs. It means having
> hundreds of pools which is a bad idea.
>
> How about just setting it as a run time value(size/bits) and can only
> change pool
> (size/bits) when zswap does not have any active stores.

I was hoping we can add the basic support here for multiple zpools,
and then later, if needed, extend to support runtime dynamic tuning.
Adding this will introduce more complexity as we will need to lock all
trees and make sure there is no activity and alloc/free zpools. If a
limitation for compile-time constant is observed we can do that,
otherwise let's keep it simple and incremental for now.

FWIW, we have been running with 32 zpools in Google's fleet for ~a
decade now and it seems to work well for various workloads and machine
configurations.

>
> Chris
>
> > has seen a ton of testing. I was hoping Johannes would get time to
> > take a look, or Nhat would get time to test it out, but neither of
> > these things happen.
> >
> > I obviously want it to be merged, but hopefully someone will chime in here :)
> >