Re: [PATCHv4 6/9] zsmalloc: pass limit on pages per-zspage to zs_create_pool()

From: Sergey Senozhatsky
Date: Fri Nov 11 2022 - 05:32:58 EST


On (22/11/10 18:10), Minchan Kim wrote:
> On Mon, Oct 31, 2022 at 02:41:05PM +0900, Sergey Senozhatsky wrote:
> > Allow zsmalloc pool owner to specify max number of pages
> > per-zspage (during pool creation), so that different pools
> > can have different characteristics.
> >
> > By default we pass ZS_DEFAULT_PAGES_PER_ZSPAGE which is 4
> > (matches the current order 2 zspages limit).
>
> How could user decide what's the best size for their workload?

[..]

For starters in a similar manner that I showed during our meeting.
They can run tests, gather stats (zsmalloc objects distribution),
analyze where most of the objects sit, how things change when we
have different cluster configurations, and so on.

But more importantly: they need lots of zramX mm_stat data, which is
perfectly traceable and collectable during fleet A/B testing: when a
number of devices get randomly assigned to different experiments and
receive different zspage len configuration, which they feed to zram
sysfs knobs during system startup (when init script configures zram).
And then look at statistically significant improvements or regressions.

This is how things done in ChromeOS and I'm sure in many other places.

In this regard, finding best zspage len value is not any different from
finding what is the best zram disksize, or what is the best compression
algorithm. Exactly same approach - feed different configuration to devices
and then analyze the data. Look at mm_stat-s before and after experiment,
per device class/type.

We can discuss in more details internally.

> > static void zs_zpool_destroy(void *pool)
> > @@ -2195,6 +2195,7 @@ static int zs_register_shrinker(struct zs_pool *pool)
> > /**
> > * zs_create_pool - Creates an allocation pool to work from.
> > * @name: pool name to be created
> > + * @num_pages: maximum number of pages per-zspage
>
> How about "max_page_chain:"?

OK.

Do you dislike idea of creating a `struct zs_tunables` which will hold
all fields that we can tune? And then zsmalloc users can pass that
struct (a pointer to) to zs_create_pool().

There can be various tunables. Like policy changes: do we use static
zspool configuration, or a dynamic one and so on.

On zram side, we can have a generic sysfs knob: allocator_tuning,
which will accept named params, the same way we did it for
recomp_algorithm and recompress.

echo "tuneable=VAL tunealbe=VAL" > /sys/block/zramX/allocator_tuning