Re: [PATCH] zram: add zstd to the supported algorithms list

From: Nick Terrell
Date: Thu Aug 24 2017 - 22:46:17 EST


On 8/24/17, 7:21 PM, "Sergey Senozhatsky" <sergey.senozhatsky.work@xxxxxxxxx> wrote:
> not really familiar either... I was thinking about having "zstd" and
> "zstd_dict" crypto_alg structs - one would be !dict, the other one would
> allocate dict and pass it to compress/decompress zstd callbacks. "zstd"
> vecrsion would invoke zstd_params() passing zeros as compress and dict
> sizes to ZSTD_getParams(), while "zstd_dict" would invoke, lets say,
> zstd_params_dict() passing PAGE_SIZE-s. hm... (0, PAGE_SIZE)? to
> ZSTD_getParams(). just a rough idea...

The way zstd dictionaries work is the user provides some data which gets
"prepended" to the data that is about to be compressed, without actually
writing it to output. That way zstd can find matches in the dictionary and
represent them for "free". That means the user has to pass the same data to
both the compressor and decompressor.

We could build a dictionary, say every 20 minutes, by sampling 512 B chunks
of the RAM and constructing a 16 KB dictionary. Then recompress all the
compressed RAM with the new dictionary. This is just a simple example of a
dictionary construction algorithm. You could imagine grouping pages by
application, and building a dictionary per application, since those pages
would likely be more similar.

Regarding the crypto API, I think it would be possible to experiment by
creating functions like
`zstd_comp_add_dictionary(void *ctx, void *data, size_t size)'
and `zstd_decomp_add_dictionary(void *ctx, void *data, size_t size)'
in the crypto zstd implementation and declare them in `zcomp.c'. If the
experiments prove that using zstd dictionaries (or LZ4 dictionaries) is
worthwhile, then we can figure out how we can make it work for real.