Re: [PATCH] pstore: Base compression input buffer size on estimated compressed size

From: Eric Biggers
Date: Thu Aug 31 2023 - 01:20:19 EST


On Wed, Aug 30, 2023 at 04:43:37PM -0700, Kees Cook wrote:
> On Wed, Aug 30, 2023 at 11:22:38PM +0200, Ard Biesheuvel wrote:
> > Commit 1756ddea6916 ("pstore: Remove worst-case compression size logic")
> > removed some clunky per-algorithm worst case size estimation routines on
> > the basis that we can always store pstore records uncompressed, and
> > these worst case estimations are about how much the size might
> > inadvertently *increase* due to encapsulation overhead when the input
> > cannot be compressed at all. So if compression results in a size
> > increase, we just store the original data instead.
>
> Does the Z_FINISH vs Z_SYNC_FLUSH thing need to be fixed as well, or
> does that become a non-issue with this change?

I haven't seen any real evidence that that issue actually exists.

> > void pstore_record_init(struct pstore_record *record,
> > @@ -305,7 +314,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
> > record.buf = psinfo->buf;
> >
> > dst = big_oops_buf ?: psinfo->buf;
> > - dst_size = psinfo->bufsize;
> > + dst_size = max_uncompressed_size ?: psinfo->bufsize;
> >
> > /* Write dump header. */
> > header_size = snprintf(dst, dst_size, "%s#%d Part%u\n", why,
> > @@ -326,8 +335,15 @@ static void pstore_dump(struct kmsg_dumper *dumper,
> > record.compressed = true;
> > record.size = zipped_len;
> > } else {
> > - record.size = header_size + dump_size;
> > - memcpy(psinfo->buf, dst, record.size);
> > + /*
> > + * Compression failed, so the buffer is most
> > + * likely filled with binary data that does not
> > + * compress as well as ASCII text. Copy as much
> > + * of the uncompressed data as possible into
> > + * the pstore record, and discard the rest.
> > + */
> > + record.size = psinfo->bufsize;
> > + memcpy(psinfo->buf, dst, psinfo->bufsize);
>
> I don't think this is "friendly" enough. :P
>
> In the compression failure case, we've got a larger dst_size (and
> dump_size, but technically it might not be true if something else went
> wrong) than psinfo->bufsize, so we want to take the trailing bytes
> (i.e. panic details are more likely at the end). And we should keep
> the header, which is already present in "dst". I think we need to do
> something like this:
>
> size_t buf_size_available = psinfo->bufsize - header_size;
> size_t dump_size_wanted = min(dump_size, buf_size_available);
>
> record.size = header_size + dump_size_wanted;
> memcpy(psinfo->buf, dst, header_size);
> memcpy(psinfo->buf + header_size,
> dst + header_size + (dump_size - dump_size_wanted),
> dump_size_wanted);
>
> My eyes, my eyes.
>

How hard would it be to write two uncompressed records when compression fails to
achieve the targeted 50% ratio?

- Eric