Re: [PATCH] lib/zstd: use div_u64() to let it build on 32-bit

From: Nick Terrell
Date: Wed Jun 28 2017 - 01:29:57 EST


> Please don't top post.

Sorry about that.

> Which function needs 1KB of stack space? That's quite a lot.

FSE_buildCTable_wksp(), FSE_compress_wksp(), and HUF_readDTableX4()
required over 1 KB of stack space.

> I can see in [1] that there are some on-stack buffers replaced by
> pointers to the workspace. That's good, but I would like to know if
> there's any hidden gem that grags the precious stack space.

I've been hunting down functions that use up the most stack trace and
replacing buffers with pointers to the workspace. I compiled the code
with -Wframe-larger-than=512 and reduced the stack usage of all offending
functions. In the next version of the patch, no function uses more than
400 B of stack space. We'll be porting the changes back upstream as well.

> Hm, I'd suggest to create a version optimized for kernel, eg. expecting
> that 4+ GB buffer will never be used and you can use the most fittin in
> type. This should affect only the function signatures, not the
> algorithm implementation, so porting future zstd changes should be
> straightforward.

If the functions were exposed, then I would agree 100%. However, since
these are internal functions, and the rest of zstd uses size_t to represent
buffer sizes, I think it would be awkward to change just FSE/HUF functions.
I also prefer size_t because it is friendlier to the optimizer, especially
the loop optimizer, since the compiler doesn't have to worry about unsigned
overflow.

On a related note, zstd performs automatic optimizations to improve
compression speed and reduce memory usage when given small sources, which
is the common case in the kernel.