Re: [PATCH RFC] mm/madvise: implement MADV_STOCKPILE (kswapd from user space)

From: Konstantin Khlebnikov
Date: Tue May 28 2019 - 05:02:13 EST


On 28.05.2019 11:42, Michal Hocko wrote:
On Tue 28-05-19 11:04:46, Konstantin Khlebnikov wrote:
On 28.05.2019 10:38, Michal Hocko wrote:
[...]
Could you define the exact semantic? Ideally something for the manual
page please?


Like kswapd which works with thresholds of free memory this one reclaims
until 'free' (i.e. memory which could be allocated without invoking
direct recliam of any kind) is lower than passed 'size' argument.

s@lower@higher@ I guess

Yep. My wording still bad.
'size' argument should be called 'watermark' or 'threshold'.

I.e. reclaim while 'free' memory is lower passed 'threshold'.


Thus right after madvise(NULL, size, MADV_STOCKPILE) 'size' bytes
could be allocated in this memory cgroup without extra latency from
reclaimer if there is no other memory consumers.

Reclaimed memory is simply put into free lists in common buddy allocator,
there is no reserves for particular task or cgroup.

If overall memory allocation rate is smooth without rough spikes then
calling MADV_STOCKPILE in loop periodically provides enough room for
allocations and eliminates direct reclaim from all other tasks.
As a result this eliminates unpredictable delays caused by
direct reclaim in random places.

OK, this makes it more clear to me. Thanks for the clarification!
I have clearly misunderstood and misinterpreted target as the reclaim
target rather than free memory target. Sorry about the confusion.
I sill think that this looks like an abuse of the madvise but if there
is a wider consensus this is acceptable I will not stand in the way.