Re: [PATCH] mm/vmscan: add sysctl knobs for protecting the working set

From: Vlastimil Babka
Date: Fri Dec 03 2021 - 07:00:00 EST


On 12/2/21 22:58, Andrew Morton wrote:
> On Thu, 2 Dec 2021 21:05:01 +0300 ValdikSS <iam@xxxxxxxxxxxxxxx> wrote:
>
>> This patchset is surprisingly effective and very useful for low-end PC
>> with slow HDD, single-board ARM boards with slow storage, cheap Android
>> smartphones with limited amount of memory. It almost completely prevents
>> thrashing condition and aids in fast OOM killer invocation.
>>
>> The similar file-locking patch is used in ChromeOS for nearly 10 years
>> but not on stock Linux or Android. It would be very beneficial for
>> lower-performance Android phones, SBCs, old PCs and other devices.
>>
>> With this patch, combined with zram, I'm able to run the following
>> software on an old office PC from 2007 with __only 2GB of RAM__
>> simultaneously:
>>
>> * Firefox with 37 active tabs (all data in RAM, no tab unloading)
>> * Discord
>> * Skype
>> * LibreOffice with the document opened
>> * Two PDF files (14 and 47 megabytes in size)
>>
>> And the PC doesn't crawl like a snail, even with 2+ GB in zram!
>> Without the patch, this PC is barely usable.
>> Please watch the video:
>> https://notes.valdikss.org.ru/linux-for-old-pc-from-2007/en/
>>
>
> This is quite a condemnation of the current VM. It shouldn't crawl
> like a snail.
>
> The patch simply sets hard limits on page reclaim's malfunctioning.
> I'd prefer that reclaim not malfunction :(

+CC Johannes

I'd also like to know where that malfunction happens in this case. The
relatively well known scenario is that memory overloaded systems thrash
instead of going OOM quickly - something PSI should be able to help with.

But in your case, if there is no OOM due to the added protections, it would
mean that the system is in fact not overloaded, just that the normal reclaim
decisions lead to reclaming something that should be left in memory, while
there is other memory that can be reclaimed without causing thrashing?
That's perhaps worse and worth investigating.

> That being said, I can see that a blunt instrument like this would be
> useful.
>
> I don't think that the limits should be "N bytes on the current node".
> Nodes can have different amounts of memory so I expect it should scale
> the hard limits on a per-node basis. And of course, the various zones
> have different size as well.
>
> We do already have a lot of sysctls for controlling these sort of
> things. Was much work put into attempting to utilize the existing
> sysctls to overcome these issues?
>
>