Re: [PATCH] mm: compaction: optimize compact_memory to comply with the admin-guide

From: Wen Yang
Date: Tue Apr 18 2023 - 10:12:11 EST



在 2023/4/17 19:13, Mel Gorman 写道:
On Sun, Apr 16, 2023 at 01:42:44AM +0800, Wen Yang wrote:
??? 2023/4/13 00:54, Wen Yang ??????:
??? 2023/4/12 04:48, Andrew Morton ??????:
On Wed, 12 Apr 2023 02:24:26 +0800 wenyang.linux@xxxxxxxxxxx wrote:

For the /proc/sys/vm/compact_memory file, the admin-guide states:
When 1 is written to the file, all zones are compacted such that free
memory is available in contiguous blocks where possible. This can be
important for example in the allocation of huge pages although
processes
will also directly compact memory as required

But it was not strictly followed, writing any value would cause all
zones to be compacted. In some critical scenarios, some applications
operating it, such as echo 0, have caused serious problems.
Really?  You mean someone actually did this and didn't observe the
effect during their testing?
Thanks for your reply.

Since /proc/sys/vm/compact_memory has been well documented for over a
decade:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/admin-guide/sysctl/vm.rst#n109


it is believed that only writing 1 will trigger trigger all zones to be
compacted.

Especially for those who write applications, they may only focus on
documentation and generally do not read kernel code.  Moreover, such
problems are not easily detected through testing on low pressure
machines.

Writing any meaningful or meaningless values will trigger it and affect
the entire server:

# echo 1 > /proc/sys/vm/compact_memory
# echo 0 > /proc/sys/vm/compact_memory
# echo dead > /proc/sys/vm/compact_memory
# echo "hello world" > /proc/sys/vm/compact_memory

The implementation of this high-risk operation may require following the
admin-guides.

--

Best wishes,

Wen


Hello, do you think it's better to optimize the sysctl_compaction_handler
code or update the admin-guide document?

Enforce the 1 on the unlikely chance that the sysctl handler is ever
extended to do something different and expects a bitmask. The original
intent intent of the sysctl was debugging -- demonstrating a contiguous
allocation failure when aggressive compaction should have succeeded. Later
some machines dedicated to batch jobs used the compaction sysctl to compact
memory before a new job started to reduce startup latencies.

Drop the justification "In some critical scenarios, some applications
operating it, such as echo 0, have caused serious problems." from the
changelog. I cannot imagine a sane "critical scenario" where an application
running as root is writing expected garbage to proc or sysfs files and
then surprised when something unexpected happens.

Thanks for your comments.

We will modify it according to your suggestion and then send v2.


--

Best wishes,

Wen