Re: [PATCH net-next 2/2] sock: Fix improper heuristic on raising memory

From: Abel Wu
Date: Fri Sep 22 2023 - 06:10:42 EST


On 9/22/23 4:36 PM, Abel Wu wrote:
On 9/22/23 3:01 AM, Shakeel Butt wrote:
On Wed, Sep 20, 2023 at 09:25:41PM +0800, Abel Wu wrote:
Before sockets became aware of net-memcg's memory pressure since
commit e1aab161e013 ("socket: initial cgroup code."), the memory
usage would be granted to raise if below average even when under
protocol's pressure. This provides fairness among the sockets of
same protocol.

That commit changes this because the heuristic will also be
effective when only memcg is under pressure which makes no sense.
Fix this by skipping this heuristic when under memcg pressure.

Fixes: e1aab161e013 ("socket: initial cgroup code.")
Signed-off-by: Abel Wu <wuyun.abel@xxxxxxxxxxxxx>
---
  net/core/sock.c | 10 +++++++++-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 379eb8b65562..ef5cf6250f17 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3093,8 +3093,16 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
      if (sk_has_memory_pressure(sk)) {
          u64 alloc;
-        if (!sk_under_memory_pressure(sk))
+        if (memcg && mem_cgroup_under_socket_pressure(memcg))
+            goto suppress_allocation;
+
+        if (!sk_under_global_memory_pressure(sk))
              return 1;

I am onboard with replacing sk_under_memory_pressure() with
sk_under_global_memory_pressure(). However suppressing on memcg pressure
is a behavior change from status quo and need more thought and testing.

I think there are three options for this hunk:

1. proposed patch
2. Consider memcg pressure only for !in_softirq().
3. Don't consider memcg pressure at all.

All three options are behavior change from the status quo but with
different risk levels. (1) may reintroduce the regression fixed by
720ca52bcef22 ("net-memcg: avoid stalls when under memory pressure").

Just for the record, it is same for the current upstream implementation
if the socket reaches average usage. Taking option 2 will fix this too.

(2) is more inlined with 720ca52bcef22. (3) has the risk to making memcg
limits ineffective.

IMHO we should go with (2) as there is already a precedence in
720ca52bcef22.

Yes, I agree. Actually applying option(2) would make this patch quite
similar to the previous version[a], except the below part:

     /* Under limit. */
     if (allocated <= sk_prot_mem_limits(sk, 0)) {
         sk_leave_memory_pressure(sk);
-        return 1;
+        if (!under_memcg_pressure)
+            return 1;
     }

After a second thought, it is still vague to me about the position
the memcg pressure should be in socket memory allocation. It lacks
convincing design. I think the above hunk helps, but not much.

I wonder if we should take option (3) first. Thoughts?

Thanks,
Abel


My original thought is to inherit the behavior of tcpmem pressure.
There are also 3 levels of memcg pressure named low/medium/critical,
but considering that the 'low' level is too much conservative for
socket allocation, I made the following match:

    PROTOCOL    MEMCG        ACTION
    -----------------------------------------------------
    low        <medium        allow allocation
    pressure    medium        be more conservative
    high        critical    throttle

which also seems align with the design[b] of memcg pressure. Anyway
I will take option (2) and post v2.

Thanks & Best,
    Abel

[a] https://lore.kernel.org/lkml/20230901062141.51972-4-wuyun.abel@xxxxxxxxxxxxx/
[b] https://docs.kernel.org/admin-guide/cgroup-v1/memory.html#memory-pressure