Re: [RFC PATCH] mm: memcontrol: don't account swap failures not due to cgroup limits

From: Johannes Weiner
Date: Tue Feb 07 2023 - 11:55:01 EST


On Mon, Feb 06, 2023 at 05:18:43PM +0100, Michal Koutný wrote:
> On Thu, Feb 02, 2023 at 10:30:40AM -0800, Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
> > > b) Only count cgroup swap events when they are actually due to a
> > > cgroup's own limit. Exclude failures that are due to physical swap
> > > shortage or other system-level conditions (like !THP_SWAP). Also
> > > count them at the level where the limit is configured, which may be
> > > above the local cgroup that holds the page-to-be-swapped.
> > >
> > > This is in line with how memory.swap.high, memory.high and
> > > memory.max events are counted.
> > >
> > > However, it's a change in documented behavior.
> >
> > This option makes sense to me, but I can't speak to the change of
> > documented behavior. However, looking at the code, it seems like if we do this
> > the "max" & "fail" counters become effectively the same. "fail" would
> > not provide much value then.
> >
> > I wonder if it makes sense to have both, and clarify that "fail" -
> > "max" would be non-limit based failures (e.g. ran out of swap space),
> > or would this cause confusion as to whether those non-limit failures
> > were transient (THP fallback) or eventual?
>
> I somewhat second this.
>
> Perhaps, could the patch (and arguments) be split in two:
> 1) count .max events on respective limit's level (other limits consistency),

Okay, I'll split this one out. It's good to have regardless of what we
do with the fail counter.

Thanks