Re: [PATCH] mm: memcg: provide accurate stats for userspace reads

From: Yosry Ahmed
Date: Fri Aug 11 2023 - 22:12:26 EST


On Fri, Aug 11, 2023 at 7:08 PM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
>
> Hi all,
>
> (sorry for late response as I was away)
>
> On Fri, Aug 11, 2023 at 1:40 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
> >
> [...]
> > > > >
> > > > > Last note, for /proc/vmstat we have /proc/sys/vm/stat_refresh to trigger
> > > > > an explicit refresh. For those users who really need more accurate
> > > > > numbers we might consider interface like that. Or allow to write to stat
> > > > > file and do that in the write handler.
> > > >
> > > > This wouldn't be my first option, but if that's the only way to get
> > > > accurate stats I'll take it.
> > >
> > > To be honest, this would be my preferable option because of 2 reasons.
> > > a) we do not want to guarantee to much on the precision front because
> > > that would just makes maintainability much more harder with different
> > > people having a different opinion of how much precision is enough and b)
> > > it makes the more rare (need precise) case the special case rather than
> > > the default.
> >
> > How about we go with the proposed approach in this patch (or the mutex
> > approach as it's much cleaner), and if someone complains about slow
> > reads we revert the change and introduce the refresh API? We might
> > just get away with making all reads accurate and avoid the hassle of
> > updating some userspace readers to do write-then-read. We don't know
> > for sure that something will regress.
> >
> > What do you think?
>
> Actually I am with Michal on this one. As I see multiple regression
> reports for reading the stats, I am inclined towards rate limiting the
> sync stats flushing from user readable interfaces (through
> mem_cgroup_flush_stats_ratelimited()) and providing a separate
> interface as suggested by Michal to explicitly flush the stats for
> users ok with the cost. Since we flush the stats every 2 seconds, most
> of the users should be fine and the users who care about accuracy can
> pay for it.

I am worried that writing to a stat for flushing then reading will
increase the staleness window which we are trying to reduce here.
Would it be acceptable to add a separate interface to explicitly read
flushed stats without having to write first? If the distinction
disappears in the future we can just short-circuit both interfaces.