Re: Expensive memory.stat + cpu.stat reads

From: Ivan Babrou
Date: Mon Aug 14 2023 - 13:57:26 EST


On Fri, Aug 11, 2023 at 7:34 PM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
>
> Hi Ivan,
>
> (sorry for late response as I was away)
>
> On Fri, Aug 11, 2023 at 3:35 PM Ivan Babrou <ivan@xxxxxxxxxxxxxx> wrote:
> [...]
> > > > I spent some time looking into this and I think I landed on a fix:
> > > >
> > > > * https://github.com/bobrik/linux/commit/50b627811d54
> > > >
> > > > I'm not 100% sure if it's the right fix for the issue, but it reduces
> > > > the runtime significantly.
>
> In your patch, can you try to replace mem_cgroup_flush_stats() with
> mem_cgroup_flush_stats_ratelimited() instead of cgroup_rstat_flush().
> I wanted to see if you observe any stale stats issues.

We scrape cgroup metrics every 53s and at that scale I doubt we'd
notice any staleness. With 2s FLUSH_TIME it would be in the noise. We
can even remove any sort of flushing from memory.stat as long as
cpu.stat is read right before it as it does flushing for both.

I agree with Yosry that there's probably no reason to flush both cpu
and memory stats at the same time.

It doesn't seem right to use delayed flush for memory and direct flush
for cpu. Perhaps it doesn't matter as much to warrant a bigger rework
of how it's done, but maybe then my patch to use the same mechanism
between cpu and memory flushing makes sense?