Re: Expensive memory.stat + cpu.stat reads

From: Yosry Ahmed
Date: Mon Aug 14 2023 - 20:32:41 EST


On Mon, Aug 14, 2023 at 5:30 PM Ivan Babrou <ivan@xxxxxxxxxxxxxx> wrote:
>
> On Mon, Aug 14, 2023 at 5:18 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > On Fri, Aug 11, 2023 at 05:01:08PM -0700, Yosry Ahmed wrote:
> > > There have been a lot of problems coming from this global rstat lock:
> > > hard lockups (when we used to flush atomically), unified flushing
> > > being expensive, skipping flushing being inaccurate, etc.
> > >
> > > I wonder if it's time to rethink this lock and break it down into
> > > granular locks. Perhaps a per-cgroup lock, and develop a locking
> > > scheme where you always lock a parent then a child, then flush the
> > > child and unlock it and move to the next child, etc. This will allow
> > > concurrent flushing of non-root cgroups. Even when flushing the root,
> > > if we flush all its children first without locking the root, then only
> > > lock the root when flushing the top-level children, then some level of
> > > concurrency can be achieved.
> > >
> > > Maybe this is too complicated, I never tried to implement it, but I
> > > have been bouncing around this idea in my head for a while now.
> > >
> > > We can also split the update tree per controller. As far as I can tell
> > > there is no reason to flush cpu stats for example when someone wants
> > > to read memory stats.
> >
> > There's another thread. Let's continue there but I'm a bit skeptical whether
> > splitting the lock is a good solution here. Regardless of locking, we don't
> > want to run in an atomic context for that long anwyay.
>
> Could you link to the other thread?

I supposedly CC'd you there, but I realized it didn't work for some reason:
https://lore.kernel.org/lkml/CAJD7tkYBFz-gZ2QsHxUMT=t0KNXs66S-zzMPebadHx9zaG0Q3w@xxxxxxxxxxxxxx/