Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

From: Yosry Ahmed
Date: Fri Oct 20 2023 - 13:43:44 EST


On Fri, Oct 20, 2023 at 10:23 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
>
> On Fri, Oct 20, 2023 at 9:18 AM kernel test robot <oliver.sang@xxxxxxxxx> wrote:
> >
> >
> >
> > Hello,
> >
> > kernel test robot noticed a -25.8% regression of will-it-scale.per_thread_ops on:
> >
> >
> > commit: 51d74c18a9c61e7ee33bc90b522dd7f6e5b80bb5 ("[PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg")
> > url: https://github.com/intel-lab-lkp/linux/commits/Yosry-Ahmed/mm-memcg-change-flush_next_time-to-flush_last_time/20231010-112257
> > base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
> > patch link: https://lore.kernel.org/all/20231010032117.1577496-4-yosryahmed@xxxxxxxxxx/
> > patch subject: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg
> >
> > testcase: will-it-scale
> > test machine: 104 threads 2 sockets (Skylake) with 192G memory
> > parameters:
> >
> > nr_task: 100%
> > mode: thread
> > test: fallocate1
> > cpufreq_governor: performance
> >
> >
> > In addition to that, the commit also has significant impact on the following tests:
> >
> > +------------------+---------------------------------------------------------------+
> > | testcase: change | will-it-scale: will-it-scale.per_thread_ops -30.0% regression |
> > | test machine | 104 threads 2 sockets (Skylake) with 192G memory |
> > | test parameters | cpufreq_governor=performance |
> > | | mode=thread |
> > | | nr_task=50% |
> > | | test=fallocate1 |
> > +------------------+---------------------------------------------------------------+
> >
>
> Yosry, I don't think 25% to 30% regression can be ignored. Unless
> there is a quick fix, IMO this series should be skipped for the
> upcoming kernel open window.

I am currently looking into it. It's reasonable to skip the next merge
window if a quick fix isn't found soon.

I am surprised by the size of the regression given the following:
1.12 ą 5% +1.4 2.50 ą 2%
perf-profile.self.cycles-pp.__mod_memcg_lruvec_state

IIUC we are only spending 1% more time in __mod_memcg_lruvec_state().