Re: [PATCH 0/2] execve scalability issues, part 1

From: Jan Kara
Date: Wed Aug 23 2023 - 12:41:21 EST


On Wed 23-08-23 18:10:29, Mateusz Guzik wrote:
> On 8/23/23, Jan Kara <jack@xxxxxxx> wrote:
> > I didn't express myself well. Sure atomics are expensive compared to plain
> > arithmetic operations. But I wanted to say - we had atomics for RSS
> > counters before commit f1a7941243 ("mm: convert mm's rss stats into
> > percpu_counter") and people seemed happy with it until there were many CPUs
> > contending on the updates. So maybe RSS counters aren't used heavily enough
> > for the difference to practically matter? Probably operation like faulting
> > in (or unmapping) tmpfs file has the highest chance of showing the cost of
> > rss accounting compared to the cost of the remainder of the operation...
> >
>
> These stats used to be decentralized by storing them in task_struct,
> the commit complains about values deviating too much.
>
> The value would get synced every 64 uses, from the diff:
> -/* sync counter once per 64 page faults */
> -#define TASK_RSS_EVENTS_THRESH (64)
> -static void check_sync_rss_stat(struct task_struct *task)
> -{
> - if (unlikely(task != current))
> - return;
> - if (unlikely(task->rss_stat.events++ > TASK_RSS_EVENTS_THRESH))
> - sync_mm_rss(task->mm);
> -}
>
> other than that it was a non-atomic update in struct thread.
>
> -static void add_mm_counter_fast(struct mm_struct *mm, int member, int val)
> -{
> - struct task_struct *task = current;
> -
> - if (likely(task->mm == mm))
> - task->rss_stat.count[member] += val;
> - else
> - add_mm_counter(mm, member, val);
> -}

Ah, I see. I already forgot these details since I was checking the
regression back in spring. Now I've just seen the atomic_long_t counters in
task_struct and forgot there used to be also these per-thread ones. Thanks
for refreshing my memory!

> So the question is how much does this matter. My personal approach is
> that avoidable slowdowns (like atomics here) only facilitate further
> avoidable slowdowns as people can claim there is a minuscule change in
> % to baseline. But if the baseline is already slow....

I get your point but over the years I've also learned that premature
optimization isn't good either as we will be dragging the maintenance
burden for a long time ;) It's a balance.

Honza

--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR