Re: [RFC] mm, slab: reschedule cache_reap() on the same CPU

From: Christopher Lameter
Date: Tue Apr 10 2018 - 10:12:18 EST


On Tue, 10 Apr 2018, Vlastimil Babka wrote:

> cache_reap() is initially scheduled in start_cpu_timer() via
> schedule_delayed_work_on(). But then the next iterations are scheduled via
> schedule_delayed_work(), thus using WORK_CPU_UNBOUND.

That is a bug.. cache_reap must run on the same cpu since it deals with
the per cpu queues of the current cpu. Scheduled_delayed_work() used to
guarantee running on teh same cpu.

> This patch makes sure schedule_delayed_work_on() is used with the proper cpu
> when scheduling the next iteration. The cpu is stored with delayed_work on a
> new slab_reap_work_struct super-structure.

The current cpu is readily available via smp_processor_id(). Why a
super structure?

> @@ -4074,7 +4086,8 @@ static void cache_reap(struct work_struct *w)
> next_reap_node();
> out:
> /* Set up the next iteration */
> - schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
> + schedule_delayed_work_on(reap_work->cpu, work,
> + round_jiffies_relative(REAPTIMEOUT_AC));

schedule_delayed_work_on(smp_processor_id(), work, round_jiffies_relative(REAPTIMEOUT_AC));

instead all of the other changes?