Re: [PATCH v2 6/7] sched: Shard per-LLC shared runqueues

From: David Vernet
Date: Tue Jul 11 2023 - 15:58:06 EST


On Tue, Jul 11, 2023 at 12:49:58PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 10, 2023 at 03:03:41PM -0500, David Vernet wrote:
>
> > +struct shared_runq_shard {
> > struct list_head list;
> > spinlock_t lock;
> > } ____cacheline_aligned;
> >
> > +struct shared_runq {
> > + u32 num_shards;
> > + struct shared_runq_shard shards[];
> > +} ____cacheline_aligned;
> > +
> > +/* This would likely work better as a configurable knob via debugfs */
> > +#define SHARED_RUNQ_SHARD_SZ 6
> > +
> > #ifdef CONFIG_SMP
> > static struct shared_runq *rq_shared_runq(struct rq *rq)
> > {
> > return rq->cfs.shared_runq;
> > }
> >
> > -static struct task_struct *shared_runq_pop_task(struct rq *rq)
> > +static struct shared_runq_shard *rq_shared_runq_shard(struct rq *rq)
> > +{
> > + return rq->cfs.shard;
> > +}
> > +
> > +static int shared_runq_shard_idx(const struct shared_runq *runq, int cpu)
> > +{
> > + return cpu % runq->num_shards;
>
> I would suggest either:
>
> (cpu >> 1) % num_shards
>
> or keeping num_shards even, to give SMT siblings a fighting chance to
> hit the same bucket.

Given that neither of these approaches guarantees that the SMT siblings
are in the same bucket, I'll just go with your suggestion which is
simpler.

Seems inevitable that we'll want to have another debugfs knob to adjust
the number of shards, but IMO it's preferable to just apply your
suggestion in v3 and hold off on adding that complexity until we know we
need it.

> (I've no idea how SMT4 (or worse SMT8) is typically enumerated, so
> someone from the Power/Sparc/MIPS world would have to go play with that
> if they so care)

Yeah, no idea either. If these things end up varying a lot across
different architectures then we can look into making shard assignment
architecture specific.

>
> > +}
>
> > + num_shards = max(per_cpu(sd_llc_size, i) /
> > + SHARED_RUNQ_SHARD_SZ, 1);
>
> > + shared_runq->num_shards = num_shards;
>
>