Re: Re: [PATCH v2 5/7] sched: Implement shared runqueue in CFS

From: David Vernet
Date: Thu Jul 13 2023 - 00:05:30 EST


On Thu, Jul 13, 2023 at 11:43:50AM +0800, Abel Wu wrote:
> On 7/13/23 6:16 AM, David Vernet wrote:
> > On Wed, Jul 12, 2023 at 06:47:26PM +0800, Abel Wu wrote:
> > > > + *
> > > > + * HOW
> > > > + * ===
> > > > + *
> > > > + * An shared_runq is comprised of a list, and a spinlock for synchronization.
> > > > + * Given that the critical section for a shared_runq is typically a fast list
> > > > + * operation, and that the shared_runq is localized to a single LLC, the
> > > > + * spinlock will typically only be contended on workloads that do little else
> > > > + * other than hammer the runqueue.
> > >
> > > Would there be scalability issues on large LLCs?
> >
> > See the next patch in the series [0] where we shard the per-LLC shared
> > runqueues to avoid contention.
> >
> > [0]: https://lore.kernel.org/lkml/20230710200342.358255-7-void@xxxxxxxxxxxxx/
>
> Sorry, I should have read the cover letter more carefully. By sharding,
> the LLC is partitioned into several zones, hence contention is relieved.
> But sharding itself might be tricky. Making the SMT siblings not cross
> shards, as suggested by Peter, is generally a good thing. But I wonder
> if there is any workload might benefit from other sharding form.

IMO for now the best thing to do is keep things simple until we observe
it being a problem in practice. Or to at least plan to address it in a
follow-on patch set when we add support for a dynamic shard sizing.
Proper shard sizing is required to do optimal SMT placement anyways, and
in general will likely have more of an impact on performance.

> > >
> > > > +
> > > > + task_rq_unlock(src_rq, p, &src_rf);
> > > > +
> > > > + raw_spin_rq_lock(rq);
> > > > + rq_repin_lock(rq, rf);
> > >
> > > By making it looks more ugly, we can save some cycles..
> > >
> > > if (src_rq != rq) {
> > > task_rq_unlock(src_rq, p, &src_rf);
> > > } else {
> > > rq_unpin_lock(src_rq, src_rf);
> > > raw_spin_unlock_irqrestore(&p->pi_lock, src_rf.flags);
> > > rq_repin_lock(rq, rf);
> > > }
>
> I forgot the repin part when src_rq != rq, but I'm sure you already got
> my point :)

Yep, I got your main point which was to avoid the extra dropping and
acquiring of the same rq spinlock. FYI for posterity / anyone else
reading, the missing repin isn't sufficient on its own for the src_rq !=
rq path. We also need to reacquire the rq lock.

Thanks again for the helpful reviews.

- David