Re: [RFC PATCH 2/3] sched: add yield_to function

From: Srivatsa Vaddagiri
Date: Fri Dec 03 2010 - 08:46:32 EST


On Fri, Dec 03, 2010 at 06:54:16AM +0100, Mike Galbraith wrote:
> > +void yield_to(struct task_struct *p)
> > +{
> > + unsigned long flags;
> > + struct sched_entity *se = &p->se;
> > + struct rq *rq;
> > + struct cfs_rq *cfs_rq;
> > + u64 remain = slice_remain(current);
>
> That "slice remaining" only shows the distance to last preempt, however
> brief. It shows nothing wrt tree position, the yielding task may well
> already be right of the task it wants to yield to, having been a buddy.

Good point.

> > cfs_rq = cfs_rq_of(se);
> > + se->vruntime -= remain;
> > + if (se->vruntime < cfs_rq->min_vruntime)
> > + se->vruntime = cfs_rq->min_vruntime;
>
> (This is usually done using max_vruntime())
>
> If the recipient was already left of the fair stick (min_vruntime),
> clipping advances it's vruntime, vaporizing entitlement from both donor
> and recipient.
>
> What if a task tries to yield to another not on the same cpu, and/or in
> the same task group?

In this case, target of yield_to is a vcpu belonging to the same VM and hence is
expected to be in same task group, but I agree its good to put a check.

> This would munge min_vruntime of other queues. I
> think you'd have to restrict this to same cpu, same group. If tasks can
> donate cross cfs_rq, (say) pinned task A cpu A running solo could donate
> vruntime to selected tasks pinned to cpu B, for as long as minuscule
> preemptions can resupply ammo. Would suck to not be the favored child.

IOW starving "non-favored" childs?

> Maybe you could exchange vruntimes cooperatively (iff same cfs_rq)
> between threads, but I don't think donations with clipping works.

Can't that lead to starvation again (as I pointed in a mail to Peterz):

p0 -> A0 B0 A1

A0/A1 enter a yield_to(other) deadlock, which means we keep swapping their
vruntimes, starving B0?

- vatsa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/