Re: [PATCH] locking/osq: Drop the overload of osq lock

From: Peter Zijlstra
Date: Sat Jun 25 2016 - 10:25:08 EST


On Sat, Jun 25, 2016 at 01:42:03PM -0400, Pan Xinhui wrote:
> An over-committed guest with more vCPUs than pCPUs has a heavy overload
> in osq_lock().
>
> This is because vCPU A hold the osq lock and yield out, vCPU B wait
> per_cpu node->locked to be set. IOW, vCPU B wait vCPU A to run and
> unlock the osq lock. Even there is need_resched(), it did not help on
> such scenario.
>
> To fix such bad issue, add a threshold in one while-loop of osq_lock().
> The value of threshold is somehow equal to SPIN_THRESHOLD.

Blergh, virt ...

So yes, lock holder preemption sucks. You would also want to limit the
immediate spin on owner.

Also; I really hate these random number spin-loop thresholds.

Is it at all possible to get feedback from your LPAR stuff that the vcpu
was preempted? Because at that point we can add do something like:


int vpc = vcpu_preempt_count();

...

for (;;) {

/* the big spin loop */

if (need_resched() || vpc != vcpu_preempt_count())
/* bail */

}


With a default implementation like:

static inline int vcpu_preempt_count(void)
{
return 0;
}

So the compiler can make it all go away.


But on virt muck it would stop spinning the moment the vcpu gets
preempted, which is the right moment I'm thinking.