Re: [RFC PATCH 0/3] directed yield for Pause Loop Exiting

From: Balbir Singh
Date: Fri Dec 10 2010 - 00:03:59 EST


* Rik van Riel <riel@xxxxxxxxxx> [2010-12-02 14:41:29]:

> When running SMP virtual machines, it is possible for one VCPU to be
> spinning on a spinlock, while the VCPU that holds the spinlock is not
> currently running, because the host scheduler preempted it to run
> something else.
>
> Both Intel and AMD CPUs have a feature that detects when a virtual
> CPU is spinning on a lock and will trap to the host.
>
> The current KVM code sleeps for a bit whenever that happens, which
> results in eg. a 64 VCPU Windows guest taking forever and a bit to
> boot up. This is because the VCPU holding the lock is actually
> running and not sleeping, so the pause is counter-productive.
>
> In other workloads a pause can also be counter-productive, with
> spinlock detection resulting in one guest giving up its CPU time
> to the others. Instead of spinning, it ends up simply not running
> much at all.
>
> This patch series aims to fix that, by having a VCPU that spins
> give the remainder of its timeslice to another VCPU in the same
> guest before yielding the CPU - one that is runnable but got
> preempted, hopefully the lock holder.
>
> Scheduler people, please flame me with anything I may have done
> wrong, so I can do it right for a next version :)
>

This is a good problem statement, there are other things to consider
as well

1. If a hard limit feature is enabled underneath, donating the
timeslice would probably not make too much sense in that case
2. The implict assumption is that spinning is bad, but for locks
held for short durations, the assumption is not true. I presume
by the problem statement above, the h/w does the detection of
when to pause, but that is not always correct as you suggest above.
3. With respect to donating timeslices, don't scheduler cgroups
and job isolation address that problem today?

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/