Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenariosin PLE handler

From: Peter Zijlstra
Date: Mon Sep 24 2012 - 12:03:28 EST


On Mon, 2012-09-24 at 17:51 +0200, Avi Kivity wrote:
> On 09/24/2012 03:54 PM, Peter Zijlstra wrote:
> > On Mon, 2012-09-24 at 18:59 +0530, Raghavendra K T wrote:
> >> However Rik had a genuine concern in the cases where runqueue is not
> >> equally distributed and lockholder might actually be on a different run
> >> queue but not running.
> >
> > Load should eventually get distributed equally -- that's what the
> > load-balancer is for -- so this is a temporary situation.
>
> What's the expected latency? This is the whole problem. Eventually the
> scheduler would pick the lock holder as well, the problem is that it's
> in the millisecond scale while lock hold times are in the microsecond
> scale, leading to a 1000x slowdown.

Yeah I know.. Heisenberg's uncertainty applied to SMP computing becomes
something like accurate or fast, never both.

> If we want to yield, we really want to boost someone.

Now if only you knew which someone ;-) This non-modified guest nonsense
is such a snake pit.. but you know how I feel about all that.

> > We already try and favour the non running vcpu in this case, that's what
> > yield_to_task_fair() is about. If its still not eligible to run, tough
> > luck.
>
> Crazy idea: instead of yielding, just run that other vcpu in the thread
> that would otherwise spin. I can see about a million objections to this
> already though.

Yah.. you want me to list a few? :-) It would require synchronization
with the other cpu to pull its task -- one really wants to avoid it also
running it.

Do this at a high enough frequency and you're dead too.

Anyway, you can do this inside the KVM stuff, simply flip the vcpu state
associated with a vcpu thread and use the preemption notifiers to sort
things against the scheduler or somesuch.

> >> Do you think instead of using rq->nr_running, we could get a global
> >> sense of load using avenrun (something like avenrun/num_onlinecpus)
> >
> > To what purpose? Also, global stuff is expensive, so you should try and
> > stay away from it as hard as you possibly can.
>
> Spinning is also expensive. How about we do the global stuff every N
> times, to amortize the cost (and reduce contention)?

Nah, spinning isn't expensive, its a waste of time, similar end result
for someone who wants to do useful work though, but not the same cause.

Pick N and I'll come up with a scenario for which its wrong ;-)

Anyway, its an ugly problem and one I really want to contain inside the
insanity that created it (virt), lets not taint the rest of the kernel
more than we need to.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/