Re: [PATCH] Prevent immediate process rescheduling

From: Avi Kivity
Date: Sat Sep 19 2009 - 04:32:04 EST


On 09/18/2009 11:03 PM, Peter Zijlstra wrote:
On Fri, 2009-09-18 at 21:54 +0200, Ingo Molnar wrote:

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 652e8bd..4fad08f 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -353,11 +353,25 @@ static void __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
static struct sched_entity *__pick_next_entity(struct cfs_rq *cfs_rq)
{
struct rb_node *left = cfs_rq->rb_leftmost;
+ struct sched_entity *se, *curr;

if (!left)
return NULL;

- return rb_entry(left, struct sched_entity, run_node);
+ se = rb_entry(left, struct sched_entity, run_node);
+ curr =&current->se;
+
+ /*
+ * Don't select the entity who just tried to schedule away
+ * if there's another entity available.
+ */
+ if (unlikely(se == curr&& cfs_rq->nr_running> 1)) {
+ struct rb_node *next_node = rb_next(&curr->run_node);
+ if (next_node)
+ se = rb_entry(next_node, struct sched_entity, run_node);
+ }
+
+ return se;
}
Really hate this change though,. doesn't seem right to not pick the same
task again if its runnable. Bad for cache footprint.

The scenario is quite common for stuff like:

CPU0 CPU1

set_task_state(TASK_INTERRUPTIBLE)

if (cond)
goto out;
<--- ttwu()
schedule();


I agree, yielding should be explicitly requested.

Also, on a heavily overcommitted box an undirected yield might take quite a long time to find the thread that's holding the lock. I think a yield_to() will be a lot better:

- we can pick one of the vcpus belonging to the same guest to improve the probability that the lock actually get released
- we avoid an issue when the other vcpus are on different runqueues (in which case the current patch does nothing)
- we can fix the accounting by donating vruntime from current to the yielded-to vcpu


--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/