Re: [PATCH 9/9] oom: give the dying task a higher priority

From: Minchan Kim
Date: Wed Jun 16 2010 - 11:31:34 EST


On Wed, Jun 16, 2010 at 08:36:29PM +0900, KOSAKI Motohiro wrote:
>
> From: Luis Claudio R. Goncalves <lclaudio@xxxxxxxx>
>
> In a system under heavy load it was observed that even after the
> oom-killer selects a task to die, the task may take a long time to die.
>
> Right after sending a SIGKILL to the task selected by the oom-killer
> this task has it's priority increased so that it can exit() exit soon,
> freeing memory. That is accomplished by:
>
> /*
> * We give our sacrificial lamb high priority and access to
> * all the memory it needs. That way it should be able to
> * exit() and clear out its resources quickly...
> */
> p->rt.time_slice = HZ;
> set_tsk_thread_flag(p, TIF_MEMDIE);
>
> It sounds plausible giving the dying task an even higher priority to be
> sure it will be scheduled sooner and free the desired memory. It was
> suggested on LKML using SCHED_FIFO:1, the lowest RT priority so that
> this task won't interfere with any running RT task.
>
> If the dying task is already an RT task, leave it untouched.
> Another good suggestion, implemented here, was to avoid boosting the
> dying task priority in case of mem_cgroup OOM.
>
> Signed-off-by: Luis Claudio R. Goncalves <lclaudio@xxxxxxxx>
> Cc: Minchan Kim <minchan.kim@xxxxxxxxx>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> ---
> mm/oom_kill.c | 38 +++++++++++++++++++++++++++++++++++---
> 1 files changed, 35 insertions(+), 3 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 7e9942d..1ecfc7a 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -82,6 +82,28 @@ static bool has_intersects_mems_allowed(struct task_struct *tsk,
> #endif /* CONFIG_NUMA */
>
> /*
> + * If this is a system OOM (not a memcg OOM) and the task selected to be
> + * killed is not already running at high (RT) priorities, speed up the
> + * recovery by boosting the dying task to the lowest FIFO priority.
> + * That helps with the recovery and avoids interfering with RT tasks.
> + */
> +static void boost_dying_task_prio(struct task_struct *p,
> + struct mem_cgroup *mem)
> +{
> + struct sched_param param = { .sched_priority = 1 };
> +
> + if (mem)
> + return;
> +
> + if (rt_task(p)) {
> + p->rt.time_slice = HZ;
> + return;

I have a question from long time ago.
If we change rt.time_slice _without_ setscheduler, is it effective?
I mean scheduler pick up the task faster than other normal task?

> + }
> +
> + sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
> +}
> +
> +/*
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/