Re: [BUG] sched_setattr() SCHED_DEADLINE hangs system

From: Michael Kerrisk (man-pages)
Date: Tue May 13 2014 - 08:46:54 EST


On 05/13/2014 02:11 PM, Juri Lelli wrote:
> On Tue, 13 May 2014 12:43:07 +0200
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
>> On Tue, May 13, 2014 at 11:57:49AM +0200, Juri Lelli wrote:
>>> static bool
>>> __checkparam_dl(const struct sched_attr *attr)
>>> {
>>> return attr && attr->sched_deadline != 0 &&
>>> (attr->sched_period == 0 ||
>>> - (s64)(attr->sched_period - attr->sched_deadline) >= 0) &&
>>> - (s64)(attr->sched_deadline - attr->sched_runtime ) >= 0 &&
>>> - attr->sched_runtime >= (2 << (DL_SCALE - 1));
>>> + (attr->sched_period >= attr->sched_deadline)) &&
>>> + (attr->sched_deadline >= attr->sched_runtime) &&
>>> + attr->sched_runtime >= (1ULL << DL_SCALE) &&
>>> + (attr->sched_deadline < (1ULL << 63) &&
>>> + attr->sched_period < (1ULL << 63));
>>> }
>>
>> Could we maybe rewrite that function to look less like a ioccc.org
>> submission?
>>
>
> Right.
>
>> static bool
>> __checkparam_dl(const struct sched_attr *attr)
>> {
>> if (!attr) /* possible at all? */
>> return false;
>>
>
> I'd say no, removed.
>
>> /* runtime <= deadline <= period */
>> if (attr->sched_period < attr->sched_deadline ||
>> attr->sched_deadline < attr->sched_runtime)
>> return false;
>>
>> /*
>> * Since we truncate DL_SCALE bits make sure we're at least that big,
>> * if runtime > (1 << DL_SCALE), so must the others be per the above
>> */
>> if (attr->sched_runtime <= (1ULL << DL_SCALE))
>> return false;
>>
>> /*
>> * Since we use the MSB for wrap-around and sign issues, make
>> * sure its not set, if period < 2^63, so must the others be.
>> */
>> if (attr->sched_period & (1ULL << 63))
>> return false;
>>
>> return true;
>> }
>>
>> Did I miss anything?

Thanks so much for suggesting the rewrite. It was rather impenetrable before.

> period can be 0, so we have to check also sched_deadline for MSB set.

Yes, I spotted that too as I tested the patch.

Anyway, I tested your version of the patch, Peter, and other than the
above problem, it works (and the error case I identified now fails with
EINVAL).

Cheers,

Michael


> ---
>>From e0c44d7614127f8dfbafc08c30681cb8098271e8 Mon Sep 17 00:00:00 2001
> From: Juri Lelli <juri.lelli@xxxxxxxxx>
> Date: Tue, 13 May 2014 10:15:59 +0200
> Subject: [PATCH] sched/deadline: restrict user params max value to 2^63 ns
>
> Michael Kerrisk noticed that creating SCHED_DEADLINE reservations
> with certain parameters (e.g, a runtime of something near 2^64 ns)
> can cause a system freeze for some amount of time.
>
> The problem is that in the interface we have
>
> u64 sched_runtime;
>
> while internally we need to have a signed runtime (to cope with
> budget overruns)
>
> s64 runtime;
>
> At the time we setup a new dl_entity we copy the first value in
> the second. The cast turns out with negative values when
> sched_runtime is too big, and this causes the scheduler to go crazy
> right from the start.
>
> Moreover, considering how we deal with deadlines wraparound
>
> (s64)(a - b) < 0
>
> we also have to restrict acceptable values for sched_{deadline,period}.
>
> This patch fixes the thing checking that user parameters are always
> below 2^63 ns (still large enough for everyone).
>
> It also rewrites other conditions that we check, since in
> __checkparam_dl we don't have to deal with deadline wraparounds
> and what we have now erroneously fails when the difference between
> values is too big.
>
> Reported-by: Michael Kerrisk <mtk.manpages@xxxxxxxxx>
> Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Signed-off-by: Juri Lelli <juri.lelli@xxxxxxxxx>
> ---
> kernel/sched/core.c | 37 ++++++++++++++++++++++++++++++-------
> 1 file changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d9d8ece..682a986 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3188,17 +3188,40 @@ __getparam_dl(struct task_struct *p, struct sched_attr *attr)
> * We ask for the deadline not being zero, and greater or equal
> * than the runtime, as well as the period of being zero or
> * greater than deadline. Furthermore, we have to be sure that
> - * user parameters are above the internal resolution (1us); we
> - * check sched_runtime only since it is always the smaller one.
> + * user parameters are above the internal resolution of 1us (we
> + * check sched_runtime only since it is always the smaller one) and
> + * below 2^63 ns (we have to check both sched_deadline and
> + * sched_period, as the latter can be zero).
> */
> static bool
> __checkparam_dl(const struct sched_attr *attr)
> {
> - return attr && attr->sched_deadline != 0 &&
> - (attr->sched_period == 0 ||
> - (s64)(attr->sched_period - attr->sched_deadline) >= 0) &&
> - (s64)(attr->sched_deadline - attr->sched_runtime ) >= 0 &&
> - attr->sched_runtime >= (2 << (DL_SCALE - 1));
> + /* deadline != 0 */
> + if (attr->sched_deadline == 0)
> + return false;
> +
> + /*
> + * Since we truncate DL_SCALE bits, make sure we're at least
> + * that big.
> + */
> + if (attr->sched_runtime < (1ULL << DL_SCALE))
> + return false;
> +
> + /*
> + * Since we use the MSB for wrap-around and sign issues, make
> + * sure it's not set (mind that period can be equal to zero).
> + */
> + if (attr->sched_deadline & (1ULL << 63) ||
> + attr->sched_period & (1ULL << 63))
> + return false;
> +
> + /* runtime <= deadline <= period (if period != 0) */
> + if ((attr->sched_period != 0 &&
> + attr->sched_period < attr->sched_deadline) ||
> + attr->sched_deadline < attr->sched_runtime)
> + return false;
> +
> + return true;
> }
>
> /*
>


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/