Re: [PATCH] kernel/time: Add hr_sleep syscall, a high-resolution sleep service

From: Andy Lutomirski
Date: Fri Jan 15 2021 - 13:53:54 EST


On Fri, Jan 15, 2021 at 10:14 AM Marco Faltelli
<marco.faltelli@xxxxxxxxxxx> wrote:
>
> hr_sleep is a new system call engineered for nanosecond time scale
> granularities.
> With respect to nanosleep, it uses a single value representation
> of the sleep period.
> hr_sleep achieves 15x improvement for microsecond scale timers
> w.r.t. nanosleep:

You need to explain what 15x means.

> +/**
> + * hr_sleep - a high-resolution sleep service for fine-grained timeouts
> + * @nanoseconds: the requested sleep period in nanoseconds
> + *
> + * Returns:
> + * 0 when the sleep request successfully terminated
> + * -EINVAL if a sleep period < 0 is requested
> + */
> +SYSCALL_DEFINE1(hr_sleep, long, nanoseconds)
> +{
> + DECLARE_WAIT_QUEUE_HEAD(the_queue);//here we use a private queue
> + control_record *control;
> + ktime_t ktime_interval;
> +
> + if (nanoseconds < 0)
> + return -EINVAL;
> +
> + if (nanoseconds == 0)
> + return 0;
> +
> + ktime_interval = ktime_set(0, nanoseconds);
> + control = (control_record *)((void *) current->stack + sizeof(struct thread_info));

What is this trying to do other than writing to memory you don't own?

In C, the way you create a structure on the stack is:

struct control_record control;

done, problem solved. (Also, on modern kernels, thread_info is not on
the stack, so your arithmetic makes no sense.

> + wait_event_interruptible(the_queue, control->awake == 1);

Once you start checking the return value (which is absolutely
necessary), you will discover that you need to do something sensible
when the return value is nonzero. At that point you will start
contemplating how to restart an interrupted wait, and you will have to
consider the merits of absolute vs relative timeouts and you might
discover that the complexity of nanosleep(2) is not an accident.

Sure, the seconds/nanoseconds split in nanosleep(2) is probably not
ideal, and a better API could probably be designed, but your hr_sleep
needs a lot of work to cover all the cases.


--Andy