Re: [PATCH v7] io_uring: Statistics of the true utilization of sq threads.

From: Jens Axboe
Date: Thu Jan 18 2024 - 14:34:35 EST


On 1/18/24 12:30 AM, Xiaobing Li wrote:
> diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
> index 976e9500f651..24a7452ed98e 100644
> --- a/io_uring/fdinfo.c
> +++ b/io_uring/fdinfo.c
> @@ -64,6 +64,7 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
> unsigned int sq_shift = 0;
> unsigned int sq_entries, cq_entries;
> int sq_pid = -1, sq_cpu = -1;
> + long long sq_total_time = 0, sq_work_time = 0;
> bool has_lock;
> unsigned int i;
>
> @@ -147,10 +148,17 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
>
> sq_pid = sq->task_pid;
> sq_cpu = sq->sq_cpu;
> + struct rusage r;
> +
> + getrusage(sq->thread, RUSAGE_SELF, &r);
> + sq_total_time = r.ru_stime.tv_sec * 1000000 + r.ru_stime.tv_usec;
> + sq_work_time = sq->work_time;
> }

I guess getrusage() is fine here, though I would probably just grab it
directly.

> diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
> index 65b5dbe3c850..f3e9fda72400 100644
> --- a/io_uring/sqpoll.c
> +++ b/io_uring/sqpoll.c
> @@ -251,6 +251,9 @@ static int io_sq_thread(void *data)
> }
>
> cap_entries = !list_is_singular(&sqd->ctx_list);
> + ktime_t start, diff;
> +
> + start = ktime_get();
> list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
> int ret = __io_sq_thread(ctx, cap_entries);

But why on earth is this part then not doing getrusage() as well?


> diff --git a/io_uring/sqpoll.h b/io_uring/sqpoll.h
> index 8df37e8c9149..c14c00240443 100644
> --- a/io_uring/sqpoll.h
> +++ b/io_uring/sqpoll.h
> @@ -16,6 +16,7 @@ struct io_sq_data {
> pid_t task_pid;
> pid_t task_tgid;
>
> + long long work_time;
> unsigned long state;
> struct completion exited;
> };

Probably just make that an u64.

As Pavel mentioned, I think we really need to consider if fdinfo is the
appropriate API for this. It's fine if you're running stuff directly and
you're just curious, but it's a very cumbersome API in general as you
need to know the pid of the task holding the ring, the fd of the ring,
and then you can get it as a textual description. If this is something
that is deemed useful, would it not make more sense to make it
programatically available in addition, or even exclusively?

--
Jens Axboe