Re: [PATCH v2] io_uring: Statistics of the true utilization of sq threads.

From: Jens Axboe
Date: Wed Nov 08 2023 - 10:26:12 EST


On 11/8/23 1:07 AM, Xiaobing Li wrote:
> Since the sq thread has a while(1) structure, during this process, there
> may be a lot of time that is not processing IO but does not exceed the
> timeout period, therefore, the sqpoll thread will keep running and will
> keep occupying the CPU. Obviously, the CPU is wasted at this time;Our
> goal is to count the part of the time that the sqpoll thread actually
> processes IO, so as to reflect the part of the CPU it uses to process
> IO, which can be used to help improve the actual utilization of the CPU
> in the future.

There should be an explanation in here on what 'work' and 'total' mean.

> The test results are as follows:
> cat /proc/11440/fdinfo/6
> pos: 0
> flags: 02000002
> mnt_id: 16
> ino: 94449
> SqMask: 0xf
> SqHead: 1845170
> SqTail: 1845170
> CachedSqHead: 1845170
> CqMask: 0xf
> CqHead: 1845154
> CqTail: 1845154
> CachedCqTail: 1845154
> SQEs: 0
> CQEs: 0
> SqThread: -1
> SqThreadCpu: -1
> UserFiles: 1
> UserBufs: 0
> PollList:
> CqOverflowList:
> PID: 11440
> work: 18794
> total: 19123

These should go with the other Sq thread related ones, eg be SqWork and
SqTotal. It's counted in jiffies right now which is a bit odd in terms
of being exposed, as you'd need to know what the base of that unit is.
But probably not much of a concern, as work/total is really the metric
you care about. Maybe it'd be better to expose it as a percentage, and
get rid of total? Eg just have SqBusy: xx% be the output.

> diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c
> index f04a43044d91..f0b79c533062 100644
> --- a/io_uring/fdinfo.c
> +++ b/io_uring/fdinfo.c
> @@ -213,6 +213,12 @@ __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
>
> }
>
> + if (ctx->sq_data) {
> + seq_printf(m, "PID:\t%d\n", task_pid_nr(ctx->sq_data->thread));
> + seq_printf(m, "work:\t%lu\n", ctx->sq_data->work);
> + seq_printf(m, "total:\t%lu\n", ctx->sq_data->total);
> + }
> +

This doesn't work, it needs proper locking. See how we get the other sq
values.

--
Jens Axboe