[PATCH] io_uring: Optimization of buffered random write

From: luhongfei
Date: Wed Apr 19 2023 - 05:22:57 EST


The buffered random write performance of io_uring is poor
due to the following reason:
By default, when performing buffered random writes, io_sq_thread
will call io_issue_sqe writes req, but due to the setting of
IO_URING_F_NONBLOCK, req is executed asynchronously in iou-wrk,
where io_wq_submit_work calls io_issue_sqe completes the write req,
with issue_flag as IO_URING_F_UNLOCKED | IO_URING_F_IOWQ,
which will reduce performance.
This patch will determine whether this req is a buffered random write,
and if so, io_sq_thread directly calls io_issue_sqe(req, 0)
completes req instead of completing it asynchronously in iou wrk.

Performance results:
For fio the following results have been obtained with a queue depth of
8 and 4k block size:

random writes:
without patch with patch libaio psync
iops: 287k 560k 248K 324K
bw: 1123MB/s 2188MB/s 970MB/s 1267MB/s
clat: 52760ns 69918ns 28405ns 2109ns

Signed-off-by: luhongfei <luhongfei@xxxxxxxx>
---
io_uring/io_uring.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
mode change 100644 => 100755 io_uring/io_uring.c

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 4a865f0e85d0..64bb91beb4d6
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2075,8 +2075,23 @@ static inline void io_queue_sqe(struct io_kiocb *req)
__must_hold(&req->ctx->uring_lock)
{
int ret;
+ bool is_write;

- ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
+ switch (req->opcode) {
+ case IORING_OP_WRITEV:
+ case IORING_OP_WRITE_FIXED:
+ case IORING_OP_WRITE:
+ is_write = true;
+ break;
+ default:
+ is_write = false;
+ break;
+ }
+
+ if (!is_write || (req->rw.kiocb.ki_flags & IOCB_DIRECT))
+ ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
+ else
+ ret = io_issue_sqe(req, 0);

/*
* We async punt it if the file wasn't marked NOWAIT, or if the file
--
2.39.0