Re: [PATCH] virtio_blk: set the default scheduler to none

From: Paolo Bonzini
Date: Fri Dec 08 2023 - 06:07:55 EST

Next message: Thomas Richter: "Re: [PATCH] perf test: Fix fails of perf stat --bpf-counters --for-each-cgroup on s390"
Previous message: Google: "Re: [PATCH v2] ring-buffer: Add offset of events in dump on mismatch"
In reply to: Li Feng: "Re: [PATCH] virtio_blk: set the default scheduler to none"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Dec 8, 2023 at 6:54 AM Li Feng <fengli@xxxxxxxxxx> wrote:
>
>
> Hi,
>
> I have ran all io pattern on my another host.
> Notes:
> q1 means fio iodepth = 1
> j1 means fio num jobs = 1
>
> VCPU = 4, VMEM = 2GiB, fio using directio.
>
> The results of most jobs are better than the deadline, and some are lower than the deadline。

I think this analysis is a bit simplistic. In particular:

For low queue depth improvements are relatively small but we also have
the worst cases:

4k-randread-q1-j1 | 12325 | 13356 | 8.37%
256k-randread-q1-j1 | 1865 | 1883 | 0.97%
4k-randwrite-q1-j1 | 9923 | 10163 | 2.42%
256k-randwrite-q1-j1 | 2762 | 2833 | 2.57%
4k-read-q1-j1 | 21499 | 22223 | 3.37%
256k-read-q1-j1 | 1919 | 1951 | 1.67%
4k-write-q1-j1 | 10120 | 10262 | 1.40%
256k-write-q1-j1 | 2779 | 2744 | -1.26%
4k-randread-q1-j2 | 24238 | 25478 | 5.12%
256k-randread-q1-j2 | 3656 | 3649 | -0.19%
4k-randwrite-q1-j2 | 17096 | 18112 | 5.94%
256k-randwrite-q1-j2 | 5188 | 4914 | -5.28%
4k-read-q1-j2 | 36890 | 31768 | -13.88%
256k-read-q1-j2 | 3708 | 4028 | 8.63%
4k-write-q1-j2 | 17786 | 18519 | 4.12%
256k-write-q1-j2 | 4756 | 5035 | 5.87%

(I ran a paired t-test and it confirms that the improvements overall
are not statistically significant).

Small, high queue depth I/O is where the improvements are definitely
significant, but even then the scheduler seems to help in the j2 case:

4k-randread-q128-j1 | 204739 | 319066 | 55.84%
4k-randwrite-q128-j1 | 137400 | 152081 | 10.68%
4k-read-q128-j1 | 158806 | 345269 | 117.42%
4k-write-q128-j1 | 47576 | 209236 | 339.79%
4k-randread-q128-j2 | 390090 | 577300 | 47.99%
4k-randwrite-q128-j2 | 143373 | 140560 | -1.96%
4k-read-q128-j2 | 399500 | 409857 | 2.59%
4k-write-q128-j2 | 175756 | 159109 | -9.47%

At higher sizes, even high queue depth results have high variability.
There are clear improvements for sequential reads, but not so much for
everything else:

256k-randread-q128-j1 | 24257 | 22851 | -5.80%
256k-randwrite-q128-j1 | 9353 | 9233 | -1.28%
256k-read-q128-j1 | 18918 | 23710 | 25.33%
256k-write-q128-j1 | 9199 | 9337 | 1.50%
256k-randread-q128-j2 | 21992 | 23437 | 6.57%
256k-randwrite-q128-j2 | 9423 | 9314 | -1.16%
256k-read-q128-j2 | 19360 | 21467 | 10.88%
256k-write-q128-j2 | 9292 | 9293 | 0.01%

I would focus on small I/O with varying queue depths, to understand at
which point the performance starts to improve; queue depth of 128 may
not be representative of common usage, especially high queue depth
*sequential* access which is where the biggest effects are visibie.
Maybe you can look at improvements in the scheduler instead?

Paolo

Next message: Thomas Richter: "Re: [PATCH] perf test: Fix fails of perf stat --bpf-counters --for-each-cgroup on s390"
Previous message: Google: "Re: [PATCH v2] ring-buffer: Add offset of events in dump on mismatch"
In reply to: Li Feng: "Re: [PATCH] virtio_blk: set the default scheduler to none"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]