Re: [RFC v1] add new io-scheduler to use cgroup on high-speed device

From: sanbai
Date: Thu Jun 06 2013 - 23:10:20 EST


On 2013å06æ05æ 21:30, Vivek Goyal wrote:
On Wed, Jun 05, 2013 at 10:09:31AM +0800, Robin Dong wrote:
We want to use blkio.cgroup on high-speed device (like fusionio) for our mysql clusters.
After testing different io-scheduler, we found that cfq is too slow and deadline can't run on cgroup.
So why not enhance deadline to be able to be used with cgroups instead of
coming up with a new scheduler?
I think if we add cgroups support into deadline, it will not be suitable to call "deadline" anymore...so a new ioscheduler and a new name may not confuse users.

So we developed a new io-scheduler: tpps (Tiny Parallel Proportion Scheduler).It dispatch requests
only by using their individual weight and total weight (proportion) therefore it's simply and efficient.
Can you give more details. Do you idle? Idling kills performance. If not,
then without idling how do you achieve performance differentiation.
We don't idle, when comes to .elevator_dispatch_fnïwe just compute quota for every group:

quota = nr_requests - rq_in_driver;
group_quota = quota * group_weight / total_weight;

and dispatch 'group_quota' requests for the coordinate group. Therefore high-weight group
will dispatch more requests than low-weight group.

Test case: fusionio card, 4 cgroups, iodepth-512

groupname weight
test1 1000
test2 800
test3 600
test4 400

What's the workload used for this?

Use tpps, the result is:

groupname iops avg-rt(ms) max-rt(ms)
test1 30220 16 54
test2 28261 18 56
test3 26333 19 69
test4 20152 25 87

Use cfq, the result is:

groupname iops avg-rt(ms) max-rt(ms)
test1 16478 30 242
test2 13015 39 347
test3 9300 54 371
test4 5806 87 393
How do results look like with cfq if this is run with slice_idle=0 and
quatum=128 or higher.

cfqq idles on 3 things. queue (cfqq), service tree and cfq group.
slice_idle will disable idling on cfqq but not no service tree. If
we provide a knob for that, then idling on service tree can be disabled
too and then we will be left with group idling only and then it should
be much better.
I do the test again for cfq (slice_idle=0, quatum=128) and tpps

cfq (slice_idle=0, quatum=128)
groupname iops avg-rt(ms) max-rt(ms)
test1 16148 15 188
test2 12756 20 117
test3 9778 26 268
test4 6198 41 209

tpps
groupname iops avg-rt(ms) max-rt(ms)
test1 17292 14 65
test2 15221 16 80
test3 12080 21 66
test4 7995 32 90

Looks cfq with is much better than before.

My fio script is :
[global]
direct=1
ioengine=libaio
#ioengine=psync
runtime=30
bs=4k
rw=randread
iodepth=256

filename=/dev/fioa
numjobs=2
#group_reporting

[read1]
cgroup=test1
cgroup_weight=1000

[read2]
cgroup=test2
cgroup_weight=800

[read3]
cgroup=test3
cgroup_weight=600

[read4]
cgroup=test4
cgroup_weight=400



Thanks
Vivek


--

Robin Dong
èæïèåïäçï
ééåå éå æåççé åæç
åæï72370
ææï13520865473
emailïsanbai@xxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/