[PATCH 0/3] BFQ I/O Scheduler (second take)

From: Fabio Checconi
Date: Tue Nov 11 2008 - 07:49:50 EST


Hi Jens, hi all,

this is an updated version of the patchset introducing the BFQ (Budget
Fair Queueing) I/O scheduler. It is based on CFQ and it has been
developed trying to improve the predictability and the fairness of the
service prvided, without penalizing the aggregate throughput.

This new version hopefully addresses the issues pointed out after the
first submission[1]:

1) it introduces a bitmap to notify priority changes to the bfq/cfq;
2) it introduces a timeout to put an higher limit to the time a
task can take to consume its budget;
3) it supports hierarchical scheduling.

With the introduction of a budget timeout, each task is still assigned
a budget measured in number of sectors instead of amount of time,
but a budget is not served to its exhaustion if it's taking too much
to complete. Budgets are still scheduled using a modified version
of WF2Q+.

Using huge values for max_budget turns the scheduler into a pure
time-domain proportional share one, and the timeout controls the duration
of the time quanta.

WRT the support for hierarchical scheduling, that is being discussed
so much in these days, this scheduler does not add anything new, it
supports proportional share distribution of disk bandwidth among
full cgroup hierarchies; the mapping from requests to cgroups is
correct only in case of synchronous reads.

The overall diffstat is the following:

block/Kconfig.iosched | 25 +
block/Makefile | 1 +
block/bfq-cgroup.c | 743 +++++++++++++++
block/bfq-ioc.c | 375 ++++++++
block/bfq-iosched.c | 2010 +++++++++++++++++++++++++++++++++++++++++
block/bfq-sched.c | 950 +++++++++++++++++++
block/bfq.h | 514 +++++++++++
block/blk-ioc.c | 27 +-
block/cfq-iosched.c | 10 +-
fs/ioprio.c | 9 +-
include/linux/cgroup_subsys.h | 6 +
include/linux/iocontext.h | 18 +-
12 files changed, 4669 insertions(+), 19 deletions(-)

For a complete description of the algorithm and of its properties
please refer to [2], (where extensive experiments on older hw can
also be found), and for some performance measurements and implementation
details, to [3].

[1] http://lkml.org/lkml/2008/4/1/234
[2] http://algo.ing.unimo.it/people/paolo/disk_sched/
[3] http://feanor.sssup.it/~fabio/linux/bfq/

As usual, any kind of feedback will be greatly appreciated, thanks
in advance.

Fabio Checconi, Paolo Valente
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/