Re: [patch 0/4] [RFC] Another proportional weight IO controller

From: Fabio Checconi
Date: Thu Nov 13 2008 - 08:47:33 EST


Hi,

> From: Nauman Rafique <nauman@xxxxxxxxxx>
> Date: Wed, Nov 12, 2008 01:20:13PM -0800
>
...
> >> CFQ can be trivially
> >> modified to do proportional division (i.e give time slices in
> >> proportion to weight instead of priority).
> >> And such a solution would
> >> avoid idleness problem like the one you mentioned above.
> >
> > Can you just elaborate a little on how do you get around idleness problem?
> > If you don't create idleness than if two tasks in two cgroups are doing
> > sequential IO, they might simply get into lockstep and we will not achieve
> > any differentiated service proportionate to their weight.
>
> I was thinking of a more cfq-like solution for proportional division
> at the elevator level (i.e. not a token based solution). There are two
> options for proportional bandwidth division at elevator level: 1)
> change the size of the time slice in proportion to the weights or 2)
> allocate equal time slice each time but allocate more slices to cgroup
> with more weight. For (2), we can actually keep track of time taken to
> serve requests and allocate time slices in such a way that the actual
> disk time is proportional to the weight. We can adopt a fair-queuing
> (http://lkml.org/lkml/2008/4/1/234) like approach for this if we want
> to go that way.
>
> I am not sure if the solutions mentioned above will have the lockstep
> problem you mentioned above or not. Since we are allocating time
> slices, and would have anticipation built in (just like cfq), we would
> have some level of idleness. But this idleness can be predicted based
> on a thread behavior.

if I understand that correctly, the problem may arise whenever you
have to deal with *synchronous* I/O, where you may not see the streams
of requests generated by tasks as continuously backlogged (and the
algorithm used to distribute bandwidth does the implicit assumption
that they are, as in the cfq case).

A cfq-like solution with idling enabled AFAIK should not suffer from
this problem, as it creates backlog for the process being anticipated.
But anticipation is not always used, and cfq currently disables it for
SSDs and in other cases where it may hurt performance (e.g., NCQ drives
in presence of seeky loads, etc). So, in these cases, something still
needs to be done if we want a proportional bandwidth distribution, and
we don't want to pay the extra cost of idling when it's not strictly
necessary.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/