Re: [RFC 0/3]block: An IOPS based ioscheduler

From: Vivek Goyal
Date: Sun Jan 15 2012 - 17:24:32 EST


On Wed, Jan 04, 2012 at 02:53:37PM +0800, Shaohua Li wrote:
> An IOPS based I/O scheduler
>
> Flash based storage has some different characteristics against rotate disk.
> 1. no I/O seek.
> 2. read and write I/O cost usually is much different.
> 3. Time which a request takes depends on request size.
> 4. High throughput and IOPS, low latency.
>
> CFQ iosched does well for rotate disk, for example fair dispatching, idle
> for sequential read. It also has optimization for flash based storage (for
> item 1 above), but overall it's not designed for flash based storage. It's
> a slice based algorithm. Since flash based storage request cost is very
> low, and drive has big queue_depth is quite popular now which makes
> dispatching cost even lower, CFQ's slice accounting (jiffy based)
> doesn't work well. CFQ doesn't consider above item 2 & 3.
>
> FIOPS (Fair IOPS) ioscheduler is trying to fix the gaps. It's IOPS based, so
> only targets for drive without I/O seek. It's quite similar like CFQ, but
> the dispatch decision is made according to IOPS instead of slice.

Hi Shaohua,

What problem are you trying to fix. If you just want to do provide
fairness in terms of IOPS instead of time, then I think existing code
should be easily modifiable instead of writing a new IO scheduler
altogether. It is just like switching the mode either based on tunable
or based on disk type.

If slice_idle is zero, we already have some code to effectively do
accounting in terms of IOPS. This is primarily for group level. We should
be able to extend it to queue level.

These are implementation details. I think what I am not able to understand
that what is the problem and what are you going to gain by doing
accounting in terms of IOPS.

Also, for flash based storage, isn't noop or deadline a good choice of
scheduler. Why do we want to come up with something which is CFQ like.
For this fast device any idling will kill the performance and if you
don't idle, then I think practically most of the workload will almost
become round robin kind of serving.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/