Fix disk statistics reporting on each disk (logical)

From: Nikolaiev, Mike (
Date: Wed Aug 30 2000 - 12:13:13 EST

OK... I cannot resist...

Working with large database systems,
(700MHz 32P, 32GB ram, 500-18GB 15K drives)(Well I think it is big)
We often configure many logical drives on a disk array controller.
Each of these are represented under Linux as a minor dev number.
We may have 50-70 logical drives on the system, each with 14 to 56 drives

In order to balance the database load across all the spindles evenly, we
monitor # of reads/sec; # of writes/sec; min,max, and avg latency (service
and avg # of outstanding I/O's (queue length) on a per logical drive basis.

If we find some drives overloaded and others lightly loaded we balance the

Will we have the capabilities stated above?
If not, is it very difficult to put these in the kernel at this point?
I would envision us supporting systems of this size on Linux next year on
the 2.4 kernel
and we really need this kind of data to tune the system...

Thanks, Mike...

Michael V. Nikolaiev
Director, Database Engineering
Compaq Computer Corporation
Tel: (281) 518-9928 Fax: (281) 514-8375

-----Original Message-----
From: Jens Axboe []
Sent: Wednesday, August 30, 2000 9:12 AM
To: Linus Torvalds
Cc: Bill Wilson;
Subject: Re: [PATCH] Fix disk statistic reporting to include all disks

On Tue, Aug 29 2000, Linus Torvalds wrote:
> Ho humm.. I actually kind of expected the statistics to be done in
> generic_make_request(), so that we'd have
> .....
> do {
> q = blk_get_queue(bh->b_rdev);
> if (!q) {
> ... no such device ..
> }
> + statistics(q, rw, bh->b_size);
> } while (q->make_request_fn(q, rw, bh)));
> ...
> and just remove the drive_stat_acct() thing completely.

Agreed, I've implemented this and ripped the stats out of kstat. Patch
coming soon :-)

We do then loose the ability to tell when we've started a new request,
now we just account all buffers as a new request. Not a big deal, IMHO.
But do you just want per-queue stats, and not care about minors at all?

> That way, we'd have a per-queue "this many reads, this many writes".
> The /proc code would have to find all the queues, but that shouldn't be
> too bad either. At least we wouldn't have the current silly disk_index()
> stuff (which is all done in a timing-critical routine).

No big deal, for now I just have the proc code doing blk_get_queue for
every major. That code is hardly performance critical ;-)

> Then, as we get saner and saner queues, we'll get saner and saner
> statistics automatically. The above should already be quite sane for IDE.

Rigth, for IDE we'd get a per hwgroup stat. Sane for SCSI, too.

* Jens Axboe <>
* SuSE Labs
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at

This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:25 EST