Re: Sharing SCSI disks

Simon Shapiro (Shimon@i-Connect.Net)
Fri, 21 Mar 1997 09:53:48 -0800 (PST)


Hi David S. Miller; On 20-Mar-97 you wrote:
> From: alan@lxorguk.ukuu.org.uk (Alan Cox)
> Date: Thu, 20 Mar 1997 21:50:28 +0000 (GMT)
>
> The right solution is simply to add raw devices to the Linux
> kernel, it doesn't look too bad or alternatively to add a raw
> semantic to the existing ones (which is basically how you'd do
> either)
>
> I brought up this with Linus once, the conversation was in reference
> to how we thought we could do on database benchmarks etc. which is
> essentially all over raw block devices these days. (which we both
> agreed was entirely stupid, the kernel should be doing buffer caching,
> not some silly Oracle disk I/O layer) In any event, it is just
> essentially page flipping every request to dma out to the disk. The
> cpu never touches any of this stuff (in the kernel that is).

Stupid?! A certain degree of modesty never hurt anyone...
I am not here to defend Oracle, but on the issue of raw devices you
need to do some more homework. It is NOT just Oracle. I have been in
this field for many (over 20) years and, at tims, an application comes
along which requires raw access to disk.

I could take the space and time to ``defend'' or explain such a case
but think it is better to try and generalize the case:

Buffered I/O, like any algorithm, makes certain assumptions and behaves
in a certain way as a result. It is purely arrogant of an O/S designer
to assume that in one algoritm he/she may cover all cases and all uses.
Where this to be true, one would not need several editors, several
implementations of database access, several networking protocols, etc.
The same applies to disk I/O; At most cases, buffered I/O does fine.
It is an excellent choice for virtual memory implementation, for file
system use, but computers have more uses than kernels and emacs :-)

For example, we are building a very complex, high speed distributed
database engine. It has the capacity for over 2,000,000,000 records,
concurrent access is guaranteed for 200,000 transactions per second
and can peak at 300,000tps. The system is fully redundant,
symetrically distributed and very, very finely tuned. For many reasons
I will not waste your time on, we NEEDED non-buffered, totally raw I/O.
Linux cannot (and is to to will not) do that. We are using FreeBSD
instead.

While Linux is a fine operating system, it cannot become a universal
platform for high load commercial processing. Mainly because of the
lack of certain essential features. Yes, these are ``boring'', not
intelectually stimulating features, but essential non the less.

Now you have my two cents` worth on the subject.

Simon