Re: OFFTOPIC: Regarding NT vs Linux

Michael Weller (eowmob@exp-math.uni-essen.de)
Tue, 23 Sep 1997 15:11:59 +0200 (MESZ)


On Tue, 23 Sep 1997, John Kodis wrote:

> > On the other hand, [touching each page in the mmap'd file space]
> > will first pagefault as fast as possible, allowing the drive to get
> > all pages in one go, and then write the data from memory.
>
> ... unless the file is too big to fit into memory, in which case it
> gets paged in, swapped out, and paged back in again -- highly
> non-optimal.

Well, but what else would a copy_file do in this situation? As the file
does not fit into memory: Hohum, it has to read it again and copy it to
the network adapter. It would be nice if the SCSI adapter (we are not
talking about something else, do we?) could write the data directly to the
network interface card. This is what I expect to happen in a workstation.
But I doubt it will be ever possible with PC hardware. (Hmm, but then,
with a memory mapped network card and a busmaster.. why not?).

So, the only advantage a copy_file can give is a sensible read_ahead on
the disk, using fewer buffers (optimally one buffer) for the data.

I would expect that the pagefault overhead would be hidden by disk and net
latency. At least with an optimal read_ahead. We'll only get a page_fault
when the disk was too slow to read_ahead too fast (or read_ahead is too
aggressive and cause removing something we did not yet read).

Optimizations in this area (if still required) would be of general benefit
for mmap. So just speed up mmap()) and generally TCP/IP (less buffer
copies).

THE ONLY possible advantage of a copy_file I can see is in the case where
it is possible to setup the ip headers on the network card, make the
hostadapter write it's data directly to the network card (which means the
adapter has to split the bytestream s.t. it nicely fits into IP/ethernet
packets, someone add the necessary IP headers in between (hmm. either the
network adapter is smart enough or the kernel has to stop the SCSI
adapter, initiate a new IP packet and restart the adapter => slow as
hell).

All in all, a big mess, if possible at all on PC hardware (and certainly
it will only work with certain SCSI/network card combinations). Easy on a
real workstation where the network card is as expensive as a PC, but
contains a CPU and does all IP protocol stuff.

[ I still like to tell the story of an RS6000 here routing between two
ethernets. We made a complete reinstall of the system, low-level formatted
the harddisk, but never power-cycled it, and did not reconfigure the 2
ethernet cards during the reinstallation (except for the final reboot).
And the 2 ethernet cards continued to route between those nets during all
the time even with no OS running on the machine at all. ]

I strongly doubt that NT will be able to do such a interface to interface
copy. Esp. because the two propietary and third party drivers would need
to interact very closely (and because PC hardware is too dumb).

> The mmap trick looks like it's one technique that could be used to
> impliment a Linux copy_file call, rather than a complete substitute
> for such a call.

I can not follow you here. The only point is that copy_file() is simpler
to implement and optimize as the much more flexible mmap() is. I think
this is actually the point here. It is a very limited function to
implement with few efforts and can be used as marketing gig for a very
specific kind of application, thus perfect for NT.

copy_file() cannot create memory from magic. If the file does not fit into
memory it has to be reread from the disk again and again. There's no way
to avoid that. (Now a clever buffer strategy might decided to keep some
part of the file in memory and only reread the rest, instead of rereading
all of the file. Again, this applies to mmap too.)

Maybe m_advise need to be improved (or implemented, if not yet done) to
set a nice buffer strategy for such a case. (again, this would of general
benefit).

There's only one problem: Virtual Addressspace. You can do that only when
the file is less than 4GB large (minus space for lib and program and some
data and kernel; probably just make it 2GB). But then you could still mmap
and munmap it (loosing some performance because the read_ahead will come
out of sync. Well, that should be ok every few 2GB). Also, such big files
create quite a bunch of other problems which are not solved with NT as
well.

Just my thoughts,
Michael.

(eowmob@exp-math.uni-essen.de or eowmob@pollux.exp-math.uni-essen.de
Please do not use my vm or de0hrz1a accounts anymore. In case of real
problems reaching me try mat42b@spi.power.uni-essen.de instead.)