linux on netserver

David Mosberger-Tang (David.Mosberger@acm.org)
Thu, 29 Jan 1998 10:36:20 -0800


I've encountered some strange performance problems trying to get Linux
running on an HP NetServer (LXe Pro, specifically). The odd part is
that Linux basically works fine. It's just that certain operations
are way slower than they should be. E.g., copying large files around
is nice and fast ('couple of MB/s). But trying to extract a tar file
takes forever (e.g., extracting linux kernel sources takes several
minutes). The really odd part is that the problem doesn't seem to be
due to the disk controller or driver (the controller happens to be an
aic7880---the same runs just fine on a different Linux box). For
example, trying to ftp files from a remote site to /dev/null shows the
same kind of slow performance, so it doesn't appear to be related to
the disk subsystem at all.

While I tried the obvious things on the machine, such as upgrading to
the latest BIOS, trying both linux 2.0.31 and 2.1.80, booting in
single-CPU as well as in 4-CPU mode, playing with the irq assignments
to make sure there are no obvious errors, disabling as much of the
hardware as possible (through BIOS configuration), nothing I tried so
far seemed to improve the situation in any way.

There are a couple of odd things I noticed:

- Linux seems to be unable to detect the amount of memory
installed. I'm aware of the BIOS limitation which makes it
impossible to report more than 64MB of RAM (the machine has
~576MB of RAM). However, in my case, the linux kernel
detects 0KB of RAM, not 64MB, which was unexpected to me.
When booting with option mem=576MB, linux boots just fine
though.

- There is an unknown PCI chip present in the box. The chip
is from Intel (vendor id 0x8086) and the device id is
0x0008. This chip seems to be present on Intel Alder
motherboards as well, so I kind of doubt that this is the
source of the problem.

Other than that, there are no oddballs (nothing in /var/log/messages
or the like).

I suppose there could be problems with the memory subsystem (e.g.,
tons of correctable ECC errors), but the box performs just fine with
NT, so it's unlikely to be a physical hw problem. If there were ECC
errors, is there a way to get notified when they occur (e.g., on Alpha
boxes, you'd get a machine check interrupt, is there something
similar?).

Has anyone else on this list experience with running Linux on
NetServer boxes (with P6s in them) or otherwise ideas on what could
explain this behavior? I still have a couple of things to try out in
my not-so-ample spare time, but it starts to look like the problem
might be a little deeper than I first thought.

--david