NFS bug (Linux 2.0.33 to HP): device files major num==0!

Mitch Davis (mjd@alphalink.com.au)
Mon, 05 Jan 1998 02:11:08 +1100


This is a multi-part message in MIME format.

--------------789FAABE6503B28B6DD93F38
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Dear Linux kernel readers,

On Friday, I posted this message:

> I'm wondering if someone can help me with a problem: I'm trying to
> use NFSROOT to store all my Linux files on an HP/UX machine. But
> there seems to be a bug in either the HP/UX NFS server, or the Linux
> NFS client.

> The bug is that the major number of any character or block device
> which is stored on the HP box, is shown as "0" on the Linux box.

I've had a look at what is causing this problem, and I've worked out
a fix for it. I would like people to have a look at the attached
patch, and see if it's an appropriate solution.

The problem is this: In section 2.3.5 of RFC1094
(http://www.cis.ohio-state.edu/htbin/rfc/rfc1094.html), there's a
struct called "fattr". This struct has an unsigned int called "rdev".
The RFC has this to say about rdev:

"rdev" is the device number of the file if it is type NFCHR or
NFBLK;

The "rdev" field in the attributes structure is an operating
system specific device specifier. It will be removed and
generalized in the next revision of the protocol.

I put some code into fs/nfs/proc.c to show me what the returned
value of rdev is.

When I NFS mount the Linux box to itself, rdev for tty1 is 00 00 04 01.
But when I NFS mount the HP box onto the Linux box, the rdev value is
04 00 00 01.

In other words, Linux and HP/UX have different ideas on how to
represent the major/minor device numbers within the unsigned int.
(But unfortunately this is allowed by the RFC, so I can't point any
fingers other than to say that it works fine when I use Solaris instead
of HP/UX.

As a hard-coded solution, swapping the 0th and 2nd bytes works fine.
But how to make this fix adaptive? My patch swaps the bytes if
the minor number is zero (which might be valid, as the "null device"),
but the normally-zero 0th byte is not zero.

What do people think of this? It certainly solves my problem and it
should work as usual for everyone else. Comments?

Many thanks,

Mitch.

--------------789FAABE6503B28B6DD93F38
Content-Type: text/plain; charset=us-ascii; name="nfs-hp.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="nfs-hp.diff"

--- fs/nfs/proc.c.orig Wed Jul 3 20:52:09 1996
+++ fs/nfs/proc.c Mon Jan 5 02:07:48 1998
@@ -161,8 +161,18 @@
return p + QUADLEN(len);
}

+/* Note: Fix for HP NFS servers which reorder the rdev bytes for device
+ * files. This fix cuts in if the major device is unexpectedly zero,
+ * and another usually zero byte is non-zero. If this is so, we swap them!
+ * The symptoms of not having this fix is that device files all have major
+ * device numbers of 0. mjd@alphalink.com.au 4-Jan-98.
+ */
static int *xdr_decode_fattr(int *p, struct nfs_fattr *fattr)
{
+ char b, *bp;
+ u_int old_nfs_rdev, new_nfs_rdev;
+ static rdev_fix_shown=0;
+
fattr->type = (enum nfs_ftype) ntohl(*p++);
fattr->mode = ntohl(*p++);
fattr->nlink = ntohl(*p++);
@@ -170,6 +180,23 @@
fattr->gid = ntohl(*p++);
fattr->size = ntohl(*p++);
fattr->blocksize = ntohl(*p++);
+ /* fix rdev if we need to, for HP-UX servers */
+ old_nfs_rdev = *p;
+ if (fattr->type == NFBLK || fattr->type == NFCHR) {
+ bp = (char *)p; /* We need to swap bytes, but p is *int. */
+ if (bp[2] == 0 && bp[0] != 0) {
+ if (!rdev_fix_shown) {
+ printk(KERN_WARNING "NFS: Activating fix for swapped rdev bytes. (Needed for HP NFS servers)\n");
+ PRINTK("NFS xdr_decode_fattr: old_nfs_rdev=%08x\n", old_nfs_rdev);
+ }
+ b = bp[2]; bp[2]=bp[0]; bp[0]=b; /* do the swap */
+ if (!rdev_fix_shown) {
+ rdev_fix_shown = 1;
+ new_nfs_rdev = *p;
+ PRINTK("NFS xdr_decode_fattr: new_nfs_rdev=%08x\n", new_nfs_rdev);
+ }
+ }
+ }
fattr->rdev = ntohl(*p++);
fattr->blocks = ntohl(*p++);
fattr->fsid = ntohl(*p++);

--------------789FAABE6503B28B6DD93F38--