nfs-mounted binaries go boom, suspect ld.so?

kwrohrer@enteract.com
Sat, 24 Jan 1998 23:32:22 -0600 (CST)


I mount /home via NFS, and recently I noticed that binaries in ~/bin
*usually* die with a bus error or segmentation fault; some bootups,
they seem to work okay. When I recently rebooted to an older kernel
(for other reasons) which had given me proper execution of nfs-mounted
binaries, I found I still had the problem.

This bootup, "strace ~kwrohrer/bin/xphoon" gives:
execve("/home/kwrohrer/bin/xphoon", ["/home/kwrohrer/bin/xphoon"], [/* 27 vars */]) = 0
mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40007000
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

Copying ~kwrohrer/bin/xphoon to /tmp, which is local, and running from there:
execve("/tmp/xphoon", ["/tmp/xphoon"], [/* 27 vars */]) = 0
mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40007000
mprotect(0x40000000, 20961, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
mprotect(0x8000000, 15396, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
stat("/etc/ld.so.cache", {st_mode=S_IFREG|0644, st_size=10743, ...}) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
[...]

Attempting to run the program under gdb also gives SIGSEGV; the backtrace:
#0 0x40000c14 in ?? ()
Cannot access memory at address 0x8016a64.

Wierder and wierder, the program runs just fine when copied to a different
export of part of the same partition (dyheli:/usr; dyheli:/home ->
/usr/home), even when the copy was done on the server (so the client
couldn't be cacheing any of it). Killing and restarting rpc.nfsd and
rpc.mountd on the server (2.1.76 unfsd) doesn't change the situation;
mounting dyheli:/home via the old NE2000 cards (rather than the newer
tulip-clones) still gave the seg faults when I tried to run binaries!
The lines in /etc/exports for /home (broken) and /usr/local/archive
are identical except for the name...

/proc/mounts on the client (jadrek, fwiw) shows:

/dev/root / ext2 rw 0 0
/dev/sda2 /dos msdos rw,noexec,nosuid,nodev 0 0
/proc /proc proc rw 0 0
dyheli:/home /home nfs rw,rsize=8192,wsize=8192,soft,intr,addr=dyheli 0 0
dyheli:/usr/local/archive /archive nfs rw,nosuid,nodev,rsize=8192,wsize=8192,soft,intr,addr=dyheli 0 0
dyheli:/var/shared /var/shared nfs rw,noexec,nosuid,nodev,rsize=1024,wsize=1024,soft,intr,addr=dyheli 0 0
dyheli-10b2:/home /mnt/scratch/ nfs rw,addr=dyheli-10b2 0 0

Does anyone know what I'm doing wrong, or where to look for more clues?
The only thing I'm sure I haven't changed is ld.so, upgraded from 1.8.5
to 1.9.6...

Keith

-- 
"Quartz glyph jocks vend, fix, BMW."  -- 1990 IOCCC judges
The Decline of Western Civilization:  Native Americans revered Raven and 
Coyote.  Our parents watched Moose and Squirrel.  Our children drool on 
Microsoft Barney and Tickle Me Elmo.          http://www.enteract.com/~kwrohrer