showstopper race condition in sync() ??? [2.1.119|120]

Cyrille Chepelov (chepelov@rip.ens-cachan.fr)
Sun, 13 Sep 1998 13:35:02 +0200 (MET DST)


Hi all, I've been experiencing some trouble on an experimental server :

everything works fine, until someone tries to change his password. The
first ten users could change their password, but now, whenever they do
this, the "passwd" process locks up, can not be killed (tried kill -1 and
-9), and subsequent calls to "sync" lock with the same symptoms.
When shutting down the machine, worse things happen : the machine locks up
after printing "unmounting remote file systems". MagicSysRq-S does not
work (MSR-U does remount r/o, however. But that does not help anything).
Needless to say, after rebooting, everything is fsck'd up, and on a 6 Gb
drive, it takes some time...

After a reboot, I tried to change a password under strace's control.
Passwd locked itself up while calling "sync()", after succesfully changing
the password.

What puzzles me the most is that the first accounts didn't cause
problems...

(as a temporary workaround, I can use RH's control panel to change the
passwords, but that's neither confortable or a good long term solution).

The system :
K6/200, trash motherboard (VXPro chipset), 6 Gb seagate UDMA HD, 32 MB
SDRAM (of which one Mb is claimed by the mb's integrated (useless) video),
2 "SimpleNet SN-3200 PCI" ne2k clones.
Distro is RedHat 5.1, autorpm'd on ftp.lip6.fr's mirror.
Kernel compiled with both gcc-2.7.2.3 and egcs-1.1b.

I can give the passwd file if someone is motivated to try to reproduce the
problem (but I'd be happier without doing so <grin /> )

-- Cyrille

------------------------------
Zog Zog

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/faq.html