Re: Linux-2.2.2-pre2..

Linus Torvalds (torvalds@transmeta.com)
Sat, 6 Feb 1999 12:03:44 -0800 (PST)


On 6 Feb 1999, Philippe Troin wrote:
>
> I'm going to try that and tell you the results, but I think there's
> something deeper going on: if this was the only problem, the oops
> would only happen in a 'crashme' process, but the oops I sent happens
> in find (why would find get a hangup ?), but I also got it in cat (in
> while true; do find / ! -fstype nfs -type f -print0 | xargs -n0 cat;
> done), or in syslogd...

No, the hangup is done asynchronously at the next scheduling point, so the
hangup even can actually happen in a random process context. So it doesn't
matter who closes the file, the actual hangup might even happen on another
CPU..

This is why we've had various problems with hangup in the past too: it's
just so asynchronous, and I think it tries to hang up a pty slave (because
we closed the master) at the same time the slave tries to open it.

Which is otherwise harmless, but some of the data structures have
obviously not been initialized. The reason it only happens under load is
that both parties do get the kernel lock, so in order to hit the window
one of them has to block on a memory allocation or similar.

Quite frankly, I only looked very quickly at this, and Ted should probably
see if maybe we could avoid the problem more prettily by just doing some
initialization in a different order, but from the quick look it basically
looks to me like the slave tty->termios structure just hasn't been
allocated yet, so when the hangup code tries to re-initialize it it fails
for obvious reasons.. Thus the simple test for NULL, while fairly ugly,
should do the trick.

(For example, maybe we should set "filp->private_data" to point to the tty
only _after_ it has been initialized? Then the hangup code wouldn't ever
see half-way initialized tty's at all. But in the meantime the NULL
pointer test should be sufficient - with the caveat that there might be
other places you'd need the same check too).

Ted?

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/