>
> I have not yet tried any of the 2.1 series kernels with Netscape, so
> can't comment on that yet...
The problem of netscape is known, but no solution yet (other than switching
to Mozilla which seems to have fixed it, or using a proxy).
When you strace netscape when it is hanging you'll see that is in an endless
write(fd,"GET ...",..) = -EAGAIN
loop. This happens when netscape tries to connect to a remote server which
does not send a SYN-ACK back. Linux returns EAGAIN when someone tries to
write to a socket that is not connected yet. This should only happen when
select() disagrees with write() [because netscape thinks a socket is writable,
but it is not]. The output routine sees the EAGAIN, thinks it is just an
temporary error, tries again => Endless loop.
The EAGAIN return code in the not-connected-yet case is a bit unfortunate
[e.g. HP/UX uses ENOTCONN here which is more appropiate IMHO], but changing
it would probably break too many programs - also Solaris returns EAGAIN too.
That it happens with 2.1, but not with 2.0 points to a difference in
the 2.0/2.1 select policies. Both me and David Miller looked over the
code but couldn't find any difference that would explain that yet.
Note that I never managed to catch netscape with strace before the incident
happened - bonus points to anybody who can supply the last 30 system calls
or so _before_ the endless loop happens. That would help tracking it down
a lot.
Mozilla seems to have fixed it, I've never seen it with mozilla. Another
workaround is to use a proxy, because the connection to the proxy usually
works fast, skipping the "dangerous" SYN_SENT phase quickly.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu