Re: kernel panics at google

From: H. Peter Anvin (hpa@transmeta.com)
Date: Tue Jan 25 2000 - 00:39:19 EST


Followup to: <200001250315.TAA11520@giraffe.corp.google.com>
By author: David desJardins <desj@google.com>
In newsgroup: linux.dev.kernel
>
> Update: We (Google) set /proc/sys/net/ipv4/tcp_retrans_collapse to "0"
> on our webservers. Now, instead of a kernel panic, they seem to
> spontaneously reboot without any errors or explanation in
> /var/log/messages. For us, this is a significant improvement over the
> previous sitation: at least we don't have to manually reboot them. But
> it still leaves unanswered the question of what is causing it.
>
> It also seems that the reboots are now happening significantly less
> often than the crashes did.
>
> Any suggestion on how we can collect more information about why the
> machines are rebooting?
>

By the way, if you set /proc/sys/kernel/panic to a nonzero value (it's
a seconds count) panics will reboot the machine after N seconds.
Unfortunately it's a dirty reboot, but it's helluva lot better than a
hang for a production server.

     -hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jan 31 2000 - 21:00:14 EST