Re: Clue on 2.0.33 crashes

Doug Ledford (dledford@dialnet.net)
Fri, 20 Feb 1998 07:25:34 -0600


Jos Vos wrote:
>
> Thomas Schenk wrote:
>
> > And I would add that we have found that prior to every freeze that our
> > systems have experienced under 2.0.33, we find a syslog message from xntpd
> > as follows:
> >
> > Feb 11 09:18:46 acme xntpd[151]: kernel pll status change 89
> >
> > These messages do not occur immediately prior to the crash, but we have
> > theorized that this status change is a precursor to the lockup and the
> > number of lockups that we have seen has been greatly reduced since we
> > stopped using xntpd and started using ntpdate with the -b option in a
> > cron job to sync the clocks.

Hmmmm, I'm running xntpd on about 5 systems and I've never had a lock nor
seen this message. Of course, that could be merely luck. The real question
in my mind is what is that message all about. There were significant
updates to the kernel time code in 2.0.32 or 2.0.33 (can't remember which).
It seems (by memory) that the changes were written by either Ulrich
{Wendel,Drepper} (SP?). Maybe the author would comment?

>
> Hmmm, yes, we also have these message every time when xntpd says to be
> synchronized but without doing a time step. Although we have frequent
> mysterious system lockups, I can't find a relation with this (in time).
>
> What is interesting, of course, is whether all people getting these
> lockups also run xntpd.
>
> Another thing (because multicasting is mentioned): do all people
> with the lockups run gated (I do...)?

I can't speak for the people that sent me configs, but on my tulip systems,
gated runs fine with multicast enabled. On the other hand, I once installed
gated on a machine that had a 3v509 ethernet, multiple ppp, and an eql
interface running. The ppp devices were all slaves to the eql and gated was
configured to run across the eql and ethernet interfaces. Within about 10
minutes of enabling this config, I got a spontaneous reboot, so I disabled
gated and went back to static routes and that machines has not rebooted
spontaneously since. That's one of the reasons I mention multicast in the
suggestion email, as it may have an impact depending on the device drivers
over which multicast is running.

-- 

Doug Ledford <dledford@dialnet.net> Opinions expressed are my own, but they should be everybody's.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu