2.2.19 kernel hang

From: Tim Peeler (thp@linux01.LinuxForce.net)
Date: Tue Jul 03 2001 - 15:47:12 EST


Summary:

   Kernel 2.2.19 hang [stuck on TLB IPI wait (CPU#0)]

Description:

   After recently upgrading the kernel on a production server to
   kernel 2.2.19 with the reiserfs patch and kernel-patch-2.2.19-ide
   from Andre Hendrick, the system became hung. The server was
   responsive to ping but ssh and http service stopped working until
   we rebooted. The hang did not happen instantly, but took several
   days of uptime before the system hung. Looking through the logs
   we noticed that "stuck on TLB IPI wait (CPU#0) was logged 128 times
   in one second. Also during that second, 4443 packets were rejected
   by the kernel coming in on eth0, of which 2681 were aimed at port
   137. These packets came from 60 addresses not on our local network
   (packets destined to 137 on our local network are rejected, but not
   logged). Of other note was this kernel message: "dst cache overflow"
   "last message repeated 9 times". Since this did not crash the
   machine, there is no oops output. We were running the mon package
   from 3 other servers during this time and noticed shortly after
   the hang that several services were failing (ssh, http, etc). Even
   ping failed a few times.

Kernel Version:

   Linux version 2.2.19 (cjf@linux00) (gcc version 2.95.2 20000220
   (Debian GNU/Linux)) #1 SMP Tue Jun 26 11:55:50 EDT 2001

Output from ver_linux:

Linux www.fi.edu 2.2.19 #1 SMP Tue Jun 26 11:55:50 EDT 2001 i686 unknown
 
Gnu C 2.95.2
Gnu make 3.79.1
binutils 2.9.5.0.37
util-linux 2.10f
modutils 2.3.11
e2fsprogs 1.18
Linux C Library 2.1.3
ldd: version 1.9.11
Procps 2.0.6
Net-tools 1.54
Console-tools 0.2.3
Sh-utils 2.0
Modules Loaded

CPU Info:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 3
cpu MHz : 933.040
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 3
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 psn mmx fxsr xmm
bogomips : 1854.66

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 3
cpu MHz : 933.040
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 3
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 psn mmx fxsr xmm
bogomips : 1861.22

SCSI Info:
Host: scsi0 Channel: 00 Id: 04 Lun: 00
  Vendor: ARCHIVE Model: Python 04106-XXX Rev: 743B
  Type: Sequential-Access ANSI SCSI revision: 02

Interrupts (/proc/interrupts):
           CPU0 CPU1
  0: 5674938 5598011 IO-APIC-edge timer
  1: 1 1 IO-APIC-edge keyboard
  2: 0 0 XT-PIC cascade
  8: 0 1 IO-APIC-edge rtc
 10: 1829310 1817099 IO-APIC-level Mylex DAC960PTL1, eth0
 11: 18 18 IO-APIC-level aic7xxx
 13: 1 0 XT-PIC fpu
NMI: 0
ERR: 0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jul 07 2001 - 21:00:13 EST