OOPS swapon kernel 2.2.9

Ghozlane Toumi (gtoumi@messel.emse.fr)
Mon, 12 Jul 1999 09:09:10 +0200 (MET DST)


Hello all...
I saw yesterday my first "oops".
(my second one in fact. but you 'll agree that with overclocking , my
very first one was not realy a genuine oops ;)
anyway this one is not overclocking related (just because I dont overclock
any more...)

[1.] One line summary of the probleme:
It seems that the kernel oopses when playing with swapfiles

[2.] Full description of the problem/report:
if you turn on 2 swapfiles (I meen swapon /swapfil1 ; swapon /swapfile2)
then turn them off, the next time you swapon a swapfile, the kernel oopses

I tried to trace this oops down to mm/swapfile.c .
it apears that the call sys_swapon craches after trying to read
the swap_info[] record with incorrect index line 551 :

for (i = 0 ; i < nr_swapfiles ; i++) {
if (i == type)
continue;
if (p->swap_device == swap_info[i].swap_device)
goto bad_swap;
}
the probleme comes from an too high value of nr_swapfile.

if nr_swapfile is (as I gess) the # of active swapfiles (doh),
the problem is that the swapoff syscall fails to decrease nr_swapfile
well.. its a guess... but I believe that its much more complicated than
that :)

I put a small (but efficient :) print at the end of sys_swapon and
sys_swapoff an these are the results of some tests. (I disabled my normal
swap partition from fstab )

action nr_swapfile at the end of the syscall (thank you prink)

swapon file1 1
swapon file2 2
swapoff file1 2
swapoff file2 2
swapon file1 oops...

an other sequence that will bring the oops (the original one in fact ) is
when you *hum* try to swapon an already user swapfile :

action nr_swapfile at the end of the syscall

swapon file 1
swapon file 2 (1)
swapoff file 2
swapon file oops...

(1) swapon returns Device or resource buzy, BUT still increases
nr_swapfile. looks like it's a strange behaviour imho...

things are *almost* the same with a partition instead of a file .
(I didn't try with 2 swap partitions )

[3.] Keywords (i.e., modules, networking, kernel):
kernel, swap , oops

[4.] Kernel version (from /proc/version):
Linux version 2.2.9 (root@gally.emse.fr) (gcc version 2.7.2.3) #111 Mon
Jul 12 07:53:02 CEST 1999

[5.] Output of Oops.. message with symbolic information resolved
(see Kernel Mailing List FAQ, Section 1.5):
$ dmesg |/usr/src/linux/scripts/ksymoops/ksymoops
Options used: -V (default)
-o /lib/modules/2.2.9/ (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-m /usr/src/linux/System.map (default)
-c 1 (default)

You did not tell me where to find symbol information. I will assume
that the log matches the kernel and modules that are running right now
and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.

3c509.c:1.16 (2.2) 2/3/98 becker@cesdis.gsfc.nasa.gov.
Unable to handle kernel NULL pointer dereference at virtual address
00000008
current->tss.cr3 = 04a87000, %cr3 = 04a87000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0121f65>]
EFLAGS: 00010202
eax: 00000000 ebx: c01d3cc4 ecx: 00000002 edx: 00000001
esi: c4b1e5c0 edi: 00000000 ebp: c4b1e5c0 esp: c4a8bf28
ds: 0018 es: 0018 ss: 0018
Process swapon (pid: 537, process nr: 39, stackpage=c4a8b000)
Stack: 08048d19 bffffa50 c4e87f20 c0119cbc c0119d39 c4a851f4 0004a025
c4a8a000
4007d760 c01d3cc4 fffffff0 c004a000 00000000 00001000 00000000
00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
Call Trace: [<c0119cbc>] [<c0119d39>] [<c0108844>]
Code: 8b 40 08 39 45 08 0f 84 6b 04 00 00 42 39 ca 72 de 31 d2 b8
Warning: trailing garbage ignored on Code: line
Text: 'Code: 8b 40 08 39 45 08 0f 84 6b 04 00 00 42 39 ca 72 de 31 d2 b8
'
Garbage: ' '

>>EIP: c0121f65 <sys_swapon+295/7d0>
Trace: c0119cbc <do_no_page+54/e4>
Trace: c0119d39 <do_no_page+d1/e4>
Trace: c0108844 <system_call+34/38>
Code: c0121f65 <sys_swapon+295/7d0> 00000000 <_EIP>: <===
Code: c0121f65 <sys_swapon+295/7d0> 0: 8b 40 08
movl 0x8(%eax),%eax <===
Code: c0121f68 <sys_swapon+298/7d0> 3: 39 45 08
cmpl %eax,0x8(%ebp)
Code: c0121f6b <sys_swapon+29b/7d0> 6: 0f 84 6b 04 00 je
c01223dc <sys_swapon+70c/7d0>
Code: c0121f70 <sys_swapon+2a0/7d0> b: 00
Code: c0121f71 <sys_swapon+2a1/7d0> c: 42
incl %edx
Code: c0121f72 <sys_swapon+2a2/7d0> d: 39 ca
cmpl %ecx,%edx
Code: c0121f74 <sys_swapon+2a4/7d0> f: 72 de jb
c0121f54 <sys_swapon+284/7d0>
Code: c0121f76 <sys_swapon+2a6/7d0> 11: 31 d2
xorl %edx,%edx
Code: c0121f78 <sys_swapon+2a8/7d0> 13: b8 00 00 00 00
movl $0x0,%eax

2 warnings issued. Results may not be reliable.

[6.] A small shell script or example program which triggers the
problem (if possible)
---8<------
#!/bin/sh

dd if=/dev/zero of=/tmp/sf1 bs=1k count=1k
cp /tmp/sf1 /tmp/sf2

mkswap /tmp/sf1
mkswap /tmp/sf2

swapon /tmp/sf1
swapon /tmp/sf2

swapoff /tmp/sf1
swapoff /tmp/sf2

# the last swapon will segfault , and the kernel oopses.
swapon /tmp/sf1

---8<-----

[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)

# /usr/src/linux/scripts/ver_linux
-- Versions installed: (if some fields are empty or looks
-- unusual then possibly you have very old versions)
Linux gally.emse.fr 2.2.9 #111 Mon Jul 12 07:53:02 CEST 1999 i586 unknown
Kernel modules 2.1.121
Gnu C 2.7.2.3
Binutils 2.9.1.0.15
Linux C Library 2.0.7
Dynamic linker ldd (GNU libc) 2.0.7
Linux C++ Library 2.8.0
Procps 1.2.9
Mount 2.9
Net-tools 1.50
Kbd 0.94
Sh-utils 1.16
Modules Loaded 3c509 nls_iso8859-1 nls_cp437 vfat fat sb uart401
sound soundlow soundcore

# swapon -V
swapon: mount-2.9

[7.2.] Processor information (from /proc/cpuinfo):

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 5
model : 4
model name : Pentium MMX
stepping : 3
cpu MHz : 200.456217
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : yes
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr mce cx8 mmx
bogomips : 399.77

[7.3.] Module information (from /proc/modules):
# cat /proc/modules
3c509 5372 1 (autoclean)
nls_iso8859-1 2024 1 (autoclean)
nls_cp437 3548 1 (autoclean)
vfat 11100 1 (autoclean)
fat 23912 1 (autoclean) [vfat]
sb 31416 0
uart401 5588 0 [sb]
sound 54456 0 [sb uart401]
soundlow 208 0 [sound]
soundcore 2100 6 [sb sound]

[7.4.] SCSI information (from /proc/scsi/scsi):
n/a (sight)

[7.5.] Other information that might be relevant to the problem
(please look in /proc and include all information that you
think to be relevant):
nothing comes in mind

[X.] Other notes, patches, fixes, workarounds:
well.. I'm afraid I don't have the skills to do more than some prink ...

I didn't have the time to check against a 2.2.10 kernel, but the patch to
mm/swapfile.c don't seem to fix the problem.

hope it helps.

and hope it is not old news...

ghoz

PS Please excuse me for my poor english. In fact, I try very hard just to
be understood :)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/