Re: Oops with tmpfs on both 2.4.22 & 2.6.0-test11

From: James W McMechan
Date: Sun Nov 30 2003 - 22:43:46 EST


> This is significantly different in nature from the 2.4 oops, since
> 2.4 hit NULL and this pointer is total garbage.
>
> Either it's a double bitflip or even worse is afoot.

Umm from include/linux/list.h
#define LIST_POISON1 ((void *) 0x00100100)
#define LIST_POISON2 ((void *) 0x00200200)
though perhaps we need a better poison
0xdead0001 for example but that might be valid

Were you thinking of a hardware fault?
The test program oops both a Athlon and a
PentiumMMX and I followed this in from a user
bugreport over on uml-devel

I single stepped through on a UML machine and it looked
like the prev pointer in the list is getting corrupted, I was
suspecting that fs/libfs.c:dcache_readdir:137
list_del(q);
list_add(q, &dentry->d_subdirs);
when q is a empty list entry this occurs when fpos is 2
and has no comment :(
there is a similar chunk at dcache_dir_lseek:90 with a
list_del(&cursor->d_child);
list_add_tail(&cursor->d_child, p);

I suspect that deleting from a empty? list and adding
back the deleted entry will mangle things...
The problem came from looping over roughly

dirfile = opendir(dirname)
seekdir(dirfile,pos)
ent = readdir(dirfile)
pos=telldir(dirfile)
closedir(dirfile)

it started with pos== 0
seekdir is fine
readdir returns "."
teldir returned 1 -> pos
seekdir is fine
readdir then got ".." and
teldir returned 2 -> pos
seekdir then blew up on the empty entry

Have you tried the test program?

________________________________________________________________
The best thing to hit the internet in years - Juno SpeedBand!
Surf the web up to FIVE TIMES FASTER!
Only $14.95/ month - visit www.juno.com to sign up today!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/