Re: bug tracking question (VFS crash)

Alexander Viro (viro@math.psu.edu)
Sat, 24 Apr 1999 11:39:21 -0400 (EDT)


On Sat, 24 Apr 1999, Manfred Spraul wrote:

> Alexander Viro wrote:
> > Nope, just that something called fput() on the file in question
> > when it shouldn't.
> I've attached Koles Sandors last email.
> He has several messages
> "VFS: Close: file count is 0"
Same effect.

> I've checked fput()/ fget() calls, and I found the
> following problems:
> (I don't think that they are related to this crash):
>
> 1) binfmt_elf.c: line 819:
> do_load_elf_library():
> > fget(fd)
> > if(!file || !file->f_op)
> > goto out;
> ...
> > out: return
>
> possible memory leak: file != NULL, file->f_op == NULL

I'm not sure on that one - it definitely can't give the effect in question
(opposite direction), but I suspect that file->f_op can't be NULL at all.
Dunno, I'll look at it.

> 2) exec,c line 194:
> sys_uselib()
> > fput(file).
> if the fget() call about 15 lines above fails,
> then fput() will cause a NULL pointer access.
Yes, but it would give instant OOPS and it was not the case.
Besides, it should not happen (read: if it happens we are screwed
somewhere else. Big way).

> But I think that the fget() call will never fail.
>
> > What .config do you have? What kind of stuff ran on the box when (and
> > before) it happened?
> I'll ask him for his .config file.
>
> The server is a quad processor PPro, many processes,
> inode_limit 32000, files_max 8000.
> more details attached.

It's nothing but a gut feeling, but I suspect that we either have fput()
called outside of big lock (and thus SMP races) or net/socket.c playing
rough. I'll look at it ~6 hours later - right now I'm going down ;-/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/