[PATCH][2.5.74] devfs lookup deadlock/stack corruption combined patch

From: Andrey Borzenkov (arvidjaar@mail.ru)
Date: Mon Jul 07 2003 - 14:06:15 EST

On Monday 07 July 2003 04:54, you wrote:
> Actually, don't bother. This idea can be made to work, but
> we already have enough tricky stuff in the wait/wakeup area.
> Let's run with your original patch.

I finally hit a painfully trivial way to reproduce another long standing devfs
problem - deadlock between devfs_lookup and devfs_d_revalidate_wait. When
devfs_lookup releases directory i_sem devfs_d_revalidate_wait grabs it (it
happens not for every path) and goes to wait to be waked up. Unfortunately,
devfs_lookup attempts to acquire directory i_sem before ever waking it up ...

To reproduce (2.5.74 UP or SMP - does not matter, single CPU system)

ls /dev/foo & rm -f /dev/foo &

or possibly in a loop but then it easily fills up process table. In my case it
hangs 100% reliably - on 2.5 OR 2.4.

The current fix is to move re-acquire of i_sem after all
devfs_d_revalidate_wait waiters have been waked up. Much better fix would be
to ensure that ->d_revalidate either is always called under i_sem or always
without. But that means the very heart of VFS and I do not dare to touch it.

The fix has been tested on 2.4 (and is part of unofficial Mandrake Club
kernel); I expected the same bug is in 2.5; I just was stupid not seeing the
way to reproduce it before.

Attached is combined patch and fix for deadlock only (to show it alone).
Andrew, I slightly polished original stack corruption version to look more
consistent with the rest of devfs; also removed NULL pointer checks - let it
just BUG in this case if it happens.

I have already sent the patch for 2.4 two times - please, could somebody
finally either apply it or explain what is wrong with it. Richard is out of
reach apparently and the bug is real and seen by many people.



