Re: 2.6.4-rc1 oops on HPFS filesystem file rename

From: viro
Date: Sun Feb 29 2004 - 08:16:25 EST


On Sun, Feb 29, 2004 at 12:31:30PM +0100, Maurice van der Stee wrote:
> I think I tested most of the basic file system operations, rename,
> move, copy, edit, truncate etc. and found no problems with the second
> patch applied..

Actually, I've found a problem with the original - race between write_inode()
and cross-directory rename(), of all things. Mind testing the patchset on
ftp.linux.org.uk/pub/people/viro/HPFS*? First two patches were posted in
this thread, the rest goes on top of them.

Race in question is funny, BTW - write_inode() might have to find and
update directory entry in the parent, so it gets the parent inode and locks
it (that's what hpfs-private lock was for - it gave exclusion between
directory modifications and write_inode() directory entry update; directory
entries can move on insertion/removal, so we need to make sure that it
won't happen during that updating). The trouble being, it did not make
sure that we lock the right directory - if rename() had moved the file
in question while we were trying to lock the parent, we would end up with
very confused write_inode().

Similar race exists for unlink() vs. write_inode() - the latter checks
->i_nlink, but there is no warranty that it won't become 0 between the
check and locking the parent. Ditto for rmdir() and overwriting rename().

The fix is simple: we turn semaphore into sempahore + rwsem and
1) on mkdir/mknod/create/symlink grab rwsem on parent for writing
2) on rmdir/unlink/rename we grab semaphore on victim and object
being moved, then grab rwsem on parent(s) for writing.
3) on write_inode we grab semaphore on inode, then check that
it has non-zero ->i_nlink, get parent and grab rwsem on parent for reading.

Exclusion already provided by VFS makes order in rename() (there we might
get two semaphores and two rwsems) a non-issue - we already have all ->rename()
on given fs serialized. Order among semaphores and among rwsems, that is -
semaphores are always taken first.

Aside of that, patchset switches from use of iget() to use of iget_locked(),
which allows to kill the horrors in hpfs_lock_iget(), and does general
cleanup.

Unless there are complaints and bug reports, it will go to Linus in a couple
of days.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/