[PATCH]: autofs fix for 2.2

From: Kurt Garloff (garloff@suse.de)
Date: Tue Jul 25 2000 - 20:19:42 EST


Hi Alan, Al,

autofs in 2.2.1x shows a serious bug:

Imagine a setup with autofs and NFS. The users' home dirs are on a
automounted directory, and all are individually mounted via NFS, when needed.
(Quite standard setup, yes.)

If you ls a home which can not be mounted due to NFS failure, you get a
failure (-ENOENT) on first attempt, a success with empty dir later. mount
and autofs syslog show that the mount was not successful and the dir was
rmdir()ed like it should. You can even cd to that dir.

You can provoke much worse behaviour: Just log in as such user.
Boom! You'll end up with a number (twelve in my tests) of such ghost
directories. Of course, none is real.

On shutdown of automount, you can watch oopses in kmem_cache_free, called by
shrink_dcache_sb.

You guess it, it's an dcache artefact. And, consequently, there is a
clean up procedure: Doing ls -R /usr clears the ghost directories, as the
dentry cache gets filled with other stuff.

Looking into it I found the following in fs/namei.c:
cached_lookup() calls d_revalidate() to check whether the cached dirs are
still valid. If not, it does do a dput(), which actually results in
d_free()ing the dentries, as they are unhashed, and returns 0.
However, when cached_lookup() does return 0, real_lookup() is called. Here,
if the dir is found in dcache after waiting for the dir semaphore going
down, no ->d_lookup() is done, but the contents is trusted. Even worse,
d_revalidate() is called, but the result is discarded. Uhh!

(And this explains, why you only get the fatal error when logging in. The
 dir is then accessed many times and sometimes you hit the race, that the
 dentry was just entered while down() was waiting.)

This breaks autofs, as revalidate() does expect the caller does act on
returning 0. The attached patch does exactly this:
dput() and return ERR_PTR(-ENOENT);

This fixes the trouble with autofs here. AFAICT, other FSes are unaffected.
(Are there braindead FSes out there always returning 0 on revalidate, but
 having mkdir()ed dirs residing in cache? Then you might find your new dir
 only on the second lookup(), if you started it before the mkdir(). That's
 the only issue I can currently think of. But, I'm not a dcache hacker...)

I'm awaiting your expert comments.

If it's OK, please apply!

Regards,

-- 
Kurt Garloff  <garloff@suse.de>                          Eindhoven, NL
GPG key: See mail header, key servers         Linux kernel development
SuSE GmbH, Nuernberg, FRG                               SCSI, Security



- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:20 EST