Re: [PATCH v2 3/3] NFS: Fix a memory leak in nfs_readdir

From: Linus Torvalds
Date: Wed Dec 01 2010 - 18:32:11 EST


On Wed, Dec 1, 2010 at 2:38 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> OK, the stop_machine() plugs a lot of potential race-vs-module-unload
> things.  But Trond is referring to races against vmscan inode reclaim,
> unmount, etc.

So?

A filesystem module cannot be unloaded while it's still mounted.

And unmount doesn't succeed until all inodes are gone.

And getting rid of an inode doesn't succeed until all pages associated
with it are gone.

And getting rid of the pages involves locking them (whether in
truncate or vmscan) and removing them from all lists.

Ergo: vmscan has a locked page leads to the filesystem being
guaranteed to not be unmounted. And that, in turn, guarantees that
the module won't be unloaded until the machine has gone through an
idle cycle.

It really is that simple. There's nothing subtle there. The reason
spin_unlock(&mapping->tree_lock) is safe is exactly the above trivial
chain of dependencies. And it's also exactly why
mapping->a_ops->freepage() would also be safe.

This is pretty much how all the module races are handled. Doing module
ref-counts per page (or per packet in flight for things like
networking) would be prohibitively expensive. There's no way we can
ever do that.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/