Re: Revoking filesystems [was Re: Sysfs attributes racing with unregistration]

From: Eric W. Biederman
Date: Thu Jan 05 2012 - 15:41:00 EST


Tejun Heo <tj@xxxxxxxxxx> writes:

> Hello, Ted.
>
> On Thu, Jan 05, 2012 at 01:27:52PM -0500, Ted Ts'o wrote:
>> So it's really more of a filesystem force-umount method. I could
>> imagine that this could also be used to extend the functionality of
>> umount(2) so that the MNT_FORCE flag could be used with non-NFS file
>> systems as well as NFS file systems.
>
> I think these are two separate mechanisms. Filesystems need to be
> able to handle IO errors no matter what and underlying device going
> away is the same situation. There's no reason to mix that with force
> unmount. That's a separate feature and whether to force unmount
> filesystem on device removal or permanent failure is a policy decision
> which belongs to userland - ie. if such behavior is desired, it should
> be implemented via udev/udisk instead of hard coded logic in kernel.
>
> I don't know enough to decide whether such forced unmount is a useful
> feature tho. It can be neat for development but is there any real
> necessity for the feature?
>
>> [1] Interesting question: do we convert an mmap region to an anonymous
>> region and perhaps notify the user out of band this has happened? Or
>> do we just make the mapping disappear and nuke the process with a SEGV
>> if it attempts to access it?
>
> FWIW, I vote for SIGBUS similarly to the way we handle mmap
> vs. truncate.

Agreed. SIGBUS is documented as the mapping exists but the backing
store has gone away, which seems to describe hotunplug very well.
Additionally we already do this for sysfs and it works well.

So it appears that on a hotunplug it is desirable to wake all poll
waiters of a filesystem, invalidate all mmaps, and probably notify
all inotify watchers. And in general scream to userspace that the
filesystem is gone leave it alone.

That does require a notification from the block device going away
to the filesystem. Tejun is there an existing mechanism that we
can plug into or do we need to implement something new?

Ted we can scream that the filesystem is going away without freeing
all of the filesystem data structures. To userspace there would
effectively be no difference but internal to the kernel it should
allows to skip the expensive logic of tracking every time a filesystem
method is invoked, allowing us to not penalize the fast path.

If I don't have to provide a zero cost ability to track which filesystem
methods are active at any given time I think I can whip up something
that is usable in a couple of days.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/