Re: Flames over -- Re: Which is simpler?

From: Phillip Susi
Date: Tue Feb 14 2006 - 22:07:23 EST

Next message: Nikolay N. Ivanov: "Re: 2.6.15.x - very slow disk-writing"
Previous message: Andrew Morton: "Re: 2.6.16-rc3-mm1"
In reply to: Kyle Moffett: "Re: Flames over -- Re: Which is simpler?"
Next in thread: Olivier Galibert: "Re: Flames over -- Re: Which is simpler?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Kyle Moffett wrote:

On Feb 14, 2006, at 16:13, Phillip Susi wrote:
Because you can not go yanking devices out from under the kernel without it's knowledge or consent. This is no more acceptable than ejecting a floppy without first unmounting it; the only difference is that the floppy drive doesn't erroneously inform the kernel that you have done this simply because you suspend.

Yes, but causing it to overwrite data over random blockdevs when the unexpected happens is _not_ OK.

Why not? That is exactly what happens if you suspend and swap floppies. Just don't do that and you're fine. It is nice if you can handle the swapped case more gracefully, but not at the cost of data loss in the _normal_ case.

You've obviously never administrated any kind of large scale linux lab. We had so many people just saving files on their USB sticks and pulling them that we would daily get people reporting that they couldn't _mount_ their USB stick because the last user hadn't unmounted it first. At one point we had to write a Perl script run from cron that ran every 10 minutes and verified if USB-stick mounts were still good, and if not it

This is a very good argument for _correctly_ detecting device removal. I am not arguing against that. The problem is with _incorrectly_ detecting device removal.

forcefully unmounted them. Now obviously this is the kind of behavior that we want to avoid, but end-users are bound to do stupid stuff until it comes back and bites them at least twice (except on Linux with -o sync mounts it usually doesn't). Think about how most linux distros automatically turn on -o sync on USB keys and the like; they do it for precisely this reason.

Some distros do. Some do not. I side with those who do not because doing so promotes stupid users ( they won't learn until it bites them twice ) and because it burns out the flash memory 10 times faster and slows down access.

I described a workable method to handle root-on-USB (and I'm not sure, but I doubt it works now). You would have early userspace find your USB filesystems again and pass device information to the resuming kernel, which would then restore RAM and then use the passed information to attach its filesystems again.

There is currently no such method of reconnecting a filesystem to a device once the device has been broken by an eject, is there? Also that amounts to much the same thing as I have been saying all along: the system should check to see if the device that is there now is the same as the one that was there before, and if so, resume using it.

One other alternative that just occurs to me now is to have a special stackable filesystem that can suspend all IO and unmount the filesystem underneath it then have the filesystem be easily reattachable later on (the stackable layer would look up paths again on remount. You would have an mlocked program start during suspend that would create a new namespace with a basic tmpfs, procfs, and sysfs. It would be able to rescan all removable devices on resume and reattach them to the stackable mounts in the primary namespace through /proc/<pid>/root. This seems to me to be the most race-free and most flexible solution. It will even handle network suspend and resume without too much extra effort.

That seems rather complex with little benefit. The only advantage that would have over not breaking the mount in the first place is in the pedantic case where the user screws with the media during suspend. Seems like a high price to pay to cleanly deal with a pedantic error condition.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Nikolay N. Ivanov: "Re: 2.6.15.x - very slow disk-writing"
Previous message: Andrew Morton: "Re: 2.6.16-rc3-mm1"
In reply to: Kyle Moffett: "Re: Flames over -- Re: Which is simpler?"
Next in thread: Olivier Galibert: "Re: Flames over -- Re: Which is simpler?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]