Re: Resend: Another 4.4 to 4.5 floppy issue

From: Mark Hounschell
Date: Wed Jul 13 2016 - 08:15:08 EST


On 07/12/2016 04:54 AM, Jiri Kosina wrote:
On Mon, 11 Jul 2016, Mark Hounschell wrote:

Well, all that was specified in my original post. I can no longer open the
floppy drive with no floppy media inserted. Worse, I can also no longer open a
floppy with media inserted that is not a "linux" recognized format. A floppy
drive is a removable media device and should be treated as such. The original
implementation of the O_NDELAY flag allowed it to be.

Any removable media device should be capable of being opened with no, or even
unrecognizable media installed. The kernel and its utilities should not
"assume" to much when it comes to removable media. Consider a SCSI tape drive
or even a removable media SCSI disk drive. How would you explain an open
failure to someone trying to open a SCSI tape drive that had no tape or even a
"non-tar" formatted tape media in it???
Or better yet, trying to open a removable media device the was write protected
but didn't include O_RDONLY in the open?

Alright, so you are basically supplementing O_NDELAY flag in order to
avoid check_disk_change() being called. It's rather a coincidence that it
has worked this way, but I agree with you that we can't ignore the fact
that there is userspace relying on this behavior.


I'm not supplementing anything. The driver _did_ this on its own. I just expect to be able to open the drive to get a handle without the kernel attempting to access the media. My apps manage a disk_change on their own. I don't think its check_disk_change that gives me my pain. There is some probe happening that fails when a floppy is installed that is not a "standard" format. That causes the open to fail which is the most pain. Still I should be able to get a handle without any media or even unrecognized media installed.

Funny, though even fdformat from the linux-utils package won't allow me to format a floppy that is NOT already formatted in a supported format. Once I format a floppy to an other than standard format, fdformat will not allow me to reformat it back to a standard format. Doesn't make much sense does it? "Unable to format a floppy that is not already formatted??" That is another issue though.

The original behavior of the floppy driver was correct. I have no idea
what BUG these changes were supposed to fix but the "fix" obviously
broke user land. Was this bug reported by some new ROBOT test or
something? The kernel floppy driver has been stable for years now

That's not really true; the code is a racy mess, and this is being
uncovered only when virtualized floppy devices started to exist (because
they are much faster than a real hardware, and the different timing
reveals bugs that were not visible before).


Forgive me here as I'm ignorant about why any virtualized floppy would require the real physical kernel floppy driver to be involved at all. We also do virtualized floppies in our user land apps but we certainly don't require any kernel floppy driver support to do it?

This particular fix was because syzkaller found a way how easily corrupt
kernel memory using O_NDELAY to floppy driver; see

https://lkml.org/lkml/2016/2/2/848

so I am really confused as to why these changes were induced.

The floppy driver is in an orphan mode; no new "features" are added "just
because". Everything that's happening there is to fix real bugs in the
kernel.

I'll look into ways how to fix this, but I am afraid this is going to be
really tricky. Therefore we'd have to very likely proceed asap with revert
of 09954bad448 and coming up with a workaround that'd still avoid the bug
reported by syzkaller.


I would be happy to do some testing for you if needed. At least with regard to our apps.

Regards
Mark