Regression in 5.17-rc1 on pata-falcon (was: Re: [PATCH] m68k: mm: Remove check for VM_IO to fix deferred I/O)

From: Michael Schmitz
Date: Fri Feb 04 2022 - 19:04:26 EST


Hi Jens,

commit 180dccb0dba4f5e84a4a70c1be1d34cbb6528b32 (blk-mq: fix tag_get wait task can't be awakened) does cause a regression on my m68k hardware test rig (m68k Falcon030, IDE disk attached through pata-falcon driver which does use polled IO instead of interrupts, so may be a little on the slow side).

While it usually takes 8 minutes for my system to boot to a point where the network driver is loaded, and another 10 minutes before I can ssh into the box, all the while with IO activity on the disk as seem from the disk activity LED, the boot takes a few hours to complete since v15-rc1, with IO activity only very rarely seen.

In the one case where I could log in remotely, I had to abort the attempted reboot after another few hours.

This problem occurs only on real hardware, and isn't seen on e.g. ARAnyM which is frequently used to test changes.

Bisection between v5.16 and v5.17-rc1 points to 180dccb0dba4f5e84a4a70c1be1d34cbb6528b32 as the culprit, which is corroborated by reverting that commit in v5.17-rc1 and booting as rapidly as before.

I don't pretend to understand the purpose of the problematic commit, and cannot spot anything glaringly obvious with the change in logic in e.g. __blk_mq_tag_idle(). If there's anything you'd like me to test that could make that commit work for my unusual set-up, I'd be happy to help.

Cheers,

Michael


Am 30.01.2022 um 19:57 schrieb Michael Schmitz:
Hi Geert,

Am 30.01.2022 um 13:32 schrieb Michael Schmitz:
Hi Geert,

testing this patch on my Falcon 030, I'm seeing a weird error checking
and mounting the root filesystem (pata-falcon). The system appears to
sit idle, never completing the journal recovery and mount. Still
investigating that.

Belay that - not related to your patch, must be some other regression
since v5.16 that I'm seeing there.

Just ignore the noise ...

Cheers,

Michael


Can't see how that would be caused by your patch, just saying I could
not yet test it.

Cheers,

Michael


Am 29.01.2022 um 06:30 schrieb Geert Uytterhoeven:
When an application accesses a mapped frame buffer backed by deferred
I/O, it receives a segmentation fault. Fix this by removing the check
for VM_IO in do_page_fault().

Signed-off-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
---
This check was never present in a fault handler on any other
architecture than m68k.
Some digging revealed that it was added in v2.1.106, but I couldn't find
an email with a patch adding it. That same kernel version extended the
use of the hwreg_present() helper to HP9000/300, so the check might have
been needed there, perhaps only during development?
The Atari kernel relies heavily on hwreg_present() (both the success and
failure cases), and these still work, at least on ARAnyM.
---
arch/m68k/mm/fault.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index 1493cf5eac1e7a39..71aa9f6315dc8028 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -93,8 +93,6 @@ int do_page_fault(struct pt_regs *regs, unsigned
long address,
vma = find_vma(mm, address);
if (!vma)
goto map_err;
- if (vma->vm_flags & VM_IO)
- goto acc_err;
if (vma->vm_start <= address)
goto good_area;
if (!(vma->vm_flags & VM_GROWSDOWN))