Re: INFO: rcu detected stall in ext4_file_write_iter

From: Theodore Y. Ts'o
Date: Wed Feb 27 2019 - 16:58:13 EST


On Wed, Feb 27, 2019 at 10:58:50AM +0100, Dmitry Vyukov wrote:
> Peter, Ingo, do you have any updates on the
> perf_event_open/sched_setattr stalls? This bug cause assorted hangs
> throughout kernel and so is nasty.
>
> syzkaller tries to remove all syscalls from reproducers one-by-one.
> Somehow without sched_setattr the hang did not reproduce (a bunch of
> repros have perf_event_open+sched_setattr so somehow they seem to be
> related)

FWIW, at least for me, the repro.c with sched_setattr commented out
(see the repro.c attached to a message[1] earlier in the thread) it
was reproducing reliably on a 2 CPU, 2 GB memory KVM using the
ext4.git tree (dev branch, 5.0-rc3 plus ext4 commits for the next
merge window) using a Debian stable-based VM[2].

[1] https://groups.google.com/d/msg/syzkaller-bugs/ByPpM3WZw1s/li7SsaEyAgAJ
[2] https://mirrors.edge.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests/root_fs.img.amd64

> But even with perfect repros machines still won't be
> able to tell in all cases that even though the hang happened in ext4
> code, the root cause is actually another scheduler-related system
> call. So thanks for looking into this.

To be clear, there was *not* a scheduler-related system call in the
repro.c I was playing with (see [2]); just perf_event_open(2) and
sendfile(2).

Cheers,

- Ted