Re: exit_aio() hang after I/O failure

From: Bart Van Assche
Date: Sat Feb 11 2012 - 13:31:01 EST


On Mon, Jan 23, 2012 at 4:47 PM, Bart Van Assche <bvanassche@xxxxxxx> wrote:
> On Mon, Jan 23, 2012 at 4:15 PM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
> > Bart Van Assche <bvanassche@xxxxxxx> writes:
> > > Apparently processes can hang in exit_aio() with at least kernel 3.2.1
> > > after an I/O failure. Has anyone seen this before ?
> > >
> > > This occurred after a SCSI device had been removed entirely (and hence
> > > after all I/O requests were killed by scsi_remove_host()).
> >
> > Fixed here:
> >
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=69e4747ee9727d660b88d7e1efe0f4afcb35db1b
>
> Thanks a lot for the feedback - I'll give this patch a try.

Bad news: I've been able to reproduce exactly the same call stack with
kernel 3.2.5. That kernel version includes the aforementioned commit.

# echo t >/proc/sysrq-trigger
[ ... ]
fio D 0000000000000001 0 25052 25008 0x00000004
ffff88001c32fb88 0000000000000046 ffff880000000000 ffff88007d949bc8
ffff88001c8e14d0 ffff88001c32ffd8 ffff88001c32ffd8 ffff88001c32ffd8
ffff880128b894d0 ffff88001c8e14d0 ffff88001c32fb88 000000018106f24d
Call Trace:
[<ffffffff813b683f>] schedule+0x3f/0x60
[<ffffffff813b68ef>] io_schedule+0x8f/0xd0
[<ffffffff81174410>] wait_for_all_aios+0xc0/0x100
[<ffffffff8103c3c0>] ? try_to_wake_up+0x270/0x270
[<ffffffff81175385>] exit_aio+0x55/0xc0
[<ffffffff810413cd>] mmput+0x2d/0x110
[<ffffffff81047c1d>] exit_mm+0x10d/0x130
[<ffffffff810482b1>] do_exit+0x671/0x860
[<ffffffff81033c1e>] ? finish_task_switch+0x4e/0xe0
[<ffffffff81048804>] do_group_exit+0x44/0xb0
[<ffffffff81058018>] get_signal_to_deliver+0x218/0x5a0
[<ffffffff81002065>] do_signal+0x65/0x700
[<ffffffff811740d0>] ? aio_read_evt+0x150/0x150
[<ffffffff8103c3c0>] ? try_to_wake_up+0x270/0x270
[<ffffffff81002785>] do_notify_resume+0x65/0x80
[<ffffffff811df84e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[<ffffffff813c0333>] int_signal+0x12/0x17
[ ... ]

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/