Re: Bug: fio traps into kernel without exiting because futex has adeadloop

From: Zhang, Yanmin
Date: Thu Jun 11 2009 - 02:05:43 EST


On Wed, 2009-06-10 at 22:58 -0700, Darren Hart wrote:
> Zhang, Yanmin wrote:
>
> Hi Zhang,
>
> > I investigate a fio hang issue. When I run fio multi-process
> > testing on many disks, fio traps into kernel and doesn't exit
> > (mostly hit once after runing sub test cases for ïhundreds of times).
> >
> > Oprofile data shows kernel consumes time with some futex functions.
> > Command kill couldn't kill the process and machine reboot also hangs.
> >
> > Eventually, I locate the root cause as a bug of futex. Kernel enters
> > a deadloop between 'retry' and 'goto retry' in function futex_wake_op.
> > By unknown reason (might be an issue of fio or glibc), parameter uaddr2
> > points to an area which is READONLY. So futex_atomic_op_inuser returns
> > -EFAULT when trying to changing the data at uaddr2, but later get_user
> > still succeeds becasue the area is READONLY. Then go back to retry.
> >
> > I create a simple test case to trigger it, which just shmat an READONLY
> > area for address uaddr2.
> >
> > It could be used as a DOS attack.
>
> Nice work on the diagnosis. I recall discussing something like this a
> couple weeks back. I thought this was fixed with a patch to ensure the
> pages were writable. Cc'ing Thomas G. to confirm.

> I didn't see a
> kernel version in your report, what are you running?
2.6.30-rc1~rc8.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/