Re: bad page state in 3.13-rc4

From: Dave Jones
Date: Thu Dec 19 2013 - 20:01:29 EST


On Thu, Dec 19, 2013 at 06:38:54PM -0500, Benjamin LaHaise wrote:
> On Thu, Dec 19, 2013 at 03:24:16PM -0500, Dave Jones wrote:
> > Yes. Note the original trace in this thread was a VM_BUG_ON(atomic_read(&page->_count) <= 0);
> >
> > Right after these crashes btw, the box locks up solid. So bad that traces don't
> > always make it over usb-serial. Annoying.
>
> I think I finally have an idea what's going on now. Kent's changes in
> e34ecee2ae791df674dfb466ce40692ca6218e43 are broken and result in a memory
> leak of the aio kioctx. This eventually leads to the system running out of
> memory, which ends up triggering the otherwise hard to hit error paths in
> aio_setup_ring(). Linus' suggested changes should fix the badness in the
> aio_setup_ring(), but more work has to be done to fix up the percpu
> reference counting tie in with the aio code. I'll fix this up in the
> morning if nobody beats me to it over night, as I'm just heading out right
> now.

That would explain why I'm having difficulty repeating it in a hurry if it
takes hours of runtime for the leak to reach a point where it becomes a problem.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/