Re: [PATCH] mm/memory-failure.c: fix memory leak by race between poison and unpoison

From: Andrew Morton
Date: Wed May 14 2014 - 18:10:42 EST


On Wed, 14 May 2014 11:21:31 -0400 Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> wrote:

> When a memory error happens on an in-use page or (free and in-use) hugepage,
> the victim page is isolated with its refcount set to one. When you try to
> unpoison it later, unpoison_memory() calls put_page() for it twice in order to
> bring the page back to free page pool (buddy or free hugepage list.)
> However, if another memory error occurs on the page which we are unpoisoning,
> memory_failure() returns without releasing the refcount which was incremented
> in the same call at first, which results in memory leak and unconsistent
> num_poisoned_pages statistics. This patch fixes it.
>
> ...
>
> --- next-20140512.orig/mm/memory-failure.c
> +++ next-20140512/mm/memory-failure.c
> @@ -1153,6 +1153,8 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
> */
> if (!PageHWPoison(p)) {
> printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
> + atomic_long_sub(nr_pages, &num_poisoned_pages);
> + put_page(hpage);
> res = 0;
> goto out;
> }

Looking at the surrounding code...

/*
* Lock the page and wait for writeback to finish.
* It's very difficult to mess with pages currently under IO
* and in many cases impossible, so we just avoid it here.
*/
lock_page(hpage);


lock_page() doesn't wait for writeback to finish -
wait_on_page_writeback() does that. Either the code or the comment
could do with fixing.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/