Re: [PATCH 3/6] mm/hwpoison: fix num_poisoned_pages error statisticsfor thp

From: Naoya Horiguchi
Date: Thu Aug 22 2013 - 23:28:10 EST


Hi Wanpeng,

On Fri, Aug 23, 2013 at 07:52:40AM +0800, Wanpeng Li wrote:
> Hi Naoya,
> On Thu, Aug 22, 2013 at 12:43:08PM -0400, Naoya Horiguchi wrote:
> >On Thu, Aug 22, 2013 at 05:48:24PM +0800, Wanpeng Li wrote:
> >> There is a race between hwpoison page and unpoison page, memory_failure
> >> set the page hwpoison and increase num_poisoned_pages without hold page
> >> lock, and one page count will be accounted against thp for num_poisoned_pages.
> >> However, unpoison can occur before memory_failure hold page lock and
> >> split transparent hugepage, unpoison will decrease num_poisoned_pages
> >> by 1 << compound_order since memory_failure has not yet split transparent
> >> hugepage with page lock held. That means we account one page for hwpoison
> >> and 1 << compound_order for unpoison. This patch fix it by decrease one
> >> account for num_poisoned_pages against no hugetlbfs pages case.
> >>
> >> Signed-off-by: Wanpeng Li <liwanp@xxxxxxxxxxxxxxxxxx>
> >
> >I think that a thp never becomes hwpoisoned without splitting, so "trying
> >to unpoison thp" never happens (I think that this implicit fact should be
>
> There is a race window here for hwpoison thp:

OK, thanks for great explanation (it's worth written in description.)
And I found my previous comment was comletely pointless, sorry :(

> A B
> memory_failue
> TestSetPageHWPoison(p);
> if (PageHuge(p))
> nr_pages = 1 << compound_order(hpage);
> else
> nr_pages = 1;
> atomic_long_add(nr_pages, &num_poisoned_pages);
> unpoison_memory
> nr_pages = 1<< compound_trans_order(page;)
>
> if(TestClearPageHWPoison(p))
> atomic_long_sub(nr_pages, &num_poisoned_pages);
> lock page
> if (!PageHWPoison(p))
> unlock page and return
> hwpoison_user_mappings
> if (PageTransHuge(hpage))
> split_huge_page(hpage);

When this race happens, our expectation is that num_poisoned_pages is
increased by 1 because finally thread A succeeds to hwpoison one normal page.
So thread B should fail to unpoison without clearing PageHWPoison nor
decreasing num_poisoned_pages. My suggestion is inserting a PageTransHuge
check before doing TestClearPageHWPoison like follows:

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 1cb3b7d..f551b72 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1336,6 +1336,16 @@ int unpoison_memory(unsigned long pfn)
return 0;
}

+ /*
+ * unpoison_memory() can encounter thp only when the thp is being
+ * worked by memory_failure() and the page lock is not held yet.
+ * In such case, we yield to memory_failure() and make unpoison fail.
+ */
+ if (PageTransHuge(page)) {
+ pr_info("MCE: Memory failure is now running on %#lx\n", pfn);
+ return 0;
+ }
+
nr_pages = 1 << compound_trans_order(page);

if (!get_page_unless_zero(page)) {


I think that replacing atomic_long_sub() with atomic_long_dec() still
has a meaning, so you don't have to drop that.

>
> We increase one page count, however, decrease 1 << compound_trans_order.
> The compound_trans_order you mentioned is used here for thp, that's why
> I don't drop it in patch 2/6.

I don't think that we have to use compound_trans_order() any more, because
with the above change we don't calculate nr_pages any more for thp.
We can reduce the cost to lock/unlock compound_lock as described in 2/6.

> >commented somewhere or asserted with VM_BUG_ON().)
>
> I will add the VM_BUG_ON() in unpoison_memory after lock page in next
> version.

Sorry, my previous suggestion didn't make sense.

Thank you!
Naoya Horiguchi

> >And nr_pages in unpoison_memory() can be greater than 1 for hugetlbfs page.
> >So does this patch break counting when unpoisoning free hugetlbfs pages?
> >
> >Thanks,
> >Naoya Horiguchi
> >
> >> ---
> >> mm/memory-failure.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> >> index 5092e06..6bfd51e 100644
> >> --- a/mm/memory-failure.c
> >> +++ b/mm/memory-failure.c
> >> @@ -1350,7 +1350,7 @@ int unpoison_memory(unsigned long pfn)
> >> return 0;
> >> }
> >> if (TestClearPageHWPoison(p))
> >> - atomic_long_sub(nr_pages, &num_poisoned_pages);
> >> + atomic_long_dec(&num_poisoned_pages);
> >> pr_info("MCE: Software-unpoisoned free page %#lx\n", pfn);
> >> return 0;
> >> }
> >> --
> >> 1.8.1.2
> >>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/