Re: [RFC PATCH v2 10/16] mm,hwpoison: Rework soft offline for free pages

From: Naoya Horiguchi
Date: Mon Oct 21 2019 - 04:17:16 EST


On Thu, Oct 17, 2019 at 04:21:17PM +0200, Oscar Salvador wrote:
> When trying to soft-offline a free page, we need to first take it off
> the buddy allocator.
> Once we know is out of reach, we can safely flag it as poisoned.
>
> take_page_off_buddy will be used to take a page meant to be poisoned
> off the buddy allocator.
> take_page_off_buddy calls break_down_buddy_pages, which splits a
> higher-order page in case our page belongs to one.
>
> Once the page is under our control, we call page_set_poison to set it

I guess you mean page_handle_poison here.

> as poisoned and grab a refcount on it.
>
> Signed-off-by: Oscar Salvador <osalvador@xxxxxxx>
> ---
> mm/memory-failure.c | 20 +++++++++++-----
> mm/page_alloc.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 82 insertions(+), 6 deletions(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 37b230b8cfe7..1d986580522d 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -78,6 +78,15 @@ EXPORT_SYMBOL_GPL(hwpoison_filter_dev_minor);
> EXPORT_SYMBOL_GPL(hwpoison_filter_flags_mask);
> EXPORT_SYMBOL_GPL(hwpoison_filter_flags_value);
>
> +extern bool take_page_off_buddy(struct page *page);
> +
> +static void page_handle_poison(struct page *page)

hwpoison is a separate idea from page poisoning, so maybe I think
it's better to be named like page_handle_hwpoison().

> +{
> + SetPageHWPoison(page);
> + page_ref_inc(page);
> + num_poisoned_pages_inc();
> +}
> +
> static int hwpoison_filter_dev(struct page *p)
> {
> struct address_space *mapping;
> @@ -1830,14 +1839,13 @@ static int soft_offline_in_use_page(struct page *page)
>
> static int soft_offline_free_page(struct page *page)
> {
> - int rc = dissolve_free_huge_page(page);
> + int rc = -EBUSY;
>
> - if (!rc) {
> - if (set_hwpoison_free_buddy_page(page))
> - num_poisoned_pages_inc();
> - else
> - rc = -EBUSY;
> + if (!dissolve_free_huge_page(page) && take_page_off_buddy(page)) {
> + page_handle_poison(page);
> + rc = 0;
> }
> +
> return rc;
> }
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cd1dd0712624..255df0c76a40 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8632,6 +8632,74 @@ bool is_free_buddy_page(struct page *page)
>
> #ifdef CONFIG_MEMORY_FAILURE
> /*
> + * Break down a higher-order page in sub-pages, and keep our target out of
> + * buddy allocator.
> + */
> +static void break_down_buddy_pages(struct zone *zone, struct page *page,
> + struct page *target, int low, int high,
> + struct free_area *area, int migratetype)
> +{
> + unsigned long size = 1 << high;
> + struct page *current_buddy, *next_page;
> +
> + while (high > low) {
> + area--;
> + high--;
> + size >>= 1;
> +
> + if (target >= &page[size]) {
> + next_page = page + size;
> + current_buddy = page;
> + } else {
> + next_page = page;
> + current_buddy = page + size;
> + }
> +
> + if (set_page_guard(zone, current_buddy, high, migratetype))
> + continue;
> +
> + if (current_buddy != target) {
> + add_to_free_area(current_buddy, area, migratetype);
> + set_page_order(current_buddy, high);
> + page = next_page;
> + }
> + }
> +}
> +
> +/*
> + * Take a page that will be marked as poisoned off the buddy allocator.
> + */
> +bool take_page_off_buddy(struct page *page)
> + {
> + struct zone *zone = page_zone(page);
> + unsigned long pfn = page_to_pfn(page);
> + unsigned long flags;
> + unsigned int order;
> + bool ret = false;
> +
> + spin_lock_irqsave(&zone->lock, flags);
> + for (order = 0; order < MAX_ORDER; order++) {
> + struct page *page_head = page - (pfn & ((1 << order) - 1));
> + int buddy_order = page_order(page_head);
> + struct free_area *area = &(zone->free_area[buddy_order]);
> +
> + if (PageBuddy(page_head) && buddy_order >= order) {
> + unsigned long pfn_head = page_to_pfn(page_head);
> + int migratetype = get_pfnblock_migratetype(page_head,
> + pfn_head);
> +
> + del_page_from_free_area(page_head, area);
> + break_down_buddy_pages(zone, page_head, page, 0,
> + buddy_order, area, migratetype);
> + ret = true;
> + break;

indent with whitespace?
And you can find a few more coding style warning with checkpatch.pl.

BTW, if we consider to make unpoison mechanism to keep up with the
new semantics, we will need the reverse operation of take_page_off_buddy().
Do you think that that part will come with a separate work?

Thanks,
Naoya Horiguchi

> + }
> + }
> + spin_unlock_irqrestore(&zone->lock, flags);
> + return ret;
> + }
> +
> +/*
> * Set PG_hwpoison flag if a given page is confirmed to be a free page. This
> * test is performed under the zone lock to prevent a race against page
> * allocation.
> --
> 2.12.3
>
>