Re: [RFC] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level

From: Aneesh Kumar K.V
Date: Wed Apr 19 2017 - 02:20:49 EST

Next message: Peter Zijlstra: "Re: [PATCH 2/3] jump_label: Provide static_key_slow_inc_nohp()"
Previous message: Stephen Rothwell: "linux-next: Tree for Apr 19"
In reply to: Anshuman Khandual: "[RFC] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level"
Next in thread: Anshuman Khandual: "Re: [RFC] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx> writes:

> Though migrating gigantic HugeTLB pages does not sound much like real
> world use case, they can be affected by memory errors. Hence migration
> at the PGD level HugeTLB pages should be supported just to enable soft
> and hard offline use cases.

In that case do we want to isolated the entire 16GB range ? Should we
just dequeue the page from hugepage pool convert them to regular 64K
pages and then isolate the 64K that had memory error ?

>
> While allocating the new gigantic HugeTLB page, it should not matter
> whether new page comes from the same node or not. There would be very
> few gigantic pages on the system afterall, we should not be bothered
> about node locality when trying to save a big page from crashing.
>
> This introduces a new HugeTLB allocator called alloc_gigantic_page()
> which will scan over all online nodes on the system and allocate a
> single HugeTLB page.
>

-aneesh

Next message: Peter Zijlstra: "Re: [PATCH 2/3] jump_label: Provide static_key_slow_inc_nohp()"
Previous message: Stephen Rothwell: "linux-next: Tree for Apr 19"
In reply to: Anshuman Khandual: "[RFC] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level"
Next in thread: Anshuman Khandual: "Re: [RFC] mm/madvise: Enable (soft|hard) offline of HugeTLB pages at PGD level"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]