Re: [PATCH 2/3] mm, compaction: raise compaction priority after it withdrawns

From: Vlastimil Babka
Date: Mon Aug 05 2019 - 05:14:47 EST


On 8/3/19 12:39 AM, Mike Kravetz wrote:
> From: Vlastimil Babka <vbabka@xxxxxxx>
>
> Mike Kravetz reports that "hugetlb allocations could stall for minutes or hours
> when should_compact_retry() would return true more often then it should.
> Specifically, this was in the case where compact_result was COMPACT_DEFERRED
> and COMPACT_PARTIAL_SKIPPED and no progress was being made."
>
> The problem is that the compaction_withdrawn() test in should_compact_retry()
> includes compaction outcomes that are only possible on low compaction priority,
> and results in a retry without increasing the priority. This may result in
> furter reclaim, and more incomplete compaction attempts.
>
> With this patch, compaction priority is raised when possible, or
> should_compact_retry() returns false.
>
> The COMPACT_SKIPPED result doesn't really fit together with the other outcomes
> in compaction_withdrawn(), as that's a result caused by insufficient order-0
> pages, not due to low compaction priority. With this patch, it is moved to
> a new compaction_needs_reclaim() function, and for that outcome we keep the
> current logic of retrying if it looks like reclaim will be able to help.
>
> Reported-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
> Tested-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>

There should be also your SOB, IIUC.

> ---
> include/linux/compaction.h | 22 +++++++++++++++++-----
> mm/page_alloc.c | 16 ++++++++++++----
> 2 files changed, 29 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index 9569e7c786d3..4b898cdbdf05 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -129,11 +129,8 @@ static inline bool compaction_failed(enum compact_result result)
> return false;
> }
>
> -/*
> - * Compaction has backed off for some reason. It might be throttling or
> - * lock contention. Retrying is still worthwhile.
> - */
> -static inline bool compaction_withdrawn(enum compact_result result)
> +/* Compaction needs reclaim to be performed first, so it can continue. */
> +static inline bool compaction_needs_reclaim(enum compact_result result)
> {
> /*
> * Compaction backed off due to watermark checks for order-0
> @@ -142,6 +139,16 @@ static inline bool compaction_withdrawn(enum compact_result result)
> if (result == COMPACT_SKIPPED)
> return true;
>
> + return false;
> +}
> +
> +/*
> + * Compaction has backed off for some reason after doing some work or none
> + * at all. It might be throttling or lock contention. Retrying might be still
> + * worthwhile, but with a higher priority if allowed.
> + */
> +static inline bool compaction_withdrawn(enum compact_result result)
> +{
> /*
> * If compaction is deferred for high-order allocations, it is
> * because sync compaction recently failed. If this is the case
> @@ -207,6 +214,11 @@ static inline bool compaction_failed(enum compact_result result)
> return false;
> }
>
> +static inline bool compaction_needs_reclaim(enum compact_result result)
> +{
> + return false;
> +}
> +
> static inline bool compaction_withdrawn(enum compact_result result)
> {
> return true;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d3bb601c461b..af29c05e23aa 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3965,15 +3965,23 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> if (compaction_failed(compact_result))
> goto check_priority;
>
> + /*
> + * compaction was skipped because there are not enough order-0 pages
> + * to work with, so we retry only if it looks like reclaim can help.
> + */
> + if (compaction_needs_reclaim(compact_result)) {
> + ret = compaction_zonelist_suitable(ac, order, alloc_flags);
> + goto out;
> + }
> +
> /*
> * make sure the compaction wasn't deferred or didn't bail out early
> * due to locks contention before we declare that we should give up.
> - * But do not retry if the given zonelist is not suitable for
> - * compaction.
> + * But the next retry should use a higher priority if allowed, so
> + * we don't just keep bailing out endlessly.
> */
> if (compaction_withdrawn(compact_result)) {
> - ret = compaction_zonelist_suitable(ac, order, alloc_flags);
> - goto out;
> + goto check_priority;
> }
>
> /*
>