Re: [PATCH] mm, compaction: properly signal and act upon lock and need_sched() contention

From: Joonsoo Kim
Date: Mon May 12 2014 - 20:42:23 EST


On Mon, May 12, 2014 at 04:15:11PM +0200, Vlastimil Babka wrote:
> Compaction uses compact_checklock_irqsave() function to periodically check for
> lock contention and need_resched() to either abort async compaction, or to
> free the lock, schedule and retake the lock. When aborting, cc->contended is
> set to signal the contended state to the caller. Two problems have been
> identified in this mechanism.
>
> First, compaction also calls directly cond_resched() in both scanners when no
> lock is yet taken. This call either does not abort async compaction, or set
> cc->contended appropriately. This patch introduces a new
> compact_check_resched() function to achieve both.
>
> Second, isolate_freepages() does not check if isolate_freepages_block()
> aborted due to contention, and advances to the next pageblock. This violates
> the principle of aborting on contention, and might result in pageblocks not
> being scanned completely, since the scanning cursor is advanced. This patch
> makes isolate_freepages_block() check the cc->contended flag and abort.
>
> Reported-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
> Cc: Minchan Kim <minchan@xxxxxxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxx>
> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@xxxxxxxxxxx>
> Cc: Michal Nazarewicz <mina86@xxxxxxxxxx>
> Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>
> Cc: Rik van Riel <riel@xxxxxxxxxx>
> ---
> mm/compaction.c | 40 +++++++++++++++++++++++++++++++++-------
> 1 file changed, 33 insertions(+), 7 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 83ca6f9..b34ab7c 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -222,6 +222,27 @@ static bool compact_checklock_irqsave(spinlock_t *lock, unsigned long *flags,
> return true;
> }
>
> +/*
> + * Similar to compact_checklock_irqsave() (see its comment) for places where
> + * a zone lock is not concerned.
> + *
> + * Returns false when compaction should abort.
> + */
> +static inline bool compact_check_resched(struct compact_control *cc)
> +{
> + /* async compaction aborts if contended */
> + if (need_resched()) {
> + if (cc->mode == MIGRATE_ASYNC) {
> + cc->contended = true;
> + return false;
> + }
> +
> + cond_resched();
> + }
> +
> + return true;
> +}
> +
> /* Returns true if the page is within a block suitable for migration to */
> static bool suitable_migration_target(struct page *page)
> {
> @@ -491,11 +512,8 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
> return 0;
> }
>
> - if (cond_resched()) {
> - /* Async terminates prematurely on need_resched() */
> - if (cc->mode == MIGRATE_ASYNC)
> - return 0;
> - }
> + if (!compact_check_resched(cc))
> + return 0;
>
> /* Time to isolate some pages for migration */
> for (; low_pfn < end_pfn; low_pfn++) {
> @@ -718,9 +736,10 @@ static void isolate_freepages(struct zone *zone,
> /*
> * This can iterate a massively long zone without finding any
> * suitable migration targets, so periodically check if we need
> - * to schedule.
> + * to schedule, or even abort async compaction.
> */
> - cond_resched();
> + if (!compact_check_resched(cc))
> + break;
>
> if (!pfn_valid(block_start_pfn))
> continue;
> @@ -758,6 +777,13 @@ static void isolate_freepages(struct zone *zone,
> */
> if (isolated)
> cc->finished_update_free = true;
> +
> + /*
> + * isolate_freepages_block() might have aborted due to async
> + * compaction being contended
> + */
> + if (cc->contended)
> + break;
> }

Hello,

I think that we can do further.

The problem is that this cc->contended is checked only in
isolate_migratepages() to break out the compaction. So if there are
free pages we are already taken, compaction wouldn't stopped
immediately and isolate_freepages() could be invoked again on next
compaction_alloc(). If there is no contention at this time, we would try
to get free pages from one pageblock because cc->contended checking is
on bottom of the loop in isolate_migratepages() and will continue to
run compaction. AFAIK, we want to stop the compaction in this case.

Moreover, if this isolate_freepages() don't stop the compaction,
next isolate_migratepages() will be invoked and it would be stopped
by checking cc->contended after isolating some pages for migration.
This is useless overhead so should be removed.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/