Re: [PATCH 2/2] free_pcppages_bulk: prefetch buddy while not holding lock

From: Mel Gorman
Date: Wed Jan 24 2018 - 11:43:51 EST


On Wed, Jan 24, 2018 at 10:30:50AM +0800, Aaron Lu wrote:
> When a page is freed back to the global pool, its buddy will be checked
> to see if it's possible to do a merge. This requires accessing buddy's
> page structure and that access could take a long time if it's cache cold.
>
> This patch adds a prefetch to the to-be-freed page's buddy outside of
> zone->lock in hope of accessing buddy's page structure later under
> zone->lock will be faster.
>
> Test with will-it-scale/page_fault1 full load:
>
> kernel Broadwell(2S) Skylake(2S) Broadwell(4S) Skylake(4S)
> v4.15-rc4 9037332 8000124 13642741 15728686
> patch1/2 9608786 +6.3% 8368915 +4.6% 14042169 +2.9% 17433559 +10.8%
> this patch 10462292 +8.9% 8602889 +2.8% 14802073 +5.4% 17624575 +1.1%
>
> Note: this patch's performance improvement percent is against patch1/2.
>

I'm less convinced by this for a microbenchmark. Prefetch has not been a
universal win in the past and we cannot be sure that it's a good idea on
all architectures or doesn't have other side-effects such as consuming
memory bandwidth for data we don't need or evicting cache hot data for
buddy information that is not used. Furthermore, we end up doing some
calculations twice without any guarantee that the prefetch can offset
the cost.

It's not strong enough of an opinion to outright NAK it but I'm not
ACKing it either.

--
Mel Gorman
SUSE Labs