Re: [PATCH] list_lru: Prefetch neighboring list entries before acquiring lock

From: Waiman Long
Date: Fri Dec 01 2017 - 09:15:11 EST


On 11/30/2017 07:09 PM, Minchan Kim wrote:
> On Thu, Nov 30, 2017 at 12:47:36PM -0800, Andrew Morton wrote:
>> On Thu, 30 Nov 2017 08:54:04 -0500 Waiman Long <longman@xxxxxxxxxx> wrote:
>>
>>>> And, from that perspective, the racy shortcut in the proposed patch
>>>> is wrong, too. Prefetch is fine, but in general shortcutting list
>>>> empty checks outside the internal lock isn't.
>>> For the record, I add one more list_empty() check at the beginning of
>>> list_lru_del() in the patch for 2 purpose:
>>> 1. it allows the code to bail out early.
>>> 2. It make sure the cacheline of the list_head entry itself is loaded.
>>>
>>> Other than that, I only add a likely() qualifier to the existing
>>> list_empty() check within the lock critical region.
>> But it sounds like Dave thinks that unlocked check should be removed?
>>
>> How does this adendum look?
>>
>> From: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Subject: list_lru-prefetch-neighboring-list-entries-before-acquiring-lock-fix
>>
>> include prefetch.h, remove unlocked list_empty() test, per Dave
>>
>> Cc: Dave Chinner <david@xxxxxxxxxxxxx>
>> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
>> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>
>> Cc: Waiman Long <longman@xxxxxxxxxx>
>> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> ---
>>
>> mm/list_lru.c | 5 ++---
>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff -puN mm/list_lru.c~list_lru-prefetch-neighboring-list-entries-before-acquiring-lock-fix mm/list_lru.c
>> --- a/mm/list_lru.c~list_lru-prefetch-neighboring-list-entries-before-acquiring-lock-fix
>> +++ a/mm/list_lru.c
>> @@ -8,6 +8,7 @@
>> #include <linux/module.h>
>> #include <linux/mm.h>
>> #include <linux/list_lru.h>
>> +#include <linux/prefetch.h>
>> #include <linux/slab.h>
>> #include <linux/mutex.h>
>> #include <linux/memcontrol.h>
>> @@ -135,13 +136,11 @@ bool list_lru_del(struct list_lru *lru,
>> /*
>> * Prefetch the neighboring list entries to reduce lock hold time.
>> */
>> - if (unlikely(list_empty(item)))
>> - return false;
>> prefetchw(item->prev);
>> prefetchw(item->next);
>>
>> spin_lock(&nlru->lock);
>> - if (likely(!list_empty(item))) {
>> + if (!list_empty(item)) {
>> l = list_lru_from_kmem(nlru, item);
>> list_del_init(item);
>> l->nr_items--;
> If we cannot guarantee it's likely !list_empty, prefetch with NULL pointer
> would be harmful by the lesson we have learned.
>
> https://lwn.net/Articles/444336/

FYI, when list_empty() is true, it just mean the links are pointing to
list entry itself. The pointers will never be NULL. So that won't cause
the NULL prefetch problem mentioned in the article.

> So, with considering list_lru_del is generic library, it cannot see
> whether a workload makes heavy lock contentions or not.
> Maybe, right place for prefetching would be in caller, not in library
> itself.

Yes, the prefetch operations will add some overhead to the whole
deletion operation when the lock isn't contended, but that is usually
rather small compared with the atomic ops involved in the locking
operation itself. On the other hand, the performance gain will be
noticeable when the lock is contended. I will ran some performance
measurement and report the results later.

Cheers,
Longman