Re: [PATCH] mm: vmscan: do not pass reclaimed slab to vmpressure

From: zhong jiang
Date: Tue Jun 06 2017 - 09:08:10 EST


On 2017/1/31 7:40, Minchan Kim wrote:
> Hi Vinayak,
> Sorry for late response. It was Lunar New Year holidays.
>
> On Fri, Jan 27, 2017 at 01:43:23PM +0530, vinayak menon wrote:
>>> Thanks for the explain. However, such case can happen with THP page
>>> as well as slab. In case of THP page, nr_scanned is 1 but nr_reclaimed
>>> could be 512 so I think vmpressure should have a logic to prevent undeflow
>>> regardless of slab shrinking.
>>>
>> I see. Going to send a vmpressure fix. But, wouldn't the THP case
>> result in incorrect
>> vmpressure reporting even if we fix the vmpressure underflow problem ?
> If a THP page is reclaimed, it reports lower pressure due to bigger
> reclaim ratio(ie, reclaimed/scanned) compared to normal pages but
> it's not a problem, is it? Because VM reclaimed more memory than
> expected so memory pressure isn't severe now.
Hi, Minchan

THP lru page is reclaimed, reclaim ratio bigger make sense. but I read the code, I found
THP is split to normal pages and loop again. reclaimed pages should not be bigger
than nr_scan. because of each loop will increase nr_scan counter.

It is likely I miss something. you can point out the point please.

Thanks
zhongjiang
>>>>>> unsigned arithmetic results in the pressure value to be
>>>>>> huge, thus resulting in a critical event being sent to
>>>>>> root cgroup. Fix this by not passing the reclaimed slab
>>>>>> count to vmpressure, with the assumption that vmpressure
>>>>>> should show the actual pressure on LRU which is now
>>>>>> diluted by adding reclaimed slab without a corresponding
>>>>>> scanned value.
>>>>> I can't guess justfication of your assumption from the description.
>>>>> Why do we consider only LRU pages for vmpressure? Could you elaborate
>>>>> a bit?
>>>>>
>>>> When we encountered the false events from vmpressure, thought the problem
>>>> could be that slab scanned is not included in sc->nr_scanned, like it is done
>>>> for reclaimed. But later thought vmpressure works only on the scanned and
>>>> reclaimed from LRU. I can explain what I understand, let me know if this is
>>>> incorrect.
>>>> vmpressure is an index which tells the pressure on LRU, and thus an
>>>> indicator of thrashing. In shrink_node when we come out of the inner do-while
>>>> loop after shrinking the lruvec, the scanned and reclaimed corresponds to the
>>>> pressure felt on the LRUs which in turn indicates the pressure on VM. The
>>>> moment we add the slab reclaimed pages to the reclaimed, we dilute the
>>>> actual pressure felt on LRUs. When slab scanned/reclaimed is not included
>>>> in the vmpressure, the values will indicate the actual pressure and if there
>>>> were a lot of slab reclaimed pages it will result in lesser pressure
>>>> on LRUs in the next run which will again be indicated by vmpressure. i.e. the
>>> I think there is no intention to exclude slab by design of vmpressure.
>>> Beause slab is memory consumption so freeing of slab pages really helps
>>> the memory pressure. Also, there might be slab-intensive workload rather
>>> than LRU. It would be great if vmpressure works well with that case.
>>> But the problem with involving slab for vmpressure is it's not fair with
>>> LRU pages. LRU pages are 1:1 cost model for scan:free but slab shriking
>>> depends the each slab's object population. It means it's impossible to
>>> get stable cost model with current slab shrinkg model, unfortunately.
>>> So I don't obejct this patch although I want to see slab shrink model's
>>> change which is heavy-handed work.
>>>
>> Looking at the code, the slab reclaimed pages started getting passed to
>> vmpressure after the commit ("mm: vmscan: invoke slab shrinkers from
>> shrink_zone()").
>> But as you said, this may be helpful for slab intensive workloads. But in its
>> current form I think it results in incorrect vmpressure reporting because of not
>> accounting the slab scanned pages. Resending the patch with a modified
>> commit msg
>> since the underflow issue is fixed separately. Thanks Minchan.
> Make sense.
>
> Thanks, Vinayak!
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
>
> .
>