Re: [PATCH v2] proc/ksm: add ksm stats to /proc/pid/smaps

From: Stefan Roesch
Date: Wed Aug 16 2023 - 12:23:22 EST



David Hildenbrand <david@xxxxxxxxxx> writes:

> On 15.08.23 19:10, Stefan Roesch wrote:
>> David Hildenbrand <david@xxxxxxxxxx> writes:
>>
>>> Sorry for the late reply, Gmail once again decided to classify your mails as
>>> spam (for whatever reason).
>>>
>>> On 11.08.23 18:28, Stefan Roesch wrote:
>>>> With madvise and prctl KSM can be enabled for different VMA's. Once it
>>>> is enabled we can query how effective KSM is overall. However we cannot
>>>> easily query if an individual VMA benefits from KSM.
>>>> This commit adds a KSM section to the /prod/<pid>/smaps file. It reports
>>>> how many of the pages are KSM pages.
>>>> Here is a typical output:
>>>> 7f420a000000-7f421a000000 rw-p 00000000 00:00 0
>>>> Size: 262144 kB
>>>> KernelPageSize: 4 kB
>>>> MMUPageSize: 4 kB
>>>> Rss: 51212 kB
>>>> Pss: 8276 kB
>>>> Shared_Clean: 172 kB
>>>> Shared_Dirty: 42996 kB
>>>> Private_Clean: 196 kB
>>>> Private_Dirty: 7848 kB
>>>> Referenced: 15388 kB
>>>> Anonymous: 51212 kB
>>>> KSM: 41376 kB
>>>> LazyFree: 0 kB
>>>> AnonHugePages: 0 kB
>>>> ShmemPmdMapped: 0 kB
>>>> FilePmdMapped: 0 kB
>>>> Shared_Hugetlb: 0 kB
>>>> Private_Hugetlb: 0 kB
>>>> Swap: 202016 kB
>>>> SwapPss: 3882 kB
>>>> Locked: 0 kB
>>>> THPeligible: 0
>>>> ProtectionKey: 0
>>>> ksm_state: 0
>>>> ksm_skip_base: 0
>>>> ksm_skip_count: 0
>>>> VmFlags: rd wr mr mw me nr mg anon
>>>> This information also helps with the following workflow:
>>>> - First enable KSM for all the VMA's of a process with prctl.
>>>> - Then analyze with the above smaps report which VMA's benefit the most
>>>> - Change the application (if possible) to add the corresponding madvise
>>>> calls for the VMA's that benefit the most
>>>> Signed-off-by: Stefan Roesch <shr@xxxxxxxxxxxx>
>>>> ---
>>>> Documentation/filesystems/proc.rst | 3 +++
>>>> fs/proc/task_mmu.c | 5 +++++
>>>> 2 files changed, 8 insertions(+)
>>>> diff --git a/Documentation/filesystems/proc.rst
>>>> b/Documentation/filesystems/proc.rst
>>>> index 7897a7dafcbc..4ef3c0bbf16a 100644
>>>> --- a/Documentation/filesystems/proc.rst
>>>> +++ b/Documentation/filesystems/proc.rst
>>>> @@ -461,6 +461,7 @@ Memory Area, or VMA) there is a series of lines such as the following::
>>>> Private_Dirty: 0 kB
>>>> Referenced: 892 kB
>>>> Anonymous: 0 kB
>>>> + KSM: 0 kB
>>>> LazyFree: 0 kB
>>>> AnonHugePages: 0 kB
>>>> ShmemPmdMapped: 0 kB
>>>> @@ -501,6 +502,8 @@ accessed.
>>>> a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE
>>>> and a page is modified, the file page is replaced by a private anonymous copy.
>>>> +"KSM" shows the amount of anonymous memory that has been de-duplicated.
>>>
>>>
>>> How do we want to treat memory that has been deduplicated into the shared
>>> zeropage?
>>>
>>> It would also match this description.
>>>
>>> See in mm-stable:
>>>
>>> commit 30ff6ed9a65c7e73545319fc15f7bcf9c52457eb
>>> Author: xu xin <xu.xin16@xxxxxxxxxx>
>>> Date: Tue Jun 13 11:09:28 2023 +0800
>>>
>>> ksm: support unsharing KSM-placed zero pages
>>>
>>> Patch series "ksm: support tracking KSM-placed zero-pages", v10.
>> I see two approaches how to deal with zero page:
>> - If zero page is not enabled, it works as is
>> - If enabled
>> - Document that zero page is accounted for the current vma or
>> - Pass in the pte from smaps_pte_entry() to smaps_account() so we can
>> determine if this is a zero page.
>
> That's probably the right thing to do: make the stat return the same value
> independent of the usage of the shared zeropage.
>

I'll update the documentation accordingly.

>> I'm not sure what to do about smaps_pmd_entry in that case. We
>> probably don't care about compund pages.
>
> No, KSM only places the shared zeropage for PTEs, no need to handle PMDs.