[PATCH v2 0/2] psi: enhance psi with the help of ebpf

From: Yafang Shao
Date: Tue Mar 31 2020 - 06:05:28 EST


PSI gives us a powerful way to anaylze memory pressure issue, but we can
make it more powerful with the help of tracepoint, kprobe, ebpf and etc.
Especially with ebpf we can flexiblely get more details of the memory
pressure.

In orderc to achieve this goal, a new parameter is added into
psi_memstall_{enter, leave}, which indicates the specific type of a
memstall. There're totally ten memstalls by now,
MEMSTALL_KSWAPD
MEMSTALL_RECLAIM_DIRECT
MEMSTALL_RECLAIM_MEMCG
MEMSTALL_RECLAIM_HIGH
MEMSTALL_KCOMPACTD
MEMSTALL_COMPACT
MEMSTALL_WORKINGSET_REFAULT
MEMSTALL_WORKINGSET_THRASH
MEMSTALL_MEMDELAY
MEMSTALL_SWAPIO
With the help of kprobe or tracepoint to trace this newly added agument we
can know which type of memstall it is and then do corresponding
improvement. I can also help us to analyze the latency spike caused by
memory pressure.

But note that we can't use it to build memory pressure for a specific type
of memstall, e.g. memcg pressure, compaction pressure and etc, because it
doesn't implement various types of task->in_memstall, e.g.
task->in_memcgstall, task->in_compactionstall and etc.

Although there're already some tracepoints can help us to achieve this
goal, e.g.
vmscan:mm_vmscan_kswapd_{wake, sleep}
vmscan:mm_vmscan_direct_reclaim_{begin, end}
vmscan:mm_vmscan_memcg_reclaim_{begin, end}
/* no tracepoint for memcg high reclaim*/
compcation:mm_compaction_kcompactd_{wake, sleep}
compcation:mm_compaction_begin_{begin, end}
/* no tracepoint for workingset refault */
/* no tracepoint for workingset thrashing */
/* no tracepoint for use memdelay */
/* no tracepoint for swapio */
but psi_memstall_{enter, leave} gives us a unified entrance for all
types of memstall and we don't need to add many begin and end tracepoints
that hasn't been implemented yet.

Patch #2 gives us an example of how to use it with ebpf. With the help of
ebpf we can trace a specific task, application, container and etc. It also
can help us to analyze the spread of latencies and whether they were
clustered at a point of time or spread out over long periods of time.

To summarize, with the pressure data in /proc/pressure/memroy we know that
the system is under memory pressure, and then with the newly added tracing
facility in this patchset we can get the reason of this memory pressure,
and then thinks about how to make the change.
The workflow can be illustrated as bellow.

REASON ACTION
| compcation | improve compcation |
| vmscan | improve vmscan |
Memory pressure -| workingset | improve workingset |
| etc | ... |

Yafang Shao (2):
psi: introduce various types of memstall
psi, tracepoint: introduce tracepoints for psi_memstall_{enter, leave}

block/blk-cgroup.c | 4 ++--
block/blk-core.c | 4 ++--
include/linux/psi.h | 15 +++++++++++----
include/linux/psi_types.h | 13 +++++++++++++
include/trace/events/sched.h | 41 +++++++++++++++++++++++++++++++++++++++++
kernel/sched/psi.c | 14 ++++++++++++--
mm/compaction.c | 4 ++--
mm/filemap.c | 4 ++--
mm/memcontrol.c | 4 ++--
mm/page_alloc.c | 8 ++++----
mm/page_io.c | 4 ++--
mm/vmscan.c | 8 ++++----
12 files changed, 97 insertions(+), 26 deletions(-)

--
2.18.2