Re: [PATCH 3/3] mm/maps: read proc/pid/maps under RCU

From: SeongJae Park
Date: Tue Jan 23 2024 - 00:36:46 EST


Hi Suren,

On Sun, 21 Jan 2024 23:13:24 -0800 Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:

> With maple_tree supporting vma tree traversal under RCU and per-vma locks
> making vma access RCU-safe, /proc/pid/maps can be read under RCU and
> without the need to read-lock mmap_lock. However vma content can change
> from under us, therefore we make a copy of the vma and we pin pointer
> fields used when generating the output (currently only vm_file and
> anon_name). Afterwards we check for concurrent address space
> modifications, wait for them to end and retry. That last check is needed
> to avoid possibility of missing a vma during concurrent maple_tree
> node replacement, which might report a NULL when a vma is replaced
> with another one. While we take the mmap_lock for reading during such
> contention, we do that momentarily only to record new mm_wr_seq counter.
> This change is designed to reduce mmap_lock contention and prevent a
> process reading /proc/pid/maps files (often a low priority task, such as
> monitoring/data collection services) from blocking address space updates.
>
> Note that this change has a userspace visible disadvantage: it allows for
> sub-page data tearing as opposed to the previous mechanism where data
> tearing could happen only between pages of generated output data.
> Since current userspace considers data tearing between pages to be
> acceptable, we assume is will be able to handle sub-page data tearing
> as well.
>
> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> ---
> fs/proc/internal.h | 2 +
> fs/proc/task_mmu.c | 114 ++++++++++++++++++++++++++++++++++++++++++---
> 2 files changed, 109 insertions(+), 7 deletions(-)
>
> diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> index a71ac5379584..e0247225bb68 100644
> --- a/fs/proc/internal.h
> +++ b/fs/proc/internal.h
> @@ -290,6 +290,8 @@ struct proc_maps_private {
> struct task_struct *task;
> struct mm_struct *mm;
> struct vma_iterator iter;
> + unsigned long mm_wr_seq;
> + struct vm_area_struct vma_copy;
> #ifdef CONFIG_NUMA
> struct mempolicy *task_mempolicy;
> #endif
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 3f78ebbb795f..3886d04afc01 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -126,11 +126,96 @@ static void release_task_mempolicy(struct proc_maps_private *priv)
> }
> #endif
>
> -static struct vm_area_struct *proc_get_vma(struct proc_maps_private *priv,
> - loff_t *ppos)
> +#ifdef CONFIG_PER_VMA_LOCK
> +
> +static const struct seq_operations proc_pid_maps_op;
> +/*
> + * Take VMA snapshot and pin vm_file and anon_name as they are used by
> + * show_map_vma.
> + */
> +static int get_vma_snapshow(struct proc_maps_private *priv, struct vm_area_struct *vma)
> {
> + struct vm_area_struct *copy = &priv->vma_copy;
> + int ret = -EAGAIN;
> +
> + memcpy(copy, vma, sizeof(*vma));
> + if (copy->vm_file && !get_file_rcu(&copy->vm_file))
> + goto out;
> +
> + if (copy->anon_name && !anon_vma_name_get_rcu(copy))
> + goto put_file;

>From today updated mm-unstable which containing this patch, I'm getting below
build error when CONFIG_ANON_VMA_NAME is not set. Seems this patch needs to
handle the case?

.../linux/fs/proc/task_mmu.c: In function ‘get_vma_snapshow’:
.../linux/fs/proc/task_mmu.c:145:19: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
145 | if (copy->anon_name && !anon_vma_name_get_rcu(copy))
| ^~~~~~~~~
| anon_vma
.../linux/fs/proc/task_mmu.c:161:19: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
161 | if (copy->anon_name)
| ^~~~~~~~~
| anon_vma
.../linux/fs/proc/task_mmu.c:162:41: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
162 | anon_vma_name_put(copy->anon_name);
| ^~~~~~~~~
| anon_vma
.../linux/fs/proc/task_mmu.c: In function ‘put_vma_snapshot’:
.../linux/fs/proc/task_mmu.c:174:18: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
174 | if (vma->anon_name)
| ^~~~~~~~~
| anon_vma
.../linux/fs/proc/task_mmu.c:175:40: error: ‘struct vm_area_struct’ has no member named ‘anon_name’; did you mean ‘anon_vma’?
175 | anon_vma_name_put(vma->anon_name);
| ^~~~~~~~~
| anon_vma

[...]


Thanks,
SJ