Re: [PATCH] uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode()

From: Michal Hocko
Date: Mon May 19 2014 - 08:41:41 EST


On Mon 05-05-14 17:13:20, Oleg Nesterov wrote:

Ups, this really slipped through...
Sorry about that.

> Hugh says:
>
> The one I noticed was that it forgets all about memcg (because
> it was copied from KSM, and there the replacement page has already
> been charged to a memcg). See how mm/memory.c do_anonymous_page()
> does a mem_cgroup_charge_anon().
>
> Hopefully not a big problem, uprobes is a system-wide thing and only
> root can insert the probes. But I agree, should be fixed anyway.

I am not familiar with uprobes but AFAIU this can be a non-trivial of
(now) uncharged memory. The fact that only root is allowed to install
probes doesn't change the fact that this is still unaccounted memory and
so setups which are tuned to not trigger OOM killer would be broken.

> Add mem_cgroup_{un,}charge_anon() into uprobe_write_opcode(). To simplify
> the error handling (and avoid the new "uncharge" label) the patch also
> moves anon_vma_prepare() up before we alloc/charge the new page.
>
> While at it fix the comment about ->mmap_sem, it is held for write.

> Suggested-by: Hugh Dickins <hughd@xxxxxxxxxx>
> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>

Acked-by: Michal Hocko <mhocko@xxxxxxx>

> ---
> kernel/events/uprobes.c | 23 +++++++++++------------
> 1 files changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 7716c40..a13251e 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -279,18 +279,13 @@ static int verify_opcode(struct page *page, unsigned long vaddr, uprobe_opcode_t
> * supported by that architecture then we need to modify is_trap_at_addr and
> * uprobe_write_opcode accordingly. This would never be a problem for archs
> * that have fixed length instructions.
> - */
> -
> -/*
> + *
> * uprobe_write_opcode - write the opcode at a given virtual address.
> * @mm: the probed process address space.
> * @vaddr: the virtual address to store the opcode.
> * @opcode: opcode to be written at @vaddr.
> *
> - * Called with mm->mmap_sem held (for read and with a reference to
> - * mm).
> - *
> - * For mm @mm, write the opcode at @vaddr.
> + * Called with mm->mmap_sem held for write.
> * Return 0 (success) or a negative errno.
> */
> int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr,
> @@ -310,21 +305,25 @@ retry:
> if (ret <= 0)
> goto put_old;
>
> + ret = anon_vma_prepare(vma);
> + if (ret)
> + goto put_old;
> +
> ret = -ENOMEM;
> new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, vaddr);
> if (!new_page)
> goto put_old;
>
> - __SetPageUptodate(new_page);
> + if (mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL))
> + goto put_new;
>
> + __SetPageUptodate(new_page);
> copy_highpage(new_page, old_page);
> copy_to_page(new_page, vaddr, &opcode, UPROBE_SWBP_INSN_SIZE);
>
> - ret = anon_vma_prepare(vma);
> - if (ret)
> - goto put_new;
> -
> ret = __replace_page(vma, vaddr, old_page, new_page);
> + if (ret)
> + mem_cgroup_uncharge_page(new_page);
>
> put_new:
> page_cache_release(new_page);
> --
> 1.5.5.1
>
>

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/