Re: [RFC PATCH 3/3] mm/migrate: Create move_phys_pages syscall

From: Thomas Gleixner
Date: Mon Sep 18 2023 - 20:17:23 EST


On Thu, Sep 07 2023 at 03:54, Gregory Price wrote:
> Similar to the move_pages system call, instead of taking a pid and
> list of virtual addresses, this system call takes a list of physical
> addresses.

Silly question. Where are these physical addresses coming from?

In my naive understanding user space deals with virtual addresses for a
reason.

Exposing access to physical addresses is definitely helpful to write
more powerful exploits, so what are the restriction applied to this?

> +/*
> + * Move a list of pages in the address space of the currently executing
> + * process.
> + */
> +static int kernel_move_phys_pages(unsigned long nr_pages,
> + const void __user * __user *pages,
> + const int __user *nodes,
> + int __user *status, int flags)
> +{
> + int err;
> + nodemask_t target_nodes;
> +
> + /* Check flags */

Documeting the obvious ...

> + if (flags & ~(MPOL_MF_MOVE|MPOL_MF_MOVE_ALL))
> + return -EINVAL;
> +
> + if ((flags & MPOL_MF_MOVE_ALL) && !capable(CAP_SYS_NICE))
> + return -EPERM;

According to this logic here MPOL_MF_MOVE is unrestricted, right?

But how is an unpriviledged process knowing which physical address the
pages have? Confused....

> + /* All tasks mapping each page is checked in phys_page_migratable */
> + nodes_setall(target_nodes);

How is the comment related to nodes_setall() and why is nodes_setall()
unconditional when target_nodes is only used in the @nodes != NULL case?

> + if (nodes)
> + err = do_pages_move(NULL, target_nodes, nr_pages, pages,
> + nodes, status, flags);
> + else
> + err = do_pages_stat(NULL, nr_pages, pages, status);

Thanks,

tglx