Re: [RFC 4/7] mm: factor out madvise's core functionality

From: Minchan Kim
Date: Mon May 20 2019 - 21:29:25 EST


Hi Oleksandr,

On Mon, May 20, 2019 at 04:26:33PM +0200, Oleksandr Natalenko wrote:
> Hi.
>
> On Mon, May 20, 2019 at 12:52:51PM +0900, Minchan Kim wrote:
> > This patch factor out madvise's core functionality so that upcoming
> > patch can reuse it without duplication.
> >
> > It shouldn't change any behavior.
> >
> > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
> > ---
> > mm/madvise.c | 168 +++++++++++++++++++++++++++------------------------
> > 1 file changed, 89 insertions(+), 79 deletions(-)
> >
> > diff --git a/mm/madvise.c b/mm/madvise.c
> > index 9a6698b56845..119e82e1f065 100644
> > --- a/mm/madvise.c
> > +++ b/mm/madvise.c
> > @@ -742,7 +742,8 @@ static long madvise_dontneed_single_vma(struct vm_area_struct *vma,
> > return 0;
> > }
> >
> > -static long madvise_dontneed_free(struct vm_area_struct *vma,
> > +static long madvise_dontneed_free(struct task_struct *tsk,
> > + struct vm_area_struct *vma,
> > struct vm_area_struct **prev,
> > unsigned long start, unsigned long end,
> > int behavior)
> > @@ -754,8 +755,8 @@ static long madvise_dontneed_free(struct vm_area_struct *vma,
> > if (!userfaultfd_remove(vma, start, end)) {
> > *prev = NULL; /* mmap_sem has been dropped, prev is stale */
> >
> > - down_read(&current->mm->mmap_sem);
> > - vma = find_vma(current->mm, start);
> > + down_read(&tsk->mm->mmap_sem);
> > + vma = find_vma(tsk->mm, start);
> > if (!vma)
> > return -ENOMEM;
> > if (start < vma->vm_start) {
> > @@ -802,7 +803,8 @@ static long madvise_dontneed_free(struct vm_area_struct *vma,
> > * Application wants to free up the pages and associated backing store.
> > * This is effectively punching a hole into the middle of a file.
> > */
> > -static long madvise_remove(struct vm_area_struct *vma,
> > +static long madvise_remove(struct task_struct *tsk,
> > + struct vm_area_struct *vma,
> > struct vm_area_struct **prev,
> > unsigned long start, unsigned long end)
> > {
> > @@ -836,13 +838,13 @@ static long madvise_remove(struct vm_area_struct *vma,
> > get_file(f);
> > if (userfaultfd_remove(vma, start, end)) {
> > /* mmap_sem was not released by userfaultfd_remove() */
> > - up_read(&current->mm->mmap_sem);
> > + up_read(&tsk->mm->mmap_sem);
> > }
> > error = vfs_fallocate(f,
> > FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> > offset, end - start);
> > fput(f);
> > - down_read(&current->mm->mmap_sem);
> > + down_read(&tsk->mm->mmap_sem);
> > return error;
> > }
> >
> > @@ -916,12 +918,13 @@ static int madvise_inject_error(int behavior,
> > #endif
>
> What about madvise_inject_error() and get_user_pages_fast() in it
> please?

Good point. Maybe, there more places where assume context is "current" so
I'm thinking to limit hints we could allow from external process.
It would be better for maintainance point of view in that we could know
the workload/usecases when someone ask new advises from external process
without making every hints works both contexts.

Thanks.