Re: [RFC 1/7] mm: introduce MADV_COOL

From: Michal Hocko
Date: Tue May 28 2019 - 12:14:35 EST


On Tue 28-05-19 23:38:11, Hillf Danton wrote:
>
> On Tue, 28 May 2019 20:39:36 +0800 Minchan Kim wrote:
> > On Tue, May 28, 2019 at 08:15:23PM +0800, Hillf Danton wrote:
> > < snip >
> > > > > > + orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> > > > > > + for (pte = orig_pte; addr < end; pte++, addr += PAGE_SIZE) {
> > > > >
> > > > > s/end/next/ ?
> > > >
> > > > Why do you think it should be next?
> > > >
> > > Simply based on the following line, and afraid that next != end
> > > > > > + next = pmd_addr_end(addr, end);
> >
> > pmd_addr_end will return smaller address so end is more proper.
> >
> Fair.
>
> > > > > > +static long madvise_cool(struct vm_area_struct *vma,
> > > > > > + unsigned long start_addr, unsigned long end_addr)
> > > > > > +{
> > > > > > + struct mm_struct *mm = vma->vm_mm;
> > > > > > + struct mmu_gather tlb;
> > > > > > +
> > > > > > + if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP))
> > > > > > + return -EINVAL;
> > > > >
> > > > > No service in case of VM_IO?
> > > >
> > > > I don't know VM_IO would have regular LRU pages but just follow normal
> > > > convention for DONTNEED and FREE.
> > > > Do you have anything in your mind?
> > > >
> > > I want to skip a mapping set up for DMA.
> >
> > What you meant is those pages in VM_IO vma are not in LRU list?
>
> What I concern is the case that there are IO pages on lru list.
> > Or
> > pages in the vma are always pinned so no worth to deactivate or reclaim?
> >
> I will not be nervous or paranoid if they are pinned.
>
> In short, I prefer to skip IO mapping since any kind of address range
> can be expected from userspace, and it may probably cover an IO mapping.
> And things can get out of control, if we reclaim some IO pages while
> underlying device is trying to fill data into any of them, for instance.

What do you mean by IO pages why what is the actual problem?
--
Michal Hocko
SUSE Labs