Re: [PATCH 2/2] mm: thp: kill the bogus ->def_flags check inhugepage_madvise()

From: Gerald Schaefer
Date: Fri Jan 24 2014 - 09:19:26 EST


On Thu, 23 Jan 2014 17:43:34 +0100
Oleg Nesterov <oleg@xxxxxxxxxx> wrote:

> On 01/22, Hugh Dickins wrote:
> >
> > On Wed, 22 Jan 2014, Oleg Nesterov wrote:
> >
> > > hugepage_madvise() checks "mm->def_flags & VM_NOHUGEPAGE" but
> > > this can be never true, currently mm->def_flags can only have
> > > VM_LOCKED.
> >
> > But line 1087 of arch/s390/mm/pgtable.c says
> > mm->def_flags |= VM_NOHUGEPAGE;
> > from 3eabaee998c787e7e1565574821652548f7fc003
> > "KVM: s390: allow sie enablement for multi-threaded programs".
>
> Argh. Thanks Hugh!
>
> Another case when I forgot about /bin/grep. So the patch is wrong,
> at least the changelog is certainly is. I am stupid.
>
> But, perhaps, this all still can work? Looks like, s390 already
> implements PR_SET_THP_DISABLE using the same idea, it would be
> nice to avoid another hack.
>
> Gerald, any chance we can revert 8e72033f2a489 "thp: make
> MADV_HUGEPAGE check for mm->def_flags" ? The changelog says "In order
> to also prevent MADV_HUGEPAGE on such an mm", is it really important?
> I mean, if the application calls madvise(MADV_HUGEPAGE) it should
> probably know what it does and, this can be useful after if
> PR_SET_THP_DISABLE or KVM_S390_ENABLE_SIE. Of course I do not
> understand this code, perhaps MADV_HUGEPAGE is simply impossible.

Yes, after discussion with Martin, I think that commit 8e72033f2a489 can
be reverted if we add a small add-on patch to the s390 gmap code instead,
like this:

diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 3584ed9..a87cdb4 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -504,6 +504,9 @@ static int gmap_connect_pgtable(unsigned long address, unsigned long segment,
if (!pmd_present(*pmd) &&
__pte_alloc(mm, vma, pmd, vmaddr))
return -ENOMEM;
+ /* large pmds cannot yet be handled */
+ if (pmd_large(*pmd))
+ return -EFAULT;
/* pmd now points to a valid segment table entry. */
rmap = kmalloc(sizeof(*rmap), GFP_KERNEL|__GFP_REPEAT);
if (!rmap)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/