[PATCH 0/2] Close race leading to pagetable corruption using hugetlbfs

From: Mel Gorman
Date: Fri Jul 27 2012 - 06:46:16 EST


This is a two-patch series to fix a bug where messages like this appear in the
kernel log

[ ..........] Lots of bad pmd messages followed by this
[ 127.164256] mm/memory.c:391: bad pmd ffff880412e04fe8(80000003de4000e7).
[ 127.164257] mm/memory.c:391: bad pmd ffff880412e04ff0(80000003de6000e7).
[ 127.164258] mm/memory.c:391: bad pmd ffff880412e04ff8(80000003de0000e7).
[ 127.186778] ------------[ cut here ]------------
[ 127.186781] kernel BUG at mm/filemap.c:134!
[ 127.186782] invalid opcode: 0000 [#1] SMP
[ 127.186783] CPU 7

The messy details of the bug are in patch 2. Patch 1 of the series is
required to revert a patch that is in mmotm. That patch avoids taking
i_mmap_mutex but the mutex is required to stabilise the page count during
unsharing. This looks like a mistake and it should be dealt with sooner rather
than later.

There is a potential large snag with patch 2 but I'm sending it now anyway
as patch 1 of the series has to be dealt with. The snag with the second
patch is that while it works for me for the test case included in the patch,
Larry Woodman reports that it does *not* fix the bug for him. We have yet
to establish if this is because of something RHEL specific or because my
test machine is simply unable to reproduce the race with the patch applied.

include/linux/hugetlb.h | 3 +++
mm/hugetlb.c | 28 ++++++++++++++++++++++++++--
mm/memory.c | 7 +++++--
3 files changed, 34 insertions(+), 4 deletions(-)

--
1.7.9.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/