Re: 3.15-rc8 oops in copy_page_rep after page fault.

From: Linus Torvalds
Date: Fri Jun 06 2014 - 14:26:19 EST


On Fri, Jun 6, 2014 at 10:43 AM, Dave Jones <davej@xxxxxxxxxx> wrote:
>
> RIP: 0010:[<ffffffff8b3287b5>] [<ffffffff8b3287b5>] copy_page_rep+0x5/0x10

Ok, it's the first iteration of "rep movsq" (%rcx is still 0x200) for
copying a page, and the pages are

RSI: ffff880052766000
RDI: ffff880014efe000

which both look like reasonable kernel addresses. So I'm assuming it's
DEBUG_PAGEALLOC that makes this trigger, and since the error code is
0, and the CR2 value matches RSI, it's the source page that seems to
have been freed.

And I see absolutely _zero_ reason for wht your 64k mmap_min_addr
should make any difference what-so-ever. That's just odd.

Anyway, can you try to figure out _which_ copy_user_highpage() it is
(by looking at what is around the call-site at
"handle_mm_fault+0x1e0". The fact that we have a stale
do_huge_pmd_wp_page() on the stack makes me suspect that we have hit
that VM_FAULT_FALLBACK case and this is related to splitting. Adding a
few more people explicitly to the cc in case anybody sees anything
(original email on lkml and linux-mm for context, guys).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/