Re: [PATCH v2 0/4] x86: sigcontext fixes, again

From: Stas Sergeev
Date: Sat Oct 31 2015 - 08:43:31 EST


29.10.2015 01:51, Toshi Kani ÐÐÑÐÑ:
On Wed, 2015-10-28 at 13:22 -0600, Toshi Kani wrote:
On Wed, 2015-10-28 at 10:34 -0600, Toshi Kani wrote:
On Wed, 2015-10-28 at 12:53 +0300, Stas Sergeev wrote:
28.10.2015 03:04, Toshi Kani ÐÐÑÐÑ:
On Wed, 2015-10-28 at 07:37 +0900, Linus Torvalds wrote:
On Tue, Oct 27, 2015 at 11:05 PM, Stas Sergeev <stsp@xxxxxxx>
wrote:
I can't easily post an Oops: under X it doesn't even appear -
machine freezes immediately, and under non-KMS console it is
possible to get one, but difficult to screen-shot (using bare
metal, not VM). Also the Oops was seemingly unrelated.
And if you run "dosemu -s" under non-KMS console, you'll also
reproduce this one:
https://bugzilla.kernel.org/show_bug.cgi?id=97321
Hmm. Andrew Morton responded to that initially, but then nothing
happened, and now it's been another six months. Andrew?

The arch/x86/mm/pat.c error handling does seem to be suspect. This
is all code several years old, so none of this is new, and I think
Suresh is gone. Adding a few other people with recent sign-offs to
that file, in the hope that somebody feels like they own it..
In the case of PFNMAP, the range should always be mapped. So, I
wonder why follow_phys() failed with the !pte_present() check.

Stas, do you have a test program that can reproduce 97321?
Get dosemu2 from here:
https://github.com/stsp/dosemu2/releases
or from git, or get dosemu1.
Then boot your kernel with "nomodeset=1" to get a text console.
Run

dosemu -s

and you'll get the bug.
I looked at the dosemu code and was able to reproduce the issue with a test
program. This problem happens when mremap() to /dev/mem (or PFNMAP) is
called with MREMAP_FIXED.

In this case, mremap calls move_vma(), which first calls move_page_tables()
to remap the translation and then calls do_munmap() to remove the original
mapping. Hence, when untrack_pfn() is called from do_munmap(), the
original map is already removed, and follow_phys() fails with the
!pte_present() check.

I think there are a couple of issues:
- If untrack_pfn() ignores an error from follow_phys() and skips
free_pfn_range(), PAT continues to track the original map that is removed.
- untrack_pfn() calls free_pfn_range() to untrack a given free range.
However, rbt_memtype_erase() requires the free range match exactly to the
tracked range. This does not support mremap, which needs to free up part
of the tracked range.
- PAT does not track a new translation specified by mremap() with MREMAP_F
IXED.
Thinking further, I think the 1st and 3rd items are non-issues. mremap remaps
virtual address, but keeps the same cache type and pfns. So, PAT does not have
to change the tracked pfns in this case. The 2nd item is still a problem,
though.
Hello Toshi, thanks for your analysis.
Now as you do not seem to be preparing a fix, how
about attaching your test-case to the bug-report for
others to re-use? Or maybe you can even make it a
part of the kernel's test suit - I suppose this will help.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/