Re: [PATCH v1] x86/mm/pat: fix VM_PAT handling in COW mappings

From: David Hildenbrand
Date: Thu Mar 14 2024 - 12:42:32 EST


On 12.03.24 20:38, David Hildenbrand wrote:
On 12.03.24 20:22, Matthew Wilcox wrote:
On Tue, Mar 12, 2024 at 07:11:18PM +0100, David Hildenbrand wrote:
PAT handling won't do the right thing in COW mappings: the first PTE
(or, in fact, all PTEs) can be replaced during write faults to point at
anon folios. Reliably recovering the correct PFN and cachemode using
follow_phys() from PTEs will not work in COW mappings.

I guess the first question is: Why do we want to support COW mappings
of VM_PAT areas? What breaks if we just disallow it?

Well, that was my first approach. Then I decided to be less radical (IOW
make my life easier by breaking less user space) and "fix it" with
minimal effort.

Chances of breaking some weird user space is possible, although I
believe for most such mappings MAP_PRIVATE doesn't make too much sense
sense.

Nasty COW support for VM_PFNMAP mappings dates back forever. So does PAT
support.

I can try finding digging through some possible user space users tomorrow.

As discussed, MAP_PRIVATE doesn't make too much sense for most PFNMAP mappings.

However, /dev/mem and /proc/vmcore are still used with MAP_PRIVATE in some cases.

Side note: /proc/vmcore is a bit weird: mmap_vmcore() sets VM_MIXEDMAP, and then we might call remap_pfn_range(), which sets VM_PFNMAP. I'm not so sure if that's what we want to happen ...

As far as I can see, makedumpfile always mmap's memory to be dumped (/dev/mem, /proc/vmcore) using PROT_READ+MAP_PRIVATE, resulting in a COW mapping.


In my opinion, we should use this fairly simple fix to keep it working for now and look into disabling any MAP_PRIVATE of VM_PFNMAP separately, for all architectures.

But I'll leave the decision to x86 maintainers.

--
Cheers,

David / dhildenb