Re: [PATCH v1] mm/gup: disallow FOLL_FORCE|FOLL_WRITE on hugetlb mappings

From: David Hildenbrand
Date: Tue Nov 22 2022 - 04:06:41 EST


On 21.11.22 22:33, Andrew Morton wrote:
On Mon, 21 Nov 2022 09:05:43 +0100 David Hildenbrand <david@xxxxxxxxxx> wrote:

MikeK do you have test cases?

Sorry, I do not have any test cases.

I can ask one of our product groups about their usage. But, that would
certainly not be a comprehensive view.

With

https://lkml.kernel.org/r/20221116102659.70287-1-david@xxxxxxxxxx

on it's way, the RDMA concern should be gone, hopefully.

@Andrew, can you queue this one? Thanks.

This is all a little tricky.

It's not good that 6.0 and earlier permit unprivileged userspace to
trigger a WARN. But we cannot backport this fix into earlier kernels
because it requires the series "mm/gup: remove FOLL_FORCE usage from
drivers (reliable R/O long-term pinning)".

Is it possible to come up with a fix for 6.1 and earlier which won't
break RDMA?

Let's recap:

(1) Nobody so far reported a RDMA regression, it was all pure
speculation. The only report we saw was via ptrace when fuzzing
syscalls.

(2) To trigger it, one would need a hugetlb MAP_PRIVATE mappings without
PROT_WRITE. For example:

mmap(0, SIZE, PROT_READ,
MAP_PRIVATE|MAP_ANON|MAP_HUGETLB|MAP_HUGE_2MB, -1, 0)
or
mmap(0, SIZE, PROT_READ, MAP_PRIVATE, hugetlbfd, 0)

While that's certainly valid, it's not the common use case with
hugetlb pages.

(3) Before 1d8d14641fd9 (< v6.0), it "worked by accident" but was wrong:
pages would get mapped writable into page tables, even though we did
not have VM_WRITE. FOLL_FORCE support is essentially absent but not
fenced properly.

(4) With 1d8d14641fd9 (v6.0 + v6.1-rc), it results in a warning instead.

(5) This patch silences the warning.


Ways forward are:

(1) Implement FOLL_FORCE for hugetlb and backport that. Fixes the
warning in 6.0 and wrong behavior before that. The functionality,
however, might not be required in 6.2 at all anymore: the last
remaining use case would be ptrace (which, again, we don't have
actual users reporting breakages).

(2) Use this patch and backport it into 6.0/6.1 to fix the warning. RDMA
will be handled properly in 6.2 via reliable long-term pinnings.

(3) Use this patch and backport it into 6.0/6.1 to fix the warning.
Further, backport the reliable long-term pinning changes into
6.0/6.1 if there are user reports.

(4) On user report regarding RDMA in 6.0 and 6.1, revert the sanity
check that triggers the warning and restore previous (wrong)
behavior.


To summarize, the benefit of (1) would be to have ptrace on hugetlb COW mappings working. As stated, I'd like to minimize FOLL_FORCE implementations if there are no legacy users because FOLL_FORCE has a proven record of security issues. Further, backports to < 6.0 might not be straight forward.

I'd suggest (2), but I'm happy to hear other opinions.

--
Thanks,

David / dhildenb