Re: [PATCH v5] mm/gup: disallow GUP writing to file-backed mappings by default

From: Jason Gunthorpe
Date: Fri Apr 28 2023 - 09:17:23 EST


On Fri, Apr 28, 2023 at 12:42:32AM +0100, Lorenzo Stoakes wrote:
> Writing to file-backed mappings which require folio dirty tracking using
> GUP is a fundamentally broken operation, as kernel write access to GUP
> mappings do not adhere to the semantics expected by a file system.
>
> A GUP caller uses the direct mapping to access the folio, which does not
> cause write notify to trigger, nor does it enforce that the caller marks
> the folio dirty.
>
> The problem arises when, after an initial write to the folio, writeback
> results in the folio being cleaned and then the caller, via the GUP
> interface, writes to the folio again.
>
> As a result of the use of this secondary, direct, mapping to the folio no
> write notify will occur, and if the caller does mark the folio dirty, this
> will be done so unexpectedly.
>
> For example, consider the following scenario:-
>
> 1. A folio is written to via GUP which write-faults the memory, notifying
> the file system and dirtying the folio.
> 2. Later, writeback is triggered, resulting in the folio being cleaned and
> the PTE being marked read-only.
> 3. The GUP caller writes to the folio, as it is mapped read/write via the
> direct mapping.
> 4. The GUP caller, now done with the page, unpins it and sets it dirty
> (though it does not have to).
>
> This results in both data being written to a folio without writenotify, and
> the folio being dirtied unexpectedly (if the caller decides to do so).
>
> This issue was first reported by Jan Kara [1] in 2018, where the problem
> resulted in file system crashes.
>
> This is only relevant when the mappings are file-backed and the underlying
> file system requires folio dirty tracking. File systems which do not, such
> as shmem or hugetlb, are not at risk and therefore can be written to
> without issue.
>
> Unfortunately this limitation of GUP has been present for some time and
> requires future rework of the GUP API in order to provide correct write
> access to such mappings.
>
> However, for the time being we introduce this check to prevent the most
> egregious case of this occurring, use of the FOLL_LONGTERM pin.
>
> These mappings are considerably more likely to be written to after
> folios are cleaned and thus simply must not be permitted to do so.
>
> As part of this change we separate out vma_needs_dirty_tracking() as a
> helper function to determine this which is distinct from
> vma_wants_writenotify() which is specific to determining which PTE flags to
> set.
>
> [1]:https://lore.kernel.org/linux-mm/20180103100430.GE4911@xxxxxxxxxxxxxx/
>
> Suggested-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
> ---
> include/linux/mm.h | 1 +
> mm/gup.c | 41 ++++++++++++++++++++++++++++++++++++++++-
> mm/mmap.c | 36 +++++++++++++++++++++++++++---------
> 3 files changed, 68 insertions(+), 10 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx>

Jason