Re: [PATCH v5 19/25] arm64/mm: Wire up PTE_CONT for user mappings

From: David Hildenbrand
Date: Tue Feb 13 2024 - 09:05:30 EST


On 13.02.24 15:02, Ryan Roberts wrote:
On 13/02/2024 13:45, David Hildenbrand wrote:
On 13.02.24 14:33, Ard Biesheuvel wrote:
On Tue, 13 Feb 2024 at 14:21, Ryan Roberts <ryan.roberts@xxxxxxx> wrote:

On 13/02/2024 13:13, David Hildenbrand wrote:
On 13.02.24 14:06, Ryan Roberts wrote:
On 13/02/2024 12:19, David Hildenbrand wrote:
On 13.02.24 13:06, Ryan Roberts wrote:
On 12/02/2024 20:38, Ryan Roberts wrote:
[...]

+static inline bool mm_is_user(struct mm_struct *mm)
+{
+    /*
+     * Don't attempt to apply the contig bit to kernel mappings,
because
+     * dynamically adding/removing the contig bit can cause page
faults.
+     * These racing faults are ok for user space, since they get
serialized
+     * on the PTL. But kernel mappings can't tolerate faults.
+     */
+    return mm != &init_mm;
+}

We also have the efi_mm as a non-user mm, though I don't think we
manipulate
that while it is live, and I'm not sure if that needs any special
handling.

Well we never need this function in the hot (order-0 folio) path, so I
think I
could add a check for efi_mm here with performance implication. It's
probably
safest to explicitly exclude it? What do you think?

Oops: This should have read "I think I could add a check for efi_mm here
*without* performance implication"

It turns out that efi_mm is only defined when CONFIG_EFI is enabled I
can do
this:

return mm != &init_mm && (!IS_ENABLED(CONFIG_EFI) || mm != &efi_mm);

Is that acceptable? This is my preference, but nothing else outside of efi
references this symbol currently.

Or perhaps I can convince myself that its safe to treat efi_mm like
userspace.
There are a couple of things that need to be garanteed for it to be safe:

     - The PFNs of present ptes either need to have an associated struct
page or
       need to have the PTE_SPECIAL bit set (either pte_mkspecial() or
       pte_mkdevmap())

     - Live mappings must either be static (no changes that could cause
fold/unfold
       while live) or the system must be able to tolerate a temporary fault

Mark suggests efi_mm is not manipulated while live, so that meets the
latter
requirement, but I'm not sure about the former?

I've gone through all the efi code, and conclude that, as Mark suggests, the
mappings are indeed static. And additionally, the ptes are populated
using only
the _private_ ptep API, so there is no issue here. As just discussed with
Mark,
my prefereence is to not make any changes to code, and just add a comment
describing why efi_mm is safe.

Details:

* Registered with ptdump
       * ptep_get_lockless()
* efi_create_mapping -> create_pgd_mapping … -> init_pte:
       * __ptep_get()
       * __set_pte()
* efi_memattr_apply_permissions -> efi_set_mapping_permissions … ->
set_permissions
       * __ptep_get()
       * __set_pte()

Sound good. We could add some VM_WARN_ON if we ever get the efi_mm via the
"official" APIs.

We could, but that would lead to the same linkage issue, which I'm trying to
avoid in the first place:

VM_WARN_ON(IS_ENABLED(CONFIG_EFI) && mm == efi_mm);

This creates new source code dependencies, which I would rather avoid if
possible.

Just a thought, you could have a is_efi_mm() function that abstracts all that.

diff --git a/include/linux/efi.h b/include/linux/efi.h
index c74f47711f0b..152f5fa66a2a 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -692,6 +692,15 @@ extern struct efi {

  extern struct mm_struct efi_mm;

+static inline void is_efi_mm(struct mm_struct *mm)
+{
+#ifdef CONFIG_EFI
+       return mm == &efi_mm;
+#else
+       return false;
+#endif
+}
+
  static inline int
  efi_guidcmp (efi_guid_t left, efi_guid_t right)
  {



That would definitely work, but in that case, I might as well just check for it
in mm_is_user() (and personally I would change the name to mm_is_efi()):


static inline bool mm_is_user(struct mm_struct *mm)
{
         return mm != &init_mm && !mm_is_efi(mm);
}

Any objections?


Any reason not to use IS_ENABLED(CONFIG_EFI) in the above? The extern
declaration is visible to the compiler, and any references should
disappear before the linker could notice that efi_mm does not exist.


Sure, as long as the linker is happy why not. I'll let Ryan mess with that :)

I'm not sure if you are suggesting dropping the mm_is_efi() helper and just use
IS_ENABLED(CONFIG_EFI) in mm_is_user() to guard efi_mm, or if you are suggesting
using IS_ENABLED(CONFIG_EFI) in mm_is_efi() instead of the ifdefery?

The former was what I did initially; It works great, but I didn't like that I
was introducing a new code dependecy between efi and arm64 (nothing else outside
of efi references efi_mm).

So then concluded that it is safe to not worry about efi_mm (thanks for your
confirmation). But then David wanted a VM_WARN check, which reintroduces the
code dependency. So he suggested the mm_is_efi() helper to hide that... This is
all starting to feel circular...

I think Ard meant that inside mm_is_efi(), we could avoid the #ifdef and simply use IS_ENABLED().

--
Cheers,

David / dhildenb