Re: [PATCH 4/4] Documentation/vm: Rework "Temporary Virtual Mappings"

From: Ira Weiny
Date: Sun Apr 24 2022 - 20:59:57 EST


On Thu, Apr 21, 2022 at 08:02:00PM +0200, Fabio M. De Francesco wrote:
> Extend and rework the "Temporary Virtual Mappings" section of the highmem.rst
> documentation.
>
> Despite the local kmaps were introduced by Thomas Gleixner in October 2020,
> documentation was still missing information about them. These additions rely
> largely on Gleixner's patches, Jonathan Corbet's LWN articles, comments by
> Ira Weiny and Matthew Wilcox, and in-code comments from
> ./include/linux/highmem.h.
>
> 1) Add a paragraph to document kmap_local_page().
> 2) Reorder the list of functions by decreasing order of preference of
> use.
> 3) Rework part of the kmap() entry in list.
>
> Cc: Jonathan Corbet <corbet@xxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Ira Weiny <ira.weiny@xxxxxxxxx>
> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@xxxxxxxxx>
> ---
> Documentation/vm/highmem.rst | 71 ++++++++++++++++++++++++++++++------
> 1 file changed, 60 insertions(+), 11 deletions(-)
>
> diff --git a/Documentation/vm/highmem.rst b/Documentation/vm/highmem.rst
> index e05bf5524174..960f61e7a552 100644
> --- a/Documentation/vm/highmem.rst
> +++ b/Documentation/vm/highmem.rst
> @@ -50,26 +50,75 @@ space when they use mm context tags.
> Temporary Virtual Mappings
> ==========================
>
> -The kernel contains several ways of creating temporary mappings:
> +The kernel contains several ways of creating temporary mappings. The following
> +list shows them in order of preference of use.
>
> -* vmap(). This can be used to make a long duration mapping of multiple
> - physical pages into a contiguous virtual space. It needs global
> - synchronization to unmap.
> +* kmap_local_page(). This function is used to require short term mappings.
> + It can be invoked from any context (including interrupts) but the mappings
> + can only be used in the context which acquired them.
> +
> + This function should be preferred, where feasible, over all the others.
>
> -* kmap(). This permits a short duration mapping of a single page. It needs
> - global synchronization, but is amortized somewhat. It is also prone to
> - deadlocks when using in a nested fashion, and so it is not recommended for
> - new code.
> + These mappings are per thread, CPU local (i.e., migration from one CPU to
> + another is disabled - this is why they are called "local"), but they don't
> + disable preemption. It's valid to take pagefaults in a local kmap region,
> + unless the context in which the local mapping is acquired does not allow
> + it for other reasons.
> +
> + It is assumed that kmap_local_page() always returns the virtual address

kmap_local_page() does return a kernel virtual address. Why 'assume' this?

Do you mean it returns an address in the direct map?

> + of the mapping, therefore they won't ever fail.

I don't think that returning a virtual address has anything to do with the
assumption they will not fail.

Why do you say this?

> +
> + If a task holding local kmaps is preempted, the maps are removed on context
> + switch and restored when the task comes back on the CPU. As the maps are
> + strictly CPU local, it is guaranteed that the task stays on the CPU and
> + that the CPU cannot be unplugged until the local kmaps are released.
> +
> + Nesting kmap_local_page() and kmap_atomic() mappings is allowed to a certain
> + extent (up to KMAP_TYPE_NR) but their invocations have to be strictly ordered
> + because the map implementation is stack based.

I think I would reference the kmap_local_page() for more details on the
ordering because there have been some conversions I've done which were
complicated by this.

>
> * kmap_atomic(). This permits a very short duration mapping of a single
> page. Since the mapping is restricted to the CPU that issued it, it
> performs well, but the issuing task is therefore required to stay on that
> CPU until it has finished, lest some other task displace its mappings.
>
> - kmap_atomic() may also be used by interrupt contexts, since it is does not
> - sleep and the caller may not sleep until after kunmap_atomic() is called.
> + kmap_atomic() may also be used by interrupt contexts, since it does not
> + sleep and the callers too may not sleep until after kunmap_atomic() is
> + called.
> +
> + Each call of kmap_atomic() in the kernel creates a non-preemptible section
> + and disable pagefaults. This could be a source of unwanted latency, so it
> + should be only used if it is absolutely required, otherwise kmap_local_page()
> + should be used where it is feasible.
>
> - It may be assumed that k[un]map_atomic() won't fail.
> + It is assumed that k[un]map_atomic() won't fail.
> +
> +* kmap(). This should be used to make short duration mapping of a single
> + page with no restrictions on preemption or migration. It comes with an
> + overhead as mapping space is restricted and protected by a global lock
> + for synchronization. When mapping is no more needed, the address that
^^^^^^^^
no longer

> + the page was mapped to must be released with kunmap().
> +
> + Mapping changes must be propagated across all the CPUs. kmap() also
> + requires global TLB invalidation when the kmap's pool wraps and it might
> + block when the mapping space is fully utilized until a slot becomes
> + available. Therefore, kmap() is only callable from preemptible context.
> +
> + All the above work is necessary if a mapping must last for a relatively
> + long time but the bulk of high-memory mappings in the kernel are
> + short-lived and only used in one place. This means that the cost of
> + kmap() is mostly wasted in such cases. kmap() was not intended for long
> + term mappings but it has morphed in that direction and its use is
> + strongly discouraged in newer code and the set of the preceding functions
> + should be preferred.

Nice!

Ira

> +
> + On 64-bit systems, calls to kmap_local_page(), kmap_atomic() and kmap() have
> + no real work to do because a 64-bit address space is more than sufficient to
> + address all the physical memory whose pages are permanently mapped.
> +
> +* vmap(). This can be used to make a long duration mapping of multiple
> + physical pages into a contiguous virtual space. It needs global
> + synchronization to unmap.
>
>
> Cost of Temporary Mappings
> --
> 2.34.1
>