Re: [QUESTION FOR ARM64 TLB] performance issue and implementation difference of TLB flush

From: Gang Li
Date: Mon May 15 2023 - 23:16:33 EST

Next message: Chen-Yu Tsai: "Re: [PATCH V6 6/6] drm: bridge: samsung-dsim: Support non-burst mode"
Previous message: Zhangfei Gao: "Re: [PATCH v2 00/17] Add Nested Translation Support for SMMUv3"
In reply to: Mark Rutland: "Re: [QUESTION FOR ARM64 TLB] performance issue and implementation difference of TLB flush"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi all!

On 2023/5/5 20:28, Gang Li wrote:

Hi,

I found that in `ghes_unmap` protected by spinlock, arm64 and x86 have
different strategies for flushing tlb.

# arm64 call trace:
```
holding a spin lock
ghes_unmap
clear_fixmap
__set_fixmap
   flush_tlb_kernel_range
```

# x86 call trace:
```
holding a spin lock
ghes_unmap
clear_fixmap
__set_fixmap
   mmu.set_fixmap
    native_set_fixmap
     __native_set_fixmap
      set_pte_vaddr
       set_pte_vaddr_p4d
        __set_pte_vaddr
         flush_tlb_one_kernel
```

arm64 broadcast TLB invalidation in ghes_unmap, because TLB entry can be
allocated regardless of whether the CPU explicitly accesses memory.

Why doesn't x86 broadcast TLB invalidation in ghes_unmap? Is there any
difference between x86 and arm64 in TLB allocation and invalidation strategy?

I found this in Intel® 64 and IA-32 Architectures Software Developer
Manuals:

4.10.2.3 Details of TLB Use
Subject to the limitations given in the previous paragraph, the
processor may cache a translation for any linear address, even if that
address is not used to access memory. For example, the processor may
cache translations required for prefetches and for accesses that result
from speculative execution that would never actually occur in the
executed code path.

Both x86 and arm64 can cache TLB for prefetches and speculative
execution. Then why are their flush policies different?

Thanks,
Gang Li

Next message: Chen-Yu Tsai: "Re: [PATCH V6 6/6] drm: bridge: samsung-dsim: Support non-burst mode"
Previous message: Zhangfei Gao: "Re: [PATCH v2 00/17] Add Nested Translation Support for SMMUv3"
In reply to: Mark Rutland: "Re: [QUESTION FOR ARM64 TLB] performance issue and implementation difference of TLB flush"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]