Re: [PATCH] x86/mm/tlb: Revert retpoline avoidance approach

From: Nadav Amit
Date: Sat Mar 19 2022 - 03:20:53 EST



> On Mar 18, 2022, at 9:33 AM, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> wrote:
>
> ⚠ External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.
>
> 0day reported a regression on a microbenchmark which is intended to
> stress the TLB flushing path:
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F20220317090415.GE735%40xsang-OptiPlex-9020%2F&amp;data=04%7C01%7Cnamit%40vmware.com%7C4a2c382b5ef44105474308da08fd0c7f%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637832180178751497%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=TA0iATQCnfDjIZ1lG3YdhjMZjelXrVatBjBE8Hz3AfE%3D&amp;reserved=0
>
> It pointed at a commit from Nadav which intended to remove retpoline
> overhead in the TLB flushing path by taking the 'cond'-ition in
> on_each_cpu_cond_mask(), pre-calculating it, and incorporating it into
> 'cpumask'. That allowed the code to use a bunch of earlier direct
> calls instead of later indirect calls that need a retpoline.
>
> But, in practice, threads can go idle (and into lazy TLB mode where
> they don't need to flush their TLB) between the early and late calls.
> It works in this direction and not in the other because TLB-flushing
> threads tend to hold mmap_lock for write. Contention on that lock
> causes threads to _go_ idle right in this early/late window.
>

Acked-by: Nadav Amit <namit@xxxxxxxxxx>