[PATCH] arm64: entry: Improve the performance of system calls

From: Zhen Lei
Date: Fri Sep 03 2021 - 08:22:53 EST


Commit 582f95835a8f ("arm64: entry: convert el0_sync to C") converted lots
of functions from assembly to C, this greatly improves readability. But
el0_svc()/el0_svc_compat() is in response to system call requests from
user mode and may be in the hot path.

Although the SVC is in the first case of the switch statement in C, the
compiler optimizes the switch statement as a whole, and does not give SVC
a small boost.

Use "likely()" to help SVC directly invoke its handler after a simple
judgment to avoid entering the switch table lookup process.

After:
0000000000000ff0 <el0t_64_sync_handler>:
ff0: d503245f bti c
ff4: d503233f paciasp
ff8: a9bf7bfd stp x29, x30, [sp, #-16]!
ffc: 910003fd mov x29, sp
1000: d5385201 mrs x1, esr_el1
1004: 531a7c22 lsr w2, w1, #26
1008: f100545f cmp x2, #0x15
100c: 540000a1 b.ne 1020 <el0t_64_sync_handler+0x30>
1010: 97fffe14 bl 860 <el0_svc>
1014: a8c17bfd ldp x29, x30, [sp], #16
1018: d50323bf autiasp
101c: d65f03c0 ret
1020: f100705f cmp x2, #0x1c

Execute "./lat_syscall null" on my board (BogoMIPS : 200.00), it can save
about 10ns.

Before:
Simple syscall: 0.2365 microseconds
Simple syscall: 0.2354 microseconds
Simple syscall: 0.2339 microseconds

After:
Simple syscall: 0.2255 microseconds
Simple syscall: 0.2254 microseconds
Simple syscall: 0.2256 microseconds

Signed-off-by: Zhen Lei <thunder.leizhen@xxxxxxxxxx>
---
arch/arm64/kernel/entry-common.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 32f9796c4ffe77b..062eb5a895ec6f3 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -607,11 +607,14 @@ static void noinstr el0_fpac(struct pt_regs *regs, unsigned long esr)
asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
+ unsigned long ec = ESR_ELx_EC(esr);

- switch (ESR_ELx_EC(esr)) {
- case ESR_ELx_EC_SVC64:
+ if (likely(ec == ESR_ELx_EC_SVC64)) {
el0_svc(regs);
- break;
+ return;
+ }
+
+ switch (ec) {
case ESR_ELx_EC_DABT_LOW:
el0_da(regs, esr);
break;
@@ -730,11 +733,14 @@ static void noinstr el0_svc_compat(struct pt_regs *regs)
asmlinkage void noinstr el0t_32_sync_handler(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
+ unsigned long ec = ESR_ELx_EC(esr);

- switch (ESR_ELx_EC(esr)) {
- case ESR_ELx_EC_SVC32:
+ if (likely(ec == ESR_ELx_EC_SVC32)) {
el0_svc_compat(regs);
- break;
+ return;
+ }
+
+ switch (ec) {
case ESR_ELx_EC_DABT_LOW:
el0_da(regs, esr);
break;
--
2.25.1