[PATCH] Respect system call number changes by sys_enter probes

From: André Rösti
Date: Sat Mar 09 2024 - 00:57:34 EST


When a probe is registered at the `trace_sys_enter` tracepoint, and
that probe changes the system call number, the old system call still
gets executed on x86_64 (and potentially other architectures). This
is inconsistent with how ARM64 (and potentially other architectures)
handles this, and inconsistent with the tracepoint semantics prior to
change b6ec41346103 (core/entry: Report syscall correctly for trace
and audit).

With this patch, the semantics are restored to be the same as before
the aforementioned change (and thus made consistent with ARM64). The
change adds one line to re-read the system call number register into
the `syscall` variable. By reading twice, the benefits of the
aforementioned change b6ec41346103 are kept.

There should be no performance impact if no sys_enter tracepoints are
registered, since re-reading the system call number from `regs` is
only done conditonally if the tracepoint is in use. If a probe is
registered, the performance impact should still be minimal, since the
additional call to `syscall_get_nr` amounts to only an inlined read
of `regs->orig_ax` (on x86_64).

Signed-off-by: André Rösti <an.roesti@xxxxxxxxx>
---
@Thomas Gleixner: You may have received this e-mail twice. My apologies!
This is my first attempt to contribute, and I made a mistake using git
send-email. Thanks for your work maintaining this and sorry again.
---
kernel/entry/common.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 88cb3c88aaa5..89b14ba9ed14 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -57,8 +57,11 @@ long syscall_trace_enter(struct pt_regs *regs, long syscall,
/* Either of the above might have changed the syscall number */
syscall = syscall_get_nr(current, regs);

- if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT))
+ if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) {
trace_sys_enter(regs, syscall);
+ /* Tracers may have changed system call number as well */
+ syscall = syscall_get_nr(current, regs);
+ }

syscall_enter_audit(regs, syscall);


base-commit: 221a164035fd8b554a44bd7c4bf8e7715a497561
--
2.34.1