diff mbox series

[v3,5/7] arm64: syscall: Expand the comment about ptrace and syscall(-1)

Message ID 20200710130702.30658-6-will@kernel.org (mailing list archive)
State Mainlined
Commit 139dbe5d8ed383cbd1ada56c78dbbbd35bf6a9d3
Headers show
Series arm64: Fix single-step handling and syscall tracing | expand

Commit Message

Will Deacon July 10, 2020, 1:07 p.m. UTC
If a task executes syscall(-1), we intercept this early and force x0 to
be -ENOSYS so that we don't need to distinguish this scenario from one
where the scno is -1 because a tracer wants to skip the system call
using ptrace. With the return value set, the return path is the same as
the skip case.

Although there is a one-line comment noting this in el0_svc_common(), it
misses out most of the detail. Expand the comment to describe a bit more
about what is going on.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Keno Fischer <keno@juliacomputing.com>
Cc: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/syscall.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 98a26d4e7b0c..5f0c04863d2c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -124,7 +124,21 @@  static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
 	user_exit();
 
 	if (has_syscall_work(flags)) {
-		/* set default errno for user-issued syscall(-1) */
+		/*
+		 * The de-facto standard way to skip a system call using ptrace
+		 * is to set the system call to -1 (NO_SYSCALL) and set x0 to a
+		 * suitable error code for consumption by userspace. However,
+		 * this cannot be distinguished from a user-issued syscall(-1)
+		 * and so we must set x0 to -ENOSYS here in case the tracer doesn't
+		 * issue the skip and we fall into trace_exit with x0 preserved.
+		 *
+		 * This is slightly odd because it also means that if a tracer
+		 * sets the system call number to -1 but does not initialise x0,
+		 * then x0 will be preserved for all system calls apart from a
+		 * user-issued syscall(-1). However, requesting a skip and not
+		 * setting the return value is unlikely to do anything sensible
+		 * anyway.
+		 */
 		if (scno == NO_SYSCALL)
 			regs->regs[0] = -ENOSYS;
 		scno = syscall_trace_enter(regs);