Message ID | 3a1c8280967b491bf6917a18fbff6c9b52e8df24.1641398395.git.fweimer@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3,1/3] x86: Implement arch_prctl(ARCH_VSYSCALL_CONTROL) to disable vsyscall | expand |
* Florian Weimer: > Distributions struggle with changing the default for vsyscall > emulation because it is a clear break of userspace ABI, something > that should not happen. > > The legacy vsyscall interface is supposed to be used by libcs only, > not by applications. This commit adds a new arch_prctl request, > ARCH_VSYSCALL_CONTROL, with one argument. If the argument is 0, > executing vsyscalls will cause the process to terminate. Argument 1 > turns vsyscall back on (this is mostly for a largely theoretical > CRIU use case). > > Newer libcs can use a zero ARCH_VSYSCALL_CONTROL at startup to disable > vsyscall for the process. Legacy libcs do not perform this call, so > vsyscall remains enabled for them. This approach should achieves > backwards compatibility (perfect compatibility if the assumption that > only libcs use vsyscall is accurate), and it provides full hardening > for new binaries. > > The chosen value of ARCH_VSYSCALL_CONTROL should avoid conflicts > with other x86-64 arch_prctl requests. The fact that with > vsyscall=emulate, reading the vsyscall region is still possible > even after a zero ARCH_VSYSCALL_CONTROL is considered limitation > in the current implementation and may change in a future kernel > version. > > Future arch_prctls requests commonly used at process startup can imply > ARCH_VSYSCALL_CONTROL with a zero argument, so that a separate system > call for disabling vsyscall is avoided. > > Signed-off-by: Florian Weimer <fweimer@redhat.com> > Acked-by: Andrei Vagin <avagin@gmail.com> > --- > v3: Remove warning log message. Split out test. > v2: ARCH_VSYSCALL_CONTROL instead of ARCH_VSYSCALL_LOCKOUT. New tests > for the toggle behavior. Implement hiding [vsyscall] in > /proc/PID/maps and test it. Various other test fixes cleanups > (e.g., fixed missing second argument to gettimeofday). > > arch/x86/entry/vsyscall/vsyscall_64.c | 7 ++++++- > arch/x86/include/asm/mmu.h | 6 ++++++ > arch/x86/include/uapi/asm/prctl.h | 2 ++ > arch/x86/kernel/process_64.c | 7 +++++++ > 4 files changed, 21 insertions(+), 1 deletion(-) Hello, sorry to bother you again. What can I do to move this forward? Thanks, Florian
On 1/5/22 08:02, Florian Weimer wrote: > Distributions struggle with changing the default for vsyscall > emulation because it is a clear break of userspace ABI, something > that should not happen. > > The legacy vsyscall interface is supposed to be used by libcs only, > not by applications. This commit adds a new arch_prctl request, > ARCH_VSYSCALL_CONTROL, with one argument. If the argument is 0, > executing vsyscalls will cause the process to terminate. Argument 1 > turns vsyscall back on (this is mostly for a largely theoretical > CRIU use case). > > Newer libcs can use a zero ARCH_VSYSCALL_CONTROL at startup to disable > vsyscall for the process. Legacy libcs do not perform this call, so > vsyscall remains enabled for them. This approach should achieves > backwards compatibility (perfect compatibility if the assumption that > only libcs use vsyscall is accurate), and it provides full hardening > for new binaries. > > The chosen value of ARCH_VSYSCALL_CONTROL should avoid conflicts > with other x86-64 arch_prctl requests. The fact that with > vsyscall=emulate, reading the vsyscall region is still possible > even after a zero ARCH_VSYSCALL_CONTROL is considered limitation > in the current implementation and may change in a future kernel > version. > > Future arch_prctls requests commonly used at process startup can imply > ARCH_VSYSCALL_CONTROL with a zero argument, so that a separate system > call for disabling vsyscall is avoided. > > Signed-off-by: Florian Weimer <fweimer@redhat.com> > Acked-by: Andrei Vagin <avagin@gmail.com> > --- > v3: Remove warning log message. Split out test. > v2: ARCH_VSYSCALL_CONTROL instead of ARCH_VSYSCALL_LOCKOUT. New tests > for the toggle behavior. Implement hiding [vsyscall] in > /proc/PID/maps and test it. Various other test fixes cleanups > (e.g., fixed missing second argument to gettimeofday). > > arch/x86/entry/vsyscall/vsyscall_64.c | 7 ++++++- > arch/x86/include/asm/mmu.h | 6 ++++++ > arch/x86/include/uapi/asm/prctl.h | 2 ++ > arch/x86/kernel/process_64.c | 7 +++++++ > 4 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c > index fd2ee9408e91..6fc524b9f232 100644 > --- a/arch/x86/entry/vsyscall/vsyscall_64.c > +++ b/arch/x86/entry/vsyscall/vsyscall_64.c > @@ -174,6 +174,9 @@ bool emulate_vsyscall(unsigned long error_code, > > tsk = current; > > + if (tsk->mm->context.vsyscall_disabled) > + goto sigsegv; > + Is there a reason you didn't just change the check earlier in the function to: if (vsyscall_mode == NONE || current->mm->context.vsyscall_disabled) Also, I still think the prctl should not be available if vsyscall=emulate. Either we should fully implement it or we should not implement. We could even do: pr_warn_once("userspace vsyscall hardening request ignored because you have vsyscall=emulate. Unless you absolutely need vsyscall=emulate, update your system to use vsyscall=xonly.\n"); and thus encourage good behavior. --Andy
* Andy Lutomirski: > Is there a reason you didn't just change the check earlier in the > function to: > > if (vsyscall_mode == NONE || current->mm->context.vsyscall_disabled) Andrei requested that I don't print anything if vsyscall was disabled. The original patch used a different message for better diagnostics. > Also, I still think the prctl should not be available if > vsyscall=emulate. Either we should fully implement it or we should > not implement. We could even do: > > pr_warn_once("userspace vsyscall hardening request ignored because you > have vsyscall=emulate. Unless you absolutely need vsyscall=emulate, > update your system to use vsyscall=xonly.\n"); > > and thus encourage good behavior. I think there is still some hardening applied even with vsyscall=emulate. The question is what is more important: the additional hardening, or clean, easily described behavior of the interface. Maybe ARCH_VSYSCALL_CONTROL could return different values based on to what degree it could disable vsyscall? The pr_warn_once does not seem particularly useful. Anyone who upgrades glibc and still uses vsyscall=emulate will see that, with no way to disable it. Thanks, Florian
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c index fd2ee9408e91..6fc524b9f232 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -174,6 +174,9 @@ bool emulate_vsyscall(unsigned long error_code, tsk = current; + if (tsk->mm->context.vsyscall_disabled) + goto sigsegv; + /* * Check for access_ok violations and find the syscall nr. * @@ -316,8 +319,10 @@ static struct vm_area_struct gate_vma __ro_after_init = { struct vm_area_struct *get_gate_vma(struct mm_struct *mm) { + if (!mm || mm->context.vsyscall_disabled) + return NULL; #ifdef CONFIG_COMPAT - if (!mm || !(mm->context.flags & MM_CONTEXT_HAS_VSYSCALL)) + if (!(mm->context.flags & MM_CONTEXT_HAS_VSYSCALL)) return NULL; #endif if (vsyscall_mode == NONE) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 5d7494631ea9..3934d6907910 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -41,6 +41,12 @@ typedef struct { #ifdef CONFIG_X86_64 unsigned short flags; #endif +#ifdef CONFIG_X86_VSYSCALL_EMULATION + /* + * Changed by arch_prctl(ARCH_VSYSCALL_CONTROL). + */ + bool vsyscall_disabled; +#endif struct mutex lock; void __user *vdso; /* vdso base address */ diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 754a07856817..aad0bcfbf49f 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -18,4 +18,6 @@ #define ARCH_MAP_VDSO_32 0x2002 #define ARCH_MAP_VDSO_64 0x2003 +#define ARCH_VSYSCALL_CONTROL 0x5001 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 3402edec236c..834bad068211 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -816,6 +816,13 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) ret = put_user(base, (unsigned long __user *)arg2); break; } +#ifdef CONFIG_X86_VSYSCALL_EMULATION + case ARCH_VSYSCALL_CONTROL: + if (unlikely(arg2 > 1)) + return -EINVAL; + current->mm->context.vsyscall_disabled = !arg2; + break; +#endif #ifdef CONFIG_CHECKPOINT_RESTORE # ifdef CONFIG_X86_X32_ABI