Message ID | 20230109153348.5625-4-gregory.price@memverge.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Checkpoint Support for Syscall User Dispatch | expand |
On Mon, Jan 09, 2023 at 10:33:48AM -0500, Gregory Price wrote: > This patch implements simple getter interface for syscall user dispatch > configuration info. > > To support checkpoint/resume of a syscall user dispatch process, > the prctl settings for syscall user dispatch must be fetchable. > Presently, these settings are write-only, making it impossible to > implement transparent checkpoint (coordination with the software is > required). > > As Syscall User Dispatch is explicitly not for secure-container > development, exposing the configuration state via prctl does not > violate the original design intent. > > Signed-off-by: Gregory Price <gregory.price@memverge.com> > --- > .../admin-guide/syscall-user-dispatch.rst | 18 +++++++ > include/linux/syscall_user_dispatch.h | 7 +++ > include/uapi/linux/prctl.h | 3 ++ > kernel/entry/syscall_user_dispatch.c | 14 +++++ > kernel/sys.c | 4 ++ > .../syscall_user_dispatch/sud_test.c | 54 +++++++++++++++++++ > 6 files changed, 100 insertions(+) > > diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst > index 60314953c728..8b2c8b6441b7 100644 > --- a/Documentation/admin-guide/syscall-user-dispatch.rst > +++ b/Documentation/admin-guide/syscall-user-dispatch.rst > @@ -45,6 +45,10 @@ only the syscall dispatcher address and the userspace key. > As the ABI of these intercepted syscalls is unknown to Linux, these > syscalls are not instrumentable via ptrace or the syscall tracepoints. > > +A getter interface is supplied for the purpose of userland > +checkpoint/restore software being able to suspend and restore the > +current state of the system. > + > Interface > --------- > > @@ -73,6 +77,20 @@ thread-wide, without the need to invoke the kernel directly. selector > can be set to SYSCALL_DISPATCH_FILTER_ALLOW or SYSCALL_DISPATCH_FILTER_BLOCK. > Any other value should terminate the program with a SIGSYS. > > + > +A thread can fetch the current Syscall User Dispatch configuration with the following prctl: > + > + prctl(PR_GET_SYSCALL_USER_DISPATCH, <dispatch_config>)) > + > +<dispatch_config> is a pointer to a ``struct syscall_user_dispatch`` as defined in ``linux/include/linux/syscall_user_dispatch.h``:: syscall_user_dispatch.h isn't a part of uapi, so I am not sure that it is a good idea to use it here. For criu, it is much more convinient to have a ptrace interface to get this sort of parameters. prctl requires to execute a system call from a context of the target process. It is tricky so we want to minimize a number of such calls. Thanks, Andrei
On Thu, Jan 12, 2023 at 10:15:39AM -0800, Andrei Vagin wrote: > On Mon, Jan 09, 2023 at 10:33:48AM -0500, Gregory Price wrote: > > This patch implements simple getter interface for syscall user dispatch > > configuration info. > > > > To support checkpoint/resume of a syscall user dispatch process, > > the prctl settings for syscall user dispatch must be fetchable. > > Presently, these settings are write-only, making it impossible to > > implement transparent checkpoint (coordination with the software is > > required). > > > > As Syscall User Dispatch is explicitly not for secure-container > > development, exposing the configuration state via prctl does not > > violate the original design intent. > > > > Signed-off-by: Gregory Price <gregory.price@memverge.com> > > --- > > .../admin-guide/syscall-user-dispatch.rst | 18 +++++++ > > include/linux/syscall_user_dispatch.h | 7 +++ > > include/uapi/linux/prctl.h | 3 ++ > > kernel/entry/syscall_user_dispatch.c | 14 +++++ > > kernel/sys.c | 4 ++ > > .../syscall_user_dispatch/sud_test.c | 54 +++++++++++++++++++ > > 6 files changed, 100 insertions(+) > > > > diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst > > index 60314953c728..8b2c8b6441b7 100644 > > --- a/Documentation/admin-guide/syscall-user-dispatch.rst > > +++ b/Documentation/admin-guide/syscall-user-dispatch.rst > > @@ -45,6 +45,10 @@ only the syscall dispatcher address and the userspace key. > > As the ABI of these intercepted syscalls is unknown to Linux, these > > syscalls are not instrumentable via ptrace or the syscall tracepoints. > > > > +A getter interface is supplied for the purpose of userland > > +checkpoint/restore software being able to suspend and restore the > > +current state of the system. > > + > > Interface > > --------- > > > > @@ -73,6 +77,20 @@ thread-wide, without the need to invoke the kernel directly. selector > > can be set to SYSCALL_DISPATCH_FILTER_ALLOW or SYSCALL_DISPATCH_FILTER_BLOCK. > > Any other value should terminate the program with a SIGSYS. > > > > + > > +A thread can fetch the current Syscall User Dispatch configuration with the following prctl: > > + > > + prctl(PR_GET_SYSCALL_USER_DISPATCH, <dispatch_config>)) > > + > > +<dispatch_config> is a pointer to a ``struct syscall_user_dispatch`` as defined in ``linux/include/linux/syscall_user_dispatch.h``:: > > syscall_user_dispatch.h isn't a part of uapi, so I am not sure that it > is a good idea to use it here. > > For criu, it is much more convinient to have a ptrace interface to get > this sort of parameters. prctl requires to execute a system call from a > context of the target process. It is tricky so we want to minimize a > number of such calls. > > Thanks, > Andrei Thank you for the feedback. I think you're right. A Ptrace for this seems more in-line with the SECCOMP filter exporting that CRIU uses too. I'll look at implementing that instead.
On Mon, Jan 09, 2023 at 10:33:48AM -0500, Gregory Price wrote: > This patch implements simple getter interface for syscall user dispatch > configuration info. s/This patch implements/Implement/ > + > +A thread can fetch the current Syscall User Dispatch configuration with the following prctl: This should have been ended with double colon (::) to make below code code block, to be consistent with syscall_user_dispatch definition below. > + > + prctl(PR_GET_SYSCALL_USER_DISPATCH, <dispatch_config>)) > + > +<dispatch_config> is a pointer to a ``struct syscall_user_dispatch`` as defined in ``linux/include/linux/syscall_user_dispatch.h``:: > + > + struct syscall_user_dispatch { > + char __user *selector; > + unsigned long offset; > + unsigned long len; > + bool on_dispatch; > + }; > + Thanks.
diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst index 60314953c728..8b2c8b6441b7 100644 --- a/Documentation/admin-guide/syscall-user-dispatch.rst +++ b/Documentation/admin-guide/syscall-user-dispatch.rst @@ -45,6 +45,10 @@ only the syscall dispatcher address and the userspace key. As the ABI of these intercepted syscalls is unknown to Linux, these syscalls are not instrumentable via ptrace or the syscall tracepoints. +A getter interface is supplied for the purpose of userland +checkpoint/restore software being able to suspend and restore the +current state of the system. + Interface --------- @@ -73,6 +77,20 @@ thread-wide, without the need to invoke the kernel directly. selector can be set to SYSCALL_DISPATCH_FILTER_ALLOW or SYSCALL_DISPATCH_FILTER_BLOCK. Any other value should terminate the program with a SIGSYS. + +A thread can fetch the current Syscall User Dispatch configuration with the following prctl: + + prctl(PR_GET_SYSCALL_USER_DISPATCH, <dispatch_config>)) + +<dispatch_config> is a pointer to a ``struct syscall_user_dispatch`` as defined in ``linux/include/linux/syscall_user_dispatch.h``:: + + struct syscall_user_dispatch { + char __user *selector; + unsigned long offset; + unsigned long len; + bool on_dispatch; + }; + Security Notes -------------- diff --git a/include/linux/syscall_user_dispatch.h b/include/linux/syscall_user_dispatch.h index a0ae443fb7df..aab25e5b6496 100644 --- a/include/linux/syscall_user_dispatch.h +++ b/include/linux/syscall_user_dispatch.h @@ -16,6 +16,7 @@ struct syscall_user_dispatch { bool on_dispatch; }; +int get_syscall_user_dispatch(struct syscall_user_dispatch __user *usd); int set_syscall_user_dispatch(unsigned long mode, unsigned long offset, unsigned long len, char __user *selector); @@ -25,6 +26,12 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset, #else struct syscall_user_dispatch {}; +static inline int get_syscall_user_dispatch( + struct syscall_user_dispatch __user *usd) +{ + return -EINVAL; +} + static inline int set_syscall_user_dispatch(unsigned long mode, unsigned long offset, unsigned long len, char __user *selector) { diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index a5e06dcbba13..221c0e369cc0 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -284,4 +284,7 @@ struct prctl_mm_map { #define PR_SET_VMA 0x53564d41 # define PR_SET_VMA_ANON_NAME 0 +/* Get Syscall User Dispatch configuraiton settings */ +#define PR_GET_SYSCALL_USER_DISPATCH 65 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c index f097c06224c9..71441664571a 100644 --- a/kernel/entry/syscall_user_dispatch.c +++ b/kernel/entry/syscall_user_dispatch.c @@ -73,6 +73,20 @@ bool syscall_user_dispatch(struct pt_regs *regs) return true; } +int get_syscall_user_dispatch(struct syscall_user_dispatch __user *usd) +{ + struct syscall_user_dispatch *sd = ¤t->syscall_dispatch; + + if (usd) { + if (copy_to_user(usd, sd, sizeof(*sd))) + return -EFAULT; + } else { + return -EINVAL; + } + + return 0; +} + int set_syscall_user_dispatch(unsigned long mode, unsigned long offset, unsigned long len, char __user *selector) { diff --git a/kernel/sys.c b/kernel/sys.c index 5fd54bf0e886..b762c49fc424 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2618,6 +2618,10 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, error = set_syscall_user_dispatch(arg2, arg3, arg4, (char __user *) arg5); break; + case PR_GET_SYSCALL_USER_DISPATCH: + error = get_syscall_user_dispatch( + (struct syscall_user_dispatch __user *) arg2); + break; #ifdef CONFIG_SCHED_CORE case PR_SCHED_CORE: error = sched_core_share_pid(arg2, arg3, arg4, arg5); diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c index b5d592d4099e..555912f3c192 100644 --- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c +++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c @@ -35,6 +35,16 @@ #define SYSCALL_DISPATCH_ON(x) ((x) = SYSCALL_DISPATCH_FILTER_BLOCK) #define SYSCALL_DISPATCH_OFF(x) ((x) = SYSCALL_DISPATCH_FILTER_ALLOW) +#ifndef PR_GET_SYSCALL_USER_DISPATCH +#define PR_GET_SYSCALL_USER_DISPATCH 65 +#endif +struct syscall_user_dispatch { + char *selector; + unsigned long offset; + unsigned long len; + bool on_dispatch; +}; + /* Test Summary: * * - dispatch_trigger_sigsys: Verify if PR_SET_SYSCALL_USER_DISPATCH is @@ -309,4 +319,48 @@ TEST(direct_dispatch_range) } } + +TEST(get_dispatch_settings) +{ + int ret = 0; + struct syscall_user_dispatch usd; + + glob_sel = SYSCALL_DISPATCH_FILTER_ALLOW; + + /* Check the negative paths - bad user pointer */ + ret = prctl(PR_GET_SYSCALL_USER_DISPATCH, NULL); + ASSERT_EQ(-1, ret) { + TH_LOG("Kernel reported success to accessing a NULL pointer"); + } + ASSERT_EQ(EINVAL, errno); + + /* Get the settings prior to it being activated */ + ret = prctl(PR_GET_SYSCALL_USER_DISPATCH, &usd); + ASSERT_EQ(0, ret) { + TH_LOG("Kernel failed to fetch syscall user dispatch settings"); + } + + /* Make sure selector is off prior to prctl. */ + SYSCALL_DISPATCH_OFF(glob_sel); + ret = prctl(PR_SET_SYSCALL_USER_DISPATCH, PR_SYS_DISPATCH_ON, 0, 0L, &glob_sel); + ASSERT_EQ(0, ret) { + TH_LOG("Failed to get Syscall User Dispatch settings"); + } + + /* sanity check the settings */ + ret = prctl(PR_GET_SYSCALL_USER_DISPATCH, &usd); + ASSERT_EQ(0, ret) { + TH_LOG("Failed to get Syscall User Dispatch settings"); + } + ASSERT_EQ(&glob_sel, usd.selector) { + TH_LOG("Selector is an unexpected pointer"); + } + ASSERT_EQ(0, usd.offset) { + TH_LOG("Offset is an unexpected value"); + } + ASSERT_EQ(0, usd.len) { + TH_LOG("Length is an unexpected value"); + } +} + TEST_HARNESS_MAIN
This patch implements simple getter interface for syscall user dispatch configuration info. To support checkpoint/resume of a syscall user dispatch process, the prctl settings for syscall user dispatch must be fetchable. Presently, these settings are write-only, making it impossible to implement transparent checkpoint (coordination with the software is required). As Syscall User Dispatch is explicitly not for secure-container development, exposing the configuration state via prctl does not violate the original design intent. Signed-off-by: Gregory Price <gregory.price@memverge.com> --- .../admin-guide/syscall-user-dispatch.rst | 18 +++++++ include/linux/syscall_user_dispatch.h | 7 +++ include/uapi/linux/prctl.h | 3 ++ kernel/entry/syscall_user_dispatch.c | 14 +++++ kernel/sys.c | 4 ++ .../syscall_user_dispatch/sud_test.c | 54 +++++++++++++++++++ 6 files changed, 100 insertions(+)