Message ID | 20210205163752.11932-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kernel: Expose SYS_kcmp by default | expand |
Am Freitag, dem 05.02.2021 um 16:37 +0000 schrieb Chris Wilson: > Userspace has discovered the functionality offered by SYS_kcmp and has > started to depend upon it. In particular, Mesa uses SYS_kcmp for > os_same_file_description() in order to identify when two fd (e.g. device > or dmabuf) point to the same struct file. Since they depend on it for > core functionality, lift SYS_kcmp out of the non-default > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Kees Cook <keescook@chromium.org> > Cc: Andy Lutomirski <luto@amacapital.net> > Cc: Will Drewry <wad@chromium.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Dave Airlie <airlied@gmail.com> > Cc: Daniel Vetter <daniel@ffwll.ch> > Cc: Lucas Stach <l.stach@pengutronix.de> > --- > init/Kconfig | 11 +++++++++++ > kernel/Makefile | 2 +- > tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- > 3 files changed, 13 insertions(+), 2 deletions(-) > > diff --git a/init/Kconfig b/init/Kconfig > index b77c60f8b963..f62fca13ac5b 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1194,6 +1194,7 @@ endif # NAMESPACES > config CHECKPOINT_RESTORE > bool "Checkpoint/restore support" > select PROC_CHILDREN > + select KCMP > default n > help > Enables additional kernel features in a sake of checkpoint/restore. > @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS > config ARCH_HAS_MEMBARRIER_SYNC_CORE > bool > > > > > +config KCMP > + bool "Enable kcmp() system call" if EXPERT > + default y > + help > + Enable the file descriptor comparison system call. It provides > + user-space with the ability to compare two fd to see if they > + point to the same file, and check other attributes. This description undersells the abilities of kcmp, while fd compare is the only thing used by the graphics stack, kcmp can compare a handful of other system resources, see man 2 kcmp. I think the helptext should at least try to cover this fact somewhat. Regards, Lucas > + > + If unsure, say Y. > + > config RSEQ > bool "Enable rseq() system call" if EXPERT > default y > diff --git a/kernel/Makefile b/kernel/Makefile > index aa7368c7eabf..320f1f3941b7 100644 > --- a/kernel/Makefile > +++ b/kernel/Makefile > @@ -51,7 +51,7 @@ obj-y += livepatch/ > obj-y += dma/ > obj-y += entry/ > > > > > -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o > +obj-$(CONFIG_KCMP) += kcmp.o > obj-$(CONFIG_FREEZER) += freezer.o > obj-$(CONFIG_PROFILING) += profile.o > obj-$(CONFIG_STACKTRACE) += stacktrace.o > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c > index 26c72f2b61b1..1b6c7d33c4ff 100644 > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c > @@ -315,7 +315,7 @@ TEST(kcmp) > ret = __filecmp(getpid(), getpid(), 1, 1); > EXPECT_EQ(ret, 0); > if (ret != 0 && errno == ENOSYS) > - SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)"); > + SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)"); > } > > > > > TEST(mode_strict_support)
On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote: > Userspace has discovered the functionality offered by SYS_kcmp and has > started to depend upon it. In particular, Mesa uses SYS_kcmp for > os_same_file_description() in order to identify when two fd (e.g. device > or dmabuf) point to the same struct file. Since they depend on it for > core functionality, lift SYS_kcmp out of the non-default > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Kees Cook <keescook@chromium.org> > Cc: Andy Lutomirski <luto@amacapital.net> > Cc: Will Drewry <wad@chromium.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Dave Airlie <airlied@gmail.com> > Cc: Daniel Vetter <daniel@ffwll.ch> > Cc: Lucas Stach <l.stach@pengutronix.de> > --- > init/Kconfig | 11 +++++++++++ > kernel/Makefile | 2 +- > tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- > 3 files changed, 13 insertions(+), 2 deletions(-) > > diff --git a/init/Kconfig b/init/Kconfig > index b77c60f8b963..f62fca13ac5b 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1194,6 +1194,7 @@ endif # NAMESPACES > config CHECKPOINT_RESTORE > bool "Checkpoint/restore support" > select PROC_CHILDREN > + select KCMP > default n > help > Enables additional kernel features in a sake of checkpoint/restore. > @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS > config ARCH_HAS_MEMBARRIER_SYNC_CORE > bool > > +config KCMP > + bool "Enable kcmp() system call" if EXPERT > + default y I would expect this to be not default-y, especially if CHECKPOINT_RESTORE does a "select" on it. This is a really powerful syscall, but it is bounded by ptrace access controls, and uses pointer address obfuscation, so it may be okay to expose this. As it is, at least Ubuntu already has CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much difference on exposure. So, if you drop the "default y", I'm fine with this. -Kees > + help > + Enable the file descriptor comparison system call. It provides > + user-space with the ability to compare two fd to see if they > + point to the same file, and check other attributes. > + > + If unsure, say Y. > + > config RSEQ > bool "Enable rseq() system call" if EXPERT > default y > diff --git a/kernel/Makefile b/kernel/Makefile > index aa7368c7eabf..320f1f3941b7 100644 > --- a/kernel/Makefile > +++ b/kernel/Makefile > @@ -51,7 +51,7 @@ obj-y += livepatch/ > obj-y += dma/ > obj-y += entry/ > > -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o > +obj-$(CONFIG_KCMP) += kcmp.o > obj-$(CONFIG_FREEZER) += freezer.o > obj-$(CONFIG_PROFILING) += profile.o > obj-$(CONFIG_STACKTRACE) += stacktrace.o > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c > index 26c72f2b61b1..1b6c7d33c4ff 100644 > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c > @@ -315,7 +315,7 @@ TEST(kcmp) > ret = __filecmp(getpid(), getpid(), 1, 1); > EXPECT_EQ(ret, 0); > if (ret != 0 && errno == ENOSYS) > - SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)"); > + SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)"); > } > > TEST(mode_strict_support) > -- > 2.20.1 >
Hi Chris, Thank you for the patch! Yet something to improve: [auto build test ERROR on linux/master] [also build test ERROR on kees/for-next/seccomp kees/for-next/pstore linus/master v5.11-rc6 next-20210125] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Chris-Wilson/kernel-Expose-SYS_kcmp-by-default/20210206-004006 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2 config: powerpc-randconfig-r023-20210205 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc cross compiling tool for clang build # apt-get install binutils-powerpc-linux-gnu # https://github.com/0day-ci/linux/commit/f7694e48ae81aac5a226e74421dbda1dcdc3ca92 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Chris-Wilson/kernel-Expose-SYS_kcmp-by-default/20210206-004006 git checkout f7694e48ae81aac5a226e74421dbda1dcdc3ca92 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): In file included from arch/powerpc/include/asm/hardirq.h:6: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/powerpc/include/asm/io.h:619: arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET' __do_##name al; \ ^~~~~~~~~~~~~~ <scratch space>:190:1: note: expanded from here __do_insw ^ arch/powerpc/include/asm/io.h:557:56: note: expanded from macro '__do_insw' #define __do_insw(p, b, n) readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n)) ~~~~~~~~~~~~~~~~~~~~~^ In file included from kernel/kcmp.c:3: In file included from include/linux/syscalls.h:84: In file included from include/trace/syscall.h:7: In file included from include/linux/trace_events.h:9: In file included from include/linux/hardirq.h:10: In file included from arch/powerpc/include/asm/hardirq.h:6: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/powerpc/include/asm/io.h:619: arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET' __do_##name al; \ ^~~~~~~~~~~~~~ <scratch space>:192:1: note: expanded from here __do_insl ^ arch/powerpc/include/asm/io.h:558:56: note: expanded from macro '__do_insl' #define __do_insl(p, b, n) readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n)) ~~~~~~~~~~~~~~~~~~~~~^ In file included from kernel/kcmp.c:3: In file included from include/linux/syscalls.h:84: In file included from include/trace/syscall.h:7: In file included from include/linux/trace_events.h:9: In file included from include/linux/hardirq.h:10: In file included from arch/powerpc/include/asm/hardirq.h:6: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/powerpc/include/asm/io.h:619: arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET' __do_##name al; \ ^~~~~~~~~~~~~~ <scratch space>:194:1: note: expanded from here __do_outsb ^ arch/powerpc/include/asm/io.h:559:58: note: expanded from macro '__do_outsb' #define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n)) ~~~~~~~~~~~~~~~~~~~~~^ In file included from kernel/kcmp.c:3: In file included from include/linux/syscalls.h:84: In file included from include/trace/syscall.h:7: In file included from include/linux/trace_events.h:9: In file included from include/linux/hardirq.h:10: In file included from arch/powerpc/include/asm/hardirq.h:6: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/powerpc/include/asm/io.h:619: arch/powerpc/include/asm/io-defs.h:51:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] DEF_PCI_AC_NORET(outsw, (unsigned long p, const void *b, unsigned long c), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET' __do_##name al; \ ^~~~~~~~~~~~~~ <scratch space>:196:1: note: expanded from here __do_outsw ^ arch/powerpc/include/asm/io.h:560:58: note: expanded from macro '__do_outsw' #define __do_outsw(p, b, n) writesw((PCI_IO_ADDR)_IO_BASE+(p),(b),(n)) ~~~~~~~~~~~~~~~~~~~~~^ In file included from kernel/kcmp.c:3: In file included from include/linux/syscalls.h:84: In file included from include/trace/syscall.h:7: In file included from include/linux/trace_events.h:9: In file included from include/linux/hardirq.h:10: In file included from arch/powerpc/include/asm/hardirq.h:6: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/powerpc/include/asm/io.h:619: arch/powerpc/include/asm/io-defs.h:53:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] DEF_PCI_AC_NORET(outsl, (unsigned long p, const void *b, unsigned long c), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET' __do_##name al; \ ^~~~~~~~~~~~~~ <scratch space>:198:1: note: expanded from here __do_outsl ^ arch/powerpc/include/asm/io.h:561:58: note: expanded from macro '__do_outsl' #define __do_outsl(p, b, n) writesl((PCI_IO_ADDR)_IO_BASE+(p),(b),(n)) ~~~~~~~~~~~~~~~~~~~~~^ >> kernel/kcmp.c:117:13: error: implicit declaration of function 'get_epoll_tfile_raw_ptr' [-Werror,-Wimplicit-function-declaration] filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); ^ kernel/kcmp.c:117:13: note: did you mean 'get_file_raw_ptr'? kernel/kcmp.c:62:1: note: 'get_file_raw_ptr' declared here get_file_raw_ptr(struct task_struct *task, unsigned int idx) ^ >> kernel/kcmp.c:117:11: warning: incompatible integer to pointer conversion assigning to 'struct file *' from 'int' [-Wint-conversion] filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 13 warnings and 1 error generated. vim +/get_epoll_tfile_raw_ptr +117 kernel/kcmp.c d97b46a64674a2 Cyrill Gorcunov 2012-05-31 96 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 97 #ifdef CONFIG_EPOLL 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 98 static int kcmp_epoll_target(struct task_struct *task1, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 99 struct task_struct *task2, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 100 unsigned long idx1, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 101 struct kcmp_epoll_slot __user *uslot) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 102 { 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 103 struct file *filp, *filp_epoll, *filp_tgt; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 104 struct kcmp_epoll_slot slot; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 105 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 106 if (copy_from_user(&slot, uslot, sizeof(slot))) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 107 return -EFAULT; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 108 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 109 filp = get_file_raw_ptr(task1, idx1); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 110 if (!filp) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 111 return -EBADF; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 112 f43c283a89a7dc Eric W. Biederman 2020-11-20 113 filp_epoll = fget_task(task2, slot.efd); f43c283a89a7dc Eric W. Biederman 2020-11-20 114 if (!filp_epoll) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 115 return -EBADF; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 116 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 @117 filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 118 fput(filp_epoll); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 119 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 120 if (IS_ERR(filp_tgt)) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 121 return PTR_ERR(filp_tgt); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 122 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 123 return kcmp_ptr(filp, filp_tgt, KCMP_FILE); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 124 } 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 125 #else 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 126 static int kcmp_epoll_target(struct task_struct *task1, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 127 struct task_struct *task2, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 128 unsigned long idx1, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 129 struct kcmp_epoll_slot __user *uslot) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 130 { 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 131 return -EOPNOTSUPP; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 132 } 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 133 #endif 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 134 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Chris, Thank you for the patch! Yet something to improve: [auto build test ERROR on linux/master] [also build test ERROR on kees/for-next/seccomp kees/for-next/pstore linus/master v5.11-rc6 next-20210125] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Chris-Wilson/kernel-Expose-SYS_kcmp-by-default/20210206-004006 base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2ab38c17aac10bf55ab3efde4c4db3893d8691d2 config: i386-randconfig-s002-20210205 (attached as .config) compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.3-215-g0fb77bb6-dirty # https://github.com/0day-ci/linux/commit/f7694e48ae81aac5a226e74421dbda1dcdc3ca92 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Chris-Wilson/kernel-Expose-SYS_kcmp-by-default/20210206-004006 git checkout f7694e48ae81aac5a226e74421dbda1dcdc3ca92 # save the attached .config to linux build tree make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): kernel/kcmp.c: In function 'kcmp_epoll_target': >> kernel/kcmp.c:117:13: error: implicit declaration of function 'get_epoll_tfile_raw_ptr'; did you mean 'get_file_raw_ptr'? [-Werror=implicit-function-declaration] 117 | filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); | ^~~~~~~~~~~~~~~~~~~~~~~ | get_file_raw_ptr >> kernel/kcmp.c:117:11: warning: assignment to 'struct file *' from 'int' makes pointer from integer without a cast [-Wint-conversion] 117 | filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); | ^ cc1: some warnings being treated as errors vim +117 kernel/kcmp.c d97b46a64674a2 Cyrill Gorcunov 2012-05-31 96 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 97 #ifdef CONFIG_EPOLL 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 98 static int kcmp_epoll_target(struct task_struct *task1, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 99 struct task_struct *task2, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 100 unsigned long idx1, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 101 struct kcmp_epoll_slot __user *uslot) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 102 { 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 103 struct file *filp, *filp_epoll, *filp_tgt; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 104 struct kcmp_epoll_slot slot; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 105 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 106 if (copy_from_user(&slot, uslot, sizeof(slot))) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 107 return -EFAULT; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 108 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 109 filp = get_file_raw_ptr(task1, idx1); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 110 if (!filp) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 111 return -EBADF; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 112 f43c283a89a7dc Eric W. Biederman 2020-11-20 113 filp_epoll = fget_task(task2, slot.efd); f43c283a89a7dc Eric W. Biederman 2020-11-20 114 if (!filp_epoll) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 115 return -EBADF; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 116 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 @117 filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 118 fput(filp_epoll); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 119 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 120 if (IS_ERR(filp_tgt)) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 121 return PTR_ERR(filp_tgt); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 122 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 123 return kcmp_ptr(filp, filp_tgt, KCMP_FILE); 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 124 } 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 125 #else 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 126 static int kcmp_epoll_target(struct task_struct *task1, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 127 struct task_struct *task2, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 128 unsigned long idx1, 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 129 struct kcmp_epoll_slot __user *uslot) 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 130 { 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 131 return -EOPNOTSUPP; 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 132 } 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 133 #endif 0791e3644e5ef2 Cyrill Gorcunov 2017-07-12 134 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote: > > On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote: > > Userspace has discovered the functionality offered by SYS_kcmp and has > > started to depend upon it. In particular, Mesa uses SYS_kcmp for > > os_same_file_description() in order to identify when two fd (e.g. device > > or dmabuf) point to the same struct file. Since they depend on it for > > core functionality, lift SYS_kcmp out of the non-default > > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Kees Cook <keescook@chromium.org> > > Cc: Andy Lutomirski <luto@amacapital.net> > > Cc: Will Drewry <wad@chromium.org> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Dave Airlie <airlied@gmail.com> > > Cc: Daniel Vetter <daniel@ffwll.ch> > > Cc: Lucas Stach <l.stach@pengutronix.de> > > --- > > init/Kconfig | 11 +++++++++++ > > kernel/Makefile | 2 +- > > tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- > > 3 files changed, 13 insertions(+), 2 deletions(-) > > > > diff --git a/init/Kconfig b/init/Kconfig > > index b77c60f8b963..f62fca13ac5b 100644 > > --- a/init/Kconfig > > +++ b/init/Kconfig > > @@ -1194,6 +1194,7 @@ endif # NAMESPACES > > config CHECKPOINT_RESTORE > > bool "Checkpoint/restore support" > > select PROC_CHILDREN > > + select KCMP > > default n > > help > > Enables additional kernel features in a sake of checkpoint/restore. > > @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS > > config ARCH_HAS_MEMBARRIER_SYNC_CORE > > bool > > > > +config KCMP > > + bool "Enable kcmp() system call" if EXPERT > > + default y > > I would expect this to be not default-y, especially if > CHECKPOINT_RESTORE does a "select" on it. > > This is a really powerful syscall, but it is bounded by ptrace access > controls, and uses pointer address obfuscation, so it may be okay to > expose this. As it is, at least Ubuntu already has > CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much > difference on exposure. > > So, if you drop the "default y", I'm fine with this. It was maybe stupid, but our userspace started relying on fd comaprison through sys_kcomp. So for better or worse, if you want to run the mesa3d gl/vk stacks, you need this. Was maybe not the brighest ideas, but since enough distros had this enabled by defaults, it wasn't really discovered, and now we're shipping this everywhere. Ofc we can leave the default n, but the select if CONFIG_DRM is unfortunately needed I think. For that part: Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Also adding Dave Airlie for his take. -Daniel > > -Kees > > > + help > > + Enable the file descriptor comparison system call. It provides > > + user-space with the ability to compare two fd to see if they > > + point to the same file, and check other attributes. > > + > > + If unsure, say Y. > > + > > config RSEQ > > bool "Enable rseq() system call" if EXPERT > > default y > > diff --git a/kernel/Makefile b/kernel/Makefile > > index aa7368c7eabf..320f1f3941b7 100644 > > --- a/kernel/Makefile > > +++ b/kernel/Makefile > > @@ -51,7 +51,7 @@ obj-y += livepatch/ > > obj-y += dma/ > > obj-y += entry/ > > > > -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o > > +obj-$(CONFIG_KCMP) += kcmp.o > > obj-$(CONFIG_FREEZER) += freezer.o > > obj-$(CONFIG_PROFILING) += profile.o > > obj-$(CONFIG_STACKTRACE) += stacktrace.o > > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c > > index 26c72f2b61b1..1b6c7d33c4ff 100644 > > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c > > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c > > @@ -315,7 +315,7 @@ TEST(kcmp) > > ret = __filecmp(getpid(), getpid(), 1, 1); > > EXPECT_EQ(ret, 0); > > if (ret != 0 && errno == ENOSYS) > > - SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)"); > > + SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)"); > > } > > > > TEST(mode_strict_support) > > -- > > 2.20.1 > > > > -- > Kees Cook
On 2021-02-05 9:53 p.m., Daniel Vetter wrote: > On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote: >> >> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote: >>> Userspace has discovered the functionality offered by SYS_kcmp and has >>> started to depend upon it. In particular, Mesa uses SYS_kcmp for >>> os_same_file_description() in order to identify when two fd (e.g. device >>> or dmabuf) point to the same struct file. Since they depend on it for >>> core functionality, lift SYS_kcmp out of the non-default >>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. >>> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> Cc: Kees Cook <keescook@chromium.org> >>> Cc: Andy Lutomirski <luto@amacapital.net> >>> Cc: Will Drewry <wad@chromium.org> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >>> Cc: Dave Airlie <airlied@gmail.com> >>> Cc: Daniel Vetter <daniel@ffwll.ch> >>> Cc: Lucas Stach <l.stach@pengutronix.de> >>> --- >>> init/Kconfig | 11 +++++++++++ >>> kernel/Makefile | 2 +- >>> tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- >>> 3 files changed, 13 insertions(+), 2 deletions(-) >>> >>> diff --git a/init/Kconfig b/init/Kconfig >>> index b77c60f8b963..f62fca13ac5b 100644 >>> --- a/init/Kconfig >>> +++ b/init/Kconfig >>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES >>> config CHECKPOINT_RESTORE >>> bool "Checkpoint/restore support" >>> select PROC_CHILDREN >>> + select KCMP >>> default n >>> help >>> Enables additional kernel features in a sake of checkpoint/restore. >>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS >>> config ARCH_HAS_MEMBARRIER_SYNC_CORE >>> bool >>> >>> +config KCMP >>> + bool "Enable kcmp() system call" if EXPERT >>> + default y >> >> I would expect this to be not default-y, especially if >> CHECKPOINT_RESTORE does a "select" on it. >> >> This is a really powerful syscall, but it is bounded by ptrace access >> controls, and uses pointer address obfuscation, so it may be okay to >> expose this. As it is, at least Ubuntu already has >> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much >> difference on exposure. >> >> So, if you drop the "default y", I'm fine with this. > > It was maybe stupid, but our userspace started relying on fd > comaprison through sys_kcomp. So for better or worse, if you want to > run the mesa3d gl/vk stacks, you need this. That's overstating things somewhat. The vast majority of applications will work fine regardless (as they did before Mesa started using this functionality). Only some special ones will run into issues, because the user-space drivers incorrectly assume two file descriptors reference different descriptions. > Was maybe not the brighest ideas, but since enough distros had this > enabled by defaults, Right, that (and the above) is why I considered it fair game to use. What should I have done instead? (TBH I was surprised that this functionality isn't generally available) > it wasn't really discovered, and now we're > shipping this everywhere. You're making it sound like this snuck in secretly somehow, which is not true of course. > Ofc we can leave the default n, but the select if CONFIG_DRM is > unfortunately needed I think. Per above, not sure this is really true.
On 2021-02-08 12:49 p.m., Michel Dänzer wrote: > On 2021-02-05 9:53 p.m., Daniel Vetter wrote: >> On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote: >>> >>> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote: >>>> Userspace has discovered the functionality offered by SYS_kcmp and has >>>> started to depend upon it. In particular, Mesa uses SYS_kcmp for >>>> os_same_file_description() in order to identify when two fd (e.g. >>>> device >>>> or dmabuf) point to the same struct file. Since they depend on it for >>>> core functionality, lift SYS_kcmp out of the non-default >>>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. >>>> >>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>>> Cc: Kees Cook <keescook@chromium.org> >>>> Cc: Andy Lutomirski <luto@amacapital.net> >>>> Cc: Will Drewry <wad@chromium.org> >>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>> Cc: Dave Airlie <airlied@gmail.com> >>>> Cc: Daniel Vetter <daniel@ffwll.ch> >>>> Cc: Lucas Stach <l.stach@pengutronix.de> >>>> --- >>>> init/Kconfig | 11 +++++++++++ >>>> kernel/Makefile | 2 +- >>>> tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- >>>> 3 files changed, 13 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/init/Kconfig b/init/Kconfig >>>> index b77c60f8b963..f62fca13ac5b 100644 >>>> --- a/init/Kconfig >>>> +++ b/init/Kconfig >>>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES >>>> config CHECKPOINT_RESTORE >>>> bool "Checkpoint/restore support" >>>> select PROC_CHILDREN >>>> + select KCMP >>>> default n >>>> help >>>> Enables additional kernel features in a sake of >>>> checkpoint/restore. >>>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS >>>> config ARCH_HAS_MEMBARRIER_SYNC_CORE >>>> bool >>>> >>>> +config KCMP >>>> + bool "Enable kcmp() system call" if EXPERT >>>> + default y >>> >>> I would expect this to be not default-y, especially if >>> CHECKPOINT_RESTORE does a "select" on it. >>> >>> This is a really powerful syscall, but it is bounded by ptrace access >>> controls, and uses pointer address obfuscation, so it may be okay to >>> expose this. As it is, at least Ubuntu already has >>> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much >>> difference on exposure. >>> >>> So, if you drop the "default y", I'm fine with this. >> >> It was maybe stupid, but our userspace started relying on fd >> comaprison through sys_kcomp. So for better or worse, if you want to >> run the mesa3d gl/vk stacks, you need this. > > That's overstating things somewhat. The vast majority of applications > will work fine regardless (as they did before Mesa started using this > functionality). Only some special ones will run into issues, because the > user-space drivers incorrectly assume two file descriptors reference > different descriptions. > > >> Was maybe not the brighest ideas, but since enough distros had this >> enabled by defaults, > > Right, that (and the above) is why I considered it fair game to use. > What should I have done instead? (TBH I was surprised that this > functionality isn't generally available) In that spirit, an alternative might be to make KCMP_FILE available unconditionally, and the rest of SYS_kcmp only with CHECKPOINT_RESTORE as before. (Or maybe other parts of SYS_kcmp are generally useful as well?)
On Mon, Feb 8, 2021 at 12:49 PM Michel Dänzer <michel@daenzer.net> wrote: > > On 2021-02-05 9:53 p.m., Daniel Vetter wrote: > > On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote: > >> > >> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote: > >>> Userspace has discovered the functionality offered by SYS_kcmp and has > >>> started to depend upon it. In particular, Mesa uses SYS_kcmp for > >>> os_same_file_description() in order to identify when two fd (e.g. device > >>> or dmabuf) point to the same struct file. Since they depend on it for > >>> core functionality, lift SYS_kcmp out of the non-default > >>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. > >>> > >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >>> Cc: Kees Cook <keescook@chromium.org> > >>> Cc: Andy Lutomirski <luto@amacapital.net> > >>> Cc: Will Drewry <wad@chromium.org> > >>> Cc: Andrew Morton <akpm@linux-foundation.org> > >>> Cc: Dave Airlie <airlied@gmail.com> > >>> Cc: Daniel Vetter <daniel@ffwll.ch> > >>> Cc: Lucas Stach <l.stach@pengutronix.de> > >>> --- > >>> init/Kconfig | 11 +++++++++++ > >>> kernel/Makefile | 2 +- > >>> tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- > >>> 3 files changed, 13 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/init/Kconfig b/init/Kconfig > >>> index b77c60f8b963..f62fca13ac5b 100644 > >>> --- a/init/Kconfig > >>> +++ b/init/Kconfig > >>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES > >>> config CHECKPOINT_RESTORE > >>> bool "Checkpoint/restore support" > >>> select PROC_CHILDREN > >>> + select KCMP > >>> default n > >>> help > >>> Enables additional kernel features in a sake of checkpoint/restore. > >>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS > >>> config ARCH_HAS_MEMBARRIER_SYNC_CORE > >>> bool > >>> > >>> +config KCMP > >>> + bool "Enable kcmp() system call" if EXPERT > >>> + default y > >> > >> I would expect this to be not default-y, especially if > >> CHECKPOINT_RESTORE does a "select" on it. > >> > >> This is a really powerful syscall, but it is bounded by ptrace access > >> controls, and uses pointer address obfuscation, so it may be okay to > >> expose this. As it is, at least Ubuntu already has > >> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much > >> difference on exposure. > >> > >> So, if you drop the "default y", I'm fine with this. > > > > It was maybe stupid, but our userspace started relying on fd > > comaprison through sys_kcomp. So for better or worse, if you want to > > run the mesa3d gl/vk stacks, you need this. > > That's overstating things somewhat. The vast majority of applications > will work fine regardless (as they did before Mesa started using this > functionality). Only some special ones will run into issues, because the > user-space drivers incorrectly assume two file descriptors reference > different descriptions. > > > > Was maybe not the brighest ideas, but since enough distros had this > > enabled by defaults, > > Right, that (and the above) is why I considered it fair game to use. > What should I have done instead? (TBH I was surprised that this > functionality isn't generally available) Yeah that one is fine, but I thought we've discussed (irc or something) more uses for de-duping dma-buf and stuff like that. But quick grep says that hasn't landed yet, so I got a bit confused (or just dreamt). Looking at this again I'm kinda surprised the drmfd de-duping blows up on normal linux distros, but I guess it can all happen. > > it wasn't really discovered, and now we're > > shipping this everywhere. > > You're making it sound like this snuck in secretly somehow, which is not > true of course. > > > > Ofc we can leave the default n, but the select if CONFIG_DRM is > > unfortunately needed I think. > > Per above, not sure this is really true. We seem to be going boom on linux distros now, maybe userspace got more creative in abusing stuff? The entire thing is small enough that imo we don't really have to care, e.g. we also unconditionally select dma-buf, despite that on most systems there's only 1 gpu, and you're never going to end up with a buffer sharing case that needs any of that code (aside from the "here's an fd" part). But I guess we can limit to just KCMP_FILE like you suggest in another reply. Just feels a bit like overkill. -Daniel
On 2021-02-08 2:34 p.m., Daniel Vetter wrote: > On Mon, Feb 8, 2021 at 12:49 PM Michel Dänzer <michel@daenzer.net> wrote: >> >> On 2021-02-05 9:53 p.m., Daniel Vetter wrote: >>> On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@chromium.org> wrote: >>>> >>>> On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote: >>>>> Userspace has discovered the functionality offered by SYS_kcmp and has >>>>> started to depend upon it. In particular, Mesa uses SYS_kcmp for >>>>> os_same_file_description() in order to identify when two fd (e.g. device >>>>> or dmabuf) point to the same struct file. Since they depend on it for >>>>> core functionality, lift SYS_kcmp out of the non-default >>>>> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. >>>>> >>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>>>> Cc: Kees Cook <keescook@chromium.org> >>>>> Cc: Andy Lutomirski <luto@amacapital.net> >>>>> Cc: Will Drewry <wad@chromium.org> >>>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>>> Cc: Dave Airlie <airlied@gmail.com> >>>>> Cc: Daniel Vetter <daniel@ffwll.ch> >>>>> Cc: Lucas Stach <l.stach@pengutronix.de> >>>>> --- >>>>> init/Kconfig | 11 +++++++++++ >>>>> kernel/Makefile | 2 +- >>>>> tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- >>>>> 3 files changed, 13 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/init/Kconfig b/init/Kconfig >>>>> index b77c60f8b963..f62fca13ac5b 100644 >>>>> --- a/init/Kconfig >>>>> +++ b/init/Kconfig >>>>> @@ -1194,6 +1194,7 @@ endif # NAMESPACES >>>>> config CHECKPOINT_RESTORE >>>>> bool "Checkpoint/restore support" >>>>> select PROC_CHILDREN >>>>> + select KCMP >>>>> default n >>>>> help >>>>> Enables additional kernel features in a sake of checkpoint/restore. >>>>> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS >>>>> config ARCH_HAS_MEMBARRIER_SYNC_CORE >>>>> bool >>>>> >>>>> +config KCMP >>>>> + bool "Enable kcmp() system call" if EXPERT >>>>> + default y >>>> >>>> I would expect this to be not default-y, especially if >>>> CHECKPOINT_RESTORE does a "select" on it. >>>> >>>> This is a really powerful syscall, but it is bounded by ptrace access >>>> controls, and uses pointer address obfuscation, so it may be okay to >>>> expose this. As it is, at least Ubuntu already has >>>> CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much >>>> difference on exposure. >>>> >>>> So, if you drop the "default y", I'm fine with this. >>> >>> It was maybe stupid, but our userspace started relying on fd >>> comaprison through sys_kcomp. So for better or worse, if you want to >>> run the mesa3d gl/vk stacks, you need this. >> >> That's overstating things somewhat. The vast majority of applications >> will work fine regardless (as they did before Mesa started using this >> functionality). Only some special ones will run into issues, because the >> user-space drivers incorrectly assume two file descriptors reference >> different descriptions. >> >> >>> Was maybe not the brighest ideas, but since enough distros had this >>> enabled by defaults, >> >> Right, that (and the above) is why I considered it fair game to use. >> What should I have done instead? (TBH I was surprised that this >> functionality isn't generally available) > > Yeah that one is fine, but I thought we've discussed (irc or > something) more uses for de-duping dma-buf and stuff like that. But > quick grep says that hasn't landed yet, so I got a bit confused (or > just dreamt). Looking at this again I'm kinda surprised the drmfd > de-duping blows up on normal linux distros, but I guess it can all > happen. One example: GEM handle name-spaces are per file description. If user-space incorrectly assumes two DRM fds are independent, when they actually reference the same file description, closing a GEM handle with one file descriptor will make it unusable with the other file descriptor as well. >>> Ofc we can leave the default n, but the select if CONFIG_DRM is >>> unfortunately needed I think. >> >> Per above, not sure this is really true. > > We seem to be going boom on linux distros now, maybe userspace got > more creative in abusing stuff? I don't know what you're referring to. I've only seen maybe two or three reports from people who didn't enable CHECKPOINT_RESTORE in their self-built kernels. > The entire thing is small enough that imo we don't really have to care, > e.g. we also unconditionally select dma-buf, despite that on most > systems there's only 1 gpu, and you're never going to end up with a > buffer sharing case that needs any of that code (aside from the > "here's an fd" part). > > But I guess we can limit to just KCMP_FILE like you suggest in another > reply. Just feels a bit like overkill. Making KCMP_FILE gated by DRM makes as little sense to me as by CHECKPOINT_RESTORE.
Hi! > Userspace has discovered the functionality offered by SYS_kcmp and has > started to depend upon it. In particular, Mesa uses SYS_kcmp for > os_same_file_description() in order to identify when two fd (e.g. device > or dmabuf) point to the same struct file. Since they depend on it for > core functionality, lift SYS_kcmp out of the non-default > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. Is it good idea to enable everything because Mesa uses it for file descriptors? This is really interesting syscall... Best regards, Pavel
Am Samstag, dem 13.02.2021 um 18:40 +0100 schrieb Pavel Machek: > Hi! > > > Userspace has discovered the functionality offered by SYS_kcmp and has > > started to depend upon it. In particular, Mesa uses SYS_kcmp for > > os_same_file_description() in order to identify when two fd (e.g. device > > or dmabuf) point to the same struct file. Since they depend on it for > > core functionality, lift SYS_kcmp out of the non-default > > CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. > > Is it good idea to enable everything because Mesa uses it for file > descriptors? > > This is really interesting syscall... As Debian/Ubuntu and Fedora are already shipping with CONFIG_CHECKPOINT_RESTORE=y in their kernel configs, I don't really see the need to add further restrictions here. Or this discussion should have happened a while ago... Regards, Lucas
diff --git a/init/Kconfig b/init/Kconfig index b77c60f8b963..f62fca13ac5b 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1194,6 +1194,7 @@ endif # NAMESPACES config CHECKPOINT_RESTORE bool "Checkpoint/restore support" select PROC_CHILDREN + select KCMP default n help Enables additional kernel features in a sake of checkpoint/restore. @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS config ARCH_HAS_MEMBARRIER_SYNC_CORE bool +config KCMP + bool "Enable kcmp() system call" if EXPERT + default y + help + Enable the file descriptor comparison system call. It provides + user-space with the ability to compare two fd to see if they + point to the same file, and check other attributes. + + If unsure, say Y. + config RSEQ bool "Enable rseq() system call" if EXPERT default y diff --git a/kernel/Makefile b/kernel/Makefile index aa7368c7eabf..320f1f3941b7 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -51,7 +51,7 @@ obj-y += livepatch/ obj-y += dma/ obj-y += entry/ -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o +obj-$(CONFIG_KCMP) += kcmp.o obj-$(CONFIG_FREEZER) += freezer.o obj-$(CONFIG_PROFILING) += profile.o obj-$(CONFIG_STACKTRACE) += stacktrace.o diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 26c72f2b61b1..1b6c7d33c4ff 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -315,7 +315,7 @@ TEST(kcmp) ret = __filecmp(getpid(), getpid(), 1, 1); EXPECT_EQ(ret, 0); if (ret != 0 && errno == ENOSYS) - SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)"); + SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)"); } TEST(mode_strict_support)
Userspace has discovered the functionality offered by SYS_kcmp and has started to depend upon it. In particular, Mesa uses SYS_kcmp for os_same_file_description() in order to identify when two fd (e.g. device or dmabuf) point to the same struct file. Since they depend on it for core functionality, lift SYS_kcmp out of the non-default CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Lucas Stach <l.stach@pengutronix.de> --- init/Kconfig | 11 +++++++++++ kernel/Makefile | 2 +- tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +- 3 files changed, 13 insertions(+), 2 deletions(-)