Message ID | 42da416106d5c1cf92bda1e058434fe240b35f44.1585898438.git.christophe.leroy@c-s.fr (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,1/5] uaccess: Add user_read_access_begin/end and user_write_access_begin/end | expand |
On Fri, Apr 3, 2020 at 12:21 AM Christophe Leroy <christophe.leroy@c-s.fr> wrote: > > Now we have user_read_access_begin() and user_write_access_begin() > in addition to user_access_begin(). I realize Al asked for this, but I don't think it really adds anything to the series. The "full" makes the names longer, but not really any more legible. So I like 1-4, but am unconvinced about 5 and would prefer that to be dropped. Sorry for the bikeshedding. And I like this series much better without the cookie that was discussed, and just making the hard rule be that they can't nest. Some architecture may obviously use a cookie internally if they have some nesting behavior of their own, but it doesn't look like we have any major reason to expose that as the actual interface. The only other question is how to synchronize this? I'm ok with it going through the ppc tree, for example, and just let others build on that. Maybe using a shared immutable branch with 5.6 as a base? Linus
On Fri, Apr 03, 2020 at 11:01:10AM -0700, Linus Torvalds wrote: > On Fri, Apr 3, 2020 at 12:21 AM Christophe Leroy > <christophe.leroy@c-s.fr> wrote: > > > > Now we have user_read_access_begin() and user_write_access_begin() > > in addition to user_access_begin(). > > I realize Al asked for this, but I don't think it really adds anything > to the series. > > The "full" makes the names longer, but not really any more legible. > > So I like 1-4, but am unconvinced about 5 and would prefer that to be > dropped. Sorry for the bikeshedding. > > And I like this series much better without the cookie that was > discussed, and just making the hard rule be that they can't nest. > > Some architecture may obviously use a cookie internally if they have > some nesting behavior of their own, but it doesn't look like we have > any major reason to expose that as the actual interface. > > The only other question is how to synchronize this? I'm ok with it > going through the ppc tree, for example, and just let others build on > that. Maybe using a shared immutable branch with 5.6 as a base? I can do a 5.7-rc1-based branch with that; depending upon what we end up doing for arm and s390 we can always change the calling conventions come next cycle ;-/ My impressions after digging through arm side of things: 1) the only instance of nesting I'd found there (so far) is a mistake. The rule should be "no fucking nesting, TYVM". 2) I'm really unhappy about the uaccess_with_memcpy thing. Besides being fucking ugly, it kills any hope of lifting user_access_begin/end out of raw_copy_to_user() - the things done in that bastard are obviously *NOT* fit for uaccess block. Including the wonders like /* the mmap semaphore is taken only if not in an atomic context */ atomic = faulthandler_disabled(); if (!atomic) down_read(¤t->mm->mmap_sem); which, IMO, deserves to be taken out and shot. Not that pin_page_for_write() in the same file (arch/arm/lib/uaccess_with_memcpy.c) did not deserve the same treatment... As far as I can tell, the whole point of that thing is that well, memcpy() is optimized better than raw_copy_to_user()... So what's wrong with taking the damn optimized memcpy and using it for raw_copy_to_user() instead? Is that the lack of STRT analogue that would store several registers? Because AFAICS commit f441882a5229ffaef61a47bccd4518f7e2749cbc Author: Vincent Whitchurch <vincent.whitchurch@axis.com> Date: Fri Nov 9 10:09:48 2018 +0100 ARM: 8812/1: Optimise copy_{from/to}_user for !CPU_USE_DOMAINS makes for much saner solution... I realize that it's v6+ and this thing is specifically for a v5 variant, but... 3) how much do we need to keep the old DACR value in a register for uaccess_restore()? AFAICS, if we prohibit nesting it becomes a function of USER_DS/KERNEL_DS setting (and that - only for CPU_USE_DOMAINS), doesn't it? And we had to have fetched it recently, back when access_ok() had been done, so shouldn't it be in cache?
Hi Christophe,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on next-20200403]
[cannot apply to powerpc/next drm-intel/for-linux-next tip/x86/core v5.6]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/uaccess-Add-user_read_access_begin-end-and-user_write_access_begin-end/20200404-080555
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 5364abc57993b3bf60c41923cb98a8f1a594e749
config: x86_64-randconfig-s1-20200404 (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
arch/x86/ia32/ia32_signal.c: In function 'ia32_setup_frame':
>> arch/x86/ia32/ia32_signal.c:265:7: error: implicit declaration of function 'user_access_begin'; did you mean 'user_access_end'? [-Werror=implicit-function-declaration]
if (!user_access_begin(frame, sizeof(*frame)))
^~~~~~~~~~~~~~~~~
user_access_end
cc1: some warnings being treated as errors
vim +265 arch/x86/ia32/ia32_signal.c
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 233
235b80226b986d arch/x86/ia32/ia32_signal.c Al Viro 2012-11-09 234 int ia32_setup_frame(int sig, struct ksignal *ksig,
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 235 compat_sigset_t *set, struct pt_regs *regs)
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 236 {
3b0d29ee1c73b6 arch/x86/ia32/ia32_signal.c Hiroshi Shimamoto 2008-12-17 237 struct sigframe_ia32 __user *frame;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 238 void __user *restorer;
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 239 void __user *fp = NULL;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 240
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 241 /* copy_to_user optimizes that into a single 8 byte store */
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 242 static const struct {
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 243 u16 poplmovl;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 244 u32 val;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 245 u16 int80;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 246 } __attribute__((packed)) code = {
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 247 0xb858, /* popl %eax ; movl $...,%eax */
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 248 __NR_ia32_sigreturn,
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 249 0x80cd, /* int $0x80 */
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 250 };
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 251
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 252 frame = get_sigframe(ksig, regs, sizeof(*frame), &fp);
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 253
235b80226b986d arch/x86/ia32/ia32_signal.c Al Viro 2012-11-09 254 if (ksig->ka.sa.sa_flags & SA_RESTORER) {
235b80226b986d arch/x86/ia32/ia32_signal.c Al Viro 2012-11-09 255 restorer = ksig->ka.sa.sa_restorer;
af65d64845a90c arch/x86/ia32/ia32_signal.c Roland McGrath 2008-01-30 256 } else {
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 257 /* Return stub is in 32bit vsyscall page */
1a3e4ca41c5a38 arch/x86/ia32/ia32_signal.c Roland McGrath 2008-04-09 258 if (current->mm->context.vdso)
6f121e548f8367 arch/x86/ia32/ia32_signal.c Andy Lutomirski 2014-05-05 259 restorer = current->mm->context.vdso +
0a6d1fa0d2b48f arch/x86/ia32/ia32_signal.c Andy Lutomirski 2015-10-05 260 vdso_image_32.sym___kernel_sigreturn;
9fbbd4dd17d071 arch/x86_64/ia32/ia32_signal.c Andi Kleen 2007-02-13 261 else
ade1af77129dea arch/x86/ia32/ia32_signal.c Jan Engelhardt 2008-01-30 262 restorer = &frame->retcode;
af65d64845a90c arch/x86/ia32/ia32_signal.c Roland McGrath 2008-01-30 263 }
3b4b75700a245d arch/x86/ia32/ia32_signal.c Hiroshi Shimamoto 2009-01-23 264
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 @265 if (!user_access_begin(frame, sizeof(*frame)))
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 266 return -EFAULT;
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 267
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 268 unsafe_put_user(sig, &frame->sig, Efault);
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 269 unsafe_put_sigcontext32(&frame->sc, fp, regs, set, Efault);
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 270 unsafe_put_user(set->sig[1], &frame->extramask[0], Efault);
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 271 unsafe_put_user(ptr_to_compat(restorer), &frame->pretcode, Efault);
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 272 /*
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 273 * These are actually not used anymore, but left because some
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 274 * gdb versions depend on them as a marker.
99b9cdf758af70 arch/x86/ia32/ia32_signal.c Thomas Gleixner 2008-01-30 275 */
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 276 unsafe_put_user(*((u64 *)&code), (u64 __user *)frame->retcode, Efault);
e2390741053e49 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 277 user_access_end();
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 278
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 279 /* Set up registers for signal handler */
65ea5b03499035 arch/x86/ia32/ia32_signal.c H. Peter Anvin 2008-01-30 280 regs->sp = (unsigned long) frame;
235b80226b986d arch/x86/ia32/ia32_signal.c Al Viro 2012-11-09 281 regs->ip = (unsigned long) ksig->ka.sa.sa_handler;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 282
536e3ee4fed13d arch/x86_64/ia32/ia32_signal.c Andi Kleen 2006-09-26 283 /* Make -mregparm=3 work */
65ea5b03499035 arch/x86/ia32/ia32_signal.c H. Peter Anvin 2008-01-30 284 regs->ax = sig;
65ea5b03499035 arch/x86/ia32/ia32_signal.c H. Peter Anvin 2008-01-30 285 regs->dx = 0;
65ea5b03499035 arch/x86/ia32/ia32_signal.c H. Peter Anvin 2008-01-30 286 regs->cx = 0;
536e3ee4fed13d arch/x86_64/ia32/ia32_signal.c Andi Kleen 2006-09-26 287
b6edbb1e045a71 arch/x86/ia32/ia32_signal.c Jeremy Fitzhardinge 2008-08-19 288 loadsegment(ds, __USER32_DS);
b6edbb1e045a71 arch/x86/ia32/ia32_signal.c Jeremy Fitzhardinge 2008-08-19 289 loadsegment(es, __USER32_DS);
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 290
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 291 regs->cs = __USER32_CS;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 292 regs->ss = __USER32_DS;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 293
1d001df19d5323 arch/x86_64/ia32/ia32_signal.c Andi Kleen 2006-09-26 294 return 0;
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 295 Efault:
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 296 user_access_end();
44a1d996325982 arch/x86/ia32/ia32_signal.c Al Viro 2020-02-15 297 return -EFAULT;
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 298 }
^1da177e4c3f41 arch/x86_64/ia32/ia32_signal.c Linus Torvalds 2005-04-16 299
:::::: The code at line 265 was first introduced by commit
:::::: e2390741053e4931841650b5eadac32697aa88aa x86: ia32_setup_frame(): consolidate uaccess areas
:::::: TO: Al Viro <viro@zeniv.linux.org.uk>
:::::: CC: Al Viro <viro@zeniv.linux.org.uk>
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Christophe,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on next-20200403]
[cannot apply to powerpc/next drm-intel/for-linux-next tip/x86/core v5.6]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/uaccess-Add-user_read_access_begin-end-and-user_write_access_begin-end/20200404-080555
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 5364abc57993b3bf60c41923cb98a8f1a594e749
config: i386-allyesconfig (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
arch/x86/kernel/vm86_32.c: In function 'save_v86_state':
>> arch/x86/kernel/vm86_32.c:116:7: error: implicit declaration of function 'user_access_begin'; did you mean 'user_access_end'? [-Werror=implicit-function-declaration]
if (!user_access_begin(user, vm86->vm86plus.is_vm86pus ?
^~~~~~~~~~~~~~~~~
user_access_end
cc1: some warnings being treated as errors
vim +116 arch/x86/kernel/vm86_32.c
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 95
5ed92a8ab71f88 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 96 void save_v86_state(struct kernel_vm86_regs *regs, int retval)
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 97 {
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 98 struct task_struct *tsk = current;
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 99 struct vm86plus_struct __user *user;
9fda6a0681e070 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 100 struct vm86 *vm86 = current->thread.vm86;
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 101
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 102 /*
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 103 * This gets called from entry.S with interrupts disabled, but
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 104 * from process context. Enable interrupts here, before trying
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 105 * to access user space.
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 106 */
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 107 local_irq_enable();
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 108
1342635638cba9 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 109 if (!vm86 || !vm86->user_vm86) {
1342635638cba9 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 110 pr_alert("no user_vm86: BAD\n");
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 111 do_exit(SIGSEGV);
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 112 }
decd275e62d5ee arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 113 set_flags(regs->pt.flags, VEFLAGS, X86_EFLAGS_VIF | vm86->veflags_mask);
1342635638cba9 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 114 user = vm86->user_vm86;
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 115
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 @116 if (!user_access_begin(user, vm86->vm86plus.is_vm86pus ?
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 117 sizeof(struct vm86plus_struct) :
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 118 sizeof(struct vm86_struct)))
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 119 goto Efault;
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 120
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 121 unsafe_put_user(regs->pt.bx, &user->regs.ebx, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 122 unsafe_put_user(regs->pt.cx, &user->regs.ecx, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 123 unsafe_put_user(regs->pt.dx, &user->regs.edx, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 124 unsafe_put_user(regs->pt.si, &user->regs.esi, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 125 unsafe_put_user(regs->pt.di, &user->regs.edi, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 126 unsafe_put_user(regs->pt.bp, &user->regs.ebp, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 127 unsafe_put_user(regs->pt.ax, &user->regs.eax, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 128 unsafe_put_user(regs->pt.ip, &user->regs.eip, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 129 unsafe_put_user(regs->pt.cs, &user->regs.cs, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 130 unsafe_put_user(regs->pt.flags, &user->regs.eflags, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 131 unsafe_put_user(regs->pt.sp, &user->regs.esp, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 132 unsafe_put_user(regs->pt.ss, &user->regs.ss, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 133 unsafe_put_user(regs->es, &user->regs.es, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 134 unsafe_put_user(regs->ds, &user->regs.ds, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 135 unsafe_put_user(regs->fs, &user->regs.fs, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 136 unsafe_put_user(regs->gs, &user->regs.gs, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 137 unsafe_put_user(vm86->screen_bitmap, &user->screen_bitmap, Efault_end);
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 138
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 139 user_access_end();
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 140
da51da189a24bb arch/x86/kernel/vm86_32.c Andy Lutomirski 2017-11-02 141 preempt_disable();
9fda6a0681e070 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 142 tsk->thread.sp0 = vm86->saved_sp0;
ed0b2edb61ba4e arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-19 143 tsk->thread.sysenter_cs = __KERNEL_CS;
252e1a0526304f arch/x86/kernel/vm86_32.c Joerg Roedel 2018-07-18 144 update_task_stack(tsk);
bd7dc5a6afac71 arch/x86/kernel/vm86_32.c Andy Lutomirski 2017-11-02 145 refresh_sysenter_cs(&tsk->thread);
9fda6a0681e070 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 146 vm86->saved_sp0 = 0;
da51da189a24bb arch/x86/kernel/vm86_32.c Andy Lutomirski 2017-11-02 147 preempt_enable();
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 148
5ed92a8ab71f88 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 149 memcpy(®s->pt, &vm86->regs32, sizeof(struct pt_regs));
49d26b6eaa8e97 arch/i386/kernel/vm86.c Jeremy Fitzhardinge 2006-12-07 150
5ed92a8ab71f88 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 151 lazy_load_gs(vm86->regs32.gs);
49d26b6eaa8e97 arch/i386/kernel/vm86.c Jeremy Fitzhardinge 2006-12-07 152
5ed92a8ab71f88 arch/x86/kernel/vm86_32.c Brian Gerst 2015-07-29 153 regs->pt.ax = retval;
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 154 return;
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 155
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 156 Efault_end:
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 157 user_access_end();
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 158 Efault:
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 159 pr_alert("could not access userspace vm86 info\n");
a37d01ead405e3 arch/x86/kernel/vm86_32.c Al Viro 2020-02-15 160 do_exit(SIGSEGV);
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 161 }
^1da177e4c3f41 arch/i386/kernel/vm86.c Linus Torvalds 2005-04-16 162
:::::: The code at line 116 was first introduced by commit
:::::: a37d01ead405e3aa14d72d284721fe46422b3b63 x86: switch save_v86_state() to unsafe_put_user()
:::::: TO: Al Viro <viro@zeniv.linux.org.uk>
:::::: CC: Al Viro <viro@zeniv.linux.org.uk>
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Le 03/04/2020 à 20:01, Linus Torvalds a écrit : > On Fri, Apr 3, 2020 at 12:21 AM Christophe Leroy > <christophe.leroy@c-s.fr> wrote: >> >> Now we have user_read_access_begin() and user_write_access_begin() >> in addition to user_access_begin(). > > I realize Al asked for this, but I don't think it really adds anything > to the series. > > The "full" makes the names longer, but not really any more legible. > > So I like 1-4, but am unconvinced about 5 and would prefer that to be > dropped. Sorry for the bikeshedding. > Yes I was not sure about it, that's the reason why I added it as the last patch of the series. And in the meantime, we see Robots reporting build failures due to additional use of user_access_begin() in parallele to this change, so I guess it would anyway be a challenge to perform such a change without coordination. > And I like this series much better without the cookie that was > discussed, and just making the hard rule be that they can't nest. > > Some architecture may obviously use a cookie internally if they have > some nesting behavior of their own, but it doesn't look like we have > any major reason to expose that as the actual interface. > > The only other question is how to synchronize this? I'm ok with it > going through the ppc tree, for example, and just let others build on > that. Maybe using a shared immutable branch with 5.6 as a base? Michael, can you take patches 1 to 4 ? Otherwise, can you ack patch 4 to enable merging through another tree ? Christophe
[rmk Cc'd] On Fri, Apr 03, 2020 at 09:52:05PM +0100, Al Viro wrote: > I can do a 5.7-rc1-based branch with that; depending upon what we end > up doing for arm and s390 we can always change the calling conventions > come next cycle ;-/ > > My impressions after digging through arm side of things: > > 1) the only instance of nesting I'd found there (so far) is a mistake. > The rule should be "no fucking nesting, TYVM". OK, after quite a bit of digging: 1) everything outside of arm is quite happy with not passing anything to user_access_end(). s390 is a red herring in that respect. 2) on arm we definitely can get rid of nesting. However, there are some unpleasant sides of the logics in there. What we have is an MMU register; everything except for two 2bit fields in it is constant. One of those fields is a function of get_fs(), another might serve an analogue of x86 EFLAGS.AC. Rules: DACR.USER is 0 if CONFIG_SW_DOMAIN_PAN is enabled and we are *not* in uaccess section; otherwise it's 1. DACR.KERNEL is 3 if CONFIG_USE_DOMAINS is enabled and we are under KERNEL_DS; otherwise it's 1. [USE_DOMAINS is forced to "yes" on v5 and earlier, configurable on v6+] [SW_DOMAIN_PAN is forced to "no" on v7 if we want support of huge physical space, configurable with default to "yes" otherwise] On entry into kernel we get into USER_DS state before we get out of asm glue. Original settings are restored on return. That goes both for ->addr_limit (get_fs() value) and for DACR.KERNEL contents. DACR.USER ("uaccess allowed") is switched to "disabled" state before we reach C code and restored on return from kernel. The costs are interesting; setting the register is costly, in the same manner STAC/CLAC is. Reading it... hell knows; I don't see any explicit information about that. As it is, both set_fs() and starting uaccess block (uaccess_save_and_enable() - the thing that would've gone into user_access_begin()) do both read and write to register; with minimal massage we could get rid of reading the damn thing in set_fs(). user_access_end() candidate does a plain write to register, with value kept around since the beginning of uaccess block. *IF* read from that register is cheap, we can trivially get rid of passing the cookie there - it's a matter of reading the register and clearing one bit in it before writing it back. If that is costly, though... We can easily calculate it from ->addr_limit, which we already have in cache at that point, or will need shortly anyway. In that case it would probably make sense to do the same to user_access_begin() and set_fs(). Note that I'm not suggesting to do anything of that sort in switch_to() - existing mechanism doesn't need any changes, and neither does the asm glue in entry*.S. The only source I'd been able to find speeks of >= 60 cycles (and possibly much more) for non-pipelined coprocessor instructions; the list of such does contain loads and stores to a bunch of registers. However, the register in question (p15/c3) has only store mentioned there, so loads might be cheap; no obvious reasons for those to be slow. That's a question to arm folks, I'm afraid... rmk? Note that we can keep the current variant (i.e. user_access_begin() being just the check for access_ok(), user_access_end() being empty and uaccess_save_and_enable()/uaccess_restore() done manually inside the primitives); after all, a lot of architectures don't _have_ anything of that sort. It's just that decisions regarding the calling conventions for these primitives will be much harder to change later on... Again, arm (32bit one) is the only architectures that has something of that sort and needs to pass cookie from beginning to the end of uaccess blocks. Everything else splits into several classes: 1) has MMU, shared address space for kernel/userland, no stac analogues. alpha, arc, csky, hexagon, itanic, nds32, nios32, openrisc, sh, sparc32, unicore32, xtensa/MMU, microblaze/MMU, mips/MMU, m68k/MMU/COLDFIRE. No way to do anything other than plain access_ok() for user_access_begin(). 2) has MMU, shared address space for kernel/userland, has stac analogue, possibly with separate "for read" and "for write" variants. Can live without passing any cookies. arm64, powerpc, riscv, x86 Current variant with changes in this patchset covers those. 3) non-MMU, uses memcpy() for everything, or at least ought to: c6x, h8300, m68k/!MMU, xtensa/!MMU(?), microblaze/!MMU(?), mips/!MMU(?), arm/!MMU No memory protection of any sort... 4) sparc-like: MMU, separate address spaces for userland and kernel, has explicit insns for uaccess + some register(s) to choose what those insns actually hit. sparc64, parisc, m68k/MMU/!COLDFIRE No stac/clac analogue would make sense. 5) s390: weird one - there is an stac analogue as far the hardware is concerned, but it can't be separated from inline asm where actual uaccess insns are. From the kernel POV it's sparc-like. Nothing that would reasonably map to user_access_begin/user_access_end 6) um: no uaccess, in a sense of dereferencing non-kernel pointers. What it does is simulation of page table walk + explicit call of #PF handler on missing pages + kmap_atomic() to get a kernel alias.
On Tue, Apr 21, 2020 at 03:49:19AM +0100, Al Viro wrote: > The only source I'd been able to find speeks of >= 60 cycles > (and possibly much more) for non-pipelined coprocessor instructions; > the list of such does contain loads and stores to a bunch of registers. > However, the register in question (p15/c3) has only store mentioned there, > so loads might be cheap; no obvious reasons for those to be slow. > That's a question to arm folks, I'm afraid... rmk? I have no information on that; instruction timings are not defined at architecture level (architecture reference manual), nor do I find information in the CPU technical reference manual (which would be specific to the CPU). Instruction timings tend to be implementation dependent. I've always consulted Will Deacon when I've needed to know whether an instruction is expensive or not.
On Mon, Apr 20, 2020 at 7:49 PM Al Viro <viro@zeniv.linux.org.uk> wrote: > > The only source I'd been able to find speaks of >= 60 cycles > (and possibly much more) for non-pipelined coprocessor instructions; > the list of such does contain loads and stores to a bunch of registers. > However, the register in question (p15/c3) has only store mentioned there, > so loads might be cheap; no obvious reasons for those to be slow. > That's a question to arm folks, I'm afraid... rmk? _If_ it turns out to be expensive, is there any reason we couldn't just cache the value in general? That's what x86 tends to do with expensive system registers. One example would be "msr_misc_features_shadow". But maybe that's something to worry about when/if it turns out to actually be a problem? Linus
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 4427d419eb1d..7fe799e081f2 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -456,14 +456,15 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src, extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset, size_t len); -static __must_check inline bool user_access_begin(const void __user *ptr, size_t len) +static __must_check inline bool +user_full_access_begin(const void __user *ptr, size_t len) { if (unlikely(!access_ok(ptr, len))) return false; allow_read_write_user((void __user *)ptr, ptr, len); return true; } -#define user_access_begin user_access_begin +#define user_full_access_begin user_full_access_begin #define user_access_end prevent_current_access_user #define user_access_save prevent_user_access_return #define user_access_restore restore_user_access diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h index f9c00110a69a..9eefea374bd4 100644 --- a/arch/x86/include/asm/futex.h +++ b/arch/x86/include/asm/futex.h @@ -56,7 +56,7 @@ do { \ static __always_inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) { - if (!user_access_begin(uaddr, sizeof(u32))) + if (!user_full_access_begin(uaddr, sizeof(u32))) return -EFAULT; switch (op) { @@ -92,7 +92,7 @@ static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, { int ret = 0; - if (!user_access_begin(uaddr, sizeof(u32))) + if (!user_full_access_begin(uaddr, sizeof(u32))) return -EFAULT; asm volatile("\n" "1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n" diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index d8f283b9a569..8776e815f215 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -473,16 +473,17 @@ extern struct movsl_mask { * The "unsafe" user accesses aren't really "unsafe", but the naming * is a big fat warning: you have to not only do the access_ok() * checking before using them, but you have to surround them with the - * user_access_begin/end() pair. + * user_full_access_begin/end() pair. */ -static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len) +static __must_check __always_inline bool +user_full_access_begin(const void __user *ptr, size_t len) { if (unlikely(!access_ok(ptr,len))) return 0; __uaccess_begin_nospec(); return 1; } -#define user_access_begin(a,b) user_access_begin(a,b) +#define user_full_access_begin(a,b) user_full_access_begin(a,b) #define user_access_end() __uaccess_end() #define user_access_save() smap_save() diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 9861c89f93be..5be9bc930342 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -368,8 +368,8 @@ extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count); #define probe_kernel_address(addr, retval) \ probe_kernel_read(&retval, addr, sizeof(retval)) -#ifndef user_access_begin -#define user_access_begin(ptr,len) access_ok(ptr, len) +#ifndef user_full_access_begin +#define user_full_access_begin(ptr,len) access_ok(ptr, len) #define user_access_end() do { } while (0) #define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0) #define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user(x,p),e) @@ -379,11 +379,11 @@ static inline unsigned long user_access_save(void) { return 0UL; } static inline void user_access_restore(unsigned long flags) { } #endif #ifndef user_write_access_begin -#define user_write_access_begin user_access_begin +#define user_write_access_begin user_full_access_begin #define user_write_access_end user_access_end #endif #ifndef user_read_access_begin -#define user_read_access_begin user_access_begin +#define user_read_access_begin user_full_access_begin #define user_read_access_end user_access_end #endif
Now we have user_read_access_begin() and user_write_access_begin() in addition to user_access_begin(). Make it explicit that user_access_begin() provides both read and write by renaming it user_full_access_begin(). And the same for user_access_end() which becomes user_full_access_end(). Done with following command, then hand splitted two too long lines. sed -i s/user_access_begin/user_full_access_begin/g `git grep -l user_access_begin` Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> --- v2: New, based on remark from Al Viro. --- arch/powerpc/include/asm/uaccess.h | 5 +++-- arch/x86/include/asm/futex.h | 4 ++-- arch/x86/include/asm/uaccess.h | 7 ++++--- include/linux/uaccess.h | 8 ++++---- 4 files changed, 13 insertions(+), 11 deletions(-)