Message ID | e11a8f0670251267f87e3114e0bdbacb1eb72980.1579783936.git.christophe.leroy@c-s.fr (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3,1/7] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() | expand |
On Thu, 23 Jan 2020, Christophe Leroy <christophe.leroy@c-s.fr> wrote: > On 32 bits powerPC (book3s/32), only write accesses to user are > protected and there is no point spending time on unlocking for reads. > > On 64 bits powerpc (book3s/64 at least), access can be granted > read only, write only or read/write. > > Add an argument to user_access_begin() to tell when it's for write and > return an opaque key that will be used by user_access_end() to know > what was done by user_access_begin(). IMHO an opaque key is a prime example of a case where the use of an opaque typedef is warranted. Nobody needs to know or care it's specifically an unsigned long. BR, Jani.
On Thu, Jan 23, 2020 at 4:59 AM Christophe Leroy <christophe.leroy@c-s.fr> wrote: > > On 32 bits powerPC (book3s/32), only write accesses to user are > protected and there is no point spending time on unlocking for reads. Honestly, I'm starting to think that 32-bit ppc just needs to look more like everybody else, than make these changes. We used to have a read/write argument to the old "verify_area()" and "access_ok()" model, and it was a mistake. It was due to odd i386 user access issues. We got rid of it. I'm not convinced this is any better - it looks very similar and for odd ppc access issues. But if we really do want to do this, then: > Add an argument to user_access_begin() to tell when it's for write and > return an opaque key that will be used by user_access_end() to know > what was done by user_access_begin(). You should make it more opaque than "unsigned long". Also, it shouldn't be a "is this a write". What if it's a read _and_ a write? Only a write? Only a read? Linus
Le 23/01/2020 à 19:02, Linus Torvalds a écrit : > On Thu, Jan 23, 2020 at 4:59 AM Christophe Leroy > <christophe.leroy@c-s.fr> wrote: >> >> On 32 bits powerPC (book3s/32), only write accesses to user are >> protected and there is no point spending time on unlocking for reads. > > Honestly, I'm starting to think that 32-bit ppc just needs to look > more like everybody else, than make these changes. Well, beside ppc32, I was also seen it as an opportunity for the modern ppc64. On it, you can unlock either read or write or both. And this is what is done for get_user() / put_user() and friends: unlock only reads for get_user() and only writes for put_user(). Could also be a compromise between performance and security: keeping reads allowed at all time and only protect against writes on modern architectures which support it like ppc64. > > We used to have a read/write argument to the old "verify_area()" and > "access_ok()" model, and it was a mistake. It was due to odd i386 user > access issues. We got rid of it. I'm not convinced this is any better > - it looks very similar and for odd ppc access issues. I'm going to leave it aside, at least for the time being, and do it as a second step later after evaluating the real performance impact. I'll respin tomorrow in that way. > > But if we really do want to do this, then: Indeed I took the idea from a discussion in last Octobre (Subject: "book3s/32 KUAP (was Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user())" ) https://lore.kernel.org/lkml/87h84avffi.fsf@mpe.ellerman.id.au/ > >> Add an argument to user_access_begin() to tell when it's for write and >> return an opaque key that will be used by user_access_end() to know >> what was done by user_access_begin(). > > You should make it more opaque than "unsigned long". > > Also, it shouldn't be a "is this a write". What if it's a read _and_ a > write? Only a write? Only a read? Indeed that was more: does it includes a write. It's either RO or RW Christophe
On Thu, Jan 23, 2020 at 11:47 AM christophe leroy <christophe.leroy@c-s.fr> wrote: > > I'm going to leave it aside, at least for the time being, and do it as a > second step later after evaluating the real performance impact. I'll > respin tomorrow in that way. Ok, good. From a "narrow the access window type" standpoint it does seem to be a good idea to specify what kind of user accesses will be done, so I don't hate the idea, it's more that I'm not convinced it matters enough. On x86, we have made the rule that user_access_begin/end() can contain _very_ few operations, and objtool really does enforce that. With objtool and KASAN, you really end up with very small ranges of user_access_begin/end(). And since we actually verify it statically on x86-64, I would say that the added benefit of narrowing by access type is fairly small. We're not going to have complicated code in that user access region, at least in generic code. > > Also, it shouldn't be a "is this a write". What if it's a read _and_ a > > write? Only a write? Only a read? > > Indeed that was more: does it includes a write. It's either RO or RW I would expect that most actual users would be RO or WO, so it's a bit odd to have those choices. Of course, often writing ends up requiring read permissions anyway if the architecture has problems with alignment handling or similar, but still... The real RW case does exist conceptually (we have "copy_in_user()", after all), but still feels like it shouldn't be seen as the only _interface_ choice. IOW, an architecture may decide to turn WO into RW because of architecture limitations (or, like x86 and arm, ignore the whole RO/RW/WO _entirely_ because there's just a single "allow user space accesses" flag), but on an interface layer if we add this flag, I really think it should be an explicit "read or write or both". So thus my "let's try to avoid doing it in the first place, but if we _do_ do this, then do it right" plea. Linus
On January 23, 2020 11:57:57 AM PST, Linus Torvalds <torvalds@linux-foundation.org> wrote: >On Thu, Jan 23, 2020 at 11:47 AM christophe leroy ><christophe.leroy@c-s.fr> wrote: >> >> I'm going to leave it aside, at least for the time being, and do it >as a >> second step later after evaluating the real performance impact. I'll >> respin tomorrow in that way. > >Ok, good. > >From a "narrow the access window type" standpoint it does seem to be a >good idea to specify what kind of user accesses will be done, so I >don't hate the idea, it's more that I'm not convinced it matters >enough. > >On x86, we have made the rule that user_access_begin/end() can contain >_very_ few operations, and objtool really does enforce that. With >objtool and KASAN, you really end up with very small ranges of >user_access_begin/end(). > >And since we actually verify it statically on x86-64, I would say that >the added benefit of narrowing by access type is fairly small. We're >not going to have complicated code in that user access region, at >least in generic code. > >> > Also, it shouldn't be a "is this a write". What if it's a read >_and_ a >> > write? Only a write? Only a read? >> >> Indeed that was more: does it includes a write. It's either RO or RW > >I would expect that most actual users would be RO or WO, so it's a bit >odd to have those choices. > >Of course, often writing ends up requiring read permissions anyway if >the architecture has problems with alignment handling or similar, but >still... The real RW case does exist conceptually (we have >"copy_in_user()", after all), but still feels like it shouldn't be >seen as the only _interface_ choice. > >IOW, an architecture may decide to turn WO into RW because of >architecture limitations (or, like x86 and arm, ignore the whole >RO/RW/WO _entirely_ because there's just a single "allow user space >accesses" flag), but on an interface layer if we add this flag, I >really think it should be an explicit "read or write or both". > >So thus my "let's try to avoid doing it in the first place, but if we >_do_ do this, then do it right" plea. > > Linus I'm wondering if we should make it a static part of the API instead of a variable. I have *deep* concern with carrying state in a "key" variable: it's a direct attack vector for a crowbar attack, especially since it is by definition live inside a user access region. One major reason x86 restricts the regions like this is to minimize the amount of unconstrained state: we don't save and restore the state around, but enter and exit unconditionally, which means that a leaked state will end up having a limited lifespan. Nor is there any state inside the user access region which could be corrupted to leave the region open.
On Thu, Jan 23, 2020 at 10:03 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > We used to have a read/write argument to the old "verify_area()" and > "access_ok()" model, and it was a mistake. It was due to odd i386 user > access issues. We got rid of it. I'm not convinced this is any better > - it looks very similar and for odd ppc access issues. If the mode (read or write) were made visible to the trap handler, I'd find that useful for machine check recovery. If I'm in the middle of a copy_from_user() and I get a machine check reading poison from a user address ... then I could try to recover in the same way as for the user accessing the poison (offline the page, SIGBUS the task). But if the poison is in kernel memory and we are doing a copy_to_user(), then we are hosed (or would need some more complex recovery plan). [Note that we only get recoverable machine checks on loads... writes are posted, so if something goes wrong it isn't synchronous with the store instruction that initiated it] -Tony
Hi Christophe, Thank you for the patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on tip/x86/core drm-intel/for-linux-next v5.5-rc7] [cannot apply to linus/master next-20200124] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/fs-readdir-Fix-filldir-and-filldir64-use-of-user_access_begin/20200125-070606 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: x86_64-randconfig-s0-20200125 (attached as .config) compiler: gcc-7 (Debian 7.5.0-3) 7.5.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): kernel/exit.c: In function '__do_sys_waitid': >> kernel/exit.c:1567:53: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(infop, sizeof(*infop), true); ^ >> kernel/exit.c:1567:6: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(infop, sizeof(*infop), true); ^ kernel/exit.c: In function '__do_compat_sys_waitid': kernel/exit.c:1697:53: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(infop, sizeof(*infop), true); ^ kernel/exit.c:1697:6: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(infop, sizeof(*infop), true); ^ -- kernel/compat.c: In function 'compat_get_bitmap': >> kernel/compat.c:267:55: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(umask, bitmap_size / 8, false); ^ >> kernel/compat.c:267:6: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(umask, bitmap_size / 8, false); ^ kernel/compat.c: In function 'compat_put_bitmap': kernel/compat.c:298:54: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(umask, bitmap_size / 8, true); ^ kernel/compat.c:298:6: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(umask, bitmap_size / 8, true); ^ -- fs/readdir.c: In function 'filldir': >> fs/readdir.c:242:58: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(prev, reclen + prev_reclen, true); ^ >> fs/readdir.c:242:6: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(prev, reclen + prev_reclen, true); ^ fs/readdir.c: In function 'filldir64': fs/readdir.c:329:58: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(prev, reclen + prev_reclen, true); ^ fs/readdir.c:329:6: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(prev, reclen + prev_reclen, true); ^ -- lib/usercopy.c: In function 'check_zeroed_user': >> lib/usercopy.c:62:43: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(from, size, false); ^ >> lib/usercopy.c:62:6: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(from, size, false); ^ -- lib/strncpy_from_user.c: In function 'strncpy_from_user': >> lib/strncpy_from_user.c:120:42: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(src, max, false); ^ >> lib/strncpy_from_user.c:120:7: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(src, max, false); ^ -- lib/strnlen_user.c: In function 'strnlen_user': >> lib/strnlen_user.c:113:42: error: macro "user_access_begin" passed 3 arguments, but takes just 2 key = user_access_begin(str, max, false); ^ >> lib/strnlen_user.c:113:7: warning: assignment makes integer from pointer without a cast [-Wint-conversion] key = user_access_begin(str, max, false); ^ vim +/user_access_begin +1567 kernel/exit.c 1548 1549 SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *, 1550 infop, int, options, struct rusage __user *, ru) 1551 { 1552 struct rusage r; 1553 struct waitid_info info = {.status = 0}; 1554 long err = kernel_waitid(which, upid, &info, options, ru ? &r : NULL); 1555 int signo = 0; 1556 unsigned long key; 1557 1558 if (err > 0) { 1559 signo = SIGCHLD; 1560 err = 0; 1561 if (ru && copy_to_user(ru, &r, sizeof(struct rusage))) 1562 return -EFAULT; 1563 } 1564 if (!infop) 1565 return err; 1566 > 1567 key = user_access_begin(infop, sizeof(*infop), true); 1568 if (!key) 1569 return -EFAULT; 1570 1571 unsafe_put_user(signo, &infop->si_signo, Efault); 1572 unsafe_put_user(0, &infop->si_errno, Efault); 1573 unsafe_put_user(info.cause, &infop->si_code, Efault); 1574 unsafe_put_user(info.pid, &infop->si_pid, Efault); 1575 unsafe_put_user(info.uid, &infop->si_uid, Efault); 1576 unsafe_put_user(info.status, &infop->si_status, Efault); 1577 user_access_end(key); 1578 return err; 1579 Efault: 1580 user_access_end(key); 1581 return -EFAULT; 1582 } 1583 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 61d93f062a36..05eccdc0366a 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -709,7 +709,8 @@ extern struct movsl_mask { * checking before using them, but you have to surround them with the * user_access_begin/end() pair. */ -static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len) +static __must_check __always_inline unsigned long +user_access_begin(const void __user *ptr, size_t len, bool write) { if (unlikely(!access_ok(ptr,len))) return 0; @@ -717,7 +718,7 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt return 1; } #define user_access_begin(a,b) user_access_begin(a,b) -#define user_access_end() __uaccess_end() +#define user_access_end(x) __uaccess_end() #define user_access_save() smap_save() #define user_access_restore(x) smap_restore(x) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index bc3a67226163..509bfb6116ac 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1615,6 +1615,7 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb) const unsigned int count = eb->buffer_count; unsigned int i; int err; + unsigned long key; for (i = 0; i < count; i++) { const unsigned int nreloc = eb->exec[i].relocation_count; @@ -1662,14 +1663,15 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb) * happened we would make the mistake of assuming that the * relocations were valid. */ - if (!user_access_begin(urelocs, size)) + key = user_access_begin(urelocs, size, true); + if (!key) goto end; for (copied = 0; copied < nreloc; copied++) unsafe_put_user(-1, &urelocs[copied].presumed_offset, end_user); - user_access_end(); + user_access_end(key); eb->exec[i].relocs_ptr = (uintptr_t)relocs; } @@ -1677,7 +1679,7 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb) return 0; end_user: - user_access_end(); + user_access_end(key); end: kvfree(relocs); err = -EFAULT; @@ -2906,6 +2908,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, struct drm_i915_gem_exec_object2 __user *user_exec_list = u64_to_user_ptr(args->buffers_ptr); unsigned int i; + unsigned long key; /* Copy the new buffer offsets back to the user's exec list. */ /* @@ -2915,7 +2918,9 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, * And this range already got effectively checked earlier * when we did the "copy_from_user()" above. */ - if (!user_access_begin(user_exec_list, count * sizeof(*user_exec_list))) + key = user_access_begin(user_exec_list, + count * sizeof(*user_exec_list), true); + if (!key) goto end; for (i = 0; i < args->buffer_count; i++) { @@ -2929,7 +2934,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, end_user); } end_user: - user_access_end(); + user_access_end(key); end:; } diff --git a/fs/readdir.c b/fs/readdir.c index 4b466cbb0f3a..47b9ef97e16e 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -221,6 +221,7 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen, int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2, sizeof(long)); int prev_reclen; + unsigned long key; buf->error = verify_dirent_name(name, namlen); if (unlikely(buf->error)) @@ -238,7 +239,8 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen, return -EINTR; dirent = buf->current_dir; prev = (void __user *)dirent - prev_reclen; - if (!user_access_begin(prev, reclen + prev_reclen)) + key = user_access_begin(prev, reclen + prev_reclen, true); + if (!key) goto efault; /* This might be 'dirent->d_off', but if so it will get overwritten */ @@ -247,14 +249,14 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen, unsafe_put_user(reclen, &dirent->d_reclen, efault_end); unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end); unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end); - user_access_end(); + user_access_end(key); buf->current_dir = (void __user *)dirent + reclen; buf->prev_reclen = reclen; buf->count -= reclen; return 0; efault_end: - user_access_end(); + user_access_end(key); efault: buf->error = -EFAULT; return -EFAULT; @@ -311,6 +313,7 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1, sizeof(u64)); int prev_reclen; + unsigned long key; buf->error = verify_dirent_name(name, namlen); if (unlikely(buf->error)) @@ -323,7 +326,8 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, return -EINTR; dirent = buf->current_dir; prev = (void __user *)dirent - prev_reclen; - if (!user_access_begin(prev, reclen + prev_reclen)) + key = user_access_begin(prev, reclen + prev_reclen, true); + if (!key) goto efault; /* This might be 'dirent->d_off', but if so it will get overwritten */ @@ -332,7 +336,7 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, unsafe_put_user(reclen, &dirent->d_reclen, efault_end); unsafe_put_user(d_type, &dirent->d_type, efault_end); unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end); - user_access_end(); + user_access_end(key); buf->prev_reclen = reclen; dirent = (void __user *)dirent + reclen; @@ -341,7 +345,7 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, return 0; efault_end: - user_access_end(); + user_access_end(key); efault: buf->error = -EFAULT; return -EFAULT; diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 67f016010aad..394f5029a727 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -369,8 +369,8 @@ extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count); probe_kernel_read(&retval, addr, sizeof(retval)) #ifndef user_access_begin -#define user_access_begin(ptr,len) access_ok(ptr, len) -#define user_access_end() do { } while (0) +#define user_access_begin(ptr, len, write) access_ok(ptr, len) +#define user_access_end(x) do { } while (0) #define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0) #define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user(x,p),e) #define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user(x,p),e) diff --git a/kernel/compat.c b/kernel/compat.c index 95005f849c68..4bcbe1cd761b 100644 --- a/kernel/compat.c +++ b/kernel/compat.c @@ -258,12 +258,14 @@ long compat_get_bitmap(unsigned long *mask, const compat_ulong_t __user *umask, unsigned long bitmap_size) { unsigned long nr_compat_longs; + unsigned long key; /* align bitmap up to nearest compat_long_t boundary */ bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG); nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size); - if (!user_access_begin(umask, bitmap_size / 8)) + key = user_access_begin(umask, bitmap_size / 8, false); + if (!key) return -EFAULT; while (nr_compat_longs > 1) { @@ -275,11 +277,11 @@ long compat_get_bitmap(unsigned long *mask, const compat_ulong_t __user *umask, } if (nr_compat_longs) unsafe_get_user(*mask, umask++, Efault); - user_access_end(); + user_access_end(key); return 0; Efault: - user_access_end(); + user_access_end(key); return -EFAULT; } @@ -287,12 +289,14 @@ long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask, unsigned long bitmap_size) { unsigned long nr_compat_longs; + unsigned long key; /* align bitmap up to nearest compat_long_t boundary */ bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG); nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size); - if (!user_access_begin(umask, bitmap_size / 8)) + key = user_access_begin(umask, bitmap_size / 8, true); + if (!key) return -EFAULT; while (nr_compat_longs > 1) { @@ -303,10 +307,10 @@ long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask, } if (nr_compat_longs) unsafe_put_user((compat_ulong_t)*mask, umask++, Efault); - user_access_end(); + user_access_end(key); return 0; Efault: - user_access_end(); + user_access_end(key); return -EFAULT; } diff --git a/kernel/exit.c b/kernel/exit.c index 2833ffb0c211..1cb9c8a879d2 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -1553,6 +1553,7 @@ SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *, struct waitid_info info = {.status = 0}; long err = kernel_waitid(which, upid, &info, options, ru ? &r : NULL); int signo = 0; + unsigned long key; if (err > 0) { signo = SIGCHLD; @@ -1563,7 +1564,8 @@ SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *, if (!infop) return err; - if (!user_access_begin(infop, sizeof(*infop))) + key = user_access_begin(infop, sizeof(*infop), true); + if (!key) return -EFAULT; unsafe_put_user(signo, &infop->si_signo, Efault); @@ -1572,10 +1574,10 @@ SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *, unsafe_put_user(info.pid, &infop->si_pid, Efault); unsafe_put_user(info.uid, &infop->si_uid, Efault); unsafe_put_user(info.status, &infop->si_status, Efault); - user_access_end(); + user_access_end(key); return err; Efault: - user_access_end(); + user_access_end(key); return -EFAULT; } @@ -1673,6 +1675,8 @@ COMPAT_SYSCALL_DEFINE5(waitid, struct waitid_info info = {.status = 0}; long err = kernel_waitid(which, pid, &info, options, uru ? &ru : NULL); int signo = 0; + unsigned long key; + if (err > 0) { signo = SIGCHLD; err = 0; @@ -1690,7 +1694,8 @@ COMPAT_SYSCALL_DEFINE5(waitid, if (!infop) return err; - if (!user_access_begin(infop, sizeof(*infop))) + key = user_access_begin(infop, sizeof(*infop), true); + if (!key) return -EFAULT; unsafe_put_user(signo, &infop->si_signo, Efault); @@ -1699,10 +1704,10 @@ COMPAT_SYSCALL_DEFINE5(waitid, unsafe_put_user(info.pid, &infop->si_pid, Efault); unsafe_put_user(info.uid, &infop->si_uid, Efault); unsafe_put_user(info.status, &infop->si_status, Efault); - user_access_end(); + user_access_end(key); return err; Efault: - user_access_end(); + user_access_end(key); return -EFAULT; } #endif diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index dccb95af6003..7184fb766439 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -113,12 +113,14 @@ long strncpy_from_user(char *dst, const char __user *src, long count) if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval; + unsigned long key; kasan_check_write(dst, count); check_object_size(dst, count, false); - if (user_access_begin(src, max)) { + key = user_access_begin(src, max, false); + if (key) { retval = do_strncpy_from_user(dst, src, count, max); - user_access_end(); + user_access_end(key); return retval; } } diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c index 6c0005d5dd5c..819e355b8608 100644 --- a/lib/strnlen_user.c +++ b/lib/strnlen_user.c @@ -108,10 +108,12 @@ long strnlen_user(const char __user *str, long count) if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval; + unsigned long key; - if (user_access_begin(str, max)) { + key = user_access_begin(str, max, false); + if (key) { retval = do_strnlen_user(str, count, max); - user_access_end(); + user_access_end(key); return retval; } } diff --git a/lib/usercopy.c b/lib/usercopy.c index cbb4d9ec00f2..9e03ca88ad32 100644 --- a/lib/usercopy.c +++ b/lib/usercopy.c @@ -51,6 +51,7 @@ int check_zeroed_user(const void __user *from, size_t size) { unsigned long val; uintptr_t align = (uintptr_t) from % sizeof(unsigned long); + unsigned long key; if (unlikely(size == 0)) return 1; @@ -58,7 +59,8 @@ int check_zeroed_user(const void __user *from, size_t size) from -= align; size += align; - if (!user_access_begin(from, size)) + key = user_access_begin(from, size, false); + if (!key) return -EFAULT; unsafe_get_user(val, (unsigned long __user *) from, err_fault); @@ -79,10 +81,10 @@ int check_zeroed_user(const void __user *from, size_t size) val &= aligned_byte_mask(size); done: - user_access_end(); + user_access_end(key); return (val == 0); err_fault: - user_access_end(); + user_access_end(key); return -EFAULT; } EXPORT_SYMBOL(check_zeroed_user);
On 32 bits powerPC (book3s/32), only write accesses to user are protected and there is no point spending time on unlocking for reads. On 64 bits powerpc (book3s/64 at least), access can be granted read only, write only or read/write. Add an argument to user_access_begin() to tell when it's for write and return an opaque key that will be used by user_access_end() to know what was done by user_access_begin(). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> --- v3: new --- arch/x86/include/asm/uaccess.h | 5 +++-- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 15 ++++++++++----- fs/readdir.c | 16 ++++++++++------ include/linux/uaccess.h | 4 ++-- kernel/compat.c | 16 ++++++++++------ kernel/exit.c | 17 +++++++++++------ lib/strncpy_from_user.c | 6 ++++-- lib/strnlen_user.c | 6 ++++-- lib/usercopy.c | 8 +++++--- 9 files changed, 59 insertions(+), 34 deletions(-)