mbox series

[v2,0/9] ARM: remove set_fs callers and implementation

Message ID 20200918124624.1469673-1-arnd@arndb.de (mailing list archive)
Headers show
Series ARM: remove set_fs callers and implementation | expand

Message

Arnd Bergmann Sept. 18, 2020, 12:46 p.m. UTC
Hi Christoph, Russell,

Here is an updated series for removing set_fs() from arch/arm,
based on the previous feedback.

I have tested the oabi-compat changes using the LTP tests for the three
modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
and I have lightly tested the get_kernel_nofault infrastructure by
loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.

     Arnd

Arnd Bergmann (9):
  mm/maccess: fix unaligned copy_{from,to}_kernel_nofault
  ARM: traps: use get_kernel_nofault instead of set_fs()
  ARM: oabi-compat: add epoll_pwait handler
  ARM: syscall: always store thread_info->syscall
  ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation
  ARM: oabi-compat: rework sys_semtimedop emulation
  ARM: oabi-compat: rework fcntl64() emulation
  ARM: uaccess: add __{get,put}_kernel_nofault
  ARM: uaccess: remove set_fs() implementation

 arch/arm/Kconfig                   |   1 -
 arch/arm/include/asm/ptrace.h      |   1 -
 arch/arm/include/asm/syscall.h     |  16 ++-
 arch/arm/include/asm/thread_info.h |   4 -
 arch/arm/include/asm/uaccess-asm.h |   6 -
 arch/arm/include/asm/uaccess.h     | 169 ++++++++++++++-------------
 arch/arm/kernel/asm-offsets.c      |   3 +-
 arch/arm/kernel/entry-common.S     |  16 +--
 arch/arm/kernel/process.c          |   7 +-
 arch/arm/kernel/ptrace.c           |   4 +-
 arch/arm/kernel/signal.c           |   8 --
 arch/arm/kernel/sys_oabi-compat.c  | 181 ++++++++++++++++-------------
 arch/arm/kernel/traps.c            |  47 +++-----
 arch/arm/lib/copy_from_user.S      |   3 +-
 arch/arm/lib/copy_to_user.S        |   3 +-
 arch/arm/tools/syscall.tbl         |   2 +-
 fs/eventpoll.c                     |   5 +-
 include/linux/eventpoll.h          |  18 +++
 include/linux/syscalls.h           |   3 +
 ipc/sem.c                          |  84 ++++++++-----
 mm/maccess.c                       |  28 ++++-
 21 files changed, 328 insertions(+), 281 deletions(-)

Comments

Christoph Hellwig Sept. 19, 2020, 5:27 a.m. UTC | #1
On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> Hi Christoph, Russell,
> 
> Here is an updated series for removing set_fs() from arch/arm,
> based on the previous feedback.
> 
> I have tested the oabi-compat changes using the LTP tests for the three
> modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> and I have lightly tested the get_kernel_nofault infrastructure by
> loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.

What is the base line?  Just the base.set_fs branch in Als tree, or do
you need anything from my RISC-V series?
Russell King (Oracle) Sept. 19, 2020, 8:19 a.m. UTC | #2
On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> Hi Christoph, Russell,
> 
> Here is an updated series for removing set_fs() from arch/arm,
> based on the previous feedback.
> 
> I have tested the oabi-compat changes using the LTP tests for the three
> modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> and I have lightly tested the get_kernel_nofault infrastructure by
> loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.

I'm not too keen on always saving the syscall number, but for the gain
of getting rid of set_fs() I think it's worth it. However...

I think there are some things to check - what value do you end up
with as the first number in /proc/self/syscall when you do:

strace cat /proc/self/syscall

?

It should be 3, not 0x900003. I suspect you're getting the latter
with these changes.  IIRC, task_thread_info(task)->syscall needs to
be the value _without_ the offset, otherwise tracing will break.
Arnd Bergmann Sept. 25, 2020, 1:40 p.m. UTC | #3
On Sat, Sep 19, 2020 at 7:27 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> > Hi Christoph, Russell,
> >
> > Here is an updated series for removing set_fs() from arch/arm,
> > based on the previous feedback.
> >
> > I have tested the oabi-compat changes using the LTP tests for the three
> > modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> > and I have lightly tested the get_kernel_nofault infrastructure by
> > loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.
>
> What is the base line?  Just the base.set_fs branch in Als tree, or do
> you need anything from my RISC-V series?

I imported these additional patches from you:

e0d17576790e quota: simplify the quotactl compat handling
b0f8a0c4046f compat: add a compat_need_64bit_alignment_fixup() helper
ed8af9335e19 compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
ce526c75bbe2 uaccess: provide a generic TASK_SIZE_MAX definition

I think I only actually needed the last one of those for the Arm
patches, the other ones are dependencies for my other patches
I have on the same branch.

      Arnd
Arnd Bergmann Sept. 25, 2020, 2:08 p.m. UTC | #4
On Sat, Sep 19, 2020 at 10:19 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> > Hi Christoph, Russell,
> >
> > Here is an updated series for removing set_fs() from arch/arm,
> > based on the previous feedback.
> >
> > I have tested the oabi-compat changes using the LTP tests for the three
> > modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> > and I have lightly tested the get_kernel_nofault infrastructure by
> > loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.
>
> I'm not too keen on always saving the syscall number, but for the gain
> of getting rid of set_fs() I think it's worth it. However...
>
> I think there are some things to check - what value do you end up
> with as the first number in /proc/self/syscall when you do:
>
> strace cat /proc/self/syscall
>
> ?

> It should be 3, not 0x900003. I suspect you're getting the latter
> with these changes.  IIRC, task_thread_info(task)->syscall needs to
> be the value _without_ the offset, otherwise tracing will break.

It seems broken in different ways, depending on the combination
of kernel and userland:

1. EABI armv5-versatile kernel, EABI Debian 5:
$ cat /proc/self/syscall
0 0x1500000000003 0x1500000000400 0x1500000000400 0x60000013c7800480
0xc0008668c0112f8c 0xc0112d14c68e1f68 0xbeab06f8 0xb6e80d4c
$ strace -f cat /proc/self/syscall
execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) =
-1 EINTR (Interrupted system call)
dup(2)                                  = -1 EINTR (Interrupted system call)
write(2, "strace: exec: Interrupted system "..., 38) = -1 EINTR
(Interrupted system call)
exit_group(1)                           = ?

2. EABI kernel, OABI Debian 5:
$ cat /proc/self/syscall
3 0x1500000000003 0x13ccc00000400 0x1500000000400 0x60000013c7800480
0xc0008de0c0112f8c 0xc0112d14c7313f68 0xbeed27d0 0xb6eab324
$ strace cat /proc/self/syscall
execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) = -1090648236
--- SIGILL (Illegal instruction) @ 0 (0) ---
+++ killed by SIGILL +++

3. OABI kernel, OABI Debian 5:
 cat /proc/self/syscall
9437187 0x1500000000003 0x13ccc00000400 0x1500000000400 0x100060000013
0x15000c72cff6c 0xc72cfe9000000000 0xbece27d0 0xb6f2f324
$ strace cat /proc/self/syscall
execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) = -1095141548
--- SIGILL (Illegal instruction) @ 0 (0) ---
+++ killed by SIGILL +++

I suspect the OABI strace in Debian is broken since it crashes on
both kernels. I'll look into fixing the output without strace first then.

       Arnd
Arnd Bergmann Sept. 25, 2020, 3:30 p.m. UTC | #5
On Fri, Sep 25, 2020 at 4:08 PM Arnd Bergmann <arnd@arndb.de> wrote:
> On Sat, Sep 19, 2020 at 10:19 AM Russell King - ARM Linux admin
> <linux@armlinux.org.uk> wrote:
> >
> > On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> > > Hi Christoph, Russell,
> > >
> > > Here is an updated series for removing set_fs() from arch/arm,
> > > based on the previous feedback.
> > >
> > > I have tested the oabi-compat changes using the LTP tests for the three
> > > modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> > > and I have lightly tested the get_kernel_nofault infrastructure by
> > > loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.
> >
> > I'm not too keen on always saving the syscall number, but for the gain
> > of getting rid of set_fs() I think it's worth it. However...
> >
> > I think there are some things to check - what value do you end up
> > with as the first number in /proc/self/syscall when you do:
> >
> > strace cat /proc/self/syscall
> >
> > ?
>
> > It should be 3, not 0x900003. I suspect you're getting the latter
> > with these changes.  IIRC, task_thread_info(task)->syscall needs to
> > be the value _without_ the offset, otherwise tracing will break.
>
> It seems broken in different ways, depending on the combination
> of kernel and userland:
>
> 1. EABI armv5-versatile kernel, EABI Debian 5:
> $ cat /proc/self/syscall
> 0 0x1500000000003 0x1500000000400 0x1500000000400 0x60000013c7800480
> 0xc0008668c0112f8c 0xc0112d14c68e1f68 0xbeab06f8 0xb6e80d4c
> $ strace -f cat /proc/self/syscall
> execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) =
> -1 EINTR (Interrupted system call)
> dup(2)                                  = -1 EINTR (Interrupted system call)
> write(2, "strace: exec: Interrupted system "..., 38) = -1 EINTR
> (Interrupted system call)
> exit_group(1)                           = ?

Both the missing number and the broken strace are fixed with

diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 610e32273c81..2c0bde14fba6 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -226,7 +226,8 @@ ENTRY(vector_swi)
         * get the old ABI syscall table address.
         */
        bics    r10, r10, #0xff000000
-       str     r10, [tsk, #TI_SYSCALL]
+       strne   r10, [tsk, #TI_SYSCALL]
+       streq   scno, [tsk, #TI_SYSCALL]
        eorne   scno, r10, #__NR_OABI_SYSCALL_BASE
        ldrne   tbl, =sys_oabi_call_table
 #elif !defined(CONFIG_AEABI)

It was already working with CONFIG_AEABI=y and
CONFIG_OABI_COMPAT=n

> 2. EABI kernel, OABI Debian 5:
> $ cat /proc/self/syscall
> 3 0x1500000000003 0x13ccc00000400 0x1500000000400 0x60000013c7800480
> 0xc0008de0c0112f8c 0xc0112d14c7313f68 0xbeed27d0 0xb6eab324
> $ strace cat /proc/self/syscall
> execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) = -1090648236
> --- SIGILL (Illegal instruction) @ 0 (0) ---
> +++ killed by SIGILL +++

This was caused by me after all, here is my fix:

--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -25,6 +25,7 @@
 #include <linux/tracehook.h>
 #include <linux/unistd.h>

+#include <asm/syscall.h>
 #include <asm/traps.h>

 #define CREATE_TRACE_POINTS
@@ -898,11 +899,11 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
                return -1;
 #else
        /* XXX: remove this once OABI gets fixed */
-       secure_computing_strict(current_thread_info()->syscall);
+       secure_computing_strict(syscall_get_nr(current, regs));
 #endif

        /* Tracer or seccomp may have changed syscall. */
-       scno = current_thread_info()->syscall;
+       scno = syscall_get_nr(current, regs);

        if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
                trace_sys_enter(regs, scno);

> 3. OABI kernel, OABI Debian 5:
>  cat /proc/self/syscall
> 9437187 0x1500000000003 0x13ccc00000400 0x1500000000400 0x100060000013
> 0x15000c72cff6c 0xc72cfe9000000000 0xbece27d0 0xb6f2f324

This one is fixed by

--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -22,7 +22,7 @@ extern const unsigned long sys_call_table[];
 static inline int syscall_get_nr(struct task_struct *task,
                                 struct pt_regs *regs)
 {
-       if (!IS_ENABLED(CONFIG_OABI_COMPAT))
+       if (IS_ENABLED(CONFIG_AEABI) && !IS_ENABLED(CONFIG_OABI_COMPAT))
                return task_thread_info(task)->syscall;

        return task_thread_info(task)->syscall & ~__NR_OABI_SYSCALL_BASE;



I'll send an updated patch once I've addressed Christoph's comments.

      Arnd
Christoph Hellwig Sept. 26, 2020, 6:49 a.m. UTC | #6
On Fri, Sep 25, 2020 at 03:40:06PM +0200, Arnd Bergmann wrote:
> e0d17576790e quota: simplify the quotactl compat handling
> b0f8a0c4046f compat: add a compat_need_64bit_alignment_fixup() helper
> ed8af9335e19 compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
> ce526c75bbe2 uaccess: provide a generic TASK_SIZE_MAX definition
> 
> I think I only actually needed the last one of those for the Arm
> patches, the other ones are dependencies for my other patches
> I have on the same branch.

I still haven't gotten anyone to pick up the TASK_SIZE_MAX one.
So if you want to minimize dependencies just define TASK_SIZE_MAX
in the arm code for now, and we can remove redundant definitions later.
Christoph Hellwig July 5, 2021, 6:01 a.m. UTC | #7
Just a ping if we're making some progress on this.  The m68knommu
set_fs implementation is actively misleading, which lead to a problem
this merge window so I'd really like to see this set merged.

Also your improvements to the copy_from_kernel_nofaul loop would be
Arnd Bergmann July 22, 2021, 5:27 p.m. UTC | #8
On Mon, Jul 5, 2021 at 8:01 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> Just a ping if we're making some progress on this.  The m68knommu
> set_fs implementation is actively misleading, which lead to a problem
> this merge window so I'd really like to see this set merged.
>
> Also your improvements to the copy_from_kernel_nofaul loop would be

I've rebased the series now, and rearranged a little in order to address
Russell's concerns. I still have one regression that I need to fix before
sending the new version.

        Arnd