Message ID | 20210115070326.294332-1-Sonicadvance1@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Adds a new ioctl32 syscall for backwards compatibility layers | expand |
From: sonicadvance1@gmail.com > Sent: 15 January 2021 07:03 > Problem presented: > A backwards compatibility layer that allows running x86-64 and x86 > processes inside of an AArch64 process. > - CPU is emulated > - Syscall interface is mostly passthrough > - Some syscalls require patching or emulation depending on behaviour > - Not viable from the emulator design to use an AArch32 host process > You are going to need to add all the x86 compatibility code into your arm64 kernel. This is likely to be different from the 32bit arm compatibility because 64bit items are only aligned on 32bit boundaries. The x86 x32 compatibility will be more like the 32bit arm 'compat' code - I'm pretty sure arm32 64bit aligned 64bit data. You'll then need to remember how the process entered the kernel to work out which compatibility code to invoke. This is what x86 does. It allows a single process to do all three types of system call. Trying to 'patch up' structures outside the kernel, or in the syscall interface code will always cause grief somewhere. The only sane place is in the code that uses the structures. Which, for ioctls, means inside the driver that parses them. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Jan 15, 2021 at 9:01 PM David Laight <David.Laight@aculab.com> wrote: > > From: sonicadvance1@gmail.com > > Sent: 15 January 2021 07:03 > > Problem presented: > > A backwards compatibility layer that allows running x86-64 and x86 > > processes inside of an AArch64 process. > > - CPU is emulated > > - Syscall interface is mostly passthrough > > - Some syscalls require patching or emulation depending on behaviour > > - Not viable from the emulator design to use an AArch32 host process > > > > You are going to need to add all the x86 compatibility code into > your arm64 kernel. > This is likely to be different from the 32bit arm compatibility > because 64bit items are only aligned on 32bit boundaries. > The x86 x32 compatibility will be more like the 32bit arm 'compat' > code - I'm pretty sure arm32 64bit aligned 64bit data. All other architectures that have both 32-bit and 64-bit variants use the same alignment for all types, except for x86. There are additional differences though, especially if one were to try to generalize the interface to all architectures. A subset of the issues includes - x32 has 64-bit types in places of some types that are 32 bit everywhere else (time_t, ino_t, off_t, clock_t, ...) - m68k aligns struct members to at most 16 bits - uid_t/gid_t/ino_t/dev_t/... are > You'll then need to remember how the process entered the kernel > to work out which compatibility code to invoke. > This is what x86 does. > It allows a single process to do all three types of system call. > > Trying to 'patch up' structures outside the kernel, or in the > syscall interface code will always cause grief somewhere. > The only sane place is in the code that uses the structures. > Which, for ioctls, means inside the driver that parses them. He's already doing the system call emulation for all the system calls other than ioctl in user space though. In my experience, there are actually fairly few ioctl commands that are different between architectures -- most of them have no misaligned or architecture-defined struct members at all. Once you have conversion functions to deal with the 32/64-bit interface differences and architecture specifics of sockets, sysvipc, signals, stat, and input_event, handling the x86-32 specific ioctl commands is comparably easy. Arnd
On Fri, Jan 15, 2021 at 11:17:09PM +0100, Arnd Bergmann wrote: > On Fri, Jan 15, 2021 at 9:01 PM David Laight <David.Laight@aculab.com> wrote: > > > > From: sonicadvance1@gmail.com > > > Sent: 15 January 2021 07:03 > > > Problem presented: > > > A backwards compatibility layer that allows running x86-64 and x86 > > > processes inside of an AArch64 process. > > > - CPU is emulated > > > - Syscall interface is mostly passthrough > > > - Some syscalls require patching or emulation depending on behaviour > > > - Not viable from the emulator design to use an AArch32 host process > > > > > > > You are going to need to add all the x86 compatibility code into > > your arm64 kernel. > > This is likely to be different from the 32bit arm compatibility > > because 64bit items are only aligned on 32bit boundaries. > > The x86 x32 compatibility will be more like the 32bit arm 'compat' > > code - I'm pretty sure arm32 64bit aligned 64bit data. > > All other architectures that have both 32-bit and 64-bit variants > use the same alignment for all types, except for x86. > > There are additional differences though, especially if one > were to try to generalize the interface to all architectures. > A subset of the issues includes > > - x32 has 64-bit types in places of some types that are > 32 bit everywhere else (time_t, ino_t, off_t, clock_t, ...) > > - m68k aligns struct members to at most 16 bits > > - uid_t/gid_t/ino_t/dev_t/... are > > > You'll then need to remember how the process entered the kernel > > to work out which compatibility code to invoke. > > This is what x86 does. > > It allows a single process to do all three types of system call. > > > > Trying to 'patch up' structures outside the kernel, or in the > > syscall interface code will always cause grief somewhere. > > The only sane place is in the code that uses the structures. > > Which, for ioctls, means inside the driver that parses them. > > He's already doing the system call emulation for all the system > calls other than ioctl in user space though. In my experience, > there are actually fairly few ioctl commands that are different > between architectures -- most of them have no misaligned > or architecture-defined struct members at all. > > Once you have conversion functions to deal with the 32/64-bit > interface differences and architecture specifics of sockets, > sysvipc, signals, stat, and input_event, handling the > x86-32 specific ioctl commands is comparably easy. Indeed, all of this should just be done in userspace. Note (as you of course know, but others on CC probably don't) that we did this in musl libc for the sake of being able to run a time64 userspace on a pre-time64 kernel, with translation from the new time64 ioctl structures to the versions needed by the old ioctls and back using a fairly simple table: https://git.musl-libc.org/cgit/musl/tree/src/misc/ioctl.c?id=v1.2.2 I imagine there's a fair bit more to be done for 32-/64-bit mismatch in size/long/pointer types and different alignments, but the problem is almost certainly tractable, and much easier than what they already have to be doing for syscalls. Rich
... > He's already doing the system call emulation for all the system > calls other than ioctl in user space though. In my experience, > there are actually fairly few ioctl commands that are different > between architectures -- most of them have no misaligned > or architecture-defined struct members at all. Aren't there also some intractable issues with socket options? IIRC the kernel code that tried to change them to 64bit was horribly broken in some obscure cases. Pushing the conversion down the stack not only identified the issues, it also made them easier to fix. If you change the kernel so a 64bit process can execute 32bit system calls then a lot of the problems do go away. This is probably easiest done by setting a high bit on the system call number - as x86_64 does for x32 calls. You still have to solve the different alignment of 64bit data on i386. Of course the system call numbers are different - but that is just a lookup. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index a6617067dbe6..81e70fd241d7 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -481,3 +481,4 @@ 549 common faccessat2 sys_faccessat2 550 common process_madvise sys_process_madvise 551 common epoll_pwait2 sys_epoll_pwait2 +552 common ioctl32 sys_ni_syscall diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 20e1170e2e0a..98fbf1af1169 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -455,3 +455,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 +442 common ioctl32 sys_ni_syscall diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 86a9d7b3eabe..949788f5ba40 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -38,7 +38,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 442 +#define __NR_compat_syscalls 443 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index cccfbbefbf95..35e3bc83dbdc 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -891,6 +891,8 @@ __SYSCALL(__NR_faccessat2, sys_faccessat2) __SYSCALL(__NR_process_madvise, sys_process_madvise) #define __NR_epoll_pwait2 441 __SYSCALL(__NR_epoll_pwait2, compat_sys_epoll_pwait2) +#define __NR_ioctl32 442 +__SYSCALL(__NR_ioctl32, compat_sys_ioctl) /* * Please add new compat syscalls above this comment and update diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl index bfc00f2bd437..087fc9627357 100644 --- a/arch/ia64/kernel/syscalls/syscall.tbl +++ b/arch/ia64/kernel/syscalls/syscall.tbl @@ -362,3 +362,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 +442 common sys_ioctl32 sys_ioctl32 diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index 7fe4e45c864c..502b2f87ab60 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -441,3 +441,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 +442 common ioctl32 sys_ni_syscall diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index a522adf194ab..e69be6c836d2 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -447,3 +447,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 +442 common ioctl32 sys_ni_syscall diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 0f03ad223f33..ba395218446f 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -380,3 +380,4 @@ 439 n32 faccessat2 sys_faccessat2 440 n32 process_madvise sys_process_madvise 441 n32 epoll_pwait2 compat_sys_epoll_pwait2 +442 n32 ioctl32 sys_ni_syscall diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl index 91649690b52f..f42f939702e2 100644 --- a/arch/mips/kernel/syscalls/syscall_n64.tbl +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -356,3 +356,5 @@ 439 n64 faccessat2 sys_faccessat2 440 n64 process_madvise sys_process_madvise 441 n64 epoll_pwait2 sys_epoll_pwait2 +441 n64 epoll_pwait2 sys_epoll_pwait2 +442 n64 ioctl32 sys_ioctl32 diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index 4bad0c40aed6..b08ff6066f06 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -429,3 +429,4 @@ 439 o32 faccessat2 sys_faccessat2 440 o32 process_madvise sys_process_madvise 441 o32 epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2 +442 o32 ioctl32 sys_ni_syscall diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl index 6bcc31966b44..84d2b88d92fa 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -439,3 +439,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2 +442 64 ioctl32 sys_ioctl32 diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index f744eb5cba88..9f04d73cf649 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -531,3 +531,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2 +442 64 sys_ioctl32 sys_ioctl32 diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl index d443423495e5..2c90c0ecb5c7 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -444,3 +444,4 @@ 439 common faccessat2 sys_faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2 +442 64 sys_ioctl32 sys_ni_syscall diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl index 9df40ac0ebc0..1e02a13fa049 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -444,3 +444,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 +442 common ioctl32 sys_ni_syscall diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl index 40d8c7cd8298..f7d24678d0b1 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -487,3 +487,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2 +442 64 sys_ioctl32 sys_ioctl32 diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 874aeacde2dd..b1a3461e1e20 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -446,3 +446,4 @@ 439 i386 faccessat2 sys_faccessat2 440 i386 process_madvise sys_process_madvise 441 i386 epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2 +442 i386 ioctl32 sys_ni_syscall diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 78672124d28b..0250a04df0df 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -363,6 +363,7 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 +442 64 ioctl32 sys_ioctl32 # # Due to a historical design error, certain syscalls are numbered differently diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl index 46116a28eeed..34b653b36b7b 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -412,3 +412,4 @@ 439 common faccessat2 sys_faccessat2 440 common process_madvise sys_process_madvise 441 common epoll_pwait2 sys_epoll_pwait2 +442 common ioctl32 sys_ni_syscall diff --git a/fs/ioctl.c b/fs/ioctl.c index 4e6cc0a7d69c..7b324a21a257 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -790,8 +790,8 @@ long compat_ptr_ioctl(struct file *file, unsigned int cmd, unsigned long arg) } EXPORT_SYMBOL(compat_ptr_ioctl); -COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, - compat_ulong_t, arg) +long do_ioctl32(unsigned int fd, unsigned int cmd, + compat_ulong_t arg) { struct fd f = fdget(fd); int error; @@ -850,4 +850,18 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, return error; } + +COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, + compat_ulong_t, arg) +{ + return do_ioctl32(fd, cmd, arg); +} + +#if BITS_PER_LONG == 64 +SYSCALL_DEFINE3(ioctl32, unsigned int, fd, unsigned int, cmd, + compat_ulong_t, arg) +{ + return do_ioctl32(fd, cmd, arg); +} +#endif #endif diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index f3929aff39cf..fb7bac17167a 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -386,6 +386,10 @@ asmlinkage long sys_inotify_rm_watch(int fd, __s32 wd); /* fs/ioctl.c */ asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg); +#if defined(CONFIG_COMPAT) && BITS_PER_LONG == 64 +asmlinkage long sys_ioctl32(unsigned int fd, unsigned int cmd, + compat_ulong_t arg); +#endif /* fs/ioprio.c */ asmlinkage long sys_ioprio_set(int which, int who, int ioprio); diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 728752917785..18279e5b7b4f 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -862,8 +862,15 @@ __SYSCALL(__NR_process_madvise, sys_process_madvise) #define __NR_epoll_pwait2 441 __SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2) +#define __NR_ioctl32 442 +#ifdef CONFIG_COMPAT +__SC_COMP(__NR_ioctl32, sys_ioctl32, compat_sys_ioctl) +#else +__SC_COMP(__NR_ioctl32, sys_ni_syscall, sys_ni_syscall) +#endif + #undef __NR_syscalls -#define __NR_syscalls 442 +#define __NR_syscalls 443 /* * 32 bit systems traditionally used different diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index 19aa806890d5..5a2f25eb341c 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -302,6 +302,9 @@ COND_SYSCALL(recvmmsg_time32); COND_SYSCALL_COMPAT(recvmmsg_time32); COND_SYSCALL_COMPAT(recvmmsg_time64); +COND_SYSCALL(ioctl32); +COND_SYSCALL_COMPAT(ioctl32); + /* * Architecture specific syscalls: see further below */ diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h index 728752917785..18279e5b7b4f 100644 --- a/tools/include/uapi/asm-generic/unistd.h +++ b/tools/include/uapi/asm-generic/unistd.h @@ -862,8 +862,15 @@ __SYSCALL(__NR_process_madvise, sys_process_madvise) #define __NR_epoll_pwait2 441 __SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2) +#define __NR_ioctl32 442 +#ifdef CONFIG_COMPAT +__SC_COMP(__NR_ioctl32, sys_ioctl32, compat_sys_ioctl) +#else +__SC_COMP(__NR_ioctl32, sys_ni_syscall, sys_ni_syscall) +#endif + #undef __NR_syscalls -#define __NR_syscalls 442 +#define __NR_syscalls 443 /* * 32 bit systems traditionally used different