Message ID | 1446507046-24604-14-git-send-email-ynorov@caviumnetworks.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tuesday 03 November 2015 02:30:42 Yury Norov wrote: > From: Andrew Pinski <apinski@cavium.com> > > Add a separate syscall-table for ILP32, which dispatches either to native > LP64 system call implementation or to compat-syscalls, as appropriate. The uapi/asm-generic/unistd.h already contains a list of compat syscalls that should work by default, I think it would be better to use that list and override only the ones that differ between normal compat mode and the new mode, e.g. when you require a wrapper or want to use the native syscall entry. > +/* We need to make sure the pointer gets copied correctly. */ > +asmlinkage long ilp32_sys_mq_notify(mqd_t mqdes, const struct sigevent __user *u_notification) > +{ > + struct sigevent __user *p = NULL; > + if (u_notification) { > + struct sigevent n; > + p = compat_alloc_user_space(sizeof(*p)); > + if (copy_from_user(&n, u_notification, sizeof(*p))) > + return -EFAULT; > + if (n.sigev_notify == SIGEV_THREAD) > + n.sigev_value.sival_ptr = compat_ptr((uintptr_t)n.sigev_value.sival_ptr); > + if (copy_to_user(p, &n, sizeof(*p))) > + return -EFAULT; > + } > + return sys_mq_notify(mqdes, p); > +} Could this be avoided by defining sigval_t in a way that is compatible? > +/* sigevent contains sigval_t which is now 64bit always > + but need special handling due to padding for SIGEV_THREAD. */ > +#define sys_mq_notify ilp32_sys_mq_notify > + > +/* sigaltstack needs some special handling as the > + padding for stack_t might not be non-zero. */ > +long ilp32_sys_sigaltstack(const stack_t __user *uss_ptr, > + stack_t __user *uoss_ptr) asmlinkage? Arnd
Yury Norov <ynorov@caviumnetworks.com> writes: > +#define sys_clock_gettime compat_sys_clock_gettime > +#define sys_clock_settime compat_sys_clock_settime You also need to redirect sys_clock_nanosleep. Andreas.
On Wednesday 11 November 2015 18:54:00 Andreas Schwab wrote: > Yury Norov <ynorov@caviumnetworks.com> writes: > > > +#define sys_clock_gettime compat_sys_clock_gettime > > +#define sys_clock_settime compat_sys_clock_settime > > You also need to redirect sys_clock_nanosleep. Note that based on my comment, that table would be turned around, and only the syscalls get overridden that do not have the normal compat mode behavior (mostly the ones that pass a 64-bit register). Arnd
Arnd Bergmann <arnd@arndb.de> writes: > On Wednesday 11 November 2015 18:54:00 Andreas Schwab wrote: >> Yury Norov <ynorov@caviumnetworks.com> writes: >> >> > +#define sys_clock_gettime compat_sys_clock_gettime >> > +#define sys_clock_settime compat_sys_clock_settime >> >> You also need to redirect sys_clock_nanosleep. > > Note that based on my comment, that table would be turned around, and > only the syscalls get overridden that do not have the normal > compat mode behavior (mostly the ones that pass a 64-bit register). Is it intented that _all_ off_t-like syscalls are implemented by the 64bit variants? Currently, this isn't fully implemented (lseek is implemented by sys_llseek and mmap by an mmap2 wrapper). Andreas.
On Thursday 12 November 2015 00:07:41 Andreas Schwab wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > > On Wednesday 11 November 2015 18:54:00 Andreas Schwab wrote: > >> Yury Norov <ynorov@caviumnetworks.com> writes: > >> > >> > +#define sys_clock_gettime compat_sys_clock_gettime > >> > +#define sys_clock_settime compat_sys_clock_settime > >> > >> You also need to redirect sys_clock_nanosleep. > > > > Note that based on my comment, that table would be turned around, and > > only the syscalls get overridden that do not have the normal > > compat mode behavior (mostly the ones that pass a 64-bit register). > > Is it intented that _all_ off_t-like syscalls are implemented by the > 64bit variants? Currently, this isn't fully implemented (lseek is > implemented by sys_llseek and mmap by an mmap2 wrapper). I think either way is fine for the two examples. I think it's clear that we want __NR_llseek as 62 and __NR_mmap2 as 222. Whether those use the compat_sys_llseek/compat_sys_mmap2_wrapper or sys_lseek/sys_mmap entry points is not overly important, we can use whatever is more convenient to glibc: if we can kill off an architecture specific wrapper function in glibc by adding one line to the kernel, that seems worthwhile. It's a bit confusing, because user space off_t matches the kernel's off_t, loff_t, but not the __kernel_off_t from the uapi headers. Arnd
Arnd Bergmann <arnd@arndb.de> writes: > I think either way is fine for the two examples. I think it's clear > that we want __NR_llseek as 62 and __NR_mmap2 as 222. Whether those > use the compat_sys_llseek/compat_sys_mmap2_wrapper or > sys_lseek/sys_mmap entry points is not overly important, we can use > whatever is more convenient to glibc: if we can kill off an > architecture specific wrapper function in glibc by adding one line > to the kernel, that seems worthwhile. Currently most off_t-like syscalls need a new glibc wrapper since the existing ones are either for 32bit off_t+off64_t (with split off64_t syscall arguments) or pure 64bit off_t architectures. Since ilp32 now (mostly) has 32bit off_t, but 64bit off_t-like syscalls neither of them fit. Andreas.
On Thursday 12 November 2015 09:58:43 Andreas Schwab wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > > I think either way is fine for the two examples. I think it's clear > > that we want __NR_llseek as 62 and __NR_mmap2 as 222. Whether those > > use the compat_sys_llseek/compat_sys_mmap2_wrapper or > > sys_lseek/sys_mmap entry points is not overly important, we can use > > whatever is more convenient to glibc: if we can kill off an > > architecture specific wrapper function in glibc by adding one line > > to the kernel, that seems worthwhile. > > Currently most off_t-like syscalls need a new glibc wrapper since the > existing ones are either for 32bit off_t+off64_t (with split off64_t > syscall arguments) or pure 64bit off_t architectures. Since ilp32 now > (mostly) has 32bit off_t, but 64bit off_t-like syscalls neither of them > fit. What do you mean with 32-bit off_t? Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit __kernel_loff_t? That sounds a bit backwards. I would expect that all new architectures that only have __kernel_loff_t based syscalls but not __kernel_off_t based ones only ever use a 64-bit off_t in libc. Arnd
Arnd Bergmann <arnd@arndb.de> writes: > What do you mean with 32-bit off_t? An ABI with 32-bit off_t, ie. all currently implemented 32-bit ABIs. > Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit > __kernel_loff_t? Glibc is bridging the user-space ABI to the kernel ABI. Andreas.
On Thursday 12 November 2015 10:44:55 Andreas Schwab wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > > What do you mean with 32-bit off_t? > > An ABI with 32-bit off_t, ie. all currently implemented 32-bit ABIs. > > > Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit > > __kernel_loff_t? > > Glibc is bridging the user-space ABI to the kernel ABI. Ok, but why? The kernel headers for all recent architectures (arc, c6x, h8300, hexagon, metag, nios2, openrisc, tile and unicore32) deliberately leave out the __kernel_off_t based system calls to simplify the ABI in a way that we never have to support a 32-bit off_t in user space. Are there programs that require using a 32-bit off_t by default on 32-bit architectures but not on 64-bit architectures? Did the previous version of the ilp32 patch set also emulate this the same way? Arnd
Arnd Bergmann <arnd@arndb.de> writes: > On Thursday 12 November 2015 10:44:55 Andreas Schwab wrote: >> Arnd Bergmann <arnd@arndb.de> writes: >> >> > What do you mean with 32-bit off_t? >> >> An ABI with 32-bit off_t, ie. all currently implemented 32-bit ABIs. >> >> > Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit >> > __kernel_loff_t? >> >> Glibc is bridging the user-space ABI to the kernel ABI. > > Ok, but why? That's how the ABI is defined right now. I didn't make that up. Andreas.
On Thursday 12 November 2015 14:47:18 Andreas Schwab wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > > On Thursday 12 November 2015 10:44:55 Andreas Schwab wrote: > >> Arnd Bergmann <arnd@arndb.de> writes: > >> > >> > What do you mean with 32-bit off_t? > >> > >> An ABI with 32-bit off_t, ie. all currently implemented 32-bit ABIs. > >> > >> > Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit > >> > __kernel_loff_t? > >> > >> Glibc is bridging the user-space ABI to the kernel ABI. > > > > Ok, but why? > > That's how the ABI is defined right now. I didn't make that up. Ok, I guess it will remain a mystery then. Should we perhaps define __ARCH_WANT_SYSCALL_OFF_T for the unistd.h file then, so we provide both the off_t and the loff_t based syscalls? That would avoid the extra wrapper in glibc when using a 32-bit off_t if that is the preferred mode for user space. Arnd
On Fri, Nov 13, 2015 at 7:34 AM, Arnd Bergmann <arnd@arndb.de> wrote: > On Thursday 12 November 2015 14:47:18 Andreas Schwab wrote: >> Arnd Bergmann <arnd@arndb.de> writes: >> >> > On Thursday 12 November 2015 10:44:55 Andreas Schwab wrote: >> >> Arnd Bergmann <arnd@arndb.de> writes: >> >> >> >> > What do you mean with 32-bit off_t? >> >> >> >> An ABI with 32-bit off_t, ie. all currently implemented 32-bit ABIs. >> >> >> >> > Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit >> >> > __kernel_loff_t? >> >> >> >> Glibc is bridging the user-space ABI to the kernel ABI. >> > >> > Ok, but why? >> >> That's how the ABI is defined right now. I didn't make that up. > > Ok, I guess it will remain a mystery then. The biggest question is here is how much compatibility do we want with other 32bit ABIs? Do we want off_t to be 32bit or 64bit? > > Should we perhaps define __ARCH_WANT_SYSCALL_OFF_T for the unistd.h > file then, so we provide both the off_t and the loff_t based syscalls? I think that is backwards ... > > That would avoid the extra wrapper in glibc when using a 32-bit > off_t if that is the preferred mode for user space. Other targets like tilegx does not do that and has a pure 32bit mode. Only score does that. Thanks, Andrew > > Arnd
On Friday 13 November 2015 07:38:49 Andrew Pinski wrote: > On Fri, Nov 13, 2015 at 7:34 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > On Thursday 12 November 2015 14:47:18 Andreas Schwab wrote: > >> Arnd Bergmann <arnd@arndb.de> writes: > >> > >> > On Thursday 12 November 2015 10:44:55 Andreas Schwab wrote: > >> >> Arnd Bergmann <arnd@arndb.de> writes: > >> >> > >> >> > What do you mean with 32-bit off_t? > >> >> > >> >> An ABI with 32-bit off_t, ie. all currently implemented 32-bit ABIs. > >> >> > >> >> > Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit > >> >> > __kernel_loff_t? > >> >> > >> >> Glibc is bridging the user-space ABI to the kernel ABI. > >> > > >> > Ok, but why? > >> > >> That's how the ABI is defined right now. I didn't make that up. > > > > Ok, I guess it will remain a mystery then. > > The biggest question is here is how much compatibility do we want with > other 32bit ABIs? > Do we want off_t to be 32bit or 64bit? I would much prefer off_t to be defined as __kernel_loff_t unconditionally, with no support for _FILE_OFFSET_BITS == 32. This is at least what I had in mind when I wrote the asm-generic/unistd.h header. We should probably find out what happened for the other glibc ports that were implemented for the architectures using this. It's possible that there was a good reason for supporting _FILE_OFFSET_BITS == 32 at the time, but I can't think of one and maybe it is one that is no longer valid. Do you know what x86/x32 does for off_t? Do they also implement both _FILE_OFFSET_BITS == 32 and _FILE_OFFSET_BITS == 64 on top of the 64-bit __kernel_off_t? > > Should we perhaps define __ARCH_WANT_SYSCALL_OFF_T for the unistd.h > > file then, so we provide both the off_t and the loff_t based syscalls? > > I think that is backwards ... > > > > > That would avoid the extra wrapper in glibc when using a 32-bit > > off_t if that is the preferred mode for user space. > > > Other targets like tilegx does not do that and has a pure 32bit mode. > Only score does that. score was unintentional, it was the first port that got done after we introduced the generic headers, and they said at the time that they would change their libc to remove the dependency on the legacy syscalls, but when I tried to remove them later, they had already shipped it with them enabled. After that, I told people to never enable the symbols in an upstream port and only use them for porting their libc internally. We could actually now move all the legacy syscall stuff to arch/score/include/uapi/asm/unistd.h, to prevent anyone else from using it any longer, as glibc works fine without them these days. The __ARCH_WANT_SYSCALL_OFF_T define might be an exception. If the glibc developers want to keep using 32-bit off_t by default on all new architecture, we could include those calls again by default. Arnd
On Friday 13 November 2015 17:10:44 Arnd Bergmann wrote: > On Friday 13 November 2015 07:38:49 Andrew Pinski wrote: > > On Fri, Nov 13, 2015 at 7:34 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > > On Thursday 12 November 2015 14:47:18 Andreas Schwab wrote: > > >> Arnd Bergmann <arnd@arndb.de> writes: > > >> > > >> > On Thursday 12 November 2015 10:44:55 Andreas Schwab wrote: > > >> >> Arnd Bergmann <arnd@arndb.de> writes: > > >> >> > > >> >> > What do you mean with 32-bit off_t? > > >> >> > > >> >> An ABI with 32-bit off_t, ie. all currently implemented 32-bit ABIs. > > >> >> > > >> >> > Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit > > >> >> > __kernel_loff_t? > > >> >> > > >> >> Glibc is bridging the user-space ABI to the kernel ABI. > > >> > > > >> > Ok, but why? > > >> > > >> That's how the ABI is defined right now. I didn't make that up. > > > > > > Ok, I guess it will remain a mystery then. > > > > The biggest question is here is how much compatibility do we want with > > other 32bit ABIs? > > Do we want off_t to be 32bit or 64bit? > > I would much prefer off_t to be defined as __kernel_loff_t unconditionally, > with no support for _FILE_OFFSET_BITS == 32. This is at least what I had > in mind when I wrote the asm-generic/unistd.h header. > > We should probably find out what happened for the other glibc ports that > were implemented for the architectures using this. It's possible that > there was a good reason for supporting _FILE_OFFSET_BITS == 32 at the > time, but I can't think of one and maybe it is one that is no longer > valid. > > Do you know what x86/x32 does for off_t? Do they also implement both > _FILE_OFFSET_BITS == 32 and _FILE_OFFSET_BITS == 64 on top of the > 64-bit __kernel_off_t? I just did a little bit of digging through glibc history and found that Chris Metcalf added the files that are now in sysdeps/unix/sysv/linux/generic/wordsize-32/ and that provide the implementation for 32-bit off_t in glibc on top of the 64-bit __kernel_off_t. Chris, do you remember what led to that? Do you think we still need to have 32-bit off_t on all new architectures, or could we move on to making 64-bit off_t the default when adding a port? Arnd
On 11/15/2015 10:18 AM, Arnd Bergmann wrote: > On Friday 13 November 2015 17:10:44 Arnd Bergmann wrote: >> On Friday 13 November 2015 07:38:49 Andrew Pinski wrote: >>> On Fri, Nov 13, 2015 at 7:34 AM, Arnd Bergmann <arnd@arndb.de> wrote: >>>> On Thursday 12 November 2015 14:47:18 Andreas Schwab wrote: >>>>> Arnd Bergmann <arnd@arndb.de> writes: >>>>> >>>>>> On Thursday 12 November 2015 10:44:55 Andreas Schwab wrote: >>>>>>> Arnd Bergmann <arnd@arndb.de> writes: >>>>>>> >>>>>>>> What do you mean with 32-bit off_t? >>>>>>> An ABI with 32-bit off_t, ie. all currently implemented 32-bit ABIs. >>>>>>> >>>>>>>> Do you mean that glibc emulates a 32-bit off_t on top of the 64-bit >>>>>>>> __kernel_loff_t? >>>>>>> Glibc is bridging the user-space ABI to the kernel ABI. >>>>>> Ok, but why? >>>>> That's how the ABI is defined right now. I didn't make that up. >>>> Ok, I guess it will remain a mystery then. >>> The biggest question is here is how much compatibility do we want with >>> other 32bit ABIs? >>> Do we want off_t to be 32bit or 64bit? >> I would much prefer off_t to be defined as __kernel_loff_t unconditionally, >> with no support for _FILE_OFFSET_BITS == 32. This is at least what I had >> in mind when I wrote the asm-generic/unistd.h header. >> >> We should probably find out what happened for the other glibc ports that >> were implemented for the architectures using this. It's possible that >> there was a good reason for supporting _FILE_OFFSET_BITS == 32 at the >> time, but I can't think of one and maybe it is one that is no longer >> valid. >> >> Do you know what x86/x32 does for off_t? Do they also implement both >> _FILE_OFFSET_BITS == 32 and _FILE_OFFSET_BITS == 64 on top of the >> 64-bit __kernel_off_t? > I just did a little bit of digging through glibc history and found that > Chris Metcalf added the files that are now in > sysdeps/unix/sysv/linux/generic/wordsize-32/ and that provide the > implementation for 32-bit off_t in glibc on top of the 64-bit > __kernel_off_t. > > Chris, do you remember what led to that? Do you think we still need > to have 32-bit off_t on all new architectures, or could we move > on to making 64-bit off_t the default when adding a port? I think there are two questions here. The first is whether glibc will change the default for _FILE_OFFSET_BITS to be 64. This has been discussed in the past, e.g.: https://sourceware.org/ml/libc-alpha/2014-03/msg00351.html I've added Rich, Paul, Joseph, and Mike to the cc's as they are probably a good subset of libc-alpha to help comment on these issues. My sense is that right now, it wouldn't be possible to add a 32-bit architecture with a non-32-bit default for _FILE_OFFSET_BITS. And, obviously, this is why, when I added the tilegx32 APIs to glibc in 2011, I needed to provide _FILE_OFFSET_BITS=32 support. As to the kernel APIs, certainly tilegx32 only has the stat64 API; I just arranged that the userspace structures are file-offset-bits-agnostic by using ifdefs to either put a 64-bit value or a (32-bit-value, 32-bit-pad) in the structure. See sysdeps/unix/sysv/linux/generic/bits/stat.h for example. While the __field64() macro is kind of nasty, it does provide the 32-bit off_t to those programs that want it without any particular cost elsewhere in the code.
On Sun, 15 Nov 2015, Chris Metcalf wrote: > I've added Rich, Paul, Joseph, and Mike to the cc's as they are probably > a good subset of libc-alpha to help comment on these issues. My sense > is that right now, it wouldn't be possible to add a 32-bit architecture > with a non-32-bit default for _FILE_OFFSET_BITS. And, obviously, this > is why, when I added the tilegx32 APIs to glibc in 2011, I needed to > provide _FILE_OFFSET_BITS=32 support. x32 uses 64-bit off_t only. That's not a problem; the problems are tv_nsec not of type long, a bug we should avoid for all new ports (padding on tv_nsec is fine; treating that padding as a significant high part of a 64-bit value on input to glibc / kernel interfaces isn't), and maybe some other types being 64-bit unnecessarily, although as far as I know the suggested issues there <https://sourceware.org/bugzilla/show_bug.cgi?id=16438> are all theoretical. It's true that we don't have a very clear notion of what "wordsize-64" sysdeps directories mean in glibc for cases such as x32. See <https://sourceware.org/bugzilla/show_bug.cgi?id=14116>.
On Monday 16 November 2015 10:16:35 Joseph Myers wrote: > On Sun, 15 Nov 2015, Chris Metcalf wrote: > > > I've added Rich, Paul, Joseph, and Mike to the cc's as they are probably > > a good subset of libc-alpha to help comment on these issues. My sense > > is that right now, it wouldn't be possible to add a 32-bit architecture > > with a non-32-bit default for _FILE_OFFSET_BITS. And, obviously, this > > is why, when I added the tilegx32 APIs to glibc in 2011, I needed to > > provide _FILE_OFFSET_BITS=32 support. > > x32 uses 64-bit off_t only. That's not a problem; the problems are > tv_nsec not of type long, a bug we should avoid for all new ports (padding > on tv_nsec is fine; treating that padding as a significant high part of a > 64-bit value on input to glibc / kernel interfaces isn't), and maybe some > other types being 64-bit unnecessarily, although as far as I know the > suggested issues there > <https://sourceware.org/bugzilla/show_bug.cgi?id=16438> are all > theoretical. Let's not get into the tv_nsec discussion today, that is not thankfully not relevant for arm64 any more at this point. The system call ABI for arm64/ilp32 is now the same as for any other 32-bit architecture using the generic ABI, the question we're trying to solve here is only whether it is ok for new 32-bit glibc ports to only offer a 64-bit off_t as the kernel currently does (using __kernel_loff_t) or if we still need to support the _FILE_OFFSET_BITS=32 case. If I got you right, we can use 64-bit off_t now, so we just need someone to figure out how to make that the default in glibc for new architectures while keeping the existing 32-bit architectures unchanged. Arnd
On Mon, 16 Nov 2015, Arnd Bergmann wrote: > Let's not get into the tv_nsec discussion today, that is not thankfully > not relevant for arm64 any more at this point. The system call ABI for > arm64/ilp32 is now the same as for any other 32-bit architecture using > the generic ABI, the question we're trying to solve here is only whether it > is ok for new 32-bit glibc ports to only offer a 64-bit off_t as the kernel > currently does (using __kernel_loff_t) or if we still need to support the > _FILE_OFFSET_BITS=32 case. > > If I got you right, we can use 64-bit off_t now, so we just need someone > to figure out how to make that the default in glibc for new architectures > while keeping the existing 32-bit architectures unchanged. It would be an entirely new combination. Presumably such a port would want the "function X is an alias of function Y" aspects of wordsize-64 directories (where that relates to off_t, struct stat etc. as opposed to long and long long), and the "registers are 64-bit so 64-bit operations are efficient" aspects, but not all the "64-bit syscall interface" aspects, so someone would need to review wordsize-64 sysdeps files and figure out what is or is not relevant to this port.
On Monday 16 November 2015 11:12:08 Joseph Myers wrote: > On Mon, 16 Nov 2015, Arnd Bergmann wrote: > > > Let's not get into the tv_nsec discussion today, that is not thankfully > > not relevant for arm64 any more at this point. The system call ABI for > > arm64/ilp32 is now the same as for any other 32-bit architecture using > > the generic ABI, the question we're trying to solve here is only whether it > > is ok for new 32-bit glibc ports to only offer a 64-bit off_t as the kernel > > currently does (using __kernel_loff_t) or if we still need to support the > > _FILE_OFFSET_BITS=32 case. > > > > If I got you right, we can use 64-bit off_t now, so we just need someone > > to figure out how to make that the default in glibc for new architectures > > while keeping the existing 32-bit architectures unchanged. > > It would be an entirely new combination. Presumably such a port would > want the "function X is an alias of function Y" aspects of wordsize-64 > directories (where that relates to off_t, struct stat etc. as opposed to > long and long long), and the "registers are 64-bit so 64-bit operations > are efficient" aspects, but not all the "64-bit syscall interface" > aspects, so someone would need to review wordsize-64 sysdeps files and > figure out what is or is not relevant to this port. There are two separate aspects here: a) leave out the support for all __off_t based syscalls (__ftruncate, __lseek, __lxstat, __pread, __preadv, __pwrite, __pwritev, __truncate, __xstat) as they are no longer needed, and change the handling of _FILE_OFFSET_BITS so that we default to 64 and error out for anything else. This needs to be done for all new 32-bit architectures if you think we should use a 64-bit off_t from now on, it's not arm64 specific. b) For an arbitrary subset of these, introduce optimized versions that are architecture specific and take advantage of the fact that we can pass 64-bit register arguments on arm64-ilp32. This is only needed because we diverge from the generic ABI in order to avoid the silliness of splitting up a 64-bit argument and then re-assembling it in the kernel. This should not be needed in the generic ABI and could just live in sysdeps/unix/sysv/linux/aarch64/wordsize-32, just like we have special handling for each one in the arm64 in the kernel code for them. Arnd
On Mon, 16 Nov 2015, Arnd Bergmann wrote: > There are two separate aspects here: > > a) leave out the support for all __off_t based syscalls (__ftruncate, > __lseek, __lxstat, __pread, __preadv, __pwrite, __pwritev, __truncate, > __xstat) as they are no longer needed, and change the handling of > _FILE_OFFSET_BITS so that we default to 64 and error out for anything > else. > This needs to be done for all new 32-bit architectures if you think we > should use a 64-bit off_t from now on, it's not arm64 specific. It's not a matter of leaving anything out - these would simply use 64-bit off_t (__off_t and __off64_t would be the same type) and the *64 versions would be aliases, exactly the same as on 64-bit architectures. (And _FILE_OFFSET_BITS handling would also be exactly the same as on 64-bit architectures.) I see no reason for the set of off_t-related symbols that exist, or which symbols are aliases of which others, to vary between pure 64-bit systems and ILP32 ABIs (for 32-bit or 64-bit architectures) that simply happen to have had 64-bit off_t from the start.
On Monday 16 November 2015 12:03:09 Joseph Myers wrote: > On Mon, 16 Nov 2015, Arnd Bergmann wrote: > > > There are two separate aspects here: > > > > a) leave out the support for all __off_t based syscalls (__ftruncate, > > __lseek, __lxstat, __pread, __preadv, __pwrite, __pwritev, __truncate, > > __xstat) as they are no longer needed, and change the handling of > > _FILE_OFFSET_BITS so that we default to 64 and error out for anything > > else. > > This needs to be done for all new 32-bit architectures if you think we > > should use a 64-bit off_t from now on, it's not arm64 specific. > > It's not a matter of leaving anything out - these would simply use 64-bit > off_t (__off_t and __off64_t would be the same type) and the *64 versions > would be aliases, exactly the same as on 64-bit architectures. (And > _FILE_OFFSET_BITS handling would also be exactly the same as on 64-bit > architectures.) I see no reason for the set of off_t-related symbols that > exist, or which symbols are aliases of which others, to vary between pure > 64-bit systems and ILP32 ABIs (for 32-bit or 64-bit architectures) that > simply happen to have had 64-bit off_t from the start. Ok, fair enough. So we just change the global __OFF_T_TYPE definition in bits/typesizes.h and override it for all the existing 32-bit ports, correct? Arnd
On Mon, 16 Nov 2015, Arnd Bergmann wrote: > > It's not a matter of leaving anything out - these would simply use 64-bit > > off_t (__off_t and __off64_t would be the same type) and the *64 versions > > would be aliases, exactly the same as on 64-bit architectures. (And > > _FILE_OFFSET_BITS handling would also be exactly the same as on 64-bit > > architectures.) I see no reason for the set of off_t-related symbols that > > exist, or which symbols are aliases of which others, to vary between pure > > 64-bit systems and ILP32 ABIs (for 32-bit or 64-bit architectures) that > > simply happen to have had 64-bit off_t from the start. > > Ok, fair enough. So we just change the global __OFF_T_TYPE definition > in bits/typesizes.h and override it for all the existing 32-bit ports, > correct? Well, it's sysdeps/unix/sysv/linux/generic/bits/typesizes.h that's relevant - so if future generic architectures will use 64-bit off_t, I suppose the existing file could be cloned for existing generic architectures with 32-bit support. And all the types involved in struct stat are affected (e.g. ino_t), not just off_t. And getting the aliases right may involve disentangling the different meanings of wordsize-64 into different sysdeps directories. ("off_t is off64_t" and "stat is stat64" are not the same thing. See MIPS n64.) And the design work needs to be done on libc-alpha, not in a random discussion elsewhere.
On Monday 16 November 2015 12:34:55 Joseph Myers wrote: > On Mon, 16 Nov 2015, Arnd Bergmann wrote: > > > > It's not a matter of leaving anything out - these would simply use 64-bit > > > off_t (__off_t and __off64_t would be the same type) and the *64 versions > > > would be aliases, exactly the same as on 64-bit architectures. (And > > > _FILE_OFFSET_BITS handling would also be exactly the same as on 64-bit > > > architectures.) I see no reason for the set of off_t-related symbols that > > > exist, or which symbols are aliases of which others, to vary between pure > > > 64-bit systems and ILP32 ABIs (for 32-bit or 64-bit architectures) that > > > simply happen to have had 64-bit off_t from the start. > > > > Ok, fair enough. So we just change the global __OFF_T_TYPE definition > > in bits/typesizes.h and override it for all the existing 32-bit ports, > > correct? > > Well, it's sysdeps/unix/sysv/linux/generic/bits/typesizes.h that's > relevant - so if future generic architectures will use 64-bit off_t, I > suppose the existing file could be cloned for existing generic > architectures with 32-bit support. Ok, got it. > And all the types involved in struct stat are affected (e.g. ino_t), > not just off_t. ino_t seems to be the only other type in 'struct stat' that depends on _FILE_OFFSET_BITS in glibc. On the kernel side, we don't care about __kernel_ino_t any more, we just leave that defined as 'unsigned long' while using a plain 'unsigned long long' for 'st_ino' in struct stat64 (and don't use __kernel_ino_t anywhere else either). > And getting the aliases > right may involve disentangling the different meanings of wordsize-64 into > different sysdeps directories. ("off_t is off64_t" and "stat is stat64" > are not the same thing. See MIPS n64.) And the design work needs to be > done on libc-alpha, not in a random discussion elsewhere. Sure. For the moment, we have all the information we need for the kernel side at least: we will keep using only 64-bit __kernel_loff_t on the system call side in new architecture ports and let you figure out how to work with that on the glibc side whenever the next 32-bit port arrives, which I assume will be arm64-ilp32. The 'struct stat' discussion will of course come back soon when we get to the 64-bit time_t patches, or when we introduce the extended stat syscall, whichever happens first. Thanks a lot for your help! Arnd
Arnd Bergmann <arnd@arndb.de> writes: > ino_t seems to be the only other type in 'struct stat' that depends > on _FILE_OFFSET_BITS in glibc. There is also blkcnt_t, and then there is fsblkcnt_t, fsfilcnt_t and fsword_t in struct statfs. Andreas.
On Monday 16 November 2015 14:34:50 Andreas Schwab wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > > ino_t seems to be the only other type in 'struct stat' that depends > > on _FILE_OFFSET_BITS in glibc. > > There is also blkcnt_t, and then there is fsblkcnt_t, fsfilcnt_t and > fsword_t in struct statfs. Ok, got it. Again these are just internal to glibc, the kernel just uses fixed width types in typedef struct { int val[2]; } __kernel_fsid_t; struct statfs64 { __u32 f_type; __u32 f_bsize; __u64 f_blocks; __u64 f_bfree; __u64 f_bavail; __u64 f_files; __u64 f_ffree; __kernel_fsid_t f_fsid; __u32 f_namelen; __u32 f_frsize; __u32 f_flags; __u32 f_spare[4]; }; so we need to be careful to define them in glibc to match the kernel types, but the kernel definition doesn't need changes. Arnd
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index 4c2cbbc..696e638 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -13,13 +13,16 @@ * You should have received a copy of the GNU General Public License * along with this program. If not, see <http://www.gnu.org/licenses/>. */ +#ifdef CONFIG_ARM64_ILP32 +#define __ARCH_WANT_COMPAT_SYS_PREADV64 +#define __ARCH_WANT_COMPAT_SYS_PWRITEV64 +#endif #ifdef CONFIG_AARCH32_EL0 #define __ARCH_WANT_COMPAT_SYS_GETDENTS64 #define __ARCH_WANT_COMPAT_STAT64 #define __ARCH_WANT_SYS_GETHOSTNAME #define __ARCH_WANT_SYS_PAUSE #define __ARCH_WANT_SYS_GETPGRP -#define __ARCH_WANT_SYS_LLSEEK #define __ARCH_WANT_SYS_NICE #define __ARCH_WANT_SYS_SIGPENDING #define __ARCH_WANT_SYS_SIGPROCMASK @@ -39,6 +42,8 @@ #define __NR_compat_sigreturn 119 #define __NR_compat_rt_sigreturn 173 +#define __ARCH_WANT_SYS_LLSEEK + /* * The following SVCs are ARM private. */ diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index 35a59af..837d730 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -24,6 +24,7 @@ arm64-obj-$(CONFIG_AARCH32_EL0) += sys32.o kuser32.o signal32.o \ sys_compat.o entry32.o \ ../../arm/kernel/opcodes.o arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o +arm64-obj-$(CONFIG_ARM64_ILP32) += sys_ilp32.o arm64-obj-$(CONFIG_COMPAT) += entry32-common.o arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 52be5c8..bcd921a 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -664,9 +664,13 @@ ENDPROC(ret_from_fork) */ .align 6 el0_svc: - adrp stbl, sys_call_table // load syscall table pointer uxtw scno, w8 // syscall number in w8 mov sc_nr, #__NR_syscalls +#ifdef CONFIG_ARM64_ILP32 + ldr x16, [tsk, #TI_FLAGS] + tbnz x16, #TIF_32BIT_AARCH64, el0_ilp32_svc // We are using ILP32 +#endif + adrp stbl, sys_call_table // load syscall table pointer el0_svc_naked: // compat entry point stp x0, scno, [sp, #S_ORIG_X0] // save the original x0 and syscall number enable_dbg_and_irq @@ -686,6 +690,12 @@ ni_sys: b ret_fast_syscall ENDPROC(el0_svc) +#ifdef CONFIG_ARM64_ILP32 +el0_ilp32_svc: + adrp stbl, sys_call_ilp32_table // load syscall table pointer + b el0_svc_naked +#endif + /* * This is the really slow path. We're going to be doing context * switches, and waiting for our parent to respond. diff --git a/arch/arm64/kernel/sys_ilp32.c b/arch/arm64/kernel/sys_ilp32.c new file mode 100644 index 0000000..6c7d274 --- /dev/null +++ b/arch/arm64/kernel/sys_ilp32.c @@ -0,0 +1,221 @@ +/* + * AArch64- ILP32 specific system calls implementation + * + * Copyright (C) 2015 Cavium Inc. + * Author: Andrew Pinski <apinski@cavium.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/compiler.h> +#include <linux/errno.h> +#include <linux/fs.h> +#include <linux/mm.h> +#include <linux/msg.h> +#include <linux/export.h> +#include <linux/sched.h> +#include <linux/slab.h> +#include <linux/syscalls.h> +#include <linux/compat.h> + +/* + * Wrappers to pass the pt_regs argument. + */ +asmlinkage long sys_rt_sigreturn_wrapper(void); +#define sys_rt_sigreturn sys_rt_sigreturn_wrapper +#define sys_rt_sigsuspend compat_sys_rt_sigsuspend +#define sys_rt_sigaction compat_sys_rt_sigaction +#define sys_rt_sigprocmask compat_sys_rt_sigprocmask +#define sys_rt_sigpending compat_sys_rt_sigpending +#define sys_rt_sigtimedwait compat_sys_rt_sigtimedwait +#define sys_rt_sigqueueinfo compat_sys_rt_sigqueueinfo +#define sys_rt_sigpending compat_sys_rt_sigpending + +/* Using Compat syscalls where necessary */ +#define sys_ioctl compat_sys_ioctl +/* iovec */ +#define sys_readv compat_sys_readv +#define sys_writev compat_sys_writev +#define sys_preadv compat_sys_preadv64 +#define sys_pwritev compat_sys_pwritev64 +#define sys_vmsplice compat_sys_vmsplice +/* robust_list_head */ +#define sys_set_robust_list compat_sys_set_robust_list +#define sys_get_robust_list compat_sys_get_robust_list + +/* kexec_segment */ +#define sys_kexec_load compat_sys_kexec_load + +/* Ptrace has some structures which are different between ILP32 and LP64 */ +#define sys_ptrace compat_sys_ptrace + +/* struct msghdr */ +#define sys_msgctl compat_sys_msgctl +#define sys_recvfrom compat_sys_recvfrom +#define sys_recvmmsg compat_sys_recvmmsg +#define sys_sendmmsg compat_sys_sendmmsg +#define sys_sendmsg compat_sys_sendmsg +#define sys_recvmsg compat_sys_recvmsg +#define sys_msgsnd compat_sys_msgsnd +#define sys_msgrcv compat_sys_msgrcv + +#define sys_setsockopt compat_sys_setsockopt +#define sys_getsockopt compat_sys_getsockopt + +/* Array of pointers */ +#define sys_execve compat_sys_execve +#define sys_move_pages compat_sys_move_pages + +/* iovec */ +#define sys_process_vm_readv compat_sys_process_vm_readv +#define sys_process_vm_writev compat_sys_process_vm_writev + +/* Pointer in struct */ +#define sys_mount compat_sys_mount + +/* NUMA */ +/* unsigned long bitmaps */ +#define sys_get_mempolicy compat_sys_get_mempolicy +#define sys_set_mempolicy compat_sys_set_mempolicy +#define sys_mbind compat_sys_mbind +/* array of pointers */ +/* unsigned long bitmaps */ +#define sys_migrate_pages compat_sys_migrate_pages + +/* Scheduler */ +/* unsigned long bitmaps */ +#define sys_sched_setaffinity compat_sys_sched_setaffinity +#define sys_sched_getaffinity compat_sys_sched_getaffinity + +/* iov usage */ +#define sys_keyctl compat_sys_keyctl + +/* aio */ +/* Pointer to Pointer */ +#define sys_io_setup compat_sys_io_setup +/* Array of pointers */ +#define sys_io_submit compat_sys_io_submit + +#define sys_nanosleep compat_sys_nanosleep + +#define sys_lseek sys_llseek + +#define sys_setitimer compat_sys_setitimer +#define sys_getitimer compat_sys_getitimer + +#define sys_gettimeofday compat_sys_gettimeofday +#define sys_settimeofday compat_sys_settimeofday +#define sys_adjtimex compat_sys_adjtimex + +#define sys_clock_gettime compat_sys_clock_gettime +#define sys_clock_settime compat_sys_clock_settime + +#define sys_timerfd_gettime compat_sys_timerfd_gettime +#define sys_timerfd_settime compat_sys_timerfd_settime +#define sys_utimensat compat_sys_utimensat + +#define sys_getrlimit compat_sys_getrlimit +#define sys_setrlimit compat_sys_setrlimit +#define sys_getrusage compat_sys_getrusage + +#define sys_futex compat_sys_futex +#define sys_get_robust_list compat_sys_get_robust_list +#define sys_set_robust_list compat_sys_set_robust_list + +#define sys_pselect6 compat_sys_pselect6 +#define sys_ppoll compat_sys_ppoll + +asmlinkage long compat_sys_mmap2_wrapper(void); +#define sys_mmap compat_sys_mmap2_wrapper + +asmlinkage long compat_sys_fstatfs64_wrapper(void); +#define sys_fstatfs compat_sys_fstatfs64_wrapper +asmlinkage long compat_sys_statfs64_wrapper(void); +#define sys_statfs compat_sys_statfs64_wrapper + +/* We need to make sure the pointer gets copied correctly. */ +asmlinkage long ilp32_sys_mq_notify(mqd_t mqdes, const struct sigevent __user *u_notification) +{ + struct sigevent __user *p = NULL; + if (u_notification) { + struct sigevent n; + p = compat_alloc_user_space(sizeof(*p)); + if (copy_from_user(&n, u_notification, sizeof(*p))) + return -EFAULT; + if (n.sigev_notify == SIGEV_THREAD) + n.sigev_value.sival_ptr = compat_ptr((uintptr_t)n.sigev_value.sival_ptr); + if (copy_to_user(p, &n, sizeof(*p))) + return -EFAULT; + } + return sys_mq_notify(mqdes, p); +} + +/* sigevent contains sigval_t which is now 64bit always + but need special handling due to padding for SIGEV_THREAD. */ +#define sys_mq_notify ilp32_sys_mq_notify + +/* sigaltstack needs some special handling as the + padding for stack_t might not be non-zero. */ +long ilp32_sys_sigaltstack(const stack_t __user *uss_ptr, + stack_t __user *uoss_ptr) +{ + stack_t uss, uoss; + int ret; + mm_segment_t seg; + + if (uss_ptr) { + if (!access_ok(VERIFY_READ, uss_ptr, sizeof(*uss_ptr))) + return -EFAULT; + if (__get_user(uss.ss_sp, &uss_ptr->ss_sp) | + __get_user(uss.ss_flags, &uss_ptr->ss_flags) | + __get_user(uss.ss_size, &uss_ptr->ss_size)) + return -EFAULT; + /* Zero extend the sp address and the size. */ + uss.ss_sp = (void *)(uintptr_t)(unsigned int)(uintptr_t)uss.ss_sp; + uss.ss_size = (size_t)(unsigned int)uss.ss_size; + } + seg = get_fs(); + set_fs(KERNEL_DS); + /* Note we need to use uoss as we have changed the segment to the + kernel one so passing an user one around is wrong. */ + ret = sys_sigaltstack((stack_t __force __user *) (uss_ptr ? &uss : NULL), + (stack_t __force __user *) &uoss); + set_fs(seg); + if (ret >= 0 && uoss_ptr) { + if (!access_ok(VERIFY_WRITE, uoss_ptr, sizeof(stack_t)) || + __put_user(uoss.ss_sp, &uoss_ptr->ss_sp) || + __put_user(uoss.ss_flags, &uoss_ptr->ss_flags) || + __put_user(uoss.ss_size, &uoss_ptr->ss_size)) + ret = -EFAULT; + } + return ret; +} + +/* sigaltstack needs some special handling as the padding + for stack_t might not be non-zero. */ +#define sys_sigaltstack ilp32_sys_sigaltstack + + +#include <asm/syscall.h> + +#undef __SYSCALL +#define __SYSCALL(nr, sym) [nr] = sym, + +/* + * The sys_call_ilp32_table array must be 4K aligned to be accessible from + * kernel/entry.S. + */ +void *sys_call_ilp32_table[__NR_syscalls] __aligned(4096) = { + [0 ... __NR_syscalls - 1] = sys_ni_syscall, +#include <asm/unistd.h> +};