diff mbox series

[1/6] tools/nolibc: add support for waitid()

Message ID 20241221-nolibc-rv32-v1-1-d9ef6dab7c63@weissschuh.net (mailing list archive)
State New
Headers show
Series selftests/nolibc: wire up riscv32 | expand

Commit Message

Thomas Weißschuh Dec. 21, 2024, 2:44 p.m. UTC
waitid() is the modern variant of the family of wait-like syscalls.
Some architectures have dropped support for wait(), wait4() and waitpid()
but all of them support waitid().
It is more flexible and easier to use than the older ones.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
---
 tools/include/nolibc/sys.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Comments

Willy Tarreau Dec. 21, 2024, 4:35 p.m. UTC | #1
Hi Thomas!

On Sat, Dec 21, 2024 at 03:44:28PM +0100, Thomas Weißschuh wrote:
> waitid() is the modern variant of the family of wait-like syscalls.
> Some architectures have dropped support for wait(), wait4() and waitpid()
> but all of them support waitid().
> It is more flexible and easier to use than the older ones.

I'm generally fine with the series, but I'm starting to get concerned
that some simple applications that used to rely on wait() or waitpid()
will not work on this architecture. Just like we did for some early
syscalls that got replaced (e.g. open->openat etc), I think we'll have
to implement a default wrapper relying on waitid() for all these calls
in this case, and maybe as well for lseek->llseek() btw, what do you
think ?

The single fact that you've had to modify some of the nolibc-test code
(which is supposed to be the application here) indicates that we're
progressively going away from what applications need on certain archs.
Ideally an application relying on long-established calls should continue
to work unmodified.

Maybe it will be time for us to run an overall audit of arch-dependent
syscalls we currently have, to make sure that the common ones continue
to work fine there (and waitpid() definitely is as common a syscall as
open() since it's the good old and portable one).

Cheers,
Willy
Thomas Weißschuh Dec. 22, 2024, 11:39 a.m. UTC | #2
Hi Willy!

On 2024-12-21 17:35:40+0100, Willy Tarreau wrote:
> On Sat, Dec 21, 2024 at 03:44:28PM +0100, Thomas Weißschuh wrote:
> > waitid() is the modern variant of the family of wait-like syscalls.
> > Some architectures have dropped support for wait(), wait4() and waitpid()
> > but all of them support waitid().
> > It is more flexible and easier to use than the older ones.
> 
> I'm generally fine with the series, but I'm starting to get concerned
> that some simple applications that used to rely on wait() or waitpid()
> will not work on this architecture. Just like we did for some early
> syscalls that got replaced (e.g. open->openat etc), I think we'll have
> to implement a default wrapper relying on waitid() for all these calls
> in this case, and maybe as well for lseek->llseek() btw, what do you
> think ?

Indeed, it would be nice to have full compatibility. However there are
more syscalls missing than wait() and lseek(). These are just the
missing ones affecting nolibc-test.
Adding wrappers will be more work. This series is only meant to
ensure that the existing limited support does not regress.

We can add compatibility wrappers one after the other on top.
I think Zhangjin implemented and proposed a few before, but a few of
them ended up complicated.

> The single fact that you've had to modify some of the nolibc-test code
> (which is supposed to be the application here) indicates that we're
> progressively going away from what applications need on certain archs.
> Ideally an application relying on long-established calls should continue
> to work unmodified.

Agreed.

> Maybe it will be time for us to run an overall audit of arch-dependent
> syscalls we currently have, to make sure that the common ones continue
> to work fine there (and waitpid() definitely is as common a syscall as
> open() since it's the good old and portable one).

Isn't this what nolibc-test is already doing?
Or do you also want to compare it to non-current kernel versions?

In general the special rv32 syscalls are not really
architecture-dependent, they just dropped the "legacy" ones, especially
all using 32bit timestamps.


Thomas
Willy Tarreau Dec. 23, 2024, 8:13 a.m. UTC | #3
Hi Thomas!

On Sun, Dec 22, 2024 at 12:39:01PM +0100, Thomas Weißschuh wrote:
> > Maybe it will be time for us to run an overall audit of arch-dependent
> > syscalls we currently have, to make sure that the common ones continue
> > to work fine there (and waitpid() definitely is as common a syscall as
> > open() since it's the good old and portable one).
> 
> Isn't this what nolibc-test is already doing?

My concern is that it might be progressively going away from this if
we replace some standard syscalls with new ones that are cross-arch.

> Or do you also want to compare it to non-current kernel versions?

I mean that we progressively replace old posix calls with new cross arch
ones in the system (e.g. open->openat, waitpid->waitid etc) and normally
it's a libc's role to preserve application-level compatibility by
maintaining the mapping between standard ones and specific ones so that
applications relying on standard ones continue to work, and that was one
of the original goals of nolibc.

I have nothing against missing some calls in newly added architectures,
of course, but when I'm seeing for example that we switch some of the
lower layer tests to use a pipe because some call was not present, I
tend to think that maybe we should first define what is the minimal set
of working syscalls that the nolibc-test program requires to be usable
on any arch.

In the current case, we seem to have to arbiter between pipe() and
lseek() support for basic nolibc-test support. But maybe a new arch will
be added for which it will be the opposite choice between the two. We
may very well require both of them to work if needed, or either, at the
risk of delaying support of a specific arch in the future, but that's
fine.

Second we should have a new look at all our supported calls and check if
some of them are present while the legacy calls they're supposed to
replace is missing (which would be perfectly possible). For example if
we had implemented waitpid() much later, it would have been perfectly
possible that we'd only implement waitid() and miss waitpid() that
applications expect.

Honestly it's not a particularly interesting job to do. That's why I'm
mostly saying that we should just keep that in mind to be careful with
new additions.

> In general the special rv32 syscalls are not really
> architecture-dependent, they just dropped the "legacy" ones, especially
> all using 32bit timestamps.

I understand, and when adding a new arch we need to start with something.
I just think that we should consider that for a new arch to switch from
"in progress" to "working", it would require the legacy ones working on
other archs to work on that one as well. My concern is that early boot
tools would only build on certain archs but not all when all of them are
supposed to be in a working state. When it fails everywhere that's fine,
it just means we're missing some calls and the user is welcome to submit
a patch. But when the user only tests on, say, x86 and arm, and someone
relies on that to package kernels and discovers late that it fails on
riscv for example, that's a problem. Note that I'm just making up examples,
and not designating any particular issue.

Maybe it would be convenient to maintain a support matrix for the syscalls
we currently support. It could look something like:

   waitpid()   x86: native
               arm: native
               riscv32: via waitid()
               foobar: not yet

   open()      ...

etc. I could try to work on such a thing if you're interested as well, but
not now as I don't have the time at the moment.

Cheers,
Willy
diff mbox series

Patch

diff --git a/tools/include/nolibc/sys.h b/tools/include/nolibc/sys.h
index 7b82bc3cf107439a3f09f98b99d4d540ffb9ba2a..d4a5c2399a66b200ebf7ab249569cce2285481a5 100644
--- a/tools/include/nolibc/sys.h
+++ b/tools/include/nolibc/sys.h
@@ -23,6 +23,7 @@ 
 #include <linux/prctl.h>
 #include <linux/resource.h>
 #include <linux/utsname.h>
+#include <linux/signal.h>
 
 #include "arch.h"
 #include "errno.h"
@@ -1225,6 +1226,23 @@  pid_t waitpid(pid_t pid, int *status, int options)
 }
 
 
+/*
+ * int waitid(idtype_t idtype, id_t id, siginfo_t *infop, int options);
+ */
+
+static __attribute__((unused))
+int sys_waitid(int which, pid_t pid, siginfo_t *infop, int options, struct rusage *rusage)
+{
+	return my_syscall5(__NR_waitid, which, pid, infop, options, rusage);
+}
+
+static __attribute__((unused))
+int waitid(int which, pid_t pid, siginfo_t *infop, int options)
+{
+	return __sysret(sys_waitid(which, pid, infop, options, NULL));
+}
+
+
 /*
  * ssize_t write(int fd, const void *buf, size_t count);
  */