From patchwork Wed Jul 12 00:46:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13309482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFBE2C001DC for ; Wed, 12 Jul 2023 00:47:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231186AbjGLArP (ORCPT ); Tue, 11 Jul 2023 20:47:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230447AbjGLArP (ORCPT ); Tue, 11 Jul 2023 20:47:15 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B5A810CF for ; Tue, 11 Jul 2023 17:47:13 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1b867f9198dso11378595ad.0 for ; Tue, 11 Jul 2023 17:47:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122833; x=1691714833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8G3IrOMNGlFczpaPrC8wvkVroG8Z2Lzb2MR2ORlnT9Y=; b=Vf1GAj6GvM+KaDdgiHg64eKqsxyAT6AMQg5TB9LHR+eDiSwMVsx4LaD8iOrRhentya AudG97oDq9deM0vTzfBmWPseEjk4/5TUQ1gnTjSicsRSl99fsfZD3IJ+2ywyp/9otCQI 0wUpyNtBeoA3N6UTxwqAs7t+eFcb3OAEnMD1jmgRn9M3HYGG2LBLgP1ncJd2kLhsgEBj c+R4XxjMub1ZM3aMdj1IyVy19gNw6KZT05pa7YJImPzgNAy7YqVq6hKNmtEr917lGs5k TPaozL/JMy0nsJ8sr4d3GOgfr9I1JinC+n1z0zswWwr552vv7lVBXwaDCqlVkzcssXzs 2vkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122833; x=1691714833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8G3IrOMNGlFczpaPrC8wvkVroG8Z2Lzb2MR2ORlnT9Y=; b=AbfFm6ptekaMhZ5ScaaB4G/hxAkHiuzjZxiBx8zZwr4upWgzzCWrPw0uBzYxjiViaf g6FTCOeutIVDwD3j5IHrhhFPqSpeql/MU2lp28drtNCVivyd1jCVRfF0CLoGoKwWuVsO lz8pN0pweKY8Ywm7eV6AgE/6Ce6Dd2MMOppN47+kA2SDV/BdCfrh63WHuX8ncXcWe2n7 h6cjNxoZTxTtIYrnR8CjU63iCaKgYz4dmMmWFFDs3YERO+si3BIuAKw1ighy8wUhPT/I MIIb+2SuTpXVuI40Mk3kXNHXQsCuFI7W5H3Sx0uEJp3uwMq5089AIAYm9CnpCP4hOyMd r4yw== X-Gm-Message-State: ABy/qLaCc144YrQdK0ZpIlBv+JFCKBXn7cXNLaaeFcKYaEMWTVA9lSO2 YRY2kCqsTufmBWD9VALrOjFKVUJhsYJ735AWqN8= X-Google-Smtp-Source: APBJJlG8unC4NI3vyoRqEhxk2e87ftX0VFPToHU5Br3bIxKcqYJe9ZSi3mf0hLaBfoR5QF9T8q+oog== X-Received: by 2002:a17:902:f683:b0:1b1:9272:55e2 with SMTP id l3-20020a170902f68300b001b1927255e2mr21727777plg.3.1689122832734; Tue, 11 Jul 2023 17:47:12 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:11 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 1/7] futex: abstract out futex_op_to_flags() helper Date: Tue, 11 Jul 2023 18:46:59 -0600 Message-Id: <20230712004705.316157-2-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Rather than needing to duplicate this for the io_uring hook of futexes, abstract out a helper. No functional changes intended in this patch. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 15 +++++++++++++++ kernel/futex/syscalls.c | 11 ++--------- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index b5379c0e6d6d..d2949fca37d1 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -291,4 +291,19 @@ extern int futex_unlock_pi(u32 __user *uaddr, unsigned int flags); extern int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *time, int trylock); +static inline bool futex_op_to_flags(int op, int cmd, unsigned int *flags) +{ + if (!(op & FUTEX_PRIVATE_FLAG)) + *flags |= FLAGS_SHARED; + + if (op & FUTEX_CLOCK_REALTIME) { + *flags |= FLAGS_CLOCKRT; + if (cmd != FUTEX_WAIT_BITSET && cmd != FUTEX_WAIT_REQUEUE_PI && + cmd != FUTEX_LOCK_PI2) + return false; + } + + return true; +} + #endif /* _FUTEX_H */ diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c index a8074079b09e..75ca8c41cc94 100644 --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -88,15 +88,8 @@ long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, int cmd = op & FUTEX_CMD_MASK; unsigned int flags = 0; - if (!(op & FUTEX_PRIVATE_FLAG)) - flags |= FLAGS_SHARED; - - if (op & FUTEX_CLOCK_REALTIME) { - flags |= FLAGS_CLOCKRT; - if (cmd != FUTEX_WAIT_BITSET && cmd != FUTEX_WAIT_REQUEUE_PI && - cmd != FUTEX_LOCK_PI2) - return -ENOSYS; - } + if (!futex_op_to_flags(op, cmd, &flags)) + return -ENOSYS; switch (cmd) { case FUTEX_WAIT: From patchwork Wed Jul 12 00:47:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13309483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E362EB64DD for ; Wed, 12 Jul 2023 00:47:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231208AbjGLArQ (ORCPT ); Tue, 11 Jul 2023 20:47:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231192AbjGLArQ (ORCPT ); Tue, 11 Jul 2023 20:47:16 -0400 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E6691720 for ; Tue, 11 Jul 2023 17:47:15 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1b898cfa6a1so9141005ad.1 for ; Tue, 11 Jul 2023 17:47:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122834; x=1689727634; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hJHEVsPXGWHqsJoWMs2SXGP/dgciG32UfjRi0VGCdeE=; b=PZeJ886W/oAqgibJDnI4kMYditvxSoDVVQKUvsrNa6hJC2RciZA4/XDfJbR/GFiJcN ziVXQ6rMuYZcDOBUm+002+A8xu5/3nUOR+nkF0SskrybnMXQlzcoOyf9QgVa6gOo7Hr5 vH+/r53KV/aDbsyM/bey7jiPFwKUvp3X7j2A/TcHXzjYybSjN2nSQBAutTZ4J0md1epR d2tTzmRdCc6X06ovP75BVgVzDlBQNm9eHwaMb2fa7UeeHFHzmu3gXWREQuOW1djjmt25 VongB2IiDsFM//BzOqjvdT6HPbjxrnHtUf35tzLaB3HKf9zQZEZBiU1Ulvty3mlv22fa ei8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122834; x=1689727634; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hJHEVsPXGWHqsJoWMs2SXGP/dgciG32UfjRi0VGCdeE=; b=H9SrKD67rqW7BSqKzZVSJ6bGCCxV1WW74uTIqstAXw0QJkPhO6QNcjIV9WaLdULZC4 kvWYXFX/Bxg+xP5xjpL76mYk/TS3llWebL2IU8UrrHPqcU4BY/QO2epDi594JTPWIc/Y G+h+MQ1i93+Yim0VL//7nyWFByYK6ZKHh0m5FSHr+5pHCdRvVohitFp+oc9D8zYAi0I5 pxjsL6Ru7Hkpo9ClrrbBfRFbSDUsmlflWfAgplknem8MxqbOKvn2lMX367vV1TWfCfvu b1Rrzz/jTmkqO2rKSn0QpzVLSYWVUkGZr/sHKDsDxfiKQRqJ7ACkEXjFIk7Iu/G3fVty A54w== X-Gm-Message-State: ABy/qLZITSN8Gug2GDBYdZclbuDKvk/8Xwj1WR+TJv899rMToo1K9unp lwSFRVxO58Twnj3TktwwkTs7lJTd3DW2UyOh1OE= X-Google-Smtp-Source: APBJJlHkC3A+zq7LnnwLbWkAjzljLxsmHgWHDTPfiPM2JKusBK2JcHM6qTyQrQZ+Ys67Lt9qbLklAA== X-Received: by 2002:a17:902:f68c:b0:1b8:17e8:547e with SMTP id l12-20020a170902f68c00b001b817e8547emr21541502plg.1.1689122834238; Tue, 11 Jul 2023 17:47:14 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:13 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 2/7] futex: factor out the futex wake handling Date: Tue, 11 Jul 2023 18:47:00 -0600 Message-Id: <20230712004705.316157-3-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org In preparation for having another waker that isn't futex_wake_mark(), add a wake handler in futex_q. No extra data is associated with the handler outside of struct futex_q itself. futex_wake_mark() is defined as the standard wakeup helper, now set through futex_q_init like other defaults. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 4 ++++ kernel/futex/requeue.c | 3 ++- kernel/futex/waitwake.c | 6 +++--- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index d2949fca37d1..8eaf1a5ce967 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -69,6 +69,9 @@ struct futex_pi_state { union futex_key key; } __randomize_layout; +struct futex_q; +typedef void (futex_wake_fn)(struct wake_q_head *wake_q, struct futex_q *q); + /** * struct futex_q - The hashed futex queue entry, one per waiting task * @list: priority-sorted list of tasks waiting on this futex @@ -98,6 +101,7 @@ struct futex_q { struct task_struct *task; spinlock_t *lock_ptr; + futex_wake_fn *wake; union futex_key key; struct futex_pi_state *pi_state; struct rt_mutex_waiter *rt_waiter; diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c index cba8b1a6a4cc..e892bc6c41d8 100644 --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -58,6 +58,7 @@ enum { const struct futex_q futex_q_init = { /* list gets initialized in futex_queue()*/ + .wake = futex_wake_mark, .key = FUTEX_KEY_INIT, .bitset = FUTEX_BITSET_MATCH_ANY, .requeue_state = ATOMIC_INIT(Q_REQUEUE_PI_NONE), @@ -591,7 +592,7 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2, /* Plain futexes just wake or requeue and are done */ if (!requeue_pi) { if (++task_count <= nr_wake) - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); else requeue_futex(this, hb1, hb2, &key2); continue; diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index ba01b9408203..3471af87cb7d 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -174,7 +174,7 @@ int futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset) if (!(this->bitset & bitset)) continue; - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++ret >= nr_wake) break; } @@ -289,7 +289,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2, ret = -EINVAL; goto out_unlock; } - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++ret >= nr_wake) break; } @@ -303,7 +303,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flags, u32 __user *uaddr2, ret = -EINVAL; goto out_unlock; } - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++op_ret >= nr_wake2) break; } From patchwork Wed Jul 12 00:47:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13309484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40CDEEB64DC for ; Wed, 12 Jul 2023 00:47:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231285AbjGLArW (ORCPT ); Tue, 11 Jul 2023 20:47:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229945AbjGLArS (ORCPT ); Tue, 11 Jul 2023 20:47:18 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB54610D4 for ; Tue, 11 Jul 2023 17:47:16 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1b898cfa6a1so9141065ad.1 for ; Tue, 11 Jul 2023 17:47:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122836; x=1689727636; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ceQzGCPNst+4GlWUlUZP0RISAT734e1HDBQMt1evUDU=; b=ddq+ANYGvujKOUtXjecxaLI5DXztouTYpNq8GAIrz1H6iPTUv3MbTzUbk2ARr2Iw3w AHC3UAiff3umcY/Rugtx+vIf8vf40IPc+HaEqBqg9fjGYhIMpSodwAeMaz3lOtYXtGVy xhH3Fm9wxFdBKlS9DBiQFCnPah5j75Pr2hmpj/KUftmIq+fhdCLJERx/zpTvPFgkyVnS 48wL7TrrfwKKLkNFV39q0lLQ/A41hWEDvMnnrod2dZbyFZXJjbmMpWJRsPZLCFfs9BqX Ihw6AmZSgTFUNMLabbbHp3aF9dZblE9Hk+hlX8K98THxo3NO0ue3Ymqo8T8Ze83U3mYy gtJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122836; x=1689727636; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ceQzGCPNst+4GlWUlUZP0RISAT734e1HDBQMt1evUDU=; b=LIRc65He3R30PrGYtvathb6zuMyCeYLT7ofjoADFPcMdG77WNp0RmCrYM8liHOUNQr gvQa2998qBIqZL4ebEBXY1DdMl2YTMkZJsWcAvbq1Cg8paI5+qiGWhDBaSTsYc0D3itG g/wrDcG1XVIHQktqMBxSptz97BY3c3Bidhi8rVd32qEr72n4vZhAzBQ7FefRu5mhaGU4 +11k0iA+Hp/mmmNSHSXv1pFTcQnHvRz5GK2unnKK1+eWZowdoDkCB9ChRBzb97i6SWVz g4SvrNRaubytgHpRTfZUhcY2HjldBuVDoxgb4KvrnH/XG4uYWyfX2sd8KuR0BXWn94dc 9Pxg== X-Gm-Message-State: ABy/qLY3h4mZwvMJKrcNT8ZeFJyOPtl35aPkYBS8oP0/fXHI+4ErQJhv IdhCI9GIB0PL3FvzmA3RQ1exKCu2fGA53gQkwXk= X-Google-Smtp-Source: APBJJlGhtVtPfEatzJ1RbPxVjY0PqbRjHsR20pVh4fV0TVhAUMItaUwjSOBxObRTA2DsaDE3Zu4SRg== X-Received: by 2002:a17:902:da92:b0:1b3:d8ac:8db3 with SMTP id j18-20020a170902da9200b001b3d8ac8db3mr21326179plx.6.1689122835577; Tue, 11 Jul 2023 17:47:15 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:14 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 3/7] io_uring: add support for futex wake and wait Date: Tue, 11 Jul 2023 18:47:01 -0600 Message-Id: <20230712004705.316157-4-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Add support for FUTEX_WAKE/WAIT primitives. IORING_OP_FUTEX_WAKE is mix of FUTEX_WAKE and FUTEX_WAKE_BITSET, as it does support passing in a bitset. Similary, IORING_OP_FUTEX_WAIT is a mix of FUTEX_WAIT and FUTEX_WAIT_BITSET. FUTEX_WAKE is straight forward, as we can always just do those inline. FUTEX_WAIT will queue the futex with an appropriate callback, and that callback will in turn post a CQE when it has triggered. Cancelations are supported, both from the application point-of-view, but also to be able to cancel pending waits if the ring exits before all events have occurred. This is just the barebones wait/wake support. PI or REQUEUE support is not added at this point, unclear if we might look into that later. Likewise, explicit timeouts are not supported either. It is expected that users that need timeouts would do so via the usual io_uring mechanism to do that using linked timeouts. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 3 + include/uapi/linux/io_uring.h | 3 + io_uring/Makefile | 4 +- io_uring/cancel.c | 5 + io_uring/cancel.h | 4 + io_uring/futex.c | 232 +++++++++++++++++++++++++++++++++ io_uring/futex.h | 34 +++++ io_uring/io_uring.c | 5 + io_uring/opdef.c | 24 +++- 9 files changed, 312 insertions(+), 2 deletions(-) create mode 100644 io_uring/futex.c create mode 100644 io_uring/futex.h diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index f04ce513fadb..a7f03d8d879f 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -273,6 +273,9 @@ struct io_ring_ctx { struct io_wq_work_list locked_free_list; unsigned int locked_free_nr; + struct hlist_head futex_list; + struct io_alloc_cache futex_cache; + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 36f9c73082de..3bd2d765f593 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -65,6 +65,7 @@ struct io_uring_sqe { __u32 xattr_flags; __u32 msg_ring_flags; __u32 uring_cmd_flags; + __u32 futex_flags; }; __u64 user_data; /* data to be passed back at completion time */ /* pack this to avoid bogus arm OABI complaints */ @@ -235,6 +236,8 @@ enum io_uring_op { IORING_OP_URING_CMD, IORING_OP_SEND_ZC, IORING_OP_SENDMSG_ZC, + IORING_OP_FUTEX_WAIT, + IORING_OP_FUTEX_WAKE, /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/Makefile b/io_uring/Makefile index 8cc8e5387a75..2e4779bc550c 100644 --- a/io_uring/Makefile +++ b/io_uring/Makefile @@ -7,5 +7,7 @@ obj-$(CONFIG_IO_URING) += io_uring.o xattr.o nop.o fs.o splice.o \ openclose.o uring_cmd.o epoll.o \ statx.o net.o msg_ring.o timeout.o \ sqpoll.o fdinfo.o tctx.o poll.o \ - cancel.o kbuf.o rsrc.o rw.o opdef.o notif.o + cancel.o kbuf.o rsrc.o rw.o opdef.o \ + notif.o obj-$(CONFIG_IO_WQ) += io-wq.o +obj-$(CONFIG_FUTEX) += futex.o diff --git a/io_uring/cancel.c b/io_uring/cancel.c index 7b23607cf4af..3dba8ccb1cd8 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -15,6 +15,7 @@ #include "tctx.h" #include "poll.h" #include "timeout.h" +#include "futex.h" #include "cancel.h" struct io_cancel { @@ -119,6 +120,10 @@ int io_try_cancel(struct io_uring_task *tctx, struct io_cancel_data *cd, if (ret != -ENOENT) return ret; + ret = io_futex_cancel(ctx, cd, issue_flags); + if (ret != -ENOENT) + return ret; + spin_lock(&ctx->completion_lock); if (!(cd->flags & IORING_ASYNC_CANCEL_FD)) ret = io_timeout_cancel(ctx, cd); diff --git a/io_uring/cancel.h b/io_uring/cancel.h index fc98622e6166..c0a8e7c520b6 100644 --- a/io_uring/cancel.h +++ b/io_uring/cancel.h @@ -1,4 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 +#ifndef IORING_CANCEL_H +#define IORING_CANCEL_H #include @@ -22,3 +24,5 @@ void init_hash_table(struct io_hash_table *table, unsigned size); int io_sync_cancel(struct io_ring_ctx *ctx, void __user *arg); bool io_cancel_req_match(struct io_kiocb *req, struct io_cancel_data *cd); + +#endif diff --git a/io_uring/futex.c b/io_uring/futex.c new file mode 100644 index 000000000000..ff0f6b394756 --- /dev/null +++ b/io_uring/futex.c @@ -0,0 +1,232 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include + +#include + +#include "../kernel/futex/futex.h" +#include "io_uring.h" +#include "rsrc.h" +#include "futex.h" + +struct io_futex { + struct file *file; + u32 __user *uaddr; + int futex_op; + unsigned int futex_val; + unsigned int futex_flags; + unsigned int futex_mask; +}; + +struct io_futex_data { + union { + struct futex_q q; + struct io_cache_entry cache; + }; + struct io_kiocb *req; +}; + +void io_futex_cache_init(struct io_ring_ctx *ctx) +{ + io_alloc_cache_init(&ctx->futex_cache, IO_NODE_ALLOC_CACHE_MAX, + sizeof(struct io_futex_data)); +} + +static void io_futex_cache_entry_free(struct io_cache_entry *entry) +{ + kfree(container_of(entry, struct io_futex_data, cache)); +} + +void io_futex_cache_free(struct io_ring_ctx *ctx) +{ + io_alloc_cache_free(&ctx->futex_cache, io_futex_cache_entry_free); +} + +static void io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) +{ + struct io_futex_data *ifd = req->async_data; + struct io_ring_ctx *ctx = req->ctx; + + io_tw_lock(ctx, ts); + if (!io_alloc_cache_put(&ctx->futex_cache, &ifd->cache)) + kfree(ifd); + req->async_data = NULL; + hlist_del_init(&req->hash_node); + io_req_task_complete(req, ts); +} + +static bool __io_futex_cancel(struct io_ring_ctx *ctx, struct io_kiocb *req) +{ + struct io_futex_data *ifd = req->async_data; + + /* futex wake already done or in progress */ + if (!futex_unqueue(&ifd->q)) + return false; + + hlist_del_init(&req->hash_node); + io_req_set_res(req, -ECANCELED, 0); + req->io_task_work.func = io_futex_complete; + io_req_task_work_add(req); + return true; +} + +int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + int nr = 0; + + if (cd->flags & (IORING_ASYNC_CANCEL_FD|IORING_ASYNC_CANCEL_FD_FIXED)) + return -ENOENT; + + io_ring_submit_lock(ctx, issue_flags); + hlist_for_each_entry_safe(req, tmp, &ctx->futex_list, hash_node) { + if (req->cqe.user_data != cd->data && + !(cd->flags & IORING_ASYNC_CANCEL_ANY)) + continue; + if (__io_futex_cancel(ctx, req)) + nr++; + if (!(cd->flags & IORING_ASYNC_CANCEL_ALL)) + break; + } + io_ring_submit_unlock(ctx, issue_flags); + + if (nr) + return nr; + + return -ENOENT; +} + +bool io_futex_remove_all(struct io_ring_ctx *ctx, struct task_struct *task, + bool cancel_all) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + bool found = false; + + lockdep_assert_held(&ctx->uring_lock); + + hlist_for_each_entry_safe(req, tmp, &ctx->futex_list, hash_node) { + if (!io_match_task_safe(req, task, cancel_all)) + continue; + __io_futex_cancel(ctx, req); + found = true; + } + + return found; +} + +int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + + if (unlikely(sqe->addr2 || sqe->buf_index || sqe->addr3)) + return -EINVAL; + + iof->futex_op = READ_ONCE(sqe->fd); + iof->uaddr = u64_to_user_ptr(READ_ONCE(sqe->addr)); + iof->futex_val = READ_ONCE(sqe->len); + iof->futex_mask = READ_ONCE(sqe->file_index); + iof->futex_flags = READ_ONCE(sqe->futex_flags); + if (iof->futex_flags & FUTEX_CMD_MASK) + return -EINVAL; + + return 0; +} + +static void io_futex_wake_fn(struct wake_q_head *wake_q, struct futex_q *q) +{ + struct io_futex_data *ifd = container_of(q, struct io_futex_data, q); + struct io_kiocb *req = ifd->req; + + __futex_unqueue(q); + smp_store_release(&q->lock_ptr, NULL); + + io_req_set_res(req, 0, 0); + req->io_task_work.func = io_futex_complete; + io_req_task_work_add(req); +} + +static struct io_futex_data *io_alloc_ifd(struct io_ring_ctx *ctx) +{ + struct io_cache_entry *entry; + + entry = io_alloc_cache_get(&ctx->futex_cache); + if (entry) + return container_of(entry, struct io_futex_data, cache); + + return kmalloc(sizeof(struct io_futex_data), GFP_NOWAIT); +} + +int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + struct io_ring_ctx *ctx = req->ctx; + struct io_futex_data *ifd = NULL; + struct futex_hash_bucket *hb; + unsigned int flags = 0; + int ret; + + if (!iof->futex_mask) { + ret = -EINVAL; + goto done; + } + if (!futex_op_to_flags(FUTEX_WAIT, iof->futex_flags, &flags)) { + ret = -ENOSYS; + goto done; + } + + io_ring_submit_lock(ctx, issue_flags); + ifd = io_alloc_ifd(ctx); + if (!ifd) { + ret = -ENOMEM; + goto done_unlock; + } + + req->async_data = ifd; + ifd->q = futex_q_init; + ifd->q.bitset = iof->futex_mask; + ifd->q.wake = io_futex_wake_fn; + ifd->req = req; + + ret = futex_wait_setup(iof->uaddr, iof->futex_val, flags, &ifd->q, &hb); + if (!ret) { + hlist_add_head(&req->hash_node, &ctx->futex_list); + io_ring_submit_unlock(ctx, issue_flags); + + futex_queue(&ifd->q, hb); + return IOU_ISSUE_SKIP_COMPLETE; + } + +done_unlock: + io_ring_submit_unlock(ctx, issue_flags); +done: + if (ret < 0) + req_set_fail(req); + io_req_set_res(req, ret, 0); + kfree(ifd); + return IOU_OK; +} + +int io_futex_wake(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + unsigned int flags = 0; + int ret; + + if (!futex_op_to_flags(FUTEX_WAKE, iof->futex_flags, &flags)) { + ret = -ENOSYS; + goto done; + } + + ret = futex_wake(iof->uaddr, flags, iof->futex_val, iof->futex_mask); +done: + if (ret < 0) + req_set_fail(req); + io_req_set_res(req, ret, 0); + return IOU_OK; +} diff --git a/io_uring/futex.h b/io_uring/futex.h new file mode 100644 index 000000000000..ddc9e0d73c52 --- /dev/null +++ b/io_uring/futex.h @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "cancel.h" + +int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags); +int io_futex_wake(struct io_kiocb *req, unsigned int issue_flags); + +#if defined(CONFIG_FUTEX) +int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags); +bool io_futex_remove_all(struct io_ring_ctx *ctx, struct task_struct *task, + bool cancel_all); +void io_futex_cache_init(struct io_ring_ctx *ctx); +void io_futex_cache_free(struct io_ring_ctx *ctx); +#else +static inline int io_futex_cancel(struct io_ring_ctx *ctx, + struct io_cancel_data *cd, + unsigned int issue_flags) +{ + return 0; +} +static inline bool io_futex_remove_all(struct io_ring_ctx *ctx, + struct task_struct *task, bool cancel_all) +{ + return false; +} +static inline void io_futex_cache_init(struct io_ring_ctx *ctx) +{ +} +static inline void io_futex_cache_free(struct io_ring_ctx *ctx) +{ +} +#endif diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index e8096d502a7c..67ff148bc394 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -92,6 +92,7 @@ #include "cancel.h" #include "net.h" #include "notif.h" +#include "futex.h" #include "timeout.h" #include "poll.h" @@ -314,6 +315,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) sizeof(struct async_poll)); io_alloc_cache_init(&ctx->netmsg_cache, IO_ALLOC_CACHE_MAX, sizeof(struct io_async_msghdr)); + io_futex_cache_init(ctx); init_completion(&ctx->ref_comp); xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1); mutex_init(&ctx->uring_lock); @@ -333,6 +335,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) INIT_LIST_HEAD(&ctx->tctx_list); ctx->submit_state.free_list.next = NULL; INIT_WQ_LIST(&ctx->locked_free_list); + INIT_HLIST_HEAD(&ctx->futex_list); INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func); INIT_WQ_LIST(&ctx->submit_state.compl_reqs); return ctx; @@ -2842,6 +2845,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_eventfd_unregister(ctx); io_alloc_cache_free(&ctx->apoll_cache, io_apoll_cache_free); io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free); + io_futex_cache_free(ctx); io_destroy_buffers(ctx); mutex_unlock(&ctx->uring_lock); if (ctx->sq_creds) @@ -3254,6 +3258,7 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, ret |= io_cancel_defer_files(ctx, task, cancel_all); mutex_lock(&ctx->uring_lock); ret |= io_poll_remove_all(ctx, task, cancel_all); + ret |= io_futex_remove_all(ctx, task, cancel_all); mutex_unlock(&ctx->uring_lock); ret |= io_kill_timeouts(ctx, task, cancel_all); if (task) diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 3b9c6489b8b6..c9f23c21a031 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -33,6 +33,7 @@ #include "poll.h" #include "cancel.h" #include "rw.h" +#include "futex.h" static int io_no_issue(struct io_kiocb *req, unsigned int issue_flags) { @@ -426,11 +427,26 @@ const struct io_issue_def io_issue_defs[] = { .issue = io_sendmsg_zc, #else .prep = io_eopnotsupp_prep, +#endif + }, + [IORING_OP_FUTEX_WAIT] = { +#if defined(CONFIG_FUTEX) + .prep = io_futex_prep, + .issue = io_futex_wait, +#else + .prep = io_eopnotsupp_prep, +#endif + }, + [IORING_OP_FUTEX_WAKE] = { +#if defined(CONFIG_FUTEX) + .prep = io_futex_prep, + .issue = io_futex_wake, +#else + .prep = io_eopnotsupp_prep, #endif }, }; - const struct io_cold_def io_cold_defs[] = { [IORING_OP_NOP] = { .name = "NOP", @@ -648,6 +664,12 @@ const struct io_cold_def io_cold_defs[] = { .fail = io_sendrecv_fail, #endif }, + [IORING_OP_FUTEX_WAIT] = { + .name = "FUTEX_WAIT", + }, + [IORING_OP_FUTEX_WAKE] = { + .name = "FUTEX_WAKE", + }, }; const char *io_uring_get_opcode(u8 opcode) From patchwork Wed Jul 12 00:47:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13309485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 517D2C001DE for ; Wed, 12 Jul 2023 00:47:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231349AbjGLArY (ORCPT ); Tue, 11 Jul 2023 20:47:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229512AbjGLArV (ORCPT ); Tue, 11 Jul 2023 20:47:21 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41EE41733 for ; Tue, 11 Jul 2023 17:47:18 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1b8c364ad3bso11362415ad.1 for ; Tue, 11 Jul 2023 17:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122837; x=1691714837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=USpRQzApdKpSjiyBkjEmkDnIwRqj5vH3splpyWQ1un8=; b=FpZRUoq0aBYl9CAG+w+52gtPelgfB27iLDuiaLxgUVSRo0wfLIQu3VEnoXzkS21Xdq m2jQmSMLiebVrVI9WO6ahU0wvGJn/wLM2FxTrd9cUM9WY4c+wcnNp67maehJN6ewabhh dVK3RGPyvWiLbGnt4Qo07jrG56bi2emgJqA1ILLMMFhyvloYjZbVc+3Tsh0eEojhdfSx Xzby3UClIYLVNsK9oAJUNiNmY2DF9sC0AQdDgP1zbgR+eX1Dy8ibt4t9ZPK+1jt4zUJS dDCtTrGBlpkFiOwIMzYzUn/KTkuI44aU08Q8pOq37fL0TdJ7sMpjmm2L1cMFu309BpEl Dl9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122837; x=1691714837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=USpRQzApdKpSjiyBkjEmkDnIwRqj5vH3splpyWQ1un8=; b=FebiGcTGcQ6YSXShdsotOTokRUWLTrDfVYgC73zfBqR/rWoHJTqhzf0/LXRa2krrNz UmBidaDPHQjD1uFrI/fOcdFwVhvE9PeEtpbs4jvm1OqUtGujgqmGm8joH6sHTxGth2Ze 9CuqOUZqANZw28k0pdwdRuBwAjg4YqtbgEXBDH6RLf7svG2tiHWUZHmb8QVRwPZq2pPB 2JgYheRaBoNNQSIMfavT0pvA/xWuAijuNRohBB0kpoS+z6XxKP1ek55teCBXEGmk6TND FSxZdZ7sDrQh4Zq7tV+QbjDxiqYL/UFYEDnNV0SkHcnSsgdVVsD5u1B4hTQ+d0tluwe7 L9hA== X-Gm-Message-State: ABy/qLbViuHy8VQrD8X8t7mAJTmN1UXXci3CLFrURLpuq22/Kica680R xPeWSX703n4xcgvg27eUpdGbx+7eKMNTt9E4/1s= X-Google-Smtp-Source: APBJJlGcQTb4zl90ols3sb/WEm7O0/Lzfmu6C2MX3GYDEZjK7TfuE80XBAJnU+feK9BQZF1fr1oReQ== X-Received: by 2002:a17:903:244e:b0:1b8:b4f6:1327 with SMTP id l14-20020a170903244e00b001b8b4f61327mr21565526pls.6.1689122837255; Tue, 11 Jul 2023 17:47:17 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:16 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 4/7] futex: add wake_data to struct futex_q Date: Tue, 11 Jul 2023 18:47:02 -0600 Message-Id: <20230712004705.316157-5-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org With handling multiple futex_q for waitv, we cannot easily go from the futex_q to data related to that request or queue. Add a wake_data argument that belongs to the wake handler assigned. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 8eaf1a5ce967..75dec2ec7469 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -102,6 +102,7 @@ struct futex_q { struct task_struct *task; spinlock_t *lock_ptr; futex_wake_fn *wake; + void *wake_data; union futex_key key; struct futex_pi_state *pi_state; struct rt_mutex_waiter *rt_waiter; From patchwork Wed Jul 12 00:47:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13309486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C303EEB64DD for ; Wed, 12 Jul 2023 00:47:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231366AbjGLArY (ORCPT ); Tue, 11 Jul 2023 20:47:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231251AbjGLArV (ORCPT ); Tue, 11 Jul 2023 20:47:21 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4028D173C for ; Tue, 11 Jul 2023 17:47:19 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-5576ad1b7e7so1011208a12.1 for ; Tue, 11 Jul 2023 17:47:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122838; x=1691714838; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wAI3I/Dhvc4kFPF9aMsZuUCg5MncxXQjosQul2Nohhs=; b=df7itfq6b68/usvHEi91e1MzZBznscTgw/J0lMxPEjugpJIES1PAZeC1++4h+nEL73 hF+zuH761pjLrXKQpoIJkDL+LT+9G23AGGt33Admf11D+pplifWgUjWVTThi3WpgwHCO VxnxqurUC3FRQw+FWp8InVuWdBVU+xc49C5pfnXq4QjskLSfD2SG3w1CEPCSVao7WuFR kDsOdHs18B1OBSsIgbeEq870s4F9L0+4RC507KZsTFkPntUQnv5uM1x4bgmDW2yu4o4o M+JTd8NS3GwLh2iaS8FT/bN6jNFYa/baEeCgtdmpgaS+PJD47Os/ouyIY42q6CayXWI+ OQUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122838; x=1691714838; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wAI3I/Dhvc4kFPF9aMsZuUCg5MncxXQjosQul2Nohhs=; b=BytxoROb2OvOQ6/14r7J79/I244Os+Z2ffGg7p1u2yDiCwqrh6zVWh18XFzLTOAG6y hoYdETfpocw1CLmz6bBw3eeZooiKQw6oAWMEB0PKp77mBDeF/e3f74OfcWcS9s12S03L +rku9JvuMUXgZzQxx2ZsXQOPd3hVEvMS8rB4ZgMU28wO6svladlOZEJWTFAin6m948WP OYXJSczj1EVGSu9pfQfQd0ciLel9BUzbFsjID96369aiblMNs3+pWrbA8KDRCg8Z63k8 bF+DCqlgFPdSxF4MiWRtKOGOx/1FBfUK2f8ewBVtjCeaJadMPABZielDLlWth2d2J+jb ye8A== X-Gm-Message-State: ABy/qLZEs8qLU4Nxh898hqF7nGw5XVo7OZVCIAlp6m406PkSZuLttoPg Yp0str8JfqNlWUas4se/NQDTItm45oSNVZGROug= X-Google-Smtp-Source: APBJJlF0OcXmextAbH9ZOz7q7HkUMlDfdRnt7GYOrghw54ttVQIxzCRuTPnMc4lyuqCnqyjb0gi/RQ== X-Received: by 2002:a17:902:d4cd:b0:1b8:17e8:5472 with SMTP id o13-20020a170902d4cd00b001b817e85472mr21890265plg.1.1689122838441; Tue, 11 Jul 2023 17:47:18 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:17 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 5/7] futex: make futex_parse_waitv() available as a helper Date: Tue, 11 Jul 2023 18:47:03 -0600 Message-Id: <20230712004705.316157-6-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org To make it more generically useful, augment it with allowing the caller to pass in the wake handler and wake data. Convert the futex_waitv() syscall, passing in the default handlers. Since we now provide a way to pass in a wake handler and data, ensure we use __futex_queue() to avoid having futex_queue() overwrite our wait data. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 5 +++++ kernel/futex/syscalls.c | 14 ++++++++++---- kernel/futex/waitwake.c | 3 ++- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 75dec2ec7469..ed5a7ccd2e99 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -284,6 +284,11 @@ struct futex_vector { struct futex_q q; }; +extern int futex_parse_waitv(struct futex_vector *futexv, + struct futex_waitv __user *uwaitv, + unsigned int nr_futexes, futex_wake_fn *wake, + void *wake_data); + extern int futex_wait_multiple(struct futex_vector *vs, unsigned int count, struct hrtimer_sleeper *to); diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c index 75ca8c41cc94..8ac70bfb89fc 100644 --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -184,12 +184,15 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val, * @futexv: Kernel side list of waiters to be filled * @uwaitv: Userspace list to be parsed * @nr_futexes: Length of futexv + * @wake: Wake to call when futex is woken + * @wake_data: Data for the wake handler * * Return: Error code on failure, 0 on success */ -static int futex_parse_waitv(struct futex_vector *futexv, - struct futex_waitv __user *uwaitv, - unsigned int nr_futexes) +int futex_parse_waitv(struct futex_vector *futexv, + struct futex_waitv __user *uwaitv, + unsigned int nr_futexes, futex_wake_fn *wake, + void *wake_data) { struct futex_waitv aux; unsigned int i; @@ -208,6 +211,8 @@ static int futex_parse_waitv(struct futex_vector *futexv, futexv[i].w.val = aux.val; futexv[i].w.uaddr = aux.uaddr; futexv[i].q = futex_q_init; + futexv[i].q.wake = wake; + futexv[i].q.wake_data = wake_data; } return 0; @@ -284,7 +289,8 @@ SYSCALL_DEFINE5(futex_waitv, struct futex_waitv __user *, waiters, goto destroy_timer; } - ret = futex_parse_waitv(futexv, waiters, nr_futexes); + ret = futex_parse_waitv(futexv, waiters, nr_futexes, futex_wake_mark, + NULL); if (!ret) ret = futex_wait_multiple(futexv, nr_futexes, timeout ? &to : NULL); diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index 3471af87cb7d..dfd02ca5ecfa 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -446,7 +446,8 @@ static int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *wo * next futex. Queue each futex at this moment so hb can * be unlocked. */ - futex_queue(q, hb); + __futex_queue(q, hb); + spin_unlock(&hb->lock); continue; } From patchwork Wed Jul 12 00:47:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13309487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C87F7EB64DC for ; Wed, 12 Jul 2023 00:47:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231543AbjGLAre (ORCPT ); Tue, 11 Jul 2023 20:47:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229911AbjGLArW (ORCPT ); Tue, 11 Jul 2023 20:47:22 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9DEC1987 for ; Tue, 11 Jul 2023 17:47:20 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1b9d9cbcc70so5006025ad.0 for ; Tue, 11 Jul 2023 17:47:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122839; x=1689727639; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dWVSKxrnKu5hFU+vCQFzf5LgVGgpoqEzW6iDZaj+u6I=; b=pOpXts18eWWzVgfvcp1gjSlrjNuWBS8ap8AAX3uZJZfqqP+3gL8MLcAIJmy8JdLwI7 kAgvvKI+lElpiyuOIABzKxKW/9YfLkQ+KoWd8zriwzKl5/WTBrjBoG/NKbHSZNBjcz6+ DzCtwiesO2ZhE3bssOr0dVsgabQPoqBbmrBWpXDSFW84tCSHdvnaEZen/11NCaAUIhFq sm5gXEoWwibq3mVuj9nW1f/EJk82UAe8fO1eeGbFZP8EotUKX7V9LEVsS76qLnSddEmY FAHOsXqy+1jQA2aZM0a3G80veYKKMudX9CP8VdYxvYm9XwxUk6M4b8+o2hTNPPVv0C9U 4WlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122840; x=1689727640; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dWVSKxrnKu5hFU+vCQFzf5LgVGgpoqEzW6iDZaj+u6I=; b=V7U03B7ADAGaniTOvXQhohq+3bbga3SG62toiM63hS/CHFTrhC5dis3IhChers7mmR nWqQqolat6eJhqadUcI8AS0NjLfNbyXV9EFQNFkDe6YuANIQRAoUROGKLvl+x9pGUPPb jMbYvk6WZIjm2adsMc6zWkNGUNCxyUq6J6gggabsCJLnF1zQozMsbuOZYSzFRJysa48l 492J3M1Ycd8GuOYIQxgdM5aabA+dgnP1SfThK61ezSHI0+t04XS3bG5gdJjY/4NRhVv2 H8/Xny68Nam5sPR3nV4rSM7102hsPmExL1TFbVX/5ZWiL6h5Pew3oUUgG+/20kgu56uo FTlA== X-Gm-Message-State: ABy/qLa7CAsEYJVI26deE+B6eMQ5tzSxzLUwoLG6+LUEu1D0CGSxhSB1 0o5ugkm1Pgrv5pJxLIyX2zfZfVc0C5dN393W6n8= X-Google-Smtp-Source: APBJJlEzylA84a9lM6BmQj4WdfDc+r9lDrVXJXDh0eEcAQi3h0fuO3XsOZah6+YncygvoD/FCTXRrw== X-Received: by 2002:a17:902:cecd:b0:1b8:9fc4:2733 with SMTP id d13-20020a170902cecd00b001b89fc42733mr21519265plg.3.1689122839695; Tue, 11 Jul 2023 17:47:19 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:19 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 6/7] futex: make the vectored futex operations available Date: Tue, 11 Jul 2023 18:47:04 -0600 Message-Id: <20230712004705.316157-7-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Rename unqueue_multiple() as futex_unqueue_multiple(), and make both that and futex_wait_multiple_setup() available for external users. This is in preparation for wiring up vectored waits in io_uring. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 5 +++++ kernel/futex/waitwake.c | 10 +++++----- 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index ed5a7ccd2e99..b06e23c4900e 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -289,6 +289,11 @@ extern int futex_parse_waitv(struct futex_vector *futexv, unsigned int nr_futexes, futex_wake_fn *wake, void *wake_data); +extern int futex_wait_multiple_setup(struct futex_vector *vs, int count, + int *woken); + +extern int futex_unqueue_multiple(struct futex_vector *v, int count); + extern int futex_wait_multiple(struct futex_vector *vs, unsigned int count, struct hrtimer_sleeper *to); diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index dfd02ca5ecfa..b2b762acc997 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -358,7 +358,7 @@ void futex_wait_queue(struct futex_hash_bucket *hb, struct futex_q *q, } /** - * unqueue_multiple - Remove various futexes from their hash bucket + * futex_unqueue_multiple - Remove various futexes from their hash bucket * @v: The list of futexes to unqueue * @count: Number of futexes in the list * @@ -368,7 +368,7 @@ void futex_wait_queue(struct futex_hash_bucket *hb, struct futex_q *q, * - >=0 - Index of the last futex that was awoken; * - -1 - No futex was awoken */ -static int unqueue_multiple(struct futex_vector *v, int count) +int futex_unqueue_multiple(struct futex_vector *v, int count) { int ret = -1, i; @@ -396,7 +396,7 @@ static int unqueue_multiple(struct futex_vector *v, int count) * - 0 - Success * - <0 - -EFAULT, -EWOULDBLOCK or -EINVAL */ -static int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *woken) +int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *woken) { struct futex_hash_bucket *hb; bool retry = false; @@ -459,7 +459,7 @@ static int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *wo * was woken, we don't return error and return this index to * userspace */ - *woken = unqueue_multiple(vs, i); + *woken = futex_unqueue_multiple(vs, i); if (*woken >= 0) return 1; @@ -544,7 +544,7 @@ int futex_wait_multiple(struct futex_vector *vs, unsigned int count, __set_current_state(TASK_RUNNING); - ret = unqueue_multiple(vs, count); + ret = futex_unqueue_multiple(vs, count); if (ret >= 0) return ret; From patchwork Wed Jul 12 00:47:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13309488 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D294EB64DC for ; Wed, 12 Jul 2023 00:47:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231348AbjGLArh (ORCPT ); Tue, 11 Jul 2023 20:47:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231322AbjGLArY (ORCPT ); Tue, 11 Jul 2023 20:47:24 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14B4C1726 for ; Tue, 11 Jul 2023 17:47:22 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1b898cfa6a1so9141325ad.1 for ; Tue, 11 Jul 2023 17:47:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122841; x=1689727641; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ybabf6Mx4VMuhxCixGmKPdJebzTyct2xBcgC4A4QTHE=; b=jplCC0xsZilG74DILpEjVEcJAUe8+fabKs3V5dzMOA/MRxevrMTrfCsq5J8kwB4it9 yaq6LQmO55fkvLM9PkUBtS8Vd+9hPjkLC/z+ctQwPYiD+IXWSn5VSM6ZOMB8w+S82lnN zJsNx2XDTbzxqfwZnAlQtzeb3c+aQ+7CPf4uEAoAY/3TnZ7CHx30kYV0pdFx8ugI2WpN +7zLN+EyS2SmVPA0PobdaLBphljLcWlQHCbTxGF33YU1BSOIkP1puvtC13xCFHSQ/FY5 pkAZ3E12S/zah08ptHaKyTMOQW92q0+qKaFslkh39kRHH1BXj2m+B66uV9mLDQa3gkro VN4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122841; x=1689727641; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ybabf6Mx4VMuhxCixGmKPdJebzTyct2xBcgC4A4QTHE=; b=LmHeCSPg6ON9nXxJzgrsdOLc+IwupaR64oIQ4emsHvs4RlZIGIqYgJnCVdABu/FN5y 3tM90mzbp5eiJbtykeNBUbvSzoNyrpZ78Ep4e6alPwzTGnK05H2qQGGIzL4wakT4fV8K YBf6frQEZB/ITI65ZlvOID3Nvru+q1SA/MA2UZ9FPvv+GKrvtLA2BB1DiFbJivGe4hte KnxumDH27UZFz+3yo0kpNoCl2fEVryTOzhiNUB1gidlm8pbakDJDVkzeRRrUgBlm+l1o KdQwB9JLe1xuPThNzk1EQiT/t7st7ZoucfHLsMKLetCpGFHLRoAfIe/RykY1zAv4seFn veAw== X-Gm-Message-State: ABy/qLaRKO46zRncRBDmwZ7CsRMWm1EG2rIgSRxm6jAYIJOeL/CZgA5Y dCyXNbN+OINBtJNAIRbugsU6aF6bi8YuDKgmxvI= X-Google-Smtp-Source: APBJJlHEg/AwTm78Dq6O8WVJlRDk8pyUyhJoGvYjko3I2P2q+wSFXyqKnOsqjZ0AQ/vpoX0AWye6Uw== X-Received: by 2002:a17:902:da92:b0:1b3:d8ac:8db3 with SMTP id j18-20020a170902da9200b001b3d8ac8db3mr21326390plx.6.1689122840987; Tue, 11 Jul 2023 17:47:20 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:20 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 7/7] io_uring: add futex waitv Date: Tue, 11 Jul 2023 18:47:05 -0600 Message-Id: <20230712004705.316157-8-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Needs a bit of splitting and a few hunks should go further back (like the wake handler typedef). WIP, adds IORING_OP_FUTEX_WAITV - pass in an array of futex addresses, and wait on all of them until one of them triggers. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 1 + io_uring/futex.c | 165 +++++++++++++++++++++++++++++++--- io_uring/futex.h | 2 + io_uring/opdef.c | 11 +++ 4 files changed, 169 insertions(+), 10 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 3bd2d765f593..420f38675769 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -238,6 +238,7 @@ enum io_uring_op { IORING_OP_SENDMSG_ZC, IORING_OP_FUTEX_WAIT, IORING_OP_FUTEX_WAKE, + IORING_OP_FUTEX_WAITV, /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/futex.c b/io_uring/futex.c index ff0f6b394756..b22120545d31 100644 --- a/io_uring/futex.c +++ b/io_uring/futex.c @@ -14,11 +14,16 @@ struct io_futex { struct file *file; - u32 __user *uaddr; + union { + u32 __user *uaddr; + struct futex_waitv __user *uwaitv; + }; int futex_op; unsigned int futex_val; unsigned int futex_flags; unsigned int futex_mask; + unsigned int futex_nr; + unsigned long futexv_owned; }; struct io_futex_data { @@ -45,6 +50,13 @@ void io_futex_cache_free(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->futex_cache, io_futex_cache_entry_free); } +static void __io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) +{ + req->async_data = NULL; + hlist_del_init(&req->hash_node); + io_req_task_complete(req, ts); +} + static void io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) { struct io_futex_data *ifd = req->async_data; @@ -53,22 +65,59 @@ static void io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) io_tw_lock(ctx, ts); if (!io_alloc_cache_put(&ctx->futex_cache, &ifd->cache)) kfree(ifd); - req->async_data = NULL; - hlist_del_init(&req->hash_node); - io_req_task_complete(req, ts); + __io_futex_complete(req, ts); } -static bool __io_futex_cancel(struct io_ring_ctx *ctx, struct io_kiocb *req) +static void io_futexv_complete(struct io_kiocb *req, struct io_tw_state *ts) { - struct io_futex_data *ifd = req->async_data; + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + struct futex_vector *futexv = req->async_data; + struct io_ring_ctx *ctx = req->ctx; + int res = 0; - /* futex wake already done or in progress */ - if (!futex_unqueue(&ifd->q)) + io_tw_lock(ctx, ts); + + res = futex_unqueue_multiple(futexv, iof->futex_nr); + if (res != -1) + io_req_set_res(req, res, 0); + + kfree(req->async_data); + req->flags &= ~REQ_F_ASYNC_DATA; + __io_futex_complete(req, ts); +} + +static bool io_futexv_claimed(struct io_futex *iof) +{ + return test_bit(0, &iof->futexv_owned); +} + +static bool io_futexv_claim(struct io_futex *iof) +{ + if (test_bit(0, &iof->futexv_owned) || + test_and_set_bit(0, &iof->futexv_owned)) return false; + return true; +} + +static bool __io_futex_cancel(struct io_ring_ctx *ctx, struct io_kiocb *req) +{ + /* futex wake already done or in progress */ + if (req->opcode == IORING_OP_FUTEX_WAIT) { + struct io_futex_data *ifd = req->async_data; + + if (!futex_unqueue(&ifd->q)) + return false; + req->io_task_work.func = io_futex_complete; + } else { + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + + if (!io_futexv_claim(iof)) + return false; + req->io_task_work.func = io_futexv_complete; + } hlist_del_init(&req->hash_node); io_req_set_res(req, -ECANCELED, 0); - req->io_task_work.func = io_futex_complete; io_req_task_work_add(req); return true; } @@ -124,7 +173,7 @@ int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); - if (unlikely(sqe->addr2 || sqe->buf_index || sqe->addr3)) + if (unlikely(sqe->buf_index || sqe->addr3)) return -EINVAL; iof->futex_op = READ_ONCE(sqe->fd); @@ -135,6 +184,53 @@ int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) if (iof->futex_flags & FUTEX_CMD_MASK) return -EINVAL; + iof->futexv_owned = 0; + return 0; +} + +static void io_futex_wakev_fn(struct wake_q_head *wake_q, struct futex_q *q) +{ + struct io_kiocb *req = q->wake_data; + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + + if (!io_futexv_claim(iof)) + return; + + __futex_unqueue(q); + smp_store_release(&q->lock_ptr, NULL); + + io_req_set_res(req, 0, 0); + req->io_task_work.func = io_futexv_complete; + io_req_task_work_add(req); +} + +int io_futexv_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + struct futex_vector *futexv; + int ret; + + ret = io_futex_prep(req, sqe); + if (ret) + return ret; + + iof->futex_nr = READ_ONCE(sqe->off); + if (!iof->futex_nr || iof->futex_nr > FUTEX_WAITV_MAX) + return -EINVAL; + + futexv = kcalloc(iof->futex_nr, sizeof(*futexv), GFP_KERNEL); + if (!futexv) + return -ENOMEM; + + ret = futex_parse_waitv(futexv, iof->uwaitv, iof->futex_nr, + io_futex_wakev_fn, req); + if (ret) { + kfree(futexv); + return ret; + } + + req->flags |= REQ_F_ASYNC_DATA; + req->async_data = futexv; return 0; } @@ -162,6 +258,55 @@ static struct io_futex_data *io_alloc_ifd(struct io_ring_ctx *ctx) return kmalloc(sizeof(struct io_futex_data), GFP_NOWAIT); } +int io_futex_waitv(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); + struct futex_vector *futexv = req->async_data; + struct io_ring_ctx *ctx = req->ctx; + int ret, woken = -1; + + io_ring_submit_lock(ctx, issue_flags); + + ret = futex_wait_multiple_setup(futexv, iof->futex_nr, &woken); + + /* + * The above call leaves us potentially non-running. This is fine + * for the sync syscall as it'll be blocking unless we already got + * one of the futexes woken, but it obviously won't work for an async + * invocation. Mark is runnable again. + */ + __set_current_state(TASK_RUNNING); + + /* + * We got woken while setting up, let that side do the completion + */ + if (io_futexv_claimed(iof)) { +skip: + io_ring_submit_unlock(ctx, issue_flags); + return IOU_ISSUE_SKIP_COMPLETE; + } + + /* + * 0 return means that we successfully setup the waiters, and that + * nobody triggered a wakeup while we were doing so. < 0 or 1 return + * is either an error or we got a wakeup while setting up. + */ + if (!ret) { + hlist_add_head(&req->hash_node, &ctx->futex_list); + goto skip; + } + + io_ring_submit_unlock(ctx, issue_flags); + if (ret < 0) + req_set_fail(req); + else if (woken != -1) + ret = woken; + io_req_set_res(req, ret, 0); + kfree(futexv); + req->flags &= ~REQ_F_ASYNC_DATA; + return IOU_OK; +} + int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags) { struct io_futex *iof = io_kiocb_to_cmd(req, struct io_futex); diff --git a/io_uring/futex.h b/io_uring/futex.h index ddc9e0d73c52..7828e27e4184 100644 --- a/io_uring/futex.h +++ b/io_uring/futex.h @@ -3,7 +3,9 @@ #include "cancel.h" int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_futexv_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags); +int io_futex_waitv(struct io_kiocb *req, unsigned int issue_flags); int io_futex_wake(struct io_kiocb *req, unsigned int issue_flags); #if defined(CONFIG_FUTEX) diff --git a/io_uring/opdef.c b/io_uring/opdef.c index c9f23c21a031..2034acfe10d0 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -443,6 +443,14 @@ const struct io_issue_def io_issue_defs[] = { .issue = io_futex_wake, #else .prep = io_eopnotsupp_prep, +#endif + }, + [IORING_OP_FUTEX_WAITV] = { +#if defined(CONFIG_FUTEX) + .prep = io_futexv_prep, + .issue = io_futex_waitv, +#else + .prep = io_eopnotsupp_prep, #endif }, }; @@ -670,6 +678,9 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_FUTEX_WAKE] = { .name = "FUTEX_WAKE", }, + [IORING_OP_FUTEX_WAITV] = { + .name = "FUTEX_WAITV", + }, }; const char *io_uring_get_opcode(u8 opcode)