From patchwork Fri Oct 25 09:03:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13850340 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20362D0C5F8 for ; Fri, 25 Oct 2024 09:41:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C3916B0092; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 96FC16B0095; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 838416B0093; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5E15B6B008A for ; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9DBDDABF64 for ; Fri, 25 Oct 2024 09:40:27 +0000 (UTC) X-FDA: 82711630134.27.EDBE6AB Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf16.hostedemail.com (Postfix) with ESMTP id 2D99E180008 for ; Fri, 25 Oct 2024 09:40:44 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=McNtDpCw; spf=none (imf16.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729849186; a=rsa-sha256; cv=none; b=vdd0fS9bMM27EoxdYVZr3GWc5gcmFjOxinJ6qvMkOpy5KKCKpi6R6gLnr24awOObW0gcy+ LK+aGqfsS1JAZYhwQcFpmnn2YF50E/OtQlrWtL+Icc5kKVQMe/83HvY71bfQAmxNOI15hc rc8uH0VJofjjgZ+xN7nixLsSQGvvctA= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=McNtDpCw; spf=none (imf16.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729849186; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=0AEHQ3fQNKjih+WCnZnELy/0uqQ93GQb79KeQAZqpzY=; b=Hkqq/JmC7rPQyqL9hIT2N1F0VBA0ZzbluHeh1B4vZl1KX5xJHzKRFFAbz1dBU+W7Hnbpkb xpdN7AqIwS82CHhwMvZwn0kDNqZ1fGvdEN0k+l+eCGEH982mOeCDaww25iFAtQ1V2Gev3V QfmRqQvcwMPZbYi+kycqKGyyWkPckrE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=0AEHQ3fQNKjih+WCnZnELy/0uqQ93GQb79KeQAZqpzY=; b=McNtDpCw6pjIEqx1QyIATJefZK jqyWF8iz8+9HX4WnJpzMD8QeilUseTQ/PQxpt024ZVD/RTbhE2tzhZ+M/+sMzWodfZvKGmRh7vTDY D945hn22VNwRJmsnRB3v9+4WCSKNbd4TATVzNAbtBJt9+x4jkyi5nOYLY1wNL/WNVQVfXYEuvvG2O hPH4rkssuZZjpoq8BUzAO3IwaAYFV0c+9uEE4NgB5vt9Opiyc99u2bFWew65H/TLWbqdFct9PK3lN zEzc6m0D1tLFmVcsU/VQR5xdkfd2KZ5i6yRkA2knj8/7wCTsbUcl606O1YdIYtnxPvDFZ+buX759U aNbpbWRg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1t4GoU-000000054PN-00f2; Fri, 25 Oct 2024 09:40:58 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id CDA73300ABE; Fri, 25 Oct 2024 11:40:57 +0200 (CEST) Message-Id: <20241025093944.372391936@infradead.org> User-Agent: quilt/0.65 Date: Fri, 25 Oct 2024 11:03:48 +0200 From: Peter Zijlstra To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, cl@linux.com, llong@redhat.com, Christoph Hellwig Subject: [PATCH 1/6] mm: Add vmalloc_huge_node() References: <20241025090347.244183920@infradead.org> MIME-Version: 1.0 X-Stat-Signature: yma676gye3sknyrt5crsn364srxjdhe9 X-Rspamd-Queue-Id: 2D99E180008 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1729849244-458027 X-HE-Meta: U2FsdGVkX19mpWxziCaY6vbV6+mqGjYNUebCIWA1xgXWtNRCKg5EmnZMbnxCaF06KmyD2o0A0qzdyQ11UOFeSpkybAzY9gdUnPMFd4BgQ2wGL8XrCt3aLudZLKkbRGUq+V8xAGMJeU9wE6zDO4phP3jfB8BeKt1i/lNuJXJPJXlS+1hYIgTbHd+xU4SNTrBUdv+kQ9f4Cr7fsaxXmdSfWHdEeystK4sZ1fLwosteJuip+wVHqCLDABk1QYJ+mofaGc8EmszL9u2smgjDURQeoa5sZdCFXuoM5D7R4CCHs/nSdiQ2hgJxMlW9EQOp+Sw55EhRzWrpGXZdrJtjlCCUqrcidy6ZzYmDHUwK4LtiqIGfTjmOQocfh8PfDZyvIza8f3ptgPRM8mltRT3ToLZ6+eK/sJEtQtSMGlFAO4FYb9xj7M8Q/nXgJsGUbh/Aq8ZbC6V3cmv3PyDIA0U9fo7k1yWQxpdDWnqxf60e0Fn7pfF9A89ns2iC1huPJzpABZwRLm7W06nxJumCqwvD+HX4z4YZUCYXoDVmPgq+li2EXqXSwm7mU5w8+5bBtKBZuexIadgKZj3F7g2yInoruIjahY5fUTigp6TyZPOHLK8aGgj4eJbSyFoqrZeO43T14+080JIJ+9rwAgyjqosvs2GwmLkuuV6vf2VD1Jit8jIHwKWqbGtZS25iKORv74T0pJoElTXyyTI0tFCofVDiVfwdRxE8uW+KlDxtqp5rPun1HjVI3De/p8t9IAjcTb8jvswwSez8VKn1rYrc+zHX2ee03on1QgU/9SsDlNNnleN0haPmwz6Uqd5X/G1Ryo1eBGwI7WwUpULzGgcxCog/Qe8ukEQ12uqtChxX3Ny3Kc0si9gAsgRLHR+23RoMC0/pXWBbNzDMIsXBbhQIcKbP7Fv1suBRp1FO3lQBEN3rXv7HzV/ps6620+4VsIJulqf8yodpX5CLHVh+CMELFsJQ7EU 1zzoi238 SYY3eMORBdYxGVtBK7GwOz+4EwAQGRUAwThQBBUv54Qgaz41ruzXptDs9e4FPHhM/Elk+/4tvMaJtzdjKZ5R3M7R3DQl4QJw59IurWpWNiR3CE0eMheeqS74ofSRT57Nx7/uWU1OUNDSsocn2tsC/qTcUvdmoCizrfkjG0DIY6mirCgyOOlsJxqA/CiK+N8XlLFma8tAR9B4LRCHLMRy+BNnVJm8pd0PibvE2OykpWJAXuJIOlqkNRuYzxqAQA1XykJeNDilhKXbx7VWp0+gDRMJQO9ZOofhUSFgmGmTDlJaQPSdhvGk+cTkVQ22mk1Tt/F2uLRF4AFLtZCBVWsbukKx4uA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To enable node specific hash-tables. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Christoph Hellwig Reviewed-by: Uladzislau Rezki (Sony) Reviewed-by: Davidlohr Bueso --- include/linux/vmalloc.h | 3 +++ mm/vmalloc.c | 7 +++++++ 2 files changed, 10 insertions(+) --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -177,6 +177,9 @@ void *__vmalloc_node_noprof(unsigned lon void *vmalloc_huge_noprof(unsigned long size, gfp_t gfp_mask) __alloc_size(1); #define vmalloc_huge(...) alloc_hooks(vmalloc_huge_noprof(__VA_ARGS__)) +void *vmalloc_huge_node_noprof(unsigned long size, gfp_t gfp_mask, int node) __alloc_size(1); +#define vmalloc_huge_node(...) alloc_hooks(vmalloc_huge_node_noprof(__VA_ARGS__)) + extern void *__vmalloc_array_noprof(size_t n, size_t size, gfp_t flags) __alloc_size(1, 2); #define __vmalloc_array(...) alloc_hooks(__vmalloc_array_noprof(__VA_ARGS__)) --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3948,6 +3948,13 @@ void *vmalloc_huge_noprof(unsigned long } EXPORT_SYMBOL_GPL(vmalloc_huge_noprof); +void *vmalloc_huge_node_noprof(unsigned long size, gfp_t gfp_mask, int node) +{ + return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, + gfp_mask, PAGE_KERNEL, VM_ALLOW_HUGE_VMAP, + node, __builtin_return_address(0)); +} + /** * vzalloc - allocate virtually contiguous memory with zero fill * @size: allocation size From patchwork Fri Oct 25 09:03:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13850342 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69456D0C5F8 for ; Fri, 25 Oct 2024 09:41:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0E4A06B007B; Fri, 25 Oct 2024 05:41:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EFBC46B0096; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3ECC6B0093; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 908FF6B008A for ; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 540DB1A075A for ; Fri, 25 Oct 2024 09:40:31 +0000 (UTC) X-FDA: 82711630344.24.F7A42AA Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf27.hostedemail.com (Postfix) with ESMTP id 8005C4000B for ; Fri, 25 Oct 2024 09:40:44 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Y2cyR1HP; spf=none (imf27.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729849110; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=RZyeqL/sDawSYArfjik+bzTHUaH0yRhT/D7WDZYPYBA=; b=KkQRfxOhVdL5cJ9UqKHFtuyRdGD78akDy5qNZbA6hA//5l8fDgHzYX0C3QfhniC6S+Gnyk +wEuN259eEzHockz1Qq6DZoUvIAgHnH8kNvVb6wi1cgplL9BhfTagqJ04icDV2US/oQIsr OawxAy+w981fOk6v05pPUP3+ULDkJHA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729849110; a=rsa-sha256; cv=none; b=q0kP4zPiQwN2RKszDBCkZfbzUhF4CjbxV4alJqgia045sZ1qzHT88nlBekkTiyNN7UT7bv R8P7Ufqrm2QB1kfn9i7H4FzK4R0eSb0qrIuf6dBWSzSrs+/n6NxcVcOneP3GPWVUr/ZiYZ 1VAKCF4SjvnAKmqtgZ5HxNSVrz4byWo= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Y2cyR1HP; spf=none (imf27.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=RZyeqL/sDawSYArfjik+bzTHUaH0yRhT/D7WDZYPYBA=; b=Y2cyR1HPqEIEF7pPOyzphBJXKY 5m6VIXPj2PJRcDkKRF8+NVUBugBJk6WNVYlLYSmQQ8F33FYEpW4Sqh1ioL5UUNbrSkDCQCbIt/mbU C/qEC6Q4O77K8rlLfhA2cwul6CNsYEDCKf+19iOiRxXrJKjsKXtCfJQ0V6j0mI8qWRPvGZrutb1gE zagsCUxVojU8zDApLn4H/SbdldGgPTcoKed5Jc0C4QZBFaI3hUx6A7I7jJKSX4MyKbb19A9SlktQ6 RSeJ9D7aHgah0fGVuSgl3kyVxp1uzgnGQjtHjauc7Fqsg12U6Wwvi0k8UOmvnJUXnhpMSDBg3E45S vRnXCBzQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1t4GoU-000000054PO-00xQ; Fri, 25 Oct 2024 09:40:58 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id D1B7C300B40; Fri, 25 Oct 2024 11:40:57 +0200 (CEST) Message-Id: <20241025093944.485691531@infradead.org> User-Agent: quilt/0.65 Date: Fri, 25 Oct 2024 11:03:49 +0200 From: Peter Zijlstra To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, cl@linux.com, llong@redhat.com Subject: [PATCH 2/6] futex: Implement FUTEX2_NUMA References: <20241025090347.244183920@infradead.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8005C4000B X-Stat-Signature: ejx3s6188mhh1ogekgz77ynbw4nebdyt X-HE-Tag: 1729849244-212071 X-HE-Meta: U2FsdGVkX1/rOzUpZ8xrxpF5KFhNZ82gFGUDqjyEIRD3rVigWQ7Czcr4Leq3cqfCg7ZeoJiN/3Mcm7QsZj3MbjRouRxc6V7wFOqjwqBZOPmfQBfvAONrndKegX0YI4iqYgNDRyxSHhhgEs+wjHagBmDzYHam+a/q9/8kSak3WcaPcVtHfI5+ZdZ97mRvxqrOR+w9eDAvrj/afaOt/1HP3cVlBjIt+UG1Fvb4veDNP/SukibGmMCGEEMGfn3600gafcLB7dhT1i/EiXeQ33VunM+O/93bp4KjCVC5+9uy8ZztnVP9a32vYWqQKJzD5iqG1kabAWMGBxP+8dRv4+0O+DO8DiC4JECYXtOp6+zDaa+O2PUxdBSz7SiIOydSZsZ/2GeZJdxWf2LoeRF6xdFmdwYNlJBLp+pY0I3ihPK96J2f/NWUppMGd7Z2KlmageXnAE5cbi32b0vtqMm6RXl9axCiNaxCb5XuIHDL0uylMERD0KyDGOe0PdvS9jv5EzO2yRDuvlaVdqSSx8lFt88mBfS6F3L9dH2h3NxH9c8Ai+ShfxtzoqLw+r3kjK8kKP/wEGwYci/rvxlMy5cBEFB0lEiVZc0j//kQ20p6w8GI0GPhU6/VHFn8DPAzFVHPQ38nnh/lX/OaF/pZ6mRTB+h87JskChKS19CHuAba1l7UKO4KjJ+/QTypogqD0x/xU/fl9fVLaknfxCJLBEAAcZICZ0hNkuJl+ZHqnLLQT+pbt0rVYTpzN6W3rF9xDKv6ilZDOWaAiETRRxkWDZVXEtmFyYT0ZHiNE5fU+LNZZ8EjMxnxwdw1OaosHYcg2/tf4/T6VaDSqiQWvZ6Y8BP4NL1H74kL9F+2hahYqX2OgS7erzDaUxAC8bEtfQBuhGlOJ5TsR2pOIYICCfW0sNjRG2kZ8UKVHP3bYNLrgbuJGCwwraLA6uS42dcSHU4ToV6Ewf/vG/W1f9KpeVe6viUQWFC 8WGzTd/t TdobYTyLhtjiMjcL2z/Veklo9P4IH/1rjL7pb6zT81tNOk3SiklZLhcgW7Is0x4VrWz55yzd9cEhZBF7qI/XM7zAoaPGzLJ9SIaHM2RSSn4YaftyU4HkDqkRhG5j9QzpMfO4L3qWVt4QodkS25KHXd4/VIshqrqawoEXGQCNzVFBCBh8jzaHNYxEeTwxsrlmdNhJTJwgqtWzd555Th+42hVmJmxEe/9Nf7dMmoIvR6QxMMFfdM6/s3wWC16ggdRZKgA64l+u8RbJOs6vqPSn1i2wrLT4l6ZFF8qvO98YPOcJaVabqtNbYg59GQB8B7IrP+RqtI4KZd1lv9UJDfjc/j3IXKw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Extend the futex2 interface to be numa aware. When FUTEX2_NUMA is specified for a futex, the user value is extended to two words (of the same size). The first is the user value we all know, the second one will be the node to place this futex on. struct futex_numa_32 { u32 val; u32 node; }; When node is set to ~0, WAIT will set it to the current node_id such that WAKE knows where to find it. If userspace corrupts the node value between WAIT and WAKE, the futex will not be found and no wakeup will happen. When FUTEX2_NUMA is not set, the node is simply an extention of the hash, such that traditional futexes are still interleaved over the nodes. This is done to avoid having to have a separate !numa hash-table. Signed-off-by: Peter Zijlstra (Intel) --- include/linux/futex.h | 3 + include/uapi/linux/futex.h | 8 ++ kernel/futex/core.c | 128 ++++++++++++++++++++++++++++++++++++--------- kernel/futex/futex.h | 17 +++++ 4 files changed, 131 insertions(+), 25 deletions(-) --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -34,6 +34,7 @@ union futex_key { u64 i_seq; unsigned long pgoff; unsigned int offset; + /* unsigned int node; */ } shared; struct { union { @@ -42,11 +43,13 @@ union futex_key { }; unsigned long address; unsigned int offset; + /* unsigned int node; */ } private; struct { u64 ptr; unsigned long word; unsigned int offset; + unsigned int node; /* NOT hashed! */ } both; }; --- a/include/uapi/linux/futex.h +++ b/include/uapi/linux/futex.h @@ -74,6 +74,14 @@ /* do not use */ #define FUTEX_32 FUTEX2_SIZE_U32 /* historical accident :-( */ + +/* + * When FUTEX2_NUMA doubles the futex word, the second word is a node value. + * The special value -1 indicates no-node. This is the same value as + * NUMA_NO_NODE, except that value is not ABI, this is. + */ +#define FUTEX_NO_NODE (-1) + /* * Max numbers of elements in a futex_waitv array */ --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -36,7 +36,8 @@ #include #include #include -#include +#include +#include #include #include @@ -49,12 +50,14 @@ * reside in the same cacheline. */ static struct { - struct futex_hash_bucket *queues; unsigned long hashsize; + unsigned int hashshift; + struct futex_hash_bucket *queues[MAX_NUMNODES]; } __futex_data __read_mostly __aligned(2*sizeof(long)); -#define futex_queues (__futex_data.queues) -#define futex_hashsize (__futex_data.hashsize) +#define futex_hashsize (__futex_data.hashsize) +#define futex_hashshift (__futex_data.hashshift) +#define futex_queues (__futex_data.queues) /* * Fault injections for futexes. @@ -107,6 +110,26 @@ late_initcall(fail_futex_debugfs); #endif /* CONFIG_FAIL_FUTEX */ +static int futex_get_value(u32 *val, u32 __user *from, unsigned int flags) +{ + switch (futex_size(flags)) { + case 1: return __get_user(*val, (u8 __user *)from); + case 2: return __get_user(*val, (u16 __user *)from); + case 4: return __get_user(*val, (u32 __user *)from); + default: BUG(); + } +} + +static int futex_put_value(u32 val, u32 __user *to, unsigned int flags) +{ + switch (futex_size(flags)) { + case 1: return __put_user(val, (u8 __user *)to); + case 2: return __put_user(val, (u16 __user *)to); + case 4: return __put_user(val, (u32 __user *)to); + default: BUG(); + } +} + /** * futex_hash - Return the hash bucket in the global hash * @key: Pointer to the futex key for which the hash is calculated @@ -116,10 +139,29 @@ late_initcall(fail_futex_debugfs); */ struct futex_hash_bucket *futex_hash(union futex_key *key) { - u32 hash = jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, + u32 hash = jhash2((u32 *)key, + offsetof(typeof(*key), both.offset) / sizeof(u32), key->both.offset); + int node = key->both.node; - return &futex_queues[hash & (futex_hashsize - 1)]; + if (node == FUTEX_NO_NODE) { + /* + * In case of !FLAGS_NUMA, use some unused hash bits to pick a + * node -- this ensures regular futexes are interleaved across + * the nodes and avoids having to allocate multiple + * hash-tables. + * + * NOTE: this isn't perfectly uniform, but it is fast and + * handles sparse node masks. + */ + node = (hash >> futex_hashshift) % nr_node_ids; + if (!node_possible(node)) { + node = find_next_bit_wrap(node_possible_map.bits, + nr_node_ids, node); + } + } + + return &futex_queues[node][hash & (futex_hashsize - 1)]; } @@ -219,7 +261,7 @@ static u64 get_inode_sequence_number(str * * lock_page() might sleep, the caller should not hold a spinlock. */ -int get_futex_key(u32 __user *uaddr, unsigned int flags, union futex_key *key, +int get_futex_key(void __user *uaddr, unsigned int flags, union futex_key *key, enum futex_access rw) { unsigned long address = (unsigned long)uaddr; @@ -227,25 +269,49 @@ int get_futex_key(u32 __user *uaddr, uns struct page *page; struct folio *folio; struct address_space *mapping; - int err, ro = 0; + int node, err, size, ro = 0; bool fshared; fshared = flags & FLAGS_SHARED; + size = futex_size(flags); + if (flags & FLAGS_NUMA) + size *= 2; /* * The futex address must be "naturally" aligned. */ key->both.offset = address % PAGE_SIZE; - if (unlikely((address % sizeof(u32)) != 0)) + if (unlikely((address % size) != 0)) return -EINVAL; address -= key->both.offset; - if (unlikely(!access_ok(uaddr, sizeof(u32)))) + if (unlikely(!access_ok(uaddr, size))) return -EFAULT; if (unlikely(should_fail_futex(fshared))) return -EFAULT; + if (flags & FLAGS_NUMA) { + void __user *naddr = uaddr + size / 2; + + if (futex_get_value(&node, naddr, flags)) + return -EFAULT; + + if (node == FUTEX_NO_NODE) { + node = numa_node_id(); + if (futex_put_value(node, naddr, flags)) + return -EFAULT; + + } else if (node >= MAX_NUMNODES || !node_possible(node)) { + return -EINVAL; + } + + key->both.node = node; + + } else { + key->both.node = FUTEX_NO_NODE; + } + /* * PROCESS_PRIVATE futexes are fast. * As the mm cannot disappear under us and the 'key' only needs @@ -1148,26 +1214,42 @@ void futex_exit_release(struct task_stru static int __init futex_init(void) { - unsigned int futex_shift; - unsigned long i; + unsigned int order, n; + unsigned long size, i; #ifdef CONFIG_BASE_SMALL futex_hashsize = 16; #else - futex_hashsize = roundup_pow_of_two(256 * num_possible_cpus()); + futex_hashsize = 256 * num_possible_cpus(); + futex_hashsize /= num_possible_nodes(); + futex_hashsize = roundup_pow_of_two(futex_hashsize); #endif + futex_hashshift = ilog2(futex_hashsize); + size = sizeof(struct futex_hash_bucket) * futex_hashsize; + order = get_order(size); + + for_each_node(n) { + struct futex_hash_bucket *table; + + if (order > MAX_ORDER) + table = vmalloc_huge_node(size, GFP_KERNEL, n); + else + table = alloc_pages_exact_nid(n, size, GFP_KERNEL); + + BUG_ON(!table); + + for (i = 0; i < futex_hashsize; i++) { + atomic_set(&table[i].waiters, 0); + spin_lock_init(&table[i].lock); + plist_head_init(&table[i].chain); + } - futex_queues = alloc_large_system_hash("futex", sizeof(*futex_queues), - futex_hashsize, 0, 0, - &futex_shift, NULL, - futex_hashsize, futex_hashsize); - futex_hashsize = 1UL << futex_shift; - - for (i = 0; i < futex_hashsize; i++) { - atomic_set(&futex_queues[i].waiters, 0); - plist_head_init(&futex_queues[i].chain); - spin_lock_init(&futex_queues[i].lock); + futex_queues[n] = table; } + pr_info("futex hash table, %d nodes, %ld entries (order: %d, %lu bytes)\n", + num_possible_nodes(), + futex_hashsize, order, + sizeof(struct futex_hash_bucket) * futex_hashsize); return 0; } --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -52,7 +52,7 @@ static inline unsigned int futex_to_flag return flags; } -#define FUTEX2_VALID_MASK (FUTEX2_SIZE_MASK | FUTEX2_PRIVATE) +#define FUTEX2_VALID_MASK (FUTEX2_SIZE_MASK | FUTEX2_NUMA | FUTEX2_PRIVATE) /* FUTEX2_ to FLAGS_ */ static inline unsigned int futex2_to_flags(unsigned int flags2) @@ -85,6 +85,19 @@ static inline bool futex_flags_valid(uns if ((flags & FLAGS_SIZE_MASK) != FLAGS_SIZE_32) return false; + /* + * Must be able to represent both FUTEX_NO_NODE and every valid nodeid + * in a futex word. + */ + if (flags & FLAGS_NUMA) { + int bits = 8 * futex_size(flags); + u64 max = ~0ULL; + + max >>= 64 - bits; + if (nr_node_ids >= max) + return false; + } + return true; } @@ -193,7 +206,7 @@ enum futex_access { FUTEX_WRITE }; -extern int get_futex_key(u32 __user *uaddr, unsigned int flags, union futex_key *key, +extern int get_futex_key(void __user *uaddr, unsigned int flags, union futex_key *key, enum futex_access rw); extern struct hrtimer_sleeper * From patchwork Fri Oct 25 09:03:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13850343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E638BD0C5F4 for ; Fri, 25 Oct 2024 09:41:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 469EF6B0095; Fri, 25 Oct 2024 05:41:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4422F6B0099; Fri, 25 Oct 2024 05:41:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 293726B0098; Fri, 25 Oct 2024 05:41:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 044476B0093 for ; Fri, 25 Oct 2024 05:41:12 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8A0D6160585 for ; Fri, 25 Oct 2024 09:40:50 +0000 (UTC) X-FDA: 82711630134.09.C0798A6 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf26.hostedemail.com (Postfix) with ESMTP id 6D28A140016 for ; Fri, 25 Oct 2024 09:40:56 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=mHf3VaB+; dmarc=none; spf=none (imf26.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729849101; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=U0D/97Jan/1h5/mHQt4BdsZRHtXtzjdUFE/GVa1R21M=; b=FWPZfC8n/9Xw9Pcasi2q3TGBeH04wuUy/nDBdooDKoSHvL8LRT48MfVPBbRsAnvDeG/zVT ZvjE767OGpp3344pYTY04+ctWfq0keOyMmxT0xZsPldkO7GVy5pC1U1RAfJK0o1STPtm9K Qwm0Srsvnv9BwoMU77i0s/lu9TgMueY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729849101; a=rsa-sha256; cv=none; b=ktWzacpEzZI8PzsnyvWwZ5BRmAn3p/Qc0ittebBEHO/5iCTjkyx9vWfP6MMAMpwXDfw0PD 86FCu+eD3/wXhsKnHwqgQCy+2W3sb7bHlzdjme5LYFLCmxpc9FHS+PvI24oX33k43oGwrU TI/cVx0YSYY/TgTzDUZ/LZwowEUZ4P8= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=mHf3VaB+; dmarc=none; spf=none (imf26.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=U0D/97Jan/1h5/mHQt4BdsZRHtXtzjdUFE/GVa1R21M=; b=mHf3VaB+HmQHmxesCbnKkAMi2d sZG3k9IRWCQW3O2HIyPWhAEPdZw/T3W1b/yG1wTuewo6QaSkAYvR++sMpfAZVuGx3gAFXcYqQzSQT j4xVtIecla88uLIdFVu1LiFvwB4Cq782IvCH+uJzMi+7MndvdPq+OAGDQKLipcXDSxxMKBmDn/vdg hZG4GTC4vYB7tQPNZKR62OMPaydbPwqbxB3+jfYTpnV+ZwwlH3kRgwWBYGKfLduwn6uwoNeHSNsHi bSqf1vVyNKE51O8hMxkxgID8XJZpxqWOz/FjDNFHXb7tgBbBOe6zscEWeuaj1ox6NMer/4J3zvV/9 pd4m9big==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1t4GoU-00000008sa8-1mSU; Fri, 25 Oct 2024 09:40:59 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id D5DAD301171; Fri, 25 Oct 2024 11:40:57 +0200 (CEST) Message-Id: <20241025093944.598921704@infradead.org> User-Agent: quilt/0.65 Date: Fri, 25 Oct 2024 11:03:50 +0200 From: Peter Zijlstra To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, cl@linux.com, llong@redhat.com Subject: [PATCH 3/6] futex: Propagate flags into futex_get_value_locked() References: <20241025090347.244183920@infradead.org> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6D28A140016 X-Stat-Signature: 3emo84u9w9upoho44sg45ma5enqrxd96 X-Rspam-User: X-HE-Tag: 1729849256-734119 X-HE-Meta: U2FsdGVkX1/mVJBjMvUa5xP65NLTY3LaoB3y1alDF2pv4LTpvr5NevOIDRiFh0OyjE0z78woj8llZLjX/Ij17PRCUp3KSsYRG4kkmxW23wh5py/cfty/bWXHI31QcIRKMfbkKPKkwXSDCuat/Bxuxu1Ep9bU//g8x5BclcUVUvPvEGkrcjfx50WcL4axRFmroTK7bzpmomJL+7S9bliqdNaVSGlQRMmyqMxeoPtACe0tQ51b6b4DLfT2Qa8O+uyKIi1UbIiag+ciCCfQl9LHQkUd/wbbzGZxUO5PeY4Qeika0oxkpzhSYkKzIBLgeRMtBfjyGjZUfzJ0GKJHSEO3z6QSDbuTKexQxAUPSvRRUu2WipJy8rdZYGA1oMWrD3c/vsOC3WrTQi0r/4BztZWnv1uEg38F+kYOzASHlly1ZtqwrxCNuiyYyiX6qqKUg8ARc5R13PIIz8X/LXYyCo6sE9hBiUsAH1c9qojMEzieOHvbPzFeTvexEVGkukCOeO9TodroWzN1ythPYVsf6wtEIGTFRv+JROQpT/Z/zspGmvUjSRojnskIhpZG1hx3c7x9YE3rzgwFLakYURMYtcqrTf97vXa4jZIroAiyH2rmDYiSZ8HFuIfD0yQqdcFOidzYCQtG2MiYKxao/YSRxDy29zNGGzsEPaGalhHMuXS3uLN51YQHaD4Xu1TOq3l0dNvlZFNeLqvovSvRUq+Nr6hcqDhPrfMID9IoR/oruv6shSl6xYfb0TH7578TyjVewnOCZAlfeNmRmJlkdFL1Df5nv2QltoaVl3+NMaodj5FNhEBxDO2A1inPJ4LCYCksodXyVf5Fd3lC3Dv0Bgkw9sCNn7TQLGM6Xdt9XgcflXM0CBKFeyh6Wq0SHeqVHyQV0toM9y4X9sZ1JRude7HdtzerEB6hthI1Nh8cU1C8lL7wycGdfiKIIOWEi+OVp7s44Im6YWtHMQCohfCSJ+MKSfq SyN9pjo/ j+KdORol19mmga8oulogZFhypXnikSJrDyG2ZQB4bubBLT1xBtdG/COEQFmYKS7IM8hqBHN5vuHG5pamwymFH2KEABxR/1L3+FGt3rmgn9Z60g5EgZOvdSHoHxWAonIHHh/5aZD2lqYpQbeEjoXD8r0x3USW8by2+QTm41y7fyQWbl75xG/cOWIQKOaZybDXac8SSAb/Kc/ul9k5MP1vrO8oh/k7u6lqhGBVnN+iRFTIRoAXk6G6i42QnQyzWnpjEFdZWAOkIKDrjZs/UfXnoMOX905JiiEMinbQ/leEyMZB3VjgnjupW/M7e7CDFsh8svJDcfmCM/AYMBJDP8Cqb7JJxtQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In order to facilitate variable sized futexes propagate the flags into futex_get_value_locked(). No functional change intended. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner --- kernel/futex/core.c | 4 ++-- kernel/futex/futex.h | 2 +- kernel/futex/pi.c | 8 ++++---- kernel/futex/requeue.c | 4 ++-- kernel/futex/waitwake.c | 4 ++-- 5 files changed, 11 insertions(+), 11 deletions(-) --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -528,12 +528,12 @@ int futex_cmpxchg_value_locked(u32 *curv return ret; } -int futex_get_value_locked(u32 *dest, u32 __user *from) +int futex_get_value_locked(u32 *dest, u32 __user *from, unsigned int flags) { int ret; pagefault_disable(); - ret = __get_user(*dest, from); + ret = futex_get_value(dest, from, flags); pagefault_enable(); return ret ? -EFAULT : 0; --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -239,7 +239,7 @@ extern void futex_wake_mark(struct wake_ extern int fault_in_user_writeable(u32 __user *uaddr); extern int futex_cmpxchg_value_locked(u32 *curval, u32 __user *uaddr, u32 uval, u32 newval); -extern int futex_get_value_locked(u32 *dest, u32 __user *from); +extern int futex_get_value_locked(u32 *dest, u32 __user *from, unsigned int flags); extern struct futex_q *futex_top_waiter(struct futex_hash_bucket *hb, union futex_key *key); extern void __futex_unqueue(struct futex_q *q); --- a/kernel/futex/pi.c +++ b/kernel/futex/pi.c @@ -240,7 +240,7 @@ static int attach_to_pi_state(u32 __user * still is what we expect it to be, otherwise retry the entire * operation. */ - if (futex_get_value_locked(&uval2, uaddr)) + if (futex_get_value_locked(&uval2, uaddr, FLAGS_SIZE_32)) goto out_efault; if (uval != uval2) @@ -359,7 +359,7 @@ static int handle_exit_race(u32 __user * * The same logic applies to the case where the exiting task is * already gone. */ - if (futex_get_value_locked(&uval2, uaddr)) + if (futex_get_value_locked(&uval2, uaddr, FLAGS_SIZE_32)) return -EFAULT; /* If the user space value has changed, try again. */ @@ -527,7 +527,7 @@ int futex_lock_pi_atomic(u32 __user *uad * Read the user space value first so we can validate a few * things before proceeding further. */ - if (futex_get_value_locked(&uval, uaddr)) + if (futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32)) return -EFAULT; if (unlikely(should_fail_futex(true))) @@ -750,7 +750,7 @@ static int __fixup_pi_state_owner(u32 __ if (!pi_state->owner) newtid |= FUTEX_OWNER_DIED; - err = futex_get_value_locked(&uval, uaddr); + err = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); if (err) goto handle_err; --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -275,7 +275,7 @@ futex_proxy_trylock_atomic(u32 __user *p u32 curval; int ret; - if (futex_get_value_locked(&curval, pifutex)) + if (futex_get_value_locked(&curval, pifutex, FLAGS_SIZE_32)) return -EFAULT; if (unlikely(should_fail_futex(true))) @@ -453,7 +453,7 @@ int futex_requeue(u32 __user *uaddr1, un if (likely(cmpval != NULL)) { u32 curval; - ret = futex_get_value_locked(&curval, uaddr1); + ret = futex_get_value_locked(&curval, uaddr1, FLAGS_SIZE_32); if (unlikely(ret)) { double_unlock_hb(hb1, hb2); --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -453,7 +453,7 @@ int futex_wait_multiple_setup(struct fut u32 val = vs[i].w.val; hb = futex_q_lock(q); - ret = futex_get_value_locked(&uval, uaddr); + ret = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); if (!ret && uval == val) { /* @@ -621,7 +621,7 @@ int futex_wait_setup(u32 __user *uaddr, retry_private: *hb = futex_q_lock(q); - ret = futex_get_value_locked(&uval, uaddr); + ret = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); if (ret) { futex_q_unlock(*hb); From patchwork Fri Oct 25 09:03:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13850345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3018D0C5F4 for ; Fri, 25 Oct 2024 09:41:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 37ED76B0096; Fri, 25 Oct 2024 05:41:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3569B6B0098; Fri, 25 Oct 2024 05:41:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F7996B0099; Fri, 25 Oct 2024 05:41:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id F06016B0096 for ; Fri, 25 Oct 2024 05:41:13 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id ECD47806DF for ; Fri, 25 Oct 2024 09:40:56 +0000 (UTC) X-FDA: 82711630302.03.A176DEC Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf08.hostedemail.com (Postfix) with ESMTP id 19BB4160005 for ; Fri, 25 Oct 2024 09:40:58 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=oToK9ZTc; dmarc=none; spf=none (imf08.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729849102; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=dbMjdL9P+u2qdFG5YZQzJ7hdT9tBr5pG921aWEix044=; b=34P2Z+SJv/7stfhIzjq10ZhJIRcr71hE/tXfXZ+KTtQHaIka4Af3ZRp6SjBtRPyLr2B4nm ttY3GKwRXDqT6SCb4GlkOg79bXHNIAZYfpslCp2jAxvgdCV7m/Ie4Ib9XRPkuJcXekvW2I Bd91pefNPDOkK5LIBK/0Ha6blbXCVno= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729849102; a=rsa-sha256; cv=none; b=RSzRnr7lPHnaGIlvHBMCw0wp9IsUx/u4BkDYLiFqEmR7s3xTnk/iLyzsLUFLMoNaa2+zeX OT/61SsndPF5oXIwd1RgEy5WrSmjqkmQJb05gggJKyhylW1OxG7UGvxwpud5gRtAMVaZ3S GqLfTCpvFDHMe8H1DltPjOnkCD5FLC4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=oToK9ZTc; dmarc=none; spf=none (imf08.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=dbMjdL9P+u2qdFG5YZQzJ7hdT9tBr5pG921aWEix044=; b=oToK9ZTcwKqnYN5RC0kkFefevU u8/ilYOenOsPM+FVeWvDHPVxcg4GGhL6k6STW5NP7C9ct6oVDyUl76zIlqlnzLWUBUkJwh89WFj7Q 5XoeELdhk1ILEt2BpbvKy9ZschyFkZsKVD1/SWt707/NGdhazJYXZP7lmTBDuBWBZX+dxZghMUjIu soy3pCQ3Jo0kqETLQTyhcOX7TsZEFBb6dJhn04uR/LOTn/nyWxdHHxFlw8/P6zSIBS+wVR6aqZu6z ReO4sVqRqBLg0ND9nBe1PXNugWLsBasiG0DRiZKxiKN/M5TGkEehurCM7uqSL5D9qs1MXVLEGrose SG7oVy5w==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1t4GoU-00000008sa9-1l5D; Fri, 25 Oct 2024 09:41:00 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id DA229301D03; Fri, 25 Oct 2024 11:40:57 +0200 (CEST) Message-Id: <20241025093944.707639534@infradead.org> User-Agent: quilt/0.65 Date: Fri, 25 Oct 2024 11:03:51 +0200 From: Peter Zijlstra To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, cl@linux.com, llong@redhat.com Subject: [PATCH 4/6] futex: Enable FUTEX2_{8,16} References: <20241025090347.244183920@infradead.org> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 19BB4160005 X-Stat-Signature: bznb9sdbboxmsyw8jqn6875oh6d8djnh X-Rspam-User: X-HE-Tag: 1729849258-132035 X-HE-Meta: U2FsdGVkX1+W1li72GfmRST59mvWWV76XmHIXnMNPXlmFRWgiLI+iaC2/qwvTQnD5VFhlfPfQ2gyrV9fPk5BCw+kHvwMH4tF7BU5Zmk/8MZJSpMghuJHNYzbBlssZarDJYWGJp2Id21qSM4fPaIikNXD9q/diW+JrHt2pkamC125yjUTSfnaBiWrE4DFiNO8oKaAKiJc5aetABV22Zznb71ckgk658piK3/se69vYrIa4ppOn+KBIy43N05QJcgdczs03GQvBXKFhTQCMIWrl2jNbgT9Xv4mAImygA8PUdPnssMQSjq8zJSz/1VryF2q9nHcn+5WzlEtJx2qEb5t0ogHWEgID4tXxVExMF/dz8LPNveWAop1zBeFTKtQ/tnGt7Gw1TZ2wimEwmpECcjotIa/6n6drXslatpAptUMdFRlKVABiBNs4BZO0z2XtdhAYMSsRcaKHyZkU0DQPGjQ42ax3r6e3fe88yQlJqdYVtq+MEzO6UKt181LiXL0K9GZqeOGS4goagqVtIgW+9X3D370otNNT0zevLq8aDDu7FILrxZuvMMW2llWnIZOdb3pTG3VVVAEk2s2wLNbGx2W61UCpN9mriyvw+UWonohv5AleSFJOGtUYA1a3c91dsQKlPTVOSX1aaJFEUmLvgWe4aWoUEOvF6bDl4F1bzyRm4RxCQjCOFZrDuHYC1uAfZtR9jScWwfXxbkLA7Y2a6cARn6hRuLy0GiEfMTRHiHLiL12wwz0yw1gYD2CCdWYIm1ojnb3edIyUC546YpnCt7H4mhnc/SoE7RW7PFLwEOv7JNUvXsjaj7mtUikbrRW8MUMmfww3VAMva7shwlnTcaKF1iNAlhkBCMse+OlnirjohnFB/MoHr9jN4DadQqkhnvE8FdmJu948nVyBR7UyVPg9V107CjmZSP45QTr2e3qvaEEo9/cQpPdOkihl4Idznmm6ZbXrE2GGxeOGryBOgm 34/kfUFN ZUuItRvdiZ8u/QSXOH2LA6WAn1sOb8Ct/JRWuMpjvINbGZZQPyiZPq4qr/tORvor5IY7RgD2Qzm9hLSq8nGgsWG4RY9Cn2bWqPjMTY542sTd2GUnvyqfA0jkS4NMt66Ju2vRT9uesfqrBUYmY+8S/LW1B3oT+qPgO+FUpsNFWUn/d2h4sWq0NIONnk24fqbQQ0FHrp7+E8omwNDoUGgwcthBsKT8l0xlwbb4gRxlicaSRZjNoUgdrZrvyVu5hDBKoVfJX5o//VS8WINSNuXTCXvMxviWL33bK+/H0gJLoO6lUf9tbgfjBShekAFhK6UopqGlMfvNu1sVeFktKAfqYB6hl1g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When futexes are no longer u32 aligned, the lower offset bits are no longer available to put type info in. However, since offset is the offset within a page, there are plenty bits available on the top end. After that, pass flags into futex_get_value_locked() for WAIT and disallow FUTEX2_SIZE_U64 instead of mandating FUTEX2_SIZE_U32. Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Thomas Gleixner --- include/linux/futex.h | 11 ++++++----- kernel/futex/core.c | 9 +++++++++ kernel/futex/futex.h | 4 ++-- kernel/futex/waitwake.c | 5 +++-- 4 files changed, 20 insertions(+), 9 deletions(-) --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -16,18 +16,19 @@ struct task_struct; * The key type depends on whether it's a shared or private mapping. * Don't rearrange members without looking at hash_futex(). * - * offset is aligned to a multiple of sizeof(u32) (== 4) by definition. - * We use the two low order bits of offset to tell what is the kind of key : + * offset is the position within a page and is in the range [0, PAGE_SIZE). + * The high bits of the offset indicate what kind of key this is: * 00 : Private process futex (PTHREAD_PROCESS_PRIVATE) * (no reference on an inode or mm) * 01 : Shared futex (PTHREAD_PROCESS_SHARED) * mapped on a file (reference on the underlying inode) * 10 : Shared futex (PTHREAD_PROCESS_SHARED) * (but private mapping on an mm, and reference taken on it) -*/ + */ -#define FUT_OFF_INODE 1 /* We set bit 0 if key has a reference on inode */ -#define FUT_OFF_MMSHARED 2 /* We set bit 1 if key has a reference on mm */ +#define FUT_OFF_INODE (PAGE_SIZE << 0) +#define FUT_OFF_MMSHARED (PAGE_SIZE << 1) +#define FUT_OFF_SIZE (PAGE_SIZE << 2) union futex_key { struct { --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -313,6 +313,15 @@ int get_futex_key(void __user *uaddr, un } /* + * Encode the futex size in the offset. This makes cross-size + * wake-wait fail -- see futex_match(). + * + * NOTE that cross-size wake-wait is fundamentally broken wrt + * FLAGS_NUMA. + */ + key->both.offset |= FUT_OFF_SIZE * (flags & FLAGS_SIZE_MASK); + + /* * PROCESS_PRIVATE futexes are fast. * As the mm cannot disappear under us and the 'key' only needs * virtual address, we dont even have to find the underlying vma. --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -81,8 +81,8 @@ static inline bool futex_flags_valid(uns return false; } - /* Only 32bit futexes are implemented -- for now */ - if ((flags & FLAGS_SIZE_MASK) != FLAGS_SIZE_32) + /* 64bit futexes aren't implemented -- yet */ + if ((flags & FLAGS_SIZE_MASK) == FLAGS_SIZE_64) return false; /* --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -449,11 +449,12 @@ int futex_wait_multiple_setup(struct fut for (i = 0; i < count; i++) { u32 __user *uaddr = (u32 __user *)(unsigned long)vs[i].w.uaddr; + unsigned int flags = vs[i].w.flags; struct futex_q *q = &vs[i].q; u32 val = vs[i].w.val; hb = futex_q_lock(q); - ret = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); + ret = futex_get_value_locked(&uval, uaddr, flags); if (!ret && uval == val) { /* @@ -621,7 +622,7 @@ int futex_wait_setup(u32 __user *uaddr, retry_private: *hb = futex_q_lock(q); - ret = futex_get_value_locked(&uval, uaddr, FLAGS_SIZE_32); + ret = futex_get_value_locked(&uval, uaddr, flags); if (ret) { futex_q_unlock(*hb); From patchwork Fri Oct 25 09:03:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13850341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C73F5D0C5F4 for ; Fri, 25 Oct 2024 09:41:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E06A86B008A; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB5AA6B008C; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B70C16B0096; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6E2176B008C for ; Fri, 25 Oct 2024 05:41:06 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D9F90A079E for ; Fri, 25 Oct 2024 09:40:30 +0000 (UTC) X-FDA: 82711630344.22.7D6D7CE Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf01.hostedemail.com (Postfix) with ESMTP id 7D7EA40017 for ; Fri, 25 Oct 2024 09:40:48 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=oh40u7qw; spf=none (imf01.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729849212; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=BvjuRtWo050xPXsgTFFwYa3Pxgz0X14YXS2BCPIBTTw=; b=V025dhE79/Mbmv+J1imjIU7dkUN2XpNG3H71auTm9TDBfuIzsAthhR+VODQi6VFU1GI76V HTtD2baeWMDkq+Ax67JCneACpF9oG/EvHYOhCsRGaetTNc72ZiFBE/DQx9i72xVAEuwqm6 B9coF6+wJ2hRehqIWnJSwaZdN8iDEqQ= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=oh40u7qw; spf=none (imf01.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729849212; a=rsa-sha256; cv=none; b=Ohgu938kuLCtzHWLclm5cj7LQP1JMxx8PBdaoINaflYX4hYxtZid3XkVD2kup4v5+dh9J2 FQxok7tSbsS7oNEf/WQdBjommA+nyvKEs+vZyiCk3yTAK/NwMohe7SzbVqzMVyyOVoaYqL Rl/vWu229lbJ8MAs/7jz1r0FFKa+qAA= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=BvjuRtWo050xPXsgTFFwYa3Pxgz0X14YXS2BCPIBTTw=; b=oh40u7qwkY/JURIVsPemCwfDPJ 0eudseN8r74Sr1VTVdGipoDt0WYw8UT03IDAnHpPTvt7oBzEXuC2//Wvl+vImQ1HELoQMuxdysCM2 RdJiYvRjjckM+MZJLWlymoVVf+vwSQIHbiR/kr+RjhYsTCxoU5LT6Q2BEIVzVAqRKWoFA2BIvtbOg DiL7E10pWE1c0nvV+KUlqg8oxq19vBO1JlGwEIPYE34CFH37MXGBhd3aaQ21LImZil8hVJYrrP/q+ QqfuzOV956MaNMFpymQ3/oNLOPhXV43/EGa1ta+OwDIld1Nk2ozSq5gNULhLjn+YWQWPqYRTrKtzW SBqZIetQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1t4GoU-000000054PT-2Eki; Fri, 25 Oct 2024 09:40:59 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id DE930301D99; Fri, 25 Oct 2024 11:40:57 +0200 (CEST) Message-Id: <20241025093944.817031866@infradead.org> User-Agent: quilt/0.65 Date: Fri, 25 Oct 2024 11:03:52 +0200 From: Peter Zijlstra To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, cl@linux.com, llong@redhat.com Subject: [PATCH 5/6] futex,selftests: Extend the futex selftests References: <20241025090347.244183920@infradead.org> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: xrp9jwi6k4uqnaphotfugxxk8ir7rdtg X-Rspamd-Queue-Id: 7D7EA40017 X-Rspamd-Server: rspam11 X-HE-Tag: 1729849248-476730 X-HE-Meta: U2FsdGVkX18eikCf+/2YzjPQcjuMW5FGuvvPbY1DYTQ0syjh0E61sLILpLEKD/jc1AfTFG+ek+2qYDE0S5QB+WU9uEBFNO8v1aBOw6jWh6gjUfz1bsw/g1T1hsQj1SUWLZ5Do6gJDOXGr1TGyQI7rFRLDZIomi6uRSUNhOTdHeLfqzXJbUp1kJrN9VieCFQr0A2LeZadeY63ZwDa10A1HL91IiBkaeDoVx2YOFOO1LhHkNBa/o88ADPZpkkZYgh1XHWMBHbjmLlmhEc+/hvBkd9SlfoOZ5HM5+WIZGcMNf+bsdJCWOIOKpti22gAt3hKizwBY73vH3PbCKzh8b0woZRQOEggWwyqvcDw0X4eMHJIaVzq5glTwz5ffdTd3zP/bFBzaNbqnQUwwqvmcDwU/3itWGu30t6isydGQYhcQo0aA9ghxSW0erP0hFTBlmQLltALsYd5Px6cmuIGErE6JR/CvuW3OjQR7nnnZ0zXkm9wUC4T5i18FBX/Mo77yLf7H+z9bN0b6Hd8esHwIjfYL9siATwswmbQXsp2WAv4JOBJn8Pv+IZVG5ehqXhNngfUBAgfahCqPhMilRX2baSYdBK1811R5EdnCvSe0TMqioNrdYrRh5Ff9Ca8KxgrDsWuMWHBIQ0d+2FPoxDAIVKamzDGO4FChe1NvuWK2M2b7esXCvUmD5LTfdPX4WcMWOsgdM1au4wsMSJiXnAKxBk9iBudyr5m+yXtpp/vO1jIBLFE8pC/6wJbm+918BVgSwAgnUcUZjppo66absF3Qqt+KVXeKZX1AsMB7A3jHRu+RbicJO46PIVykFZig3AS7fv0viqCHqn5NdqqllX2gnpjGShtBiA5U9DdyZ/lEX73LjNXdd+lvqZY4Sjx+zU1kIGCEpYXJoDXR54x9WjjAr5tqXoFjFJWIKl/W9PUJUWjtWkgaqbA3eqh4RPNsdTU1SKS46UpT2UvU6HfnMXHlSu b6rkdAkM HykSBvw3DroehcFVY0JhxhnclA3IxGvOkGRZVvgFXq59VI2wxfnR1fPmqdmJlcqwilkoTm0jo5ZtrWckIEljOpvwB7jXh3Gfgx9r6+L9J7vaRSAk+UuBn6//6KonSpdkPEG7cAC3J+LrKdoyY09usyRKKGiqTiNcdNfZGX5yem3P3+k+Q49vNy4Wd0L0xrVGK+udIo3/6ckRPV9RZIR8MUzx9zkT0TsDQIJ3gmswAwVwrm02m6xeJMPHMPdN3hpR8KLlhuyEWOGh6BCtbWSfRc7p5harnqJum7Tt7mweNQY1eDII= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Extend the wait/requeue selftests to also cover the futex2 syscalls. Signed-off-by: Peter Zijlstra (Intel) --- tools/testing/selftests/futex/functional/futex_requeue.c | 100 +++++++++- tools/testing/selftests/futex/functional/futex_wait.c | 56 ++++- tools/testing/selftests/futex/functional/futex_wait_timeout.c | 16 + tools/testing/selftests/futex/functional/futex_wait_wouldblock.c | 28 ++ tools/testing/selftests/futex/functional/futex_waitv.c | 15 - tools/testing/selftests/futex/functional/run.sh | 6 tools/testing/selftests/futex/include/futex2test.h | 52 +++++ 7 files changed, 243 insertions(+), 30 deletions(-) --- a/tools/testing/selftests/futex/functional/futex_requeue.c +++ b/tools/testing/selftests/futex/functional/futex_requeue.c @@ -7,8 +7,10 @@ #include #include +#include #include "logging.h" #include "futextest.h" +#include "futex2test.h" #define TEST_NAME "futex-requeue" #define timeout_ns 30000000 @@ -16,24 +18,58 @@ volatile futex_t *f1; +bool futex2 = 0; +bool mixed = 0; + void usage(char *prog) { printf("Usage: %s\n", prog); printf(" -c Use color\n"); + printf(" -n Use futex2 interface\n"); + printf(" -x Use mixed size futex\n"); printf(" -h Display this help message\n"); printf(" -v L Verbosity level: %d=QUIET %d=CRITICAL %d=INFO\n", VQUIET, VCRITICAL, VINFO); } -void *waiterfn(void *arg) +static void *waiterfn(void *arg) { + unsigned int flags = 0; struct timespec to; - to.tv_sec = 0; - to.tv_nsec = timeout_ns; + if (futex2) { + unsigned long mask; + + if (clock_gettime(CLOCK_MONOTONIC, &to)) { + printf("clock_gettime() failed errno %d", errno); + return NULL; + } + + to.tv_nsec += timeout_ns; + if (to.tv_nsec >= 1000000000) { + to.tv_sec++; + to.tv_nsec -= 1000000000; + } + + if (mixed) { + flags |= FUTEX2_SIZE_U16; + mask = (unsigned short)(~0U); + } else { + flags |= FUTEX2_SIZE_U32; + mask = (unsigned int)(~0U); + } + + if (futex2_wait(f1, *f1, mask, flags, + &to, CLOCK_MONOTONIC)) + printf("waiter failed errno %d\n", errno); + } else { + + to.tv_sec = 0; + to.tv_nsec = timeout_ns; - if (futex_wait(f1, *f1, &to, 0)) - printf("waiter failed errno %d\n", errno); + if (futex_wait(f1, *f1, &to, flags)) + printf("waiter failed errno %d\n", errno); + } return NULL; } @@ -48,7 +84,7 @@ int main(int argc, char *argv[]) f1 = &_f1; - while ((c = getopt(argc, argv, "cht:v:")) != -1) { + while ((c = getopt(argc, argv, "xncht:v:")) != -1) { switch (c) { case 'c': log_color(1); @@ -59,6 +95,12 @@ int main(int argc, char *argv[]) case 'v': log_verbosity(atoi(optarg)); break; + case 'x': + mixed=1; + /* fallthrough */ + case 'n': + futex2=1; + break; default: usage(basename(argv[0])); exit(1); @@ -79,7 +121,22 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Requeuing 1 futex from f1 to f2\n"); - res = futex_cmp_requeue(f1, 0, &f2, 0, 1, 0); + if (futex2) { + struct futex_waitv futexes[2] = { + { + .val = 0, + .uaddr = (unsigned long)f1, + .flags = mixed ? FUTEX2_SIZE_U16 : FUTEX2_SIZE_U32, + }, + { + .uaddr = (unsigned long)&f2, + .flags = FUTEX2_SIZE_U32, + }, + }; + res = futex2_requeue(futexes, 0, 0, 1); + } else { + res = futex_cmp_requeue(f1, 0, &f2, 0, 1, 0); + } if (res != 1) { ksft_test_result_fail("futex_requeue simple returned: %d %s\n", res ? errno : res, @@ -89,7 +146,11 @@ int main(int argc, char *argv[]) info("Waking 1 futex at f2\n"); - res = futex_wake(&f2, 1, 0); + if (futex2) { + res = futex2_wake(&f2, ~0U, 1, FUTEX2_SIZE_U32); + } else { + res = futex_wake(&f2, 1, 0); + } if (res != 1) { ksft_test_result_fail("futex_requeue simple returned: %d %s\n", res ? errno : res, @@ -112,7 +173,22 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Waking 3 futexes at f1 and requeuing 7 futexes from f1 to f2\n"); - res = futex_cmp_requeue(f1, 0, &f2, 3, 7, 0); + if (futex2) { + struct futex_waitv futexes[2] = { + { + .val = 0, + .uaddr = (unsigned long)f1, + .flags = mixed ? FUTEX2_SIZE_U16 : FUTEX2_SIZE_U32, + }, + { + .uaddr = (unsigned long)&f2, + .flags = FUTEX2_SIZE_U32, + }, + }; + res = futex2_requeue(futexes, 0, 3, 7); + } else { + res = futex_cmp_requeue(f1, 0, &f2, 3, 7, 0); + } if (res != 10) { ksft_test_result_fail("futex_requeue many returned: %d %s\n", res ? errno : res, @@ -121,7 +197,11 @@ int main(int argc, char *argv[]) } info("Waking INT_MAX futexes at f2\n"); - res = futex_wake(&f2, INT_MAX, 0); + if (futex2) { + res = futex2_wake(&f2, ~0U, INT_MAX, FUTEX2_SIZE_U32); + } else { + res = futex_wake(&f2, INT_MAX, 0); + } if (res != 7) { ksft_test_result_fail("futex_requeue many returned: %d %s\n", res ? errno : res, --- a/tools/testing/selftests/futex/functional/futex_wait.c +++ b/tools/testing/selftests/futex/functional/futex_wait.c @@ -9,8 +9,10 @@ #include #include #include +#include #include "logging.h" #include "futextest.h" +#include "futex2test.h" #define TEST_NAME "futex-wait" #define timeout_ns 30000000 @@ -19,10 +21,13 @@ void *futex; +bool futex2 = 0; + void usage(char *prog) { printf("Usage: %s\n", prog); printf(" -c Use color\n"); + printf(" -n Use futex2 interface\n"); printf(" -h Display this help message\n"); printf(" -v L Verbosity level: %d=QUIET %d=CRITICAL %d=INFO\n", VQUIET, VCRITICAL, VINFO); @@ -30,17 +35,35 @@ void usage(char *prog) static void *waiterfn(void *arg) { - struct timespec to; unsigned int flags = 0; + struct timespec to; if (arg) flags = *((unsigned int *) arg); - to.tv_sec = 0; - to.tv_nsec = timeout_ns; + if (futex2) { + if (clock_gettime(CLOCK_MONOTONIC, &to)) { + printf("clock_gettime() failed errno %d", errno); + return NULL; + } - if (futex_wait(futex, 0, &to, flags)) - printf("waiter failed errno %d\n", errno); + to.tv_nsec += timeout_ns; + if (to.tv_nsec >= 1000000000) { + to.tv_sec++; + to.tv_nsec -= 1000000000; + } + + if (futex2_wait(futex, 0, ~0U, flags | FUTEX2_SIZE_U32, + &to, CLOCK_MONOTONIC)) + printf("waiter failed errno %d\n", errno); + } else { + + to.tv_sec = 0; + to.tv_nsec = timeout_ns; + + if (futex_wait(futex, 0, &to, flags)) + printf("waiter failed errno %d\n", errno); + } return NULL; } @@ -55,7 +78,7 @@ int main(int argc, char *argv[]) futex = &f_private; - while ((c = getopt(argc, argv, "cht:v:")) != -1) { + while ((c = getopt(argc, argv, "ncht:v:")) != -1) { switch (c) { case 'c': log_color(1); @@ -66,6 +89,9 @@ int main(int argc, char *argv[]) case 'v': log_verbosity(atoi(optarg)); break; + case 'n': + futex2=1; + break; default: usage(basename(argv[0])); exit(1); @@ -84,7 +110,11 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Calling private futex_wake on futex: %p\n", futex); - res = futex_wake(futex, 1, FUTEX_PRIVATE_FLAG); + if (futex2) { + res = futex2_wake(futex, ~0U, 1, FUTEX2_SIZE_U32 | FUTEX2_PRIVATE); + } else { + res = futex_wake(futex, 1, FUTEX_PRIVATE_FLAG); + } if (res != 1) { ksft_test_result_fail("futex_wake private returned: %d %s\n", errno, strerror(errno)); @@ -112,7 +142,11 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Calling shared (page anon) futex_wake on futex: %p\n", futex); - res = futex_wake(futex, 1, 0); + if (futex2) { + res = futex2_wake(futex, ~0U, 1, FUTEX2_SIZE_U32); + } else { + res = futex_wake(futex, 1, 0); + } if (res != 1) { ksft_test_result_fail("futex_wake shared (page anon) returned: %d %s\n", errno, strerror(errno)); @@ -151,7 +185,11 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); info("Calling shared (file backed) futex_wake on futex: %p\n", futex); - res = futex_wake(shm, 1, 0); + if (futex2) { + res = futex2_wake(shm, ~0U, 1, FUTEX2_SIZE_U32); + } else { + res = futex_wake(shm, 1, 0); + } if (res != 1) { ksft_test_result_fail("futex_wake shared (file backed) returned: %d %s\n", errno, strerror(errno)); --- a/tools/testing/selftests/futex/functional/futex_wait_timeout.c +++ b/tools/testing/selftests/futex/functional/futex_wait_timeout.c @@ -103,7 +103,7 @@ int main(int argc, char *argv[]) struct futex_waitv waitv = { .uaddr = (uintptr_t)&f1, .val = f1, - .flags = FUTEX_32, + .flags = FUTEX2_SIZE_U32, .__reserved = 0 }; @@ -128,7 +128,7 @@ int main(int argc, char *argv[]) } ksft_print_header(); - ksft_set_plan(9); + ksft_set_plan(11); ksft_print_msg("%s: Block on a futex and wait for timeout\n", basename(argv[0])); ksft_print_msg("\tArguments: timeout=%ldns\n", timeout_ns); @@ -201,6 +201,18 @@ int main(int argc, char *argv[]) res = futex_waitv(&waitv, 1, 0, &to, CLOCK_REALTIME); test_timeout(res, &ret, "futex_waitv realtime", ETIMEDOUT); + /* futex2_wait with CLOCK_MONOTONIC */ + if (futex_get_abs_timeout(CLOCK_MONOTONIC, &to, timeout_ns)) + return RET_FAIL; + res = futex2_wait(&f1, f1, 1, FUTEX2_SIZE_U32, &to, CLOCK_MONOTONIC); + test_timeout(res, &ret, "futex2_wait monotonic", ETIMEDOUT); + + /* futex2_wait with CLOCK_REALTIME */ + if (futex_get_abs_timeout(CLOCK_REALTIME, &to, timeout_ns)) + return RET_FAIL; + res = futex2_wait(&f1, f1, 1, FUTEX2_SIZE_U32, &to, CLOCK_REALTIME); + test_timeout(res, &ret, "futex2_wait realtime", ETIMEDOUT); + ksft_print_cnts(); return ret; } --- a/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c +++ b/tools/testing/selftests/futex/functional/futex_wait_wouldblock.c @@ -46,7 +46,7 @@ int main(int argc, char *argv[]) struct futex_waitv waitv = { .uaddr = (uintptr_t)&f1, .val = f1+1, - .flags = FUTEX_32, + .flags = FUTEX2_SIZE_U32 | FUTEX2_PRIVATE, .__reserved = 0 }; @@ -68,7 +68,7 @@ int main(int argc, char *argv[]) } ksft_print_header(); - ksft_set_plan(2); + ksft_set_plan(3); ksft_print_msg("%s: Test the unexpected futex value in FUTEX_WAIT\n", basename(argv[0])); @@ -106,6 +106,30 @@ int main(int argc, char *argv[]) ksft_test_result_pass("futex_waitv\n"); } + if (clock_gettime(CLOCK_MONOTONIC, &to)) { + error("clock_gettime failed\n", errno); + return errno; + } + + to.tv_nsec += timeout_ns; + + if (to.tv_nsec >= 1000000000) { + to.tv_sec++; + to.tv_nsec -= 1000000000; + } + + info("Calling futex2_wait on f1: %u @ %p with val=%u\n", f1, &f1, f1+1); + res = futex2_wait(&f1, f1+1, ~0U, FUTEX2_SIZE_U32 | FUTEX2_PRIVATE, + &to, CLOCK_MONOTONIC); + if (!res || errno != EWOULDBLOCK) { + ksft_test_result_pass("futex2_wait returned: %d %s\n", + res ? errno : res, + res ? strerror(errno) : ""); + ret = RET_FAIL; + } else { + ksft_test_result_pass("futex2_wait\n"); + } + ksft_print_cnts(); return ret; } --- a/tools/testing/selftests/futex/functional/futex_waitv.c +++ b/tools/testing/selftests/futex/functional/futex_waitv.c @@ -88,7 +88,7 @@ int main(int argc, char *argv[]) for (i = 0; i < NR_FUTEXES; i++) { waitv[i].uaddr = (uintptr_t)&futexes[i]; - waitv[i].flags = FUTEX_32 | FUTEX_PRIVATE_FLAG; + waitv[i].flags = FUTEX2_SIZE_U32 | FUTEX2_PRIVATE; waitv[i].val = 0; waitv[i].__reserved = 0; } @@ -99,7 +99,8 @@ int main(int argc, char *argv[]) usleep(WAKE_WAIT_US); - res = futex_wake(u64_to_ptr(waitv[NR_FUTEXES - 1].uaddr), 1, FUTEX_PRIVATE_FLAG); + res = futex2_wake(u64_to_ptr(waitv[NR_FUTEXES - 1].uaddr), ~0U, 1, + FUTEX2_PRIVATE | FUTEX2_SIZE_U32); if (res != 1) { ksft_test_result_fail("futex_wake private returned: %d %s\n", res ? errno : res, @@ -122,7 +123,7 @@ int main(int argc, char *argv[]) *shared_data = 0; waitv[i].uaddr = (uintptr_t)shared_data; - waitv[i].flags = FUTEX_32; + waitv[i].flags = FUTEX2_SIZE_U32; waitv[i].val = 0; waitv[i].__reserved = 0; } @@ -145,8 +146,8 @@ int main(int argc, char *argv[]) for (i = 0; i < NR_FUTEXES; i++) shmdt(u64_to_ptr(waitv[i].uaddr)); - /* Testing a waiter without FUTEX_32 flag */ - waitv[0].flags = FUTEX_PRIVATE_FLAG; + /* Testing a waiter without FUTEX2_SIZE_U32 flag */ + waitv[0].flags = FUTEX2_PRIVATE; if (clock_gettime(CLOCK_MONOTONIC, &to)) error("gettime64 failed\n", errno); @@ -160,11 +161,11 @@ int main(int argc, char *argv[]) res ? strerror(errno) : ""); ret = RET_FAIL; } else { - ksft_test_result_pass("futex_waitv without FUTEX_32\n"); + ksft_test_result_pass("futex_waitv without FUTEX2_SIZE_U32\n"); } /* Testing a waiter with an unaligned address */ - waitv[0].flags = FUTEX_PRIVATE_FLAG | FUTEX_32; + waitv[0].flags = FUTEX2_PRIVATE | FUTEX2_SIZE_U32; waitv[0].uaddr = 1; if (clock_gettime(CLOCK_MONOTONIC, &to)) --- a/tools/testing/selftests/futex/functional/run.sh +++ b/tools/testing/selftests/futex/functional/run.sh @@ -76,9 +76,15 @@ echo echo ./futex_wait $COLOR +echo +./futex_wait -n $COLOR echo ./futex_requeue $COLOR +echo +./futex_requeue -n $COLOR +echo +./futex_requeue -x $COLOR echo ./futex_waitv $COLOR --- a/tools/testing/selftests/futex/include/futex2test.h +++ b/tools/testing/selftests/futex/include/futex2test.h @@ -8,6 +8,41 @@ #define u64_to_ptr(x) ((void *)(uintptr_t)(x)) +#ifndef __NR_futex_waitv +#define __NR_futex_waitv 449 + +struct futex_waitv { + __u64 val; + __u64 uaddr; + __u32 flags; + __u32 __reserved; +}; +#endif + +#ifndef __NR_futex_wake +#define __NR_futex_wake 454 +#define __NR_futex_wait 455 +#define __NR_futex_requeue 456 +#endif + +#ifndef FUTEX2_SIZE_U8 +/* + * Flags for futex2 syscalls. + */ +#define FUTEX2_SIZE_U8 0x00 +#define FUTEX2_SIZE_U16 0x01 +#define FUTEX2_SIZE_U32 0x02 +#define FUTEX2_SIZE_U64 0x03 +#define FUTEX2_NUMA 0x04 + /* 0x08 */ + /* 0x10 */ + /* 0x20 */ + /* 0x40 */ +#define FUTEX2_PRIVATE FUTEX_PRIVATE_FLAG +#endif + +#define FUTEX_NO_NODE (-1) + /** * futex_waitv - Wait at multiple futexes, wake on any * @waiters: Array of waiters @@ -20,3 +55,20 @@ static inline int futex_waitv(volatile s { return syscall(__NR_futex_waitv, waiters, nr_waiters, flags, timo, clockid); } + +static inline int futex2_wake(volatile void *uaddr, unsigned long mask, int nr, unsigned int flags) +{ + return syscall(__NR_futex_wake, uaddr, mask, nr, flags); +} + +static inline int futex2_wait(volatile void *uaddr, unsigned long val, unsigned long mask, + unsigned int flags, struct timespec *timo, clockid_t clockid) +{ + return syscall(__NR_futex_wait, uaddr, val, mask, flags, timo, clockid); +} + +static inline int futex2_requeue(struct futex_waitv *futexes, unsigned int flags, + int nr_wake, int nr_requeue) +{ + return syscall(__NR_futex_requeue, futexes, flags, nr_wake, nr_requeue); +} From patchwork Fri Oct 25 09:03:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13850346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79F3DD0C5F8 for ; Fri, 25 Oct 2024 09:41:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 381046B0098; Fri, 25 Oct 2024 05:41:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 307CE6B0099; Fri, 25 Oct 2024 05:41:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CF916B009B; Fri, 25 Oct 2024 05:41:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EE6B96B0098 for ; Fri, 25 Oct 2024 05:41:14 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E64F780331 for ; Fri, 25 Oct 2024 09:40:57 +0000 (UTC) X-FDA: 82711630344.03.0527CC6 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf13.hostedemail.com (Postfix) with ESMTP id 0EF662000B for ; Fri, 25 Oct 2024 09:40:52 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=EWoI2kF0; dmarc=none; spf=none (imf13.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729849159; a=rsa-sha256; cv=none; b=6/7wrUX2Yoi2V9O5E7dv/7gMj79xXWTD0B+UeCX2nZ45xA5SYP14SNxjdu3jSX1hpWDpE8 iWlXDnJcXMeMePZjnPa4k3HKLMIoSmrBb/fao+yaHsW3b4IjDs7vfvOjcA92+KCj4lCQol ooj2Xs1G4yCncUROsJo54kE/2/O5ZMQ= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=EWoI2kF0; dmarc=none; spf=none (imf13.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729849159; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=WBCThp+Q/6jOaBGTrPx+/LzFcfDXMvuZTpT311NKNDE=; b=lP7kc9M3iiEAo1WyiQarDmq99lHsjk3hRsiRWgw1Bh76Jx3KWTOVsooEOLxwHT/7N4YPdN kvckRvQGN/W61IXHcYK4jkXg0e4fUomY2L7FKGqnXQPDEeW6h86eulubS5tiOLOmELuJ0t hJHfqzFXEr6iKt/NAHVgHtPlr+q15/8= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=WBCThp+Q/6jOaBGTrPx+/LzFcfDXMvuZTpT311NKNDE=; b=EWoI2kF0smf1s0JbGY03zJ6pds 7+1TyI0uc8qyQpyMesLwBt8XOHhmPlGYE6EGGOJ73mlBKdBT1KX37ZFqsrABYBwNA94a91/FQ42E0 qEU/HMdu1bqB3bmHqSxuYrPpUtnWTkeUecZHvkEGzT2NST2LbR0+7RMTxFAPnPz2fjV7vmnBwTS5g jVIMPypgqKWNQJGkWzRSb3hMHT4ailPDfYV4mvfJ9mkvV5AFzSxGTSsRSCkhQ8AYpy1JO1U57kNLQ HPpriKCzCoE2UyasLLSWp8bGqlpPR/6bYSadxpDjWiFR6elaqL2Mg6ZY8wuFm4p7WIT7slUwynCVh ouy8S96w==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1t4GoV-00000008saA-0f0B; Fri, 25 Oct 2024 09:40:59 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id E2BAE302179; Fri, 25 Oct 2024 11:40:57 +0200 (CEST) Message-Id: <20241025093944.922683354@infradead.org> User-Agent: quilt/0.65 Date: Fri, 25 Oct 2024 11:03:53 +0200 From: Peter Zijlstra To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, cl@linux.com, llong@redhat.com Subject: [PATCH 6/6] futex,selftests: Extend the futex selftests for NUMA References: <20241025090347.244183920@infradead.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 0EF662000B X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: wqn413oq4neg577fss7ibfwyjjzr6hx6 X-HE-Tag: 1729849252-292552 X-HE-Meta: U2FsdGVkX1/VlJN1IPtTHhg/+FdjD+eeMaO9rr3wxZ4VVEfRHxrspA/NoD3V1NPcaOeD698XA9WDmM7bJ8WDstkTZzFyFQuhuAC41VT8nXhoxd3uoM04Z8FjhLjzkFgKa9sJYPgiYXYRry3vKD4ja9nJ3QpZb3/6UuERNuUk7a4dLf88QHajEAFsjk+HcQZjpfRRWsxW8/dvegxq7a6lIDhA7lo0FGoFrUD5dKw0ssSrbWOhgxOVWddpkU7qNDrxhvOf9oGiXgqGZ/sjvwnX+bfnbT3VgMcLZOCAoKKjKpgEOt3gE2smZadoG0T8KBKIJxmq0WJJ45L3mGGOPfX4oE+hk7J0tRJCWdCK3ca/Ubum7Hfb0mGkmyjfD0mxLMERyjQqDnnCAKq+dbZXi4SMZr23XWaryZyXmo46Nbedul2q1XISVFnIvuLLfaHS3KeCaaEYcbWAGf1KHZA/vOEM6q+kNrFQFRR90tK0sc2/mAClfYA2UW5F4bQJvc2+FkBicDfH9gijWu4J1lEJ4hubatmpqfN1bGRgzrscrOz6oe28g2p1rvJzUNG9xz1ccUYVICh0B2rQICK6/MRb+h7zRUn9eXS+c4iJjf4MA+L3H5VaJIeGj9DdwhCmCqk+BdMEVgP1Qw8bCL1QG8okLMvGhjeoiV/sFPIQStdX+XqalMUdSzyJt7KrFrnyI95Cjk9MrIvey7vovEvE4boPjCrEoi9Hz3AOHLTgQkf7w/7qn0bk3w+u7PQiy8Bc3QhugxbdKSHolExbZBNxWC0B3YgwBHBga6vD6pHB+BzTrIWvpSEuFfGyBj48fN9ains2gR4bZZq+LmcollWrHFn2yRz4xnuc4kB8J2cVJhNKz3Yso2sTQR9hPnZEYBqkn057ZzzyGFPx6DsyKBf3i0zUeJeLyOT3RJQwudCNPhqi+rShTYbaQY+BnZGL/RhZpkQrD2gpW+6+wP0IANVhcGL1Bfx Z5ovzkcx i0F6sVnZ5CDwCmgNmVEx8/hnhbkIin9aNKVL1NCLVj5txeEG83VSPeNhPn8wisLP/Yrz6zSI2su4qTdUD6NjANLE2zMcCYdJZhrmWKM5htK8tIVnpyBpc5XQWgb1WZKop7YN+yXJgwmn2P+vR61pzaT+d2XUryll3HUCD/oXFLXK6pSxj80fiNqSqKJ38Kkw9kMkZVl6P7iZLzMayVNYg3BlRfEgmZIyPBKBwYqxAkdVXALMK6xYnyhCUPm8bCnUSR1L+w1HAmvp8sIsSkjEy1Wok/IK0W66zVUQ9d/mdXjw854k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: XXX Signed-off-by: Peter Zijlstra (Intel) --- tools/testing/selftests/futex/functional/Makefile | 3 tools/testing/selftests/futex/functional/futex_numa.c | 262 ++++++++++++++++++ 2 files changed, 264 insertions(+), 1 deletion(-) --- a/tools/testing/selftests/futex/functional/Makefile +++ b/tools/testing/selftests/futex/functional/Makefile @@ -17,7 +17,8 @@ TEST_GEN_PROGS := \ futex_wait_private_mapped_file \ futex_wait \ futex_requeue \ - futex_waitv + futex_waitv \ + futex_numa TEST_PROGS := run.sh --- /dev/null +++ b/tools/testing/selftests/futex/functional/futex_numa.c @@ -0,0 +1,262 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include +#include "logging.h" +#include "futextest.h" +#include "futex2test.h" + +typedef u_int32_t u32; +typedef int32_t s32; +typedef u_int64_t u64; + +static int fflags = (FUTEX2_SIZE_U32 | FUTEX2_PRIVATE); +static int fnode = FUTEX_NO_NODE; + +/* fairly stupid test-and-set lock with a waiter flag */ + +#define N_LOCK 0x0000001 +#define N_WAITERS 0x0001000 + +struct futex_numa_32 { + union { + u64 full; + struct { + u32 val; + u32 node; + }; + }; +}; + +void futex_numa_32_lock(struct futex_numa_32 *lock) +{ + for (;;) { + struct futex_numa_32 new, old = { + .full = __atomic_load_n(&lock->full, __ATOMIC_RELAXED), + }; + + for (;;) { + new = old; + if (old.val == 0) { + /* no waiter, no lock -> first lock, set no-node */ + new.node = fnode; + } + if (old.val & N_LOCK) { + /* contention, set waiter */ + new.val |= N_WAITERS; + } + new.val |= N_LOCK; + + /* nothing changed, ready to block */ + if (old.full == new.full) + break; + + /* + * Use u64 cmpxchg to set the futex value and node in a + * consistent manner. + */ + if (__atomic_compare_exchange_n(&lock->full, + &old.full, new.full, + /* .weak */ false, + __ATOMIC_ACQUIRE, + __ATOMIC_RELAXED)) { + + /* if we just set N_LOCK, we own it */ + if (!(old.val & N_LOCK)) + return; + + /* go block */ + break; + } + } + + futex2_wait(lock, new.val, ~0U, fflags, NULL, 0); + } +} + +void futex_numa_32_unlock(struct futex_numa_32 *lock) +{ + u32 val = __atomic_sub_fetch(&lock->val, N_LOCK, __ATOMIC_RELEASE); + assert((s32)val >= 0); + if (val & N_WAITERS) { + int woken = futex2_wake(lock, ~0U, 1, fflags); + assert(val == N_WAITERS); + if (!woken) { + __atomic_compare_exchange_n(&lock->val, &val, 0U, + false, __ATOMIC_RELAXED, + __ATOMIC_RELAXED); + } + } +} + +static long nanos = 50000; + +struct thread_args { + pthread_t tid; + volatile int * done; + struct futex_numa_32 *lock; + int val; + int *val1, *val2; + int node; +}; + +static void *threadfn(void *_arg) +{ + struct thread_args *args = _arg; + struct timespec ts = { + .tv_nsec = nanos, + }; + int node; + + while (!*args->done) { + + futex_numa_32_lock(args->lock); + args->val++; + + assert(*args->val1 == *args->val2); + (*args->val1)++; + nanosleep(&ts, NULL); + (*args->val2)++; + + node = args->lock->node; + futex_numa_32_unlock(args->lock); + + if (node != args->node) { + args->node = node; + printf("node: %d\n", node); + } + + nanosleep(&ts, NULL); + } + + return NULL; +} + +static void *contendfn(void *_arg) +{ + struct thread_args *args = _arg; + + while (!*args->done) { + /* + * futex2_wait() will take hb-lock, verify *var == val and + * queue/abort. By knowingly setting val 'wrong' this will + * abort and thereby generate hb-lock contention. + */ + futex2_wait(&args->lock->val, ~0U, ~0U, fflags, NULL, 0); + args->val++; + } + + return NULL; +} + +static volatile int done = 0; +static struct futex_numa_32 lock = { .val = 0, }; +static int val1, val2; + +int main(int argc, char *argv[]) +{ + struct thread_args *tas[512], *cas[512]; + int c, t, threads = 2, contenders = 0; + int sleeps = 10; + int total = 0; + + while ((c = getopt(argc, argv, "c:t:s:n:N::")) != -1) { + switch (c) { + case 'c': + contenders = atoi(optarg); + break; + case 't': + threads = atoi(optarg); + break; + case 's': + sleeps = atoi(optarg); + break; + case 'n': + nanos = atoi(optarg); + break; + case 'N': + fflags |= FUTEX2_NUMA; + if (optarg) + fnode = atoi(optarg); + break; + default: + exit(1); + break; + } + } + + for (t = 0; t < contenders; t++) { + struct thread_args *args = calloc(1, sizeof(*args)); + if (!args) { + perror("thread_args"); + exit(-1); + } + + args->done = &done; + args->lock = &lock; + args->val1 = &val1; + args->val2 = &val2; + args->node = -1; + + if (pthread_create(&args->tid, NULL, contendfn, args)) { + perror("pthread_create"); + exit(-1); + } + + cas[t] = args; + } + + for (t = 0; t < threads; t++) { + struct thread_args *args = calloc(1, sizeof(*args)); + if (!args) { + perror("thread_args"); + exit(-1); + } + + args->done = &done; + args->lock = &lock; + args->val1 = &val1; + args->val2 = &val2; + args->node = -1; + + if (pthread_create(&args->tid, NULL, threadfn, args)) { + perror("pthread_create"); + exit(-1); + } + + tas[t] = args; + } + + sleep(sleeps); + + done = true; + + for (t = 0; t < threads; t++) { + struct thread_args *args = tas[t]; + + pthread_join(args->tid, NULL); + total += args->val; +// printf("tval: %d\n", args->val); + } + printf("total: %d\n", total); + + if (contenders) { + total = 0; + for (t = 0; t < contenders; t++) { + struct thread_args *args = cas[t]; + + pthread_join(args->tid, NULL); + total += args->val; + // printf("tval: %d\n", args->val); + } + printf("contenders: %d\n", total); + } + + return 0; +} +