From patchwork Fri Sep 22 20:01:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 13396389 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1456CE7A86 for ; Fri, 22 Sep 2023 20:59:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC69B6B02FB; Fri, 22 Sep 2023 16:59:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8984E6B0302; Fri, 22 Sep 2023 16:59:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6ECC26B02DE; Fri, 22 Sep 2023 16:59:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4E52B6B02FB for ; Fri, 22 Sep 2023 16:59:26 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 23FBC140113 for ; Fri, 22 Sep 2023 20:59:26 +0000 (UTC) X-FDA: 81265449132.30.50E2426 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf22.hostedemail.com (Postfix) with ESMTP id 87E89C0009 for ; Fri, 22 Sep 2023 20:59:23 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=DCD6izEC; spf=none (imf22.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695416364; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=m5eUhTj4OMdhGrZCCmAgS0fprpsPOZ8qVsmY+pLiWtU=; b=UVgkHsB9IGKAcXuJr7B7CVh0NxZZa8OTcbLq7qsG48c2/leWna0/h5MG5HHkhuEPSeXMS/ TwUmP5bgJBQD6XmMl5V1jscaOsBXgfXq7JrzoohbAQb0/4YBAHWXv9gm7IlvtW2J8hT0oR zRxjGEwbM0ufa/3M4QC2EzQOAy4lT0M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695416364; a=rsa-sha256; cv=none; b=pBweKsHRkfKaAVblEiD2aL8K4Ys1mjo5E8szcH4XEYw3bN/37ihHNsEzZf7X2CgE5MTW4g NzbYTezthBcNUJfPCNKhZ10RbEppLXKmZwaqGD2hLU6bWwYKDQvMP2QYpIRaAj/uZ2q0AF F+dNQJmy+Ge9/cIqlS53D+NUCnUE5Bo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=DCD6izEC; spf=none (imf22.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-Id:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=m5eUhTj4OMdhGrZCCmAgS0fprpsPOZ8qVsmY+pLiWtU=; b=DCD6izECXSbSHvAInV6KQTFrIm AM42jx2fn31pHdWtIwgBg7UYPun61VPp7UZmrCy0O+wtnJgcLfiXyzMLfRpoWnTWzpJVGQJ8jo7J5 bFtxuDcm4+py1iQQAF3vZol/m5JSW+wtkIYwBQUnx+DP6uHi6VjF8Qs8bA0HREM7IAbga7BkRcJ8z l7kAiGcSdKDyHIkEYK9JgiGqtRuvl75DEd2RCLmo9ym5K2osMT3RX8p5NMe8f3cwH2RuRKq4O4KF/ uVS4O05DZT1B9tQivLWaNkuXz1fPyb4mCZcQheDnfTNDElGcU3IAvwCoLKAF9ShDt8Wk2e9MO5frF 4ixJusiA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qjnEs-00GXzB-2v; Fri, 22 Sep 2023 20:59:05 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id EC2F03005AA; Fri, 22 Sep 2023 22:59:03 +0200 (CEST) Message-Id: <20230922205449.923636292@infradead.org> User-Agent: quilt/0.65 Date: Fri, 22 Sep 2023 22:01:22 +0200 From: Peter Zijlstra To: tglx@linutronix.de, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de, steve.shaw@intel.com, marko.makela@mariadb.com, andrei.artemev@intel.com Subject: [PATCH 16/15] futex,selftests: Extend the futex selftests for NUMA References: <20230921104505.717750284@noisy.programming.kicks-ass.net> <20230921104505.717750284@noisy.programming.kicks-ass.net> <20230922200120.011184118@infradead.org> MIME-Version: 1.0 X-Stat-Signature: i1sehppgtkc1nhr4dd1n8t76u9gkgem5 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 87E89C0009 X-Rspam-User: X-HE-Tag: 1695416363-296362 X-HE-Meta: U2FsdGVkX1+duHR5qCaiuL1TnMo8XW6agOFNr9C3YZW+eBHAB8MZSn8o/4qNEFt2sqllNp+ci7O4b5XBkup/28GF1s5W8zF+rf9hZGmsVCBRNMj3hB4DWyg1ze3AoNGVbj1isPKz8GN94iUc1I4vBToTfqld2bxn0GljoEwPlUPREpxza2LcZPeZ1h0qHpSU21bd3x7d+hxG6p4k8OemOtt+lCOZr41tFg7xN72bPrQwgrCNTheujVuO6I511cgtLOsRc91SyUJanrXQ1PAl/oY4ZUw/3bvlScdzl0dAI7zXKDHPAm3wlsCShWgDOxATyYxVwSgSlKVqyu8ofGDNWWbcDq+1V+s8PD2IOlsO7WmjshsgXjQSvd6pqOyfJwnRYYTDzYfYcIhOswUeWAVMh08lPT2KmAobGaZmf6oUOAr32DoBZV6wi84CWeutvdFuoD0VcaYVH0mqJLzsEr+AeccUEwm8H1d6KgZRZc1lRYnVlmv45vlqWQePpGWpRIriYACcaTNPxBeH3fd4QahrZ4S3GkU6QN5waYFLLigz3o1zPaxajbA0XwSRhzbDx6uMbbAOt10WnXYU0fX4016hO6Wrzh1GpZg34s3/rFmx4/zFUAMila4KnBqODh8QKv5inlJ4z1RP/Kjbk76oRBs+A5gmH+nYcP6I9KREuc5Z6KP2Z75y1hIOBwdMA9PU8U37zhyDTY8h/nVwt6p2nxbXP5s7pIzDctyuNU+vHoW5u1tbtbyhI9LQgq/z03xzUIDVTYgJTBTIBGwyu8a5jlm+4VD9MNIp8Wuz5yVuCoJsPamgthx81Yol+5RzfaW0HAlJVmo/z3eewb2MHpytWY76iLYY4ixlDmF1lpzavI7P9pZWV5EnQhZ66HpPmCNxdn0AMzEfMh04nbRv8JXSaW3QA/ngFv5W1pccxNM+2RoGNWA3QstZHw0jSMisk5AcnwwFf8kKjtckD4XYiCHDMTr 0xBZiLVg 9VkVJ3VUxJbiJr8wTzYVlj0jqHeHMaQ2TkQlMl/Hmz5E7gQYrDvOHyDPr7gv59PYH6Yva6BA7r9VYfmIlzt7I8HS9B0rYCutz9mwVms3u/HJfJcaTkUzoEHqYoVF4eCne7oDPUd+yecH5+TG0ey4uz59Vkov/m/a839tQce+U3Q8eBrEiTmbtSamLn9njdxdMebUdoUw1fmM7vMXjkHILZVojn9BgmVSmsNhURgxGaCIg7L+lEG7V+hfxG7OEVQ52+caxSnxolpgIJ78YBwHIsZe/gA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: XXX Signed-off-by: Peter Zijlstra (Intel) --- tools/testing/selftests/futex/functional/Makefile | 3 tools/testing/selftests/futex/functional/futex_numa.c | 262 ++++++++++++++++++ 2 files changed, 264 insertions(+), 1 deletion(-) --- a/tools/testing/selftests/futex/functional/Makefile +++ b/tools/testing/selftests/futex/functional/Makefile @@ -17,7 +17,8 @@ TEST_GEN_PROGS := \ futex_wait_private_mapped_file \ futex_wait \ futex_requeue \ - futex_waitv + futex_waitv \ + futex_numa TEST_PROGS := run.sh --- /dev/null +++ b/tools/testing/selftests/futex/functional/futex_numa.c @@ -0,0 +1,262 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include +#include "logging.h" +#include "futextest.h" +#include "futex2test.h" + +typedef u_int32_t u32; +typedef int32_t s32; +typedef u_int64_t u64; + +static int fflags = (FUTEX2_SIZE_U32 | FUTEX2_PRIVATE); +static int fnode = FUTEX_NO_NODE; + +/* fairly stupid test-and-set lock with a waiter flag */ + +#define N_LOCK 0x0000001 +#define N_WAITERS 0x0001000 + +struct futex_numa_32 { + union { + u64 full; + struct { + u32 val; + u32 node; + }; + }; +}; + +void futex_numa_32_lock(struct futex_numa_32 *lock) +{ + for (;;) { + struct futex_numa_32 new, old = { + .full = __atomic_load_n(&lock->full, __ATOMIC_RELAXED), + }; + + for (;;) { + new = old; + if (old.val == 0) { + /* no waiter, no lock -> first lock, set no-node */ + new.node = fnode; + } + if (old.val & N_LOCK) { + /* contention, set waiter */ + new.val |= N_WAITERS; + } + new.val |= N_LOCK; + + /* nothing changed, ready to block */ + if (old.full == new.full) + break; + + /* + * Use u64 cmpxchg to set the futex value and node in a + * consistent manner. + */ + if (__atomic_compare_exchange_n(&lock->full, + &old.full, new.full, + /* .weak */ false, + __ATOMIC_ACQUIRE, + __ATOMIC_RELAXED)) { + + /* if we just set N_LOCK, we own it */ + if (!(old.val & N_LOCK)) + return; + + /* go block */ + break; + } + } + + futex2_wait(lock, new.val, ~0U, fflags, NULL, 0); + } +} + +void futex_numa_32_unlock(struct futex_numa_32 *lock) +{ + u32 val = __atomic_sub_fetch(&lock->val, N_LOCK, __ATOMIC_RELEASE); + assert((s32)val >= 0); + if (val & N_WAITERS) { + int woken = futex2_wake(lock, ~0U, 1, fflags); + assert(val == N_WAITERS); + if (!woken) { + __atomic_compare_exchange_n(&lock->val, &val, 0U, + false, __ATOMIC_RELAXED, + __ATOMIC_RELAXED); + } + } +} + +static long nanos = 50000; + +struct thread_args { + pthread_t tid; + volatile int * done; + struct futex_numa_32 *lock; + int val; + int *val1, *val2; + int node; +}; + +static void *threadfn(void *_arg) +{ + struct thread_args *args = _arg; + struct timespec ts = { + .tv_nsec = nanos, + }; + int node; + + while (!*args->done) { + + futex_numa_32_lock(args->lock); + args->val++; + + assert(*args->val1 == *args->val2); + (*args->val1)++; + nanosleep(&ts, NULL); + (*args->val2)++; + + node = args->lock->node; + futex_numa_32_unlock(args->lock); + + if (node != args->node) { + args->node = node; + printf("node: %d\n", node); + } + + nanosleep(&ts, NULL); + } + + return NULL; +} + +static void *contendfn(void *_arg) +{ + struct thread_args *args = _arg; + + while (!*args->done) { + /* + * futex2_wait() will take hb-lock, verify *var == val and + * queue/abort. By knowingly setting val 'wrong' this will + * abort and thereby generate hb-lock contention. + */ + futex2_wait(&args->lock->val, ~0U, ~0U, fflags, NULL, 0); + args->val++; + } + + return NULL; +} + +static volatile int done = 0; +static struct futex_numa_32 lock = { .val = 0, }; +static int val1, val2; + +int main(int argc, char *argv[]) +{ + struct thread_args *tas[512], *cas[512]; + int c, t, threads = 2, contenders = 0; + int sleeps = 10; + int total = 0; + + while ((c = getopt(argc, argv, "c:t:s:n:N::")) != -1) { + switch (c) { + case 'c': + contenders = atoi(optarg); + break; + case 't': + threads = atoi(optarg); + break; + case 's': + sleeps = atoi(optarg); + break; + case 'n': + nanos = atoi(optarg); + break; + case 'N': + fflags |= FUTEX2_NUMA; + if (optarg) + fnode = atoi(optarg); + break; + default: + exit(1); + break; + } + } + + for (t = 0; t < contenders; t++) { + struct thread_args *args = calloc(1, sizeof(*args)); + if (!args) { + perror("thread_args"); + exit(-1); + } + + args->done = &done; + args->lock = &lock; + args->val1 = &val1; + args->val2 = &val2; + args->node = -1; + + if (pthread_create(&args->tid, NULL, contendfn, args)) { + perror("pthread_create"); + exit(-1); + } + + cas[t] = args; + } + + for (t = 0; t < threads; t++) { + struct thread_args *args = calloc(1, sizeof(*args)); + if (!args) { + perror("thread_args"); + exit(-1); + } + + args->done = &done; + args->lock = &lock; + args->val1 = &val1; + args->val2 = &val2; + args->node = -1; + + if (pthread_create(&args->tid, NULL, threadfn, args)) { + perror("pthread_create"); + exit(-1); + } + + tas[t] = args; + } + + sleep(sleeps); + + done = true; + + for (t = 0; t < threads; t++) { + struct thread_args *args = tas[t]; + + pthread_join(args->tid, NULL); + total += args->val; +// printf("tval: %d\n", args->val); + } + printf("total: %d\n", total); + + if (contenders) { + total = 0; + for (t = 0; t < contenders; t++) { + struct thread_args *args = cas[t]; + + pthread_join(args->tid, NULL); + total += args->val; +// printf("tval: %d\n", args->val); + } + printf("contenders: %d\n", total); + } + + return 0; +} +