From patchwork Tue Jan 2 22:00:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samuel Holland X-Patchwork-Id: 13509527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C080FC46CD2 for ; Tue, 2 Jan 2024 22:01:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C35D6B02A3; Tue, 2 Jan 2024 17:01:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 372536B02A4; Tue, 2 Jan 2024 17:01:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23A3E6B02A6; Tue, 2 Jan 2024 17:01:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 12E606B02A3 for ; Tue, 2 Jan 2024 17:01:40 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D62A51C118D for ; Tue, 2 Jan 2024 22:01:39 +0000 (UTC) X-FDA: 81635743518.03.4E14E1D Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf01.hostedemail.com (Postfix) with ESMTP id 105A44002A for ; Tue, 2 Jan 2024 22:01:37 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=sifive.com header.s=google header.b=G2D6OkrA; spf=pass (imf01.hostedemail.com: domain of samuel.holland@sifive.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=samuel.holland@sifive.com; dmarc=pass (policy=reject) header.from=sifive.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704232898; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=8AiLZ+S8ss+wMnNlNoL9Dmtw+ys4LyrrpsDpp+lAvHs=; b=z6+Xn5+MY06tYN0IpoT5IElGoOEAekwHztYhz9k9JgXnBIRkjFG2NT5RIpGmzTrh8G3+Bw TjobF2M2HDt9rWQtiaLOQKtNju8YAjx2e2n5fqrxnsSwH/Ue2Ib2uqRt/Fqj6TiYtQOnSt izprpSzg3+BH9zainsn8sIH8C3/e8is= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704232898; a=rsa-sha256; cv=none; b=wSfZjn8YVWqMN3D+WN3DPj1zTWqavb6GXf9Wu7R42gdbexe6EkSe1ZHsSTtvbI3Wu6v3pz +cXKPOq/mXGCQcDw6ALyxAEciSZq6KqMOB5D4C+xuhIf78+uxZCIrab8SdLIx8pRCPrdyr WguuZuDohbOhnKRVLyhqHQ5x3gbpeGA= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=sifive.com header.s=google header.b=G2D6OkrA; spf=pass (imf01.hostedemail.com: domain of samuel.holland@sifive.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=samuel.holland@sifive.com; dmarc=pass (policy=reject) header.from=sifive.com Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-28c0536806fso7203952a91.0 for ; Tue, 02 Jan 2024 14:01:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1704232897; x=1704837697; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=8AiLZ+S8ss+wMnNlNoL9Dmtw+ys4LyrrpsDpp+lAvHs=; b=G2D6OkrAB6fRJa1ZsBh9Idu4V+d6Eo5XEzIYVjz3teG5NzpxHmVoNuxmFN1DmzSO1g XJ9f0/zdM+9YT41Z/BeiWkJekjOOn4f31ItlJWgx16nP1BUxQxLFqjFju6VSFiTlRiId IT1ao+tr+ypp2yWl/Ziiy9TnQWUAeXFmq31dAhP9L9Q8k0zlV1veiwiTqzlm865pk8OH V3yNNQmBZLk0LGvpwuFvxGWeeeAlwcDmafR/AuSe9bkjbz/IbjgxdfUalEGymVXIe4Tn e4ubVGUu6bZZxp4q2z6mdAyc7A35QWSbQn3sdtzeN/H2mNBmCJvWIXmnQDB8SXxs0Pwo dbkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704232897; x=1704837697; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8AiLZ+S8ss+wMnNlNoL9Dmtw+ys4LyrrpsDpp+lAvHs=; b=ukZStBNfuyYRLLlnYP7pZ7bQ06mBCBw/mTV+XpguC9VARye23kpuBT238KaB7n6MZl dwHRNKs5x66jTk5HvStgaP+wf8sQ0pyDCWxKP6DKCQcVtJwXinPn59VFbURRo0bYQvJc BST06Ygcw0B3IaNA0eLddhTLCguKeTliSBnwEkcinE2zHjNOP0V/hGrH8XiUr1sWyASu eoHYF48iacZHOE0kHJ1W9MVCgOiq+mz9nl2uC3Aovr104fbW6cvEQGnMZI+vkFLpACkH rG7c3gXbDAo19pLWCxEUt/QwxqEVNTc2Oft7ysCntX1KValmGSv5MgaB3tvbrbFMInYf lS/A== X-Gm-Message-State: AOJu0YyrgZvB610paXdbWWe2t2db9NrpnxHliFTxhObdBGON5PtN/8Y5 6YmOlBQ6Yuc8NHsCmx1fUVx6hqAtNFCImg== X-Google-Smtp-Source: AGHT+IFeov/gk3M9avfaoFHoq8DxqDNeNKONTMxG1L5Z5D7bNfNm3jHAFAoBaA7kbVFzjuOMQZ9Zxw== X-Received: by 2002:a17:90a:5898:b0:28c:890f:f814 with SMTP id j24-20020a17090a589800b0028c890ff814mr5775390pji.29.1704232896832; Tue, 02 Jan 2024 14:01:36 -0800 (PST) Received: from sw06.internal.sifive.com ([4.53.31.132]) by smtp.gmail.com with ESMTPSA id r59-20020a17090a43c100b0028ce507cd7dsm101724pjg.55.2024.01.02.14.01.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Jan 2024 14:01:36 -0800 (PST) From: Samuel Holland To: Palmer Dabbelt , linux-riscv@lists.infradead.org Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Alexandre Ghiti , Samuel Holland Subject: [PATCH v4 00/12] riscv: ASID-related and UP-related TLB flush enhancements Date: Tue, 2 Jan 2024 14:00:37 -0800 Message-ID: <20240102220134.3229156-1-samuel.holland@sifive.com> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 X-Rspamd-Queue-Id: 105A44002A X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: jf1ka6djuotu9p1o377nyamhams4a6q3 X-HE-Tag: 1704232897-510880 X-HE-Meta: U2FsdGVkX1+S6aiD6En3OvvYRP9eW4Y9WyDs7z8GedCxDKy33cxc0PoYTSVjeQdcisKy0oWGMWWP7tjf1EhAK9ka0wzauECwUDUg73XqraAzFthYtjRQoeZ+T2U99DOPPytE2gP9PW8VtPLn1y2jw+Tk0CMP9omZb+jYHGlpjlrbiyE4yZRWAM8+Y5utlGr7+mD+6kHBD22Nk6gpeDWnfNZeiPj6QX+knxnay5Yfmfjiovv4fs6Tg3HxXCPnNTaL2HsD4OLhI51sOxIm9ieQfJLsJMqEKJo+M9lIXVJEnqj7CzIObrmBJdeJT+U948ijQBj0GLC6Q0aqwTpAVbvWt0FKBpEG7cqx5zovzXkQ7EBu25kSjbxuWbcyKZeTe8v/Tuss+AHTgwEzd7PBvfd819QMjkUSsZASwFNHzCgBUOrFCWPM4Z4P2yjzi9SPwWFeipmAmP9w4D3adsQJvpdnIEnAbyMvTco8k5eRyXYPG6uts5QeEwbKiMtxDi4+Rcc2Fq09eI9gtFqWtrKjeCiSrWCRu+f3LWi+COYpsIRv1c7jmYTBS1Cmc4Ahj14D3020eyB1cMF2LNv4pfEfBN3wN637xxGxnsdjjHmayhpfIFLAL+/m8qnrUminqQgY/mB42RBc5QTooBIf0qicBtqIOrkQlPwOqrsn6yHkrJaQuAqcYdcpfbqorVGVO//lyliTHCgQ/fLoZrC7c5qev7+3t3tCh4NBygzBxxKOVQnMmJG45It4eS9Yt7CXsrJ4DerepDC92sSnxiZMFJLWajXLo4hapY7Tu81OFp4xj4lrLKfZ5vD9kp0bN8iVuPD/tn117NASnfL6tcCtEvs2qdhPagwDZ6RW8jwEwrb4y4XrAW9J9b+3NgrJnX8HKNzNlp7GBXAvde2TAuaj/Hkd3a4sabz8Yf9XWvFi2cfR9HLrBcfKf3vC4QVB5ipyNC6lT0SXopmDvB2Y2acnvhx7HsC gAlTBfhU 4d/0bDF+eNZgEe+9iIVTH0E58iWE6N3Bm5J9ESKNTb1OKGLXAdjzDsu69v1hojj24YZzTw1OiDNaZqojoe0XffkRyja58iSxzmvB/6NBp+6ifhYj2gvWiUC5orbde44aEu2A0Gs0dIlta+9+YEnAH7jze9H9vgKCfh8prfDkvXelPmeOI5fcAx1+TIBaeOqYrMCdinz1TPEMS+zEWr79Dgs2RSimOHkdx4e0LLSvlD6tLINPDpVkelnZNpmkSt3/0tyQ5XBwYXRi6eGA8wFNKpFDXy6sQSRht//0eUcvyp7QX7JUUHrKp6URLb83aNdkX0Idh59eDr584cMwNAyI3fa7Q1xKSpgrQBuccc25DSlyDKz17EpB7WP9HLXZoiVDG4VWf/LjZlnYCG6C87y3tTZ03hv+T3t+WUWtQsb/c/NcKMHDD6ZElU0PJOgzCluwpzW/fJvRGBpXD0bg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: While reviewing Alexandre Ghiti's "riscv: tlb flush improvements" series[1], I noticed that most TLB flush functions end up as a call to local_flush_tlb_all() when SMP is disabled. This series resolves that, and also optimizes the scenario where SMP is enabled but only one CPU is present or online. Along the way, I realized that we should be using single-ASID flushes wherever possible, so I implemented that as well. Here are some numbers from D1 (with SMP disabled) which show the performance impact: v6.7-rc8: System Benchmarks Partial Index BASELINE RESULT INDEX Execl Throughput 43.0 207.4 48.2 File Copy 1024 bufsize 2000 maxblocks 3960.0 52187.4 131.8 File Copy 256 bufsize 500 maxblocks 1655.0 14872.6 89.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 146597.8 252.8 Pipe Throughput 12440.0 125318.4 100.7 Pipe-based Context Switching 4000.0 17804.2 44.5 Process Creation 126.0 479.2 38.0 Shell Scripts (1 concurrent) 42.4 564.5 133.1 Shell Scripts (16 concurrent) --- 36.8 --- Shell Scripts (8 concurrent) 6.0 74.3 123.9 System Call Overhead 15000.0 182050.7 121.4 ======== System Benchmarks Index Score (Partial Only) 93.2 v6.7-rc8 plus this patch series: System Benchmarks Partial Index BASELINE RESULT INDEX Execl Throughput 43.0 208.5 48.5 File Copy 1024 bufsize 2000 maxblocks 3960.0 56847.0 143.6 File Copy 256 bufsize 500 maxblocks 1655.0 17728.9 107.1 File Copy 4096 bufsize 8000 maxblocks 5800.0 168016.2 289.7 Pipe Throughput 12440.0 133376.2 107.2 Pipe-based Context Switching 4000.0 19736.3 49.3 Process Creation 126.0 484.5 38.4 Shell Scripts (1 concurrent) 42.4 564.1 133.0 Shell Scripts (16 concurrent) --- 36.6 --- Shell Scripts (8 concurrent) 6.0 74.1 123.5 System Call Overhead 15000.0 210181.8 140.1 ======== System Benchmarks Index Score (Partial Only) 100.1 [1]: https://lore.kernel.org/linux-riscv/20231030133027.19542-1-alexghiti@rivosinc.com/ Changes in v4: - Fix a possible race between flush_icache_*() and SMP bringup - Refactor riscv_use_ipi_for_rfence() to make later changes cleaner - Optimize kernel TLB flushes with only one CPU online - Optimize global cache/TLB flushes with only one CPU online - Merge the two copies of __flush_tlb_range() and rely on the compiler to optimize out the broadcast path (both clang and gcc do this) - Merge the two copies of flush_tlb_all() and rely on constant folding - Only set tlb_flush_all_threshold when CONFIG_MMU=y. Changes in v3: - Fixed a performance regression caused by executing sfence.vma in a loop on implementations affected by SiFive CIP-1200 - Rebased on v6.7-rc1 Changes in v2: - Move the SMP/UP merge earlier in the series to avoid build issues - Make a copy of __flush_tlb_range() instead of adding ifdefs inside - local_flush_tlb_all() is the only function used on !MMU (smpboot.c) Samuel Holland (12): riscv: Flush the instruction cache during SMP bringup riscv: Use IPIs for remote cache/TLB flushes by default riscv: mm: Broadcast kernel TLB flushes only when needed riscv: Only send remote fences when some other CPU is online riscv: mm: Combine the SMP and UP TLB flush code riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma riscv: Avoid TLB flush loops when affected by SiFive CIP-1200 riscv: mm: Introduce cntx2asid/cntx2version helper macros riscv: mm: Use a fixed layout for the MM context ID riscv: mm: Make asid_bits a local variable riscv: mm: Preserve global TLB entries when switching contexts riscv: mm: Always use an ASID to flush mm contexts arch/riscv/errata/sifive/errata.c | 5 ++ arch/riscv/include/asm/errata_list.h | 12 ++++- arch/riscv/include/asm/mmu.h | 3 ++ arch/riscv/include/asm/mmu_context.h | 2 - arch/riscv/include/asm/sbi.h | 4 ++ arch/riscv/include/asm/smp.h | 15 +----- arch/riscv/include/asm/tlbflush.h | 50 ++++++++---------- arch/riscv/kernel/sbi-ipi.c | 11 +++- arch/riscv/kernel/smp.c | 11 +--- arch/riscv/kernel/smpboot.c | 7 +-- arch/riscv/mm/Makefile | 5 +- arch/riscv/mm/cacheflush.c | 7 +-- arch/riscv/mm/context.c | 26 ++++------ arch/riscv/mm/tlbflush.c | 76 +++++++++------------------- drivers/clocksource/timer-clint.c | 2 +- 15 files changed, 102 insertions(+), 134 deletions(-)