From patchwork Thu May 6 23:25:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12243605 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 776E6C43460 for ; Thu, 6 May 2021 23:25:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E89ED61289 for ; Thu, 6 May 2021 23:25:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E89ED61289 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 735B46B0072; Thu, 6 May 2021 19:25:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C0336B0073; Thu, 6 May 2021 19:25:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 512066B0074; Thu, 6 May 2021 19:25:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0231.hostedemail.com [216.40.44.231]) by kanga.kvack.org (Postfix) with ESMTP id 2F4436B0072 for ; Thu, 6 May 2021 19:25:47 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EAFDC99BE for ; Thu, 6 May 2021 23:25:46 +0000 (UTC) X-FDA: 78112390692.26.E645373 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf20.hostedemail.com (Postfix) with ESMTP id 313A4135 for ; Thu, 6 May 2021 23:25:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1620343546; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wdtiudLSXqX9mG3wVVY1C6HQ27m2h7nYkQwJQU/EgKg=; b=NEDAjdfLyxYbZtXGj8TFuYZe7iaBCgjCS2lfn1Ue8HFU2PC2ZZ8P1HdvR0NEhXFLI8RGgQ 9+fJYS8o10d5D+mvQMFexJ4IjjxseqkbypMchrBp8SSuIfDMEKAAGiwrn5TvDGAOyI6b8c WnTp1SD1Xo9cSWwqdyI2aJvteSeFWAE= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-585-xjVkqgRUMI-NNmOv5CA6KA-1; Thu, 06 May 2021 19:25:44 -0400 X-MC-Unique: xjVkqgRUMI-NNmOv5CA6KA-1 Received: by mail-qt1-f199.google.com with SMTP id b19-20020ac84f130000b02901d543c52248so1391577qte.1 for ; Thu, 06 May 2021 16:25:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wdtiudLSXqX9mG3wVVY1C6HQ27m2h7nYkQwJQU/EgKg=; b=GlSUDMULHjNp0SCn6CVhF/cq0itX2AV/0eAdP3MUq3CAXRLYJ7m4ZwuEWnBFHRzdi+ Daz73RV/0ZvvXnisU/tJvoM2IF++cxpHukPjmDFk+01+dfDHwlXZXJdhHroOE5/D5VJ6 GHqXwTYkws0D0JVN+Qkat6KlCXcKfQeiJS/z43hEq4dE0MNEs1ppNrzsHq+DCHlSPg3K gAHMfZw5zyRD3Xyw0NONfvnUKlao67kzLMCG9mRLjEEQmMVoah3dHXO1lKQXoDfdbNwg jnixYPk0n8IEtzA/f1a+jVNzeWgUHv07AW/VtjQdCQlWZyTyKH93UwUFdYPkaawRfyXM mwXQ== X-Gm-Message-State: AOAM531fsWWWWzNjHj9KXOPaC6RzRZTx5vmJAkSMkKXyJqc2GMMlgHgg Zd7QUet1z32gmkViNE+kaLJX+937/sZOatDpOfUfPEPDb2pCNZA3r2H2urh7iv9fVNZZnj3vxA2 uN7FesIXz0IN4qfz2npHNmlGRROKbIUVkeNhS1Kz9DVzIJDxIJkxEK6mECFYS X-Received: by 2002:ac8:5fd5:: with SMTP id k21mr3071570qta.231.1620343543637; Thu, 06 May 2021 16:25:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz3dk/t7SX3sfbhwUbA67QTp2M/7n3z1E6C0HGX6OS4HRvcdqLfWnmNr2vPMSeg4IVAFExKaA== X-Received: by 2002:ac8:5fd5:: with SMTP id k21mr3071528qta.231.1620343543208; Thu, 06 May 2021 16:25:43 -0700 (PDT) Received: from t490s.redhat.com (bras-base-toroon474qw-grc-72-184-145-4-219.dsl.bell.ca. [184.145.4.219]) by smtp.gmail.com with ESMTPSA id q13sm1605026qkn.10.2021.05.06.16.25.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 May 2021 16:25:42 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , peterx@redhat.com, John Hubbard , Jan Kara , Kirill Shutemov , Jason Gunthorpe , Andrew Morton , Kirill Tkhai , Michal Hocko , Oleg Nesterov , Jann Horn , Linus Torvalds , Matthew Wilcox , Andrea Arcangeli Subject: [PATCH 2/3] mm: gup: allow FOLL_PIN to scale in SMP Date: Thu, 6 May 2021 19:25:36 -0400 Message-Id: <20210506232537.165788-3-peterx@redhat.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210506232537.165788-1-peterx@redhat.com> References: <20210506232537.165788-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NEDAjdfL; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf20.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=peterx@redhat.com X-Rspamd-Server: rspam03 X-Stat-Signature: 8qwj6ypij4cd7bx65zz78uw91udzd5ut X-Rspamd-Queue-Id: 313A4135 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf20; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=216.205.24.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1620343539-987139 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Andrea Arcangeli has_pinned cannot be written by each pin-fast or it won't scale in SMP. This isn't "false sharing" strictly speaking (it's more like "true non-sharing"), but it creates the same SMP scalability bottleneck of "false sharing". To verify the improvement a new "pin_fast.c" program was added to the will-it-scale benchmark. == pin_fast.c - start == /* SPDX-License-Identifier: GPL-2.0-or-later */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include /* exercises pin_user_pages_fast, requires a kernel with CONFIG_GUP_TEST=y */ char *testcase_description = "pin_user_pages_fast SMP scalability benchmark"; static int gup_fd; #define NR_PAGES 1024 #define BUFLEN (getpagesize() * NR_PAGES) #define GUP_TEST_MAX_PAGES_TO_DUMP 8 #define PIN_FAST_BENCHMARK _IOWR('g', 2, struct gup_test) struct gup_test { __u64 get_delta_usec; __u64 put_delta_usec; __u64 addr; __u64 size; __u32 nr_pages_per_call; __u32 flags; /* * Each non-zero entry is the number of the page (1-based: first page is * page 1, so that zero entries mean "do nothing") from the .addr base. */ __u32 which_pages[GUP_TEST_MAX_PAGES_TO_DUMP]; }; void testcase_prepare(unsigned long nr_tasks) { gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR); assert(gup_fd >= 0); } void testcase(unsigned long long *iterations, unsigned long nr) { char *p = aligned_alloc(getpagesize()*512, BUFLEN); assert(p); assert(!madvise(p, BUFLEN, MADV_HUGEPAGE)); for (int i = 0; i < NR_PAGES; i++) p[getpagesize()*i] = 0; struct gup_test gup = { .size = BUFLEN, .addr = (unsigned long)p, .nr_pages_per_call = 1, }; while (1) { assert(!ioctl(gup_fd, PIN_FAST_BENCHMARK, &gup)); (*iterations)++; } free(p); } void testcase_cleanup(void) { assert(!close(gup_fd)); } == pin_fast.c - end == The pin_fast will-it-scale benchmark was run with 1 thread per-CPU on this 2 NUMA nodes system: available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 node 0 size: 128792 MB node 0 free: 126741 MB node 1 cpus: 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 node 1 size: 128944 MB node 1 free: 127330 MB node distances: node 0 1 0: 10 32 1: 32 10 Before this commit (average 25617 +- 0.16%): tasks,processes,processes_idle,threads,threads_idle,linear 0,0,100,0,100,0 256,0,0.00,25641,0.17,0 tasks,processes,processes_idle,threads,threads_idle,linear 0,0,100,0,100,0 256,0,0.00,25652,0.16,0 tasks,processes,processes_idle,threads,threads_idle,linear 0,0,100,0,100,0 256,0,0.00,25559,0.16,0 After this commit (average 1194790 +- 0.11%): tasks,processes,processes_idle,threads,threads_idle,linear 0,0,100,0,100,0 256,0,0.00,1196513,0.19,0 tasks,processes,processes_idle,threads,threads_idle,linear 0,0,100,0,100,0 256,0,0.00,1194664,0.19,0 tasks,processes,processes_idle,threads,threads_idle,linear 0,0,100,0,100,0 256,0,0.00,1193194,0.19,0 This commits increases the SMP scalability of pin_user_pages_fast() executed by different threads of the same process by more than 4000%. Signed-off-by: Andrea Arcangeli Signed-off-by: Peter Xu Reviewed-by: John Hubbard --- mm/gup.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 63a079e361a3d..8b513e1723b45 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1292,7 +1292,7 @@ static __always_inline long __get_user_pages_locked(struct mm_struct *mm, BUG_ON(*locked != 1); } - if (flags & FOLL_PIN) + if (flags & FOLL_PIN && !atomic_read(&mm->has_pinned)) atomic_set(&mm->has_pinned, 1); /* @@ -2617,7 +2617,7 @@ static int internal_get_user_pages_fast(unsigned long start, FOLL_FAST_ONLY))) return -EINVAL; - if (gup_flags & FOLL_PIN) + if (gup_flags & FOLL_PIN && !atomic_read(¤t->mm->has_pinned)) atomic_set(¤t->mm->has_pinned, 1); if (!(gup_flags & FOLL_FAST_ONLY))