From patchwork Wed Jun 6 18:38:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 10450819 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9F4536053F for ; Wed, 6 Jun 2018 18:38:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8D23229911 for ; Wed, 6 Jun 2018 18:38:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 81D6029917; Wed, 6 Jun 2018 18:38:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EB54929911 for ; Wed, 6 Jun 2018 18:38:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E51B46B0005; Wed, 6 Jun 2018 14:38:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DDA636B0006; Wed, 6 Jun 2018 14:38:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD8446B0007; Wed, 6 Jun 2018 14:38:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-wr0-f197.google.com (mail-wr0-f197.google.com [209.85.128.197]) by kanga.kvack.org (Postfix) with ESMTP id 6DD806B0005 for ; Wed, 6 Jun 2018 14:38:06 -0400 (EDT) Received: by mail-wr0-f197.google.com with SMTP id x14-v6so4042016wrr.17 for ; Wed, 06 Jun 2018 11:38:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from:to :cc:subject:message-id:mime-version:content-disposition:user-agent; bh=zrpM03DLQPU69qJbqQlrj6Xjtq3EyD70zD+AFg4ygTE=; b=cV366Bq8zCL0+QqMI+bBnd+VzdihaslKWu+qM8I2nSnURMHZEqEtbVKjw7RE2bB3iE EhkslygzQtGiHrn+fX/J5n8Gy17M4VlpFx4r409tq4pRYC4rOfcxRdV+LMi1g6/ENNuI mc3FxcCRvMxuseEFeYC1S2EQGOXYg4KyFmuJNgk9pCO7h1CF170rVuWbiaSaboNO5C2H 2ESSYOFhDp8dVLQbvgqYkA0XPadC9MVCb4rlm7tcAq0eNpUiN63VYaJxWunXRF4Q8Tv9 R5PcW3SOWD5Azz2YiUjvUTQS59Y8U95jHL3rTYq9xl2z1rtZFcCRim2K2SpsLu8FHwoP 2AIQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 81.17.249.193 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net X-Gm-Message-State: APt69E16YXvqx+f+LSK3theJGFSRtq5HVFjiDVa0AlqyqygTua1ulI/m W0ue8IwXKZgJwEWmFOL73H3eVh7JxAhBRJIYiR5xzIdXYbC4dO9hJJT+OlnTTA8vOWEYfpzNP8+ sI2YIrwP4Hc2hN63/C1Z9hz05/UzOjga3LfAS/AmzpNKHp4mDKkM5HazRsZ3+AygvkA== X-Received: by 2002:a50:f310:: with SMTP id p16-v6mr5002641edm.183.1528310286014; Wed, 06 Jun 2018 11:38:06 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLqUKDCO4m4PUrKmtsaFsgnlS8eamCeEbT5sAYMvSfsYQyRyPGsCcgzLujliBOhiqsZJZ4u X-Received: by 2002:a50:f310:: with SMTP id p16-v6mr5002592edm.183.1528310284808; Wed, 06 Jun 2018 11:38:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528310284; cv=none; d=google.com; s=arc-20160816; b=xoTCKJJMgu9cwOA3h8OoeU/f79sxyBWoTdRjVoFH4ayjtfh1kdZIrIagFjUJVTclyC 8OxsHS2HpELWUgax+x1/nlED3HR9Mfwmk+tAVJSUZ8EJHVcUZYEa/Yq+81dkZ/BHB/9S BqGn6K7ryhaGQ/1tvaz45Xw64cWCdz3WK/ZgBGIlN5Qh3RahKK3jegGkrovMixnRIV+6 y4i3jM7Su6eFe4M4jVhdS/RPhXeVfARTlXZZApX0lKDPEa/sLiq850/WMBcIPY4utcTk gmM/f7geH4BGR1BUt0ROr1PMEGJXgDQL2N9hFpgqmBd4tE7bGePIk4NQf0FanJG74sB+ mLaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:content-disposition:mime-version:message-id:subject:cc :to:from:date:arc-authentication-results; bh=zrpM03DLQPU69qJbqQlrj6Xjtq3EyD70zD+AFg4ygTE=; b=tPc3Wd2htRvFGrgCnMtthij0Ld62Dc03YaOt/eJs2i9pYz3L0lKNeAupyPq6Nywo01 l5nfr1o3Wc3aYd34Rsdb0z1rqQMqlGgSY/6RSuGCD1SfRbpRkOroakzySSKGiZDKVIGy W2OhYmf3axMm7wyFMgBFy5by+GwFcgK+FOLQWII9m9+h8kk9UeTc3YZ2MjroQIg6KzIa KwqlvXyX7qSlRFtPLF2HooUABzQP5j0+1/H0AytLyou6IlvuLyVG28RPeYeBtcgAaYcF BSB2CgmL7GrRW+hay/O6stfI3LUbfE+1rK4gCoQ/axGy+Dxn4smXQRgsn48s2/9SYi3K qwsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 81.17.249.193 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net Received: from outbound-smtp25.blacknight.com (outbound-smtp25.blacknight.com. [81.17.249.193]) by mx.google.com with ESMTPS id d23-v6si1762052edq.426.2018.06.06.11.38.04 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 06 Jun 2018 11:38:04 -0700 (PDT) Received-SPF: pass (google.com: domain of mgorman@techsingularity.net designates 81.17.249.193 as permitted sender) client-ip=81.17.249.193; Authentication-Results: mx.google.com; spf=pass (google.com: domain of mgorman@techsingularity.net designates 81.17.249.193 as permitted sender) smtp.mailfrom=mgorman@techsingularity.net Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp25.blacknight.com (Postfix) with ESMTPS id 6F724B8A29 for ; Wed, 6 Jun 2018 19:38:04 +0100 (IST) Received: (qmail 27843 invoked from network); 6 Jun 2018 18:38:04 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[37.228.237.73]) by 81.17.254.9 with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 6 Jun 2018 18:38:04 -0000 Date: Wed, 6 Jun 2018 19:38:03 +0100 From: Mel Gorman To: Andrew Morton Cc: Nadav Amit , Dave Hansen , mhocko@kernel.org, vbabka@suse.cz, Aaron Lu , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Hugh Dickins Subject: [PATCH] mremap: Remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns Message-ID: <20180606183803.k7qaw2xnbvzshv34@techsingularity.net> MIME-Version: 1.0 Content-Disposition: inline User-Agent: NeoMutt/20170912 (1.9.0) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Commit 5d1904204c99 ("mremap: fix race between mremap() and page cleanning") fixed races between mremap and other operations for both file-backed and anonymous mappings. The file-backed was the most critical as it allowed the possibility that data could be changed on a physical page after page_mkclean returned which could trigger data loss or data integrity issues. A customer reported that the cost of the TLBs for anonymous regressions was excessive and resulting in a 30-50% drop in performance overall since this commit on a microbenchmark. Unfortunately I neither have access to the test-case nor can I describe what it does other than saying that mremap operations dominate heavily. This patch removes the LATENCY_LIMIT to handle TLB flushes on a PMD boundary instead of every 64 pages to reduce the number of TLB shootdowns by a factor of 8 in the ideal case. LATENCY_LIMIT was almost certainly used originally to limit the PTL hold times but the latency savings are likely offset by the cost of IPIs in many cases. This patch is not reported to completely restore performance but gets it within an acceptable percentage. The given metric here is simply described as "higher is better". Baseline that was known good 002: Metric: 91.05 004: Metric: 109.45 008: Metric: 73.08 016: Metric: 58.14 032: Metric: 61.09 064: Metric: 57.76 128: Metric: 55.43 Current 001: Metric: 54.98 002: Metric: 56.56 004: Metric: 41.22 008: Metric: 35.96 016: Metric: 36.45 032: Metric: 35.71 064: Metric: 35.73 128: Metric: 34.96 With patch 001: Metric: 61.43 002: Metric: 81.64 004: Metric: 67.92 008: Metric: 51.67 016: Metric: 50.47 032: Metric: 52.29 064: Metric: 50.01 128: Metric: 49.04 So for low threads, it's not restored but for larger number of threads, it's closer to the "known good" baseline. Using a different mremap-intensive workload that is not representative of the real workload there is little difference observed outside of noise in the headline metrics However, the TLB shootdowns are reduced by 11% on average and at the peak, TLB shootdowns were reduced by 21%. Interrupts were sampled every second while the workload ran to get those figures. It's known that the figures will vary as the non-representative load is non-deterministic. An alternative patch was posted that should have significantly reduced the TLB flushes but unfortunately it does not perform as well as this version on the customer test case. If revisited, the two patches can stack on top of each other. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Acked-by: Michal Hocko --- mm/mremap.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 049470aa1e3e..5c2e18505f75 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -191,8 +191,6 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, drop_rmap_locks(vma); } -#define LATENCY_LIMIT (64 * PAGE_SIZE) - unsigned long move_page_tables(struct vm_area_struct *vma, unsigned long old_addr, struct vm_area_struct *new_vma, unsigned long new_addr, unsigned long len, @@ -247,8 +245,6 @@ unsigned long move_page_tables(struct vm_area_struct *vma, next = (new_addr + PMD_SIZE) & PMD_MASK; if (extent > next - new_addr) extent = next - new_addr; - if (extent > LATENCY_LIMIT) - extent = LATENCY_LIMIT; move_ptes(vma, old_pmd, old_addr, old_addr + extent, new_vma, new_pmd, new_addr, need_rmap_locks, &need_flush); }