From patchwork Thu Jun 18 22:22:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 11613055 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B04B31392 for ; Thu, 18 Jun 2020 23:30:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7CA832067D for ; Thu, 18 Jun 2020 23:30:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="guONlS+z" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7CA832067D Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8BF848D0064; Thu, 18 Jun 2020 19:30:13 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 86F8F8D0052; Thu, 18 Jun 2020 19:30:13 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 785858D0064; Thu, 18 Jun 2020 19:30:13 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0201.hostedemail.com [216.40.44.201]) by kanga.kvack.org (Postfix) with ESMTP id 61B1D8D0052 for ; Thu, 18 Jun 2020 19:30:13 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B7E2B1804E71E for ; Thu, 18 Jun 2020 23:30:12 +0000 (UTC) X-FDA: 76943928264.11.fire05_5911f2526e13 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 2937418153B61 for ; Thu, 18 Jun 2020 22:32:09 +0000 (UTC) X-Spam-Summary: 2,0,0,21b35f42ff4d3c26,d41d8cd98f00b204,3kenrxg0kckgifmtziaucaamvowwotm.kwutqvcf-uusdiks.wzo@flex--axelrasmussen.bounces.google.com,,RULES_HIT:41:152:355:379:541:960:973:982:988:989:1260:1277:1313:1314:1345:1437:1516:1518:1535:1543:1593:1594:1711:1730:1747:1777:1792:1801:2198:2199:2393:2559:2562:2693:2912:3138:3139:3140:3141:3142:3152:3353:3865:3866:3867:3868:3870:3871:3872:3874:4250:4605:5007:6117:6119:6120:6261:6653:7875:7903:8660:9121:9969:10004:10400:11026:11232:11658:11914:12043:12291:12296:12297:12438:12555:12663:12679:12683:12895:13019:13148:13160:13161:13229:13230:13548:14096:14097:14659:14721:21080:21220:21444:21627:21809:21939:30034:30054,0,RBL:209.85.210.202:@flex--axelrasmussen.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: fire05_5911f2526e13 X-Filterd-Recvd-Size: 5958 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Thu, 18 Jun 2020 22:32:08 +0000 (UTC) Received: by mail-pf1-f202.google.com with SMTP id q24so5413415pfs.7 for ; Thu, 18 Jun 2020 15:32:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=Ij26VoyzhYGvLjAG21TvPgeZf+MNFhcxJkH7FR67BL4=; b=guONlS+zDvIwKKVsdiiPKNH7/eiTgu2dN+5fua4h7UfSizaHOrh8PHvEFcbioXU0UE s48UzvSNhVC6diNscyJjApImeqZRrbqiKamhkJipt5yFN2Kn8NDTYNWwObqjqxlh3tAw QHa0E/9PRSzQDQYhqBbFLle/SXijKCDIA0zzWc3XHyAGq5qVvT8cyZ0rGKKypA+/bQDO hUAa51FtVs70IN0xsiqETb3j9NfIC6deYVTKb2RtekE4VwtYibPz8Tj4lR6ck1pfQIv8 RtW2uLTOACEyOBvfTQJF4CqCiSIEJlTMR1AUn9awEU+Ez1WvYKQHHOuW7rpaY6v8UDGD 9ZSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=Ij26VoyzhYGvLjAG21TvPgeZf+MNFhcxJkH7FR67BL4=; b=uQKA4nue/qlprL+t5h2JS8DF906O2IullsngfRqsBHBBp8b8ouU4nwmm2rknmFjOCd i8+37yDvCXYkGUNh8Z7mJcYBxkZwu+UyRmJp5Hyod7L1Wuwbstx6akU2S+pF+I1xPBjF ImJOelfkaieP4TxdkRPEbOnRl30uMD32m/S/cQW4XXHpJHk4G4V0SrtIb4fjFnbrwiDS xiYjRFqKZfEpSUSLH9rQYCp8xrzDcLXueXDCQ03lQDE9osgEkdIsVuf68k7GkJrzo97k WPCyDlH2eUM/Kki1bmLWVfyLGf4Vbgr1HM5618CtHFCote/zKMnly42aJKlmjA+u8/ej OKLw== X-Gm-Message-State: AOAM530DLhN9NiRCIYBKLbUtNKXuxUV0SAEys3y3UXQTrILVVJh50z73 IZgIh8vsIAuT6S9COTPur50Mb3lWNvcdACqVZnpk X-Google-Smtp-Source: ABdhPJz8sjdhRA7XEQ+5GTNRGsSav0+5iQXsihIkL9r3Dui8VuzBELYDV00aZJCPsZUBQ3sWLkOwIlnWDqJfmRIMVRsS X-Received: by 2002:a0c:e385:: with SMTP id a5mr6091856qvl.218.1592518953173; Thu, 18 Jun 2020 15:22:33 -0700 (PDT) Date: Thu, 18 Jun 2020 15:22:24 -0700 Message-Id: <20200618222225.102337-1-axelrasmussen@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.27.0.111.gc72c7da667-goog Subject: [RFC PATCH v3 0/1] Add rwsem "contended hook" API and mmap_lock histograms From: Axel Rasmussen To: Michel Lespinasse , Peter Zijlstra , David Howells Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Jonathan Adams , David Rientjes , Ying Han , Axel Rasmussen X-Rspamd-Queue-Id: 2937418153B61 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The overall goal of this patch is to add tracepoints around mmap_lock acquisition. This will let us collect latency histograms, so we can see how long we block for in the contended case. Our goal is to collect this data across all of production at Google, so low overhead is critical. I'm sending this RFC for feedback on the changes to rwsem.{h,c} and lockdep.h in particular. I'll describe reasoning for the down_write case, for brevity. We want to measure the time lock acquisition takes. Naively, this is: u64 start = sched_clock(); down_write(/* ... */); trace(sched_clock() - start); My measurements show that this adds ~5-6% overhead to building a kernel on a test machine [1]. This level of overhead is unacceptably high. My measurements show that only instrumenting the contended case lowers overhead to < 1%. Naively, we can instrument only the contended case like this: if (!down_write_trylock(/* ... */)) /* Time and call down_write as before. */ However, in the case where `_trylock` succeeds, we have lost the lockdep annotations (e.g. around ordering) `down_write` would normally include. (Granted, we don't run with lockdep in production, but debug builds do.) Assuming we need lower overhead, we aren't okay with losing lock annotations, and we reject various alternatives to this patch: - Making rwsem.c's __down_write and __down_write_trylock public, so mmap_lock.c could construct its own version of LOCK_CONTENDED with tracepoint calls. - Having mmap_lock.c reach into rwsem.c's internals with "extern" forward declarations for these functions (and removing "static inline"). - Somehow adding the instrumentation directly to rwsem.c (either affecting all locks, or polluting it some other way). The remaining alternative, I think, is what this patch proposes: add API surface to rwsem.h which allows callers to provide instrumentation callbacks which are invoked in the contended case. [1]: For measuring the overhead of the instrumentation, I've been timing a defconfig kernel build. The numbers above come from a KVM instance with 4 CPUs + 32G RAM, running 5.8-rc1 with this patch applied and a histogram trigger configured for the acquire_returned tracepoint. My test script is simple: for (( i=0; i<5; ++i)); do make mrproper > /dev/null || exit 1 make defconfig > /dev/null || exit 1 sync || exit 1 echo 3 > /proc/sys/vm/drop_caches || exit 1 /usr/bin/time make -j5 > /dev/null done The numbers I'm giving above are computed as: (avg of 5 runs with this hist trigger enabled) / (avg on 5.8-rc1). Axel Rasmussen (1): mmap_lock: add tracepoints around mmap_lock acquisition include/linux/lockdep.h | 47 ++++++ include/linux/mmap_lock.h | 27 ++- include/linux/rwsem.h | 12 ++ include/trace/events/mmap_lock.h | 76 +++++++++ kernel/locking/rwsem.c | 64 +++++++ mm/Kconfig | 19 +++ mm/Makefile | 1 + mm/mmap_lock.c | 281 +++++++++++++++++++++++++++++++ 8 files changed, 526 insertions(+), 1 deletion(-) create mode 100644 include/trace/events/mmap_lock.h create mode 100644 mm/mmap_lock.c --- 2.27.0.111.gc72c7da667-goog