From patchwork Tue Aug 30 21:48:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B069ECAAD4 for ; Tue, 30 Aug 2022 21:49:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231776AbiH3Vtf (ORCPT ); Tue, 30 Aug 2022 17:49:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55584 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231741AbiH3Vt3 (ORCPT ); Tue, 30 Aug 2022 17:49:29 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE3768E9BD for ; Tue, 30 Aug 2022 14:49:27 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-33f8988daecso180733757b3.12 for ; Tue, 30 Aug 2022 14:49:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=b8cpW+nrR3APd+B9XZ294x5uN/r70qZMashrRw3GaA8=; b=B8JrbzRleZqubdBWO9KwQObWYBUFN1DEZsbcRMc0ufm/mVUktKeZg8X1kQOqUZzy93 ncNLrZU8nPu7KazqAMAyKJcikXtJLfl/reW5WBaXn80mOydYU5Xs8aKPQyLo7ErwCgSR 6vTq/fkGb/bg91DRTJvbM/ZhewoDhk6Fzi5TAqWRXDZK39lE+mcUdSuOiAqs+pSJsWjQ 9uZPxBCGuZMakiTU63KXQhu+wIJ9wLTXQN/A1wLZlo6tR7q0gp+RObUZBUjYJR6H/wFq J6OY9ufTA18je5eH0c2mEH8FL4dZmNwHh7WEfnbxz8VCQJOnaA1LbLkW+QDP3I4yxqae dF5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=b8cpW+nrR3APd+B9XZ294x5uN/r70qZMashrRw3GaA8=; b=lZgZkzz+ev6dJQF9DyVXaGp9/+hOc20czpU9x8X058FweMTpK7S9XVrIbx14E1+/NI VwO4quv/lPW+Tkzvggy1AKbNSKJ9ynyQLq5m280zgWjIscuJkTkA2o5EJZgfDE588esO +Tlhlh9D9nFyctnV0vPqxrTcs9kMIXss8eF/eNa5YBkCG6hM/uz6PKzrb2Fq1B0zuSsI c9iI35Szagv9AEObNioPm+bQgFhK+12OTVict4kfa+kyrkHnNvDZ/X5X15HWLyLNMMIg GAo1TwDKhBcOne4I1WqkdInGgNI1O3o9y/O+v9LBU0CphScHTvYmVubFzdmup5vp5wvs HJMQ== X-Gm-Message-State: ACgBeo3SauZXbv/C7+oRpFW/tZTTylBum14Oqa1qyasmX3exO1TkhoO6 kSnvoLHbC1cgkv6vTpN80EtQCBMz9gk= X-Google-Smtp-Source: AA6agR57tRpK3lCbg8ALQyLqVk3+SD8j/eJQxhkloYdEQAGMf1mIZU8UNti5rfc/2ao5N+5wkH06z2emEi0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a0d:e650:0:b0:341:85d:f480 with SMTP id p77-20020a0de650000000b00341085df480mr9713169ywe.161.1661896166926; Tue, 30 Aug 2022 14:49:26 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:50 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-2-surenb@google.com> Subject: [RFC PATCH 01/30] kernel/module: move find_kallsyms_symbol_value declaration From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Allow find_kallsyms_symbol_value to be called by code outside of kernel/module. It will be used for code tagging module support. Signed-off-by: Suren Baghdasaryan --- include/linux/module.h | 1 + kernel/module/internal.h | 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/module.h b/include/linux/module.h index 518296ea7f73..563d38ad84ed 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -605,6 +605,7 @@ struct module *find_module(const char *name); int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type, char *name, char *module_name, int *exported); +unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name); /* Look for this name: can be of form module:name. */ unsigned long module_kallsyms_lookup_name(const char *name); diff --git a/kernel/module/internal.h b/kernel/module/internal.h index 680d980a4fb2..f1b6c477bd93 100644 --- a/kernel/module/internal.h +++ b/kernel/module/internal.h @@ -246,7 +246,6 @@ static inline void kmemleak_load_module(const struct module *mod, void init_build_id(struct module *mod, const struct load_info *info); void layout_symtab(struct module *mod, struct load_info *info); void add_kallsyms(struct module *mod, const struct load_info *info); -unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name); static inline bool sect_empty(const Elf_Shdr *sect) { From patchwork Tue Aug 30 21:48:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959932 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 152FEECAAD4 for ; Tue, 30 Aug 2022 21:49:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231766AbiH3Vth (ORCPT ); Tue, 30 Aug 2022 17:49:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231193AbiH3Vte (ORCPT ); Tue, 30 Aug 2022 17:49:34 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 538628E9AA for ; Tue, 30 Aug 2022 14:49:30 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id n16-20020a258d10000000b0068df1e297c0so719110ybl.15 for ; Tue, 30 Aug 2022 14:49:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc; bh=nkzMCqKgjBTn+NwznXGF4FkMKBv5sNXeON2LIxVXB6M=; b=kJlmMVVbY7IXNl91XGcBvO3YBT3quRQF4WCS0SiH4LOLG/e+9p2jaPHVw9E6t43+lu d4/47XlZrYUEtJ7XQ4VgzKUrbfj5xy9BOKDP//VyLPwp6hXR8zYzPt3dQTar5Wyr4GYS iZIbWUJ75o4n7+UgTsXC+An80tm2UyPOQgFq5wJiP9bNIbl98o3r9SGFN0OGKpWzVCwL D90b1H3jXLeXTLKjKOyg3XK2WfiVlhvPG/TS9Z+MZLclTKFSEr9IlFYU0A0pKe+sQn6L rk2/ZWVdc/gD2rOE75MuQ1Xdte4UkYIWIQjcJTwqiH4rcTqyxgiTvaJazxYK/g79dGii T0zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc; bh=nkzMCqKgjBTn+NwznXGF4FkMKBv5sNXeON2LIxVXB6M=; b=0tafe3Wv59y2AOiAxCPV9gMP54P7SJx8LDvBgqb/M56of726MS/TcXUZCTMdXtNVBt kho49cTegdKVZRUagHD9+SZdTvm7mEE/tG/YBsfkPa9poMGYnLomWFbQhKPL+Y0vYR3G kZMOmBWwhCEIXp7sIlLmxzzfALQYodfWDEG9Gd09aQm7uijoOV0rHXi58TLOE/uOHDAg R41MJXKRRpmxtt74XNV0HS3alNrhLQ0F11Yo3gfKlOiDKZVaRe6/gB4NpABVN5cXPv51 C3QN6V60fGTw8e8fY35zDeS4SUfccL+Wesasvkfyq7xjj0++FXyzOtDQxeE1CuHb6c70 w9kg== X-Gm-Message-State: ACgBeo3bivP6FzxXKyZRT6wj3fFvqnuIR+m8f0hYlNJC2usvHFiBMnkE OEfIZ8wIldK57MHlS/8ldqD9YD6WQnM= X-Google-Smtp-Source: AA6agR58VLI71n7hWTuP/n58j09zS0GpQ603/kDNR321OvTW6NJyNZKRSs+4OQb0POQ9FkLHMKokHnLGLkM= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:e90e:0:b0:695:64cf:5d2 with SMTP id n14-20020a25e90e000000b0069564cf05d2mr12975575ybd.541.1661896169455; Tue, 30 Aug 2022 14:49:29 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:51 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-3-surenb@google.com> Subject: [RFC PATCH 02/30] lib/string_helpers: Drop space in string_get_size's output From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, Andy Shevchenko , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , "Michael S. Tsirkin" , Jason Wang , " =?utf-8?q?Noralf_Tr=C3=B8nnes?= " Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet Previously, string_get_size() outputted a space between the number and the units, i.e. 9.88 MiB This changes it to 9.88MiB which allows it to be parsed correctly by the 'sort -h' command. Signed-off-by: Kent Overstreet Cc: Andy Shevchenko Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: "Michael S. Tsirkin" Cc: Jason Wang Cc: "Noralf Trønnes" Cc: Jens Axboe --- lib/string_helpers.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/string_helpers.c b/lib/string_helpers.c index 5ed3beb066e6..3032d1b04ca3 100644 --- a/lib/string_helpers.c +++ b/lib/string_helpers.c @@ -126,8 +126,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units, else unit = units_str[units][i]; - snprintf(buf, len, "%u%s %s", (u32)size, - tmp, unit); + snprintf(buf, len, "%u%s%s", (u32)size, tmp, unit); } EXPORT_SYMBOL(string_get_size); From patchwork Tue Aug 30 21:48:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 469C2ECAAA1 for ; Tue, 30 Aug 2022 21:50:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231879AbiH3VuB (ORCPT ); Tue, 30 Aug 2022 17:50:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231794AbiH3Vtg (ORCPT ); Tue, 30 Aug 2022 17:49:36 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11A218FD6D for ; Tue, 30 Aug 2022 14:49:33 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id p12-20020a259e8c000000b006958480b858so716736ybq.12 for ; Tue, 30 Aug 2022 14:49:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=Ax7BfvJ0osMGjEtA73bq+7eMx40gLvPyn7aoZ5x/9O0=; b=hsapsq4Io1duD+oIcMKHPqjZY+p0l934IuKt6ADX53IxWDy+3W9yhUCc9UOoy72l+K yoi0pXSbPO5msleZdGZIvsV2cB7YSDfNeglHGW6FBm8CZZTN50YXVoQQqGyWu9X2u7pD k570L0pSRabwOW7Ke158CR88Up/9Z6lXEL30XT3S5u/I8ECd096CujeXPJFvz+9r3kgB qoVp6eGu0iY42VmNRszNeFxFF2vJTlL/2lnL0neCfdF+yMpw5RYl0YRjkEJWDOUcF8+F QxPQK5wJ7XQtmCBTueYp+wLoE89WGl4y+n7qUFebGmKhLaH9FtY63f9CwEVHHl+9VlmY tEOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=Ax7BfvJ0osMGjEtA73bq+7eMx40gLvPyn7aoZ5x/9O0=; b=pT/r9x1hnjD1muEU4MxU9fm92q91RxC5e7e3a4arc+NbMotRPl/MDL87Cfi6mpVniv 703UQSIEGQN0PQ7UQUbtani4PIdCjrBQePg+9A+VJzhI1Q9IT5iMzRqNM8dUUNsoJF5O Wa9xt5osV3yqNNHoL/QUJHRAIDNyFviKJvBs66s3lQ6cc4YN8fyLeGo8GAucNA1SIo7I jDqLMGnPdXPx6XAvKVdZQzZ3/e7qRHrQCrORsE49LvmwHDrn1jGX8776dup9ltMG7Nxk o7XJTGqJxz11EZyBxbRPm6XyRS9HHjx+pm11CgysFFPRyBGhZRi9AGwXp/DkTHjQCdXe +vMQ== X-Gm-Message-State: ACgBeo2BjQTtfBvj9u/J4c30vC1QN8m4R1ZAW7Xo6TkZUXnT7D8Np/Bv 8Hgzmi0b+BU0UfIKiYJZiiV0jR0xvTQ= X-Google-Smtp-Source: AA6agR7hndW0to/mdEbktFYjBoW+FsmL/An39pt9gov0fVTaelPsArzlVK5iYkOr3o+674D1aZ8YbEGFrHU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:3406:0:b0:69c:857b:7fd3 with SMTP id b6-20020a253406000000b0069c857b7fd3mr3099193yba.404.1661896172108; Tue, 30 Aug 2022 14:49:32 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:52 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-4-surenb@google.com> Subject: [RFC PATCH 03/30] Lazy percpu counters From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet This patch adds lib/lazy-percpu-counter.c, which implements counters that start out as atomics, but lazily switch to percpu mode if the update rate crosses some threshold (arbitrarily set at 256 per second). Signed-off-by: Kent Overstreet --- include/linux/lazy-percpu-counter.h | 67 +++++++++++++ lib/Kconfig | 3 + lib/Makefile | 2 + lib/lazy-percpu-counter.c | 141 ++++++++++++++++++++++++++++ 4 files changed, 213 insertions(+) create mode 100644 include/linux/lazy-percpu-counter.h create mode 100644 lib/lazy-percpu-counter.c diff --git a/include/linux/lazy-percpu-counter.h b/include/linux/lazy-percpu-counter.h new file mode 100644 index 000000000000..a22a2b9a9f32 --- /dev/null +++ b/include/linux/lazy-percpu-counter.h @@ -0,0 +1,67 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Lazy percpu counters: + * (C) 2022 Kent Overstreet + * + * Lazy percpu counters start out in atomic mode, then switch to percpu mode if + * the update rate crosses some threshold. + * + * This means we don't have to decide between low memory overhead atomic + * counters and higher performance percpu counters - we can have our cake and + * eat it, too! + * + * Internally we use an atomic64_t, where the low bit indicates whether we're in + * percpu mode, and the high 8 bits are a secondary counter that's incremented + * when the counter is modified - meaning 55 bits of precision are available for + * the counter itself. + * + * lazy_percpu_counter is 16 bytes (on 64 bit machines), raw_lazy_percpu_counter + * is 8 bytes but requires a separate unsigned long to record when the counter + * wraps - because sometimes multiple counters are used together and can share + * the same timestamp. + */ + +#ifndef _LINUX_LAZY_PERCPU_COUNTER_H +#define _LINUX_LAZY_PERCPU_COUNTER_H + +struct raw_lazy_percpu_counter { + atomic64_t v; +}; + +void __lazy_percpu_counter_exit(struct raw_lazy_percpu_counter *c); +void __lazy_percpu_counter_add(struct raw_lazy_percpu_counter *c, + unsigned long *last_wrap, s64 i); +s64 __lazy_percpu_counter_read(struct raw_lazy_percpu_counter *c); + +static inline void __lazy_percpu_counter_sub(struct raw_lazy_percpu_counter *c, + unsigned long *last_wrap, s64 i) +{ + __lazy_percpu_counter_add(c, last_wrap, -i); +} + +struct lazy_percpu_counter { + struct raw_lazy_percpu_counter v; + unsigned long last_wrap; +}; + +static inline void lazy_percpu_counter_exit(struct lazy_percpu_counter *c) +{ + __lazy_percpu_counter_exit(&c->v); +} + +static inline void lazy_percpu_counter_add(struct lazy_percpu_counter *c, s64 i) +{ + __lazy_percpu_counter_add(&c->v, &c->last_wrap, i); +} + +static inline void lazy_percpu_counter_sub(struct lazy_percpu_counter *c, s64 i) +{ + __lazy_percpu_counter_sub(&c->v, &c->last_wrap, i); +} + +static inline s64 lazy_percpu_counter_read(struct lazy_percpu_counter *c) +{ + return __lazy_percpu_counter_read(&c->v); +} + +#endif /* _LINUX_LAZY_PERCPU_COUNTER_H */ diff --git a/lib/Kconfig b/lib/Kconfig index dc1ab2ed1dc6..fc6dbc425728 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -498,6 +498,9 @@ config ASSOCIATIVE_ARRAY for more information. +config LAZY_PERCPU_COUNTER + bool + config HAS_IOMEM bool depends on !NO_IOMEM diff --git a/lib/Makefile b/lib/Makefile index ffabc30a27d4..cc7762748708 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -163,6 +163,8 @@ obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o obj-$(CONFIG_DEBUG_LIST) += list_debug.o obj-$(CONFIG_DEBUG_OBJECTS) += debugobjects.o +obj-$(CONFIG_LAZY_PERCPU_COUNTER) += lazy-percpu-counter.o + obj-$(CONFIG_BITREVERSE) += bitrev.o obj-$(CONFIG_LINEAR_RANGES) += linear_ranges.o obj-$(CONFIG_PACKING) += packing.o diff --git a/lib/lazy-percpu-counter.c b/lib/lazy-percpu-counter.c new file mode 100644 index 000000000000..299ef36137ee --- /dev/null +++ b/lib/lazy-percpu-counter.c @@ -0,0 +1,141 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include + +/* + * We use the high bits of the atomic counter for a secondary counter, which is + * incremented every time the counter is touched. When the secondary counter + * wraps, we check the time the counter last wrapped, and if it was recent + * enough that means the update frequency has crossed our threshold and we + * switch to percpu mode: + */ +#define COUNTER_MOD_BITS 8 +#define COUNTER_MOD_MASK ~(~0ULL >> COUNTER_MOD_BITS) +#define COUNTER_MOD_BITS_START (64 - COUNTER_MOD_BITS) + +/* + * We use the low bit of the counter to indicate whether we're in atomic mode + * (low bit clear), or percpu mode (low bit set, counter is a pointer to actual + * percpu counters: + */ +#define COUNTER_IS_PCPU_BIT 1 + +static inline u64 __percpu *lazy_percpu_counter_is_pcpu(u64 v) +{ + if (!(v & COUNTER_IS_PCPU_BIT)) + return NULL; + + v ^= COUNTER_IS_PCPU_BIT; + return (u64 __percpu *)(unsigned long)v; +} + +static inline s64 lazy_percpu_counter_atomic_val(s64 v) +{ + /* Ensure output is sign extended properly: */ + return (v << COUNTER_MOD_BITS) >> + (COUNTER_MOD_BITS + COUNTER_IS_PCPU_BIT); +} + +static void lazy_percpu_counter_switch_to_pcpu(struct raw_lazy_percpu_counter *c) +{ + u64 __percpu *pcpu_v = alloc_percpu_gfp(u64, GFP_ATOMIC|__GFP_NOWARN); + u64 old, new, v; + + if (!pcpu_v) + return; + + preempt_disable(); + v = atomic64_read(&c->v); + do { + if (lazy_percpu_counter_is_pcpu(v)) { + free_percpu(pcpu_v); + return; + } + + old = v; + new = (unsigned long)pcpu_v | 1; + + *this_cpu_ptr(pcpu_v) = lazy_percpu_counter_atomic_val(v); + } while ((v = atomic64_cmpxchg(&c->v, old, new)) != old); + preempt_enable(); +} + +/** + * __lazy_percpu_counter_exit: Free resources associated with a + * raw_lazy_percpu_counter + * + * @c: counter to exit + */ +void __lazy_percpu_counter_exit(struct raw_lazy_percpu_counter *c) +{ + free_percpu(lazy_percpu_counter_is_pcpu(atomic64_read(&c->v))); +} +EXPORT_SYMBOL_GPL(__lazy_percpu_counter_exit); + +/** + * __lazy_percpu_counter_read: Read current value of a raw_lazy_percpu_counter + * + * @c: counter to read + */ +s64 __lazy_percpu_counter_read(struct raw_lazy_percpu_counter *c) +{ + s64 v = atomic64_read(&c->v); + u64 __percpu *pcpu_v = lazy_percpu_counter_is_pcpu(v); + + if (pcpu_v) { + int cpu; + + v = 0; + for_each_possible_cpu(cpu) + v += *per_cpu_ptr(pcpu_v, cpu); + } else { + v = lazy_percpu_counter_atomic_val(v); + } + + return v; +} +EXPORT_SYMBOL_GPL(__lazy_percpu_counter_read); + +/** + * __lazy_percpu_counter_add: Add a value to a lazy_percpu_counter + * + * @c: counter to modify + * @last_wrap: pointer to a timestamp, updated when mod counter wraps + * @i: value to add + */ +void __lazy_percpu_counter_add(struct raw_lazy_percpu_counter *c, + unsigned long *last_wrap, s64 i) +{ + u64 atomic_i; + u64 old, v = atomic64_read(&c->v); + u64 __percpu *pcpu_v; + + atomic_i = i << COUNTER_IS_PCPU_BIT; + atomic_i &= ~COUNTER_MOD_MASK; + atomic_i |= 1ULL << COUNTER_MOD_BITS_START; + + do { + pcpu_v = lazy_percpu_counter_is_pcpu(v); + if (pcpu_v) { + this_cpu_add(*pcpu_v, i); + return; + } + + old = v; + } while ((v = atomic64_cmpxchg(&c->v, old, old + atomic_i)) != old); + + if (unlikely(!(v & COUNTER_MOD_MASK))) { + unsigned long now = jiffies; + + if (*last_wrap && + unlikely(time_after(*last_wrap + HZ, now))) + lazy_percpu_counter_switch_to_pcpu(c); + else + *last_wrap = now; + } +} +EXPORT_SYMBOL(__lazy_percpu_counter_add); From patchwork Tue Aug 30 21:48:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959934 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA264ECAAD4 for ; Tue, 30 Aug 2022 21:50:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231888AbiH3VuP (ORCPT ); Tue, 30 Aug 2022 17:50:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231801AbiH3Vti (ORCPT ); Tue, 30 Aug 2022 17:49:38 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5898F90199 for ; Tue, 30 Aug 2022 14:49:35 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id i194-20020a253bcb000000b00676d86fc5d7so708292yba.9 for ; Tue, 30 Aug 2022 14:49:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=Ximx3byfJ7rLtdPEIh1Ra0raz9MzgSTr/ga3ge5UvWM=; b=hlMw9ssXTclldlCaHPrcXJ+C8bd3xApn5Llhx/6gZpRfRlRJOwhqWghtr5RVB1LK2v 8FfTzwl4OCrCjFuadtzRuPxbLUjQJudmMLjVKcB1W8L4nbh23VnBIJLeDBpv1hunXsFS TNZP/OkUwJBjRdLHnTFXsfbD3FCLleefIIXeH80z2Nq7UmGRffrhNJtzi3K0o37plX80 MxuLJZry+KmCa87qIRicDj7F3K2/cxyGqrCNJyg48FflYe+16BIsOuCNrNLVMYU6qqtM jxbTTAqnCt+yOzPfECe6hBLshpNDmGjFaoiWgb1ccH/WG7XlQYWhlthKRprfEZxNn5xs 7PgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=Ximx3byfJ7rLtdPEIh1Ra0raz9MzgSTr/ga3ge5UvWM=; b=v8MQSxCHA+efW1RWw4728UAX2BVl5V6cMksdsZeeZV8J24tGKKRVIGWYKNv1Mtdz3W tI41BEdzwOpvr4Run8BRim8ikU8rZMyApNYFvlDvF2Sv5Dww6qGCxoqmKCaV6Ve/KEPn qIMQnWBr596ZROrotIAGlCe2U8Tcw8q8wNTwIiPAoOiJrMkg3KThTj2uBXve/JIGgVuL QPMBL6j2p7aKXqBDxzp1CcFSDaGVEmB0xvwKPwxYjSkcs8d9pPsk67gbs+hpGt/Gq7pI 3e+DwiQtim/afkZriuj/QeLfwsAH9xeHT1T0eM06RQ2+CWnqTLx+9yQ8hDtOyEokJkWC ZlfQ== X-Gm-Message-State: ACgBeo0vlhuz7OEWS9ncZmlITLTxwB0jaYo4CoAc7AQ5VWg0Pd8aJJjN TDUK74BxBOBjLoYU9W94dibOE/fWY3U= X-Google-Smtp-Source: AA6agR7neVKTaIOQpaQTJJtMxLoXgZYEqvCUk9V+fUBv2iEO9EJWDyjyjmM2UWOPn+DCBvrBJ11meMcGOpo= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a0d:d98c:0:b0:33d:c482:9714 with SMTP id b134-20020a0dd98c000000b0033dc4829714mr15776751ywe.415.1661896174717; Tue, 30 Aug 2022 14:49:34 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:53 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-5-surenb@google.com> Subject: [RFC PATCH 04/30] scripts/kallysms: Always include __start and __stop symbols From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet These symbols are used to denote section boundaries: by always including them we can unify loading sections from modules with loading built-in sections, which leads to some significant cleanup. Signed-off-by: Kent Overstreet --- scripts/kallsyms.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c index f18e6dfc68c5..3d51639a595d 100644 --- a/scripts/kallsyms.c +++ b/scripts/kallsyms.c @@ -263,6 +263,11 @@ static int symbol_in_range(const struct sym_entry *s, return 0; } +static bool string_starts_with(const char *s, const char *prefix) +{ + return strncmp(s, prefix, strlen(prefix)) == 0; +} + static int symbol_valid(const struct sym_entry *s) { const char *name = sym_name(s); @@ -270,6 +275,14 @@ static int symbol_valid(const struct sym_entry *s) /* if --all-symbols is not specified, then symbols outside the text * and inittext sections are discarded */ if (!all_symbols) { + /* + * Symbols starting with __start and __stop are used to denote + * section boundaries, and should always be included: + */ + if (string_starts_with(name, "__start_") || + string_starts_with(name, "__stop_")) + return 1; + if (symbol_in_range(s, text_ranges, ARRAY_SIZE(text_ranges)) == 0) return 0; From patchwork Tue Aug 30 21:48:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A00F7ECAAD5 for ; Tue, 30 Aug 2022 21:50:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231876AbiH3Vue (ORCPT ); Tue, 30 Aug 2022 17:50:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231793AbiH3VuA (ORCPT ); Tue, 30 Aug 2022 17:50:00 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEA61901AD for ; Tue, 30 Aug 2022 14:49:38 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-33d9f6f4656so190937337b3.21 for ; Tue, 30 Aug 2022 14:49:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=kcUwO49ZlU6bl/Tm+t4qMFM3AvAt3rlv3WDq12yXCn8=; b=IB5sn3g8G5XCtC4vvUkMJf3Z0NCybbDF4PfUg7xY8jaPrHiWi7tBKXZi3iw7JzW/hg R1rg3QKwvg2o4P0UJORvK/vnl+yrRARG5QS+tHVqZj7Wlke8B5UL2GccofvWmM5QTLmm X07Mo57+bKLwACVIf4YGvfOrnuhdmQMRu0jBSFOkXWJ/DCc2JiFJS2dKowFDk3y0y1lo ++gDXN1YD0J2obdelJxD0Ca+lRDY4iXl62HepJKu5B9nBoEEZMQwIAzfpufGcQaLvs6J LfNF8Ip/phVmlaMWNU5bNvIxjy7nc6YboQjzsEA9CsIOA+UQlQ9lwqxZiOCG8LXjXtLj HTKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=kcUwO49ZlU6bl/Tm+t4qMFM3AvAt3rlv3WDq12yXCn8=; b=s5L7Zgwgj/lrQ0lehIfZdjo5Au6VPE1nUMva7L4p1X+bdI3MUHynPQQ/i3P4rYXN0Y othJ47R5lUwY6HYj/xE3VqFyQ5hkDC4k6GQe/y+uECv3dpTxIaO6xMyhneq3tPq4Fp9f U38cvkX6sOVrLC3Io+NsG3q2T2uHjVRH4W6Z0qdW546W9cebekt5VGM7uUibZPqCU0Hf zbKSSDJlES8K2/FJBgh66Kr+md/V8Bl7TQhYYqrQLeImD7o9r1AASqVLaaJ/S0r1eKBK LUiFo5U9P/mg6x0C/wlQOariUyRZDKzamjomJmO7++QO2bRKpnaVmfdV69YYl6Ii+Ftd U9TQ== X-Gm-Message-State: ACgBeo1+jSx0JDzWmSPioTPFZiuHnbieoRZe1eDKgqGctHFyW39C5hfV KfhVn5EyxGvzcJhWvw9iD+duIf9DRDM= X-Google-Smtp-Source: AA6agR6vbeCOP++6z+390hwytf4tDzdTpspO7jE9UZOGKy2Zs51apVk0rDEGDjb4QkAG6+ILkunHIACE8Y4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a5b:18d:0:b0:695:a9d7:44b5 with SMTP id r13-20020a5b018d000000b00695a9d744b5mr13051126ybl.549.1661896177035; Tue, 30 Aug 2022 14:49:37 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:54 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-6-surenb@google.com> Subject: [RFC PATCH 05/30] lib: code tagging framework From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Add basic infrastructure to support code tagging which stores tag common information consisting of the module name, function, file name and line number. Provide functions to register a new code tag type and navigate between code tags. Co-developed-by: Kent Overstreet Signed-off-by: Kent Overstreet Signed-off-by: Suren Baghdasaryan --- include/linux/codetag.h | 71 ++++++++++++++ lib/Kconfig.debug | 4 + lib/Makefile | 1 + lib/codetag.c | 199 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 275 insertions(+) create mode 100644 include/linux/codetag.h create mode 100644 lib/codetag.c diff --git a/include/linux/codetag.h b/include/linux/codetag.h new file mode 100644 index 000000000000..a9d7adecc2a5 --- /dev/null +++ b/include/linux/codetag.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * code tagging framework + */ +#ifndef _LINUX_CODETAG_H +#define _LINUX_CODETAG_H + +#include + +struct codetag_iterator; +struct codetag_type; +struct seq_buf; +struct module; + +/* + * An instance of this structure is created in a special ELF section at every + * code location being tagged. At runtime, the special section is treated as + * an array of these. + */ +struct codetag { + unsigned int flags; /* used in later patches */ + unsigned int lineno; + const char *modname; + const char *function; + const char *filename; +} __aligned(8); + +union codetag_ref { + struct codetag *ct; +}; + +struct codetag_range { + struct codetag *start; + struct codetag *stop; +}; + +struct codetag_module { + struct module *mod; + struct codetag_range range; +}; + +struct codetag_type_desc { + const char *section; + size_t tag_size; +}; + +struct codetag_iterator { + struct codetag_type *cttype; + struct codetag_module *cmod; + unsigned long mod_id; + struct codetag *ct; +}; + +#define CODE_TAG_INIT { \ + .modname = KBUILD_MODNAME, \ + .function = __func__, \ + .filename = __FILE__, \ + .lineno = __LINE__, \ + .flags = 0, \ +} + +void codetag_lock_module_list(struct codetag_type *cttype, bool lock); +struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype); +struct codetag *codetag_next_ct(struct codetag_iterator *iter); + +void codetag_to_text(struct seq_buf *out, struct codetag *ct); + +struct codetag_type * +codetag_register_type(const struct codetag_type_desc *desc); + +#endif /* _LINUX_CODETAG_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index bcbe60d6c80c..22bc1eff7f8f 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -969,6 +969,10 @@ config DEBUG_STACKOVERFLOW If in doubt, say "N". +config CODE_TAGGING + bool + select KALLSYMS + source "lib/Kconfig.kasan" source "lib/Kconfig.kfence" diff --git a/lib/Makefile b/lib/Makefile index cc7762748708..574d7716e640 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -227,6 +227,7 @@ obj-$(CONFIG_OF_RECONFIG_NOTIFIER_ERROR_INJECT) += \ of-reconfig-notifier-error-inject.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o +obj-$(CONFIG_CODE_TAGGING) += codetag.o lib-$(CONFIG_GENERIC_BUG) += bug.o obj-$(CONFIG_HAVE_ARCH_TRACEHOOK) += syscall.o diff --git a/lib/codetag.c b/lib/codetag.c new file mode 100644 index 000000000000..7708f8388e55 --- /dev/null +++ b/lib/codetag.c @@ -0,0 +1,199 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include +#include + +struct codetag_type { + struct list_head link; + unsigned int count; + struct idr mod_idr; + struct rw_semaphore mod_lock; /* protects mod_idr */ + struct codetag_type_desc desc; +}; + +static DEFINE_MUTEX(codetag_lock); +static LIST_HEAD(codetag_types); + +void codetag_lock_module_list(struct codetag_type *cttype, bool lock) +{ + if (lock) + down_read(&cttype->mod_lock); + else + up_read(&cttype->mod_lock); +} + +struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype) +{ + struct codetag_iterator iter = { + .cttype = cttype, + .cmod = NULL, + .mod_id = 0, + .ct = NULL, + }; + + return iter; +} + +static inline struct codetag *get_first_module_ct(struct codetag_module *cmod) +{ + return cmod->range.start < cmod->range.stop ? cmod->range.start : NULL; +} + +static inline +struct codetag *get_next_module_ct(struct codetag_iterator *iter) +{ + struct codetag *res = (struct codetag *) + ((char *)iter->ct + iter->cttype->desc.tag_size); + + return res < iter->cmod->range.stop ? res : NULL; +} + +struct codetag *codetag_next_ct(struct codetag_iterator *iter) +{ + struct codetag_type *cttype = iter->cttype; + struct codetag_module *cmod; + struct codetag *ct; + + lockdep_assert_held(&cttype->mod_lock); + + if (unlikely(idr_is_empty(&cttype->mod_idr))) + return NULL; + + ct = NULL; + while (true) { + cmod = idr_find(&cttype->mod_idr, iter->mod_id); + + /* If module was removed move to the next one */ + if (!cmod) + cmod = idr_get_next_ul(&cttype->mod_idr, + &iter->mod_id); + + /* Exit if no more modules */ + if (!cmod) + break; + + if (cmod != iter->cmod) { + iter->cmod = cmod; + ct = get_first_module_ct(cmod); + } else + ct = get_next_module_ct(iter); + + if (ct) + break; + + iter->mod_id++; + } + + iter->ct = ct; + return ct; +} + +void codetag_to_text(struct seq_buf *out, struct codetag *ct) +{ + seq_buf_printf(out, "%s:%u module:%s func:%s", + ct->filename, ct->lineno, + ct->modname, ct->function); +} + +static inline size_t range_size(const struct codetag_type *cttype, + const struct codetag_range *range) +{ + return ((char *)range->stop - (char *)range->start) / + cttype->desc.tag_size; +} + +static void *get_symbol(struct module *mod, const char *prefix, const char *name) +{ + char buf[64]; + int res; + + res = snprintf(buf, sizeof(buf), "%s%s", prefix, name); + if (WARN_ON(res < 1 || res > sizeof(buf))) + return NULL; + + return mod ? + (void *)find_kallsyms_symbol_value(mod, buf) : + (void *)kallsyms_lookup_name(buf); +} + +static struct codetag_range get_section_range(struct module *mod, + const char *section) +{ + return (struct codetag_range) { + get_symbol(mod, "__start_", section), + get_symbol(mod, "__stop_", section), + }; +} + +static int codetag_module_init(struct codetag_type *cttype, struct module *mod) +{ + struct codetag_range range; + struct codetag_module *cmod; + int err; + + range = get_section_range(mod, cttype->desc.section); + if (!range.start || !range.stop) { + pr_warn("Failed to load code tags of type %s from the module %s\n", + cttype->desc.section, + mod ? mod->name : "(built-in)"); + return -EINVAL; + } + + /* Ignore empty ranges */ + if (range.start == range.stop) + return 0; + + BUG_ON(range.start > range.stop); + + cmod = kmalloc(sizeof(*cmod), GFP_KERNEL); + if (unlikely(!cmod)) + return -ENOMEM; + + cmod->mod = mod; + cmod->range = range; + + down_write(&cttype->mod_lock); + err = idr_alloc(&cttype->mod_idr, cmod, 0, 0, GFP_KERNEL); + if (err >= 0) + cttype->count += range_size(cttype, &range); + up_write(&cttype->mod_lock); + + if (err < 0) { + kfree(cmod); + return err; + } + + return 0; +} + +struct codetag_type * +codetag_register_type(const struct codetag_type_desc *desc) +{ + struct codetag_type *cttype; + int err; + + BUG_ON(desc->tag_size <= 0); + + cttype = kzalloc(sizeof(*cttype), GFP_KERNEL); + if (unlikely(!cttype)) + return ERR_PTR(-ENOMEM); + + cttype->desc = *desc; + idr_init(&cttype->mod_idr); + init_rwsem(&cttype->mod_lock); + + err = codetag_module_init(cttype, NULL); + if (unlikely(err)) { + kfree(cttype); + return ERR_PTR(err); + } + + mutex_lock(&codetag_lock); + list_add_tail(&cttype->link, &codetag_types); + mutex_unlock(&codetag_lock); + + return cttype; +} From patchwork Tue Aug 30 21:48:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFA48C54EE9 for ; Tue, 30 Aug 2022 21:50:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231834AbiH3Vui (ORCPT ); Tue, 30 Aug 2022 17:50:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231768AbiH3VuQ (ORCPT ); Tue, 30 Aug 2022 17:50:16 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C75BD90836 for ; Tue, 30 Aug 2022 14:49:40 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-33dbfb6d2a3so188939427b3.11 for ; Tue, 30 Aug 2022 14:49:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=98WyT1z35HcvSKL2J7VN7g/2jl2Mia82SjRaCpg7p/8=; b=ppQwolWM0xAnRwum2KKY24m6a3yzqCKoXnrVhpHLwdzhEX68q/9GWwURRXRAJwAXGE 4hx2+McVJ/jsyFytRFbe+OPGYzeVEhiv30wYy3Dbv2te49LJhVOTWhRl89blVyPEgJ5w k3Kn066cwQfYjgsoqDMdWYBFuCQId3TWaakWpKvAmlWCCQJJyDWDc/PSlpx9kgm9lJqq PzMv/4CnJ/6J/uOAkl+MbChdqUZ3xBMeEP2NgbSdh/KBp25WeMEzsbU/mEvRjqtr/S/B +8o+9cRhHtG5HXvPx6nUDeXaaa6TO2tjt+mMpb6RSrzIfgp9N/6Vm98rkMF4T4b9tOCX x/hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=98WyT1z35HcvSKL2J7VN7g/2jl2Mia82SjRaCpg7p/8=; b=AM37ddOalhRgHUe5dBtyGsPVWr1y5liT0jX/DLAbef4uJHql/L1PDtxM/VfavcyPmD 47rufxD4is7ge74H7bW1FfeqXiaWzKQ8w6CqtYiLlABJXYrDAVUHf5dF963p5QcZQw+a j577JXlFsD6luCA15yIchlk5Kdhs8EUXfIYsviD7HzJXm4WFCOvIBEzp9LKztIQqbEBl KlvC+n0I4nTQ5G5oX+5FUbNByuSkTMIQh7Id2FtSTcJHZhx9QBcvTksoAy3NqlTmHlqH A/vUiRYrlFITMotDa73+KVwTIp8G/YZzpkUl1fpyz0MpLG1RpV7b/+aYtHkHFWPhyBgr whNQ== X-Gm-Message-State: ACgBeo1WbkFBqvbpMQvNWZ0OEzMpEYV5k6oeT0iWy5wKvI/8wo7fcbnj wVXUYsrYL+wm+ExkjMdeVQCGlONNRkI= X-Google-Smtp-Source: AA6agR6q8iUNu3CLxqQx3h199RKUO7Cut/Mgh7jMzVJUDX7OllJVViu9/IZhn89ZbhwU0FUSjxhVMQRhhhM= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a05:6902:2cb:b0:684:aebe:49ab with SMTP id w11-20020a05690202cb00b00684aebe49abmr13690932ybh.242.1661896179352; Tue, 30 Aug 2022 14:49:39 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:55 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-7-surenb@google.com> Subject: [RFC PATCH 06/30] lib: code tagging module support From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Add support for code tagging from dynamically loaded modules. Signed-off-by: Suren Baghdasaryan Co-developed-by: Kent Overstreet Signed-off-by: Kent Overstreet --- include/linux/codetag.h | 12 ++++++++++ kernel/module/main.c | 4 ++++ lib/codetag.c | 51 ++++++++++++++++++++++++++++++++++++++++- 3 files changed, 66 insertions(+), 1 deletion(-) diff --git a/include/linux/codetag.h b/include/linux/codetag.h index a9d7adecc2a5..386733e89b31 100644 --- a/include/linux/codetag.h +++ b/include/linux/codetag.h @@ -42,6 +42,10 @@ struct codetag_module { struct codetag_type_desc { const char *section; size_t tag_size; + void (*module_load)(struct codetag_type *cttype, + struct codetag_module *cmod); + void (*module_unload)(struct codetag_type *cttype, + struct codetag_module *cmod); }; struct codetag_iterator { @@ -68,4 +72,12 @@ void codetag_to_text(struct seq_buf *out, struct codetag *ct); struct codetag_type * codetag_register_type(const struct codetag_type_desc *desc); +#ifdef CONFIG_CODE_TAGGING +void codetag_load_module(struct module *mod); +void codetag_unload_module(struct module *mod); +#else +static inline void codetag_load_module(struct module *mod) {} +static inline void codetag_unload_module(struct module *mod) {} +#endif + #endif /* _LINUX_CODETAG_H */ diff --git a/kernel/module/main.c b/kernel/module/main.c index a4e4d84b6f4e..d253277492fd 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -53,6 +53,7 @@ #include #include #include +#include #include #include "internal.h" @@ -1151,6 +1152,7 @@ static void free_module(struct module *mod) { trace_module_free(mod); + codetag_unload_module(mod); mod_sysfs_teardown(mod); /* @@ -2849,6 +2851,8 @@ static int load_module(struct load_info *info, const char __user *uargs, /* Get rid of temporary copy. */ free_copy(info, flags); + codetag_load_module(mod); + /* Done! */ trace_module_load(mod); diff --git a/lib/codetag.c b/lib/codetag.c index 7708f8388e55..f0a3174f9b71 100644 --- a/lib/codetag.c +++ b/lib/codetag.c @@ -157,8 +157,11 @@ static int codetag_module_init(struct codetag_type *cttype, struct module *mod) down_write(&cttype->mod_lock); err = idr_alloc(&cttype->mod_idr, cmod, 0, 0, GFP_KERNEL); - if (err >= 0) + if (err >= 0) { cttype->count += range_size(cttype, &range); + if (cttype->desc.module_load) + cttype->desc.module_load(cttype, cmod); + } up_write(&cttype->mod_lock); if (err < 0) { @@ -197,3 +200,49 @@ codetag_register_type(const struct codetag_type_desc *desc) return cttype; } + +void codetag_load_module(struct module *mod) +{ + struct codetag_type *cttype; + + if (!mod) + return; + + mutex_lock(&codetag_lock); + list_for_each_entry(cttype, &codetag_types, link) + codetag_module_init(cttype, mod); + mutex_unlock(&codetag_lock); +} + +void codetag_unload_module(struct module *mod) +{ + struct codetag_type *cttype; + + if (!mod) + return; + + mutex_lock(&codetag_lock); + list_for_each_entry(cttype, &codetag_types, link) { + struct codetag_module *found = NULL; + struct codetag_module *cmod; + unsigned long mod_id, tmp; + + down_write(&cttype->mod_lock); + idr_for_each_entry_ul(&cttype->mod_idr, cmod, tmp, mod_id) { + if (cmod->mod && cmod->mod == mod) { + found = cmod; + break; + } + } + if (found) { + if (cttype->desc.module_unload) + cttype->desc.module_unload(cttype, cmod); + + cttype->count -= range_size(cttype, &cmod->range); + idr_remove(&cttype->mod_idr, mod_id); + kfree(cmod); + } + up_write(&cttype->mod_lock); + } + mutex_unlock(&codetag_lock); +} From patchwork Tue Aug 30 21:48:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959936 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60999ECAAA1 for ; Tue, 30 Aug 2022 21:50:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231894AbiH3Vug (ORCPT ); Tue, 30 Aug 2022 17:50:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231895AbiH3VuQ (ORCPT ); Tue, 30 Aug 2022 17:50:16 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17B3391097 for ; Tue, 30 Aug 2022 14:49:42 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-335420c7bfeso188916837b3.16 for ; Tue, 30 Aug 2022 14:49:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=v/Q5RZvHm0ms6516zJ+eaQEhTJBhgieG8iYB8LZreqU=; b=dSiwJ3rjpD9k+D3ejPNjLVoeg6kQsBAXkS5tHCQ4weDhFLA/IwvZ3Q8/4xxcyoMJ6e isFuqbINTHNxbJqZ7EEbgu9o53Xs56RjzRvuvf4eTeoYNkY9SG5lERFqePgEz5jBEIRg QvId775FjFECHp0OGiyatkPffLAyhQT41MfBQyoy2O41OqYeYbXsV3zL5yP1EhqhzEDG Ak4e/fA+4cROH7t1gmYSTnptvwHELHiW/qlFXek1xTxDu9n3XVI1sWuwSvdDGMPhmsh6 FLJrfKaSpeVUrcOZuw1Sf/dHnHK+0uqWtXMcZsMOZh3YPgQfnsV4y3UN2W3I7st9KiwO 9X+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=v/Q5RZvHm0ms6516zJ+eaQEhTJBhgieG8iYB8LZreqU=; b=f8FpcA9dpb5YJNup/k03d2+rE8p+hq9lD1fmw+2pspg6XpLMe//VgGlrwp2fF4J2Am y6065K4se5l04IbGkkItJ3gzbkXZlt94oVBXbobIY1ZnHgXqUpKYbmyPuNzDccD7sK3r haVIiQFmdZ7hB+1TQ14w5kR5PZ6Q71ja7bf16vwFIKxHVVH55aB6GrK0mJJ9BG6bmMk7 7TQKiiVxlXoyzuFrbso2J9iCJ+wTyvjBO6syF1qAD61pW7729wJotzouT91wxLLC5kt5 KA9tKM08TNK847jzXa6gG/nHWvC6kWX843uqdUFpqUUcjj2OkywO5/gEy5OHoCK5tMVB 4znA== X-Gm-Message-State: ACgBeo2nr/k8qwAkQS1B0x4lQYW4Yamr23EafA8iFb0WWMyNzruXD11P EpNJU9pAZYgxGTQlnWkaooUCI57OCAk= X-Google-Smtp-Source: AA6agR5yxP3HtYw1/nnJdlitQAiJ4+t1NlijduSUdtY4FE0+dYeCh7gN+ZXU56IyxNeDRenvKE9hkSpDyvU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:2586:0:b0:695:9529:c9a6 with SMTP id l128-20020a252586000000b006959529c9a6mr13158054ybl.591.1661896181976; Tue, 30 Aug 2022 14:49:41 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:56 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-8-surenb@google.com> Subject: [RFC PATCH 07/30] lib: add support for allocation tagging From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Introduce CONFIG_ALLOC_TAGGING which provides definitions to easily instrument allocators. It also registers an "alloc_tags" codetag type with defbugfs interface to output allocation tags information. Signed-off-by: Suren Baghdasaryan Co-developed-by: Kent Overstreet Signed-off-by: Kent Overstreet --- include/asm-generic/codetag.lds.h | 14 +++ include/asm-generic/vmlinux.lds.h | 3 + include/linux/alloc_tag.h | 66 +++++++++++++ lib/Kconfig.debug | 5 + lib/Makefile | 2 + lib/alloc_tag.c | 158 ++++++++++++++++++++++++++++++ scripts/module.lds.S | 7 ++ 7 files changed, 255 insertions(+) create mode 100644 include/asm-generic/codetag.lds.h create mode 100644 include/linux/alloc_tag.h create mode 100644 lib/alloc_tag.c diff --git a/include/asm-generic/codetag.lds.h b/include/asm-generic/codetag.lds.h new file mode 100644 index 000000000000..64f536b80380 --- /dev/null +++ b/include/asm-generic/codetag.lds.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __ASM_GENERIC_CODETAG_LDS_H +#define __ASM_GENERIC_CODETAG_LDS_H + +#define SECTION_WITH_BOUNDARIES(_name) \ + . = ALIGN(8); \ + __start_##_name = .; \ + KEEP(*(_name)) \ + __stop_##_name = .; + +#define CODETAG_SECTIONS() \ + SECTION_WITH_BOUNDARIES(alloc_tags) + +#endif /* __ASM_GENERIC_CODETAG_LDS_H */ diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 7515a465ec03..c2dc2a59ab2e 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -50,6 +50,8 @@ * [__nosave_begin, __nosave_end] for the nosave data */ +#include + #ifndef LOAD_OFFSET #define LOAD_OFFSET 0 #endif @@ -348,6 +350,7 @@ __start___dyndbg = .; \ KEEP(*(__dyndbg)) \ __stop___dyndbg = .; \ + CODETAG_SECTIONS() \ LIKELY_PROFILE() \ BRANCH_PROFILE() \ TRACE_PRINTKS() \ diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h new file mode 100644 index 000000000000..b3f589afb1c9 --- /dev/null +++ b/include/linux/alloc_tag.h @@ -0,0 +1,66 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * allocation tagging + */ +#ifndef _LINUX_ALLOC_TAG_H +#define _LINUX_ALLOC_TAG_H + +#include +#include +#include +#include + +/* + * An instance of this structure is created in a special ELF section at every + * allocation callsite. At runtime, the special section is treated as + * an array of these. Embedded codetag utilizes codetag framework. + */ +struct alloc_tag { + struct codetag ct; + unsigned long last_wrap; + struct raw_lazy_percpu_counter call_count; + struct raw_lazy_percpu_counter bytes_allocated; +} __aligned(8); + +static inline struct alloc_tag *ct_to_alloc_tag(struct codetag *ct) +{ + return container_of(ct, struct alloc_tag, ct); +} + +#define DEFINE_ALLOC_TAG(_alloc_tag) \ + static struct alloc_tag _alloc_tag __used __aligned(8) \ + __section("alloc_tags") = { .ct = CODE_TAG_INIT } + +#define alloc_tag_counter_read(counter) \ + __lazy_percpu_counter_read(counter) + +static inline void __alloc_tag_sub(union codetag_ref *ref, size_t bytes) +{ + struct alloc_tag *tag = ct_to_alloc_tag(ref->ct); + + __lazy_percpu_counter_add(&tag->call_count, &tag->last_wrap, -1); + __lazy_percpu_counter_add(&tag->bytes_allocated, &tag->last_wrap, -bytes); + ref->ct = NULL; +} + +#define alloc_tag_sub(_ref, _bytes) \ +do { \ + if ((_ref) && (_ref)->ct) \ + __alloc_tag_sub(_ref, _bytes); \ +} while (0) + +static inline void __alloc_tag_add(struct alloc_tag *tag, union codetag_ref *ref, size_t bytes) +{ + ref->ct = &tag->ct; + __lazy_percpu_counter_add(&tag->call_count, &tag->last_wrap, 1); + __lazy_percpu_counter_add(&tag->bytes_allocated, &tag->last_wrap, bytes); +} + +#define alloc_tag_add(_ref, _bytes) \ +do { \ + DEFINE_ALLOC_TAG(_alloc_tag); \ + if (_ref && !WARN_ONCE(_ref->ct, "alloc_tag was not cleared")) \ + __alloc_tag_add(&_alloc_tag, _ref, _bytes); \ +} while (0) + +#endif /* _LINUX_ALLOC_TAG_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 22bc1eff7f8f..795bf6993f8a 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -973,6 +973,11 @@ config CODE_TAGGING bool select KALLSYMS +config ALLOC_TAGGING + bool + select CODE_TAGGING + select LAZY_PERCPU_COUNTER + source "lib/Kconfig.kasan" source "lib/Kconfig.kfence" diff --git a/lib/Makefile b/lib/Makefile index 574d7716e640..dc00533fc5c8 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -228,6 +228,8 @@ obj-$(CONFIG_OF_RECONFIG_NOTIFIER_ERROR_INJECT) += \ obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o obj-$(CONFIG_CODE_TAGGING) += codetag.o +obj-$(CONFIG_ALLOC_TAGGING) += alloc_tag.o + lib-$(CONFIG_GENERIC_BUG) += bug.o obj-$(CONFIG_HAVE_ARCH_TRACEHOOK) += syscall.o diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c new file mode 100644 index 000000000000..082fbde184ef --- /dev/null +++ b/lib/alloc_tag.c @@ -0,0 +1,158 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include +#include +#include + +#ifdef CONFIG_DEBUG_FS + +struct alloc_tag_file_iterator { + struct codetag_iterator ct_iter; + struct seq_buf buf; + char rawbuf[4096]; +}; + +struct user_buf { + char __user *buf; /* destination user buffer */ + size_t size; /* size of requested read */ + ssize_t ret; /* bytes read so far */ +}; + +static int flush_ubuf(struct user_buf *dst, struct seq_buf *src) +{ + if (src->len) { + size_t bytes = min_t(size_t, src->len, dst->size); + int err = copy_to_user(dst->buf, src->buffer, bytes); + + if (err) + return err; + + dst->ret += bytes; + dst->buf += bytes; + dst->size -= bytes; + src->len -= bytes; + memmove(src->buffer, src->buffer + bytes, src->len); + } + + return 0; +} + +static int alloc_tag_file_open(struct inode *inode, struct file *file) +{ + struct codetag_type *cttype = inode->i_private; + struct alloc_tag_file_iterator *iter; + + iter = kzalloc(sizeof(*iter), GFP_KERNEL); + if (!iter) + return -ENOMEM; + + codetag_lock_module_list(cttype, true); + iter->ct_iter = codetag_get_ct_iter(cttype); + codetag_lock_module_list(cttype, false); + seq_buf_init(&iter->buf, iter->rawbuf, sizeof(iter->rawbuf)); + file->private_data = iter; + + return 0; +} + +static int alloc_tag_file_release(struct inode *inode, struct file *file) +{ + struct alloc_tag_file_iterator *iter = file->private_data; + + kfree(iter); + return 0; +} + +static void alloc_tag_to_text(struct seq_buf *out, struct codetag *ct) +{ + struct alloc_tag *tag = ct_to_alloc_tag(ct); + char buf[10]; + + string_get_size(alloc_tag_counter_read(&tag->bytes_allocated), 1, + STRING_UNITS_2, buf, sizeof(buf)); + + seq_buf_printf(out, "%8s %8lld ", buf, alloc_tag_counter_read(&tag->call_count)); + codetag_to_text(out, ct); + seq_buf_putc(out, '\n'); +} + +static ssize_t alloc_tag_file_read(struct file *file, char __user *ubuf, + size_t size, loff_t *ppos) +{ + struct alloc_tag_file_iterator *iter = file->private_data; + struct user_buf buf = { .buf = ubuf, .size = size }; + struct codetag *ct; + int err = 0; + + codetag_lock_module_list(iter->ct_iter.cttype, true); + while (1) { + err = flush_ubuf(&buf, &iter->buf); + if (err || !buf.size) + break; + + ct = codetag_next_ct(&iter->ct_iter); + if (!ct) + break; + + alloc_tag_to_text(&iter->buf, ct); + } + codetag_lock_module_list(iter->ct_iter.cttype, false); + + return err ? : buf.ret; +} + +static const struct file_operations alloc_tag_file_ops = { + .owner = THIS_MODULE, + .open = alloc_tag_file_open, + .release = alloc_tag_file_release, + .read = alloc_tag_file_read, +}; + +static int dbgfs_init(struct codetag_type *cttype) +{ + struct dentry *file; + + file = debugfs_create_file("alloc_tags", 0444, NULL, cttype, + &alloc_tag_file_ops); + + return IS_ERR(file) ? PTR_ERR(file) : 0; +} + +#else /* CONFIG_DEBUG_FS */ + +static int dbgfs_init(struct codetag_type *) { return 0; } + +#endif /* CONFIG_DEBUG_FS */ + +static void alloc_tag_module_unload(struct codetag_type *cttype, struct codetag_module *cmod) +{ + struct codetag_iterator iter = codetag_get_ct_iter(cttype); + struct codetag *ct; + + for (ct = codetag_next_ct(&iter); ct; ct = codetag_next_ct(&iter)) { + struct alloc_tag *tag = ct_to_alloc_tag(ct); + + __lazy_percpu_counter_exit(&tag->call_count); + __lazy_percpu_counter_exit(&tag->bytes_allocated); + } +} + +static int __init alloc_tag_init(void) +{ + struct codetag_type *cttype; + const struct codetag_type_desc desc = { + .section = "alloc_tags", + .tag_size = sizeof(struct alloc_tag), + .module_unload = alloc_tag_module_unload, + }; + + cttype = codetag_register_type(&desc); + if (IS_ERR_OR_NULL(cttype)) + return PTR_ERR(cttype); + + return dbgfs_init(cttype); +} +module_init(alloc_tag_init); diff --git a/scripts/module.lds.S b/scripts/module.lds.S index 3a3aa2354ed8..e73a8781f239 100644 --- a/scripts/module.lds.S +++ b/scripts/module.lds.S @@ -12,6 +12,8 @@ # define SANITIZER_DISCARDS #endif +#include + SECTIONS { /DISCARD/ : { *(.discard) @@ -47,6 +49,7 @@ SECTIONS { .data : { *(.data .data.[0-9a-zA-Z_]*) *(.data..L*) + CODETAG_SECTIONS() } .rodata : { @@ -62,6 +65,10 @@ SECTIONS { *(.text.__cfi_check) *(.text .text.[0-9a-zA-Z_]* .text..L.cfi*) } +#else + .data : { + CODETAG_SECTIONS() + } #endif } From patchwork Tue Aug 30 21:48:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959938 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07ECFECAAD4 for ; Tue, 30 Aug 2022 21:50:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231873AbiH3Vux (ORCPT ); Tue, 30 Aug 2022 17:50:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231858AbiH3Vud (ORCPT ); Tue, 30 Aug 2022 17:50:33 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43195389 for ; Tue, 30 Aug 2022 14:49:45 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-33d9f6f4656so190939427b3.21 for ; Tue, 30 Aug 2022 14:49:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=6wmSIFSfHrn7NaG14r3E9vE6XZmoNzeOL4K0xpw9Sss=; b=U5dHm0ISPIOLMTxnmdJQWXA78Rd17S+5/1lv6jv1jaCvgl2EQMbJ9rsmgDNdrDNInX r6k/9wVO8+G1D9fLXaRbQ+fgsHGPVnRmhDau0Jf/al55Nd9kk7GQ20qWjTwu8vglhLE+ 3PiRhO95cWm0bgNaWlaQf68qN6Xl56a4Ea06ODeuAoBkBLBVNj2p2wKR1fVJamPhQROd m/J25psoHq4bGP+9+2UOdK5IJRKeT6PJER8/+rdWAJbYTakEyRla+xNdK6dgdf+KYmis CkIEz6UMG+Lc5WuSZw5HpKbNBdk17ykSRPGJ6mFMzPJalO+24uoHVv3eVDTzjtfYfPny H4fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=6wmSIFSfHrn7NaG14r3E9vE6XZmoNzeOL4K0xpw9Sss=; b=FRKRJCfnb2MSUuVIsikQ1gJlIUaoRL/N2Gng9MGNsxPX7nHHRrjX1kUASTtbr2Bk2k viVYCBJR8yDita110Ig1d4k3I+53aDWaXleD4+UCDcpiuwdkzBk5p/g3azT9ghdDuBOZ iI3C2gH6LtEfRVK1nLdULITxjGFnvG18xZKoQADE831f9oXkxbdaGk8IbrzNAnYpK50R 36z3+jOD6JqqH5sf7HK75/tAe5/Z+Hjd9i62wQJb7qA4fja5IfJAdq8toy39vwEFx9Gi LWJApeA/ukStZ65FxLxxvhNFYNX2rfuO/fs/FfRLFD+Nj4eV9mP5LHSn9qjCK3CZmgv+ 7rLw== X-Gm-Message-State: ACgBeo0MZy4EnLdgdXzETAsgzoL6OAJj7/b4vfWn2OLTojVdaoBGEoQl Ig1Yutjh2Z20PFmH0ehchFExQmRYmnI= X-Google-Smtp-Source: AA6agR7ZicVg+8FFu5BCSLxIsHS/72reGV/Sx3q1E28zgHvzeDh99UkhDovyJoKjdHzkG5X9TjhbL9HgAEQ= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a81:47c4:0:b0:341:2cab:a63c with SMTP id u187-20020a8147c4000000b003412caba63cmr8994715ywa.58.1661896184744; Tue, 30 Aug 2022 14:49:44 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:57 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-9-surenb@google.com> Subject: [RFC PATCH 08/30] lib: introduce page allocation tagging From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Introduce CONFIG_PAGE_ALLOC_TAGGING which provides helper functions to easily instrument page allocators and adds a page_ext field to store a pointer to the allocation tag associated with the code that allocated the page. Signed-off-by: Suren Baghdasaryan Co-developed-by: Kent Overstreet Signed-off-by: Kent Overstreet --- include/linux/pgalloc_tag.h | 28 ++++++++++++++++++++++++++++ lib/Kconfig.debug | 11 +++++++++++ lib/Makefile | 1 + lib/pgalloc_tag.c | 22 ++++++++++++++++++++++ mm/page_ext.c | 6 ++++++ 5 files changed, 68 insertions(+) create mode 100644 include/linux/pgalloc_tag.h create mode 100644 lib/pgalloc_tag.c diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h new file mode 100644 index 000000000000..f525abfe51d4 --- /dev/null +++ b/include/linux/pgalloc_tag.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * page allocation tagging + */ +#ifndef _LINUX_PGALLOC_TAG_H +#define _LINUX_PGALLOC_TAG_H + +#include +#include + +extern struct page_ext_operations page_alloc_tagging_ops; +struct page_ext *lookup_page_ext(const struct page *page); + +static inline union codetag_ref *get_page_tag_ref(struct page *page) +{ + struct page_ext *page_ext = lookup_page_ext(page); + + return page_ext ? (void *)page_ext + page_alloc_tagging_ops.offset + : NULL; +} + +static inline void pgalloc_tag_dec(struct page *page, unsigned int order) +{ + if (page) + alloc_tag_sub(get_page_tag_ref(page), PAGE_SIZE << order); +} + +#endif /* _LINUX_PGALLOC_TAG_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 795bf6993f8a..6686648843b3 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -978,6 +978,17 @@ config ALLOC_TAGGING select CODE_TAGGING select LAZY_PERCPU_COUNTER +config PAGE_ALLOC_TAGGING + bool "Enable page allocation tagging" + default n + select ALLOC_TAGGING + select PAGE_EXTENSION + help + Instrument page allocators to track allocation source code and + collect statistics on the number of allocations and their total size + initiated at that code location. The mechanism can be used to track + memory leaks with a low performance impact. + source "lib/Kconfig.kasan" source "lib/Kconfig.kfence" diff --git a/lib/Makefile b/lib/Makefile index dc00533fc5c8..99f732156673 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -229,6 +229,7 @@ obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o obj-$(CONFIG_CODE_TAGGING) += codetag.o obj-$(CONFIG_ALLOC_TAGGING) += alloc_tag.o +obj-$(CONFIG_PAGE_ALLOC_TAGGING) += pgalloc_tag.o lib-$(CONFIG_GENERIC_BUG) += bug.o diff --git a/lib/pgalloc_tag.c b/lib/pgalloc_tag.c new file mode 100644 index 000000000000..7d97372ca0df --- /dev/null +++ b/lib/pgalloc_tag.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include + +static __init bool need_page_alloc_tagging(void) +{ + return true; +} + +static __init void init_page_alloc_tagging(void) +{ +} + +struct page_ext_operations page_alloc_tagging_ops = { + .size = sizeof(union codetag_ref), + .need = need_page_alloc_tagging, + .init = init_page_alloc_tagging, +}; +EXPORT_SYMBOL(page_alloc_tagging_ops); + diff --git a/mm/page_ext.c b/mm/page_ext.c index 3dc715d7ac29..a22f514ff4da 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -9,6 +9,7 @@ #include #include #include +#include /* * struct page extension @@ -76,6 +77,9 @@ static struct page_ext_operations *page_ext_ops[] __initdata = { #if defined(CONFIG_PAGE_IDLE_FLAG) && !defined(CONFIG_64BIT) &page_idle_ops, #endif +#ifdef CONFIG_PAGE_ALLOC_TAGGING + &page_alloc_tagging_ops, +#endif #ifdef CONFIG_PAGE_TABLE_CHECK &page_table_check_ops, #endif @@ -152,6 +156,7 @@ struct page_ext *lookup_page_ext(const struct page *page) MAX_ORDER_NR_PAGES); return get_entry(base, index); } +EXPORT_SYMBOL(lookup_page_ext); static int __init alloc_node_page_ext(int nid) { @@ -221,6 +226,7 @@ struct page_ext *lookup_page_ext(const struct page *page) return NULL; return get_entry(section->page_ext, pfn); } +EXPORT_SYMBOL(lookup_page_ext); static void *__meminit alloc_page_ext(size_t size, int nid) { From patchwork Tue Aug 30 21:48:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BACE9ECAAD4 for ; Tue, 30 Aug 2022 21:51:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231964AbiH3VvI (ORCPT ); Tue, 30 Aug 2022 17:51:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231961AbiH3Vue (ORCPT ); Tue, 30 Aug 2022 17:50:34 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C95FFEB1 for ; Tue, 30 Aug 2022 14:49:48 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id v5-20020a2583c5000000b006964324be8cso714413ybm.14 for ; Tue, 30 Aug 2022 14:49:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=p8/y/nLBQm6IGi3iJX5wV5NMUkycPfOY4OR1rTeRfxQ=; b=eh/OQULFmIACthv18cWUl4oX11okytVSai6q8VzR+NAQKjUe//Iyb+nFKGGFut6bnz U9vNve6H/h3V8KtV1wUMGWZN/R3y/KwtIJWlz0MYZTJiIV/PhXncLGaXcxmIWasd49k4 MzAfZ3MmPE/GrPk0xGglkpLQ6vp+JNN34VaaKbWMkI6l0blrObop9acuYIFl58Z8BCzL 5fC4JT76yIwf+mLDK1SPTy1QXCq8FwZe3KmkNaoWIFZ0MNlEC2nHkf9R6kzo++VpnZeN 5Whn6DE0IEl1BMPRxgM5XkxVMj4js15JxMerWLMdHXpHMsiFoGiwHoKTk4A6B15t/TFb q+Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=p8/y/nLBQm6IGi3iJX5wV5NMUkycPfOY4OR1rTeRfxQ=; b=o3qQEtIcgBh7X404TBWXh8PfrqKWOjNvdLTTM0yNdnyqR1/hfkFfK721Y+TUekNd/B 1NvT6IOVICcrBWaDxC0rb/k5FTvOXob47oBIpOUF6mjHRSo6doniaUqCbL9OoJHKan9V DVz8f/uNA4n35FFjfEJpqGX/6sfrNjT+vkNweCTpTUZI7BgvAhso5xjp7nEOzycbRI+z 72LD1u+OsecmhApJEBKUPK2M509UCfGCNOS6Ucirxpftb2lE9AqvPyvjGfRbekuOiZgr DgacgVJlewy3BFSagVjZjlwhTjOXRpQ4yg1Gla5ed49uP1pLCVU01CnGxrM+nkoUAOvK ihCw== X-Gm-Message-State: ACgBeo20fLg9wi5TsxtH7GYOMCL+lHsNwNRQMfdU1nrVd03u02TBxZ0i BSBtOrk+nJL5v0UBC8M86jShUUapOzM= X-Google-Smtp-Source: AA6agR5jbCGe1MArd8G7FNCgOj77g3iLznXqPzYSPJ0w7ptPLXk4tjPHYPeR1rNV3a/cmH1L2NtV8LKD3is= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:4d56:0:b0:69c:3d80:bb51 with SMTP id a83-20020a254d56000000b0069c3d80bb51mr6383765ybb.124.1661896187249; Tue, 30 Aug 2022 14:49:47 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:58 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-10-surenb@google.com> Subject: [RFC PATCH 09/30] change alloc_pages name in dma_map_ops to avoid name conflicts From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org After redefining alloc_pages, all uses of that name are being replaced. Change the conflicting names to prevent preprocessor from replacing them when it's not intended. Signed-off-by: Suren Baghdasaryan --- arch/x86/kernel/amd_gart_64.c | 2 +- drivers/iommu/dma-iommu.c | 2 +- drivers/xen/grant-dma-ops.c | 2 +- drivers/xen/swiotlb-xen.c | 2 +- include/linux/dma-map-ops.h | 2 +- kernel/dma/mapping.c | 4 ++-- 6 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c index 194d54eed537..5e83a387bfef 100644 --- a/arch/x86/kernel/amd_gart_64.c +++ b/arch/x86/kernel/amd_gart_64.c @@ -676,7 +676,7 @@ static const struct dma_map_ops gart_dma_ops = { .get_sgtable = dma_common_get_sgtable, .dma_supported = dma_direct_supported, .get_required_mask = dma_direct_get_required_mask, - .alloc_pages = dma_direct_alloc_pages, + .alloc_pages_op = dma_direct_alloc_pages, .free_pages = dma_direct_free_pages, }; diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 17dd683b2fce..58b4878ef930 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1547,7 +1547,7 @@ static const struct dma_map_ops iommu_dma_ops = { .flags = DMA_F_PCI_P2PDMA_SUPPORTED, .alloc = iommu_dma_alloc, .free = iommu_dma_free, - .alloc_pages = dma_common_alloc_pages, + .alloc_pages_op = dma_common_alloc_pages, .free_pages = dma_common_free_pages, .alloc_noncontiguous = iommu_dma_alloc_noncontiguous, .free_noncontiguous = iommu_dma_free_noncontiguous, diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c index 8973fc1e9ccc..0e26d066036e 100644 --- a/drivers/xen/grant-dma-ops.c +++ b/drivers/xen/grant-dma-ops.c @@ -262,7 +262,7 @@ static int xen_grant_dma_supported(struct device *dev, u64 mask) static const struct dma_map_ops xen_grant_dma_ops = { .alloc = xen_grant_dma_alloc, .free = xen_grant_dma_free, - .alloc_pages = xen_grant_dma_alloc_pages, + .alloc_pages_op = xen_grant_dma_alloc_pages, .free_pages = xen_grant_dma_free_pages, .mmap = dma_common_mmap, .get_sgtable = dma_common_get_sgtable, diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 67aa74d20162..5ab2616153f0 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -403,6 +403,6 @@ const struct dma_map_ops xen_swiotlb_dma_ops = { .dma_supported = xen_swiotlb_dma_supported, .mmap = dma_common_mmap, .get_sgtable = dma_common_get_sgtable, - .alloc_pages = dma_common_alloc_pages, + .alloc_pages_op = dma_common_alloc_pages, .free_pages = dma_common_free_pages, }; diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index d678afeb8a13..e8e2d210ba68 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -27,7 +27,7 @@ struct dma_map_ops { unsigned long attrs); void (*free)(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle, unsigned long attrs); - struct page *(*alloc_pages)(struct device *dev, size_t size, + struct page *(*alloc_pages_op)(struct device *dev, size_t size, dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp); void (*free_pages)(struct device *dev, size_t size, struct page *vaddr, diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 49cbf3e33de7..80a2bfeed8d0 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -552,9 +552,9 @@ static struct page *__dma_alloc_pages(struct device *dev, size_t size, size = PAGE_ALIGN(size); if (dma_alloc_direct(dev, ops)) return dma_direct_alloc_pages(dev, size, dma_handle, dir, gfp); - if (!ops->alloc_pages) + if (!ops->alloc_pages_op) return NULL; - return ops->alloc_pages(dev, size, dma_handle, dir, gfp); + return ops->alloc_pages_op(dev, size, dma_handle, dir, gfp); } struct page *dma_alloc_pages(struct device *dev, size_t size, From patchwork Tue Aug 30 21:48:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959940 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B776ECAAD5 for ; Tue, 30 Aug 2022 21:51:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232077AbiH3Vv2 (ORCPT ); Tue, 30 Aug 2022 17:51:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231843AbiH3Vus (ORCPT ); Tue, 30 Aug 2022 17:50:48 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 442165FA1 for ; Tue, 30 Aug 2022 14:49:51 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id o144-20020a25d796000000b0069b523a4234so716961ybg.17 for ; Tue, 30 Aug 2022 14:49:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=hT6Mz/zinVpRQBPKe7V7vUNm2EL+nNA0Rd6lje133es=; b=dfpyzvwd4Cc++xCQWq35JB0wJm+WrSSoxMENLy9aDfaWxkxM1OaMmDnm/vqeqmhqwU gZ74ppSfTAgF7SE3SqSoTv2aIM58r+Pz/1W9uwrKdGiDv7ewg2BuxbdnPUGh3CuiOCkA 14w+GGgADDYFbMYHNlMxcXneiC4QjBwI/pjC3GHz1ttLv+1YSdBbrybMdEtkcC5CZ2yG 0v6U9Ssq49CU9MEssaTVF8mmd1yQf+A9weRhrB49GBJc7iyJ1ibIQDHVBUxMDTeBzNZT 89GZsPPyB1nT7HTWg+BylmuFWY1bglVPePrUwBMLPZc7q+nvTNhi0s23dmExUdbXdR5Y Cwww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=hT6Mz/zinVpRQBPKe7V7vUNm2EL+nNA0Rd6lje133es=; b=4aHhTUaxO83Mq4sLZTV+dc2TBLKcAwudgL/t7m9wGwFC1U9lr46im7xB6PUFihuKuo AGAxWt/apBSvzZFGdZ8c7H+bNxKsi/QH+CJ3XeJZ0oikhiAFaK61ISNXdRU3gRQWgwHg hfeWNeTEwmdoFp/ZgGKKBcQFOZDi6fj/L0U5F2MzwoLQ0nJlTpedH+QZWvaOZudjG+wO fcfuc4iZTEQQ+cSSwdLmkSma9+IKlspBho6NgPavQ4OyL9YsyxR9PZstffZhIbVoWT3/ IW5zpmCsVHk5kRRZlhUR2apIGGwfPD19HVjNtuMalSL8rACvLUzqXnpTWmFbWYk9bGjX xT8w== X-Gm-Message-State: ACgBeo0B+JrlN9lMnU1WOhpqw8bnK6JfYzlCmhMNwSzv2OvS/hj9AJKG J5G3KASsQabsLGmbKzO78007cFWI8do= X-Google-Smtp-Source: AA6agR6wiYlgc1y+Ddz6GZjAkBSlMmUIZ6hPk3nzcjnnuhPA77BDamNwQUPuwNWjTzLGCV8DDV21TdlLdak= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:cf0e:0:b0:696:42f1:3889 with SMTP id f14-20020a25cf0e000000b0069642f13889mr13000431ybg.175.1661896189937; Tue, 30 Aug 2022 14:49:49 -0700 (PDT) Date: Tue, 30 Aug 2022 14:48:59 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-11-surenb@google.com> Subject: [RFC PATCH 10/30] mm: enable page allocation tagging for __get_free_pages and alloc_pages From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Redefine alloc_pages, __get_free_pages to record allocations done by these functions. Instrument deallocation hooks to record object freeing. Signed-off-by: Suren Baghdasaryan --- include/linux/gfp.h | 10 +++++++--- include/linux/page_ext.h | 3 ++- include/linux/pgalloc_tag.h | 35 +++++++++++++++++++++++++++++++++++ mm/mempolicy.c | 4 ++-- mm/page_alloc.c | 13 ++++++++++--- 5 files changed, 56 insertions(+), 9 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index f314be58fa77..5cb950a49d40 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -6,6 +6,7 @@ #include #include +#include struct vm_area_struct; @@ -267,12 +268,12 @@ static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask, } #ifdef CONFIG_NUMA -struct page *alloc_pages(gfp_t gfp, unsigned int order); +struct page *_alloc_pages(gfp_t gfp, unsigned int order); struct folio *folio_alloc(gfp_t gfp, unsigned order); struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma, unsigned long addr, bool hugepage); #else -static inline struct page *alloc_pages(gfp_t gfp_mask, unsigned int order) +static inline struct page *_alloc_pages(gfp_t gfp_mask, unsigned int order) { return alloc_pages_node(numa_node_id(), gfp_mask, order); } @@ -283,6 +284,7 @@ static inline struct folio *folio_alloc(gfp_t gfp, unsigned int order) #define vma_alloc_folio(gfp, order, vma, addr, hugepage) \ folio_alloc(gfp, order) #endif +#define alloc_pages(gfp, order) pgtag_alloc_pages(gfp, order) #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0) static inline struct page *alloc_page_vma(gfp_t gfp, struct vm_area_struct *vma, unsigned long addr) @@ -292,7 +294,9 @@ static inline struct page *alloc_page_vma(gfp_t gfp, return &folio->page; } -extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order); +extern unsigned long _get_free_pages(gfp_t gfp_mask, unsigned int order, + struct page **ppage); +#define __get_free_pages(gfp_mask, order) pgtag_get_free_pages(gfp_mask, order) extern unsigned long get_zeroed_page(gfp_t gfp_mask); void *alloc_pages_exact(size_t size, gfp_t gfp_mask) __alloc_size(1); diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h index fabb2e1e087f..b26077110fb3 100644 --- a/include/linux/page_ext.h +++ b/include/linux/page_ext.h @@ -4,7 +4,6 @@ #include #include -#include struct pglist_data; struct page_ext_operations { @@ -14,6 +13,8 @@ struct page_ext_operations { void (*init)(void); }; +#include + #ifdef CONFIG_PAGE_EXTENSION enum page_ext_flags { diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h index f525abfe51d4..154ea7436fec 100644 --- a/include/linux/pgalloc_tag.h +++ b/include/linux/pgalloc_tag.h @@ -5,6 +5,8 @@ #ifndef _LINUX_PGALLOC_TAG_H #define _LINUX_PGALLOC_TAG_H +#ifdef CONFIG_PAGE_ALLOC_TAGGING + #include #include @@ -25,4 +27,37 @@ static inline void pgalloc_tag_dec(struct page *page, unsigned int order) alloc_tag_sub(get_page_tag_ref(page), PAGE_SIZE << order); } +/* + * Redefinitions of the common page allocators/destructors + */ +#define pgtag_alloc_pages(gfp, order) \ +({ \ + struct page *_page = _alloc_pages((gfp), (order)); \ + \ + if (_page) \ + alloc_tag_add(get_page_tag_ref(_page), PAGE_SIZE << (order));\ + _page; \ +}) + +#define pgtag_get_free_pages(gfp_mask, order) \ +({ \ + struct page *_page; \ + unsigned long _res = _get_free_pages((gfp_mask), (order), &_page);\ + \ + if (_res) \ + alloc_tag_add(get_page_tag_ref(_page), PAGE_SIZE << (order));\ + _res; \ +}) + +#else /* CONFIG_PAGE_ALLOC_TAGGING */ + +#define pgtag_alloc_pages(gfp, order) _alloc_pages(gfp, order) + +#define pgtag_get_free_pages(gfp_mask, order) \ + _get_free_pages((gfp_mask), (order), NULL) + +#define pgalloc_tag_dec(__page, __size) do {} while (0) + +#endif /* CONFIG_PAGE_ALLOC_TAGGING */ + #endif /* _LINUX_PGALLOC_TAG_H */ diff --git a/mm/mempolicy.c b/mm/mempolicy.c index b73d3248d976..f7e6d9564a49 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2249,7 +2249,7 @@ EXPORT_SYMBOL(vma_alloc_folio); * flags are used. * Return: The page on success or NULL if allocation fails. */ -struct page *alloc_pages(gfp_t gfp, unsigned order) +struct page *_alloc_pages(gfp_t gfp, unsigned int order) { struct mempolicy *pol = &default_policy; struct page *page; @@ -2273,7 +2273,7 @@ struct page *alloc_pages(gfp_t gfp, unsigned order) return page; } -EXPORT_SYMBOL(alloc_pages); +EXPORT_SYMBOL(_alloc_pages); struct folio *folio_alloc(gfp_t gfp, unsigned order) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e5486d47406e..165daba19e2a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -763,6 +763,7 @@ static inline bool pcp_allowed_order(unsigned int order) static inline void free_the_page(struct page *page, unsigned int order) { + if (pcp_allowed_order(order)) /* Via pcp? */ free_unref_page(page, order); else @@ -1120,6 +1121,8 @@ static inline void __free_one_page(struct page *page, VM_BUG_ON_PAGE(pfn & ((1 << order) - 1), page); VM_BUG_ON_PAGE(bad_range(zone, page), page); + pgalloc_tag_dec(page, order); + while (order < MAX_ORDER - 1) { if (compaction_capture(capc, page, order, migratetype)) { __mod_zone_freepage_state(zone, -(1 << order), @@ -3440,6 +3443,7 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, int pindex; bool free_high; + pgalloc_tag_dec(page, order); __count_vm_event(PGFREE); pindex = order_to_pindex(migratetype, order); list_add(&page->pcp_list, &pcp->lists[pindex]); @@ -5557,16 +5561,19 @@ EXPORT_SYMBOL(__folio_alloc); * address cannot represent highmem pages. Use alloc_pages and then kmap if * you need to access high mem. */ -unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order) +unsigned long _get_free_pages(gfp_t gfp_mask, unsigned int order, + struct page **ppage) { struct page *page; - page = alloc_pages(gfp_mask & ~__GFP_HIGHMEM, order); + page = _alloc_pages(gfp_mask & ~__GFP_HIGHMEM, order); + if (ppage) + *ppage = page; if (!page) return 0; return (unsigned long) page_address(page); } -EXPORT_SYMBOL(__get_free_pages); +EXPORT_SYMBOL(_get_free_pages); unsigned long get_zeroed_page(gfp_t gfp_mask) { From patchwork Tue Aug 30 21:49:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D1D7ECAAD4 for ; Tue, 30 Aug 2022 21:51:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231922AbiH3Vva (ORCPT ); Tue, 30 Aug 2022 17:51:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232008AbiH3Vut (ORCPT ); Tue, 30 Aug 2022 17:50:49 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85D81CE14 for ; Tue, 30 Aug 2022 14:49:54 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-3328a211611so187658337b3.5 for ; Tue, 30 Aug 2022 14:49:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=QL3vDW0G0EAkws93GOiItwEQTSZQKKs2lme2XWPOm88=; b=q6gqHWNQeXDhAVOtfKS2jwuMe1Ivg0lKJX/fMDL1vUMhAassB9mHwvnpQh1WMVGm+o wuVsa+cCW5oNsObXqY67nrhecsa30nwljncbAEXhpniEWz54/04oCNvdTSDux54cMFAE 4nhyqFnO9NYfs/n3dbSM7vYghvS7v91FyklTtChN8Epl4LhkkowH7ng1Yu5Rlr+D4oBe saaovWlzUEOrepTUgFJ4YjWeFoI8FhkeHo7862NM79a9vsK0GODb/N8fXYZG0KDHFyUH lOmNzvZIwEM4nHvKASmMHhgHoOrOUGgI5MKkn55Hfv++9R4FcwKbaSCkvPthI2sWTbpi N5LQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=QL3vDW0G0EAkws93GOiItwEQTSZQKKs2lme2XWPOm88=; b=25jC9TURZQ+zoDgOFPL9LUrTcmP5kXEMl/eC43zHEQT/N1cx0aRwrAIvQ0zspoT7vn rTKHWic6qiRGpx2acpVlLM1bAa3MPQOkp6klTecGcOp23G4CzsZjX3lY2EeqE8HECd3G XXq6l6u7uf8n1r3bAHby1LXr18MOTJnvrPLd8z1/6mUBVAKsiLruGYK/drl7ODQN20bV eAqEIu25M3UGjG2TeOaH0M6ZgLFWktD2Rp2oZvaPu6LyibQqLH7O0s6GDlCYCSjXlb/u NbnKFIF4I+dFbVMwfTBcqoqv5U7SYlLgXEcdz0u/5C2tQ8oYHXHulL8ISEgBsemOjqZd 73dQ== X-Gm-Message-State: ACgBeo2T+oeTGOl4kEztb53oCPWPMMr3xB3Deq3lXxmYP08WHjdH0gW+ ORzQVmX2mFELAPFZ2oCDChtTJG2pVDs= X-Google-Smtp-Source: AA6agR7DyMkyo2dvlOLM5oLxyv9VlOWNhVJhl4Sbj9IzVf2sqsZYQJSX/JL7zRsUtBJ8DZwLwhom1mMOXNE= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:8402:0:b0:696:42c8:c561 with SMTP id u2-20020a258402000000b0069642c8c561mr13648632ybk.435.1661896192809; Tue, 30 Aug 2022 14:49:52 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:00 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-12-surenb@google.com> Subject: [RFC PATCH 11/30] mm: introduce slabobj_ext to support slab object extensions From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Currently slab pages can store only vectors of obj_cgroup pointers in page->memcg_data. Introduce slabobj_ext structure to allow more data to be stored for each slab object. Wraps obj_cgroup into slabobj_ext to support current functionality while allowing to extend slabobj_ext in the future. Note: ideally the config dependency should be turned the other way around: MEMCG should depend on SLAB_OBJ_EXT and {page|slab|folio}.memcg_data would be renamed to something like {page|slab|folio}.objext_data. However doing this in RFC would introduce considerable churn unrelated to the overall idea, so avoiding this until v1. Signed-off-by: Suren Baghdasaryan --- include/linux/memcontrol.h | 18 ++++-- init/Kconfig | 5 ++ mm/kfence/core.c | 2 +- mm/memcontrol.c | 60 ++++++++++--------- mm/page_owner.c | 2 +- mm/slab.h | 119 +++++++++++++++++++++++++------------ 6 files changed, 131 insertions(+), 75 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6257867fbf95..315399f77173 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -227,6 +227,14 @@ struct obj_cgroup { }; }; +/* + * Extended information for slab objects stored as an array in page->memcg_data + * if MEMCG_DATA_OBJEXTS is set. + */ +struct slabobj_ext { + struct obj_cgroup *objcg; +} __aligned(8); + /* * The memory controller data structure. The memory controller controls both * page cache and RSS per cgroup. We would eventually like to provide @@ -363,7 +371,7 @@ extern struct mem_cgroup *root_mem_cgroup; enum page_memcg_data_flags { /* page->memcg_data is a pointer to an objcgs vector */ - MEMCG_DATA_OBJCGS = (1UL << 0), + MEMCG_DATA_OBJEXTS = (1UL << 0), /* page has been accounted as a non-slab kernel page */ MEMCG_DATA_KMEM = (1UL << 1), /* the next bit after the last actual flag */ @@ -401,7 +409,7 @@ static inline struct mem_cgroup *__folio_memcg(struct folio *folio) unsigned long memcg_data = folio->memcg_data; VM_BUG_ON_FOLIO(folio_test_slab(folio), folio); - VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJCGS, folio); + VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio); VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_KMEM, folio); return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); @@ -422,7 +430,7 @@ static inline struct obj_cgroup *__folio_objcg(struct folio *folio) unsigned long memcg_data = folio->memcg_data; VM_BUG_ON_FOLIO(folio_test_slab(folio), folio); - VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJCGS, folio); + VM_BUG_ON_FOLIO(memcg_data & MEMCG_DATA_OBJEXTS, folio); VM_BUG_ON_FOLIO(!(memcg_data & MEMCG_DATA_KMEM), folio); return (struct obj_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); @@ -517,7 +525,7 @@ static inline struct mem_cgroup *page_memcg_check(struct page *page) */ unsigned long memcg_data = READ_ONCE(page->memcg_data); - if (memcg_data & MEMCG_DATA_OBJCGS) + if (memcg_data & MEMCG_DATA_OBJEXTS) return NULL; if (memcg_data & MEMCG_DATA_KMEM) { @@ -556,7 +564,7 @@ static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *ob static inline bool folio_memcg_kmem(struct folio *folio) { VM_BUG_ON_PGFLAGS(PageTail(&folio->page), &folio->page); - VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJCGS, folio); + VM_BUG_ON_FOLIO(folio->memcg_data & MEMCG_DATA_OBJEXTS, folio); return folio->memcg_data & MEMCG_DATA_KMEM; } diff --git a/init/Kconfig b/init/Kconfig index 532362fcfe31..82396d7a2717 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -958,6 +958,10 @@ config MEMCG help Provides control over the memory footprint of tasks in a cgroup. +config SLAB_OBJ_EXT + bool + depends on MEMCG + config MEMCG_SWAP bool depends on MEMCG && SWAP @@ -966,6 +970,7 @@ config MEMCG_SWAP config MEMCG_KMEM bool depends on MEMCG && !SLOB + select SLAB_OBJ_EXT default y config BLK_CGROUP diff --git a/mm/kfence/core.c b/mm/kfence/core.c index c252081b11df..c0958e4a32e2 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -569,7 +569,7 @@ static unsigned long kfence_init_pool(void) __folio_set_slab(slab_folio(slab)); #ifdef CONFIG_MEMCG slab->memcg_data = (unsigned long)&kfence_metadata[i / 2 - 1].objcg | - MEMCG_DATA_OBJCGS; + MEMCG_DATA_OBJEXTS; #endif } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b69979c9ced5..3f407ef2f3f1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2793,7 +2793,7 @@ static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) folio->memcg_data = (unsigned long)memcg; } -#ifdef CONFIG_MEMCG_KMEM +#ifdef CONFIG_SLAB_OBJ_EXT /* * The allocated objcg pointers array is not accounted directly. * Moreover, it should not come from DMA buffer and is not readily @@ -2801,38 +2801,20 @@ static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) */ #define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | __GFP_ACCOUNT) -/* - * mod_objcg_mlstate() may be called with irq enabled, so - * mod_memcg_lruvec_state() should be used. - */ -static inline void mod_objcg_mlstate(struct obj_cgroup *objcg, - struct pglist_data *pgdat, - enum node_stat_item idx, int nr) -{ - struct mem_cgroup *memcg; - struct lruvec *lruvec; - - rcu_read_lock(); - memcg = obj_cgroup_memcg(objcg); - lruvec = mem_cgroup_lruvec(memcg, pgdat); - mod_memcg_lruvec_state(lruvec, idx, nr); - rcu_read_unlock(); -} - -int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, - gfp_t gfp, bool new_slab) +int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, + gfp_t gfp, bool new_slab) { unsigned int objects = objs_per_slab(s, slab); unsigned long memcg_data; void *vec; gfp &= ~OBJCGS_CLEAR_MASK; - vec = kcalloc_node(objects, sizeof(struct obj_cgroup *), gfp, + vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp, slab_nid(slab)); if (!vec) return -ENOMEM; - memcg_data = (unsigned long) vec | MEMCG_DATA_OBJCGS; + memcg_data = (unsigned long) vec | MEMCG_DATA_OBJEXTS; if (new_slab) { /* * If the slab is brand new and nobody can yet access its @@ -2843,7 +2825,7 @@ int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, } else if (cmpxchg(&slab->memcg_data, 0, memcg_data)) { /* * If the slab is already in use, somebody can allocate and - * assign obj_cgroups in parallel. In this case the existing + * assign slabobj_exts in parallel. In this case the existing * objcg vector should be reused. */ kfree(vec); @@ -2853,6 +2835,26 @@ int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, kmemleak_not_leak(vec); return 0; } +#endif /* CONFIG_SLAB_OBJ_EXT */ + +#ifdef CONFIG_MEMCG_KMEM +/* + * mod_objcg_mlstate() may be called with irq enabled, so + * mod_memcg_lruvec_state() should be used. + */ +static inline void mod_objcg_mlstate(struct obj_cgroup *objcg, + struct pglist_data *pgdat, + enum node_stat_item idx, int nr) +{ + struct mem_cgroup *memcg; + struct lruvec *lruvec; + + rcu_read_lock(); + memcg = obj_cgroup_memcg(objcg); + lruvec = mem_cgroup_lruvec(memcg, pgdat); + mod_memcg_lruvec_state(lruvec, idx, nr); + rcu_read_unlock(); +} static __always_inline struct mem_cgroup *mem_cgroup_from_obj_folio(struct folio *folio, void *p) @@ -2863,18 +2865,18 @@ struct mem_cgroup *mem_cgroup_from_obj_folio(struct folio *folio, void *p) * slab->memcg_data. */ if (folio_test_slab(folio)) { - struct obj_cgroup **objcgs; + struct slabobj_ext *obj_exts; struct slab *slab; unsigned int off; slab = folio_slab(folio); - objcgs = slab_objcgs(slab); - if (!objcgs) + obj_exts = slab_obj_exts(slab); + if (!obj_exts) return NULL; off = obj_to_index(slab->slab_cache, slab, p); - if (objcgs[off]) - return obj_cgroup_memcg(objcgs[off]); + if (obj_exts[off].objcg) + return obj_cgroup_memcg(obj_exts[off].objcg); return NULL; } diff --git a/mm/page_owner.c b/mm/page_owner.c index e4c6f3f1695b..fd4af1ad34b8 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -353,7 +353,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret, if (!memcg_data) goto out_unlock; - if (memcg_data & MEMCG_DATA_OBJCGS) + if (memcg_data & MEMCG_DATA_OBJEXTS) ret += scnprintf(kbuf + ret, count - ret, "Slab cache page\n"); diff --git a/mm/slab.h b/mm/slab.h index 4ec82bec15ec..c767ce3f0fe2 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -422,36 +422,94 @@ static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t fla return false; } +#ifdef CONFIG_SLAB_OBJ_EXT + +static inline bool is_kmem_only_obj_ext(void) +{ #ifdef CONFIG_MEMCG_KMEM + return sizeof(struct slabobj_ext) == sizeof(struct obj_cgroup *); +#else + return false; +#endif +} + /* - * slab_objcgs - get the object cgroups vector associated with a slab + * slab_obj_exts - get the pointer to the slab object extension vector + * associated with a slab. * @slab: a pointer to the slab struct * - * Returns a pointer to the object cgroups vector associated with the slab, + * Returns a pointer to the object extension vector associated with the slab, * or NULL if no such vector has been associated yet. */ -static inline struct obj_cgroup **slab_objcgs(struct slab *slab) +static inline struct slabobj_ext *slab_obj_exts(struct slab *slab) { unsigned long memcg_data = READ_ONCE(slab->memcg_data); - VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJCGS), + VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJEXTS), slab_page(slab)); VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, slab_page(slab)); - return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); + return (struct slabobj_ext *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); } -int memcg_alloc_slab_cgroups(struct slab *slab, struct kmem_cache *s, - gfp_t gfp, bool new_slab); -void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, - enum node_stat_item idx, int nr); +int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, + gfp_t gfp, bool new_slab); -static inline void memcg_free_slab_cgroups(struct slab *slab) +static inline void free_slab_obj_exts(struct slab *slab) { - kfree(slab_objcgs(slab)); + struct slabobj_ext *obj_exts; + + if (!memcg_kmem_enabled() && is_kmem_only_obj_ext()) + return; + + obj_exts = slab_obj_exts(slab); + kfree(obj_exts); slab->memcg_data = 0; } +static inline void prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p) +{ + struct slab *slab; + + /* If kmem is the only extension then the vector will be created conditionally */ + if (is_kmem_only_obj_ext()) + return; + + slab = virt_to_slab(p); + if (!slab_obj_exts(slab)) + WARN(alloc_slab_obj_exts(slab, s, flags, false), + "%s, %s: Failed to create slab extension vector!\n", + __func__, s->name); +} + +#else /* CONFIG_SLAB_OBJ_EXT */ + +static inline struct slabobj_ext *slab_obj_exts(struct slab *slab) +{ + return NULL; +} + +static inline int alloc_slab_obj_exts(struct slab *slab, + struct kmem_cache *s, gfp_t gfp, + bool new_slab) +{ + return 0; +} + +static inline void free_slab_obj_exts(struct slab *slab) +{ +} + +static inline void prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, void *p) +{ +} + +#endif /* CONFIG_SLAB_OBJ_EXT */ + +#ifdef CONFIG_MEMCG_KMEM +void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, + enum node_stat_item idx, int nr); + static inline size_t obj_full_size(struct kmem_cache *s) { /* @@ -519,16 +577,15 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, if (likely(p[i])) { slab = virt_to_slab(p[i]); - if (!slab_objcgs(slab) && - memcg_alloc_slab_cgroups(slab, s, flags, - false)) { + if (!slab_obj_exts(slab) && + alloc_slab_obj_exts(slab, s, flags, false)) { obj_cgroup_uncharge(objcg, obj_full_size(s)); continue; } off = obj_to_index(s, slab, p[i]); obj_cgroup_get(objcg); - slab_objcgs(slab)[off] = objcg; + slab_obj_exts(slab)[off].objcg = objcg; mod_objcg_state(objcg, slab_pgdat(slab), cache_vmstat_idx(s), obj_full_size(s)); } else { @@ -541,14 +598,14 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, int objects) { - struct obj_cgroup **objcgs; + struct slabobj_ext *obj_exts; int i; if (!memcg_kmem_enabled()) return; - objcgs = slab_objcgs(slab); - if (!objcgs) + obj_exts = slab_obj_exts(slab); + if (!obj_exts) return; for (i = 0; i < objects; i++) { @@ -556,11 +613,11 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, unsigned int off; off = obj_to_index(s, slab, p[i]); - objcg = objcgs[off]; + objcg = obj_exts[off].objcg; if (!objcg) continue; - objcgs[off] = NULL; + obj_exts[off].objcg = NULL; obj_cgroup_uncharge(objcg, obj_full_size(s)); mod_objcg_state(objcg, slab_pgdat(slab), cache_vmstat_idx(s), -obj_full_size(s)); @@ -569,27 +626,11 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, } #else /* CONFIG_MEMCG_KMEM */ -static inline struct obj_cgroup **slab_objcgs(struct slab *slab) -{ - return NULL; -} - static inline struct mem_cgroup *memcg_from_slab_obj(void *ptr) { return NULL; } -static inline int memcg_alloc_slab_cgroups(struct slab *slab, - struct kmem_cache *s, gfp_t gfp, - bool new_slab) -{ - return 0; -} - -static inline void memcg_free_slab_cgroups(struct slab *slab) -{ -} - static inline bool memcg_slab_pre_alloc_hook(struct kmem_cache *s, struct list_lru *lru, struct obj_cgroup **objcgp, @@ -627,7 +668,7 @@ static __always_inline void account_slab(struct slab *slab, int order, struct kmem_cache *s, gfp_t gfp) { if (memcg_kmem_enabled() && (s->flags & SLAB_ACCOUNT)) - memcg_alloc_slab_cgroups(slab, s, gfp, true); + alloc_slab_obj_exts(slab, s, gfp, true); mod_node_page_state(slab_pgdat(slab), cache_vmstat_idx(s), PAGE_SIZE << order); @@ -636,8 +677,7 @@ static __always_inline void account_slab(struct slab *slab, int order, static __always_inline void unaccount_slab(struct slab *slab, int order, struct kmem_cache *s) { - if (memcg_kmem_enabled()) - memcg_free_slab_cgroups(slab); + free_slab_obj_exts(slab); mod_node_page_state(slab_pgdat(slab), cache_vmstat_idx(s), -(PAGE_SIZE << order)); @@ -729,6 +769,7 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, memset(p[i], 0, s->object_size); kmemleak_alloc_recursive(p[i], s->object_size, 1, s->flags, flags); + prepare_slab_obj_exts_hook(s, flags, p[i]); } memcg_slab_post_alloc_hook(s, objcg, flags, size, p); From patchwork Tue Aug 30 21:49:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 323EAECAAA1 for ; Tue, 30 Aug 2022 21:51:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232112AbiH3Vv4 (ORCPT ); Tue, 30 Aug 2022 17:51:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231974AbiH3VvJ (ORCPT ); Tue, 30 Aug 2022 17:51:09 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7F8311C0A for ; Tue, 30 Aug 2022 14:49:56 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id k126-20020a253d84000000b0068bb342010dso712735yba.1 for ; Tue, 30 Aug 2022 14:49:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=+dBCH/ZYeXKPl1XZC0y8jtwH0hgKGXE3JjJWq8gRM44=; b=AFaSMy8jXac7TuC+3zPxWUGJEoV6Xwt0zQp9MqDmFFlT9mWJtIdF7dkyLAZ2dlc2YA b9Gf5k0v2Zvts/DmzzZ5pyBGK4PvMyQtpd/jCDUfU7pbPfnKneDUuVfIqoh8xzC2EEXM ggO+9/ET/L3N7vFPV9TWkym8O3GXfgaPKOHQOvjujiuqNd9k0qGNIcoGl/dBIZk4zlFa QTsb0Fx654oBfwlYyCvPgpIbUnjCokKwOPL7A775wrhi+ShDtrR1D6bLCXYg9BBaYtDw bc4QILCqScR/90fRkJQtLt8mPydrXHAQRmui3zER0ZDrlUBSto4jKF/epnVV4Tt3A9sc lVzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=+dBCH/ZYeXKPl1XZC0y8jtwH0hgKGXE3JjJWq8gRM44=; b=orhsqewMYfC+l2P4QtGdko8WkbBc+GQdDylEA1CMMwpjMkk1IdLcgUj3Aej3xmka48 wrmi1nM1htKkyuN9rnAyiYO57w+wqSPVZTPg/DsKVIpl1Y8VhWJrebXfU+IKSrHur3+T zWh/o//0NBBF0WlHqMlcCs0tvdMU2Uwxi1MnVoGqNZz7Ea7L+xS3/qzARKPbT27muIpC QgSCVE/U/oTRClXy2rZDSvVMNlQBWtPAht+jBPUBTAAh4htUZfVCnUxczcBUhFg/U5Qb VitPgG64P8Ji+PAvFLa856JjLqSaXjPEClpjUBQ5Z/980BtFMr7IhSdacWD/tJauhuPU Ymgw== X-Gm-Message-State: ACgBeo3Pa7cHPSfhcRXWbo0/j9byphJTYuFrpYIfzFPF6bxcgR2yB0xs 6mayqtV+aXeeXvcUEC2OddXw+aUtOjA= X-Google-Smtp-Source: AA6agR5IHyLOVpCYPsGCTl1WAh1PLHzSkOgAmQAkkvI/LDvSxkU9pCXJyMghFPF7B9PD/jn9WBQElFJpekI= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:94b:0:b0:68f:4e05:e8f0 with SMTP id u11-20020a25094b000000b0068f4e05e8f0mr13319593ybm.115.1661896195289; Tue, 30 Aug 2022 14:49:55 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:01 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-13-surenb@google.com> Subject: [RFC PATCH 12/30] mm: introduce __GFP_NO_OBJ_EXT flag to selectively prevent slabobj_ext creation From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Introduce __GFP_NO_OBJ_EXT flag in order to prevent recursive allocations when allocating slabobj_ext on a slab. Signed-off-by: Suren Baghdasaryan --- include/linux/gfp_types.h | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h index d88c46ca82e1..a2cba1d20b86 100644 --- a/include/linux/gfp_types.h +++ b/include/linux/gfp_types.h @@ -55,8 +55,13 @@ typedef unsigned int __bitwise gfp_t; #define ___GFP_SKIP_KASAN_UNPOISON 0 #define ___GFP_SKIP_KASAN_POISON 0 #endif +#ifdef CONFIG_SLAB_OBJ_EXT +#define ___GFP_NO_OBJ_EXT 0x8000000u +#else +#define ___GFP_NO_OBJ_EXT 0 +#endif #ifdef CONFIG_LOCKDEP -#define ___GFP_NOLOCKDEP 0x8000000u +#define ___GFP_NOLOCKDEP 0x10000000u #else #define ___GFP_NOLOCKDEP 0 #endif @@ -101,12 +106,15 @@ typedef unsigned int __bitwise gfp_t; * node with no fallbacks or placement policy enforcements. * * %__GFP_ACCOUNT causes the allocation to be accounted to kmemcg. + * + * %__GFP_NO_OBJ_EXT causes slab allocation to have no object extension. */ #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) #define __GFP_WRITE ((__force gfp_t)___GFP_WRITE) #define __GFP_HARDWALL ((__force gfp_t)___GFP_HARDWALL) #define __GFP_THISNODE ((__force gfp_t)___GFP_THISNODE) #define __GFP_ACCOUNT ((__force gfp_t)___GFP_ACCOUNT) +#define __GFP_NO_OBJ_EXT ((__force gfp_t)___GFP_NO_OBJ_EXT) /** * DOC: Watermark modifiers @@ -256,7 +264,7 @@ typedef unsigned int __bitwise gfp_t; #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP) /* Room for N __GFP_FOO bits */ -#define __GFP_BITS_SHIFT (27 + IS_ENABLED(CONFIG_LOCKDEP)) +#define __GFP_BITS_SHIFT (28 + IS_ENABLED(CONFIG_LOCKDEP)) #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1)) /** From patchwork Tue Aug 30 21:49:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959942 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D159DECAAA1 for ; Tue, 30 Aug 2022 21:51:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232097AbiH3Vvv (ORCPT ); Tue, 30 Aug 2022 17:51:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231741AbiH3VvG (ORCPT ); Tue, 30 Aug 2022 17:51:06 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED08E18B31 for ; Tue, 30 Aug 2022 14:50:00 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-336c3b72da5so187501697b3.6 for ; Tue, 30 Aug 2022 14:50:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=lEzLHp5EUALGBDR080eL/ZTXye/B6ScLofp8WoGZutw=; b=b1R/0Nzef/8svLYP03pWrgs7vVHFpmq4XPsUT6SPRcEERNTFUBKXL3gF3rV3lAxbCu zXGaLab/bgugeQVtB6fQn7BhC15GHiYy8yfVXRL4GYXQsnil0/qsZR4erD7n1daSxUT/ hbKk1xyU9Ykc1njB28pQzjTGt5yf81XdbMvnJj3jJrZZVKL0FDy5ZPsMSVZCq6qj2GNq 5xqR3jFoEpuD7VZT5PIkQzsQQ7XeVpMK6yxsDE98VmdfDUzKzdP+i7iVBsy7CZO0ECYl ds6ZVkTkDhj6OOx/cOiCWsxrgV2NyVoWxYPRsclO00W2nRWfmYoOg0iJH72Y79yCTeat 6jXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=lEzLHp5EUALGBDR080eL/ZTXye/B6ScLofp8WoGZutw=; b=guOlqXcyEI9Dpwxnfav79KM80vN+ldz2tv/6oP1pw6j8qkqecPgIjVWxVcQ8Vqrhjs jwUauTwHQPVjMzgEU/K7mYd/bgtLkS8m++o5DxVgnD0oCr+cg29lI67/a6pn4Hr4LZxS /x9MbICrFyE6Eb9TxnX8UlXpYox1LFarAjoSNDjzVF3U+p5on4OaqOx8m3ySWlLmEf1f 0vQvrFiothGcwRKZ4t1/Zx7F7PlTXLesrk85pUfM21kvcfEwJuDkLybWRz7AgvPGU38q tHk5V8xFswn++2xqzSBjRr/+AIcTT3QUJE3Xyl7amFGnabT9Ti5WqCe49Dcsmx1v7bLt DVCA== X-Gm-Message-State: ACgBeo3sr4bPpxYV6NQVmxasWpzB7ytW8LonJXVeNwgdZvYRz9I9PUd+ N5wGkQuaCG/UCJTXcqu14GRQ6UbmRws= X-Google-Smtp-Source: AA6agR5aFDFTV93gABXVtI6sO2HuTqIfM7XjbTB1z4Gt/hzYUN7LO58jr4g2RITDfY5OEI/Zy7OiK9gpPj0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a05:6902:89:b0:695:7ed0:d8cb with SMTP id h9-20020a056902008900b006957ed0d8cbmr13360099ybs.77.1661896198026; Tue, 30 Aug 2022 14:49:58 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:02 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-14-surenb@google.com> Subject: [RFC PATCH 13/30] mm/slab: introduce SLAB_NO_OBJ_EXT to avoid obj_ext creation From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Slab extension objects can't be allocated before slab infrastructure is initialized. Some caches, like kmem_cache and kmem_cache_node, are created before slab infrastructure is initialized. Objects from these caches can't have extension objects. Introduce SLAB_NO_OBJ_EXT slab flag to mark these caches and avoid creating extensions for objects allocated from these slabs. Signed-off-by: Suren Baghdasaryan --- include/linux/slab.h | 7 +++++++ mm/slab.c | 2 +- mm/slub.c | 5 +++-- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 0fefdf528e0d..55ae3ea864a4 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -124,6 +124,13 @@ #define SLAB_RECLAIM_ACCOUNT ((slab_flags_t __force)0x00020000U) #define SLAB_TEMPORARY SLAB_RECLAIM_ACCOUNT /* Objects are short-lived */ +#ifdef CONFIG_SLAB_OBJ_EXT +/* Slab created using create_boot_cache */ +#define SLAB_NO_OBJ_EXT ((slab_flags_t __force)0x20000000U) +#else +#define SLAB_NO_OBJ_EXT 0 +#endif + /* * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests. * diff --git a/mm/slab.c b/mm/slab.c index 10e96137b44f..ba97aeef7ec1 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1233,7 +1233,7 @@ void __init kmem_cache_init(void) create_boot_cache(kmem_cache, "kmem_cache", offsetof(struct kmem_cache, node) + nr_node_ids * sizeof(struct kmem_cache_node *), - SLAB_HWCACHE_ALIGN, 0, 0); + SLAB_HWCACHE_ALIGN | SLAB_NO_OBJ_EXT, 0, 0); list_add(&kmem_cache->list, &slab_caches); slab_state = PARTIAL; diff --git a/mm/slub.c b/mm/slub.c index 862dbd9af4f5..80199d5ac7c9 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4825,7 +4825,8 @@ void __init kmem_cache_init(void) node_set(node, slab_nodes); create_boot_cache(kmem_cache_node, "kmem_cache_node", - sizeof(struct kmem_cache_node), SLAB_HWCACHE_ALIGN, 0, 0); + sizeof(struct kmem_cache_node), + SLAB_HWCACHE_ALIGN | SLAB_NO_OBJ_EXT, 0, 0); register_hotmemory_notifier(&slab_memory_callback_nb); @@ -4835,7 +4836,7 @@ void __init kmem_cache_init(void) create_boot_cache(kmem_cache, "kmem_cache", offsetof(struct kmem_cache, node) + nr_node_ids * sizeof(struct kmem_cache_node *), - SLAB_HWCACHE_ALIGN, 0, 0); + SLAB_HWCACHE_ALIGN | SLAB_NO_OBJ_EXT, 0, 0); kmem_cache = bootstrap(&boot_kmem_cache); kmem_cache_node = bootstrap(&boot_kmem_cache_node); From patchwork Tue Aug 30 21:49:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959945 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A6E6ECAAD4 for ; Tue, 30 Aug 2022 21:52:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232066AbiH3VwU (ORCPT ); Tue, 30 Aug 2022 17:52:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231830AbiH3Vv0 (ORCPT ); Tue, 30 Aug 2022 17:51:26 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78AF51E3C3 for ; Tue, 30 Aug 2022 14:50:02 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-33d9f6f4656so190943497b3.21 for ; Tue, 30 Aug 2022 14:50:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=gY+PdoFf3bNnEB0W9Vd0woPoFdt9hhtegjhiPtTj+AE=; b=pP4c3oP+YdesowZgNPs4eE1Xp1hWxAr9Ov2IDGBfglQ7uK9BWLrmzhDJ/utRpkPDJK tLMoERW1+3/smlLhO5G46VfqVywVDMABKHAv9jnhzz9ZshpBvX9ODCL7lsQaaZS0MLOk moKmQwwsOGjNod+M4t/tjA/+LNyP1VXOc5752+/a3Um/WwaPBzNEpe5l2cE7BDo787CF /PWmu6igP2xWBoiVtEHlimVT5AWdWMQxPlHvsqYO1szpGDqcGUPThIrLg/hGXqOCKmhM XRoOlsQx66oUshjUnxZWYY0qi/lYffvprIXZMUGryr7KjlXRCtsueZ1FZIWTKefulhTk stEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=gY+PdoFf3bNnEB0W9Vd0woPoFdt9hhtegjhiPtTj+AE=; b=SLtlFbwS4bRbgyIeFjzn8aAKwjQWfT0cm9hcz9qteWmaGXAQwya0994H532DLfhrM8 Vq4yoVNF79x41fXWc2omYfb1qgg/NPy5y27kVSJB5MsSyotlr0BSbxYRN+HLNKlKnjWr N4o5nzuISEfLe9TFojhtIifUUPbanq2qGuIDmvrMLCuMVnnEopZ4AEo4G7Z/ap+k9AFo V8En5iRyrVf5zlidBhT6MQcYv3Cn0IBx98RKXDBvtcOK6CJ9vNmX4hwT51Kb7RbKnm1J ULfi8vYDDtFPjQzT4rJGEY3mAcjokTptoZ8mZ+lAkLEbtuqH73ub1ifGFcf1fc4Oa8UR zCvA== X-Gm-Message-State: ACgBeo3yGxPCB5g28adpIpFisDYVCE2ZNicTdgmfNrRQWIxqkTUWGhiX c5kBr3K8krYYirVZVC9aUaQKyqw7GTE= X-Google-Smtp-Source: AA6agR7WLdIMrAR0FWfuROzkHWzqEUxzLPyMUzXzjD/46yfJpgch8NK3VM+VyOZeNwYsSnGIQ5TrqCRxJvQ= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:ba91:0:b0:683:ebc2:7114 with SMTP id s17-20020a25ba91000000b00683ebc27114mr13866830ybg.319.1661896200808; Tue, 30 Aug 2022 14:50:00 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:03 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-15-surenb@google.com> Subject: [RFC PATCH 14/30] mm: prevent slabobj_ext allocations for slabobj_ext and kmem_cache objects From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Use __GFP_NO_OBJ_EXT to prevent recursions when allocating slabobj_ext objects. Also prevent slabobj_ext allocations for kmem_cache objects. Signed-off-by: Suren Baghdasaryan --- mm/memcontrol.c | 2 ++ mm/slab.h | 6 ++++++ 2 files changed, 8 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 3f407ef2f3f1..dabb451dc364 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2809,6 +2809,8 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s, void *vec; gfp &= ~OBJCGS_CLEAR_MASK; + /* Prevent recursive extension vector allocation */ + gfp |= __GFP_NO_OBJ_EXT; vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp, slab_nid(slab)); if (!vec) diff --git a/mm/slab.h b/mm/slab.h index c767ce3f0fe2..d93b22b8bbe2 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -475,6 +475,12 @@ static inline void prepare_slab_obj_exts_hook(struct kmem_cache *s, gfp_t flags, if (is_kmem_only_obj_ext()) return; + if (s->flags & SLAB_NO_OBJ_EXT) + return; + + if (flags & __GFP_NO_OBJ_EXT) + return; + slab = virt_to_slab(p); if (!slab_obj_exts(slab)) WARN(alloc_slab_obj_exts(slab, s, flags, false), From patchwork Tue Aug 30 21:49:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959944 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28ACAECAAD4 for ; Tue, 30 Aug 2022 21:51:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229658AbiH3Vv4 (ORCPT ); Tue, 30 Aug 2022 17:51:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231979AbiH3VvO (ORCPT ); Tue, 30 Aug 2022 17:51:14 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C465422B02 for ; Tue, 30 Aug 2022 14:50:04 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-33f8988daecso180743287b3.12 for ; Tue, 30 Aug 2022 14:50:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=w9gVcgQRdZitDZxjqrDoJq6Gs/0aEjUQjbVWDU06BvI=; b=Q1PzL2bThBJ4iiSWL0d7IheB5VfLNxWH1gqUohex8OE2QSiGR/fpg9gqy0yixW+Isb pxJFdeTMvRwdBwAcg7flVNtX1j9eJ9C9tS/tDg05AImXSgWHVN2GUlakKwYKuvgZhIEZ 53DiKTHKeM16TZHvnMmCYNdJyP0y2AnTT97Oi3Zv6WKFja6rmmzzL6Teq1ArHTNPI2TC NVjO3wFAVKlA4rp9aDdNwzT8IVNC+AXM7IGd6mywh1LPpaNIM2cA8K+l/DF+pItxSR+n jG8tcEuC2nhcd/7nioVG87S+Xo+Y8ZaGYAgx8HTdNWhJ4JbCh9BURQy523rvt+lhf5xu HCDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=w9gVcgQRdZitDZxjqrDoJq6Gs/0aEjUQjbVWDU06BvI=; b=du9nRbtA35hcqZJOmxy2Oi3rayjaptAjRi1C2Xgjz8B2Q0kJU7mnpbdRehVxeVOzla 1J/F+drUstWPlGlyreQW8Z/xcbx2/fD2ei7e8d5caFixVBoXgntC71aaqlccBdEUuaLi 3xxndD0JkiNQVJTcVvpNAa7qH2lnwRGGPiYV+eRmIQiBSoJxYCstqqGxFJlN2nbbnqwN p1mOkNhIni8aHx+1bi/lx7T940SmxC00gbn+dHitJAds04bvu8u071cj949L31frUpPV n0T/7IJnAK6MDaQ8p6NMRB3F2tz7FfIzwgpy9RL8ICsV0nLam/2AsY6fxLz8OmKwoecs rcJQ== X-Gm-Message-State: ACgBeo024mLa6HfQkQie4C5Q55Tqk91T6XtLMuwZ3FUKTiUKETfuT9Hc TSSxJhL3Se+YOnX2W53TZJ9Zq5+E8KU= X-Google-Smtp-Source: AA6agR4vG82TMLNZbe9b42TzeXUrvZHL0ChO+lv1LMjTJEyr7xCSEiF1DQ2YaYAniXhteJjohj+5Rf5pvb8= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:e6c6:0:b0:695:f4dc:8c4f with SMTP id d189-20020a25e6c6000000b00695f4dc8c4fmr13235724ybh.329.1661896203524; Tue, 30 Aug 2022 14:50:03 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:04 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-16-surenb@google.com> Subject: [RFC PATCH 15/30] lib: introduce slab allocation tagging From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Introduce CONFIG_SLAB_ALLOC_TAGGING which provides helper functions to easily instrument slab allocators and adds a codetag_ref field into slabobj_ext to store a pointer to the allocation tag associated with the code that allocated the slab object. Signed-off-by: Suren Baghdasaryan Co-developed-by: Kent Overstreet Signed-off-by: Kent Overstreet --- include/linux/memcontrol.h | 5 +++++ include/linux/slab.h | 25 +++++++++++++++++++++++++ include/linux/slab_def.h | 2 +- include/linux/slub_def.h | 4 ++-- lib/Kconfig.debug | 11 +++++++++++ mm/slab_common.c | 33 +++++++++++++++++++++++++++++++++ 6 files changed, 77 insertions(+), 3 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 315399f77173..97c0153f0247 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -232,7 +232,12 @@ struct obj_cgroup { * if MEMCG_DATA_OBJEXTS is set. */ struct slabobj_ext { +#ifdef CONFIG_MEMCG_KMEM struct obj_cgroup *objcg; +#endif +#ifdef CONFIG_SLAB_ALLOC_TAGGING + union codetag_ref ref; +#endif } __aligned(8); /* diff --git a/include/linux/slab.h b/include/linux/slab.h index 55ae3ea864a4..5a198aa02a08 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -438,6 +438,31 @@ static __always_inline unsigned int __kmalloc_index(size_t size, #define kmalloc_index(s) __kmalloc_index(s, true) #endif /* !CONFIG_SLOB */ +#ifdef CONFIG_SLAB_ALLOC_TAGGING + +#include + +union codetag_ref *get_slab_tag_ref(const void *objp); + +#define slab_tag_add(_old, _new) \ +do { \ + if (!ZERO_OR_NULL_PTR(_new) && _old != _new) \ + alloc_tag_add(get_slab_tag_ref(_new), __ksize(_new)); \ +} while (0) + +static inline void slab_tag_dec(const void *ptr) +{ + if (!ZERO_OR_NULL_PTR(ptr)) + alloc_tag_sub(get_slab_tag_ref(ptr), __ksize(ptr)); +} + +#else + +#define slab_tag_add(_old, _new) do {} while (0) +static inline void slab_tag_dec(const void *ptr) {} + +#endif + void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __alloc_size(1); void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) __assume_slab_alignment __malloc; void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru, diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h index e24c9aff6fed..25feb5f7dc32 100644 --- a/include/linux/slab_def.h +++ b/include/linux/slab_def.h @@ -106,7 +106,7 @@ static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *sla * reciprocal_divide(offset, cache->reciprocal_buffer_size) */ static inline unsigned int obj_to_index(const struct kmem_cache *cache, - const struct slab *slab, void *obj) + const struct slab *slab, const void *obj) { u32 offset = (obj - slab->s_mem); return reciprocal_divide(offset, cache->reciprocal_buffer_size); diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index f9c68a9dac04..940c146768d4 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -170,14 +170,14 @@ static inline void *nearest_obj(struct kmem_cache *cache, const struct slab *sla /* Determine object index from a given position */ static inline unsigned int __obj_to_index(const struct kmem_cache *cache, - void *addr, void *obj) + void *addr, const void *obj) { return reciprocal_divide(kasan_reset_tag(obj) - addr, cache->reciprocal_size); } static inline unsigned int obj_to_index(const struct kmem_cache *cache, - const struct slab *slab, void *obj) + const struct slab *slab, const void *obj) { if (is_kfence_address(obj)) return 0; diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 6686648843b3..08c97a978906 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -989,6 +989,17 @@ config PAGE_ALLOC_TAGGING initiated at that code location. The mechanism can be used to track memory leaks with a low performance impact. +config SLAB_ALLOC_TAGGING + bool "Enable slab allocation tagging" + default n + select ALLOC_TAGGING + select SLAB_OBJ_EXT + help + Instrument slab allocators to track allocation source code and + collect statistics on the number of allocations and their total size + initiated at that code location. The mechanism can be used to track + memory leaks with a low performance impact. + source "lib/Kconfig.kasan" source "lib/Kconfig.kfence" diff --git a/mm/slab_common.c b/mm/slab_common.c index 17996649cfe3..272eda62ecaa 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -202,6 +202,39 @@ struct kmem_cache *find_mergeable(unsigned int size, unsigned int align, return NULL; } +#ifdef CONFIG_SLAB_ALLOC_TAGGING + +union codetag_ref *get_slab_tag_ref(const void *objp) +{ + struct slabobj_ext *obj_exts; + union codetag_ref *res = NULL; + struct slab *slab; + unsigned int off; + + slab = virt_to_slab(objp); + /* + * We could be given a kmalloc_large() object, skip those. They use + * alloc_pages and can be tracked by page allocation tracking. + */ + if (!slab) + goto out; + + obj_exts = slab_obj_exts(slab); + if (!obj_exts) + goto out; + + if (!slab->slab_cache) + goto out; + + off = obj_to_index(slab->slab_cache, slab, objp); + res = &obj_exts[off].ref; +out: + return res; +} +EXPORT_SYMBOL(get_slab_tag_ref); + +#endif /* CONFIG_SLAB_ALLOC_TAGGING */ + static struct kmem_cache *create_cache(const char *name, unsigned int object_size, unsigned int align, slab_flags_t flags, unsigned int useroffset, From patchwork Tue Aug 30 21:49:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959946 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0087FC3DA6B for ; Tue, 30 Aug 2022 21:52:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232002AbiH3VwX (ORCPT ); Tue, 30 Aug 2022 17:52:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232008AbiH3Vvc (ORCPT ); Tue, 30 Aug 2022 17:51:32 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D63A326E2 for ; Tue, 30 Aug 2022 14:50:07 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id l9-20020a252509000000b00695eb4f1422so721684ybl.13 for ; Tue, 30 Aug 2022 14:50:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=DVDWc4n0OUELi9+TR5vEerYUeoJhmwr6oegHq8VoMPY=; b=bM/4HTwD/5NvD2z1e/m/chKOx8M+k/cup4NHKUTyLbSRW7GSEIiXXrcH3cV9SZ1SAI L09ba1CkrtuU8u1GNYrpa4iH8Pf6SJhzN8yZiOz8fjB8X6SKmg6EcT8m4ikT2VrNvIZC fcEJcUtC8cuaOScZm4W400OnCT8oTeDu69+FoT8U0Z8bPUbNNtUMn8N9gtPxsRiO1Jkp yHen1YuPdi7I2iwNEDue9JoiosDrnFDYBeVHrjJCFkXdPVyn9cYkcijvjT6kVZJ46mfx 8WNJGPhwmcInAt8+4wUEyRhB6jGCNZ+WfXvMftBjOymVBGsg95WkR7Uw2/KIXsL46tgh o0Kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=DVDWc4n0OUELi9+TR5vEerYUeoJhmwr6oegHq8VoMPY=; b=ZWLXL+8OUpTYK9gSLt9PaHNV1h6A2bBwx3QYvQcztt/uQPodp+5PSGQ8ql3WmSRbuD 7T7xJBtwlExoyqbx/oyqn6RMoHVLhCB5q9VbLMbWbjvMX0HkTKZ5dartmZi9c4goEFRX SzsmZ2grY/JkiC3/6Z0kx5RSwnvngUsPh69K1kpzFdxE4hwhodVtpUaEZTDfSFgaXo9y 4Zvzj1M8AmeyZEEU48ewzOQq/Ma0U/M8gnQ4+Vv1Juno3jr+w3UsXJu54/sGjKuR98pt h3mChWiZex+SsT/JAtOXtQgkKySPSjf+2SJon0Fsc1mQ1BAmE48KL1Yd2KeO2hqMi1Cw YwJA== X-Gm-Message-State: ACgBeo2+7+rESi3qSLKhnF6ah8Eyx7jtXB83snd/3OnmVddOLKl4OafQ RvNaQX0IhlolohK+XgIG7w8Q8+fYhws= X-Google-Smtp-Source: AA6agR7T4bKeuJlf9DAoJiFE4v7Q2R/MdDJESlGt7ahUFYj3Wmkn9a6qQs7FkiNluCNw2Wq41YrDjpcXKsY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:9f85:0:b0:693:614:cb2a with SMTP id u5-20020a259f85000000b006930614cb2amr13240649ybq.143.1661896206375; Tue, 30 Aug 2022 14:50:06 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:05 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-17-surenb@google.com> Subject: [RFC PATCH 16/30] mm: enable slab allocation tagging for kmalloc and friends From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Redefine kmalloc, krealloc, kzalloc, kcalloc, etc. to record allocations and deallocations done by these functions. Signed-off-by: Suren Baghdasaryan Co-developed-by: Kent Overstreet Signed-off-by: Kent Overstreet --- include/linux/slab.h | 103 +++++++++++++++++++++++++------------------ mm/slab.c | 2 + mm/slab_common.c | 16 +++---- mm/slob.c | 2 + mm/slub.c | 2 + 5 files changed, 75 insertions(+), 50 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 5a198aa02a08..89273be35743 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -191,7 +191,10 @@ int kmem_cache_shrink(struct kmem_cache *s); /* * Common kmalloc functions provided by all allocators */ -void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); +void * __must_check _krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2); +#define krealloc(_p, _size, _flags) \ + krealloc_hooks(_p, _krealloc(_p, _size, _flags)) + void kfree(const void *objp); void kfree_sensitive(const void *objp); size_t __ksize(const void *objp); @@ -463,6 +466,15 @@ static inline void slab_tag_dec(const void *ptr) {} #endif +#define krealloc_hooks(_p, _do_alloc) \ +({ \ + void *_res = _do_alloc; \ + slab_tag_add(_p, _res); \ + _res; \ +}) + +#define kmalloc_hooks(_do_alloc) krealloc_hooks(NULL, _do_alloc) + void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __alloc_size(1); void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) __assume_slab_alignment __malloc; void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru, @@ -541,25 +553,31 @@ static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, g } #endif /* CONFIG_TRACING */ -extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment +extern void *_kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __alloc_size(1); +#define kmalloc_order(_size, _flags, _order) \ + kmalloc_hooks(_kmalloc_order(_size, _flags, _order)) #ifdef CONFIG_TRACING -extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) +extern void *_kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __alloc_size(1); #else -static __always_inline __alloc_size(1) void *kmalloc_order_trace(size_t size, gfp_t flags, +static __always_inline __alloc_size(1) void *_kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) { - return kmalloc_order(size, flags, order); + return _kmalloc_order(size, flags, order); } #endif +#define kmalloc_order_trace(_size, _flags, _order) \ + kmalloc_hooks(_kmalloc_order_trace(_size, _flags, _order)) -static __always_inline __alloc_size(1) void *kmalloc_large(size_t size, gfp_t flags) +static __always_inline __alloc_size(1) void *_kmalloc_large(size_t size, gfp_t flags) { unsigned int order = get_order(size); - return kmalloc_order_trace(size, flags, order); + return _kmalloc_order_trace(size, flags, order); } +#define kmalloc_large(_size, _flags) \ + kmalloc_hooks(_kmalloc_large(_size, _flags)) /** * kmalloc - allocate memory @@ -615,14 +633,14 @@ static __always_inline __alloc_size(1) void *kmalloc_large(size_t size, gfp_t fl * Try really hard to succeed the allocation but fail * eventually. */ -static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags) +static __always_inline __alloc_size(1) void *_kmalloc(size_t size, gfp_t flags) { if (__builtin_constant_p(size)) { #ifndef CONFIG_SLOB unsigned int index; #endif if (size > KMALLOC_MAX_CACHE_SIZE) - return kmalloc_large(size, flags); + return _kmalloc_large(size, flags); #ifndef CONFIG_SLOB index = kmalloc_index(size); @@ -636,8 +654,9 @@ static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags) } return __kmalloc(size, flags); } +#define kmalloc(_size, _flags) kmalloc_hooks(_kmalloc(_size, _flags)) -static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node) +static __always_inline __alloc_size(1) void *_kmalloc_node(size_t size, gfp_t flags, int node) { #ifndef CONFIG_SLOB if (__builtin_constant_p(size) && @@ -654,6 +673,8 @@ static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t fla #endif return __kmalloc_node(size, flags, node); } +#define kmalloc_node(_size, _flags, _node) \ + kmalloc_hooks(_kmalloc_node(_size, _flags, _node)) /** * kmalloc_array - allocate memory for an array. @@ -661,16 +682,18 @@ static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t fla * @size: element size. * @flags: the type of memory to allocate (see kmalloc). */ -static inline __alloc_size(1, 2) void *kmalloc_array(size_t n, size_t size, gfp_t flags) +static inline __alloc_size(1, 2) void *_kmalloc_array(size_t n, size_t size, gfp_t flags) { size_t bytes; if (unlikely(check_mul_overflow(n, size, &bytes))) return NULL; if (__builtin_constant_p(n) && __builtin_constant_p(size)) - return kmalloc(bytes, flags); - return __kmalloc(bytes, flags); + return _kmalloc(bytes, flags); + return _kmalloc(bytes, flags); } +#define kmalloc_array(_n, _size, _flags) \ + kmalloc_hooks(_kmalloc_array(_n, _size, _flags)) /** * krealloc_array - reallocate memory for an array. @@ -679,7 +702,7 @@ static inline __alloc_size(1, 2) void *kmalloc_array(size_t n, size_t size, gfp_ * @new_size: new size of a single member of the array * @flags: the type of memory to allocate (see kmalloc) */ -static inline __alloc_size(2, 3) void * __must_check krealloc_array(void *p, +static inline __alloc_size(2, 3) void * __must_check _krealloc_array(void *p, size_t new_n, size_t new_size, gfp_t flags) @@ -689,8 +712,10 @@ static inline __alloc_size(2, 3) void * __must_check krealloc_array(void *p, if (unlikely(check_mul_overflow(new_n, new_size, &bytes))) return NULL; - return krealloc(p, bytes, flags); + return _krealloc(p, bytes, flags); } +#define krealloc_array(_p, _n, _size, _flags) \ + krealloc_hooks(_p, _krealloc_array(_p, _n, _size, _flags)) /** * kcalloc - allocate memory for an array. The memory is set to zero. @@ -698,10 +723,8 @@ static inline __alloc_size(2, 3) void * __must_check krealloc_array(void *p, * @size: element size. * @flags: the type of memory to allocate (see kmalloc). */ -static inline __alloc_size(1, 2) void *kcalloc(size_t n, size_t size, gfp_t flags) -{ - return kmalloc_array(n, size, flags | __GFP_ZERO); -} +#define kcalloc(_n, _size, _flags) \ + kmalloc_array(_n, _size, (_flags)|__GFP_ZERO) /* * kmalloc_track_caller is a special version of kmalloc that records the @@ -712,10 +735,10 @@ static inline __alloc_size(1, 2) void *kcalloc(size_t n, size_t size, gfp_t flag * request comes from. */ extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller); -#define kmalloc_track_caller(size, flags) \ - __kmalloc_track_caller(size, flags, _RET_IP_) +#define kmalloc_track_caller(size, flags) \ + kmalloc_hooks(__kmalloc_track_caller(size, flags, _RET_IP_)) -static inline __alloc_size(1, 2) void *kmalloc_array_node(size_t n, size_t size, gfp_t flags, +static inline __alloc_size(1, 2) void *_kmalloc_array_node(size_t n, size_t size, gfp_t flags, int node) { size_t bytes; @@ -723,26 +746,24 @@ static inline __alloc_size(1, 2) void *kmalloc_array_node(size_t n, size_t size, if (unlikely(check_mul_overflow(n, size, &bytes))) return NULL; if (__builtin_constant_p(n) && __builtin_constant_p(size)) - return kmalloc_node(bytes, flags, node); + return _kmalloc_node(bytes, flags, node); return __kmalloc_node(bytes, flags, node); } +#define kmalloc_array_node(_n, _size, _flags, _node) \ + kmalloc_hooks(_kmalloc_array_node(_n, _size, _flags, _node)) -static inline __alloc_size(1, 2) void *kcalloc_node(size_t n, size_t size, gfp_t flags, int node) -{ - return kmalloc_array_node(n, size, flags | __GFP_ZERO, node); -} - +#define kcalloc_node(_n, _size, _flags, _node) \ + kmalloc_array_node(_n, _size, (_flags)|__GFP_ZERO, _node) #ifdef CONFIG_NUMA extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node, unsigned long caller) __alloc_size(1); -#define kmalloc_node_track_caller(size, flags, node) \ - __kmalloc_node_track_caller(size, flags, node, \ - _RET_IP_) +#define kmalloc_node_track_caller(size, flags, node) \ + kmalloc_hooks(__kmalloc_node_track_caller(size, flags, node, _RET_IP_)) #else /* CONFIG_NUMA */ -#define kmalloc_node_track_caller(size, flags, node) \ +#define kmalloc_node_track_caller(size, flags, node) \ kmalloc_track_caller(size, flags) #endif /* CONFIG_NUMA */ @@ -750,20 +771,16 @@ extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node, /* * Shortcuts */ -static inline void *kmem_cache_zalloc(struct kmem_cache *k, gfp_t flags) -{ - return kmem_cache_alloc(k, flags | __GFP_ZERO); -} +#define kmem_cache_zalloc(_k, _flags) \ + kmem_cache_alloc(_k, (_flags)|__GFP_ZERO) /** * kzalloc - allocate memory. The memory is set to zero. * @size: how many bytes of memory are required. * @flags: the type of memory to allocate (see kmalloc). */ -static inline __alloc_size(1) void *kzalloc(size_t size, gfp_t flags) -{ - return kmalloc(size, flags | __GFP_ZERO); -} +#define kzalloc(_size, _flags) \ + kmalloc(_size, (_flags)|__GFP_ZERO) /** * kzalloc_node - allocate zeroed memory from a particular memory node. @@ -771,10 +788,12 @@ static inline __alloc_size(1) void *kzalloc(size_t size, gfp_t flags) * @flags: the type of memory to allocate (see kmalloc). * @node: memory node from which to allocate */ -static inline __alloc_size(1) void *kzalloc_node(size_t size, gfp_t flags, int node) +static inline __alloc_size(1) void *_kzalloc_node(size_t size, gfp_t flags, int node) { - return kmalloc_node(size, flags | __GFP_ZERO, node); + return _kmalloc_node(size, flags | __GFP_ZERO, node); } +#define kzalloc_node(_size, _flags, _node) \ + kmalloc_hooks(_kzalloc_node(_size, _flags, _node)) extern void *kvmalloc_node(size_t size, gfp_t flags, int node) __alloc_size(1); static inline __alloc_size(1) void *kvmalloc(size_t size, gfp_t flags) diff --git a/mm/slab.c b/mm/slab.c index ba97aeef7ec1..db344de3b260 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -3402,6 +3402,7 @@ static __always_inline void __cache_free(struct kmem_cache *cachep, void *objp, if (is_kfence_address(objp)) { kmemleak_free_recursive(objp, cachep->flags); + slab_tag_dec(objp); __kfence_free(objp); return; } @@ -3433,6 +3434,7 @@ void ___cache_free(struct kmem_cache *cachep, void *objp, check_irq_off(); kmemleak_free_recursive(objp, cachep->flags); + slab_tag_dec(objp); objp = cache_free_debugcheck(cachep, objp, caller); /* diff --git a/mm/slab_common.c b/mm/slab_common.c index 272eda62ecaa..7b6473db5ab4 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -938,7 +938,7 @@ gfp_t kmalloc_fix_flags(gfp_t flags) * directly to the page allocator. We use __GFP_COMP, because we will need to * know the allocation order to free the pages properly in kfree. */ -void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) +void *_kmalloc_order(size_t size, gfp_t flags, unsigned int order) { void *ret = NULL; struct page *page; @@ -958,16 +958,16 @@ void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) kmemleak_alloc(ret, size, 1, flags); return ret; } -EXPORT_SYMBOL(kmalloc_order); +EXPORT_SYMBOL(_kmalloc_order); #ifdef CONFIG_TRACING -void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) +void *_kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) { - void *ret = kmalloc_order(size, flags, order); + void *ret = _kmalloc_order(size, flags, order); trace_kmalloc(_RET_IP_, ret, NULL, size, PAGE_SIZE << order, flags); return ret; } -EXPORT_SYMBOL(kmalloc_order_trace); +EXPORT_SYMBOL(_kmalloc_order_trace); #endif #ifdef CONFIG_SLAB_FREELIST_RANDOM @@ -1187,7 +1187,7 @@ static __always_inline void *__do_krealloc(const void *p, size_t new_size, return (void *)p; } - ret = kmalloc_track_caller(new_size, flags); + ret = __kmalloc_track_caller(new_size, flags, _RET_IP_); if (ret && p) { /* Disable KASAN checks as the object's redzone is accessed. */ kasan_disable_current(); @@ -1211,7 +1211,7 @@ static __always_inline void *__do_krealloc(const void *p, size_t new_size, * * Return: pointer to the allocated memory or %NULL in case of error */ -void *krealloc(const void *p, size_t new_size, gfp_t flags) +void *_krealloc(const void *p, size_t new_size, gfp_t flags) { void *ret; @@ -1226,7 +1226,7 @@ void *krealloc(const void *p, size_t new_size, gfp_t flags) return ret; } -EXPORT_SYMBOL(krealloc); +EXPORT_SYMBOL(_krealloc); /** * kfree_sensitive - Clear sensitive information in memory before freeing diff --git a/mm/slob.c b/mm/slob.c index 2bd4f476c340..23b49f6c9c8f 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -554,6 +554,7 @@ void kfree(const void *block) if (unlikely(ZERO_OR_NULL_PTR(block))) return; kmemleak_free(block); + slab_tag_dec(block); sp = virt_to_folio(block); if (folio_test_slab(sp)) { @@ -680,6 +681,7 @@ static void kmem_rcu_free(struct rcu_head *head) void kmem_cache_free(struct kmem_cache *c, void *b) { kmemleak_free_recursive(b, c->flags); + slab_tag_dec(b); trace_kmem_cache_free(_RET_IP_, b, c->name); if (unlikely(c->flags & SLAB_TYPESAFE_BY_RCU)) { struct slob_rcu *slob_rcu; diff --git a/mm/slub.c b/mm/slub.c index 80199d5ac7c9..caf752087ad6 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1715,6 +1715,7 @@ static inline void *kmalloc_large_node_hook(void *ptr, size_t size, gfp_t flags) static __always_inline void kfree_hook(void *x) { kmemleak_free(x); + slab_tag_dec(x); kasan_kfree_large(x); } @@ -1722,6 +1723,7 @@ static __always_inline bool slab_free_hook(struct kmem_cache *s, void *x, bool init) { kmemleak_free_recursive(x, s->flags); + slab_tag_dec(x); debug_check_no_locks_freed(x, s->object_size); From patchwork Tue Aug 30 21:49:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959948 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12EC1ECAAD5 for ; Tue, 30 Aug 2022 21:52:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231997AbiH3Vwm (ORCPT ); Tue, 30 Aug 2022 17:52:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231950AbiH3Vvn (ORCPT ); Tue, 30 Aug 2022 17:51:43 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 453D040541 for ; Tue, 30 Aug 2022 14:50:11 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-33dbfb6d2a3so188947347b3.11 for ; Tue, 30 Aug 2022 14:50:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=PaZc/ErQmIz9CmZWuOwZKmMvFAFDhrfR0u4+oqULyqs=; b=hebs9Ab4jQ13PLVsrCINOWSpgoH5EATUTa9Cks3Eg1nmkxH7qE2oRwCx7AF5BNZXTH AHzIFQ24Or2oJnVcja+AOnCs2XgXBCldmNcsagLkjdyOLpIpceU0bsUf62DWg4fCi92e s30PWdmXxABwFMNU1JgUL6Fp0h2PBQ5hgdc0oQExPZeqJt/0xfqCgMF0cTxAmuBThoZJ frR4uwNwpRrRdJWEew4t7LzoFroWP+hLZBEH0mT+P1Yv4GgQEDG37uiXZMEF+0m6Voke PgoPiZ8qXrLB1DY6fUVv3gddg9c6Rea7O2e0xIyCiy2p7GROj3EzsKBaYmh74I5z0+5s 1/pA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=PaZc/ErQmIz9CmZWuOwZKmMvFAFDhrfR0u4+oqULyqs=; b=nXxxNQmwBOpybixpzU2m/3xDOv7jfPhGHB29uPu81RpubylBqQl9w5v/hxWzkJEVTi bW/Nv9UdoigCdmSc6GJcTt6+x5IwjqVIGiBF7r8Y6EkgDttgqm6odbBF71hE2tmTaqT1 /vXdSW25B/4085VILOeBHCh7r/yJW+hj4cmoN1ODmlF+3wFdK+N4tkZcF8Ow5Zsr30iK JDlC+eybNHgOtPGyhIk4qWZwXJVGVlSQOrJGewyhq/Im4y4AbVl1Epm1wmleT8l2nD/y gtuapOkngnJ4nzhJOJKr+50HxdRV/uS6PKIgFp/mu/EG5XwI5fkLpR9CCEuNJIQSKE3S g1MA== X-Gm-Message-State: ACgBeo10+DbRizgFYbTWWTnPzytRAo0hUerEOh7TUPMqxDzXwNBZJzwv icUTt78ueX5dS052lDezLMf1qEEeTPI= X-Google-Smtp-Source: AA6agR60bKzipuEg+9xXy+Pee21QZvBbTBPOhKhJO0fVzuHKtaTXE6dCUM7Zh4tN5bptU+EiRysnFAQW6po= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:31d5:0:b0:677:28b:1451 with SMTP id x204-20020a2531d5000000b00677028b1451mr13217805ybx.437.1661896208990; Tue, 30 Aug 2022 14:50:08 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:06 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-18-surenb@google.com> Subject: [RFC PATCH 17/30] lib/string.c: strsep_no_empty() From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet This adds a new helper which is like strsep, except that it skips empty tokens. Signed-off-by: Kent Overstreet --- include/linux/string.h | 1 + lib/string.c | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/include/linux/string.h b/include/linux/string.h index 61ec7e4f6311..b950ac9cfa56 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -96,6 +96,7 @@ extern char * strpbrk(const char *,const char *); #ifndef __HAVE_ARCH_STRSEP extern char * strsep(char **,const char *); #endif +extern char *strsep_no_empty(char **, const char *); #ifndef __HAVE_ARCH_STRSPN extern __kernel_size_t strspn(const char *,const char *); #endif diff --git a/lib/string.c b/lib/string.c index 6f334420f687..6939f5b751f2 100644 --- a/lib/string.c +++ b/lib/string.c @@ -596,6 +596,25 @@ char *strsep(char **s, const char *ct) EXPORT_SYMBOL(strsep); #endif +/** + * strsep_no_empt - Split a string into tokens, but don't return empty tokens + * @s: The string to be searched + * @ct: The characters to search for + * + * strsep() updates @s to point after the token, ready for the next call. + */ +char *strsep_no_empty(char **s, const char *ct) +{ + char *ret; + + do { + ret = strsep(s, ct); + } while (ret && !*ret); + + return ret; +} +EXPORT_SYMBOL_GPL(strsep_no_empty); + #ifndef __HAVE_ARCH_MEMSET /** * memset - Fill a region of memory with the given value From patchwork Tue Aug 30 21:49:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959947 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C48DECAAD5 for ; Tue, 30 Aug 2022 21:52:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232140AbiH3VwZ (ORCPT ); Tue, 30 Aug 2022 17:52:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232021AbiH3Vvh (ORCPT ); Tue, 30 Aug 2022 17:51:37 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA1175A3D6 for ; Tue, 30 Aug 2022 14:50:12 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id v5-20020a2583c5000000b006964324be8cso714998ybm.14 for ; Tue, 30 Aug 2022 14:50:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=ofrC1PqNoYIKnyKYWjHD7inVuacSGBJJLHypr0RookE=; b=VGRf/84iZ1p67fJnvZ/X1xcCS67elVIPk9IGJJ+bUuWMMIF3sRy7RxzaFdrO+WG7dH d/TlbooTEmNgUOWOqFkIyFyAhgj636EU+aMGf3bhfUhI9LPGF0x0v39+rp3COSYRpICF emKVXtjPhIhzNDYojpJwNosafP0nhIDkwMZkM1SszHz/jzoceNfVxm+3TZcljgHgvL8y TVQxs/Koc6RZpeES3hIup/YjX0wdMTaa96AxFv1wg+iYYTwvZ7GGZBg+tBO+knqi5Juw fEBz0idUpafaOWyAi5+6IN+BrsO4tEn9FOM/qV9RCHSE0DEDO6hWPkFyyM8anxAPc5xB lw8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=ofrC1PqNoYIKnyKYWjHD7inVuacSGBJJLHypr0RookE=; b=B5mFxuc/IV2HkC928KIdjjJwvhyA0tCPTl1RmTQ1LK1nFvrD+l8pYuIhHwulH6YBNU FM+q/oFcQIXfg57Vh6QZu3L53/GZC/CWt/yaX0fhcIl3rplildRPfD7stWq7nB5bHSEf Gkm7O3DzPU0JKQIJiTD7joJVoe/VFy2hfyK55kwxSX6KscpGA+FDIYBdbyXiHAPkQUub fcVTrg2cuttccv995TdZPS3yPlf224DNwVRpwj7XVjB6a9O5l9If2+x8rHM/QAZyToZ6 SMOIagB+AkO15Ba8PJbl1oa3lZd6FwSWtEkAEraxODBnug8nbLHiY310fZJnAKmamjzq tSFw== X-Gm-Message-State: ACgBeo0qwl0BgvATV468yQZLGGmuG4JZ6UTbm65PN0JCqAPzhNrs5IYm lQ8oEFwWOT2xqPWJ3TGf+GEnqL8n0EQ= X-Google-Smtp-Source: AA6agR5S9pMWRwrdqx3FNbA74RB4ArEfkNsuNbxmIMNnFeUvwj+w6zosapneZCdX4VxzXX80wAu2gx5OQhs= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a0d:cd43:0:b0:329:febf:8c25 with SMTP id p64-20020a0dcd43000000b00329febf8c25mr15393402ywd.90.1661896211649; Tue, 30 Aug 2022 14:50:11 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:07 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-19-surenb@google.com> Subject: [RFC PATCH 18/30] codetag: add codetag query helper functions From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet Provide codetag_query_parse() to parse codetag queries and codetag_matches_query() to check if the query affects a given codetag. Signed-off-by: Kent Overstreet --- include/linux/codetag.h | 27 ++++++++ lib/codetag.c | 135 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 162 insertions(+) diff --git a/include/linux/codetag.h b/include/linux/codetag.h index 386733e89b31..0c605417ebbe 100644 --- a/include/linux/codetag.h +++ b/include/linux/codetag.h @@ -80,4 +80,31 @@ static inline void codetag_load_module(struct module *mod) {} static inline void codetag_unload_module(struct module *mod) {} #endif +/* Codetag query parsing */ + +struct codetag_query { + const char *filename; + const char *module; + const char *function; + const char *class; + unsigned int first_line, last_line; + unsigned int first_index, last_index; + unsigned int cur_index; + + bool match_line:1; + bool match_index:1; + + unsigned int set_enabled:1; + unsigned int enabled:2; + + unsigned int set_frequency:1; + unsigned int frequency; +}; + +char *codetag_query_parse(struct codetag_query *q, char *buf); +bool codetag_matches_query(struct codetag_query *q, + const struct codetag *ct, + const struct codetag_module *mod, + const char *class); + #endif /* _LINUX_CODETAG_H */ diff --git a/lib/codetag.c b/lib/codetag.c index f0a3174f9b71..288ccfd5cbd0 100644 --- a/lib/codetag.c +++ b/lib/codetag.c @@ -246,3 +246,138 @@ void codetag_unload_module(struct module *mod) } mutex_unlock(&codetag_lock); } + +/* Codetag query parsing */ + +#define CODETAG_QUERY_TOKENS() \ + x(func) \ + x(file) \ + x(line) \ + x(module) \ + x(class) \ + x(index) + +enum tokens { +#define x(name) TOK_##name, + CODETAG_QUERY_TOKENS() +#undef x +}; + +static const char * const token_strs[] = { +#define x(name) #name, + CODETAG_QUERY_TOKENS() +#undef x + NULL +}; + +static int parse_range(char *str, unsigned int *first, unsigned int *last) +{ + char *first_str = str; + char *last_str = strchr(first_str, '-'); + + if (last_str) + *last_str++ = '\0'; + + if (kstrtouint(first_str, 10, first)) + return -EINVAL; + + if (!last_str) + *last = *first; + else if (kstrtouint(last_str, 10, last)) + return -EINVAL; + + return 0; +} + +char *codetag_query_parse(struct codetag_query *q, char *buf) +{ + while (1) { + char *p = buf; + char *str1 = strsep_no_empty(&p, " \t\r\n"); + char *str2 = strsep_no_empty(&p, " \t\r\n"); + int ret, token; + + if (!str1 || !str2) + break; + + token = match_string(token_strs, ARRAY_SIZE(token_strs), str1); + if (token < 0) + break; + + switch (token) { + case TOK_func: + q->function = str2; + break; + case TOK_file: + q->filename = str2; + break; + case TOK_line: + ret = parse_range(str2, &q->first_line, &q->last_line); + if (ret) + return ERR_PTR(ret); + q->match_line = true; + break; + case TOK_module: + q->module = str2; + break; + case TOK_class: + q->class = str2; + break; + case TOK_index: + ret = parse_range(str2, &q->first_index, &q->last_index); + if (ret) + return ERR_PTR(ret); + q->match_index = true; + break; + } + + buf = p; + } + + return buf; +} + +bool codetag_matches_query(struct codetag_query *q, + const struct codetag *ct, + const struct codetag_module *mod, + const char *class) +{ + size_t classlen = q->class ? strlen(q->class) : 0; + + if (q->module && + (!mod->mod || + strcmp(q->module, ct->modname))) + return false; + + if (q->filename && + strcmp(q->filename, ct->filename) && + strcmp(q->filename, kbasename(ct->filename))) + return false; + + if (q->function && + strcmp(q->function, ct->function)) + return false; + + /* match against the line number range */ + if (q->match_line && + (ct->lineno < q->first_line || + ct->lineno > q->last_line)) + return false; + + /* match against the class */ + if (classlen && + (strncmp(q->class, class, classlen) || + (class[classlen] && class[classlen] != ':'))) + return false; + + /* match against the fault index */ + if (q->match_index && + (q->cur_index < q->first_index || + q->cur_index > q->last_index)) { + q->cur_index++; + return false; + } + + q->cur_index++; + return true; +} From patchwork Tue Aug 30 21:49:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959949 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3958ECAAA1 for ; Tue, 30 Aug 2022 21:52:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232154AbiH3Vwo (ORCPT ); Tue, 30 Aug 2022 17:52:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231458AbiH3Vvx (ORCPT ); Tue, 30 Aug 2022 17:51:53 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 064477D7BA for ; Tue, 30 Aug 2022 14:50:14 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id n18-20020a25d612000000b0069661a1dc48so726107ybg.20 for ; Tue, 30 Aug 2022 14:50:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=vjjGk6pymrjMSyrfbdxQ3/p6O8Kus1ISFMk96O5o39I=; b=OXRgptnZ7i+ICNigJkBWT2cSYFZtaNbBth28h0zoZMO0yy1KlhLJWbOL/agYd17Ge+ H8eygOzvmZZ/skvQrD81gmRZiypc2lcG0QHZKUmUE/ni+JmGucTrUU1yExVGKLt9MJij ftmKNsJcjdyi4dhaD13stFQzNJ1xbGRin4GkFtVHsn7nMBn5wS02jls3bfzzV07WlYsq mi9CGSRXCSnkh+sEnyExpA/okvKqytcPBHSo0PnRH7aCU5Q1xfzVqiAqvHEkIxcqVnHs 2oBNbQXE+ro549f5zm45uHNrO1sona3NzY/FqJ45DQDnvKS22ifnzF7B5XwlNd/5xlfX M9KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=vjjGk6pymrjMSyrfbdxQ3/p6O8Kus1ISFMk96O5o39I=; b=NUfUcLBfFPwpR1LMM4TzOwlKPFTaSwMYtisDNa7y9OkHchXRRdZGpwQLxr6FJT5CIb kMdK1keMl+dtWRcIhdWi1JxFsS34xpEQh9g0BRRgtfaVBrKODu5HSrLEN6mi2VZQOwMp veK5uzq7hcx/XaR2jQ7Cd7Tp0uop9E63jvcif3TbuBzvBQ+b8PFgoycK3VvG7hwcT8gP /lirJd6+neR4U74IAlI+utXVtPVyNH3wPxj76d2ZMuS7ls6GHDi/eEj/1zBZqMcFcFQm EbFTib+0YRz/yvyfdiMqehAz+QmnJTc8G/eiUm+SNogXSFyAOIUmig6mktAeAlH30Ewn USCQ== X-Gm-Message-State: ACgBeo1Gjp/qQyyWfpRKEvaiUXAGabUYIDTObZ2Cv4mZ+a8j5L2TndC+ pAqDZrkC13dFuXq4DpUD8yTTNf3wulU= X-Google-Smtp-Source: AA6agR4vIoEARtwRV51vchZbtFRCbo5XPnJmQ5jWl1WbEyyresfSAVBl15tPztqHAfgzavikcDXhMgmO3BY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a81:f47:0:b0:31f:434b:5ee with SMTP id 68-20020a810f47000000b0031f434b05eemr15734874ywp.383.1661896214287; Tue, 30 Aug 2022 14:50:14 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:08 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-20-surenb@google.com> Subject: [RFC PATCH 19/30] move stack capture functionality into a separate function for reuse From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Make save_stack() function part of stackdepot API to be used outside of page_owner. Also rename task_struct's in_page_owner to in_capture_stack flag to better convey the wider use of this flag. Signed-off-by: Suren Baghdasaryan --- include/linux/sched.h | 6 ++-- include/linux/stackdepot.h | 3 ++ lib/stackdepot.c | 68 ++++++++++++++++++++++++++++++++++++++ mm/page_owner.c | 52 ++--------------------------- 4 files changed, 77 insertions(+), 52 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index e7b2f8a5c711..d06cad6c14bd 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -930,9 +930,9 @@ struct task_struct { /* Stalled due to lack of memory */ unsigned in_memstall:1; #endif -#ifdef CONFIG_PAGE_OWNER - /* Used by page_owner=on to detect recursion in page tracking. */ - unsigned in_page_owner:1; +#ifdef CONFIG_STACKDEPOT + /* Used by stack_depot_capture_stack to detect recursion. */ + unsigned in_capture_stack:1; #endif #ifdef CONFIG_EVENTFD /* Recursion prevention for eventfd_signal() */ diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h index bc2797955de9..8dc9fdb2c4dd 100644 --- a/include/linux/stackdepot.h +++ b/include/linux/stackdepot.h @@ -64,4 +64,7 @@ int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size, void stack_depot_print(depot_stack_handle_t stack); +bool stack_depot_capture_init(void); +depot_stack_handle_t stack_depot_capture_stack(gfp_t flags); + #endif diff --git a/lib/stackdepot.c b/lib/stackdepot.c index e73fda23388d..c8615bd6dc25 100644 --- a/lib/stackdepot.c +++ b/lib/stackdepot.c @@ -514,3 +514,71 @@ depot_stack_handle_t stack_depot_save(unsigned long *entries, return __stack_depot_save(entries, nr_entries, alloc_flags, true); } EXPORT_SYMBOL_GPL(stack_depot_save); + +static depot_stack_handle_t recursion_handle; +static depot_stack_handle_t failure_handle; + +static __always_inline depot_stack_handle_t create_custom_stack(void) +{ + unsigned long entries[4]; + unsigned int nr_entries; + + nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0); + return stack_depot_save(entries, nr_entries, GFP_KERNEL); +} + +static noinline void register_recursion_stack(void) +{ + recursion_handle = create_custom_stack(); +} + +static noinline void register_failure_stack(void) +{ + failure_handle = create_custom_stack(); +} + +bool stack_depot_capture_init(void) +{ + static DEFINE_MUTEX(stack_depot_capture_init_mutex); + static bool utility_stacks_ready; + + mutex_lock(&stack_depot_capture_init_mutex); + if (!utility_stacks_ready) { + register_recursion_stack(); + register_failure_stack(); + utility_stacks_ready = true; + } + mutex_unlock(&stack_depot_capture_init_mutex); + + return utility_stacks_ready; +} + +/* TODO: teach stack_depot_capture_stack to use off stack temporal storage */ +#define CAPTURE_STACK_DEPTH (16) + +depot_stack_handle_t stack_depot_capture_stack(gfp_t flags) +{ + unsigned long entries[CAPTURE_STACK_DEPTH]; + depot_stack_handle_t handle; + unsigned int nr_entries; + + /* + * Avoid recursion. + * + * Sometimes page metadata allocation tracking requires more + * memory to be allocated: + * - when new stack trace is saved to stack depot + * - when backtrace itself is calculated (ia64) + */ + if (current->in_capture_stack) + return recursion_handle; + current->in_capture_stack = 1; + + nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 2); + handle = stack_depot_save(entries, nr_entries, flags); + if (!handle) + handle = failure_handle; + + current->in_capture_stack = 0; + return handle; +} diff --git a/mm/page_owner.c b/mm/page_owner.c index fd4af1ad34b8..c3173e34a779 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -15,12 +15,6 @@ #include "internal.h" -/* - * TODO: teach PAGE_OWNER_STACK_DEPTH (__dump_page_owner and save_stack) - * to use off stack temporal storage - */ -#define PAGE_OWNER_STACK_DEPTH (16) - struct page_owner { unsigned short order; short last_migrate_reason; @@ -37,8 +31,6 @@ struct page_owner { static bool page_owner_enabled __initdata; DEFINE_STATIC_KEY_FALSE(page_owner_inited); -static depot_stack_handle_t dummy_handle; -static depot_stack_handle_t failure_handle; static depot_stack_handle_t early_handle; static void init_early_allocated_pages(void); @@ -68,16 +60,6 @@ static __always_inline depot_stack_handle_t create_dummy_stack(void) return stack_depot_save(entries, nr_entries, GFP_KERNEL); } -static noinline void register_dummy_stack(void) -{ - dummy_handle = create_dummy_stack(); -} - -static noinline void register_failure_stack(void) -{ - failure_handle = create_dummy_stack(); -} - static noinline void register_early_stack(void) { early_handle = create_dummy_stack(); @@ -88,8 +70,7 @@ static __init void init_page_owner(void) if (!page_owner_enabled) return; - register_dummy_stack(); - register_failure_stack(); + stack_depot_capture_init(); register_early_stack(); static_branch_enable(&page_owner_inited); init_early_allocated_pages(); @@ -106,33 +87,6 @@ static inline struct page_owner *get_page_owner(struct page_ext *page_ext) return (void *)page_ext + page_owner_ops.offset; } -static noinline depot_stack_handle_t save_stack(gfp_t flags) -{ - unsigned long entries[PAGE_OWNER_STACK_DEPTH]; - depot_stack_handle_t handle; - unsigned int nr_entries; - - /* - * Avoid recursion. - * - * Sometimes page metadata allocation tracking requires more - * memory to be allocated: - * - when new stack trace is saved to stack depot - * - when backtrace itself is calculated (ia64) - */ - if (current->in_page_owner) - return dummy_handle; - current->in_page_owner = 1; - - nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 2); - handle = stack_depot_save(entries, nr_entries, flags); - if (!handle) - handle = failure_handle; - - current->in_page_owner = 0; - return handle; -} - void __reset_page_owner(struct page *page, unsigned short order) { int i; @@ -145,7 +99,7 @@ void __reset_page_owner(struct page *page, unsigned short order) if (unlikely(!page_ext)) return; - handle = save_stack(GFP_NOWAIT | __GFP_NOWARN); + handle = stack_depot_capture_stack(GFP_NOWAIT | __GFP_NOWARN); for (i = 0; i < (1 << order); i++) { __clear_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags); page_owner = get_page_owner(page_ext); @@ -189,7 +143,7 @@ noinline void __set_page_owner(struct page *page, unsigned short order, if (unlikely(!page_ext)) return; - handle = save_stack(gfp_mask); + handle = stack_depot_capture_stack(gfp_mask); __set_page_owner_handle(page_ext, handle, order, gfp_mask); } From patchwork Tue Aug 30 21:49:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6692ECAAD4 for ; Tue, 30 Aug 2022 21:52:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232165AbiH3Vww (ORCPT ); Tue, 30 Aug 2022 17:52:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231974AbiH3VwF (ORCPT ); Tue, 30 Aug 2022 17:52:05 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8ABCC90199 for ; Tue, 30 Aug 2022 14:50:18 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id d8-20020a25bc48000000b00680651cf051so725233ybk.23 for ; Tue, 30 Aug 2022 14:50:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=k1RBq3yq4U5V/3OJ4fp4kBxfnryZ2+wdIlpTdmkxoTI=; b=VhiQWzTMcrdH+FaQQULhGR8WSlxfBOTth6dlvIbrF9tNsikpCPNZSCsVVNQzwya0+m inYdBoaJMW2SV4VMQbG9aeBgJRps+ez+XWS5k3Rur9K2IIy7JqKGimhHZpLDadEOC8jn mlaKyF5VrtiZ0WWLgSmcUKau4UWKqCVgLA42otcSHORATS4YZtKue2vIEn7FwP4yHe6S ZPXLk6/7AXcHQ83MkQ3/jShGMz5S5LOyBhJVZrkJyLThC9PcUD9ldyCOL0uzdDG1sb5M 06MkPGxPeBa0aM/Evc6sFSslW/GhAWnvZBA8+PZrxcvSTvVlK9z1clRYSNWpN2i8rC4H SSiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=k1RBq3yq4U5V/3OJ4fp4kBxfnryZ2+wdIlpTdmkxoTI=; b=jOTUvVSHGVnfpXVfVxw2y5lplGifNHx48JiR+jm1tN5jz1/jvE6YrwS1DfEq09BZnT Sq3y/KBDscmBZlmFnj8QtVX2Kca11hgdczPSuEe1qbmHU+e9VzAb+6ui5EGQ16bl1deS ApyLzOrnhm+3MDeIQ0fGoEF5YP0rzqGsDtZZHw98nokC2gakRmertaymvK5HKMHVQ68Z u9nWzp12N5qZzCA9b/LmTL+MBfPgmENBg59rG7h9CmJ6xW1UArZPSQVa4d1zqNp9yUxZ JVj+nkXtSMBodI6QZOAEb/qkjgm+mxQksnbGKYqAKxWztpczKYAs31a/asxVcCNx+T6x pU4w== X-Gm-Message-State: ACgBeo1zl0wA6dXtvvwfcezKjgcyeGF7dlSYV0dylgtgglC1ihZJ1hA2 4SiOPD4ankKiezzpwgj8wSccmdhJeDA= X-Google-Smtp-Source: AA6agR4708nF24x7XI3k0Gx9DyNo/OwdO20Vx8Uo0n79f6qJIllny6st9xxE+KwycAbaAtlsqzU8pt5cOBU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:ef45:0:b0:696:45b0:7b5d with SMTP id w5-20020a25ef45000000b0069645b07b5dmr12075882ybm.368.1661896216803; Tue, 30 Aug 2022 14:50:16 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:09 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-21-surenb@google.com> Subject: [RFC PATCH 20/30] lib: introduce support for storing code tag context From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Add support for code tag context capture when registering a new code tag type. When context capture for a specific code tag is enabled, codetag_ref will point to a codetag_ctx object which can be attached to an application-specific object storing code invocation context. codetag_ctx has a pointer to its codetag_with_ctx object with embedded codetag object in it. All context objects of the same code tag are placed into codetag_with_ctx.ctx_head linked list. codetag.flag is used to indicate when a context capture for the associated code tag is initialized and enabled. Signed-off-by: Suren Baghdasaryan --- include/linux/codetag.h | 50 +++++++++++++- include/linux/codetag_ctx.h | 48 +++++++++++++ lib/codetag.c | 134 ++++++++++++++++++++++++++++++++++++ 3 files changed, 231 insertions(+), 1 deletion(-) create mode 100644 include/linux/codetag_ctx.h diff --git a/include/linux/codetag.h b/include/linux/codetag.h index 0c605417ebbe..57736ec77b45 100644 --- a/include/linux/codetag.h +++ b/include/linux/codetag.h @@ -5,8 +5,12 @@ #ifndef _LINUX_CODETAG_H #define _LINUX_CODETAG_H +#include +#include #include +struct kref; +struct codetag_ctx; struct codetag_iterator; struct codetag_type; struct seq_buf; @@ -18,15 +22,38 @@ struct module; * an array of these. */ struct codetag { - unsigned int flags; /* used in later patches */ + unsigned int flags; /* has to be the first member shared with codetag_ctx */ unsigned int lineno; const char *modname; const char *function; const char *filename; } __aligned(8); +/* codetag_with_ctx flags */ +#define CTC_FLAG_CTX_PTR (1 << 0) +#define CTC_FLAG_CTX_READY (1 << 1) +#define CTC_FLAG_CTX_ENABLED (1 << 2) + +/* + * Code tag with context capture support. Contains a list to store context for + * each tag hit, a lock protecting the list and a flag to indicate whether + * context capture is enabled for the tag. + */ +struct codetag_with_ctx { + struct codetag ct; + struct list_head ctx_head; + spinlock_t ctx_lock; +} __aligned(8); + +/* + * Tag reference can point to codetag directly or indirectly via codetag_ctx. + * Direct codetag pointer is used when context capture is disabled or not + * supported. When context capture for the tag is used, the reference points + * to the codetag_ctx through which the codetag can be reached. + */ union codetag_ref { struct codetag *ct; + struct codetag_ctx *ctx; }; struct codetag_range { @@ -46,6 +73,7 @@ struct codetag_type_desc { struct codetag_module *cmod); void (*module_unload)(struct codetag_type *cttype, struct codetag_module *cmod); + void (*free_ctx)(struct kref *ref); }; struct codetag_iterator { @@ -53,6 +81,7 @@ struct codetag_iterator { struct codetag_module *cmod; unsigned long mod_id; struct codetag *ct; + struct codetag_ctx *ctx; }; #define CODE_TAG_INIT { \ @@ -63,9 +92,28 @@ struct codetag_iterator { .flags = 0, \ } +static inline bool is_codetag_ctx_ref(union codetag_ref *ref) +{ + return !!(ref->ct->flags & CTC_FLAG_CTX_PTR); +} + +static inline +struct codetag_with_ctx *ct_to_ctc(struct codetag *ct) +{ + return container_of(ct, struct codetag_with_ctx, ct); +} + void codetag_lock_module_list(struct codetag_type *cttype, bool lock); struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype); struct codetag *codetag_next_ct(struct codetag_iterator *iter); +struct codetag_ctx *codetag_next_ctx(struct codetag_iterator *iter); + +bool codetag_enable_ctx(struct codetag_with_ctx *ctc, bool enable); +static inline bool codetag_ctx_enabled(struct codetag_with_ctx *ctc) +{ + return !!(ctc->ct.flags & CTC_FLAG_CTX_ENABLED); +} +bool codetag_has_ctx(struct codetag_with_ctx *ctc); void codetag_to_text(struct seq_buf *out, struct codetag *ct); diff --git a/include/linux/codetag_ctx.h b/include/linux/codetag_ctx.h new file mode 100644 index 000000000000..e741484f0e08 --- /dev/null +++ b/include/linux/codetag_ctx.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * code tag context + */ +#ifndef _LINUX_CODETAG_CTX_H +#define _LINUX_CODETAG_CTX_H + +#include +#include + +/* Code tag hit context. */ +struct codetag_ctx { + unsigned int flags; /* has to be the first member shared with codetag */ + struct codetag_with_ctx *ctc; + struct list_head node; + struct kref refcount; +} __aligned(8); + +static inline struct codetag_ctx *kref_to_ctx(struct kref *refcount) +{ + return container_of(refcount, struct codetag_ctx, refcount); +} + +static inline void add_ctx(struct codetag_ctx *ctx, + struct codetag_with_ctx *ctc) +{ + kref_init(&ctx->refcount); + spin_lock(&ctc->ctx_lock); + ctx->flags = CTC_FLAG_CTX_PTR; + ctx->ctc = ctc; + list_add_tail(&ctx->node, &ctc->ctx_head); + spin_unlock(&ctc->ctx_lock); +} + +static inline void rem_ctx(struct codetag_ctx *ctx, + void (*free_ctx)(struct kref *refcount)) +{ + struct codetag_with_ctx *ctc = ctx->ctc; + + spin_lock(&ctc->ctx_lock); + /* ctx might have been removed while we were using it */ + if (!list_empty(&ctx->node)) + list_del_init(&ctx->node); + spin_unlock(&ctc->ctx_lock); + kref_put(&ctx->refcount, free_ctx); +} + +#endif /* _LINUX_CODETAG_CTX_H */ diff --git a/lib/codetag.c b/lib/codetag.c index 288ccfd5cbd0..2762fda5c016 100644 --- a/lib/codetag.c +++ b/lib/codetag.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only #include +#include #include #include #include @@ -91,6 +92,139 @@ struct codetag *codetag_next_ct(struct codetag_iterator *iter) return ct; } +static struct codetag_ctx *next_ctx_from_ct(struct codetag_iterator *iter) +{ + struct codetag_with_ctx *ctc; + struct codetag_ctx *ctx = NULL; + struct codetag *ct = iter->ct; + + while (ct) { + if (!(ct->flags & CTC_FLAG_CTX_READY)) + goto next; + + ctc = ct_to_ctc(ct); + spin_lock(&ctc->ctx_lock); + if (!list_empty(&ctc->ctx_head)) { + ctx = list_first_entry(&ctc->ctx_head, + struct codetag_ctx, node); + kref_get(&ctx->refcount); + } + spin_unlock(&ctc->ctx_lock); + if (ctx) + break; +next: + ct = codetag_next_ct(iter); + } + + iter->ctx = ctx; + return ctx; +} + +struct codetag_ctx *codetag_next_ctx(struct codetag_iterator *iter) +{ + struct codetag_ctx *ctx = iter->ctx; + struct codetag_ctx *found = NULL; + + lockdep_assert_held(&iter->cttype->mod_lock); + + if (!ctx) + return next_ctx_from_ct(iter); + + spin_lock(&ctx->ctc->ctx_lock); + /* + * Do not advance if the object was isolated, restart at the same tag. + */ + if (!list_empty(&ctx->node)) { + if (list_is_last(&ctx->node, &ctx->ctc->ctx_head)) { + /* Finished with this tag, advance to the next */ + codetag_next_ct(iter); + } else { + found = list_next_entry(ctx, node); + kref_get(&found->refcount); + } + } + spin_unlock(&ctx->ctc->ctx_lock); + kref_put(&ctx->refcount, iter->cttype->desc.free_ctx); + + if (!found) + return next_ctx_from_ct(iter); + + iter->ctx = found; + return found; +} + +static struct codetag_type *find_cttype(struct codetag *ct) +{ + struct codetag_module *cmod; + struct codetag_type *cttype; + unsigned long mod_id; + unsigned long tmp; + + mutex_lock(&codetag_lock); + list_for_each_entry(cttype, &codetag_types, link) { + down_read(&cttype->mod_lock); + idr_for_each_entry_ul(&cttype->mod_idr, cmod, tmp, mod_id) { + if (ct >= cmod->range.start && ct < cmod->range.stop) { + up_read(&cttype->mod_lock); + goto found; + } + } + up_read(&cttype->mod_lock); + } + cttype = NULL; +found: + mutex_unlock(&codetag_lock); + + return cttype; +} + +bool codetag_enable_ctx(struct codetag_with_ctx *ctc, bool enable) +{ + struct codetag_type *cttype = find_cttype(&ctc->ct); + + if (!cttype || !cttype->desc.free_ctx) + return false; + + lockdep_assert_held(&cttype->mod_lock); + BUG_ON(!rwsem_is_locked(&cttype->mod_lock)); + + if (codetag_ctx_enabled(ctc) == enable) + return false; + + if (enable) { + /* Initialize context capture fields only once */ + if (!(ctc->ct.flags & CTC_FLAG_CTX_READY)) { + spin_lock_init(&ctc->ctx_lock); + INIT_LIST_HEAD(&ctc->ctx_head); + ctc->ct.flags |= CTC_FLAG_CTX_READY; + } + ctc->ct.flags |= CTC_FLAG_CTX_ENABLED; + } else { + /* + * The list of context objects is intentionally left untouched. + * It can be read back and if context capture is re-enablied it + * will append new objects. + */ + ctc->ct.flags &= ~CTC_FLAG_CTX_ENABLED; + } + + return true; +} + +bool codetag_has_ctx(struct codetag_with_ctx *ctc) +{ + bool no_ctx; + + if (!(ctc->ct.flags & CTC_FLAG_CTX_READY)) + return false; + + spin_lock(&ctc->ctx_lock); + no_ctx = list_empty(&ctc->ctx_head); + spin_unlock(&ctc->ctx_lock); + + return !no_ctx; +} + void codetag_to_text(struct seq_buf *out, struct codetag *ct) { seq_buf_printf(out, "%s:%u module:%s func:%s", From patchwork Tue Aug 30 21:49:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E1DCECAAD5 for ; Tue, 30 Aug 2022 21:52:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232168AbiH3Vwy (ORCPT ); Tue, 30 Aug 2022 17:52:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232050AbiH3VwH (ORCPT ); Tue, 30 Aug 2022 17:52:07 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0BAC90C77 for ; Tue, 30 Aug 2022 14:50:20 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id n18-20020a25d612000000b0069661a1dc48so726204ybg.20 for ; Tue, 30 Aug 2022 14:50:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=g/YTY0u3yadOEciruM5AsrsoCBr2LdU6F8XKpTCCoXE=; b=W9ws8GQM/90cTDmDN4sijcJEomx0StRc0qnBzUb4l/dKXquUVNd+bkQOSXwJxVHeSR QG60SQEVm6VA17XfZBoKGGzRXcSLpOO21vFhHqE3hmk2c9gf7/HtTYj3kp+LhucXD7zS 32XZBjNXwZuxDqFNmQ7tqseycDiUbS6bSZaQn1dUA4lMpujJGMmrMQrbex/Wc04wsxKu B3eerDWEH6lgqkNOuwBI3NzTboD1OpHxugThISWmfH4bZEx8AfmzAF8N3fNasLdDbLD+ eox1A7oZRZT7QuWta0fKi5W+QGVJWyjQVSgZXf5MPYNZ62sspJg2YTeSPJPL6GlufKLs TtEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=g/YTY0u3yadOEciruM5AsrsoCBr2LdU6F8XKpTCCoXE=; b=lnig5tJwRnb/LMIT+pLdpJiXF3C6+Ql7D/sYHs1Lr5CPZc8QnBnrT+CvD4vGLhKyya krZj+8ecQg0C6rz8bT+Cwern1QsRDxpDAL4nHu3OLGRC95FPyeDfFPIY/Hz9lo1M8plu YRB1xfn78AD+dFQnxXJJC5vjAEBhEAOclHjcrZ4+HjNSPUQdp0P7ZMpTdU4kvO02F6Di GFAbLBIqVDXYIA/bHUMHWweTX2xX02B2kFdlChddjKpmjBtAckY39AydaD7n0yk2Gomf zR4L8ulVIHzzRr+4lRK89DQBPK3Goxj96j74uqVkvZv6R3bW6w+ehq9nfWza70KDO5jc zUVw== X-Gm-Message-State: ACgBeo1FZVw2ck/XMpqvFTJ9KFDCglMGheC75AIQ2095ojGSBgsVlEZt tbDfPLCstOvXue7JL5NfHhWLfqU14I4= X-Google-Smtp-Source: AA6agR7zbyBaHTavSAdE/8JQpYqzjp+0tJKa0FexURSVp+cX649/Bz10TDDbPo8ER0HHDl/6WW+PqHXP88E= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a05:6902:100b:b0:695:bd4e:95d6 with SMTP id w11-20020a056902100b00b00695bd4e95d6mr13705955ybt.595.1661896219160; Tue, 30 Aug 2022 14:50:19 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:10 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-22-surenb@google.com> Subject: [RFC PATCH 21/30] lib: implement context capture support for page and slab allocators From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Implement mechanisms for capturing allocation call context which consists of: - allocation size - pid, tgid and name of the allocating task - allocation timestamp - allocation call stack The patch creates alloc_tags.ctx file which can be written to enable/disable context capture for a specific code tag. Captured context can be obtained by reading alloc_tags.ctx file. Usage example: echo "file include/asm-generic/pgalloc.h line 63 enable" > \ /sys/kernel/debug/alloc_tags.ctx cat alloc_tags.ctx 91.0MiB 212 include/asm-generic/pgalloc.h:63 module:pgtable func:__pte_alloc_one size: 4096 pid: 1551 tgid: 1551 comm: cat ts: 670109646361 call stack: pte_alloc_one+0xfe/0x130 __pte_alloc+0x22/0x90 move_page_tables.part.0+0x994/0xa60 shift_arg_pages+0xa4/0x180 setup_arg_pages+0x286/0x2d0 load_elf_binary+0x4e1/0x18d0 bprm_execve+0x26b/0x660 do_execveat_common.isra.0+0x19d/0x220 __x64_sys_execve+0x2e/0x40 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd size: 4096 pid: 1551 tgid: 1551 comm: cat ts: 670109711801 call stack: pte_alloc_one+0xfe/0x130 __do_fault+0x52/0xc0 __handle_mm_fault+0x7d9/0xdd0 handle_mm_fault+0xc0/0x2b0 do_user_addr_fault+0x1c3/0x660 exc_page_fault+0x62/0x150 asm_exc_page_fault+0x22/0x30 ... echo "file include/asm-generic/pgalloc.h line 63 disable" > \ /sys/kernel/debug/alloc_tags.ctx Note that disabling context capture will not clear already captured context but no new context will be captured. Signed-off-by: Suren Baghdasaryan --- include/linux/alloc_tag.h | 28 ++++- include/linux/codetag.h | 3 +- lib/Kconfig.debug | 1 + lib/alloc_tag.c | 239 +++++++++++++++++++++++++++++++++++++- lib/codetag.c | 20 ++-- 5 files changed, 273 insertions(+), 18 deletions(-) diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h index b3f589afb1c9..66638cbf349a 100644 --- a/include/linux/alloc_tag.h +++ b/include/linux/alloc_tag.h @@ -16,27 +16,41 @@ * an array of these. Embedded codetag utilizes codetag framework. */ struct alloc_tag { - struct codetag ct; + struct codetag_with_ctx ctc; unsigned long last_wrap; struct raw_lazy_percpu_counter call_count; struct raw_lazy_percpu_counter bytes_allocated; } __aligned(8); +static inline struct alloc_tag *ctc_to_alloc_tag(struct codetag_with_ctx *ctc) +{ + return container_of(ctc, struct alloc_tag, ctc); +} + static inline struct alloc_tag *ct_to_alloc_tag(struct codetag *ct) { - return container_of(ct, struct alloc_tag, ct); + return container_of(ct_to_ctc(ct), struct alloc_tag, ctc); } +struct codetag_ctx *alloc_tag_create_ctx(struct alloc_tag *tag, size_t size); +void alloc_tag_free_ctx(struct codetag_ctx *ctx, struct alloc_tag **ptag); +bool alloc_tag_enable_ctx(struct alloc_tag *tag, bool enable); + #define DEFINE_ALLOC_TAG(_alloc_tag) \ static struct alloc_tag _alloc_tag __used __aligned(8) \ - __section("alloc_tags") = { .ct = CODE_TAG_INIT } + __section("alloc_tags") = { .ctc.ct = CODE_TAG_INIT } #define alloc_tag_counter_read(counter) \ __lazy_percpu_counter_read(counter) static inline void __alloc_tag_sub(union codetag_ref *ref, size_t bytes) { - struct alloc_tag *tag = ct_to_alloc_tag(ref->ct); + struct alloc_tag *tag; + + if (is_codetag_ctx_ref(ref)) + alloc_tag_free_ctx(ref->ctx, &tag); + else + tag = ct_to_alloc_tag(ref->ct); __lazy_percpu_counter_add(&tag->call_count, &tag->last_wrap, -1); __lazy_percpu_counter_add(&tag->bytes_allocated, &tag->last_wrap, -bytes); @@ -51,7 +65,11 @@ do { \ static inline void __alloc_tag_add(struct alloc_tag *tag, union codetag_ref *ref, size_t bytes) { - ref->ct = &tag->ct; + if (codetag_ctx_enabled(&tag->ctc)) + ref->ctx = alloc_tag_create_ctx(tag, bytes); + else + ref->ct = &tag->ctc.ct; + __lazy_percpu_counter_add(&tag->call_count, &tag->last_wrap, 1); __lazy_percpu_counter_add(&tag->bytes_allocated, &tag->last_wrap, bytes); } diff --git a/include/linux/codetag.h b/include/linux/codetag.h index 57736ec77b45..a10c5fcbdd20 100644 --- a/include/linux/codetag.h +++ b/include/linux/codetag.h @@ -104,7 +104,8 @@ struct codetag_with_ctx *ct_to_ctc(struct codetag *ct) } void codetag_lock_module_list(struct codetag_type *cttype, bool lock); -struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype); +void codetag_init_iter(struct codetag_iterator *iter, + struct codetag_type *cttype); struct codetag *codetag_next_ct(struct codetag_iterator *iter); struct codetag_ctx *codetag_next_ctx(struct codetag_iterator *iter); diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 08c97a978906..2790848464f1 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -977,6 +977,7 @@ config ALLOC_TAGGING bool select CODE_TAGGING select LAZY_PERCPU_COUNTER + select STACKDEPOT config PAGE_ALLOC_TAGGING bool "Enable page allocation tagging" diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c index 082fbde184ef..50d7bdc2a3c8 100644 --- a/lib/alloc_tag.c +++ b/lib/alloc_tag.c @@ -1,12 +1,75 @@ // SPDX-License-Identifier: GPL-2.0-only #include +#include #include #include #include #include +#include +#include #include +#include #include +#define STACK_BUF_SIZE 1024 + +struct alloc_call_ctx { + struct codetag_ctx ctx; + size_t size; + pid_t pid; + pid_t tgid; + char comm[TASK_COMM_LEN]; + u64 ts_nsec; + depot_stack_handle_t stack_handle; +} __aligned(8); + +static void alloc_tag_ops_free_ctx(struct kref *refcount) +{ + kfree(container_of(kref_to_ctx(refcount), struct alloc_call_ctx, ctx)); +} + +struct codetag_ctx *alloc_tag_create_ctx(struct alloc_tag *tag, size_t size) +{ + struct alloc_call_ctx *ac_ctx; + + /* TODO: use a dedicated kmem_cache */ + ac_ctx = kmalloc(sizeof(struct alloc_call_ctx), GFP_KERNEL); + if (WARN_ON(!ac_ctx)) + return NULL; + + ac_ctx->size = size; + ac_ctx->pid = current->pid; + ac_ctx->tgid = current->tgid; + strscpy(ac_ctx->comm, current->comm, sizeof(ac_ctx->comm)); + ac_ctx->ts_nsec = local_clock(); + ac_ctx->stack_handle = + stack_depot_capture_stack(GFP_NOWAIT | __GFP_NOWARN); + add_ctx(&ac_ctx->ctx, &tag->ctc); + + return &ac_ctx->ctx; +} +EXPORT_SYMBOL_GPL(alloc_tag_create_ctx); + +void alloc_tag_free_ctx(struct codetag_ctx *ctx, struct alloc_tag **ptag) +{ + *ptag = ctc_to_alloc_tag(ctx->ctc); + rem_ctx(ctx, alloc_tag_ops_free_ctx); +} +EXPORT_SYMBOL_GPL(alloc_tag_free_ctx); + +bool alloc_tag_enable_ctx(struct alloc_tag *tag, bool enable) +{ + static bool stack_depot_ready; + + if (enable && !stack_depot_ready) { + stack_depot_init(); + stack_depot_capture_init(); + stack_depot_ready = true; + } + + return codetag_enable_ctx(&tag->ctc, enable); +} + #ifdef CONFIG_DEBUG_FS struct alloc_tag_file_iterator { @@ -50,7 +113,7 @@ static int alloc_tag_file_open(struct inode *inode, struct file *file) return -ENOMEM; codetag_lock_module_list(cttype, true); - iter->ct_iter = codetag_get_ct_iter(cttype); + codetag_init_iter(&iter->ct_iter, cttype); codetag_lock_module_list(cttype, false); seq_buf_init(&iter->buf, iter->rawbuf, sizeof(iter->rawbuf)); file->private_data = iter; @@ -111,14 +174,182 @@ static const struct file_operations alloc_tag_file_ops = { .read = alloc_tag_file_read, }; +static void alloc_tag_ctx_to_text(struct seq_buf *out, struct codetag_ctx *ctx) +{ + struct alloc_call_ctx *ac_ctx; + char *buf; + + ac_ctx = container_of(ctx, struct alloc_call_ctx, ctx); + seq_buf_printf(out, " size: %zu\n", ac_ctx->size); + seq_buf_printf(out, " pid: %d\n", ac_ctx->pid); + seq_buf_printf(out, " tgid: %d\n", ac_ctx->tgid); + seq_buf_printf(out, " comm: %s\n", ac_ctx->comm); + seq_buf_printf(out, " ts: %llu\n", ac_ctx->ts_nsec); + + buf = kmalloc(STACK_BUF_SIZE, GFP_KERNEL); + if (buf) { + int bytes_read = stack_depot_snprint(ac_ctx->stack_handle, buf, + STACK_BUF_SIZE - 1, 8); + buf[bytes_read] = '\0'; + seq_buf_printf(out, " call stack:\n%s\n", buf); + } + kfree(buf); +} + +static ssize_t alloc_tag_ctx_file_read(struct file *file, char __user *ubuf, + size_t size, loff_t *ppos) +{ + struct alloc_tag_file_iterator *iter = file->private_data; + struct codetag_iterator *ct_iter = &iter->ct_iter; + struct user_buf buf = { .buf = ubuf, .size = size }; + struct codetag_ctx *ctx; + struct codetag *prev_ct; + int err = 0; + + codetag_lock_module_list(ct_iter->cttype, true); + while (1) { + err = flush_ubuf(&buf, &iter->buf); + if (err || !buf.size) + break; + + prev_ct = ct_iter->ct; + ctx = codetag_next_ctx(ct_iter); + if (!ctx) + break; + + if (prev_ct != &ctx->ctc->ct) + alloc_tag_to_text(&iter->buf, &ctx->ctc->ct); + alloc_tag_ctx_to_text(&iter->buf, ctx); + } + codetag_lock_module_list(ct_iter->cttype, false); + + return err ? : buf.ret; +} + +#define CTX_CAPTURE_TOKENS() \ + x(disable, 0) \ + x(enable, 0) + +static const char * const ctx_capture_token_strs[] = { +#define x(name, nr_args) #name, + CTX_CAPTURE_TOKENS() +#undef x + NULL +}; + +enum ctx_capture_token { +#define x(name, nr_args) TOK_##name, + CTX_CAPTURE_TOKENS() +#undef x +}; + +static int enable_ctx_capture(struct codetag_type *cttype, + struct codetag_query *query, bool enable) +{ + struct codetag_iterator ct_iter; + struct codetag_with_ctx *ctc; + struct codetag *ct; + unsigned int nfound = 0; + + codetag_lock_module_list(cttype, true); + + codetag_init_iter(&ct_iter, cttype); + while ((ct = codetag_next_ct(&ct_iter))) { + if (!codetag_matches_query(query, ct, ct_iter.cmod, NULL)) + continue; + + ctc = ct_to_ctc(ct); + if (codetag_ctx_enabled(ctc) == enable) + continue; + + if (!alloc_tag_enable_ctx(ctc_to_alloc_tag(ctc), enable)) { + pr_warn("Failed to toggle context capture\n"); + continue; + } + + nfound++; + } + + codetag_lock_module_list(cttype, false); + + return nfound ? 0 : -ENOENT; +} + +static int parse_command(struct codetag_type *cttype, char *buf) +{ + struct codetag_query query = { NULL }; + char *cmd; + int ret; + int tok; + + buf = codetag_query_parse(&query, buf); + if (IS_ERR(buf)) + return PTR_ERR(buf); + + cmd = strsep_no_empty(&buf, " \t\r\n"); + if (!cmd) + return -EINVAL; /* no command */ + + tok = match_string(ctx_capture_token_strs, + ARRAY_SIZE(ctx_capture_token_strs), cmd); + if (tok < 0) + return -EINVAL; /* unknown command */ + + ret = enable_ctx_capture(cttype, &query, tok == TOK_enable); + if (ret < 0) + return ret; + + return 0; +} + +static ssize_t alloc_tag_ctx_file_write(struct file *file, const char __user *ubuf, + size_t len, loff_t *offp) +{ + struct alloc_tag_file_iterator *iter = file->private_data; + char tmpbuf[256]; + + if (len == 0) + return 0; + /* we don't check *offp -- multiple writes() are allowed */ + if (len > sizeof(tmpbuf) - 1) + return -E2BIG; + + if (copy_from_user(tmpbuf, ubuf, len)) + return -EFAULT; + + tmpbuf[len] = '\0'; + parse_command(iter->ct_iter.cttype, tmpbuf); + + *offp += len; + return len; +} + +static const struct file_operations alloc_tag_ctx_file_ops = { + .owner = THIS_MODULE, + .open = alloc_tag_file_open, + .release = alloc_tag_file_release, + .read = alloc_tag_ctx_file_read, + .write = alloc_tag_ctx_file_write, +}; + static int dbgfs_init(struct codetag_type *cttype) { struct dentry *file; + struct dentry *ctx_file; file = debugfs_create_file("alloc_tags", 0444, NULL, cttype, &alloc_tag_file_ops); + if (IS_ERR(file)) + return PTR_ERR(file); + + ctx_file = debugfs_create_file("alloc_tags.ctx", 0666, NULL, cttype, + &alloc_tag_ctx_file_ops); + if (IS_ERR(ctx_file)) { + debugfs_remove(file); + return PTR_ERR(ctx_file); + } - return IS_ERR(file) ? PTR_ERR(file) : 0; + return 0; } #else /* CONFIG_DEBUG_FS */ @@ -129,9 +360,10 @@ static int dbgfs_init(struct codetag_type *) { return 0; } static void alloc_tag_module_unload(struct codetag_type *cttype, struct codetag_module *cmod) { - struct codetag_iterator iter = codetag_get_ct_iter(cttype); + struct codetag_iterator iter; struct codetag *ct; + codetag_init_iter(&iter, cttype); for (ct = codetag_next_ct(&iter); ct; ct = codetag_next_ct(&iter)) { struct alloc_tag *tag = ct_to_alloc_tag(ct); @@ -147,6 +379,7 @@ static int __init alloc_tag_init(void) .section = "alloc_tags", .tag_size = sizeof(struct alloc_tag), .module_unload = alloc_tag_module_unload, + .free_ctx = alloc_tag_ops_free_ctx, }; cttype = codetag_register_type(&desc); diff --git a/lib/codetag.c b/lib/codetag.c index 2762fda5c016..a936d2988c96 100644 --- a/lib/codetag.c +++ b/lib/codetag.c @@ -26,16 +26,14 @@ void codetag_lock_module_list(struct codetag_type *cttype, bool lock) up_read(&cttype->mod_lock); } -struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype) +void codetag_init_iter(struct codetag_iterator *iter, + struct codetag_type *cttype) { - struct codetag_iterator iter = { - .cttype = cttype, - .cmod = NULL, - .mod_id = 0, - .ct = NULL, - }; - - return iter; + iter->cttype = cttype; + iter->cmod = NULL; + iter->mod_id = 0; + iter->ct = NULL; + iter->ctx = NULL; } static inline struct codetag *get_first_module_ct(struct codetag_module *cmod) @@ -127,6 +125,10 @@ struct codetag_ctx *codetag_next_ctx(struct codetag_iterator *iter) lockdep_assert_held(&iter->cttype->mod_lock); + /* Move to the first codetag if search just started */ + if (!iter->ct) + codetag_next_ct(iter); + if (!ctx) return next_ctx_from_ct(iter); From patchwork Tue Aug 30 21:49:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CEF77ECAAD5 for ; Tue, 30 Aug 2022 21:53:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231920AbiH3Vx1 (ORCPT ); Tue, 30 Aug 2022 17:53:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58778 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232136AbiH3VwX (ORCPT ); Tue, 30 Aug 2022 17:52:23 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB76391D29 for ; Tue, 30 Aug 2022 14:50:23 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id k13-20020a056902024d00b0066fa7f50b97so710673ybs.6 for ; Tue, 30 Aug 2022 14:50:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=OQ1/DMahF4uWAfLYh5vc64TkQSOW6G8gA0sp/Oew5U4=; b=E6n8RLBDvVdKTt99wS7gonDHgXGyFD48wANdOMf152JDQ5KdmSOWFLIpRW6kDDjotY x+AW587u5EK6tbc6fvifCLn56BbbsozOj5hKUEn6M41P4UEgc9OABsv5SIiQi/8oKY8R SnWivvj44LyiP0dXPWoI7mNyRDo/YzIUrD8I5hoMt+oKP1+hGTBP83QKTjsDABUbkjlp 2WTgmE2oNacaN0/b3glSakcpEMlPWkX8Cq7584LeI9wCjV+07nZN/bVKQwFVm5Mw8GwD nmAvgbFTWQSM9f9o0sF35PvZg/TjRD7dQ6ibTq+wNYKM6OcUMOKofnZUPQGzw4DuByNO 8QdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=OQ1/DMahF4uWAfLYh5vc64TkQSOW6G8gA0sp/Oew5U4=; b=wVTAoH2Oiy2MKhurIawxegspwfaZSHqJGTTeZlPxNv1iexv9gNjGvMnITB4TOke4z+ TVohNwCgOvRcHpe6OPfHS5GUcisVsqasFzp/Gqf1YEs5nRbrrIUHjpcxWg/uSJoCnysN TvoyRB/6fFSj3mX/IiGhIkU2qjxkUh3gmg/LDXdeZPhURs4nWMs09lKhiJ5kBJgxERXS +KhM0YP+b3y81PFyONQF8UGesSmK2zk/fVww7O/6Y7eI2o7MDhRtl3La3ahEVexZKTt1 BVJU6sGDYlw3g74z/AraNCgYTL3tR+paPt7LmEvL1uPsz7PVv4yszz7N8Q21hV7HphuY R9Bg== X-Gm-Message-State: ACgBeo1+qel/g9sQ3NG604VgHfAYrTb1QPV0iGhKo1+WvAlqURSyT1EY t5hGJ12iaiVLTMilPHIQf8o1Zlcqy/s= X-Google-Smtp-Source: AA6agR6JKAe3T4R70KWhBqCZIGWOR7vlkyzPVmWfqRUYHsRdrPe/NdFlMgKrQ6XnjzPBCyOacXIPh+/j+IY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:6985:0:b0:695:8355:f894 with SMTP id e127-20020a256985000000b006958355f894mr13667557ybc.648.1661896221989; Tue, 30 Aug 2022 14:50:21 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:11 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-23-surenb@google.com> Subject: [RFC PATCH 22/30] Code tagging based fault injection From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet This adds a new fault injection capability, based on code tagging. To use, simply insert somewhere in your code dynamic_fault("fault_class_name") and check whether it returns true - if so, inject the error. For example if (dynamic_fault("init")) return -EINVAL; There's no need to define faults elsewhere, as with include/linux/fault-injection.h. Faults show up in debugfs, under /sys/kernel/debug/dynamic_faults, and can be selected based on file/module/function/line number/class, and enabled permanently, or in oneshot mode, or with a specified frequency. Signed-off-by: Kent Overstreet --- include/asm-generic/codetag.lds.h | 3 +- include/linux/dynamic_fault.h | 79 +++++++ include/linux/slab.h | 3 +- lib/Kconfig.debug | 6 + lib/Makefile | 2 + lib/dynamic_fault.c | 372 ++++++++++++++++++++++++++++++ 6 files changed, 463 insertions(+), 2 deletions(-) create mode 100644 include/linux/dynamic_fault.h create mode 100644 lib/dynamic_fault.c diff --git a/include/asm-generic/codetag.lds.h b/include/asm-generic/codetag.lds.h index 64f536b80380..16fbf74edc3d 100644 --- a/include/asm-generic/codetag.lds.h +++ b/include/asm-generic/codetag.lds.h @@ -9,6 +9,7 @@ __stop_##_name = .; #define CODETAG_SECTIONS() \ - SECTION_WITH_BOUNDARIES(alloc_tags) + SECTION_WITH_BOUNDARIES(alloc_tags) \ + SECTION_WITH_BOUNDARIES(dynamic_fault_tags) #endif /* __ASM_GENERIC_CODETAG_LDS_H */ diff --git a/include/linux/dynamic_fault.h b/include/linux/dynamic_fault.h new file mode 100644 index 000000000000..526a33209e94 --- /dev/null +++ b/include/linux/dynamic_fault.h @@ -0,0 +1,79 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _LINUX_DYNAMIC_FAULT_H +#define _LINUX_DYNAMIC_FAULT_H + +/* + * Dynamic/code tagging fault injection: + * + * Originally based on the dynamic debug trick of putting types in a special elf + * section, then rewritten using code tagging: + * + * To use, simply insert a call to dynamic_fault("fault_class"), which will + * return true if an error should be injected. + * + * Fault injection sites may be listed and enabled via debugfs, under + * /sys/kernel/debug/dynamic_faults. + */ + +#ifdef CONFIG_CODETAG_FAULT_INJECTION + +#include +#include + +#define DFAULT_STATES() \ + x(disabled) \ + x(enabled) \ + x(oneshot) + +enum dfault_enabled { +#define x(n) DFAULT_##n, + DFAULT_STATES() +#undef x +}; + +union dfault_state { + struct { + unsigned int enabled:2; + unsigned int count:30; + }; + + struct { + unsigned int v; + }; +}; + +struct dfault { + struct codetag tag; + const char *class; + unsigned int frequency; + union dfault_state state; + struct static_key_false enabled; +}; + +bool __dynamic_fault_enabled(struct dfault *df); + +#define dynamic_fault(_class) \ +({ \ + static struct dfault \ + __used \ + __section("dynamic_fault_tags") \ + __aligned(8) df = { \ + .tag = CODE_TAG_INIT, \ + .class = _class, \ + .enabled = STATIC_KEY_FALSE_INIT, \ + }; \ + \ + static_key_false(&df.enabled.key) && \ + __dynamic_fault_enabled(&df); \ +}) + +#else + +#define dynamic_fault(_class) false + +#endif /* CODETAG_FAULT_INJECTION */ + +#define memory_fault() dynamic_fault("memory") + +#endif /* _LINUX_DYNAMIC_FAULT_H */ diff --git a/include/linux/slab.h b/include/linux/slab.h index 89273be35743..4be5a93ed15a 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -17,6 +17,7 @@ #include #include #include +#include /* @@ -468,7 +469,7 @@ static inline void slab_tag_dec(const void *ptr) {} #define krealloc_hooks(_p, _do_alloc) \ ({ \ - void *_res = _do_alloc; \ + void *_res = !memory_fault() ? _do_alloc : NULL; \ slab_tag_add(_p, _res); \ _res; \ }) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 2790848464f1..b7d03afbc808 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1982,6 +1982,12 @@ config FAULT_INJECTION_STACKTRACE_FILTER help Provide stacktrace filter for fault-injection capabilities +config CODETAG_FAULT_INJECTION + bool "Code tagging based fault injection" + select CODE_TAGGING + help + Dynamic fault injection based on code tagging + config ARCH_HAS_KCOV bool help diff --git a/lib/Makefile b/lib/Makefile index 99f732156673..489ea000c528 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -231,6 +231,8 @@ obj-$(CONFIG_CODE_TAGGING) += codetag.o obj-$(CONFIG_ALLOC_TAGGING) += alloc_tag.o obj-$(CONFIG_PAGE_ALLOC_TAGGING) += pgalloc_tag.o +obj-$(CONFIG_CODETAG_FAULT_INJECTION) += dynamic_fault.o + lib-$(CONFIG_GENERIC_BUG) += bug.o obj-$(CONFIG_HAVE_ARCH_TRACEHOOK) += syscall.o diff --git a/lib/dynamic_fault.c b/lib/dynamic_fault.c new file mode 100644 index 000000000000..4c9cd18686be --- /dev/null +++ b/lib/dynamic_fault.c @@ -0,0 +1,372 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include +#include + +static struct codetag_type *cttype; + +bool __dynamic_fault_enabled(struct dfault *df) +{ + union dfault_state old, new; + unsigned int v = df->state.v; + bool ret; + + do { + old.v = new.v = v; + + if (new.enabled == DFAULT_disabled) + return false; + + ret = df->frequency + ? ++new.count >= df->frequency + : true; + if (ret) + new.count = 0; + if (ret && new.enabled == DFAULT_oneshot) + new.enabled = DFAULT_disabled; + } while ((v = cmpxchg(&df->state.v, old.v, new.v)) != old.v); + + if (ret) + pr_debug("returned true for %s:%u", df->tag.filename, df->tag.lineno); + + return ret; +} +EXPORT_SYMBOL(__dynamic_fault_enabled); + +static const char * const dfault_state_strs[] = { +#define x(n) #n, + DFAULT_STATES() +#undef x + NULL +}; + +static void dynamic_fault_to_text(struct seq_buf *out, struct dfault *df) +{ + codetag_to_text(out, &df->tag); + seq_buf_printf(out, "class:%s %s \"", df->class, + dfault_state_strs[df->state.enabled]); +} + +struct dfault_query { + struct codetag_query q; + + bool set_enabled:1; + unsigned int enabled:2; + + bool set_frequency:1; + unsigned int frequency; +}; + +/* + * Search the tables for _dfault's which match the given + * `query' and apply the `flags' and `mask' to them. Tells + * the user which dfault's were changed, or whether none + * were matched. + */ +static int dfault_change(struct dfault_query *query) +{ + struct codetag_iterator ct_iter; + struct codetag *ct; + unsigned int nfound = 0; + + codetag_lock_module_list(cttype, true); + codetag_init_iter(&ct_iter, cttype); + + while ((ct = codetag_next_ct(&ct_iter))) { + struct dfault *df = container_of(ct, struct dfault, tag); + + if (!codetag_matches_query(&query->q, ct, ct_iter.cmod, df->class)) + continue; + + if (query->set_enabled && + query->enabled != df->state.enabled) { + if (query->enabled != DFAULT_disabled) + static_key_slow_inc(&df->enabled.key); + else if (df->state.enabled != DFAULT_disabled) + static_key_slow_dec(&df->enabled.key); + + df->state.enabled = query->enabled; + } + + if (query->set_frequency) + df->frequency = query->frequency; + + pr_debug("changed %s:%d [%s]%s #%d %s", + df->tag.filename, df->tag.lineno, df->tag.modname, + df->tag.function, query->q.cur_index, + dfault_state_strs[df->state.enabled]); + + nfound++; + } + + pr_debug("dfault: %u matches", nfound); + + codetag_lock_module_list(cttype, false); + + return nfound ? 0 : -ENOENT; +} + +#define DFAULT_TOKENS() \ + x(disable, 0) \ + x(enable, 0) \ + x(oneshot, 0) \ + x(frequency, 1) + +enum dfault_token { +#define x(name, nr_args) TOK_##name, + DFAULT_TOKENS() +#undef x +}; + +static const char * const dfault_token_strs[] = { +#define x(name, nr_args) #name, + DFAULT_TOKENS() +#undef x + NULL +}; + +static unsigned int dfault_token_nr_args[] = { +#define x(name, nr_args) nr_args, + DFAULT_TOKENS() +#undef x +}; + +static enum dfault_token str_to_token(const char *word, unsigned int nr_words) +{ + int tok = match_string(dfault_token_strs, ARRAY_SIZE(dfault_token_strs), word); + + if (tok < 0) { + pr_debug("unknown keyword \"%s\"", word); + return tok; + } + + if (nr_words < dfault_token_nr_args[tok]) { + pr_debug("insufficient arguments to \"%s\"", word); + return -EINVAL; + } + + return tok; +} + +static int dfault_parse_command(struct dfault_query *query, + enum dfault_token tok, + char *words[], size_t nr_words) +{ + unsigned int i = 0; + int ret; + + switch (tok) { + case TOK_disable: + query->set_enabled = true; + query->enabled = DFAULT_disabled; + break; + case TOK_enable: + query->set_enabled = true; + query->enabled = DFAULT_enabled; + break; + case TOK_oneshot: + query->set_enabled = true; + query->enabled = DFAULT_oneshot; + break; + case TOK_frequency: + query->set_frequency = 1; + ret = kstrtouint(words[i++], 10, &query->frequency); + if (ret) + return ret; + + if (!query->set_enabled) { + query->set_enabled = 1; + query->enabled = DFAULT_enabled; + } + break; + } + + return i; +} + +static int dynamic_fault_store(char *buf) +{ + struct dfault_query query = { NULL }; +#define MAXWORDS 9 + char *tok, *words[MAXWORDS]; + int ret, nr_words, i = 0; + + buf = codetag_query_parse(&query.q, buf); + if (IS_ERR(buf)) + return PTR_ERR(buf); + + while ((tok = strsep_no_empty(&buf, " \t\r\n"))) { + if (nr_words == ARRAY_SIZE(words)) + return -EINVAL; /* ran out of words[] before bytes */ + words[nr_words++] = tok; + } + + while (i < nr_words) { + const char *tok_str = words[i++]; + enum dfault_token tok = str_to_token(tok_str, nr_words - i); + + if (tok < 0) + return tok; + + ret = dfault_parse_command(&query, tok, words + i, nr_words - i); + if (ret < 0) + return ret; + + i += ret; + BUG_ON(i > nr_words); + } + + pr_debug("q->function=\"%s\" q->filename=\"%s\" " + "q->module=\"%s\" q->line=%u-%u\n q->index=%u-%u", + query.q.function, query.q.filename, query.q.module, + query.q.first_line, query.q.last_line, + query.q.first_index, query.q.last_index); + + ret = dfault_change(&query); + if (ret < 0) + return ret; + + return 0; +} + +struct dfault_iter { + struct codetag_iterator ct_iter; + + struct seq_buf buf; + char rawbuf[4096]; +}; + +static int dfault_open(struct inode *inode, struct file *file) +{ + struct dfault_iter *iter; + + iter = kzalloc(sizeof(*iter), GFP_KERNEL); + if (!iter) + return -ENOMEM; + + codetag_lock_module_list(cttype, true); + codetag_init_iter(&iter->ct_iter, cttype); + codetag_lock_module_list(cttype, false); + + file->private_data = iter; + seq_buf_init(&iter->buf, iter->rawbuf, sizeof(iter->rawbuf)); + return 0; +} + +static int dfault_release(struct inode *inode, struct file *file) +{ + struct dfault_iter *iter = file->private_data; + + kfree(iter); + return 0; +} + +struct user_buf { + char __user *buf; /* destination user buffer */ + size_t size; /* size of requested read */ + ssize_t ret; /* bytes read so far */ +}; + +static int flush_ubuf(struct user_buf *dst, struct seq_buf *src) +{ + if (src->len) { + size_t bytes = min_t(size_t, src->len, dst->size); + int err = copy_to_user(dst->buf, src->buffer, bytes); + + if (err) + return err; + + dst->ret += bytes; + dst->buf += bytes; + dst->size -= bytes; + src->len -= bytes; + memmove(src->buffer, src->buffer + bytes, src->len); + } + + return 0; +} + +static ssize_t dfault_read(struct file *file, char __user *ubuf, + size_t size, loff_t *ppos) +{ + struct dfault_iter *iter = file->private_data; + struct user_buf buf = { .buf = ubuf, .size = size }; + struct codetag *ct; + struct dfault *df; + int err; + + codetag_lock_module_list(iter->ct_iter.cttype, true); + while (1) { + err = flush_ubuf(&buf, &iter->buf); + if (err || !buf.size) + break; + + ct = codetag_next_ct(&iter->ct_iter); + if (!ct) + break; + + df = container_of(ct, struct dfault, tag); + dynamic_fault_to_text(&iter->buf, df); + seq_buf_putc(&iter->buf, '\n'); + } + codetag_lock_module_list(iter->ct_iter.cttype, false); + + return err ?: buf.ret; +} + +/* + * File_ops->write method for /dynamic_fault/conrol. Gathers the + * command text from userspace, parses and executes it. + */ +static ssize_t dfault_write(struct file *file, const char __user *ubuf, + size_t len, loff_t *offp) +{ + char tmpbuf[256]; + + if (len == 0) + return 0; + /* we don't check *offp -- multiple writes() are allowed */ + if (len > sizeof(tmpbuf)-1) + return -E2BIG; + if (copy_from_user(tmpbuf, ubuf, len)) + return -EFAULT; + tmpbuf[len] = '\0'; + pr_debug("read %zu bytes from userspace", len); + + dynamic_fault_store(tmpbuf); + + *offp += len; + return len; +} + +static const struct file_operations dfault_ops = { + .owner = THIS_MODULE, + .open = dfault_open, + .release = dfault_release, + .read = dfault_read, + .write = dfault_write +}; + +static int __init dynamic_fault_init(void) +{ + const struct codetag_type_desc desc = { + .section = "dynamic_fault_tags", + .tag_size = sizeof(struct dfault), + }; + struct dentry *debugfs_file; + + cttype = codetag_register_type(&desc); + if (IS_ERR_OR_NULL(cttype)) + return PTR_ERR(cttype); + + debugfs_file = debugfs_create_file("dynamic_faults", 0666, NULL, NULL, &dfault_ops); + if (IS_ERR(debugfs_file)) + return PTR_ERR(debugfs_file); + + return 0; +} +module_init(dynamic_fault_init); From patchwork Tue Aug 30 21:49:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 962ABC65C0D for ; Tue, 30 Aug 2022 21:53:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232083AbiH3Vx3 (ORCPT ); Tue, 30 Aug 2022 17:53:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229778AbiH3Vwe (ORCPT ); Tue, 30 Aug 2022 17:52:34 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A500B923F8 for ; Tue, 30 Aug 2022 14:50:26 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id s15-20020a5b044f000000b00680c4eb89f1so711974ybp.7 for ; Tue, 30 Aug 2022 14:50:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=tbrDSnAu2qp291q6FJVI5HAkVfDwcm9jT2uRBxykBxo=; b=kiiww/u+gNRUQ3btvqFql15o+OQdqGRa8iL5JT7J5S/FJUftRrKykzaiX80qCn3gMA N7fu/FadrPTfEuXGUmENdVvQNfb5Lc4hc7j+KvIJFmTYxMub96J2Dq6Qz7qLRmUGLg7U /uSKboJAjqTu07ETlC77WQsJyUV6n0Enhp4LQHtPzmOHIYGgLaodwOejnzgOW/A+KcGj gwCBB1i6MWj4k2p8na3VWnkGj14/Yz4lw5Y6xUMEQ+oozkWQlSuhUKUMfndEmuXu4exw 3VsqAU7Bes2ONZTxN75FIe9AuWEsm6slCzF5WV8jRwyOqmL1j24ghFNHmneopZkYIiwV aIGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=tbrDSnAu2qp291q6FJVI5HAkVfDwcm9jT2uRBxykBxo=; b=e0JO1D0zOWDVpt50wCLke8vX0b9BTRMkAY6VQv85tAe73rBJiXsP77/kQQr/6H7g00 ASVjDe5AMpoB5vl6Yddj+g04Tn+oDrA9XEUFk1eZydMODV1fhuI9pdQBhOBUs7X7Py+9 PTz3Hu0mdgsM1n3UE3ukzWW7df6Pv1axCGgUpeOBgUKREnZJhhwCzckuCCXiwpEwLqpp CyTqs2b0KXeP3piNrojKuBSxiILFNoOV44rRKkr1hiI3TE4nkK7+aYmMBEsjlpc8FsL7 R49QXrY48VDCfk+vPivaKusupan69oVWddFJSvJgagAIkAgeE1IBc1c76nGJSXwVFddS yDLw== X-Gm-Message-State: ACgBeo3rYml1ACawouyEf5ztaZYqZ2D3KwlBPZjOMo9lu9GeQV2XD9uE 3vVtOwwUmp4Tu5JKH/dDP6y83Z+6nf0= X-Google-Smtp-Source: AA6agR68X2SkqWShs2xXg87H6kYkH/5wxhQgvNByd1kyOXdUxqgguxdVIPa42ox8e5jz4qITJ7VUdF/uTQY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a81:13d7:0:b0:324:7dcb:8d26 with SMTP id 206-20020a8113d7000000b003247dcb8d26mr16712805ywt.452.1661896224365; Tue, 30 Aug 2022 14:50:24 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:12 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-24-surenb@google.com> Subject: [RFC PATCH 23/30] timekeeping: Add a missing include From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet We need ktime.h for ktime_t. Signed-off-by: Kent Overstreet --- include/linux/timekeeping.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h index fe1e467ba046..7c43e98cf211 100644 --- a/include/linux/timekeeping.h +++ b/include/linux/timekeeping.h @@ -4,6 +4,7 @@ #include #include +#include /* Included from linux/ktime.h */ From patchwork Tue Aug 30 21:49:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A598C65C0D for ; Tue, 30 Aug 2022 21:53:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231964AbiH3Vxr (ORCPT ); Tue, 30 Aug 2022 17:53:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231648AbiH3Vwr (ORCPT ); Tue, 30 Aug 2022 17:52:47 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32A0B94139 for ; Tue, 30 Aug 2022 14:50:28 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-340862314d9so173452597b3.3 for ; Tue, 30 Aug 2022 14:50:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=PJDqdE16VQFkhBtAZQdE7ryguEFDkMx7mMptHwdJxIQ=; b=Ij75nFrUrPvJu2bumKHw5qB/wQ6gVxiA7JiR459fjyWwf8bAM7esU+z26b2Gs2wm+2 GJco7Z59pIbk7lAYzIUjBLlbgij5WieJsWdNVz1juJkVh+sa9UWATRgrEBqBGRWkiIgS e9fE2xkYV7tYVniZBisXLNcLv1T2Qe2o0c8hnbKmgjEZ4CuxPiv3aHRBgJXL1TW2FAU/ xTP0HeMc1J5XCTMzbl6MriSnvFIhWbWVUtAJ/9V96DnOp2vk9Hc7k6iFLgNru+svATI5 qjP0FOpFl9sx7TmiI9c+PBZgc2FqjExNp+v4DIbowFGQu7f8+ZLj1VsmDoNwIg4tHpwc B9Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=PJDqdE16VQFkhBtAZQdE7ryguEFDkMx7mMptHwdJxIQ=; b=k2GpLo77wOFAmx2GK+UTV47AZJ3SeSpJT7w40jU4mGQRZVPD+5UKHZFg5nfCAt1kys AcnYzFWC5qLzCxvVUpJkccgCA4ZxCRMRINwVOZGWnfKa5m28W3cbowiKj6SubCuTmSTZ VqjatD8ytUvEm4jBnivYByqJ2vIhtrpdXnbGLra8rG78IZT7EY4z8N5eC0JYNQGa43UI TTwrk1RAiT1q9+NVuUr3O9ZTIQiNlMqCjZlJloAcA1GMmT/cuMByRvR4xXKBaMvF1fzS hcet88mUTXO4WvMlSb6W6UzDDedBA4do/UXm3fpBpH7i5tkJO3pusnmLLz9IqwC8WAZx oRmg== X-Gm-Message-State: ACgBeo20mnCo9F3xunlZNN80V9QQ7k/iDNv6lobMhwvkAwZoh5ftjrVR 8ZUZFyfwa4ewUaOuGIUXGpvUK5fuZmc= X-Google-Smtp-Source: AA6agR6avTM18XP2y4CdHSCV3FE6n/efM5TdzJAiOApCx+OJvVOl1HBVpl97MaszhlrgGlpdRen0jumUZ8U= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a81:54c4:0:b0:329:d0e1:cfcf with SMTP id i187-20020a8154c4000000b00329d0e1cfcfmr15741208ywb.451.1661896227243; Tue, 30 Aug 2022 14:50:27 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:13 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-25-surenb@google.com> Subject: [RFC PATCH 24/30] wait: Clean up waitqueue_entry initialization From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, Ingo Molnar Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet Cleanup for code tagging latency tracking: Add an initializer, WAIT_FUNC_INITIALIZER(), to be used by initializers for structs that include wait_queue_entries. Also, change init_wait(), init_wait_entry etc. to be a wrapper around the new __init_waitqueue_entry(); more de-duplication prep work. Signed-off-by: Kent Overstreet Cc: Ingo Molnar Cc: Peter Zijlstra --- include/linux/sbitmap.h | 6 +---- include/linux/wait.h | 52 +++++++++++++++++++--------------------- include/linux/wait_bit.h | 7 +----- kernel/sched/wait.c | 9 ------- 4 files changed, 27 insertions(+), 47 deletions(-) diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 8f5a86e210b9..f696c29d9ab3 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -596,11 +596,7 @@ struct sbq_wait { #define DEFINE_SBQ_WAIT(name) \ struct sbq_wait name = { \ .sbq = NULL, \ - .wait = { \ - .private = current, \ - .func = autoremove_wake_function, \ - .entry = LIST_HEAD_INIT((name).wait.entry), \ - } \ + .wait = WAIT_FUNC_INITIALIZER((name).wait, autoremove_wake_function),\ } /* diff --git a/include/linux/wait.h b/include/linux/wait.h index 58cfbf81447c..91ced6a118bc 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -79,21 +79,38 @@ extern void __init_waitqueue_head(struct wait_queue_head *wq_head, const char *n # define DECLARE_WAIT_QUEUE_HEAD_ONSTACK(name) DECLARE_WAIT_QUEUE_HEAD(name) #endif -static inline void init_waitqueue_entry(struct wait_queue_entry *wq_entry, struct task_struct *p) -{ - wq_entry->flags = 0; - wq_entry->private = p; - wq_entry->func = default_wake_function; +#define WAIT_FUNC_INITIALIZER(name, function) { \ + .private = current, \ + .func = function, \ + .entry = LIST_HEAD_INIT((name).entry), \ } +#define DEFINE_WAIT_FUNC(name, function) \ + struct wait_queue_entry name = WAIT_FUNC_INITIALIZER(name, function) + +#define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function) + static inline void -init_waitqueue_func_entry(struct wait_queue_entry *wq_entry, wait_queue_func_t func) +__init_waitqueue_entry(struct wait_queue_entry *wq_entry, unsigned int flags, + void *private, wait_queue_func_t func) { - wq_entry->flags = 0; - wq_entry->private = NULL; + wq_entry->flags = flags; + wq_entry->private = private; wq_entry->func = func; + INIT_LIST_HEAD(&wq_entry->entry); } +#define init_waitqueue_func_entry(_wq_entry, _func) \ + __init_waitqueue_entry(_wq_entry, 0, NULL, _func) + +#define init_waitqueue_entry(_wq_entry, _task) \ + __init_waitqueue_entry(_wq_entry, 0, _task, default_wake_function) + +#define init_wait_entry(_wq_entry, _flags) \ + __init_waitqueue_entry(_wq_entry, _flags, current, autoremove_wake_function) + +#define init_wait(wait) init_wait_entry(wait, 0) + /** * waitqueue_active -- locklessly test for waiters on the queue * @wq_head: the waitqueue to test for waiters @@ -283,8 +300,6 @@ static inline void wake_up_pollfree(struct wait_queue_head *wq_head) (!__builtin_constant_p(state) || \ state == TASK_INTERRUPTIBLE || state == TASK_KILLABLE) \ -extern void init_wait_entry(struct wait_queue_entry *wq_entry, int flags); - /* * The below macro ___wait_event() has an explicit shadow of the __ret * variable when used from the wait_event_*() macros. @@ -1170,23 +1185,6 @@ long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout); int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key); int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key); -#define DEFINE_WAIT_FUNC(name, function) \ - struct wait_queue_entry name = { \ - .private = current, \ - .func = function, \ - .entry = LIST_HEAD_INIT((name).entry), \ - } - -#define DEFINE_WAIT(name) DEFINE_WAIT_FUNC(name, autoremove_wake_function) - -#define init_wait(wait) \ - do { \ - (wait)->private = current; \ - (wait)->func = autoremove_wake_function; \ - INIT_LIST_HEAD(&(wait)->entry); \ - (wait)->flags = 0; \ - } while (0) - typedef int (*task_call_f)(struct task_struct *p, void *arg); extern int task_call_func(struct task_struct *p, task_call_f func, void *arg); diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h index 7725b7579b78..267ca0fe9fd9 100644 --- a/include/linux/wait_bit.h +++ b/include/linux/wait_bit.h @@ -38,12 +38,7 @@ int wake_bit_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync #define DEFINE_WAIT_BIT(name, word, bit) \ struct wait_bit_queue_entry name = { \ .key = __WAIT_BIT_KEY_INITIALIZER(word, bit), \ - .wq_entry = { \ - .private = current, \ - .func = wake_bit_function, \ - .entry = \ - LIST_HEAD_INIT((name).wq_entry.entry), \ - }, \ + .wq_entry = WAIT_FUNC_INITIALIZER((name).wq_entry, wake_bit_function),\ } extern int bit_wait(struct wait_bit_key *key, int mode); diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index 9860bb9a847c..b9922346077d 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -289,15 +289,6 @@ prepare_to_wait_exclusive(struct wait_queue_head *wq_head, struct wait_queue_ent } EXPORT_SYMBOL(prepare_to_wait_exclusive); -void init_wait_entry(struct wait_queue_entry *wq_entry, int flags) -{ - wq_entry->flags = flags; - wq_entry->private = current; - wq_entry->func = autoremove_wake_function; - INIT_LIST_HEAD(&wq_entry->entry); -} -EXPORT_SYMBOL(init_wait_entry); - long prepare_to_wait_event(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry, int state) { unsigned long flags; From patchwork Tue Aug 30 21:49:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4444DC0502A for ; Tue, 30 Aug 2022 21:53:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231933AbiH3Vxp (ORCPT ); Tue, 30 Aug 2022 17:53:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231941AbiH3Vwv (ORCPT ); Tue, 30 Aug 2022 17:52:51 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C517295E4B for ; Tue, 30 Aug 2022 14:50:31 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id x27-20020a25ac9b000000b0069140cfbbd9so714276ybi.8 for ; Tue, 30 Aug 2022 14:50:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=QqkrsGdR9RIQpztjH+HKE2iaQK76d4alLlYGvP5mtuk=; b=EmELHff//JPLhVllZqIpmS0Yi/n4CYn1PpFo5D7aAgY/mGxsMtS6XD/+ioRS2qZY8C no10ngIVmFAT46GJYUJjGDNqdY92DoHLQ9QbhAJVsBRmHbrR9QbwqsDn7oDd7mF57S+J MK/ad/Fah7kdPasA545qk8VUbJu+zxO3x46X50P0T+NbpE7jyYjWPsZQ5vgUXGysvUe/ W1W5uPUmm7KhCwcWdijNNG0C+OJlqBBAZwgyEyHf6FDcQet7W8l2Nexpz36zNi+sAkKv CMElZPZokU2RO03oq7FRoEcd/C8olIjRx4EsriXjGCc7C59ySMMLL8R3CCw5o8VWV3mF SxkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=QqkrsGdR9RIQpztjH+HKE2iaQK76d4alLlYGvP5mtuk=; b=12lBXjolFivAGO0mdePaWgxLk1SSrfkbaFzFOjOU6qj+sqHtwx7gtrbXd9KJF565kg MOsBGDLj9Y4zZZBuzkaN2bO8v7cANWI2qC2d3HxiU+s1oHKujxgTZxfrtou1/OG1tZru ekzXC0R7YA1n20QVkcb1E2z81xHSDd6hxmPHPEOQM0PMx+uaftPZXQQf0UBsIdVBxhlo Pxn2c4tgKJH04tiSOtwnBheTxn15wtN9E1lQ0MEXPzg5OWon/rcfLSCL83fekHkSvw7G oKLCtbb5I+LRZebSYGJ8FdH/VSlvUBr4KTWEBNfTkrjBlanFk1yLKB8RGzz5Xec8SsPA 703Q== X-Gm-Message-State: ACgBeo3QYK9fN8O/82WPIq0GLPvQG1b8N56itFKsUGQ+2PWvp0WRfZfY Q7tr+tpnsmxQ1PzM50TfvQ9xlnbMzy4= X-Google-Smtp-Source: AA6agR5ND+Bk6d/UBSONHhQ297Nqdm//CwhSmMh2PEZtOl62aMkc4xfTQyNu/uBDVWhdtd26Z8NBy2Q9NdY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:4246:0:b0:699:186f:76ca with SMTP id p67-20020a254246000000b00699186f76camr13282039yba.272.1661896229901; Tue, 30 Aug 2022 14:50:29 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:14 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-26-surenb@google.com> Subject: [RFC PATCH 25/30] lib/time_stats: New library for statistics on events From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet This adds a small new library for tracking statistics on events that have a duration, i.e. a start and end time. - number of events - rate/frequency - average duration - max duration - duration quantiles This code comes from bcachefs, and originally bcache: the next patch will be converting bcache to use this version, and a subsequent patch will be using code_tagging to instrument all wait_event() calls in the kernel. Signed-off-by: Kent Overstreet --- include/linux/time_stats.h | 44 +++++++ lib/Kconfig | 3 + lib/Makefile | 1 + lib/time_stats.c | 236 +++++++++++++++++++++++++++++++++++++ 4 files changed, 284 insertions(+) create mode 100644 include/linux/time_stats.h create mode 100644 lib/time_stats.c diff --git a/include/linux/time_stats.h b/include/linux/time_stats.h new file mode 100644 index 000000000000..7ae929e6f836 --- /dev/null +++ b/include/linux/time_stats.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef _LINUX_TIMESTATS_H +#define _LINUX_TIMESTATS_H + +#include +#include + +#define NR_QUANTILES 15 + +struct quantiles { + struct quantile_entry { + u64 m; + u64 step; + } entries[NR_QUANTILES]; +}; + +struct time_stat_buffer { + unsigned int nr; + struct time_stat_buffer_entry { + u64 start; + u64 end; + } entries[32]; +}; + +struct time_stats { + spinlock_t lock; + u64 count; + /* all fields are in nanoseconds */ + u64 average_duration; + u64 average_frequency; + u64 max_duration; + u64 last_event; + struct quantiles quantiles; + + struct time_stat_buffer __percpu *buffer; +}; + +struct seq_buf; +void time_stats_update(struct time_stats *stats, u64 start); +void time_stats_to_text(struct seq_buf *out, struct time_stats *stats); +void time_stats_exit(struct time_stats *stats); + +#endif /* _LINUX_TIMESTATS_H */ diff --git a/lib/Kconfig b/lib/Kconfig index fc6dbc425728..884fd9f2f06d 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -744,3 +744,6 @@ config ASN1_ENCODER config POLYNOMIAL tristate + +config TIME_STATS + bool diff --git a/lib/Makefile b/lib/Makefile index 489ea000c528..e54392011f5e 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -232,6 +232,7 @@ obj-$(CONFIG_ALLOC_TAGGING) += alloc_tag.o obj-$(CONFIG_PAGE_ALLOC_TAGGING) += pgalloc_tag.o obj-$(CONFIG_CODETAG_FAULT_INJECTION) += dynamic_fault.o +obj-$(CONFIG_TIME_STATS) += time_stats.o lib-$(CONFIG_GENERIC_BUG) += bug.o diff --git a/lib/time_stats.c b/lib/time_stats.c new file mode 100644 index 000000000000..30362364fdd2 --- /dev/null +++ b/lib/time_stats.c @@ -0,0 +1,236 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static inline unsigned int eytzinger1_child(unsigned int i, unsigned int child) +{ + return (i << 1) + child; +} + +static inline unsigned int eytzinger1_right_child(unsigned int i) +{ + return eytzinger1_child(i, 1); +} + +static inline unsigned int eytzinger1_next(unsigned int i, unsigned int size) +{ + if (eytzinger1_right_child(i) <= size) { + i = eytzinger1_right_child(i); + + i <<= __fls(size + 1) - __fls(i); + i >>= i > size; + } else { + i >>= ffz(i) + 1; + } + + return i; +} + +static inline unsigned int eytzinger0_child(unsigned int i, unsigned int child) +{ + return (i << 1) + 1 + child; +} + +static inline unsigned int eytzinger0_first(unsigned int size) +{ + return rounddown_pow_of_two(size) - 1; +} + +static inline unsigned int eytzinger0_next(unsigned int i, unsigned int size) +{ + return eytzinger1_next(i + 1, size) - 1; +} + +#define eytzinger0_for_each(_i, _size) \ + for ((_i) = eytzinger0_first((_size)); \ + (_i) != -1; \ + (_i) = eytzinger0_next((_i), (_size))) + +#define ewma_add(ewma, val, weight) \ +({ \ + typeof(ewma) _ewma = (ewma); \ + typeof(weight) _weight = (weight); \ + \ + (((_ewma << _weight) - _ewma) + (val)) >> _weight; \ +}) + +static void quantiles_update(struct quantiles *q, u64 v) +{ + unsigned int i = 0; + + while (i < ARRAY_SIZE(q->entries)) { + struct quantile_entry *e = q->entries + i; + + if (unlikely(!e->step)) { + e->m = v; + e->step = max_t(unsigned int, v / 2, 1024); + } else if (e->m > v) { + e->m = e->m >= e->step + ? e->m - e->step + : 0; + } else if (e->m < v) { + e->m = e->m + e->step > e->m + ? e->m + e->step + : U32_MAX; + } + + if ((e->m > v ? e->m - v : v - e->m) < e->step) + e->step = max_t(unsigned int, e->step / 2, 1); + + if (v >= e->m) + break; + + i = eytzinger0_child(i, v > e->m); + } +} + +static void time_stats_update_one(struct time_stats *stats, + u64 start, u64 end) +{ + u64 duration, freq; + + duration = time_after64(end, start) + ? end - start : 0; + freq = time_after64(end, stats->last_event) + ? end - stats->last_event : 0; + + stats->count++; + + stats->average_duration = stats->average_duration + ? ewma_add(stats->average_duration, duration, 6) + : duration; + + stats->average_frequency = stats->average_frequency + ? ewma_add(stats->average_frequency, freq, 6) + : freq; + + stats->max_duration = max(stats->max_duration, duration); + + stats->last_event = end; + + quantiles_update(&stats->quantiles, duration); +} + +void time_stats_update(struct time_stats *stats, u64 start) +{ + u64 end = ktime_get_ns(); + unsigned long flags; + + if (!stats->buffer) { + spin_lock_irqsave(&stats->lock, flags); + time_stats_update_one(stats, start, end); + + if (stats->average_frequency < 32 && + stats->count > 1024) + stats->buffer = + alloc_percpu_gfp(struct time_stat_buffer, + GFP_ATOMIC); + spin_unlock_irqrestore(&stats->lock, flags); + } else { + struct time_stat_buffer_entry *i; + struct time_stat_buffer *b; + + preempt_disable(); + b = this_cpu_ptr(stats->buffer); + + BUG_ON(b->nr >= ARRAY_SIZE(b->entries)); + b->entries[b->nr++] = (struct time_stat_buffer_entry) { + .start = start, + .end = end + }; + + if (b->nr == ARRAY_SIZE(b->entries)) { + spin_lock_irqsave(&stats->lock, flags); + for (i = b->entries; + i < b->entries + ARRAY_SIZE(b->entries); + i++) + time_stats_update_one(stats, i->start, i->end); + spin_unlock_irqrestore(&stats->lock, flags); + + b->nr = 0; + } + + preempt_enable(); + } +} +EXPORT_SYMBOL(time_stats_update); + +static const struct time_unit { + const char *name; + u32 nsecs; +} time_units[] = { + { "ns", 1 }, + { "us", NSEC_PER_USEC }, + { "ms", NSEC_PER_MSEC }, + { "sec", NSEC_PER_SEC }, +}; + +static const struct time_unit *pick_time_units(u64 ns) +{ + const struct time_unit *u; + + for (u = time_units; + u + 1 < time_units + ARRAY_SIZE(time_units) && + ns >= u[1].nsecs << 1; + u++) + ; + + return u; +} + +static void pr_time_units(struct seq_buf *out, u64 ns) +{ + const struct time_unit *u = pick_time_units(ns); + + seq_buf_printf(out, "%llu %s", div_u64(ns, u->nsecs), u->name); +} + +void time_stats_to_text(struct seq_buf *out, struct time_stats *stats) +{ + const struct time_unit *u; + u64 freq = READ_ONCE(stats->average_frequency); + u64 q, last_q = 0; + int i; + + seq_buf_printf(out, "count: %llu\n", stats->count); + seq_buf_printf(out, "rate: %llu/sec\n", + freq ? div64_u64(NSEC_PER_SEC, freq) : 0); + seq_buf_printf(out, "frequency: "); + pr_time_units(out, freq); + seq_buf_putc(out, '\n'); + + seq_buf_printf(out, "avg duration: "); + pr_time_units(out, stats->average_duration); + seq_buf_putc(out, '\n'); + + seq_buf_printf(out, "max duration: "); + pr_time_units(out, stats->max_duration); + seq_buf_putc(out, '\n'); + + i = eytzinger0_first(NR_QUANTILES); + u = pick_time_units(stats->quantiles.entries[i].m); + seq_buf_printf(out, "quantiles (%s): ", u->name); + eytzinger0_for_each(i, NR_QUANTILES) { + q = max(stats->quantiles.entries[i].m, last_q); + seq_buf_printf(out, "%llu ", div_u64(q, u->nsecs)); + last_q = q; + } + + seq_buf_putc(out, '\n'); +} +EXPORT_SYMBOL_GPL(time_stats_to_text); + +void time_stats_exit(struct time_stats *stats) +{ + free_percpu(stats->buffer); + stats->buffer = NULL; +} +EXPORT_SYMBOL_GPL(time_stats_exit); From patchwork Tue Aug 30 21:49:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28395C64991 for ; Tue, 30 Aug 2022 21:53:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231959AbiH3Vxq (ORCPT ); Tue, 30 Aug 2022 17:53:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232153AbiH3VxI (ORCPT ); Tue, 30 Aug 2022 17:53:08 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1274590C54 for ; Tue, 30 Aug 2022 14:50:34 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id i194-20020a253bcb000000b00676d86fc5d7so709737yba.9 for ; Tue, 30 Aug 2022 14:50:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=q8clZUHCVXNR9yzPtPwGAp/8c18i/nqvvNYb9V5EY+0=; b=nwraOOA8rbsDfrvp5gBz8rYLe0NDJnSuxfKXWTc3i5fQmEeQNcPETe8KaY9Mt9/+1r ov+Vf6KkAdui0Rmw1+5xSKYNrSBv2kWObfOZUmmiPaKz9kQgQUwPGyNfu/DYb4bPK42K 36D96qfHlnHXTqqxaW28I0UFsAs+LSQcoibyPA3rdTyOgd8YqbSgEUdft4zOlN2gaNbw dVh3tFK6UzGRc+MjeMOQD3ATSE//iPHaMa+xktBjpyqEQSekSHvIyA0BP3Rq7kUAJkb0 tRcC/6q0Z5Rvm1vf/73MAHgozRrLOYJlaS49+Ebjg2YexCVtD/Vs4zwcF2pG9WHANRAT ZYUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=q8clZUHCVXNR9yzPtPwGAp/8c18i/nqvvNYb9V5EY+0=; b=FX6Pf1RGf4iOT/+k4eMQ1GTNDnw9V68Hs37D/Exw01GuorIci3EXe/2fHwPLC74Qi8 M+2iZcKEeVCPQLZXc0/heyVWuXD1XpKye0+6jP8BXFSjMtpE22SJeOVz8AGxLVsctsY+ LDETQHdqRf4xVZafE6zfnFu6pmGtyXWJaDrKQyA/R2vZ5wj1FYJ8VeNnb40wtlrJ8+2N 9j+i9kS9KCiZTThRtKg7taKVG07LdbkxPUnqR6M8cirLhjjOXHkSevKx1l/SCahFd0qA tXEXC4f+h1P44GFnAUbSxr/X9toX1i4wrez3WWkS4ocsSTmrHPQ2yV6m6gp2Cl3eNFu2 UOtA== X-Gm-Message-State: ACgBeo1PdKx+FZmmQzg4TvfJ9FWgndAW0Ux+9V4Mu5oo/MmEU57rF8wy nO9lE54/iG3u0fsg4zURBycGvLG635k= X-Google-Smtp-Source: AA6agR6QQvnvy8mDyUDzzIm8p3bCu8QCjthbfoOD2oTtJsYiGZpiHSGB1TzI0H74pXNHM3rxRxKb4sF+hq0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a5b:2ce:0:b0:67a:6054:9eb0 with SMTP id h14-20020a5b02ce000000b0067a60549eb0mr13092972ybp.15.1661896232608; Tue, 30 Aug 2022 14:50:32 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:15 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-27-surenb@google.com> Subject: [RFC PATCH 26/30] bcache: Convert to lib/time_stats From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, Coly Li Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet This patch converts bcache to the new generic time_stats code lib/time_stats.c. The new code is from bcachefs, and has some changes from the version in bcache: - we now use ktime_get_ns(), not local_clock(). When the code was originally written multi processor systems that lacked synchronized TSCs were still common, and so local_clock() was much cheaper than sched_clock() (though not necessarily fully accurate, due to TSC drift). ktime_get_ns() should be cheap enough on all common hardware now, and more standard/correct. - time_stats are now exported in a single file in sysfs, which means we can improve the statistics we keep track of without changing all users. This also means we don't have to manually specify which units (ms, us, ns) a given time_stats should be printed in; that's handled dynamically. - There's a lazily-allocated percpu buffer, which now needs to be freed with time_stats_exit(). Signed-off-by: Kent Overstreet Cc: Coly Li --- drivers/md/bcache/Kconfig | 1 + drivers/md/bcache/bcache.h | 1 + drivers/md/bcache/bset.c | 8 +++--- drivers/md/bcache/bset.h | 1 + drivers/md/bcache/btree.c | 12 ++++---- drivers/md/bcache/super.c | 3 ++ drivers/md/bcache/sysfs.c | 43 ++++++++++++++++++++-------- drivers/md/bcache/util.c | 30 -------------------- drivers/md/bcache/util.h | 57 -------------------------------------- 9 files changed, 47 insertions(+), 109 deletions(-) diff --git a/drivers/md/bcache/Kconfig b/drivers/md/bcache/Kconfig index 529c9d04e9a4..8d165052e508 100644 --- a/drivers/md/bcache/Kconfig +++ b/drivers/md/bcache/Kconfig @@ -4,6 +4,7 @@ config BCACHE tristate "Block device as cache" select BLOCK_HOLDER_DEPRECATED if SYSFS select CRC64 + select TIME_STATS help Allows a block device to be used as cache for other devices; uses a btree for indexing and the layout is optimized for SSDs. diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 2acda9cea0f9..5100010a3897 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -185,6 +185,7 @@ #include #include #include +#include #include #include #include diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c index 94d38e8a59b3..727e9b7aead4 100644 --- a/drivers/md/bcache/bset.c +++ b/drivers/md/bcache/bset.c @@ -1251,7 +1251,7 @@ static void __btree_sort(struct btree_keys *b, struct btree_iter *iter, order = state->page_order; } - start_time = local_clock(); + start_time = ktime_get_ns(); btree_mergesort(b, out, iter, fixup, false); b->nsets = start; @@ -1286,7 +1286,7 @@ static void __btree_sort(struct btree_keys *b, struct btree_iter *iter, bch_bset_build_written_tree(b); if (!start) - bch_time_stats_update(&state->time, start_time); + time_stats_update(&state->time, start_time); } void bch_btree_sort_partial(struct btree_keys *b, unsigned int start, @@ -1322,14 +1322,14 @@ void bch_btree_sort_and_fix_extents(struct btree_keys *b, void bch_btree_sort_into(struct btree_keys *b, struct btree_keys *new, struct bset_sort_state *state) { - uint64_t start_time = local_clock(); + uint64_t start_time = ktime_get_ns(); struct btree_iter iter; bch_btree_iter_init(b, &iter, NULL); btree_mergesort(b, new->set->data, &iter, false, true); - bch_time_stats_update(&state->time, start_time); + time_stats_update(&state->time, start_time); new->set->size = 0; // XXX: why? } diff --git a/drivers/md/bcache/bset.h b/drivers/md/bcache/bset.h index d795c84246b0..13e524ad7783 100644 --- a/drivers/md/bcache/bset.h +++ b/drivers/md/bcache/bset.h @@ -3,6 +3,7 @@ #define _BCACHE_BSET_H #include +#include #include #include "bcache_ondisk.h" diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 147c493a989a..abf543bc7551 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -242,7 +242,7 @@ static void btree_node_read_endio(struct bio *bio) static void bch_btree_node_read(struct btree *b) { - uint64_t start_time = local_clock(); + uint64_t start_time = ktime_get_ns(); struct closure cl; struct bio *bio; @@ -270,7 +270,7 @@ static void bch_btree_node_read(struct btree *b) goto err; bch_btree_node_read_done(b); - bch_time_stats_update(&b->c->btree_read_time, start_time); + time_stats_update(&b->c->btree_read_time, start_time); return; err: @@ -1789,7 +1789,7 @@ static void bch_btree_gc(struct cache_set *c) struct gc_stat stats; struct closure writes; struct btree_op op; - uint64_t start_time = local_clock(); + uint64_t start_time = ktime_get_ns(); trace_bcache_gc_start(c); @@ -1815,7 +1815,7 @@ static void bch_btree_gc(struct cache_set *c) bch_btree_gc_finish(c); wake_up_allocators(c); - bch_time_stats_update(&c->btree_gc_time, start_time); + time_stats_update(&c->btree_gc_time, start_time); stats.key_bytes *= sizeof(uint64_t); stats.data <<= 9; @@ -2191,7 +2191,7 @@ static int btree_split(struct btree *b, struct btree_op *op, { bool split; struct btree *n1, *n2 = NULL, *n3 = NULL; - uint64_t start_time = local_clock(); + uint64_t start_time = ktime_get_ns(); struct closure cl; struct keylist parent_keys; @@ -2297,7 +2297,7 @@ static int btree_split(struct btree *b, struct btree_op *op, btree_node_free(b); rw_unlock(true, n1); - bch_time_stats_update(&b->c->btree_split_time, start_time); + time_stats_update(&b->c->btree_split_time, start_time); return 0; err_free2: diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index ba3909bb6bea..26c8fa93b55d 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1691,6 +1691,9 @@ static void cache_set_free(struct closure *cl) kobject_put(&ca->kobj); } + time_stats_exit(&c->btree_gc_time); + time_stats_exit(&c->btree_split_time); + time_stats_exit(&c->sort.time); if (c->moving_gc_wq) destroy_workqueue(c->moving_gc_wq); diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index c6f677059214..01eec5877cd7 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -16,6 +16,7 @@ #include #include #include +#include extern bool bcache_is_reboot; @@ -79,10 +80,10 @@ read_attribute(active_journal_entries); read_attribute(backing_dev_name); read_attribute(backing_dev_uuid); -sysfs_time_stats_attribute(btree_gc, sec, ms); -sysfs_time_stats_attribute(btree_split, sec, us); -sysfs_time_stats_attribute(btree_sort, ms, us); -sysfs_time_stats_attribute(btree_read, ms, us); +read_attribute(btree_gc_time); +read_attribute(btree_split_time); +read_attribute(btree_sort_time); +read_attribute(btree_read_time); read_attribute(btree_nodes); read_attribute(btree_used_percent); @@ -731,6 +732,9 @@ static unsigned int bch_average_key_size(struct cache_set *c) SHOW(__bch_cache_set) { struct cache_set *c = container_of(kobj, struct cache_set, kobj); + struct seq_buf s; + + seq_buf_init(&s, buf, PAGE_SIZE); sysfs_print(synchronous, CACHE_SYNC(&c->cache->sb)); sysfs_print(journal_delay_ms, c->journal_delay_ms); @@ -743,10 +747,25 @@ SHOW(__bch_cache_set) sysfs_print(btree_cache_max_chain, bch_cache_max_chain(c)); sysfs_print(cache_available_percent, 100 - c->gc_stats.in_use); - sysfs_print_time_stats(&c->btree_gc_time, btree_gc, sec, ms); - sysfs_print_time_stats(&c->btree_split_time, btree_split, sec, us); - sysfs_print_time_stats(&c->sort.time, btree_sort, ms, us); - sysfs_print_time_stats(&c->btree_read_time, btree_read, ms, us); + if (attr == &sysfs_btree_gc_time) { + time_stats_to_text(&s, &c->btree_gc_time); + return s.len; + } + + if (attr == &sysfs_btree_split_time) { + time_stats_to_text(&s, &c->btree_split_time); + return s.len; + } + + if (attr == &sysfs_btree_sort_time) { + time_stats_to_text(&s, &c->sort.time); + return s.len; + } + + if (attr == &sysfs_btree_read_time) { + time_stats_to_text(&s, &c->btree_read_time); + return s.len; + } sysfs_print(btree_used_percent, bch_btree_used(c)); sysfs_print(btree_nodes, c->gc_stats.nodes); @@ -988,10 +1007,10 @@ KTYPE(bch_cache_set); static struct attribute *bch_cache_set_internal_attrs[] = { &sysfs_active_journal_entries, - sysfs_time_stats_attribute_list(btree_gc, sec, ms) - sysfs_time_stats_attribute_list(btree_split, sec, us) - sysfs_time_stats_attribute_list(btree_sort, ms, us) - sysfs_time_stats_attribute_list(btree_read, ms, us) + &sysfs_btree_gc_time, + &sysfs_btree_split_time, + &sysfs_btree_sort_time, + &sysfs_btree_read_time, &sysfs_btree_nodes, &sysfs_btree_used_percent, diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c index ae380bc3992e..95282bf0f9a7 100644 --- a/drivers/md/bcache/util.c +++ b/drivers/md/bcache/util.c @@ -160,36 +160,6 @@ int bch_parse_uuid(const char *s, char *uuid) return i; } -void bch_time_stats_update(struct time_stats *stats, uint64_t start_time) -{ - uint64_t now, duration, last; - - spin_lock(&stats->lock); - - now = local_clock(); - duration = time_after64(now, start_time) - ? now - start_time : 0; - last = time_after64(now, stats->last) - ? now - stats->last : 0; - - stats->max_duration = max(stats->max_duration, duration); - - if (stats->last) { - ewma_add(stats->average_duration, duration, 8, 8); - - if (stats->average_frequency) - ewma_add(stats->average_frequency, last, 8, 8); - else - stats->average_frequency = last << 8; - } else { - stats->average_duration = duration << 8; - } - - stats->last = now ?: 1; - - spin_unlock(&stats->lock); -} - /** * bch_next_delay() - update ratelimiting statistics and calculate next delay * @d: the struct bch_ratelimit to update diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h index 6f3cb7c92130..1e1bdbae9593 100644 --- a/drivers/md/bcache/util.h +++ b/drivers/md/bcache/util.h @@ -345,68 +345,11 @@ ssize_t bch_hprint(char *buf, int64_t v); bool bch_is_zero(const char *p, size_t n); int bch_parse_uuid(const char *s, char *uuid); -struct time_stats { - spinlock_t lock; - /* - * all fields are in nanoseconds, averages are ewmas stored left shifted - * by 8 - */ - uint64_t max_duration; - uint64_t average_duration; - uint64_t average_frequency; - uint64_t last; -}; - -void bch_time_stats_update(struct time_stats *stats, uint64_t time); - static inline unsigned int local_clock_us(void) { return local_clock() >> 10; } -#define NSEC_PER_ns 1L -#define NSEC_PER_us NSEC_PER_USEC -#define NSEC_PER_ms NSEC_PER_MSEC -#define NSEC_PER_sec NSEC_PER_SEC - -#define __print_time_stat(stats, name, stat, units) \ - sysfs_print(name ## _ ## stat ## _ ## units, \ - div_u64((stats)->stat >> 8, NSEC_PER_ ## units)) - -#define sysfs_print_time_stats(stats, name, \ - frequency_units, \ - duration_units) \ -do { \ - __print_time_stat(stats, name, \ - average_frequency, frequency_units); \ - __print_time_stat(stats, name, \ - average_duration, duration_units); \ - sysfs_print(name ## _ ##max_duration ## _ ## duration_units, \ - div_u64((stats)->max_duration, \ - NSEC_PER_ ## duration_units)); \ - \ - sysfs_print(name ## _last_ ## frequency_units, (stats)->last \ - ? div_s64(local_clock() - (stats)->last, \ - NSEC_PER_ ## frequency_units) \ - : -1LL); \ -} while (0) - -#define sysfs_time_stats_attribute(name, \ - frequency_units, \ - duration_units) \ -read_attribute(name ## _average_frequency_ ## frequency_units); \ -read_attribute(name ## _average_duration_ ## duration_units); \ -read_attribute(name ## _max_duration_ ## duration_units); \ -read_attribute(name ## _last_ ## frequency_units) - -#define sysfs_time_stats_attribute_list(name, \ - frequency_units, \ - duration_units) \ -&sysfs_ ## name ## _average_frequency_ ## frequency_units, \ -&sysfs_ ## name ## _average_duration_ ## duration_units, \ -&sysfs_ ## name ## _max_duration_ ## duration_units, \ -&sysfs_ ## name ## _last_ ## frequency_units, - #define ewma_add(ewma, val, weight, factor) \ ({ \ (ewma) *= (weight) - 1; \ From patchwork Tue Aug 30 21:49:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E869EC0502A for ; Tue, 30 Aug 2022 21:53:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232080AbiH3Vxu (ORCPT ); Tue, 30 Aug 2022 17:53:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231799AbiH3VxY (ORCPT ); Tue, 30 Aug 2022 17:53:24 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B17C7910BF for ; Tue, 30 Aug 2022 14:50:36 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id c18-20020a170902d49200b001750f4fb8beso2414439plg.15 for ; Tue, 30 Aug 2022 14:50:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=kZY1PfnIfhpzKe2fg2QNhEi4K+n6/AZeM4PXPDl+jSs=; b=dJIIXMoaDfo4LS1xQQsLOYvOSR0hfxqszIVBrvS11cTWnTWgSYoHleGkgC0i1RcvEW g0FnB2GyEmZdNHr6qvVjCkD5o8oUCDsQN6hIW7/R52HXnnZeYixGTmPmUjkl/HVYK9gL Qo4/LFsvJ+0cR9rmUutD4WN2/z4CAitg0y36m7N9M2fZq2hfBbMNxbfvmd6aQgPqHqkG 6a9S9xFeeaCoHjkHXbbF/J0YPQDSPpwSprLRWHG9c2lJ3QzYjp2lWoTEqHtd8FJ9BudO moRv5nbs15u8zEsMhSNn0aeOd2zru1Uacoknhz067J3h2b0D6a3WPhHKe4+8QHPv/dBd t9OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=kZY1PfnIfhpzKe2fg2QNhEi4K+n6/AZeM4PXPDl+jSs=; b=7Ir2nvvho195PnJOCD3a9etNDgSyRilgeoK6mHNdH9S7rz1z8/FDnDtPx8EGi4KQlD tBVusV0cbITt6XXQy/1OGtNaDe7FZ06DAqU6ewsO4xs8Lp2PaVVDXUu2BZ3BIUP5Q8d4 hpIqVu1KmaWJfD8iD4v+lXAUyCYK2nKiqn6EtIIQdFzUigGSQ0JqtwCGDB3/pDXei+So 5RWLu3qLmCtV8PywQyk6F4TGO2AFjZMkwTi7PRvCyymvhevJaVcZvCaFoh2D6FLICgVH MO5IEZfMSpuEtr+wmysAIPE30X6DLCtU+aGiZEYGRAluFkw0gpS9EUlxD9BSYDY3j3gA tapg== X-Gm-Message-State: ACgBeo3BOj2yw36w8UJc/sYC70yAFFHylc3dqbSdq1HrQefSjwTJ4efK AyVYPqARHocf/voyrNTHGRTzyjfZf/k= X-Google-Smtp-Source: AA6agR6gjE9iFWchZ4YdJrumvkghNHAZU6MHZC3qdslJX2rUX6yUkswTTbpzQbuhO2vfdv4e8k6IHGZMB7I= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a63:5c4a:0:b0:41d:bd7d:6084 with SMTP id n10-20020a635c4a000000b0041dbd7d6084mr19548657pgm.411.1661896235530; Tue, 30 Aug 2022 14:50:35 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:16 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-28-surenb@google.com> Subject: [RFC PATCH 27/30] Code tagging based latency tracking From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet This adds the ability to easily instrument code for measuring latency. To use, add the following to calls to your code, at the start and end of the event you wish to measure: code_tag_time_stats_start(start_time); code_tag_time_stats_finish(start_time); Stastistics will then show up in debugfs under /sys/kernel/debug/time_stats, listed by file and line number. Stastics measured include weighted averages of frequency, duration, max duration, as well as quantiles. This patch also instruments all calls to init_wait and finish_wait, which includes all calls to wait_event. Example debugfs output: fs/xfs/xfs_trans_ail.c:746 module:xfs func:xfs_ail_push_all_sync count: 17 rate: 0/sec frequency: 2 sec avg duration: 10 us max duration: 232 us quantiles (ns): 128 128 128 128 128 128 128 128 128 128 128 128 128 128 128 lib/sbitmap.c:813 module:sbitmap func:sbitmap_finish_wait count: 3 rate: 0/sec frequency: 4 sec avg duration: 4 sec max duration: 4 sec quantiles (ns): 0 4288669120 4288669120 5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 5360836048 net/core/datagram.c:122 module:datagram func:__skb_wait_for_more_packets count: 10 rate: 1/sec frequency: 859 ms avg duration: 472 ms max duration: 30 sec quantiles (ns): 0 12279 12279 15669 15669 15669 15669 17217 17217 17217 17217 17217 17217 17217 17217 Signed-off-by: Kent Overstreet --- include/asm-generic/codetag.lds.h | 3 +- include/linux/codetag_time_stats.h | 54 +++++++++++ include/linux/io_uring_types.h | 2 +- include/linux/wait.h | 22 ++++- kernel/sched/wait.c | 6 +- lib/Kconfig.debug | 8 ++ lib/Makefile | 1 + lib/codetag_time_stats.c | 143 +++++++++++++++++++++++++++++ 8 files changed, 233 insertions(+), 6 deletions(-) create mode 100644 include/linux/codetag_time_stats.h create mode 100644 lib/codetag_time_stats.c diff --git a/include/asm-generic/codetag.lds.h b/include/asm-generic/codetag.lds.h index 16fbf74edc3d..d799f4aced82 100644 --- a/include/asm-generic/codetag.lds.h +++ b/include/asm-generic/codetag.lds.h @@ -10,6 +10,7 @@ #define CODETAG_SECTIONS() \ SECTION_WITH_BOUNDARIES(alloc_tags) \ - SECTION_WITH_BOUNDARIES(dynamic_fault_tags) + SECTION_WITH_BOUNDARIES(dynamic_fault_tags) \ + SECTION_WITH_BOUNDARIES(time_stats_tags) #endif /* __ASM_GENERIC_CODETAG_LDS_H */ diff --git a/include/linux/codetag_time_stats.h b/include/linux/codetag_time_stats.h new file mode 100644 index 000000000000..7e44c7ee9e9b --- /dev/null +++ b/include/linux/codetag_time_stats.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_CODETAG_TIMESTATS_H +#define _LINUX_CODETAG_TIMESTATS_H + +/* + * Code tagging based latency tracking: + * (C) 2022 Kent Overstreet + * + * This allows you to easily instrument code to track latency, and have the + * results show up in debugfs. To use, add the following two calls to your code + * at the beginning and end of the event you wish to instrument: + * + * code_tag_time_stats_start(start_time); + * code_tag_time_stats_finish(start_time); + * + * Statistics will then show up in debugfs under /sys/kernel/debug/time_stats, + * listed by file and line number. + */ + +#ifdef CONFIG_CODETAG_TIME_STATS + +#include +#include +#include + +struct codetag_time_stats { + struct codetag tag; + struct time_stats stats; +}; + +#define codetag_time_stats_start(_start_time) u64 _start_time = ktime_get_ns() + +#define codetag_time_stats_finish(_start_time) \ +do { \ + static struct codetag_time_stats \ + __used \ + __section("time_stats_tags") \ + __aligned(8) s = { \ + .tag = CODE_TAG_INIT, \ + .stats.lock = __SPIN_LOCK_UNLOCKED(_lock) \ + }; \ + \ + WARN_ONCE(!(_start_time), "codetag_time_stats_start() not called");\ + time_stats_update(&s.stats, _start_time); \ +} while (0) + +#else + +#define codetag_time_stats_finish(_start_time) do {} while (0) +#define codetag_time_stats_start(_start_time) do {} while (0) + +#endif /* CODETAG_CODETAG_TIME_STATS */ + +#endif diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 677a25d44d7f..3bcef85eacd8 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -488,7 +488,7 @@ struct io_cqe { struct io_cmd_data { struct file *file; /* each command gets 56 bytes of data */ - __u8 data[56]; + __u8 data[64]; }; static inline void io_kiocb_cmd_sz_check(size_t cmd_sz) diff --git a/include/linux/wait.h b/include/linux/wait.h index 91ced6a118bc..bab11b7ef19a 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -4,6 +4,7 @@ /* * Linux wait queue related types and methods */ +#include #include #include #include @@ -32,6 +33,9 @@ struct wait_queue_entry { void *private; wait_queue_func_t func; struct list_head entry; +#ifdef CONFIG_CODETAG_TIME_STATS + u64 start_time; +#endif }; struct wait_queue_head { @@ -79,10 +83,17 @@ extern void __init_waitqueue_head(struct wait_queue_head *wq_head, const char *n # define DECLARE_WAIT_QUEUE_HEAD_ONSTACK(name) DECLARE_WAIT_QUEUE_HEAD(name) #endif +#ifdef CONFIG_CODETAG_TIME_STATS +#define WAIT_QUEUE_ENTRY_START_TIME_INITIALIZER .start_time = ktime_get_ns(), +#else +#define WAIT_QUEUE_ENTRY_START_TIME_INITIALIZER +#endif + #define WAIT_FUNC_INITIALIZER(name, function) { \ .private = current, \ .func = function, \ .entry = LIST_HEAD_INIT((name).entry), \ + WAIT_QUEUE_ENTRY_START_TIME_INITIALIZER \ } #define DEFINE_WAIT_FUNC(name, function) \ @@ -98,6 +109,9 @@ __init_waitqueue_entry(struct wait_queue_entry *wq_entry, unsigned int flags, wq_entry->private = private; wq_entry->func = func; INIT_LIST_HEAD(&wq_entry->entry); +#ifdef CONFIG_CODETAG_TIME_STATS + wq_entry->start_time = ktime_get_ns(); +#endif } #define init_waitqueue_func_entry(_wq_entry, _func) \ @@ -1180,11 +1194,17 @@ do { \ void prepare_to_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry, int state); bool prepare_to_wait_exclusive(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry, int state); long prepare_to_wait_event(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry, int state); -void finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry); +void __finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry); long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout); int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key); int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key); +#define finish_wait(_wq_head, _wq_entry) \ +do { \ + codetag_time_stats_finish((_wq_entry)->start_time); \ + __finish_wait(_wq_head, _wq_entry); \ +} while (0) + typedef int (*task_call_f)(struct task_struct *p, void *arg); extern int task_call_func(struct task_struct *p, task_call_f func, void *arg); diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index b9922346077d..e88de3f0c3ad 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -367,7 +367,7 @@ int do_wait_intr_irq(wait_queue_head_t *wq, wait_queue_entry_t *wait) EXPORT_SYMBOL(do_wait_intr_irq); /** - * finish_wait - clean up after waiting in a queue + * __finish_wait - clean up after waiting in a queue * @wq_head: waitqueue waited on * @wq_entry: wait descriptor * @@ -375,7 +375,7 @@ EXPORT_SYMBOL(do_wait_intr_irq); * the wait descriptor from the given waitqueue if still * queued. */ -void finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) +void __finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) { unsigned long flags; @@ -399,7 +399,7 @@ void finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_en spin_unlock_irqrestore(&wq_head->lock, flags); } } -EXPORT_SYMBOL(finish_wait); +EXPORT_SYMBOL(__finish_wait); int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key) { diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index b7d03afbc808..b0f86643b8f0 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1728,6 +1728,14 @@ config LATENCYTOP Enable this option if you want to use the LatencyTOP tool to find out which userspace is blocking on what kernel operations. +config CODETAG_TIME_STATS + bool "Code tagging based latency measuring" + depends on DEBUG_FS + select TIME_STATS + select CODE_TAGGING + help + Enabling this option makes latency statistics available in debugfs + source "kernel/trace/Kconfig" config PROVIDE_OHCI1394_DMA_INIT diff --git a/lib/Makefile b/lib/Makefile index e54392011f5e..d4067973805b 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -233,6 +233,7 @@ obj-$(CONFIG_PAGE_ALLOC_TAGGING) += pgalloc_tag.o obj-$(CONFIG_CODETAG_FAULT_INJECTION) += dynamic_fault.o obj-$(CONFIG_TIME_STATS) += time_stats.o +obj-$(CONFIG_CODETAG_TIME_STATS) += codetag_time_stats.o lib-$(CONFIG_GENERIC_BUG) += bug.o diff --git a/lib/codetag_time_stats.c b/lib/codetag_time_stats.c new file mode 100644 index 000000000000..b0e9a08308a2 --- /dev/null +++ b/lib/codetag_time_stats.c @@ -0,0 +1,143 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include +#include + +static struct codetag_type *cttype; + +struct user_buf { + char __user *buf; /* destination user buffer */ + size_t size; /* size of requested read */ + ssize_t ret; /* bytes read so far */ +}; + +static int flush_ubuf(struct user_buf *dst, struct seq_buf *src) +{ + if (src->len) { + size_t bytes = min_t(size_t, src->len, dst->size); + int err = copy_to_user(dst->buf, src->buffer, bytes); + + if (err) + return err; + + dst->ret += bytes; + dst->buf += bytes; + dst->size -= bytes; + src->len -= bytes; + memmove(src->buffer, src->buffer + bytes, src->len); + } + + return 0; +} + +struct time_stats_iter { + struct codetag_iterator ct_iter; + struct seq_buf buf; + char rawbuf[4096]; + bool first; +}; + +static int time_stats_open(struct inode *inode, struct file *file) +{ + struct time_stats_iter *iter; + + pr_debug("called"); + + iter = kzalloc(sizeof(*iter), GFP_KERNEL); + if (!iter) + return -ENOMEM; + + codetag_lock_module_list(cttype, true); + codetag_init_iter(&iter->ct_iter, cttype); + codetag_lock_module_list(cttype, false); + + file->private_data = iter; + seq_buf_init(&iter->buf, iter->rawbuf, sizeof(iter->rawbuf)); + iter->first = true; + return 0; +} + +static int time_stats_release(struct inode *inode, struct file *file) +{ + struct time_stats_iter *i = file->private_data; + + kfree(i); + return 0; +} + +static ssize_t time_stats_read(struct file *file, char __user *ubuf, + size_t size, loff_t *ppos) +{ + struct time_stats_iter *iter = file->private_data; + struct user_buf buf = { .buf = ubuf, .size = size }; + struct codetag_time_stats *s; + struct codetag *ct; + int err; + + codetag_lock_module_list(iter->ct_iter.cttype, true); + while (1) { + err = flush_ubuf(&buf, &iter->buf); + if (err || !buf.size) + break; + + ct = codetag_next_ct(&iter->ct_iter); + if (!ct) + break; + + s = container_of(ct, struct codetag_time_stats, tag); + if (s->stats.count) { + if (!iter->first) { + seq_buf_putc(&iter->buf, '\n'); + iter->first = true; + } + + codetag_to_text(&iter->buf, &s->tag); + seq_buf_putc(&iter->buf, '\n'); + time_stats_to_text(&iter->buf, &s->stats); + } + } + codetag_lock_module_list(iter->ct_iter.cttype, false); + + return err ?: buf.ret; +} + +static const struct file_operations time_stats_ops = { + .owner = THIS_MODULE, + .open = time_stats_open, + .release = time_stats_release, + .read = time_stats_read, +}; + +static void time_stats_module_unload(struct codetag_type *cttype, struct codetag_module *mod) +{ + struct codetag_time_stats *i, *start = (void *) mod->range.start; + struct codetag_time_stats *end = (void *) mod->range.stop; + + for (i = start; i != end; i++) + time_stats_exit(&i->stats); +} + +static int __init codetag_time_stats_init(void) +{ + const struct codetag_type_desc desc = { + .section = "time_stats_tags", + .tag_size = sizeof(struct codetag_time_stats), + .module_unload = time_stats_module_unload, + }; + struct dentry *debugfs_file; + + cttype = codetag_register_type(&desc); + if (IS_ERR_OR_NULL(cttype)) + return PTR_ERR(cttype); + + debugfs_file = debugfs_create_file("time_stats", 0666, NULL, NULL, &time_stats_ops); + if (IS_ERR(debugfs_file)) + return PTR_ERR(debugfs_file); + + return 0; +} +module_init(codetag_time_stats_init); From patchwork Tue Aug 30 21:49:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959980 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43572ECAAD5 for ; Tue, 30 Aug 2022 21:54:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232010AbiH3Vys (ORCPT ); Tue, 30 Aug 2022 17:54:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232203AbiH3Vx3 (ORCPT ); Tue, 30 Aug 2022 17:53:29 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1070F8FD66 for ; Tue, 30 Aug 2022 14:50:39 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id x27-20020a25ac9b000000b0069140cfbbd9so714524ybi.8 for ; Tue, 30 Aug 2022 14:50:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=ZStBKMvqZf5cf9FAaMePhJpKdJzTFBwr6u3tFotg8hA=; b=VTVsraZqevmRCaObBRFMsjzBHvekcfce8MtBSaZZ5Crg1seH0CtSm9c29RKCC59TBN 19FBNPHC8dpOcblrHciXVuHQoofaiZfBwdvimHRJsLndFImNdH+eu+YtfXaacByCuPex 4skGzTc23jNuehJZ1slH7Ll/8rph1Z+dZIGkIrmDgscRY7RIgNnfOz1AT6ADLF2JebGP ImyuFSTfbcS7qk97+Srdow1KJqFKbaNXoPDO2ySg/qcfR6YEVvxMJcvJ5RrVZoonHGUz PLJ9fn0z6uZxXsmBY3YesylZ0YvtVFfBxK+5t+T1HyPEAGObb07elO3lsln5Hg+z0Hea kK4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=ZStBKMvqZf5cf9FAaMePhJpKdJzTFBwr6u3tFotg8hA=; b=J9pgdyxB0G+dxMcdLtjM6sQdzBjh+OR+4TcPGZ3bHueM6q5UasN4fn8G/e2L7wSqfg ES1XFcSL5nUE07ZPqaaO6F4TxXOhlLDZJSnTMn207W7jENb5yH1TxcqjnB68PuJb4R+g YvYuSBBEvu/Lw/VCL+P/oSijw8v4/5bheXBJt35m3/VWTgdSCJyZDzedYzkqCrk5bhIv o/4aLTty3Fvk8B1Mr6dm0tbwTSb3NJJ72dzrDKsnPYhrRc2OI2daRjs8vR4Hn57uWKs1 wPwqELt0/nfE35qn0r7KNQWkP4hO8Am1ZBvyOWe24RTdeaxW0eFJ2wT/APwhKjrEbJmy BdDw== X-Gm-Message-State: ACgBeo3voJS1jCocLZJF1bfyW1xNhdPwqbHxBJ1tVRAcVYXJ/r2TjNVW N9mOp+DX2e5qXpckQ6pJCXBVmk/Ouxk= X-Google-Smtp-Source: AA6agR5ZJaVBuvkDlBerd8g/D0cXVNCnXViOohqfD6wf3W0FPvTvaQusAJxcQ2WbQWopoioqQ9vnkekcbVo= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a05:6902:15cf:b0:67c:1ee7:149 with SMTP id l15-20020a05690215cf00b0067c1ee70149mr13333139ybu.594.1661896238275; Tue, 30 Aug 2022 14:50:38 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:17 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-29-surenb@google.com> Subject: [RFC PATCH 28/30] Improved symbolic error names From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet This patch adds per-error-site error codes, with error strings that include their file and line number. To use, change code that returns an error, e.g. return -ENOMEM; to return -ERR(ENOMEM); Then, errname() will return a string that includes the file and line number of the ERR() call, for example printk("Got error %s!\n", errname(err)); will result in Got error ENOMEM at foo.c:1234 To convert back to the original error code (before returning it to outside code that does not understand dynamic error codes), use return error_class(err); To test if an error is of some type, replace if (err == -ENOMEM) with if (error_matches(err, ENOMEM)) Implementation notes: Error codes are allocated dynamically on module load and deallocated on module unload. On memory allocation failure (i.e. the data structures for indexing error strings and error parents), ERR() will fall back to returning the error code that it was passed. MAX_ERRNO has been raised from 4096 to 1 million, which should be sufficient given the number of lines of code and the fraction that throw errors in the kernel codebase. This has implications for ERR_PTR(), since the range of the address space reserved for errors is unavailable for other purposes. Since ERR_PTR() ptrs are at the top of the address space there should not be any major difficulties. Signed-off-by: Kent Overstreet --- include/asm-generic/codetag.lds.h | 3 +- include/linux/err.h | 2 +- include/linux/errname.h | 50 +++++++++++++++ lib/errname.c | 103 ++++++++++++++++++++++++++++++ 4 files changed, 156 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/codetag.lds.h b/include/asm-generic/codetag.lds.h index d799f4aced82..b087cf1874a9 100644 --- a/include/asm-generic/codetag.lds.h +++ b/include/asm-generic/codetag.lds.h @@ -11,6 +11,7 @@ #define CODETAG_SECTIONS() \ SECTION_WITH_BOUNDARIES(alloc_tags) \ SECTION_WITH_BOUNDARIES(dynamic_fault_tags) \ - SECTION_WITH_BOUNDARIES(time_stats_tags) + SECTION_WITH_BOUNDARIES(time_stats_tags) \ + SECTION_WITH_BOUNDARIES(error_code_tags) #endif /* __ASM_GENERIC_CODETAG_LDS_H */ diff --git a/include/linux/err.h b/include/linux/err.h index a139c64aef2a..1d8d6c46ab9c 100644 --- a/include/linux/err.h +++ b/include/linux/err.h @@ -15,7 +15,7 @@ * This should be a per-architecture thing, to allow different * error and pointer decisions. */ -#define MAX_ERRNO 4095 +#define MAX_ERRNO ((1 << 20) - 1) #ifndef __ASSEMBLY__ diff --git a/include/linux/errname.h b/include/linux/errname.h index e8576ad90cb7..dd39fe7120bb 100644 --- a/include/linux/errname.h +++ b/include/linux/errname.h @@ -5,12 +5,62 @@ #include #ifdef CONFIG_SYMBOLIC_ERRNAME + const char *errname(int err); + +#include + +struct codetag_error_code { + const char *str; + int err; +}; + +/** + * ERR - return an error code that records the error site + * + * E.g., instead of + * return -ENOMEM; + * Use + * return -ERR(ENOMEM); + * + * Then, when a caller prints out the error with errname(), the error string + * will include the file and line number. + */ +#define ERR(_err) \ +({ \ + static struct codetag_error_code \ + __used \ + __section("error_code_tags") \ + __aligned(8) e = { \ + .str = #_err " at " __FILE__ ":" __stringify(__LINE__),\ + .err = _err, \ + }; \ + \ + e.err; \ +}) + +int error_class(int err); +bool error_matches(int err, int class); + #else + +static inline int error_class(int err) +{ + return err; +} + +static inline bool error_matches(int err, int class) +{ + return err == class; +} + +#define ERR(_err) _err + static inline const char *errname(int err) { return NULL; } + #endif #endif /* _LINUX_ERRNAME_H */ diff --git a/lib/errname.c b/lib/errname.c index 05cbf731545f..2db8f5301ba0 100644 --- a/lib/errname.c +++ b/lib/errname.c @@ -1,9 +1,20 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include #include +#include #include #include +#include +#include + +#define DYNAMIC_ERRCODE_START 4096 + +static DEFINE_IDR(dynamic_error_strings); +static DEFINE_XARRAY(error_classes); + +static struct codetag_type *cttype; /* * Ensure these tables do not accidentally become gigantic if some @@ -200,6 +211,9 @@ static const char *names_512[] = { static const char *__errname(unsigned err) { + if (err >= DYNAMIC_ERRCODE_START) + return idr_find(&dynamic_error_strings, err); + if (err < ARRAY_SIZE(names_0)) return names_0[err]; if (err >= 512 && err - 512 < ARRAY_SIZE(names_512)) @@ -222,3 +236,92 @@ const char *errname(int err) return err > 0 ? name + 1 : name; } + +/** + * error_class - return standard/parent error (of a dynamic error code) + * + * When using dynamic error codes returned by ERR(), error_class() will return + * the original errorcode that was passed to ERR(). + */ +int error_class(int err) +{ + int class = abs(err); + + if (class > DYNAMIC_ERRCODE_START) + class = (unsigned long) xa_load(&error_classes, + class - DYNAMIC_ERRCODE_START); + if (err < 0) + class = -class; + return class; +} +EXPORT_SYMBOL(error_class); + +/** + * error_matches - test if error is of some type + * + * When using dynamic error codes, instead of checking for errors with e.g. + * if (err == -ENOMEM) + * Instead use + * if (error_matches(err, ENOMEM)) + */ +bool error_matches(int err, int class) +{ + err = abs(err); + class = abs(class); + + BUG_ON(err >= MAX_ERRNO); + BUG_ON(class >= MAX_ERRNO); + + if (err != class) + err = error_class(err); + + return err == class; +} +EXPORT_SYMBOL(error_matches); + +static void errcode_module_load(struct codetag_type *cttype, struct codetag_module *mod) +{ + struct codetag_error_code *i, *start = (void *) mod->range.start; + struct codetag_error_code *end = (void *) mod->range.stop; + + for (i = start; i != end; i++) { + int err = idr_alloc(&dynamic_error_strings, + (char *) i->str, + DYNAMIC_ERRCODE_START, + MAX_ERRNO, + GFP_KERNEL); + if (err < 0) + continue; + + xa_store(&error_classes, + err - DYNAMIC_ERRCODE_START, + (void *)(unsigned long) abs(i->err), + GFP_KERNEL); + + i->err = i->err < 0 ? -err : err; + } +} + +static void errcode_module_unload(struct codetag_type *cttype, struct codetag_module *mod) +{ + struct codetag_error_code *i, *start = (void *) mod->range.start; + struct codetag_error_code *end = (void *) mod->range.stop; + + for (i = start; i != end; i++) + idr_remove(&dynamic_error_strings, abs(i->err)); +} + +static int __init errname_init(void) +{ + const struct codetag_type_desc desc = { + .section = "error_code_tags", + .tag_size = sizeof(struct codetag_error_code), + .module_load = errcode_module_load, + .module_unload = errcode_module_unload, + }; + + cttype = codetag_register_type(&desc); + + return PTR_ERR_OR_ZERO(cttype); +} +module_init(errname_init); From patchwork Tue Aug 30 21:49:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959982 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 756D9ECAAA1 for ; Tue, 30 Aug 2022 21:55:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232083AbiH3Vz0 (ORCPT ); Tue, 30 Aug 2022 17:55:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232271AbiH3Vxl (ORCPT ); Tue, 30 Aug 2022 17:53:41 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C293C985B4 for ; Tue, 30 Aug 2022 14:50:44 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id d8-20020a25bc48000000b00680651cf051so725876ybk.23 for ; Tue, 30 Aug 2022 14:50:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=I4E0zbgDI9VG+bn61bv+haOxSwGhD/tyP2DJbq3PPno=; b=lMDbTLhz/ohye/t587Ty2qYkwt84HzoX6h4AilPBmG9LeLK/WaDZ7EvrZIWUeDFZwN baQ3A+GzmoJpCC2Qp6Nin3hCbTA3XMAwcRsC3ADvEnF19vQrwy6RoUEvm7OappAdU5l3 60OMhVhHNpayFnTLjtsN7/Lu9JsBb/KhrGHPzY9WG93dKLmDa68Or3m8sTRshadBMxRo gvNxqguLj4zV89o7/OmZtheKpzYwtex2+d9trxVpf+SrcS8bxpKkphWAXzDnCiw+XQbW tpWgaNqbZfcBTDlXT2S+rQcVcH5VK0aSOJyORG8U2BkSE5BmUcnQE78u9yJeVf5iO1RV 3uQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=I4E0zbgDI9VG+bn61bv+haOxSwGhD/tyP2DJbq3PPno=; b=ApR5K3FParpSIZWRO1DPZ4mwB3y+aQ6cqN8FjbHGDO0tUpnrtrxbNGKNtQnrGQxqm/ l9gv7LjLJGYYXATkGZJJGNOX8ncfxM0gl0WSs4cSbg/GUxBSLeTVIvNvVmKVVs8DAoWL IOHlHPPoVvbP/Fn9TXLaXcRzcXYXYC1xE60IcDwGM/err+WoD7DFm2x5HhEAmGRi92cW wzBTkFT5MjyQL1tFnw4w05u3C9AH/9mIZL6q8YqBcOw9HjLrajHsQTUhjBw3aCCCGodw je9cHVBfie0GqeWHT3ctk1bXodujHJ4ZL2u+FYPRld3aPSYpwfgaBsgsqLttlLyO6SS6 Ncrw== X-Gm-Message-State: ACgBeo0YwmCn+QVwqo1eqwIQVK3o8giLTiXRmz2oM8kxj6d6uIgcyUYE jki7IMsKzMCuOhj2rvVp59LR5IFXpF4= X-Google-Smtp-Source: AA6agR7dI1wO7X0I+X6vkeMAF4jNOS1nU0nYZhacVn04sWRsl5inyAJXnu3ndUKBzLRifBhinWG6G+5CS9s= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:4f0b:0:b0:69c:2b2c:f6e5 with SMTP id d11-20020a254f0b000000b0069c2b2cf6e5mr7392973ybb.298.1661896240890; Tue, 30 Aug 2022 14:50:40 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:18 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-30-surenb@google.com> Subject: [RFC PATCH 29/30] dyndbg: Convert to code tagging From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet This converts dynamic debug to the new code tagging framework, which provides an interface for iterating over objects in a particular elf section. It also converts the debugfs interface from seq_file to the style used by other code tagging users, which also makes the code a bit smaller and simpler. It doesn't yet convert struct _ddebug to use struct codetag; another cleanup could convert it to that, and to codetag_query_parse(). Signed-off-by: Kent Overstreet Cc: Jason Baron Cc: Luis Chamberlain --- include/asm-generic/codetag.lds.h | 5 +- include/asm-generic/vmlinux.lds.h | 5 - include/linux/dynamic_debug.h | 11 +- kernel/module/internal.h | 2 - kernel/module/main.c | 23 -- lib/dynamic_debug.c | 452 ++++++++++-------------------- 6 files changed, 158 insertions(+), 340 deletions(-) diff --git a/include/asm-generic/codetag.lds.h b/include/asm-generic/codetag.lds.h index b087cf1874a9..b7e351f80e9e 100644 --- a/include/asm-generic/codetag.lds.h +++ b/include/asm-generic/codetag.lds.h @@ -8,10 +8,11 @@ KEEP(*(_name)) \ __stop_##_name = .; -#define CODETAG_SECTIONS() \ +#define CODETAG_SECTIONS() \ SECTION_WITH_BOUNDARIES(alloc_tags) \ SECTION_WITH_BOUNDARIES(dynamic_fault_tags) \ SECTION_WITH_BOUNDARIES(time_stats_tags) \ - SECTION_WITH_BOUNDARIES(error_code_tags) + SECTION_WITH_BOUNDARIES(error_code_tags) \ + SECTION_WITH_BOUNDARIES(dyndbg) #endif /* __ASM_GENERIC_CODETAG_LDS_H */ diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index c2dc2a59ab2e..d3fb914d157f 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -345,11 +345,6 @@ __end_once = .; \ STRUCT_ALIGN(); \ *(__tracepoints) \ - /* implement dynamic printk debug */ \ - . = ALIGN(8); \ - __start___dyndbg = .; \ - KEEP(*(__dyndbg)) \ - __stop___dyndbg = .; \ CODETAG_SECTIONS() \ LIKELY_PROFILE() \ BRANCH_PROFILE() \ diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index dce631e678dd..6a57009dd29e 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -58,9 +58,6 @@ struct _ddebug { /* exported for module authors to exercise >control */ int dynamic_debug_exec_queries(const char *query, const char *modname); -int ddebug_add_module(struct _ddebug *tab, unsigned int n, - const char *modname); -extern int ddebug_remove_module(const char *mod_name); extern __printf(2, 3) void __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...); @@ -89,7 +86,7 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor, #define DEFINE_DYNAMIC_DEBUG_METADATA(name, fmt) \ static struct _ddebug __aligned(8) \ - __section("__dyndbg") name = { \ + __section("dyndbg") name = { \ .modname = KBUILD_MODNAME, \ .function = __func__, \ .filename = __FILE__, \ @@ -187,12 +184,6 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor, #include #include -static inline int ddebug_add_module(struct _ddebug *tab, unsigned int n, - const char *modname) -{ - return 0; -} - static inline int ddebug_remove_module(const char *mod) { return 0; diff --git a/kernel/module/internal.h b/kernel/module/internal.h index f1b6c477bd93..f867c57ab74f 100644 --- a/kernel/module/internal.h +++ b/kernel/module/internal.h @@ -62,8 +62,6 @@ struct load_info { Elf_Shdr *sechdrs; char *secstrings, *strtab; unsigned long symoffs, stroffs, init_typeoffs, core_typeoffs; - struct _ddebug *debug; - unsigned int num_debug; bool sig_ok; #ifdef CONFIG_KALLSYMS unsigned long mod_kallsyms_init_off; diff --git a/kernel/module/main.c b/kernel/module/main.c index d253277492fd..28e3b337841b 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -1163,9 +1163,6 @@ static void free_module(struct module *mod) mod->state = MODULE_STATE_UNFORMED; mutex_unlock(&module_mutex); - /* Remove dynamic debug info */ - ddebug_remove_module(mod->name); - /* Arch-specific cleanup. */ module_arch_cleanup(mod); @@ -1600,19 +1597,6 @@ static void free_modinfo(struct module *mod) } } -static void dynamic_debug_setup(struct module *mod, struct _ddebug *debug, unsigned int num) -{ - if (!debug) - return; - ddebug_add_module(debug, num, mod->name); -} - -static void dynamic_debug_remove(struct module *mod, struct _ddebug *debug) -{ - if (debug) - ddebug_remove_module(mod->name); -} - void * __weak module_alloc(unsigned long size) { return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, @@ -2113,9 +2097,6 @@ static int find_module_sections(struct module *mod, struct load_info *info) if (section_addr(info, "__obsparm")) pr_warn("%s: Ignoring obsolete parameters\n", mod->name); - info->debug = section_objs(info, "__dyndbg", - sizeof(*info->debug), &info->num_debug); - return 0; } @@ -2808,9 +2789,6 @@ static int load_module(struct load_info *info, const char __user *uargs, goto free_arch_cleanup; } - init_build_id(mod, info); - dynamic_debug_setup(mod, info->debug, info->num_debug); - /* Ftrace init must be called in the MODULE_STATE_UNFORMED state */ ftrace_module_init(mod); @@ -2875,7 +2853,6 @@ static int load_module(struct load_info *info, const char __user *uargs, ddebug_cleanup: ftrace_release_mod(mod); - dynamic_debug_remove(mod, info->debug); synchronize_rcu(); kfree(mod->args); free_arch_cleanup: diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c index dd7f56af9aed..e9079825fb3b 100644 --- a/lib/dynamic_debug.c +++ b/lib/dynamic_debug.c @@ -13,6 +13,7 @@ #define pr_fmt(fmt) "dyndbg: " fmt +#include #include #include #include @@ -36,19 +37,37 @@ #include #include #include +#include #include -extern struct _ddebug __start___dyndbg[]; -extern struct _ddebug __stop___dyndbg[]; +static struct codetag_type *cttype; -struct ddebug_table { - struct list_head link; - const char *mod_name; - unsigned int num_ddebugs; - struct _ddebug *ddebugs; +struct user_buf { + char __user *buf; /* destination user buffer */ + size_t size; /* size of requested read */ + ssize_t ret; /* bytes read so far */ }; +static int flush_ubuf(struct user_buf *dst, struct seq_buf *src) +{ + if (src->len) { + size_t bytes = min_t(size_t, src->len, dst->size); + int err = copy_to_user(dst->buf, src->buffer, bytes); + + if (err) + return err; + + dst->ret += bytes; + dst->buf += bytes; + dst->size -= bytes; + src->len -= bytes; + memmove(src->buffer, src->buffer + bytes, src->len); + } + + return 0; +} + struct ddebug_query { const char *filename; const char *module; @@ -58,8 +77,9 @@ struct ddebug_query { }; struct ddebug_iter { - struct ddebug_table *table; - unsigned int idx; + struct codetag_iterator ct_iter; + struct seq_buf buf; + char rawbuf[4096]; }; struct flag_settings { @@ -67,8 +87,6 @@ struct flag_settings { unsigned int mask; }; -static DEFINE_MUTEX(ddebug_lock); -static LIST_HEAD(ddebug_tables); static int verbose; module_param(verbose, int, 0644); MODULE_PARM_DESC(verbose, " dynamic_debug/control processing " @@ -152,78 +170,76 @@ static void vpr_info_dq(const struct ddebug_query *query, const char *msg) static int ddebug_change(const struct ddebug_query *query, struct flag_settings *modifiers) { - int i; - struct ddebug_table *dt; + struct codetag_iterator ct_iter; + struct codetag *ct; unsigned int newflags; unsigned int nfound = 0; struct flagsbuf fbuf; - /* search for matching ddebugs */ - mutex_lock(&ddebug_lock); - list_for_each_entry(dt, &ddebug_tables, link) { + codetag_lock_module_list(cttype, true); + codetag_init_iter(&ct_iter, cttype); + + while ((ct = codetag_next_ct(&ct_iter))) { + struct _ddebug *dp = (void *) ct; /* match against the module name */ if (query->module && - !match_wildcard(query->module, dt->mod_name)) + !match_wildcard(query->module, dp->modname)) continue; - for (i = 0; i < dt->num_ddebugs; i++) { - struct _ddebug *dp = &dt->ddebugs[i]; - - /* match against the source filename */ - if (query->filename && - !match_wildcard(query->filename, dp->filename) && - !match_wildcard(query->filename, - kbasename(dp->filename)) && - !match_wildcard(query->filename, - trim_prefix(dp->filename))) - continue; + /* match against the source filename */ + if (query->filename && + !match_wildcard(query->filename, dp->filename) && + !match_wildcard(query->filename, + kbasename(dp->filename)) && + !match_wildcard(query->filename, + trim_prefix(dp->filename))) + continue; - /* match against the function */ - if (query->function && - !match_wildcard(query->function, dp->function)) - continue; + /* match against the function */ + if (query->function && + !match_wildcard(query->function, dp->function)) + continue; - /* match against the format */ - if (query->format) { - if (*query->format == '^') { - char *p; - /* anchored search. match must be at beginning */ - p = strstr(dp->format, query->format+1); - if (p != dp->format) - continue; - } else if (!strstr(dp->format, query->format)) + /* match against the format */ + if (query->format) { + if (*query->format == '^') { + char *p; + /* anchored search. match must be at beginning */ + p = strstr(dp->format, query->format+1); + if (p != dp->format) continue; - } - - /* match against the line number range */ - if (query->first_lineno && - dp->lineno < query->first_lineno) - continue; - if (query->last_lineno && - dp->lineno > query->last_lineno) + } else if (!strstr(dp->format, query->format)) continue; + } + + /* match against the line number range */ + if (query->first_lineno && + dp->lineno < query->first_lineno) + continue; + if (query->last_lineno && + dp->lineno > query->last_lineno) + continue; - nfound++; + nfound++; - newflags = (dp->flags & modifiers->mask) | modifiers->flags; - if (newflags == dp->flags) - continue; + newflags = (dp->flags & modifiers->mask) | modifiers->flags; + if (newflags == dp->flags) + continue; #ifdef CONFIG_JUMP_LABEL - if (dp->flags & _DPRINTK_FLAGS_PRINT) { - if (!(modifiers->flags & _DPRINTK_FLAGS_PRINT)) - static_branch_disable(&dp->key.dd_key_true); - } else if (modifiers->flags & _DPRINTK_FLAGS_PRINT) - static_branch_enable(&dp->key.dd_key_true); + if (dp->flags & _DPRINTK_FLAGS_PRINT) { + if (!(modifiers->flags & _DPRINTK_FLAGS_PRINT)) + static_branch_disable(&dp->key.dd_key_true); + } else if (modifiers->flags & _DPRINTK_FLAGS_PRINT) + static_branch_enable(&dp->key.dd_key_true); #endif - dp->flags = newflags; - v4pr_info("changed %s:%d [%s]%s =%s\n", - trim_prefix(dp->filename), dp->lineno, - dt->mod_name, dp->function, - ddebug_describe_flags(dp->flags, &fbuf)); - } + dp->flags = newflags; + v4pr_info("changed %s:%d [%s]%s =%s\n", + trim_prefix(dp->filename), dp->lineno, + dp->modname, dp->function, + ddebug_describe_flags(dp->flags, &fbuf)); } - mutex_unlock(&ddebug_lock); + codetag_lock_module_list(cttype, false); if (!nfound && verbose) pr_info("no matches for query\n"); @@ -794,187 +810,96 @@ static ssize_t ddebug_proc_write(struct file *file, const char __user *ubuf, return len; } -/* - * Set the iterator to point to the first _ddebug object - * and return a pointer to that first object. Returns - * NULL if there are no _ddebugs at all. - */ -static struct _ddebug *ddebug_iter_first(struct ddebug_iter *iter) -{ - if (list_empty(&ddebug_tables)) { - iter->table = NULL; - iter->idx = 0; - return NULL; - } - iter->table = list_entry(ddebug_tables.next, - struct ddebug_table, link); - iter->idx = 0; - return &iter->table->ddebugs[iter->idx]; -} - -/* - * Advance the iterator to point to the next _ddebug - * object from the one the iterator currently points at, - * and returns a pointer to the new _ddebug. Returns - * NULL if the iterator has seen all the _ddebugs. - */ -static struct _ddebug *ddebug_iter_next(struct ddebug_iter *iter) -{ - if (iter->table == NULL) - return NULL; - if (++iter->idx == iter->table->num_ddebugs) { - /* iterate to next table */ - iter->idx = 0; - if (list_is_last(&iter->table->link, &ddebug_tables)) { - iter->table = NULL; - return NULL; - } - iter->table = list_entry(iter->table->link.next, - struct ddebug_table, link); - } - return &iter->table->ddebugs[iter->idx]; -} - -/* - * Seq_ops start method. Called at the start of every - * read() call from userspace. Takes the ddebug_lock and - * seeks the seq_file's iterator to the given position. - */ -static void *ddebug_proc_start(struct seq_file *m, loff_t *pos) -{ - struct ddebug_iter *iter = m->private; - struct _ddebug *dp; - int n = *pos; - - mutex_lock(&ddebug_lock); - - if (!n) - return SEQ_START_TOKEN; - if (n < 0) - return NULL; - dp = ddebug_iter_first(iter); - while (dp != NULL && --n > 0) - dp = ddebug_iter_next(iter); - return dp; -} - -/* - * Seq_ops next method. Called several times within a read() - * call from userspace, with ddebug_lock held. Walks to the - * next _ddebug object with a special case for the header line. - */ -static void *ddebug_proc_next(struct seq_file *m, void *p, loff_t *pos) -{ - struct ddebug_iter *iter = m->private; - struct _ddebug *dp; - - if (p == SEQ_START_TOKEN) - dp = ddebug_iter_first(iter); - else - dp = ddebug_iter_next(iter); - ++*pos; - return dp; -} - /* * Seq_ops show method. Called several times within a read() * call from userspace, with ddebug_lock held. Formats the * current _ddebug as a single human-readable line, with a * special case for the header line. */ -static int ddebug_proc_show(struct seq_file *m, void *p) +static void ddebug_to_text(struct seq_buf *out, struct _ddebug *dp) { - struct ddebug_iter *iter = m->private; - struct _ddebug *dp = p; struct flagsbuf flags; + char *buf; + size_t len; - if (p == SEQ_START_TOKEN) { - seq_puts(m, - "# filename:lineno [module]function flags format\n"); - return 0; - } - - seq_printf(m, "%s:%u [%s]%s =%s \"", + seq_buf_printf(out, "%s:%u [%s]%s =%s \"", trim_prefix(dp->filename), dp->lineno, - iter->table->mod_name, dp->function, + dp->modname, dp->function, ddebug_describe_flags(dp->flags, &flags)); - seq_escape(m, dp->format, "\t\r\n\""); - seq_puts(m, "\"\n"); - return 0; -} + len = seq_buf_get_buf(out, &buf); + len = string_escape_mem(dp->format, strlen(dp->format), + buf, len, ESCAPE_OCTAL, "\t\r\n\""); + seq_buf_commit(out, len); -/* - * Seq_ops stop method. Called at the end of each read() - * call from userspace. Drops ddebug_lock. - */ -static void ddebug_proc_stop(struct seq_file *m, void *p) -{ - mutex_unlock(&ddebug_lock); + seq_buf_puts(out, "\"\n"); } -static const struct seq_operations ddebug_proc_seqops = { - .start = ddebug_proc_start, - .next = ddebug_proc_next, - .show = ddebug_proc_show, - .stop = ddebug_proc_stop -}; - static int ddebug_proc_open(struct inode *inode, struct file *file) { - return seq_open_private(file, &ddebug_proc_seqops, - sizeof(struct ddebug_iter)); + struct ddebug_iter *iter; + + iter = kzalloc(sizeof(*iter), GFP_KERNEL); + if (!iter) + return -ENOMEM; + + codetag_lock_module_list(cttype, true); + codetag_init_iter(&iter->ct_iter, cttype); + codetag_lock_module_list(cttype, false); + seq_buf_init(&iter->buf, iter->rawbuf, sizeof(iter->rawbuf)); + file->private_data = iter; + + return 0; } -static const struct file_operations ddebug_proc_fops = { - .owner = THIS_MODULE, - .open = ddebug_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = seq_release_private, - .write = ddebug_proc_write -}; +static int ddebug_proc_release(struct inode *inode, struct file *file) +{ + struct ddebug_iter *iter = file->private_data; -static const struct proc_ops proc_fops = { - .proc_open = ddebug_proc_open, - .proc_read = seq_read, - .proc_lseek = seq_lseek, - .proc_release = seq_release_private, - .proc_write = ddebug_proc_write -}; + kfree(iter); + return 0; +} -/* - * Allocate a new ddebug_table for the given module - * and add it to the global list. - */ -int ddebug_add_module(struct _ddebug *tab, unsigned int n, - const char *name) +static ssize_t ddebug_proc_read(struct file *file, char __user *ubuf, + size_t size, loff_t *ppos) { - struct ddebug_table *dt; + struct ddebug_iter *iter = file->private_data; + struct user_buf buf = { .buf = ubuf, .size = size }; + struct codetag *ct; + int err = 0; - dt = kzalloc(sizeof(*dt), GFP_KERNEL); - if (dt == NULL) { - pr_err("error adding module: %s\n", name); - return -ENOMEM; - } - /* - * For built-in modules, name lives in .rodata and is - * immortal. For loaded modules, name points at the name[] - * member of struct module, which lives at least as long as - * this struct ddebug_table. - */ - dt->mod_name = name; - dt->num_ddebugs = n; - dt->ddebugs = tab; + codetag_lock_module_list(iter->ct_iter.cttype, true); + while (1) { + err = flush_ubuf(&buf, &iter->buf); + if (err || !buf.size) + break; + + ct = codetag_next_ct(&iter->ct_iter); + if (!ct) + break; - mutex_lock(&ddebug_lock); - list_add(&dt->link, &ddebug_tables); - mutex_unlock(&ddebug_lock); + ddebug_to_text(&iter->buf, (void *) ct); + } + codetag_lock_module_list(iter->ct_iter.cttype, false); - vpr_info("%3u debug prints in module %s\n", n, dt->mod_name); - return 0; + return err ? : buf.ret; } +static const struct file_operations ddebug_proc_fops = { + .owner = THIS_MODULE, + .open = ddebug_proc_open, + .read = ddebug_proc_read, + .release = ddebug_proc_release, + .write = ddebug_proc_write, +}; + +static const struct proc_ops proc_fops = { + .proc_open = ddebug_proc_open, + .proc_read = ddebug_proc_read, + .proc_release = ddebug_proc_release, + .proc_write = ddebug_proc_write, +}; + /* helper for ddebug_dyndbg_(boot|module)_param_cb */ static int ddebug_dyndbg_param_cb(char *param, char *val, const char *modname, int on_err) @@ -1015,47 +940,6 @@ int ddebug_dyndbg_module_param_cb(char *param, char *val, const char *module) return ddebug_dyndbg_param_cb(param, val, module, -ENOENT); } -static void ddebug_table_free(struct ddebug_table *dt) -{ - list_del_init(&dt->link); - kfree(dt); -} - -/* - * Called in response to a module being unloaded. Removes - * any ddebug_table's which point at the module. - */ -int ddebug_remove_module(const char *mod_name) -{ - struct ddebug_table *dt, *nextdt; - int ret = -ENOENT; - - mutex_lock(&ddebug_lock); - list_for_each_entry_safe(dt, nextdt, &ddebug_tables, link) { - if (dt->mod_name == mod_name) { - ddebug_table_free(dt); - ret = 0; - break; - } - } - mutex_unlock(&ddebug_lock); - if (!ret) - v2pr_info("removed module \"%s\"\n", mod_name); - return ret; -} - -static void ddebug_remove_all_tables(void) -{ - mutex_lock(&ddebug_lock); - while (!list_empty(&ddebug_tables)) { - struct ddebug_table *dt = list_entry(ddebug_tables.next, - struct ddebug_table, - link); - ddebug_table_free(dt); - } - mutex_unlock(&ddebug_lock); -} - static __initdata int ddebug_init_success; static int __init dynamic_debug_init_control(void) @@ -1083,45 +967,19 @@ static int __init dynamic_debug_init_control(void) static int __init dynamic_debug_init(void) { - struct _ddebug *iter, *iter_start; - const char *modname = NULL; + const struct codetag_type_desc desc = { + .section = "dyndbg", + .tag_size = sizeof(struct _ddebug), + }; char *cmdline; - int ret = 0; - int n = 0, entries = 0, modct = 0; + int ret; - if (&__start___dyndbg == &__stop___dyndbg) { - if (IS_ENABLED(CONFIG_DYNAMIC_DEBUG)) { - pr_warn("_ddebug table is empty in a CONFIG_DYNAMIC_DEBUG build\n"); - return 1; - } - pr_info("Ignore empty _ddebug table in a CONFIG_DYNAMIC_DEBUG_CORE build\n"); - ddebug_init_success = 1; - return 0; - } - iter = __start___dyndbg; - modname = iter->modname; - iter_start = iter; - for (; iter < __stop___dyndbg; iter++) { - entries++; - if (strcmp(modname, iter->modname)) { - modct++; - ret = ddebug_add_module(iter_start, n, modname); - if (ret) - goto out_err; - n = 0; - modname = iter->modname; - iter_start = iter; - } - n++; - } - ret = ddebug_add_module(iter_start, n, modname); + cttype = codetag_register_type(&desc); + ret = PTR_ERR_OR_ZERO(cttype); if (ret) - goto out_err; + return ret; ddebug_init_success = 1; - vpr_info("%d prdebugs in %d modules, %d KiB in ddebug tables, %d kiB in __dyndbg section\n", - entries, modct, (int)((modct * sizeof(struct ddebug_table)) >> 10), - (int)((entries * sizeof(struct _ddebug)) >> 10)); /* now that ddebug tables are loaded, process all boot args * again to find and activate queries given in dyndbg params. @@ -1132,14 +990,12 @@ static int __init dynamic_debug_init(void) * slightly noisy if verbose, but harmless. */ cmdline = kstrdup(saved_command_line, GFP_KERNEL); + if (!cmdline) + return -ENOMEM; parse_args("dyndbg params", cmdline, NULL, 0, 0, 0, NULL, &ddebug_dyndbg_boot_param_cb); kfree(cmdline); return 0; - -out_err: - ddebug_remove_all_tables(); - return 0; } /* Allow early initialization for boot messages via boot param */ early_initcall(dynamic_debug_init); From patchwork Tue Aug 30 21:49:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12959981 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94BDEECAAA1 for ; Tue, 30 Aug 2022 21:55:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232233AbiH3VzM (ORCPT ); Tue, 30 Aug 2022 17:55:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232243AbiH3Vxi (ORCPT ); Tue, 30 Aug 2022 17:53:38 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C28F6985AD for ; Tue, 30 Aug 2022 14:50:45 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-33dbe61eed8so188505257b3.1 for ; Tue, 30 Aug 2022 14:50:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc; bh=q3DJ62FdAkc75G5f5TC1WQpGOVt3l/MJJCNdx5cTnOk=; b=C8JONxxXVlTpftHaAGAIlpLkK3JbLire6Nmp7rtyqKkA8STqUN7GtKHqgck9OnhnK6 JLmVEdOBtO4UgVT+rqGSVVb8el9/E2EzKFM8T34Szjzm3zJ0p5kFz3mCYcxw5G4J1WvF QMb4GTQ/yzzPPSsO8n/bi291Hq8qngpZB5y0K7P/GbpAFTCQEKZCNPr9wH/IKiRw2v40 589a1BB4hctRlz5v1afevUX177R+LYQq/8EEHO7YXqeqRF8HSNYDOdN7xFassnCgRext TnhJSTiUhrw0wYWDFlFWpRGy66tj1fz8fdmGodc46WuVd0A4xLZ7WpBC/QBvDzjQGrdA SEig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc; bh=q3DJ62FdAkc75G5f5TC1WQpGOVt3l/MJJCNdx5cTnOk=; b=ZMJPUV+3QIlZnBbH7sKbYmSfLYhm2829g4UbOWWuIAsD3OIgTp3rt04YiQtjNGsAlW jxYX0/jv1Z/YB8AnqXvXTp2eaBE48xzBX0sI0kk3T5wcrh8lVmdQSHnPiEUs+4ygP96Z 3/r3SFvKrDY5LgRPBfOVajyXCQGNE/nUZucXOhmEF53ISm832nxYL7rjRcZ0ciKSjrwh I8sVPw7p8TfAoAxxTeRxuGTQoC4ydYS0oTQ4HTRnKBEkgxixILgl6t9xFwGlQ9rAJ56i DcR2ecZ63jjMh6UEVRhDna9n9y+QiXCstq3rFvsfV2lwMwSNobBzInfkXwkNlQwUXgcF B2gg== X-Gm-Message-State: ACgBeo3M1tt9rJNzsWWhJm11MoEiMDRD+C+UoemyfR5K8iyiw95BuQuq RNLy9pjYjR0EKgyUCE+uRgdGqSSRK0E= X-Google-Smtp-Source: AA6agR6mRIYh/Ugh76Z9OGxiRjbQrZ+9wpxKxcSIOnClnh/ne0QVaaMxuz4y8WxW9UVOyJ1glmO9CKBYbzs= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:a005:55b3:6c26:b3e4]) (user=surenb job=sendgmr) by 2002:a25:8a85:0:b0:671:715e:a1b0 with SMTP id h5-20020a258a85000000b00671715ea1b0mr12680068ybl.98.1661896243785; Tue, 30 Aug 2022 14:50:43 -0700 (PDT) Date: Tue, 30 Aug 2022 14:49:19 -0700 In-Reply-To: <20220830214919.53220-1-surenb@google.com> Mime-Version: 1.0 References: <20220830214919.53220-1-surenb@google.com> X-Mailer: git-send-email 2.37.2.672.g94769d06f0-goog Message-ID: <20220830214919.53220-31-surenb@google.com> Subject: [RFC PATCH 30/30] MAINTAINERS: Add entries for code tagging & related From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, arnd@arndb.de, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, iommu@lists.linux.dev, kasan-dev@googlegroups.com, io-uring@vger.kernel.org, linux-arch@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Kent Overstreet The new code & libraries added are being maintained - mark them as such. Signed-off-by: Kent Overstreet --- MAINTAINERS | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 589517372408..902c96744bcb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5111,6 +5111,19 @@ S: Supported F: Documentation/process/code-of-conduct-interpretation.rst F: Documentation/process/code-of-conduct.rst +CODE TAGGING +M: Suren Baghdasaryan +M: Kent Overstreet +S: Maintained +F: lib/codetag.c +F: include/linux/codetag.h + +CODE TAGGING TIME STATS +M: Kent Overstreet +S: Maintained +F: lib/codetag_time_stats.c +F: include/linux/codetag_time_stats.h + COMEDI DRIVERS M: Ian Abbott M: H Hartley Sweeten @@ -11405,6 +11418,12 @@ M: John Hawley S: Maintained F: tools/testing/ktest +LAZY PERCPU COUNTERS +M: Kent Overstreet +S: Maintained +F: lib/lazy-percpu-counter.c +F: include/linux/lazy-percpu-counter.h + L3MDEV M: David Ahern L: netdev@vger.kernel.org @@ -13124,6 +13143,15 @@ F: include/linux/memblock.h F: mm/memblock.c F: tools/testing/memblock/ +MEMORY ALLOCATION TRACKING +M: Suren Baghdasaryan +M: Kent Overstreet +S: Maintained +F: lib/alloc_tag.c +F: lib/pgalloc_tag.c +F: include/linux/alloc_tag.h +F: include/linux/codetag_ctx.h + MEMORY CONTROLLER DRIVERS M: Krzysztof Kozlowski L: linux-kernel@vger.kernel.org @@ -20421,6 +20449,12 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/luca/wl12xx.git F: drivers/net/wireless/ti/ F: include/linux/wl12xx.h +TIME STATS +M: Kent Overstreet +S: Maintained +F: lib/time_stats.c +F: include/linux/time_stats.h + TIMEKEEPING, CLOCKSOURCE CORE, NTP, ALARMTIMER M: John Stultz M: Thomas Gleixner