From patchwork Fri Sep 29 18:00:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 13404689 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAA56E728D5 for ; Fri, 29 Sep 2023 18:01:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AA3C8D00F7; Fri, 29 Sep 2023 14:01:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25A228D002B; Fri, 29 Sep 2023 14:01:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 122CD8D00F7; Fri, 29 Sep 2023 14:01:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F34108D002B for ; Fri, 29 Sep 2023 14:01:20 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id AAD9780E7F for ; Fri, 29 Sep 2023 18:01:20 +0000 (UTC) X-FDA: 81290401920.26.903890F Received: from out-210.mta0.migadu.com (out-210.mta0.migadu.com [91.218.175.210]) by imf13.hostedemail.com (Postfix) with ESMTP id 391DA2002B for ; Fri, 29 Sep 2023 18:01:16 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=SkOuMpvS; spf=pass (imf13.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.210 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696010477; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=6tcTuWYRALIgJf3jd0QQAQe8tlt9e/aWb2Z4sY08mGQ=; b=crn9nQ8IUga9mhzTwUpQLvhPmt/r/DxsawBOc/ek8iTa8jpDOplA3hk8+e0ARZ/uZzFvXu QE6O8mCoJsHOoVs3bKRxnFndrA3T+7mJkTMfowKOc2OBibVDRR7BpYKph+CX9bHCRgKDKn 3E7+IIJngR9o8jgVGCP2PXhKHK3Q+0E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696010477; a=rsa-sha256; cv=none; b=QEwHmKqttKM8fyTeWT5plDgL5VvZpQ40+up3wZq24qwXuvilUDxAuTcRYMEsDlRp6XNQtU t7dUxyCww5a38Kp4bxIYZalI5rYr1itof4VAUOtQlHjQ5JhZ1F2WWMbTgTosGc3nfZ1zNI 5vWkDlhi/D1CqY4jI7Adhm8MailxHlM= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=SkOuMpvS; spf=pass (imf13.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.210 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1696010475; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=6tcTuWYRALIgJf3jd0QQAQe8tlt9e/aWb2Z4sY08mGQ=; b=SkOuMpvSIJQsX1dDkYHCau3cvj0O3IYvhitbxnrKAG2QyDLxJyA2P4OufwQXTxcFl3Nqf1 hBwl2Nq2rZ+L+tq8JEqSRm6O9MoulQjBqZEZYio1AlRAkoXBirXAhyCeetKPYbHVW7NG0V EW1XF9qCTWMH8bocXcrlnOOLzPABKJM= From: Roman Gushchin To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Dennis Zhou , Andrew Morton , David Rientjes , Vlastimil Babka , Roman Gushchin Subject: [PATCH v1 0/5] mm: improve performance of accounted kernel memory allocations Date: Fri, 29 Sep 2023 11:00:50 -0700 Message-ID: <20230929180056.1122002-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Stat-Signature: ukxkk5digusds391cympidoo7tjtwnef X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 391DA2002B X-Rspam-User: X-HE-Tag: 1696010476-215939 X-HE-Meta: U2FsdGVkX19ZVvxUOHnAQMrfelZo7Ugs3XfOGffaYU02rbGQOqJkKUjV/7pHAsuPskhpph5f2U67qIt6r63JYHI6D+5BLKM/4xu56H60J4yrJCbdwOp656cXWhq9sHrjX2CzgYLJG+oIH412FOJ2Nxodzro6Ep9skqHmX7QPjcIRL2uVSU5Pw+/bIdLRuFmh91IBuxpDvMF+wjBylMGR3rqsWIi3zq7NTaFQMabm3b4sx21A+/1MsX1BecCydekx5yOlUI2LIff8Q0LpdsSE97z67H4wv1gn5xSpjnifWfHJSH9WQJzOmmGsUxnBNmv1Fcz0GP1rBjKx8wgt6+4nvIXhLyMgFB68UqrK5ujvdpDORuNDZRbokZBH82mWRWWo48b9TqQRDXVmkkMJCh6naMAQhjxBE6Mm+NmxbBlgpu8/sL/lWWO3wtQuVgU7RVNsmpXY9vGZM62YgmKWHBgX8daKoGRezZw4heaKBaZdKYuNgtDCk91poqpkCXst5K1V8cGz6kfWL/02XUg5DEpM/825e2LU28LlhG5ogOgDBKeL51IehqNZ5LWvHsHK1GVxkfZHFIZHQeDfGaUKisDFr8NYtrsYo91TLCtI6T0opkGBQN1vrl0GxvRD2+3n/p4fHT+pdRI572bkSdz3pEaLCf/GlFHmz/8Nj2gyVH5LWaOoRchone2ZDYIquG/SHAWMQ20Jo3aG3wAOwVo2gEXeTzeltk75yqIeaygXcv3Yki81bWbDQSyWi8cBcCfZOdxT0JxW9+r2h8/gQ0s8rZAzEJayFhJ5RI1IKzRfAFtH6eOsNAMF/fl9dABpfNd/SvI6CNskfLPeJkS1mCCSUpJdu4snZnkY0FA2XZyoLWaMWyKmvjUga7a0hRp1l+NagDjWjbSOn3jYyHONL2CSE78qahxr6+C0ULjd4t7Ssqcd4kZJHAgx0fDhP0q8uXwyAVA1q3tiMl/LV67JjmpU/rv k+HRYwZP YRv2VZCMuRs+4AnaBTe3xTTtNgeDAbVgGunP2+UsKMQ4MT1rsuBmpkjYFLm1MZ621WoiTjk2vKgjN/VGTdlymHBwG9JK4IvVFEuv2Rrui6IveCIWtN4IBQHnz6Hm7vGoxILSPgri7pQN1I/8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patchset improves the performance of accounted kernel memory allocations by ~30% as measured by a micro-benchmark [1]. The benchmark is very straightforward: 1M of 64 bytes-large kmalloc() allocations. Below are results with the disabled kernel memory accounting, the original state and with this patchset applied. | | Kmem disabled | Original | Patched | Delta | |-------------+---------------+----------+---------+--------| | User cgroup | 29764 | 84548 | 59078 | -30.0% | | Root cgroup | 29742 | 48342 | 31501 | -34.8% | As we can see, the patchset removes the majority of the overhead when there is no actual accounting (a task belongs to the root memory cgroup) and almost halves the accounting overhead otherwise. The main idea is to get rid of unnecessary memcg to objcg conversions and switch to a scope-based protection of objcgs, which eliminates extra operations with objcg reference counters under a rcu read lock. More details are provided in individual commit descriptions. v1: - made the objcg update fully lockless - fixed !CONFIG_MMU build issues rfc: https://lwn.net/Articles/945722/ --- [1]: static int memory_alloc_test(struct seq_file *m, void *v) { unsigned long i, j; void **ptrs; ktime_t start, end; s64 delta, min_delta = LLONG_MAX; ptrs = kvmalloc(sizeof(void *) * 1000000, GFP_KERNEL); if (!ptrs) return -ENOMEM; for (j = 0; j < 100; j++) { start = ktime_get(); for (i = 0; i < 1000000; i++) ptrs[i] = kmalloc(64, GFP_KERNEL_ACCOUNT); end = ktime_get(); delta = ktime_us_delta(end, start); if (delta < min_delta) min_delta = delta; for (i = 0; i < 1000000; i++) kfree(ptrs[i]); } kvfree(ptrs); seq_printf(m, "%lld us\n", min_delta); return 0; } -- Signed-off-by: Roman Gushchin (Cruise) Roman Gushchin (5): mm: kmem: optimize get_obj_cgroup_from_current() mm: kmem: add direct objcg pointer to task_struct mm: kmem: make memcg keep a reference to the original objcg mm: kmem: scoped objcg protection percpu: scoped objcg protection include/linux/memcontrol.h | 24 ++++- include/linux/sched.h | 4 + mm/memcontrol.c | 184 ++++++++++++++++++++++++++++++++----- mm/percpu.c | 8 +- mm/slab.h | 10 +- 5 files changed, 192 insertions(+), 38 deletions(-)