From patchwork Tue Oct 10 00:09:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 13414694 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E826E95A8D for ; Tue, 10 Oct 2023 00:10:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B9D0C8D009C; Mon, 9 Oct 2023 20:10:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B4CF68D0089; Mon, 9 Oct 2023 20:10:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A13FD8D009C; Mon, 9 Oct 2023 20:10:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 91C088D0089 for ; Mon, 9 Oct 2023 20:10:00 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 601F612042E for ; Tue, 10 Oct 2023 00:10:00 +0000 (UTC) X-FDA: 81327618960.25.C842E1E Received: from out-195.mta0.migadu.com (out-195.mta0.migadu.com [91.218.175.195]) by imf03.hostedemail.com (Postfix) with ESMTP id 919D72000C for ; Tue, 10 Oct 2023 00:09:58 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=DnHfbVma; spf=pass (imf03.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.195 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696896599; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Zu2srZGRza0i12D/f9sOSWHgmsyy71BOzkS7lMMPjis=; b=VtRmykKam85+nTHHIkfsG/VVGEnkkRnus/d3SrxG1aA3xaD5ZO4Dl4N8SGCR+7FR/t52qi hsFaEqePEsoQWbvbFksdi0BSU4QcKzr7r9DhycsjbXGMZMnTCMMXvpgZnrm7ELFLQAOyOu swifytmxg7Ru4bfak4nODV2xMUqWhCY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696896599; a=rsa-sha256; cv=none; b=hbuEB9OkeLp4ENpuTAX97N5C+/IzENYawQweOVMqMLXztYvglFGotPpqysGNOVUpmXOZIt DRTPH4P9VnE5cDmMBwHqOgLFlsRF0vxd8rStkXJFcqcK4ljch/elR2tjX72/JjfaFI0b5t Hm3TcLFvQPaJKjNdLju5LyUrqxaOxlY= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=DnHfbVma; spf=pass (imf03.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.195 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1696896596; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=Zu2srZGRza0i12D/f9sOSWHgmsyy71BOzkS7lMMPjis=; b=DnHfbVmaqpEspZYM0pD00NCFUZGLSVOOtcdBPfHVbSOARxf2o1fJ/qt7mqIbtSIPUqM1Xs AlBwkqXu83Dk3Ms55eJaUyr5vrS7cGAm3R0CItbKKNrFTRdhXFx9Lg2iXrue1HccXLZJDr mkrBXCUahp/34s8KWx+jyweMYmAnjR8= From: Roman Gushchin To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Dennis Zhou , Andrew Morton , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v2 0/5] mm: improve performance of accounted kernel memory allocations Date: Mon, 9 Oct 2023 17:09:24 -0700 Message-ID: <20231010000929.450702-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 919D72000C X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: cd9mcdwshcd7brgr9hiypfk5f69a839e X-HE-Tag: 1696896598-232256 X-HE-Meta: U2FsdGVkX1/f6WUJ1fa4WjkCgkZhWYYZQLvFHYmJoHUAjP/I7dJbZcrLPXHrm35nQq3MEmfiMH0UfcMhnwQzG9LkbBrd6W0MH4qDj6zrWyqpl95DKgjPvBpP2tOl5IM8Thzk/vexvKlU8+uM7wvYsTWwqLqOv2J7S0Kdmm7XjkaI4EkOxqX1PL7rGCz/j3ZZUaiPqfC+RHkkjJlX4IBJrvGQn5J74iVfErhdiL2pdHUTAZbKNHU6jzn0IGd+w/7QGqFrflFyuM9+G/pECAyDgJqVWJuwKSHV/KkwEZzSXdIkXjYG+1PhJrJekPpSExFeYjJHu4h39JCAuJgZrrA6fxICP9pKE5p5u9u9LRlNT9F3NtiHP3mwee3vbB9Rn3s65/S14CVvdR1xB2pqJ+7My3eAhoG05IGJukpdbg68dcj+M3yHG+kTqQrNtHYFuHR02d1v9sz6z6o6JCmol/75U+4bvuVIqfcqTmwFdWgl9ujJ0/a+gx+IpYz3PmKvYCDyZis+fAd5ZXjtYugcDgrM1QcD7be5HdNHF4V+zp3X4zrtp3jIb9imjiTV4uimqzNqximK5lQsThh9xuQTaIjzlsPIn+iNR39xGAdCjDI61NT3yzP6tOSoa+Z6maSdpahFY70m3Oj0e5FBhXGiHbuvNVYQ7HuKMA4N7fITpe1x7lh00vd+2XIJfvyOCgGxP5lI45fA9B6POjyWhM+PzFrpGSZSJ/TQZMNYXn+mBqFFZAMhy+wUSgEXpRgOGUX+AQKNNFqH0AlGj2OcQfXkQP92QHQZKhtSPBBZ6nlgC7rRCT/VA2D6tuD6KehKUvJk1tcRLve9JM4NDzSviYmMmJaYQmOt81HH8G/FzDUlcEJmQFotCusZyUYm+qKs2X7RW6JDboyXLA04iRNjXqyznCN+iNyFwM8M6hptVG+jtAZaNRxSv7LZa8cEL6g4zefa4mhy6dQg7AdhS8m/aKjSjcw z+icgNt4 wIUP07nrKgK8mLMFEw1CUahKsCpw3oEtrp9I5TBvuNjwbxsdZqBouscwt+6Ev3AiF0a2d9LWmFtuEAeGr4tQgVB9uwpFnsnXCJw/xXMHZJtb8uaZ7+1MqFiYkBw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patchset improves the performance of accounted kernel memory allocations by ~30% as measured by a micro-benchmark [1]. The benchmark is very straightforward: 1M of 64 bytes-large kmalloc() allocations. Below are results with the disabled kernel memory accounting, the original state and with this patchset applied. | | Kmem disabled | Original | Patched | Delta | |-------------+---------------+----------+---------+--------| | User cgroup | 29764 | 84548 | 59078 | -30.0% | | Root cgroup | 29742 | 48342 | 31501 | -34.8% | As we can see, the patchset removes the majority of the overhead when there is no actual accounting (a task belongs to the root memory cgroup) and almost halves the accounting overhead otherwise. The main idea is to get rid of unnecessary memcg to objcg conversions and switch to a scope-based protection of objcgs, which eliminates extra operations with objcg reference counters under a rcu read lock. More details are provided in individual commit descriptions. v2: - fixed a bug discovered by Naresh Kamboju - code changes asked by Johannes (added comments, open-coded bit ops) - merged in a couple of small fixes v1: - made the objcg update fully lockless - fixed !CONFIG_MMU build issues rfc: https://lwn.net/Articles/945722/ --- [1]: static int memory_alloc_test(struct seq_file *m, void *v) { unsigned long i, j; void **ptrs; ktime_t start, end; s64 delta, min_delta = LLONG_MAX; ptrs = kvmalloc(sizeof(void *) * 1000000, GFP_KERNEL); if (!ptrs) return -ENOMEM; for (j = 0; j < 100; j++) { start = ktime_get(); for (i = 0; i < 1000000; i++) ptrs[i] = kmalloc(64, GFP_KERNEL_ACCOUNT); end = ktime_get(); delta = ktime_us_delta(end, start); if (delta < min_delta) min_delta = delta; for (i = 0; i < 1000000; i++) kfree(ptrs[i]); } kvfree(ptrs); seq_printf(m, "%lld us\n", min_delta); return 0; } -- Signed-off-by: Roman Gushchin (Cruise) Roman Gushchin (5): mm: kmem: optimize get_obj_cgroup_from_current() mm: kmem: add direct objcg pointer to task_struct mm: kmem: make memcg keep a reference to the original objcg mm: kmem: scoped objcg protection percpu: scoped objcg protection include/linux/memcontrol.h | 14 ++- include/linux/sched.h | 4 + mm/memcontrol.c | 204 ++++++++++++++++++++++++++++++++----- mm/percpu.c | 8 +- mm/slab.h | 10 +- 5 files changed, 202 insertions(+), 38 deletions(-)