From patchwork Thu Jun 15 03:49:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhongkun He X-Patchwork-Id: 13280678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CD69EB64D9 for ; Thu, 15 Jun 2023 03:49:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E910E6B0078; Wed, 14 Jun 2023 23:49:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E187F8E0002; Wed, 14 Jun 2023 23:49:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C92148E0001; Wed, 14 Jun 2023 23:49:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B2BCA6B0078 for ; Wed, 14 Jun 2023 23:49:15 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 70CF8C0970 for ; Thu, 15 Jun 2023 03:49:15 +0000 (UTC) X-FDA: 80903601870.04.701A4BC Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) by imf03.hostedemail.com (Postfix) with ESMTP id B3C7A20002 for ; Thu, 15 Jun 2023 03:49:13 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=RFOEvPUJ; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf03.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686800953; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Ru9RQc/YctUd3oofmNo9B/wQzL/S7hlbQkvymAiMmdk=; b=z61srtX+whbFbE4N+kyGe6XI6AZJwbXA0YxVTnb7bj1Y8SNzJ74LX9vpwb7aQ5bYgNHy8u cDCMOqd9uX7AXwNzorV9OLUBoY6xtoe2WltzxqcVnTktzKNJPYMj/pIeVQtVk+6yWuWE8U BETH4gMece1t9gp4s+LS51G8QEAq+F4= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=RFOEvPUJ; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf03.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686800953; a=rsa-sha256; cv=none; b=RKChyUuRV3Ke2wlyGC0XpgUC/LUaGHfi8tI8DyvJ672PyIQ3Sk2T9b9zjk4ZVy9M1ln1qs HBODiqKDt4MnsqZIrQo4X3CBI2jg9dY9oaVa8skl4BfDML9cG0wkfCmoIx4Hb+j4UD3kmI vZ5lVoxnh6OrcdUI2iNNJR1T0GPj+ls= Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-662f0feafb2so265044b3a.1 for ; Wed, 14 Jun 2023 20:49:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1686800952; x=1689392952; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Ru9RQc/YctUd3oofmNo9B/wQzL/S7hlbQkvymAiMmdk=; b=RFOEvPUJE1xXVdLOmJQw4AYx9wGU3W9LSe9MDMMyjAwqI9uTfSl9XBmBC6yV1Yi4Kh H5ihk+wgHHlCnOOdlsK+YQIKHKcsa0sxfObaLcDMztLxyRgjxloUdUKjfcaCc5G09DE4 T1ZtX0M7w/f+A68B0vXL/ZEVD92E9KMiC/kc5Ymc3uTz6/eN3y0FCb3l4BVRWFuozSP1 zpZda+7WfBf2P6fjRmGIp0NBM73LgZXiCoJsxszp3yFdP3mEa7LVogULdTubmA4Z8Va2 dcEwfzIurYedtwnU1JpEKfxiGNiABEwM5APuMvNbbDATP0bRmEYfjh+gAcGsup4LFKbK uUAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686800952; x=1689392952; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ru9RQc/YctUd3oofmNo9B/wQzL/S7hlbQkvymAiMmdk=; b=DphcUx3op/TPGUwpqUMFPfgPeh1bW59wVOOrl/12dpmeW2Qlj7GFPUwAReNI38eQR2 Ptbfoico1kGX0U9N6Rpa/kU0pQnaUJou1JGxjCWBeskWtZWlcKINFU6dx4tKepmFNx+E 6f/nrao5dnqDw5450mRIxJg/aAdDNo0HLX8tv6k8ZzIAnN4s6Mt+f0eGAQ5st7cu4Oyb 7LjzrUWkPIsSlQW8HFQagWWpOJV84HyUhXcE4hhlTHK0u3SYrcsQKEKOoDeBsCAi6xZ6 petG7dVfYGgLJKkguu3VtpDQWiUNtVRZ8XA9C+IzXJ9qt/A1rYruY7jigzM8ccIY+4/0 RZuA== X-Gm-Message-State: AC+VfDyPkbqNEBeP9bepg7uUxpwEXpAzArarstmjSkhFkobD1R716Zv2 yMhMhf9/dASxt2vdwzzpKYB8Og== X-Google-Smtp-Source: ACHHUZ4uuJz+fwISKlsP9Eu7UCx/w4kIpOrZngm6hKKNWnuEab1aS4tbt3/zosNjXGxk7SIc4kGFVg== X-Received: by 2002:a05:6a00:2e17:b0:650:1a64:d8d3 with SMTP id fc23-20020a056a002e1700b006501a64d8d3mr4983241pfb.14.1686800952295; Wed, 14 Jun 2023 20:49:12 -0700 (PDT) Received: from Tower.bytedance.net ([203.208.167.147]) by smtp.gmail.com with ESMTPSA id u6-20020a62ed06000000b00661694f1469sm6090369pfh.74.2023.06.14.20.49.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 20:49:11 -0700 (PDT) From: Zhongkun He To: minchan@kernel.org, senozhatsky@chromium.org, mhocko@suse.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Zhongkun He Subject: [RFC PATCH 3/3] zsmalloc: charge the zspage's object to obj_cgroup Date: Thu, 15 Jun 2023 11:49:05 +0800 Message-Id: <20230615034905.1362034-1-hezhongkun.hzk@bytedance.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B3C7A20002 X-Stat-Signature: 53pzw6och1onjyzmgy9z368nyq41fmcu X-Rspam-User: X-HE-Tag: 1686800953-4434 X-HE-Meta: U2FsdGVkX1/clvy+7beK19Q3ci/dfHNPoPlWsVxdEMjON4K+gNzB4t0M0GrRmJ38cLwr2QGuDq//GZ9VvYM3wttWUbf8rWCMNAQmJlKEEQasK8qeaK7QSKoQnacYocjYKcFyiVzUXKk71Rhj4Hb3rBYbwM34fgjzGxYFvBWCRxzqbHk+ZLGB0gyjoOoUfrq8pOnndNzx/3LfGM3+kJUi/kdCay8r5fZ8IuJmkrErDuD2diTOnimnY1qeXMUL6ASdzsBFWhOk0dA6mCdHLbG/I/L+GXiKn3KMp4eM/gXsHVEhA2iersUNot769aQl4xRd/4qUPTtyhV4riFLguMXS067tlbNZkbsaFVnzzrxhtAR07VNiPBWl6sd+/Xp6f4dP5FeWjLSAehPXPP9aK5p33lwxozVN5y54qMK+EyQuCM7jLVGZJ7+JhF9TZbHCAToKkh4avfRduhU2a0Z11uxSRVyVhBbiycqAbX4v6LT+uIv/XVl8T48vzes67hom+T3jYQS6ZpfZYAwOX55nEJ0rcg473Vgw6ytoaRg3U1NKbWBQtu2al4VHgMmT419C0SZvYSSmWP+BPrdtcABMXXBMo7rZk7ZyZ+JWKvX9HZgl9Q9d3l+bV8T7glAzNDfQmFJq+5raZiARb4BfIHYzZvha9yLqyTGmBzkPY+35l2XbIL32nqFH+NMu5zH+2xKvqLoh/Rv2Dq1wN33n7uMuqg4l0Ijv6guOABMWdVMYKPXrsP/wIKVeVSwRyZays4Si8eWRkTamTp0EF/VPvEX5BKG9UNSbOO3BQUoQ8pvXgJAKVl+pa0Hgxu1PETDDb9KZkBkmxFZhH52aiyhZuBs54mpvasLClijqUtqnYfpoqhlvwFTMsbSUPF0aWTDH+baq2+uy9bz/MlqXiGxomfpX6mV5ClqOKr5zRIgxW1IXOplFvPZaB7lY3YbnJ1uUQ46VNTR3XaKG4wNZsQV39wwPwmh JPkdXzWr W4QjRiB7etDCEJjmmolVQtoBREgEB+HtKdTUq1SyerfRiisTJwQrJTWr53DRC0m3WcrxZrJNKy/kNyUj6KR9vfMo85E8ChhDTldinonbHNpkH5y3gybqPbSCDVRAXa4PHhJn6kbCv6YAv0u6dNCoZUKWm5/GyYtU5GT51mbih9SE8T3+aDFImJZ1Ae4BkVRR+dh+rkFzkmzLlmHYc8npdfUA+sBpfGIJjPJe55kAsJ7WbuM7Bk0wBJEvSHQTMsfH7vqmKSL2Tl5jRvuIOeI/I/+FPbTT4AlFx4rehNjwoF+GPbiolTu7jfiMnKewXN39PWFl5lzZAOm154R/oK1r/ETNLRhrG4BF7rkA50P/IUVwCU5aEbn8HIUagvXqF6enfobwh8VmR21ofaT+VAVm0XqnNSv0dVFiaWBMlqpcOQiaoDPZiq0SXkxBKqTE6QU1iDK/KRIzalzUIgdk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The zspage's object is not charge to any memory cgroup.The memory of a task will have no limit if the zram swap device is large enough even though the task is limited by memory cgroup. So, it should be necessary to charge the zspage's object to obj_cgroup,just like slub. Signed-off-by: Zhongkun He --- mm/zsmalloc.c | 196 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 191 insertions(+), 5 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 02f7f414aade..a5d1c484dd8f 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -62,6 +62,7 @@ #include #include #include +#include #define ZSPAGE_MAGIC 0x58 @@ -311,6 +312,7 @@ static bool ZsHugePage(struct zspage *zspage) static void migrate_lock_init(struct zspage *zspage); static void migrate_read_lock(struct zspage *zspage); static void migrate_read_unlock(struct zspage *zspage); +static void zs_free_page_obj_cgroups(struct zspage *zspage); #ifdef CONFIG_COMPACTION static void migrate_write_lock(struct zspage *zspage); @@ -973,6 +975,7 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, free_handles(pool, class, zspage); next = page = get_first_page(zspage); + zs_free_page_obj_cgroups(zspage); do { VM_BUG_ON_PAGE(!PageLocked(page), page); next = get_next_page(page); @@ -1420,6 +1423,170 @@ size_t zs_huge_class_size(struct zs_pool *pool) } EXPORT_SYMBOL_GPL(zs_huge_class_size); + +#ifdef CONFIG_MEMCG_KMEM + +static inline struct obj_cgroup **page_objcgs_check(struct page *page) +{ + unsigned long memcg_data = READ_ONCE(page->memcg_data); + + if (!memcg_data || !(memcg_data & MEMCG_DATA_OBJCGS)) + return NULL; + + VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); + + return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); +} + + +static void zs_free_page_obj_cgroups(struct zspage *zspage) +{ + struct obj_cgroup **objcgs; + struct page *page = zspage->first_page; + + objcgs = page_objcgs_check(page); + if (!objcgs) + return; + + kfree(objcgs); + page->memcg_data = 0; +} + +static inline size_t zs_obj_full_size(struct size_class *class) +{ + return class->size + sizeof(struct obj_cgroup *); +} + +static inline void zs_objcg_exchange(struct page *s_page, unsigned int s_index, + struct page *d_page, unsigned int d_index) +{ + struct page *sf_page, *df_page; + struct obj_cgroup **s_objcgs, **d_objcgs; + + sf_page = get_first_page(get_zspage(s_page)); + df_page = get_first_page(get_zspage(d_page)); + + s_objcgs = page_objcgs_check(sf_page); + d_objcgs = page_objcgs_check(df_page); + + if (!s_objcgs || !d_objcgs) + return; + + d_objcgs[d_index] = s_objcgs[s_index]; + s_objcgs[s_index] = NULL; +} + +void zs_alloc_obj_cgroups(struct zspage *zspage, struct size_class *class) +{ + struct page *page = zspage->first_page; + int objects = class->objs_per_zspage; + unsigned long memcg_data = 0; + void *vec; + + vec = kcalloc_node(objects, sizeof(struct obj_cgroup *), + GFP_NOWAIT|__GFP_NOWARN| + __GFP_ZERO|__GFP_MOVABLE, + page_to_nid(page)); + + if (vec) + memcg_data = (unsigned long)vec | MEMCG_DATA_OBJCGS; + page->memcg_data = memcg_data; +} + +static inline bool zs_obj_pre_alloc_hook(struct size_class *class, + struct obj_cgroup **objcgp, gfp_t flags) +{ + struct obj_cgroup *objcg; + + *objcgp = NULL; + + objcg = get_obj_cgroup_from_current(); + if (!objcg) + return true; + + if (obj_cgroup_charge(objcg, flags, zs_obj_full_size(class))) { + obj_cgroup_put(objcg); + return false; + } + + *objcgp = objcg; + return true; +} + +static inline void zs_obj_post_alloc_hook(struct size_class *class, + struct zspage *zspage, + unsigned int idx, + struct obj_cgroup *objcg) +{ + struct obj_cgroup **objcgs = NULL; + + if (!objcg) + return; + + if (zspage) + objcgs = page_objcgs_check(zspage->first_page); + + if (!objcgs) { + obj_cgroup_uncharge(objcg, zs_obj_full_size(class)); + obj_cgroup_put(objcg); + return; + } + + objcgs[idx] = objcg; +} + +static inline void zs_obj_free_hook(int class_size, struct zspage *zspage, + unsigned int idx) +{ + struct obj_cgroup **objcgs, *objcg; + + objcgs = page_objcgs_check(zspage->first_page); + + if (WARN_ON(!objcgs)) + return; + + objcg = objcgs[idx]; + if (!objcg) + return; + + objcgs[idx] = NULL; + obj_cgroup_uncharge(objcg, class_size + sizeof(struct obj_cgroup *)); + obj_cgroup_put(objcg); +} +#else + +static void zs_free_page_obj_cgroups(struct zspage *zspage) +{ +} + +static inline void zs_objcg_exchange(struct page *s_page, unsigned int s_index, + struct page *d_page, unsigned int d_index) +{ +} + +int zs_alloc_obj_cgroups(struct zspage *zspage, struct size_class *class) +{ + return 0; +} + +static inline bool zs_obj_pre_alloc_hook(struct size_class *class, + struct obj_cgroup **objcgp, gfp_t flags) +{ + return true; +} + +static inline void zs_obj_post_alloc_hook(struct size_class *class, + struct page *page, + unsigned int idx, + struct obj_cgroup *objcg) +{ +} +static inline void zs_obj_free_hook(int class_size, struct zspage *zspage, + unsigned int idx) +{ +} +#endif + static unsigned long obj_malloc(struct zs_pool *pool, struct zspage *zspage, unsigned long handle) { @@ -1475,7 +1642,8 @@ static unsigned long obj_malloc(struct zs_pool *pool, */ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) { - unsigned long handle, obj; + unsigned long handle, obj, index; + struct obj_cgroup *objcg; struct size_class *class; int newfg; struct zspage *zspage; @@ -1491,17 +1659,23 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) size += ZS_HANDLE_SIZE; class = pool->size_class[get_size_class_index(size)]; + if (!zs_obj_pre_alloc_hook(class, &objcg, gfp)) { + cache_free_handle(pool, handle); + return (unsigned long)ERR_PTR(-ENOMEM); + } + /* pool->lock effectively protects the zpage migration */ spin_lock(&pool->lock); zspage = find_get_zspage(class); if (likely(zspage)) { + index = get_freeobj(zspage); obj = obj_malloc(pool, zspage, handle); /* Now move the zspage to another fullness group, if required */ fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, ZS_OBJS_INUSE, 1); - goto out; + goto out_unlock; } spin_unlock(&pool->lock); @@ -1509,9 +1683,12 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) zspage = alloc_zspage(pool, class, gfp); if (!zspage) { cache_free_handle(pool, handle); - return (unsigned long)ERR_PTR(-ENOMEM); + handle = (unsigned long)ERR_PTR(-ENOMEM); + goto out; } + zs_alloc_obj_cgroups(zspage, class); + index = get_freeobj(zspage); spin_lock(&pool->lock); obj = obj_malloc(pool, zspage, handle); newfg = get_fullness_group(class, zspage); @@ -1524,7 +1701,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); -out: +out_unlock: #ifdef CONFIG_ZPOOL /* Add/move zspage to beginning of LRU */ if (!list_empty(&zspage->lru)) @@ -1533,6 +1710,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) #endif spin_unlock(&pool->lock); +out: + zs_obj_post_alloc_hook(class, zspage, index, objcg); return handle; } @@ -1573,7 +1752,7 @@ static void obj_free(int class_size, unsigned long obj, unsigned long *handle) f_page->index = 0; set_freeobj(zspage, f_objidx); } - + zs_obj_free_hook(class_size, zspage, f_objidx); kunmap_atomic(vaddr); mod_zspage_inuse(zspage, -1); } @@ -1640,6 +1819,7 @@ static void zs_object_copy(struct size_class *class, unsigned long dst, obj_to_location(src, &s_page, &s_objidx); obj_to_location(dst, &d_page, &d_objidx); + zs_objcg_exchange(s_page, s_objidx, d_page, d_objidx); s_off = (class->size * s_objidx) & ~PAGE_MASK; d_off = (class->size * d_objidx) & ~PAGE_MASK; @@ -1960,6 +2140,12 @@ static void replace_sub_page(struct size_class *class, struct zspage *zspage, int idx = 0; page = get_first_page(zspage); +#ifdef CONFIG_MEMCG_KMEM + if (page == oldpage) { + newpage->memcg_data = page->memcg_data; + page->memcg_data = 0; + } +#endif do { if (page == oldpage) pages[idx] = newpage;