From patchwork Sat Mar 4 03:48:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergey Senozhatsky X-Patchwork-Id: 13159781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3042DC678D5 for ; Sat, 4 Mar 2023 03:48:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C16C96B007B; Fri, 3 Mar 2023 22:48:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B7A796B007D; Fri, 3 Mar 2023 22:48:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9CBFF6B007E; Fri, 3 Mar 2023 22:48:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8334A6B007B for ; Fri, 3 Mar 2023 22:48:52 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4F5C8AAEE3 for ; Sat, 4 Mar 2023 03:48:52 +0000 (UTC) X-FDA: 80529834504.18.5673D47 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by imf24.hostedemail.com (Postfix) with ESMTP id 79D0D180003 for ; Sat, 4 Mar 2023 03:48:49 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=PoJckSaX; spf=pass (imf24.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.178 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677901729; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mxRoi4TukkX6UlbjK9WtsqOLfwrBv6GFUJd0cC4XWDw=; b=g6zzZfMKm7eQAiZ69pXUJjdwUBXt6kYtrWxjBPA7lzqeFMPcFTSVUzb2FCNYEDLgDRMlmz SQ5ZKyZSfSI9e0Lpk/C4FwimX6yHoUg52JS5mI5KHAiPHzQWH4VAx+MH5dKHeowwfyrHDE S2Fnn9qIazZGR+HFTowxeEXkBBFrSLs= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=PoJckSaX; spf=pass (imf24.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.178 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677901729; a=rsa-sha256; cv=none; b=8CaQCOBafQB+YOUwTHWxmMD9rsMDs00omusrMbCduh0G0CtoIpytNYkbtghFFOGsiw+4I1 dR3a1hHDUh/HvIzPGgGuOF9IBaH41UBOncZvRVj4Rybc/ER5nBVr+y2bS5D2jr1DXe0oUb oPTWto4vCj89wM1HNgLjxeLtT31KrHE= Received: by mail-pl1-f178.google.com with SMTP id y11so4817354plg.1 for ; Fri, 03 Mar 2023 19:48:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mxRoi4TukkX6UlbjK9WtsqOLfwrBv6GFUJd0cC4XWDw=; b=PoJckSaXrWf6myxlSmql82ndp3fdHyLe8QuReRZtA53h8cgepOrcFc3wAxf5T+880d JxwL2bFvEx57bJWamf060g4cq+HktGLGh3pibGPW+OlzO+ceBhngA6O7ZtD+EcQopQ6T MaCJNdUlbXoZGJC2ah7Uxd+e2hfQW5hlYfvYw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mxRoi4TukkX6UlbjK9WtsqOLfwrBv6GFUJd0cC4XWDw=; b=ckK62XuV5FKAOgw2GxOxSVrfuXBR30yU4Pk+LyFGR7KWZbCTcT8g/JrTklmvMU2OeD nEMbbyMruVNe085TdKRx9T4JwDrdtx0W/OlLmsZEr7xgXTAIDOvuA9wA+tTU2ll/ycyY LpCE1vnHDkzq5fGvLfcj2FF2dHMbRql618pt52GI9RPbB+bfBbaiar8+PiCf6PF2joiq 8lKjw+zRyJIal1O/ZGg7WtIuPrKKTaZJYo0L1GvCp87//V+DXsZSBHu1SHshePNiA2Qk NF15xOk4jJKDOGmxV5pXQM+mVAl1MSmsDtkvAvVJOqHnMr52STXAMuJ8pvgGaRdQMzGE Oukw== X-Gm-Message-State: AO0yUKVPFeanZQu6saF16ehbWmpNZYEPTjXoFvGC3lnTXtN+4Vp9pAsE VNRuRGUs1sfHqyIJwWFbGLlLDw== X-Google-Smtp-Source: AK7set8zpcSUINaMZnuyOqbfw1GR5LVivHiboO47W0G55Vv1CAgAtziIWwGghZEnWnNpAu3eWy170Q== X-Received: by 2002:a17:903:1cb:b0:19c:eaf0:9859 with SMTP id e11-20020a17090301cb00b0019ceaf09859mr5610861plh.38.1677901728469; Fri, 03 Mar 2023 19:48:48 -0800 (PST) Received: from tigerii.tok.corp.google.com ([2401:fa00:8f:203:6ac2:6eee:5465:7ee6]) by smtp.gmail.com with ESMTPSA id d6-20020a170902c18600b00199025284b3sm2249204pld.151.2023.03.03.19.48.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Mar 2023 19:48:48 -0800 (PST) From: Sergey Senozhatsky To: Minchan Kim , Andrew Morton Cc: Yosry Ahmed , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Sergey Senozhatsky Subject: [PATCHv4 3/4] zsmalloc: rework compaction algorithm Date: Sat, 4 Mar 2023 12:48:34 +0900 Message-Id: <20230304034835.2082479-4-senozhatsky@chromium.org> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog In-Reply-To: <20230304034835.2082479-1-senozhatsky@chromium.org> References: <20230304034835.2082479-1-senozhatsky@chromium.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 79D0D180003 X-Stat-Signature: qrap9h6f8jyp9uebh71p9p8xskczpx8z X-HE-Tag: 1677901729-360327 X-HE-Meta: U2FsdGVkX1/djC3/mCCwr/ft1+KSCB0Ljublu6yy38bqcbiPrrMyBFUe9DHSCq5b8fDlhMcKblk0pmV8i5QIOpTYJqMzAbdcm0DP6xzAybbZ1d3iRzKKuR3PAieWiEgLheq8DxP9tMuHiTSMGy4pCb1fLm0+jcyEHjKOgkFUAig2QYps8m0Oc40kGjyiSbUBRc/kW0TVsEXCG0/Oy/af8aHIFJbDc+76OIU+Z6gCGRmD77eBLbVO/HpXUlSLjI4mkIMdiRXcLjanZE3/y3h3lgH6pNvov9KN8hQ0TlKsoRfxOsWWEmSboOHuEoD+A9aKPxoo8K9eghX9NiK9PXY0KMZbR68QUt5WEAj8cqzXJ3iu4Wta6B5jAb67FDNXzrq8ODvlOmQ6Yw+E7OWc6SND8f/Yqri0y13x+59tzCawbd0KCz05nPld4g2i/lZhzIeart5Mj33fj3KxVrUm1oUpWV/3TWogFk/c3rjx2DK9+/6PFfIBdBLE6UfjAKq0d0oDQOoZZW3EmWLZuWqyi+AiqEY66agA9qx/Ydz6x23I7ef+Cqr0l6OqYtukd1ZaspYfkAOOJKuWNA140XPhxr9QnOIIydNxoTbj5BdFpUPtV7mWwY3D2du6shMdO842vYpLTIfH+ymp321W5/ItpTD6b9f7ovwVYt3AxU2rk0dDCSl2Ke5Sg3iNnCJD8X1I1Tp9SCHMNaV8kbVbJ11gs7jVc4H97bjhXPGVkjy+YtSoY1Cj5+aiYlnk+UQhUIcIxuZvNdDj80bd3sZ/iGDfvH8RgMcGAyjfEJNHAyT6bSHx0RM+qTqbT5FNYDb+P7WLO+LD91n7dL64NWplbZxT3TbNUFXrG+ulSxEdnrF7HZyyxlYOmmg3yitD3djYxn8A9CY4iuN8M+uODxLtQJH+plqEJ4SfaxV4JLk79lebIjIVeq7ds6F02zBZs8SZFyOnd32HhoqT2O8Z7MFdF6AGcNH ERq3ZtvH ttuO1hmScG+AuXqJY4K9REQTfAYm3Qw6/awUPddLmu6kmaW+43adu78C6EeJw6kha4FpqmUI06qnL92tWHivoE4O+DbTd7H5pOYPU3pTXJ8UzB/Zth+unPVF0uPveh6c3/gsNEcLwB0L5OrQ0ci7k0JJvXVaQscM4bsn+oZNrqrWmFWdY8rB4CIgKkNsznhTg30c3d3DA44ZbIEGgTGb8hBbAuCidBDg5GMEvduvQMRLyYUzXJdDEvsdGx78ObBFR+4Pa2LtXyAqXf1FyqgBkzc2uX1g47BkIdgthkeJeV0nUXeSUzshA0vvFrA4M0C5TXuA7iPYIQ3zrZIt/215Ov2ooEwVxRwFs6o5i1JGKZv+543rs5kqZRIOpFmnralB2UxEZT23YhxNls3j44px1J9PZGD+rrUFzbl7ygpovM4IguGg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.040623, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The zsmalloc compaction algorithm has the potential to waste some CPU cycles, particularly when compacting pages within the same fullness group. This is due to the way it selects the head page of the fullness list for source and destination pages, and how it reinserts those pages during each iteration. The algorithm may first use a page as a migration destination and then as a migration source, leading to an unnecessary back-and-forth movement of objects. Consider the following fullness list: PageA PageB PageC PageD PageE During the first iteration, the compaction algorithm will select PageA as the source and PageB as the destination. All of PageA's objects will be moved to PageB, and then PageA will be released while PageB is reinserted into the fullness list. PageB PageC PageD PageE During the next iteration, the compaction algorithm will again select the head of the list as the source and destination, meaning that PageB will now serve as the source and PageC as the destination. This will result in the objects being moved away from PageB, the same objects that were just moved to PageB in the previous iteration. To prevent this avalanche effect, the compaction algorithm should not reinsert the destination page between iterations. By doing so, the most optimal page will continue to be used and its usage ratio will increase, reducing internal fragmentation. The destination page should only be reinserted into the fullness list if: - It becomes full - No source page is available. TEST ==== It's very challenging to reliably test this series. I ended up developing my own synthetic test that has 100% reproducibility. The test generates significan fragmentation (for each size class) and then performs compaction for each class individually and tracks the number of memcpy() in zs_object_copy(), so that we can compare the amount work compaction does on per-class basis. Total amount of work (zram mm_stat objs_moved) ---------------------------------------------- Old fullness grouping, old compaction algorithm: 323977 memcpy() in zs_object_copy(). Old fullness grouping, new compaction algorithm: 262944 memcpy() in zs_object_copy(). New fullness grouping, new compaction algorithm: 213978 memcpy() in zs_object_copy(). Per-class compaction memcpy() comparison (T-test) ------------------------------------------------- x Old fullness grouping, old compaction algorithm + Old fullness grouping, new compaction algorithm N Min Max Median Avg Stddev x 140 349 3513 2461 2314.1214 806.03271 + 140 289 2778 2006 1878.1714 641.02073 Difference at 95.0% confidence -435.95 +/- 170.595 -18.8387% +/- 7.37193% (Student's t, pooled s = 728.216) x Old fullness grouping, old compaction algorithm + New fullness grouping, new compaction algorithm N Min Max Median Avg Stddev x 140 349 3513 2461 2314.1214 806.03271 + 140 226 2279 1644 1528.4143 524.85268 Difference at 95.0% confidence -785.707 +/- 159.331 -33.9527% +/- 6.88516% (Student's t, pooled s = 680.132) Signed-off-by: Sergey Senozhatsky --- mm/zsmalloc.c | 78 ++++++++++++++++++++++++--------------------------- 1 file changed, 36 insertions(+), 42 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index cc59336a966a..a61540afbb28 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1782,15 +1782,14 @@ struct zs_compact_control { int obj_idx; }; -static int migrate_zspage(struct zs_pool *pool, struct size_class *class, - struct zs_compact_control *cc) +static void migrate_zspage(struct zs_pool *pool, struct size_class *class, + struct zs_compact_control *cc) { unsigned long used_obj, free_obj; unsigned long handle; struct page *s_page = cc->s_page; struct page *d_page = cc->d_page; int obj_idx = cc->obj_idx; - int ret = 0; while (1) { handle = find_alloced_obj(class, s_page, &obj_idx); @@ -1803,10 +1802,8 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class, } /* Stop if there is no more space */ - if (zspage_full(class, get_zspage(d_page))) { - ret = -ENOMEM; + if (zspage_full(class, get_zspage(d_page))) break; - } used_obj = handle_to_obj(handle); free_obj = obj_malloc(pool, get_zspage(d_page), handle); @@ -1819,8 +1816,6 @@ static int migrate_zspage(struct zs_pool *pool, struct size_class *class, /* Remember last position in this iteration */ cc->s_page = s_page; cc->obj_idx = obj_idx; - - return ret; } static struct zspage *isolate_src_zspage(struct size_class *class) @@ -2216,7 +2211,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, struct size_class *class) { struct zs_compact_control cc; - struct zspage *src_zspage; + struct zspage *src_zspage = NULL; struct zspage *dst_zspage = NULL; unsigned long pages_freed = 0; @@ -2225,50 +2220,45 @@ static unsigned long __zs_compact(struct zs_pool *pool, * as well as zpage allocation/free */ spin_lock(&pool->lock); - while ((src_zspage = isolate_src_zspage(class))) { - /* protect someone accessing the zspage(i.e., zs_map_object) */ - migrate_write_lock(src_zspage); + while (zs_can_compact(class)) { + int fg; - if (!zs_can_compact(class)) + if (!dst_zspage) { + dst_zspage = isolate_dst_zspage(class); + if (!dst_zspage) + break; + migrate_write_lock(dst_zspage); + cc.d_page = get_first_page(dst_zspage); + } + + src_zspage = isolate_src_zspage(class); + if (!src_zspage) break; + migrate_write_lock_nested(src_zspage); + cc.obj_idx = 0; cc.s_page = get_first_page(src_zspage); + migrate_zspage(pool, class, &cc); + fg = putback_zspage(class, src_zspage); + migrate_write_unlock(src_zspage); - while ((dst_zspage = isolate_dst_zspage(class))) { - migrate_write_lock_nested(dst_zspage); - - cc.d_page = get_first_page(dst_zspage); - /* - * If there is no more space in dst_page, resched - * and see if anyone had allocated another zspage. - */ - if (!migrate_zspage(pool, class, &cc)) - break; + if (fg == ZS_INUSE_RATIO_0) { + free_zspage(pool, class, src_zspage); + pages_freed += class->pages_per_zspage; + src_zspage = NULL; + } + if (get_fullness_group(class, dst_zspage) == ZS_INUSE_RATIO_100 + || spin_is_contended(&pool->lock)) { putback_zspage(class, dst_zspage); migrate_write_unlock(dst_zspage); dst_zspage = NULL; - if (spin_is_contended(&pool->lock)) - break; - } - /* Stop if we couldn't find slot */ - if (dst_zspage == NULL) - break; - - putback_zspage(class, dst_zspage); - migrate_write_unlock(dst_zspage); - - if (putback_zspage(class, src_zspage) == ZS_INUSE_RATIO_0) { - migrate_write_unlock(src_zspage); - free_zspage(pool, class, src_zspage); - pages_freed += class->pages_per_zspage; - } else - migrate_write_unlock(src_zspage); - spin_unlock(&pool->lock); - cond_resched(); - spin_lock(&pool->lock); + spin_unlock(&pool->lock); + cond_resched(); + spin_lock(&pool->lock); + } } if (src_zspage) { @@ -2276,6 +2266,10 @@ static unsigned long __zs_compact(struct zs_pool *pool, migrate_write_unlock(src_zspage); } + if (dst_zspage) { + putback_zspage(class, dst_zspage); + migrate_write_unlock(dst_zspage); + } spin_unlock(&pool->lock); return pages_freed;