From patchwork Thu Sep 29 03:00:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: xu xin X-Patchwork-Id: 12993457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBF01C04A95 for ; Thu, 29 Sep 2022 03:00:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 419C68D0002; Wed, 28 Sep 2022 23:00:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C8248D0001; Wed, 28 Sep 2022 23:00:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2431C8D0002; Wed, 28 Sep 2022 23:00:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 117428D0001 for ; Wed, 28 Sep 2022 23:00:35 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D9C21AB701 for ; Thu, 29 Sep 2022 03:00:34 +0000 (UTC) X-FDA: 79963619988.10.4837670 Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by imf26.hostedemail.com (Postfix) with ESMTP id 6CF5E140007 for ; Thu, 29 Sep 2022 03:00:34 +0000 (UTC) Received: by mail-pl1-f196.google.com with SMTP id iw17so137995plb.0 for ; Wed, 28 Sep 2022 20:00:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=M5H3e9miRP4TxLcvskK5xs1da7TP6nFSXnPh1B/uK2w=; b=Mk1MoSlpOIGJ4QYg52wnT9GlN8jKL7rjjAdKFbxViUNsIbhj2a2zXIXIh1VcXUDXi2 42W5RRTWOPedqPJEbBtSK2ej+Loh6MlGoOqze0q+ITSKl0G5VDc9Djd+d5xeJ4b7AzX4 BhsTqrgZFLjE9rwe0SS2ofinH+Q9ot3EybD7gfWyyWtqon/fC8tn7nMOYnyTOrm47Mx5 Pgzb9z2XlsX7dBNgsbrJzilxGgQUeASGz42QwlCDjcvMkwN/wTYqlYL2zberSQp/qxwm UBNx6zgBJ4y1fH/bUO8vk2gHjzOCJR12D+xjModqaurZDpIGfHLDORSHUX0Pb7F5ZdyQ Vb9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=M5H3e9miRP4TxLcvskK5xs1da7TP6nFSXnPh1B/uK2w=; b=kX2WQ0aTdOblqA1OPUiSUJ3QIe95AdqlFABLELYSdDGseMTZJA8hxpVuNiEQRL+AeT RCeSqj/nrB7bq51jTZP2JcgbYfqTDS25ib5oJrDAWA0ickvauq63P4K2i2P8pgWwkpbs 1MBwxAUarx1aX4MkGK5zmgUN/jrmXcHoj8UP1B39dDWa5XNPdE30wJGlxadfoN76s/C2 4bxUtP0pQGld/G3wqwRXHuy9uU4NPvB+JccqERf25qqyqmjMPitOyP2dpEp7Pc98h/4t o/Ci98khqBdZ+h/9ZfB0I9gDOdgTEFLW795ZjPuxasVDEaiVlIgt0cFSV+tcbcjU78H/ ZvSg== X-Gm-Message-State: ACrzQf1i5H8tz5+8Bd8CLN84BxGwixGRo4yK52SiheU3K98viwIPBj/I vDAOrCvBMI0T6AED/t6xAKQ= X-Google-Smtp-Source: AMsMyM7xpxY/1haMW08MyUdfHdX++2KCNx7LNnzmjS9WsU8ZUB1xTLtAXofMOI/gzJSnB5AGDtFljw== X-Received: by 2002:a17:902:d548:b0:178:41dd:899 with SMTP id z8-20020a170902d54800b0017841dd0899mr1210936plf.149.1664420433430; Wed, 28 Sep 2022 20:00:33 -0700 (PDT) Received: from localhost.localdomain ([193.203.214.57]) by smtp.gmail.com with ESMTPSA id gi9-20020a17090b110900b001fb47692333sm2309867pjb.23.2022.09.28.20.00.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Sep 2022 20:00:33 -0700 (PDT) From: xu.xin.sc@gmail.com X-Google-Original-From: xu.xin16@zte.com.cn To: akpm@linux-foundation.org, david@redhat.com, imbrenda@linux.vnet.ibm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, xu xin , Xiaokai Ran , Yang Yang , Jiang Xuexin Subject: [PATCH 2/3] ksm: add the accounting of zero pages merged by use_zero_pages Date: Thu, 29 Sep 2022 03:00:27 +0000 Message-Id: <20220929030027.281387-1-xu.xin16@zte.com.cn> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220929025206.280970-1-xu.xin16@zte.com.cn> References: <20220929025206.280970-1-xu.xin16@zte.com.cn> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664420434; a=rsa-sha256; cv=none; b=UotA5JylqdlxS3MjGx1igk+zZ4ZWbOqKX8mcTJQGpbZol/pgLL1MYBE7DzLALxtqrKMcg3 av+HE7qTdbyBEAdPgV6O3LMTp0nFxfK19sNZU2mk3akI+jvwH6HIVG+HtavMzAGNdYA2oK RgPNQLVfeulsuz2h+wbPzLu8RMrcsP0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Mk1MoSlp; spf=pass (imf26.hostedemail.com: domain of xu.xin.sc@gmail.com designates 209.85.214.196 as permitted sender) smtp.mailfrom=xu.xin.sc@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664420434; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M5H3e9miRP4TxLcvskK5xs1da7TP6nFSXnPh1B/uK2w=; b=Rnf0PxqIeMzWs4AkygF+pmNg1SaHAyusH0nYoxP+ltHjzD2YhaHKleX9XEqrCYDgR3hXYx 7RyoXSl3OveGtGq0YXC5mZlM31SCJiQvtLt25WRebyuH64h1A8KZ8BDP/Pe7ivQsLWHggR E44qlfxdpv+AOL/GoGjqQMZj1Os6WhM= X-Stat-Signature: a3y3xkucfkmai7mz37sr97698qrt8fwo X-Rspamd-Queue-Id: 6CF5E140007 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=Mk1MoSlp; spf=pass (imf26.hostedemail.com: domain of xu.xin.sc@gmail.com designates 209.85.214.196 as permitted sender) smtp.mailfrom=xu.xin.sc@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1664420434-876736 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: xu xin Before enabling use_zero_pages by setting /sys/kernel/mm/ksm/ use_zero_pages to 1, Using pages_sharing of KSM to indicate how much pages saved by KSM is basically accurate. But when enabling use_zero_pages, it becomes not accurate, and all empty(zeroed) pages that are merged with kernel zero page are not counted in pages_sharing or pages_shared. That is because the rmap_items of these ksm zero pages are never appended to the Stable Tree of KSM. This leads to KSM not being fully correct and transparent with all merged pages when enabling use_zero_pages. There are two ways to fix it. One way is to count ksm zero pages into pages_sharing, but it breaks the definition of pages_sharing (means how many pages is sharing those KSM stable node). So we have to choose Plan B, which is adding a new interface "zero_pages_sharing" under /sys/kernel/mm/ksm/ to show it. To implement that, we introduce a new flag SPECIAL_ZERO_FLAG to mark those special zero pages (merged with kernel zero pages) for accounting because these zero pages neither belongs to the existing STABLE_FLAG nor UNSTABLE_FLAG. Fixes: e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with colouring") Co-developed-by: Xiaokai Ran Signed-off-by: Xiaokai Ran Co-developed-by: Yang Yang Signed-off-by: Yang Yang Co-developed-by: Jiang Xuexin Signed-off-by: Jiang Xuexin Signed-off-by: xu xin --- mm/ksm.c | 98 +++++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 79 insertions(+), 19 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 5b68482d2b3b..88153d2b497f 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -213,6 +213,7 @@ struct ksm_rmap_item { #define SEQNR_MASK 0x0ff /* low bits of unstable tree seqnr */ #define UNSTABLE_FLAG 0x100 /* is a node of the unstable tree */ #define STABLE_FLAG 0x200 /* is listed from the stable tree */ +#define SPECIAL_ZERO_FLAG 0x400 /* specially treated zero page */ /* The stable and unstable tree heads */ static struct rb_root one_stable_tree[1] = { RB_ROOT }; @@ -274,6 +275,9 @@ static unsigned int zero_checksum __read_mostly; /* Whether to merge empty (zeroed) pages with actual zero pages */ static bool ksm_use_zero_pages __read_mostly; +/* The number of empty(zeroed) pages merged but not in the stable tree */ +static unsigned long ksm_zero_pages_sharing; + #ifdef CONFIG_NUMA /* Zeroed when merging across nodes is not allowed */ static unsigned int ksm_merge_across_nodes = 1; @@ -796,6 +800,10 @@ static void remove_trailing_rmap_items(struct ksm_rmap_item **rmap_list) struct ksm_rmap_item *rmap_item = *rmap_list; *rmap_list = rmap_item->rmap_list; remove_rmap_item_from_tree(rmap_item); + if (rmap_item->address & SPECIAL_ZERO_FLAG) { + rmap_item->address &= PAGE_MASK; + ksm_zero_pages_sharing--; + } free_rmap_item(rmap_item); } } @@ -2017,6 +2025,39 @@ static void stable_tree_append(struct ksm_rmap_item *rmap_item, rmap_item->mm->ksm_merging_pages++; } +static int try_to_merge_with_kernel_zero_page(struct mm_struct *mm, + struct ksm_rmap_item *rmap_item, + struct page *page) +{ + int err = 0; + + if (!(rmap_item->address & SPECIAL_ZERO_FLAG)) { + struct vm_area_struct *vma; + + mmap_read_lock(mm); + vma = find_mergeable_vma(mm, rmap_item->address); + if (vma) { + err = try_to_merge_one_page(vma, page, + ZERO_PAGE(rmap_item->address)); + } else { + /* If the vma is out of date, we do not need to continue. */ + err = 0; + } + mmap_read_unlock(mm); + /* + * In case of failure, the page was not really empty, so we + * need to continue. Otherwise we're done. + */ + if (!err) { + rmap_item->address |= SPECIAL_ZERO_FLAG; + ksm_zero_pages_sharing++; + } + + } + + return err; +} + /* * cmp_and_merge_page - first see if page can be merged into the stable tree; * if not, compare checksum to previous and if it's the same, see if page can @@ -2101,29 +2142,22 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite * Same checksum as an empty page. We attempt to merge it with the * appropriate zero page if the user enabled this via sysfs. */ - if (ksm_use_zero_pages && (checksum == zero_checksum)) { - struct vm_area_struct *vma; - - mmap_read_lock(mm); - vma = find_mergeable_vma(mm, rmap_item->address); - if (vma) { - err = try_to_merge_one_page(vma, page, - ZERO_PAGE(rmap_item->address)); - } else { + if (ksm_use_zero_pages) { + if (checksum == zero_checksum) { + /* If success, just return. Otherwise, continue */ + if (!try_to_merge_with_kernel_zero_page(mm, rmap_item, page)) + return; + } else if (rmap_item->address & SPECIAL_ZERO_FLAG) { /* - * If the vma is out of date, we do not need to - * continue. + * The page now is not kernel zero page(modified) but the flag + * of its rmap_item is still zero-page, so need to reset the + * flag and update the corresponding count. */ - err = 0; + rmap_item->address &= PAGE_MASK; + ksm_zero_pages_sharing--; } - mmap_read_unlock(mm); - /* - * In case of failure, the page was not really empty, so we - * need to continue. Otherwise we're done. - */ - if (!err) - return; } + tree_rmap_item = unstable_tree_search_insert(rmap_item, page, &tree_page); if (tree_rmap_item) { @@ -2336,6 +2370,24 @@ static struct ksm_rmap_item *scan_get_next_rmap_item(struct page **page) mmap_read_unlock(mm); return rmap_item; } + /* + * Because we want to count ksm zero pages which is + * non-anonymous, we must try to return the rmap_items + * of those kernel zero pages which replaces its + * original anonymous empty page due to use_zero_pages's + * feature. + */ + if (is_zero_pfn(page_to_pfn(*page))) { + rmap_item = try_to_get_old_rmap_item( + ksm_scan.address, + ksm_scan.rmap_list); + if (rmap_item->address & SPECIAL_ZERO_FLAG) { + ksm_scan.rmap_list = &rmap_item->rmap_list; + ksm_scan.address += PAGE_SIZE; + mmap_read_unlock(mm); + return rmap_item; + } + } next_page: put_page(*page); ksm_scan.address += PAGE_SIZE; @@ -3115,6 +3167,13 @@ static ssize_t pages_volatile_show(struct kobject *kobj, } KSM_ATTR_RO(pages_volatile); +static ssize_t zero_pages_sharing_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%ld\n", ksm_zero_pages_sharing); +} +KSM_ATTR_RO(zero_pages_sharing); + static ssize_t stable_node_dups_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { @@ -3175,6 +3234,7 @@ static struct attribute *ksm_attrs[] = { &merge_across_nodes_attr.attr, #endif &max_page_sharing_attr.attr, + &zero_pages_sharing_attr.attr, &stable_node_chains_attr.attr, &stable_node_dups_attr.attr, &stable_node_chains_prune_millisecs_attr.attr,