From patchwork Fri Jun 7 12:23:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13689817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92BBDC27C53 for ; Fri, 7 Jun 2024 12:24:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E88F76B009C; Fri, 7 Jun 2024 08:24:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DE6D26B009D; Fri, 7 Jun 2024 08:24:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C86616B009E; Fri, 7 Jun 2024 08:24:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9CEE06B009C for ; Fri, 7 Jun 2024 08:24:16 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5059D818D9 for ; Fri, 7 Jun 2024 12:24:16 +0000 (UTC) X-FDA: 82204010112.25.1ABD7EF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf26.hostedemail.com (Postfix) with ESMTP id A1976140018 for ; Fri, 7 Jun 2024 12:24:14 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="HxrT/ahE"; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717763054; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cmTZWHwiI2sG/pKc2jb4R462HpyeqH/IZVxQCS1NoNI=; b=0D1Os9j7lQJm+BWFzoVmPTdTyG9C3UvbvWqo7yyGSVFuMXbOifUCenls3b+UBSads1yxQG s20gUDQzH5pBIsOp6+FlmIEyfRsUkzer/7HC28sEzmmrHofE1HNKlY0m0pEbct6thMBEoK rLk5PBAWfNjFzSsmdOrocndWRPdpLu0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="HxrT/ahE"; spf=pass (imf26.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717763054; a=rsa-sha256; cv=none; b=wyPQrnLOUOxCui1cQhdlWGaVxUVeKSPoWcF+/jHr+kQcypZ092Um2qVpaDztmePUAV0Rgr eDilAi4TG0FXQbp6LBDm/+h6oQjecwIfRc3CWYOqrEnDQxAWammiH6gXv0Re9PUdxcyF8p HjkNYiLbX8PTfxNfbOLqlbsWaGadDaQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717763054; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cmTZWHwiI2sG/pKc2jb4R462HpyeqH/IZVxQCS1NoNI=; b=HxrT/ahE15jHTfoPIzElX4OYauls5PyyuTclmz0ak0NEVReSmvmWaxK3wUNfIlaB0QgeTe 1SUD3Md8UuSrav791wGVGzV75ulAHR95BnuQnL0OvPQiGIRnNKY20sxsS+D5r/jA1H/lG/ poY00VlHjXe38Ko9IVUB0/ZlBEYAFcM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-338-AAnXGuseOzmHuALWoCvR9w-1; Fri, 07 Jun 2024 08:24:09 -0400 X-MC-Unique: AAnXGuseOzmHuALWoCvR9w-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 39BE185A5B5; Fri, 7 Jun 2024 12:24:09 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.192.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id DDC93492BCD; Fri, 7 Jun 2024 12:24:07 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, David Hildenbrand , Andrew Morton , Jonathan Corbet Subject: [PATCH v1 4/6] fs/proc/task_mmu: account non-present entries as "maybe shared, but no idea how often" Date: Fri, 7 Jun 2024 14:23:55 +0200 Message-ID: <20240607122357.115423-5-david@redhat.com> In-Reply-To: <20240607122357.115423-1-david@redhat.com> References: <20240607122357.115423-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-Stat-Signature: dxzst6z7nodmhubuejq7jt761h7skwe7 X-Rspamd-Queue-Id: A1976140018 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717763054-754192 X-HE-Meta: U2FsdGVkX1+uzvuR//6KuKUqGDY2ii6wGF3NNrKrmXP6BhemQsF9mwOvmD0zQmRswkYuYTCKwdSi+Ye0V9s/Wd7eyAsk3o3iJc+dwMU8NvkIoPx3Zt0peo1mfc47PyYxS0ITSiqYhHpWjwYghVZaCHYGOkHcn8rSKur25dpiJiE9YQMgoJLSEjkmlZkks7XPXkeGqtqB8VMomrQcHCqOiDO5FLtPckEX5Hv8TOZqxeoEgFIhH/vcBTTFmF22oKn/e1aXxGkUzq+C1ZRCqX6UsTI0gBWwnrlNt1UPgPk3d7FXdeki4xUVM66L2cAK1li4zN1BxCAC7h4JoFX8wmJA5cvOyTYAEN6omM85M5wecqzkkcCgwCNlzuT8T2CkwP7XBQIpujF/26krl+wGDjOJb5F3F8AXvEPa5MbOAFgkgvn9iPjJzynhpyN8T0dBevUFz0bKKITO0aXl4dNpzH8yHt/b8QfLziTWeMV8+gFu+WwSx2FU9DISUWRhLOX6B7dewPpAvcZh7t7LmmvSLIEZA5dhYRyB+4/j431DtwWJL3UFapnyeiScLQGJaL8/61dIvZtVkOUthMEWCf9NGtNn4O0ptLKv0QPCfIWq9SBghbb0rUQOR4P6xjBKTHoecY0IAYjOITwFW7tur6NCvF3WeSqtv+IEE0NTKCSFCrEP1QHaBE/CMgCmoXATyxIGj0cHEbe6VewrFyXdG8P86eiREzCdFUxSD7kM7ExxvdEXy+YBPS4cNNkI1sx05yyQ90PuTx4RT3b4rOwGtElyNeoyQo+1F+5EZgbQbJnJtr/9FNDiU6TwY6hxTcuk3aYVgIGLlaft1ftlCiQMN3s4ANW2m6i2BknP/LZiH2i5yJK33o9MCqvUMT/BZum+GovtsSUnFr4mwj0jFfUMC+L6CxRO8VDR/CBv/XbvdsGfadSvcQUdIg0427DwbjVjYH2IJdu2Vl20v3lYvCdgojZwJ5p W4bZA68A 1dIZvPOJyQKr3HaBL9R/gAFWh6BUGVWbRGU+O6t5A09NacnTxzacmGPVylojT/AeASWD2Lu0xJh0mBDmtmeX9WpkwpBi3sI+1ckyYsKI3IzdKLxlbQyVZtRcs9ee5NjXvHEZI2C4Uqhvi3vPjyOsbVK2klGkX6X7yyQ+kthQC0cZU33F6o7CqAMnvPN9FmBqZ6ohJyFfh01mYpo9jWpwcU/N5PA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We currently rely on mapcount information for pages referenced by non-present entries to calculate the USS (shared vs. private) and the PSS. However, relying on mapcounts for non-present entries doesn't make any sense. We have to treat such entries as "maybe shared, but no idea how often", implying that they will *not* get accounted towards the USS, and will get fully accounted to the PSS (no idea how often shared). There is one exception: device exclusive entries essentially behave like present entries (e.g., mapcount incremented). In smaps_pmd_entry(), use is_pfn_swap_entry() instead of is_migration_entry(), which should not make a real difference but makes the code look more similar to the PTE variant. While at it, adjust the comments in smaps_account(). Signed-off-by: David Hildenbrand --- fs/proc/task_mmu.c | 53 +++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 22 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f427176ce2c34..67d9b406c7586 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -442,7 +442,7 @@ static void smaps_page_accumulate(struct mem_size_stats *mss, static void smaps_account(struct mem_size_stats *mss, struct page *page, bool compound, bool young, bool dirty, bool locked, - bool migration) + bool present) { struct folio *folio = page_folio(page); int i, nr = compound ? compound_nr(page) : 1; @@ -471,22 +471,27 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, * Then accumulate quantities that may depend on sharing, or that may * differ page-by-page. * - * refcount == 1 guarantees the page is mapped exactly once. - * If any subpage of the compound page mapped with PTE it would elevate - * the refcount. + * refcount == 1 for present entries guarantees that the folio is mapped + * exactly once. For large folios this implies that exactly one + * PTE/PMD/... maps (a part of) this folio. * - * The page_mapcount() is called to get a snapshot of the mapcount. - * Without holding the page lock this snapshot can be slightly wrong as - * we cannot always read the mapcount atomically. It is not safe to - * call page_mapcount() even with PTL held if the page is not mapped, - * especially for migration entries. Treat regular migration entries - * as mapcount == 1. + * Treat all non-present entries (where relying on the mapcount and + * refcount doesn't make sense) as "maybe shared, but not sure how + * often". We treat device private entries as being fake-present. + * + * Note that it would not be safe to read the mapcount especially for + * pages referenced by migration entries, even with the PTL held. */ - if ((folio_ref_count(folio) == 1) || migration) { + if (folio_ref_count(folio) == 1 || !present) { smaps_page_accumulate(mss, folio, size, size << PSS_SHIFT, - dirty, locked, true); + dirty, locked, present); return; } + /* + * The page_mapcount() is called to get a snapshot of the mapcount. + * Without holding the folio lock this snapshot can be slightly wrong as + * we cannot always read the mapcount atomically. + */ for (i = 0; i < nr; i++, page++) { int mapcount = page_mapcount(page); unsigned long pss = PAGE_SIZE << PSS_SHIFT; @@ -531,13 +536,14 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, struct vm_area_struct *vma = walk->vma; bool locked = !!(vma->vm_flags & VM_LOCKED); struct page *page = NULL; - bool migration = false, young = false, dirty = false; + bool present = false, young = false, dirty = false; pte_t ptent = ptep_get(pte); if (pte_present(ptent)) { page = vm_normal_page(vma, addr, ptent); young = pte_young(ptent); dirty = pte_dirty(ptent); + present = true; } else if (is_swap_pte(ptent)) { swp_entry_t swpent = pte_to_swp_entry(ptent); @@ -555,8 +561,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT; } } else if (is_pfn_swap_entry(swpent)) { - if (is_migration_entry(swpent)) - migration = true; + if (is_device_private_entry(swpent)) + present = true; page = pfn_swap_entry_to_page(swpent); } } else { @@ -567,7 +573,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, if (!page) return; - smaps_account(mss, page, false, young, dirty, locked, migration); + smaps_account(mss, page, false, young, dirty, locked, present); } #ifdef CONFIG_TRANSPARENT_HUGEPAGE @@ -578,18 +584,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, struct vm_area_struct *vma = walk->vma; bool locked = !!(vma->vm_flags & VM_LOCKED); struct page *page = NULL; + bool present = false; struct folio *folio; - bool migration = false; if (pmd_present(*pmd)) { page = vm_normal_page_pmd(vma, addr, *pmd); + present = true; } else if (unlikely(thp_migration_supported() && is_swap_pmd(*pmd))) { swp_entry_t entry = pmd_to_swp_entry(*pmd); - if (is_migration_entry(entry)) { - migration = true; + if (is_pfn_swap_entry(entry)) page = pfn_swap_entry_to_page(entry); - } } if (IS_ERR_OR_NULL(page)) return; @@ -604,7 +609,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, mss->file_thp += HPAGE_PMD_SIZE; smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), - locked, migration); + locked, present); } #else static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, @@ -732,17 +737,21 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, struct vm_area_struct *vma = walk->vma; pte_t ptent = huge_ptep_get(pte); struct folio *folio = NULL; + bool present = false; if (pte_present(ptent)) { folio = page_folio(pte_page(ptent)); + present = true; } else if (is_swap_pte(ptent)) { swp_entry_t swpent = pte_to_swp_entry(ptent); if (is_pfn_swap_entry(swpent)) folio = pfn_swap_entry_folio(swpent); } + if (folio) { - if (folio_likely_mapped_shared(folio) || + /* We treat non-present entries as "maybe shared". */ + if (!present || folio_likely_mapped_shared(folio) || hugetlb_pmd_shared(pte)) mss->shared_hugetlb += huge_page_size(hstate_vma(vma)); else