From patchwork Wed Aug 23 13:13:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13362350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5255EE49A3 for ; Wed, 23 Aug 2023 13:16:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=RM4rxrtgJY2BI52Z56UZaldCrb99Ja2lZVc0VWcYftE=; b=JW3p2S5rTDeIVp rs8ColmmdqaypyHELPx3XN4sorQtg/TSh5cSFd33oPeNxwKcJdD2q18Bbccz61/YwHIzjcFEegooe Mn0Z8aUXSbvTF73cXfYDyz9u2uCs6ybvGNt9esDiIfYRh5TSnrqy03dAsx7EVlnlEQ2w1N93QukG2 g6/hXjZc1P4CDo9WByW0QeVUm1yYiH/xR2jpPSFyPij0kdkmsbf/QOODJI/Ew1T8A1xynFN5faTaB bs2kVUnGYkbkFWbRRIKjaMzef4+3F8ONTXHaGqy12k9PJCIWjkZNVNht3Ubi+G59rSTj7O60mAez8 pB9VY5viMWafDXLwIsqw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qYnie-000dN6-2y; Wed, 23 Aug 2023 13:16:20 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qYniU-000dFB-20 for linux-arm-kernel@lists.infradead.org; Wed, 23 Aug 2023 13:16:12 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 472231688; Wed, 23 Aug 2023 06:16:50 -0700 (PDT) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 99D093F740; Wed, 23 Aug 2023 06:16:03 -0700 (PDT) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC 19/37] mm: page_alloc: Manage metadata storage on page allocation Date: Wed, 23 Aug 2023 14:13:32 +0100 Message-Id: <20230823131350.114942-20-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230823131350.114942-1-alexandru.elisei@arm.com> References: <20230823131350.114942-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230823_061610_758301_B4F33654 X-CRM114-Status: GOOD ( 20.22 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When a page is allocated with metadata, the associated metadata storage cannot be used for data because the page metadata will overwrite the metadata storage contents. Reserve metadata storage when the associated page is allocated with metadata enabled. If metadata storage cannot be reserved, because, for example, of a short term pin, then the page with metadata enabled which triggered the reservation will be put back at the tail of the free list and the page allocator will repeat the process for a new page. If the page allocator exhausts all allocation paths, then it must mean that the system is out of memory and this is treated like any other OOM situation. When a metadata-enabled page is freed, then also free the associated metadata storage, so it can be used to data allocations. For the direct reclaim slowpath, no special handling for metadata pages has been added - metadata pages are still considered for reclaim even if they cannot be used to satisfy the allocation request. This behaviour has been preserved to increase the chance that the metadata storage is free when the associated page is allocated with metadata enabled. Signed-off-by: Alexandru Elisei --- arch/arm64/include/asm/memory_metadata.h | 14 ++++++++ include/asm-generic/memory_metadata.h | 11 ++++++ include/linux/vm_event_item.h | 5 +++ mm/page_alloc.c | 43 ++++++++++++++++++++++++ mm/vmstat.c | 5 +++ 5 files changed, 78 insertions(+) diff --git a/arch/arm64/include/asm/memory_metadata.h b/arch/arm64/include/asm/memory_metadata.h index 3287b2776af1..1b18e3217dd0 100644 --- a/arch/arm64/include/asm/memory_metadata.h +++ b/arch/arm64/include/asm/memory_metadata.h @@ -20,12 +20,26 @@ static inline bool alloc_can_use_metadata_pages(gfp_t gfp_mask) return !(gfp_mask & __GFP_TAGGED); } +static inline bool alloc_requires_metadata(gfp_t gfp_mask) +{ + return gfp_mask & __GFP_TAGGED; +} + #define page_has_metadata(page) page_mte_tagged(page) static inline bool folio_has_metadata(struct folio *folio) { return page_has_metadata(&folio->page); } + +static inline int reserve_metadata_storage(struct page *page, int order, gfp_t gfp_mask) +{ + return 0; +} + +static inline void free_metadata_storage(struct page *page, int order) +{ +} #endif /* CONFIG_MEMORY_METADATA */ #endif /* __ASM_MEMORY_METADATA_H */ diff --git a/include/asm-generic/memory_metadata.h b/include/asm-generic/memory_metadata.h index 8f4e2fba222f..111d6edc0997 100644 --- a/include/asm-generic/memory_metadata.h +++ b/include/asm-generic/memory_metadata.h @@ -16,6 +16,17 @@ static inline bool alloc_can_use_metadata_pages(gfp_t gfp_mask) { return false; } +static inline bool alloc_requires_metadata(gfp_t gfp_mask) +{ + return false; +} +static inline int reserve_metadata_storage(struct page *page, int order, gfp_t gfp_mask) +{ + return 0; +} +static inline void free_metadata_storage(struct page *page, int order) +{ +} static inline bool page_has_metadata(struct page *page) { return false; diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 8abfa1240040..3163b85d2bc6 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -86,6 +86,11 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, #ifdef CONFIG_CMA CMA_ALLOC_SUCCESS, CMA_ALLOC_FAIL, +#endif +#ifdef CONFIG_MEMORY_METADATA + METADATA_RESERVE_SUCCESS, + METADATA_RESERVE_FAIL, + METADATA_RESERVE_FREE, #endif UNEVICTABLE_PGCULLED, /* culled to noreclaim list */ UNEVICTABLE_PGSCANNED, /* scanned for reclaimability */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 011645d07ce9..911d3c362848 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1111,6 +1111,9 @@ static __always_inline bool free_pages_prepare(struct page *page, trace_mm_page_free(page, order); kmsan_free_page(page, order); + if (metadata_storage_enabled() && page_has_metadata(page)) + free_metadata_storage(page, order); + if (unlikely(PageHWPoison(page)) && !order) { /* * Do not let hwpoison pages hit pcplists/buddy @@ -3143,6 +3146,24 @@ static inline unsigned int gfp_to_alloc_flags_fast(gfp_t gfp_mask, return alloc_flags; } +#ifdef CONFIG_MEMORY_METADATA +static void return_page_to_buddy(struct page *page, int order) +{ + struct zone *zone = page_zone(page); + unsigned long pfn = page_to_pfn(page); + unsigned long flags; + int migratetype = get_pfnblock_migratetype(page, pfn); + + spin_lock_irqsave(&zone->lock, flags); + __free_one_page(page, pfn, zone, order, migratetype, FPI_TO_TAIL); + spin_unlock_irqrestore(&zone->lock, flags); +} +#else +static void return_page_to_buddy(struct page *page, int order) +{ +} +#endif + /* * get_page_from_freelist goes through the zonelist trying to allocate * a page. @@ -3156,6 +3177,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, struct pglist_data *last_pgdat = NULL; bool last_pgdat_dirty_ok = false; bool no_fallback; + int ret; retry: /* @@ -3270,6 +3292,15 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, page = rmqueue(ac->preferred_zoneref->zone, zone, order, gfp_mask, alloc_flags, ac->migratetype); if (page) { + if (metadata_storage_enabled() && alloc_requires_metadata(gfp_mask)) { + ret = reserve_metadata_storage(page, order, gfp_mask); + if (ret != 0) { + return_page_to_buddy(page, order); + page = NULL; + goto no_page; + } + } + prep_new_page(page, order, gfp_mask, alloc_flags); /* @@ -3285,7 +3316,10 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags, if (try_to_accept_memory(zone, order)) goto try_this_zone; } + } +no_page: + if (!page) { #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* Try again if zone has deferred pages */ if (deferred_pages_enabled()) { @@ -3475,6 +3509,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order, struct page *page = NULL; unsigned long pflags; unsigned int noreclaim_flag; + int ret; if (!order) return NULL; @@ -3498,6 +3533,14 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order, */ count_vm_event(COMPACTSTALL); + if (metadata_storage_enabled() && page && alloc_requires_metadata(gfp_mask)) { + ret = reserve_metadata_storage(page, order, gfp_mask); + if (ret != 0) { + return_page_to_buddy(page, order); + page = NULL; + } + } + /* Prep a captured page if available */ if (page) prep_new_page(page, order, gfp_mask, alloc_flags); diff --git a/mm/vmstat.c b/mm/vmstat.c index 07caa284a724..807b514718d2 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1338,6 +1338,11 @@ const char * const vmstat_text[] = { #ifdef CONFIG_CMA "cma_alloc_success", "cma_alloc_fail", +#endif +#ifdef CONFIG_MEMORY_METADATA + "metadata_reserve_success", + "metadata_reserve_fail", + "metadata_reserve_free", #endif "unevictable_pgs_culled", "unevictable_pgs_scanned",