From patchwork Tue Feb 13 09:37:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13554782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8226C48260 for ; Tue, 13 Feb 2024 09:37:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 720A76B0074; Tue, 13 Feb 2024 04:37:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A9A96B0075; Tue, 13 Feb 2024 04:37:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5718A6B007B; Tue, 13 Feb 2024 04:37:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 485416B0074 for ; Tue, 13 Feb 2024 04:37:27 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id DBEEDC0B13 for ; Tue, 13 Feb 2024 09:37:26 +0000 (UTC) X-FDA: 81786277692.02.1A5A60D Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) by imf16.hostedemail.com (Postfix) with ESMTP id 1AFB0180005 for ; Tue, 13 Feb 2024 09:37:24 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b="0mK9WM/E"; spf=pass (imf16.hostedemail.com: domain of me@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=me@pankajraghav.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707817045; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IYsJ32qMRCtWIP7xXM7OP+qt3qi8RVhOs/HaC3Mhi1k=; b=rGESVAmAFojBZxDOVTu7DeiRya8e5SBTpNAocnc6+GLd+0vpNVnsufslP4IIf6FIvD9B+/ jUrsOs03WxqT7dbo7IcMxM7xYrkb95IcqITXmUqkiAKpvWwxfLWqzHjS77h8w7nyi+u10k KvPio87AdGiDJBqiJNUQUKB79zMQkds= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707817045; a=rsa-sha256; cv=none; b=hHHkuLmb/7GnRl//0ykI3AN9dP1B1RSs16hm6R2yAsle1rbBYKQQqY35zZiuU4vdqqmtkR 3c7Qpui++hS96dxDaTwVAvb+3G2JXDOvnALV+YP2SKkeNehV1sVHQrjD3M84wAijNfOPi6 BoPEK6h3TH4M0TmP9DaFN35i0GvfRuY= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b="0mK9WM/E"; spf=pass (imf16.hostedemail.com: domain of me@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=me@pankajraghav.com; dmarc=none Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4TYx7P5TQGz9sTM; Tue, 13 Feb 2024 10:37:21 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707817041; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IYsJ32qMRCtWIP7xXM7OP+qt3qi8RVhOs/HaC3Mhi1k=; b=0mK9WM/EFBtMl+dbpHA5NndcygMmv6RRM66I83fovGzWHIsnV7/2VPdQm8kcB3euYKZSFX UatwkePCcSp/Lzf0q+P8t99PuCfm8y7LsDJ0RhTt7Mh+11SSEqyFtQxXyHmbsXW8YXXsnw nQX/tG65TSJ/9vVVm/LW8DSW5GATIO5FBux0ZARu9/qrlU/C6PUxh6YE3l+j1jP5QY70yw ppe/kcPhJTO6Cpl8yT2bPB8K2n3jJXPahl9kOH8NLS+jdVgubCwhPrSbxCg0X/40NgdFXC ptJxZqX4UXs21l8Me8WCfP+0NXi0g5N0JTA/flhbb9ayDk+o55DciZOM3XGHWw== From: "Pankaj Raghav (Samsung)" To: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org, david@fromorbit.com Subject: [RFC v2 01/14] fs: Allow fine-grained control of folio sizes Date: Tue, 13 Feb 2024 10:37:00 +0100 Message-ID: <20240213093713.1753368-2-kernel@pankajraghav.com> In-Reply-To: <20240213093713.1753368-1-kernel@pankajraghav.com> References: <20240213093713.1753368-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1AFB0180005 X-Rspam-User: X-Stat-Signature: xnt1tu8tgiwuaok9k1bky9mnx9u3saxi X-Rspamd-Server: rspam03 X-HE-Tag: 1707817044-71946 X-HE-Meta: U2FsdGVkX1/VoNd5Ob+AuDOWfTrGK9VIjrrup8xZSRiDed4eaPLOBcrX5jx5hbC2dChDZLRJndNczZ0/GaIA1qNJMkEh74jaRHtmUe5aDQP+2cnYezuq92u5RqZjFNFf5iUXXSN0EBGowV3E1qn8wdWhMSkBAhcIpyPj0MBQFZjWCPq8XT3xffGX+6cxbLW4YkRzG6mGt1NhU8P4q9jo3t9AdsL0Mu+ZBGWY4HAyaCzoWByR8lkzBYQaGSY6E0iesguI/DL4CPxGBZwtbfrVHiOn7lsqZ7wJU5Ybtjw19mKaWkWm/3WabsCZAjhR+UeD/I/qPAdLmOPSh3sGOhqgEO4x+wkpWrh8eRVzADGk6BoUkryk2GduG1rr0jLEvAxbP0gPe3+NnB4LczOF4++AceQmvPFikd5TcAc1y/LLKaWoowf7FozIzg+dElaGGtMqt3/DwHxd46+t8HLE1V/E/BsBjYI5ngmpaiT8PGuVSsXi3Ab0eUd1lZAsY+AGkCRdIr3oe+s/UJrFflJElWBs/fHWSZdBFQtVnGZkmrsyu1IOYS0TAIE/6ouhqBrqwE/6ptguRRkBrSJa/F9zjNFu+3V4Z6ml6g9TdXOqIvVVJz+IhICei6hAUIe8MrmHJyD75PHW70kxPAStP0rld8AOrGDu9k+b3ogMqyziYtgimytabou8eFt1LHpi0i7DOw/OdN5pL35Un+QAOnFlPV16Dx4RbFKOwNGNgU2Hz+rteflbsA6C/ZoMkXZm9hj/YO+mQ2c0C2vkz5XMvh4EfBSoVFLpyTAY9XIJojBnF8TF0pZOzKLlJSZU4uhxl1U3KuZa256zTrslk/YSkVluCY8UNwNsf2DPxLp0s4hgXlL/yZiW0zXN25ASWOOjLmVqeZZHVZG6gws5MVPIcoM4oj6WEmnQT6NfZ6b9pCcdEeeTWiIZ/xNFrzZVLQwHCrwGMzFTjf0YztR/q3DdNSCBbNM zJWMjwzY 8wL1uof7+yN8JqL42q3Y4qDZ6MYWGxa7Q8sY8n1V9YeEhgrTzQovFrYKnUcKV0ob5K5ikNG8nKZdVSh97OyqNrLgvUDye/5YqG5TtApLSOcYzg6CHhoDtgRGvvViQLQ+gN0n6YmupA1DHqrzfsjD7zEGnAIR1FIUpDFa/4RVWznlzfhp/mjHT8MSlqNz99Q4SCJV/f7pPWPgi6v9v3TbGu0MbpVdXWRvxu9zE6josLaH702yGIB3wKAwo2UY5jhf/y0E+ZLxUYD1uOi/jfokm0ndAEXpIQxIku1Ex9QgSnj2u6mSSlFb+2zOPXVy8BUpOXClnBM1QFVX4xi7z9MDjju6P6y19M3xYwOIWlHBgV7+/lVBShINwWFJqzciqsHV/Ixg6z/WvVZG3TmehiUUkQlzPNP2osG4clc33rj8qEkgmlkPBd2HhAyPDZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: "Matthew Wilcox (Oracle)" Some filesystems want to be able to limit the maximum size of folios, and some want to be able to ensure that folios are at least a certain size. Add mapping_set_folio_orders() to allow this level of control. The max folio order parameter is ignored and it is always set to MAX_PAGECACHE_ORDER. Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Pankaj Raghav Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke --- include/linux/pagemap.h | 92 ++++++++++++++++++++++++++++++++--------- 1 file changed, 73 insertions(+), 19 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 2df35e65557d..5618f762187b 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -202,13 +202,18 @@ enum mapping_flags { AS_EXITING = 4, /* final truncate in progress */ /* writeback related tags are not used */ AS_NO_WRITEBACK_TAGS = 5, - AS_LARGE_FOLIO_SUPPORT = 6, - AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */ - AS_STABLE_WRITES, /* must wait for writeback before modifying + AS_RELEASE_ALWAYS = 6, /* Call ->release_folio(), even if no private data */ + AS_STABLE_WRITES = 7, /* must wait for writeback before modifying folio contents */ - AS_UNMOVABLE, /* The mapping cannot be moved, ever */ + AS_FOLIO_ORDER_MIN = 8, + AS_FOLIO_ORDER_MAX = 13, /* Bit 8-17 are used for FOLIO_ORDER */ + AS_UNMOVABLE = 18, /* The mapping cannot be moved, ever */ }; +#define AS_FOLIO_ORDER_MIN_MASK 0x00001f00 +#define AS_FOLIO_ORDER_MAX_MASK 0x0003e000 +#define AS_FOLIO_ORDER_MASK (AS_FOLIO_ORDER_MIN_MASK | AS_FOLIO_ORDER_MAX_MASK) + /** * mapping_set_error - record a writeback error in the address_space * @mapping: the mapping in which an error should be set @@ -344,6 +349,53 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) m->gfp_mask = mask; } +/* + * There are some parts of the kernel which assume that PMD entries + * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, + * limit the maximum allocation order to PMD size. I'm not aware of any + * assumptions about maximum order if THP are disabled, but 8 seems like + * a good order (that's 1MB if you're using 4kB pages) + */ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER +#else +#define MAX_PAGECACHE_ORDER 8 +#endif + +/* + * mapping_set_folio_orders() - Set the range of folio sizes supported. + * @mapping: The file. + * @min: Minimum folio order (between 0-MAX_PAGECACHE_ORDER inclusive). + * @max: Maximum folio order (between 0-MAX_PAGECACHE_ORDER inclusive). + * + * The filesystem should call this function in its inode constructor to + * indicate which sizes of folio the VFS can use to cache the contents + * of the file. This should only be used if the filesystem needs special + * handling of folio sizes (ie there is something the core cannot know). + * Do not tune it based on, eg, i_size. + * + * Context: This should not be called while the inode is active as it + * is non-atomic. + */ +static inline void mapping_set_folio_orders(struct address_space *mapping, + unsigned int min, unsigned int max) +{ + if (min == 1) + min = 2; + if (max < min) + max = min; + if (max > MAX_PAGECACHE_ORDER) + max = MAX_PAGECACHE_ORDER; + + /* + * XXX: max is ignored as only minimum folio order is supported + * currently. + */ + mapping->flags = (mapping->flags & ~AS_FOLIO_ORDER_MASK) | + (min << AS_FOLIO_ORDER_MIN) | + (MAX_PAGECACHE_ORDER << AS_FOLIO_ORDER_MAX); +} + /** * mapping_set_large_folios() - Indicate the file supports large folios. * @mapping: The file. @@ -357,7 +409,22 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) */ static inline void mapping_set_large_folios(struct address_space *mapping) { - __set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + mapping_set_folio_orders(mapping, 0, MAX_PAGECACHE_ORDER); +} + +static inline unsigned int mapping_max_folio_order(struct address_space *mapping) +{ + return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX; +} + +static inline unsigned int mapping_min_folio_order(struct address_space *mapping) +{ + return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN; +} + +static inline unsigned int mapping_min_folio_nrpages(struct address_space *mapping) +{ + return 1U << mapping_min_folio_order(mapping); } /* @@ -367,7 +434,7 @@ static inline void mapping_set_large_folios(struct address_space *mapping) static inline bool mapping_large_folio_support(struct address_space *mapping) { return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && - test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + (mapping_max_folio_order(mapping) > 0); } static inline int filemap_nr_thps(struct address_space *mapping) @@ -528,19 +595,6 @@ static inline void *detach_page_private(struct page *page) return folio_detach_private(page_folio(page)); } -/* - * There are some parts of the kernel which assume that PMD entries - * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, - * limit the maximum allocation order to PMD size. I'm not aware of any - * assumptions about maximum order if THP are disabled, but 8 seems like - * a good order (that's 1MB if you're using 4kB pages) - */ -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER -#else -#define MAX_PAGECACHE_ORDER 8 -#endif - #ifdef CONFIG_NUMA struct folio *filemap_alloc_folio(gfp_t gfp, unsigned int order); #else