From patchwork Wed May 29 13:44:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678904 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51461C27C43 for ; Wed, 29 May 2024 13:45:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EA456B0092; Wed, 29 May 2024 09:45:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 972276B009E; Wed, 29 May 2024 09:45:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7EC4C6B009F; Wed, 29 May 2024 09:45:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 59F906B0092 for ; Wed, 29 May 2024 09:45:29 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 10F031C03B3 for ; Wed, 29 May 2024 13:45:29 +0000 (UTC) X-FDA: 82171555578.12.5C5A31A Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 4F442C0023 for ; Wed, 29 May 2024 13:45:26 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=K5BDc8k6; spf=pass (imf22.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990326; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vyE4Fmym59OtmjhKpB7hb2ndo0HifAVPGXvfWnGUe04=; b=jumW7ZECXV1dcHsjfFTCaQkQzeCToJCzoDyp+j4gVWztYLIiQ+M1XFRfM0PxRQOxS81XNQ zGN+jMKqP5OIAV4ICBb+w5Fp/BzbqG17acx9/7he0dk+e/Pe3XX4tzIJ7fwvTeANJAkCWF Y+PVIulbMc33vkrqo1D5Xkake8WWmzQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990326; a=rsa-sha256; cv=none; b=qgZJH4TvqkYIlACIwZOSO+qJp5NVI9GJNsqdTKgoj1U5NwQ9c3SKbvMGJDTuIMnaKQ39xy CU68MR0Zt79xT36TKCW+iTVwAniqN9R9ENlySd41ed89ihN41HwKLIFQPprLS+Uh3XyWom kB4a0XfZ8dpjDdBE0fJqp7LCAsFX8CE= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=K5BDc8k6; spf=pass (imf22.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Vq9cf2FK2z9spd; Wed, 29 May 2024 15:45:22 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990322; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vyE4Fmym59OtmjhKpB7hb2ndo0HifAVPGXvfWnGUe04=; b=K5BDc8k6JS74gk5nZ9S3ftaf1bczhFlSfvZUdxetoUfU00BgRFBVYNz9TZyB75a+jyKRcK +rEokZRXyNksoG16gNoeXEEtUg7tI23QP5/aFBofdmGcfKacI5XSBpC3haBSH1ikEVw+Cp JLar43Zxvmd/8EYuFal/LtLzVKvZAopFHMR74IU/8tt5ac7AMO3BnBiW7DZzJsFGjzleai pgB6m2CPlRrfCFGiSH7j6BbSCxH8SierAl/HSCY9WsLuB0UYZh6TaD9dXsdm4WuODNMNNJ sB960inXvVAiv0lcaR7D9b9DlBiDLRzBNqn44+S5wT4T3/+Ruk1cbMo3mRA6Ow== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 01/11] readahead: rework loop in page_cache_ra_unbounded() Date: Wed, 29 May 2024 15:44:59 +0200 Message-Id: <20240529134509.120826-2-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4F442C0023 X-Rspam-User: X-Stat-Signature: 1m3z5sy5w5zzqgscksynqshh33mdpitc X-HE-Tag: 1716990326-941700 X-HE-Meta: U2FsdGVkX19TLGrez2J/JcG5TBlMCBeJdlPrrrY1+1zDxFDg04npq9ztir3UpY/1YY4Iwn9WYDX0FvImYQkbg09zjiS0nyQHlaGSx0dtoLmHszTwdrUd0oBaC9F19gIqmHWVvRqIzHRuZ5RLvnvlu4lhARVIbJKoq9liU2xqTWnDKuGRt3rWnVso+XZGpCjpvi/GBDNJrVlD/LUQTIyIOlhRrhJ3LGnSHjiecHj08N0Le4E5DmvkuHnkvD76M6TmdhJ2rq4ECaiIhYIC7Vvb8IfE0mS/6z9q2lzTVKqCzmriXHN/A5/eX/27krrLVityM2ouzZ1a0OyeVTcAcZx4vaDcSOaHzgVAoJ4UAt9TffuPpQDM6K3eIcoRAzFUMuWu8LJChGEgw2HrqHYr+MprKJABhviJyXw2aSvln4mlojK7DMgvUH7oigQPkiVUJ85VwIS+pyciuI3IBg+xR+lFcPkEFA0rZUB2v+BbMXUu1Go1rYL5Fq4OmqdnT1FMMjP7sLHS/Z1HTs5rdl3GAwjfTmyEbkOOJ6rajBIOBn1Gz/0bfADHG/YcLuGU2uzXoCewLamyiUZP6XDUyYdtVJ5Kidnm6WOKVoP9uJcE6XmofG7WM/49i1UYOsK0I4B4T8js1jn2zHn/1a8cHlpzIK+RS6ErCMKbdLn/kPbzId/gdaXsMiQwz1+5rlQfoYPh72NH5c5aqQYIpBVs+BhUGcZNDpPRg8p/x77GNAeaKLvdY0TuDAhTRU9GCPRVJOKcWdclP8UlZjWEwWQhDHienPei70AdOgeCrE8oS7HIQh0UBzRw61E5YeA3CTc4gkX+k7RlljQi2hCvKLgHZ7hnlmdOLHZ4gNBXNQw3jmF0FuvanukEvkwCdpUrTggGcL5ttM3YbSPksuf3bdR4IbMgUE1volngPP50aAukbx3BIq7pFmYZqg0E3y6EoYp0KDDCszsUH979OrPPDIz/IqEgAEe i3Qr9SxO Z3a9b9r6F4wX/ytXIb2XUrHG45W3BbshdCykGZlz1NKGP7qKV3kvMaVzf+xduuFLhvJq15FbpdmnRq/qusRdx0kPkgG4vxtYA5nBlTRNO/puHP0UgN7F2SqRnAcZ/ho60AndsH2vDM4XJvLQzK6Y4qUyEPkiyrGlTNEeIt/ULi8cxenSmeFf+TxeKpFayrXtml6xunn4OT+hGByBIF6Raq1EsjWVOHVT04rUo3HpcYvSdOn4qId8tsuyD9E7qF32aSK8l+pRoFYY2OIGgbpVvce8MvRuR2/Jpo0ESRjmQMB6WnnQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Hannes Reinecke Rework the loop in page_cache_ra_unbounded() to advance with the number of pages in a folio instead of just one page at a time. Note that the index is incremented by 1 if filemap_add_folio() fails because the size of the folio we are trying to add is 1 (order 0). Signed-off-by: Hannes Reinecke Co-developed-by: Pankaj Raghav Acked-by: Darrick J. Wong Signed-off-by: Pankaj Raghav --- mm/readahead.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index c1b23989d9ca..75e934a1fd78 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -208,7 +208,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, struct address_space *mapping = ractl->mapping; unsigned long index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); - unsigned long i; + unsigned long i = 0; /* * Partway through the readahead operation, we will have added @@ -226,7 +226,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, /* * Preallocate as many pages as we will need. */ - for (i = 0; i < nr_to_read; i++) { + while (i < nr_to_read) { struct folio *folio = xa_load(&mapping->i_pages, index + i); int ret; @@ -240,8 +240,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index++; - i = ractl->_index + ractl->_nr_pages - index - 1; + ractl->_index += folio_nr_pages(folio); + i = ractl->_index + ractl->_nr_pages - index; continue; } @@ -256,13 +256,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, break; read_pages(ractl); ractl->_index++; - i = ractl->_index + ractl->_nr_pages - index - 1; + i = ractl->_index + ractl->_nr_pages - index; continue; } if (i == nr_to_read - lookahead_size) folio_set_readahead(folio); ractl->_workingset |= folio_test_workingset(folio); - ractl->_nr_pages++; + ractl->_nr_pages += folio_nr_pages(folio); + i += folio_nr_pages(folio); } /* From patchwork Wed May 29 13:45:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE0C8C27C43 for ; Wed, 29 May 2024 13:45:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 507B76B00A0; Wed, 29 May 2024 09:45:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 492016B00A1; Wed, 29 May 2024 09:45:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 30A766B00A3; Wed, 29 May 2024 09:45:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 10B866B00A0 for ; Wed, 29 May 2024 09:45:33 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AC2F41C06C7 for ; Wed, 29 May 2024 13:45:32 +0000 (UTC) X-FDA: 82171555704.20.F54AD4C Received: from mout-p-103.mailbox.org (mout-p-103.mailbox.org [80.241.56.161]) by imf30.hostedemail.com (Postfix) with ESMTP id B59908000E for ; Wed, 29 May 2024 13:45:30 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=uCxNoEb4; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.161 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990331; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DmwuGR6fV17eJIHfq5Yf4JdaphEw6LOJM4z1CyIGllc=; b=cB6J5aID38N5hIzxq/QbDso5z/wuvTkzrGuhAADAmXXulDGto7btRfNggTC/+t9luqDIHU di9ULFlWw5OxvSwQI4H+Ui8ws2/ca+O7A3XhP8YEuICXoigcFchN0iuKx/f6T+LYIZfY3O aeOe5PqfoHHYfc+JaXn1EZ6Ud+3RZ8M= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=uCxNoEb4; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.161 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990331; a=rsa-sha256; cv=none; b=iJi7NNGVE2toJJZJ0R3B1Mk3j9SDs9Uhq4FZ/qn9CitjT7Jtz/PAG9Q8v+W95xzLNso13C U/zKfsn8UN7C3c+RXi5ORH9kqWP+npNCGui6MEr5ZhIt5tKKwOLsrGqTI8EZA8TWPA63Ny 4EEq+18Z2N7J9JR6f4Dgw8DGHH+iMJE= Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-103.mailbox.org (Postfix) with ESMTPS id 4Vq9cl1xNPz9snK; Wed, 29 May 2024 15:45:27 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990327; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DmwuGR6fV17eJIHfq5Yf4JdaphEw6LOJM4z1CyIGllc=; b=uCxNoEb4kYqDnq31BObQroi0nA2/UXAzLSfgqIfXYw62R2P+uubr27nYr2daicCxjLyktp XbWQf0kFBOnU28dEtcXh8DnqjNAnEvP3kYPo9M9uk46/Td/qHLHd4vSPpdwe2xJFRt1dOc XIwhLskzO7SUOPh7RArr0IL+znXgwRrWfXBj+UQZ11d5Snj3agfirtIpq2RSfhVaPlTmR0 ai74+QvvIkeAwjRpDk86yvPQKjAXQRCN5yVaASrwW3zr99g8sU022ejL1vQHg84OGcsgMY r75O5X9FbR01aSvlWu5iOfNr9BL7FmUbZfC0w6o0cLok1Ir1ss30+gNUpQQBpQ== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 02/11] fs: Allow fine-grained control of folio sizes Date: Wed, 29 May 2024 15:45:00 +0200 Message-Id: <20240529134509.120826-3-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: B59908000E X-Stat-Signature: fhygx1p1zn77kcm8opfd6akisfskpq4i X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1716990330-116045 X-HE-Meta: U2FsdGVkX1+OmoyJRqE6FsiGJLK7HPjhXBYSU9zdcqoT7FSQXyIqrurSE3yXK+jC20Y5uARIhQDEunkSxNjyoiyhUJy+KrpzchxKjmc4pQdWR2PVkzkV47icJE/ONOb//iqTwuupGwy9+BdkHKrc9jWv3wm8IZJD3iVJWKyitollZ5ThA0T1FGRfj+uBlBxiRCCtaxoo1g1BA45X8Xs8RDZ7N/eS9tucuJXjgDuuxt0ZMsFdLFmZpsRWbLWIZ2VWN54hIcNWF+2oW5+o92I+wJMufUYbRl8qgEGpTU6U4XDeYWcSCU5sPVrSjW45qOq84lC2+VNVMcMIug48R4cKZ1+n0/F1HQZuUH602F86U6Vx5vSZPjgRSejmGwxg6xE31BcMcMR0s+oTRzMeZx6nf1VoIVSImg44gS+b6NoqkbgXc+f79ifeXV6XXdftjPCl4CPyA+hJof290bQC/Had1BPX8pVjOmARaB50sRihxp1/glp22p16tLr9hykn5eLdf6WiCOoRN8Rk8UeIbWiBn835quY5SO7fEi66iyZTE4MLdv1gkctyubruxFuEBeMzo9ni9PgcGM1ZHZBhS1Fysn0qPDqhRvwjbJHrU/EHqdVFbImpdElcmIZLxJ8k2ynjDYi2sJ6heNpJ4uh8Aqaz017HBk9Qccx1Rsh9eENeWHRAM8rnuFIz5hDWogQBoX5a3+kaujKhQ9NtWL+itYKeBPqB4ZbAH7lRA5IY9ZaVqZ5snczfVxr+fMvrGHyuLadUyWte5pr+n38JtxGWwyHSwQCKmVuXy1CtxgsephV0H+tsD+s7Om5NEvioDoosS9Xpi88UeTmdHhWlpGk5COuhZNRlOziJVjvcdPIZ24hEO8mZMSBBZtz5dtZ82HyCAikUqkbd96s5CSdRX9LI1CynTM0w2Zfk1K3djULc/KY1/TlmPC6PGPkSlAjEjhF/G76S7iSP7wuqIqkI8wAimAS O+d7I4NY zlNVOb4nTNGKa0NhG0RIbjy47W3uqhMl/OWrEGbYLhUpud8cGwnWiTkVc2iEgkjAOQ0ZFYpnqwQDxwXm6vsTxLr2YDN0VfHWfLtGiHOQ7YZTxc80RnLeBMRhL/esMoJClAeRzdpn0aMe/jkxu7uSKA2fMd20BGyz47NYZS8wWv4eA++UDwEVaUKyIeUdYkvzfml4fu8o1w3gzJbT4zoVgxs2TddfqVUHO/Xq/LWDTzJqlR2JED48M7UBYvvuH8PXXLSlHWTa5NIaJwvYspPMZ4Q8ryg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: "Matthew Wilcox (Oracle)" We need filesystems to be able to communicate acceptable folio sizes to the pagecache for a variety of uses (e.g. large block sizes). Support a range of folio sizes between order-0 and order-31. Signed-off-by: Matthew Wilcox (Oracle) Co-developed-by: Pankaj Raghav Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke --- include/linux/pagemap.h | 86 ++++++++++++++++++++++++++++++++++------- mm/filemap.c | 6 +-- mm/readahead.c | 4 +- 3 files changed, 77 insertions(+), 19 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 8f09ed4a4451..228275e7049f 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -204,14 +204,21 @@ enum mapping_flags { AS_EXITING = 4, /* final truncate in progress */ /* writeback related tags are not used */ AS_NO_WRITEBACK_TAGS = 5, - AS_LARGE_FOLIO_SUPPORT = 6, - AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */ - AS_STABLE_WRITES, /* must wait for writeback before modifying + AS_RELEASE_ALWAYS = 6, /* Call ->release_folio(), even if no private data */ + AS_STABLE_WRITES = 7, /* must wait for writeback before modifying folio contents */ - AS_UNMOVABLE, /* The mapping cannot be moved, ever */ - AS_INACCESSIBLE, /* Do not attempt direct R/W access to the mapping */ + AS_UNMOVABLE = 8, /* The mapping cannot be moved, ever */ + AS_INACCESSIBLE = 9, /* Do not attempt direct R/W access to the mapping */ + /* Bits 16-25 are used for FOLIO_ORDER */ + AS_FOLIO_ORDER_BITS = 5, + AS_FOLIO_ORDER_MIN = 16, + AS_FOLIO_ORDER_MAX = AS_FOLIO_ORDER_MIN + AS_FOLIO_ORDER_BITS, }; +#define AS_FOLIO_ORDER_MASK ((1u << AS_FOLIO_ORDER_BITS) - 1) +#define AS_FOLIO_ORDER_MIN_MASK (AS_FOLIO_ORDER_MASK << AS_FOLIO_ORDER_MIN) +#define AS_FOLIO_ORDER_MAX_MASK (AS_FOLIO_ORDER_MASK << AS_FOLIO_ORDER_MAX) + /** * mapping_set_error - record a writeback error in the address_space * @mapping: the mapping in which an error should be set @@ -360,9 +367,49 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) #define MAX_PAGECACHE_ORDER 8 #endif +/* + * mapping_set_folio_order_range() - Set the orders supported by a file. + * @mapping: The address space of the file. + * @min: Minimum folio order (between 0-MAX_PAGECACHE_ORDER inclusive). + * @max: Maximum folio order (between @min-MAX_PAGECACHE_ORDER inclusive). + * + * The filesystem should call this function in its inode constructor to + * indicate which base size (min) and maximum size (max) of folio the VFS + * can use to cache the contents of the file. This should only be used + * if the filesystem needs special handling of folio sizes (ie there is + * something the core cannot know). + * Do not tune it based on, eg, i_size. + * + * Context: This should not be called while the inode is active as it + * is non-atomic. + */ +static inline void mapping_set_folio_order_range(struct address_space *mapping, + unsigned int min, + unsigned int max) +{ + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) + return; + + if (min > MAX_PAGECACHE_ORDER) + min = MAX_PAGECACHE_ORDER; + if (max > MAX_PAGECACHE_ORDER) + max = MAX_PAGECACHE_ORDER; + if (max < min) + max = min; + + mapping->flags = (mapping->flags & ~AS_FOLIO_ORDER_MASK) | + (min << AS_FOLIO_ORDER_MIN) | (max << AS_FOLIO_ORDER_MAX); +} + +static inline void mapping_set_folio_min_order(struct address_space *mapping, + unsigned int min) +{ + mapping_set_folio_order_range(mapping, min, MAX_PAGECACHE_ORDER); +} + /** * mapping_set_large_folios() - Indicate the file supports large folios. - * @mapping: The file. + * @mapping: The address space of the file. * * The filesystem should call this function in its inode constructor to * indicate that the VFS can use large folios to cache the contents of @@ -373,7 +420,23 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) */ static inline void mapping_set_large_folios(struct address_space *mapping) { - __set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + mapping_set_folio_order_range(mapping, 0, MAX_PAGECACHE_ORDER); +} + +static inline +unsigned int mapping_max_folio_order(const struct address_space *mapping) +{ + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) + return 0; + return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX; +} + +static inline +unsigned int mapping_min_folio_order(const struct address_space *mapping) +{ + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) + return 0; + return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN; } /* @@ -382,16 +445,13 @@ static inline void mapping_set_large_folios(struct address_space *mapping) */ static inline bool mapping_large_folio_support(struct address_space *mapping) { - return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && - test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + return mapping_max_folio_order(mapping) > 0; } /* Return the maximum folio size for this pagecache mapping, in bytes. */ -static inline size_t mapping_max_folio_size(struct address_space *mapping) +static inline size_t mapping_max_folio_size(const struct address_space *mapping) { - if (mapping_large_folio_support(mapping)) - return PAGE_SIZE << MAX_PAGECACHE_ORDER; - return PAGE_SIZE; + return PAGE_SIZE << mapping_max_folio_order(mapping); } static inline int filemap_nr_thps(struct address_space *mapping) diff --git a/mm/filemap.c b/mm/filemap.c index ba06237b942d..308714a44a0f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1933,10 +1933,8 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, if (WARN_ON_ONCE(!(fgp_flags & (FGP_LOCK | FGP_FOR_MMAP)))) fgp_flags |= FGP_LOCK; - if (!mapping_large_folio_support(mapping)) - order = 0; - if (order > MAX_PAGECACHE_ORDER) - order = MAX_PAGECACHE_ORDER; + if (order > mapping_max_folio_order(mapping)) + order = mapping_max_folio_order(mapping); /* If we're not aligned, allocate a smaller folio */ if (index & ((1UL << order) - 1)) order = __ffs(index); diff --git a/mm/readahead.c b/mm/readahead.c index 75e934a1fd78..da34b28da02c 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -504,9 +504,9 @@ void page_cache_ra_order(struct readahead_control *ractl, limit = min(limit, index + ra->size - 1); - if (new_order < MAX_PAGECACHE_ORDER) { + if (new_order < mapping_max_folio_order(mapping)) { new_order += 2; - new_order = min_t(unsigned int, MAX_PAGECACHE_ORDER, new_order); + new_order = min(mapping_max_folio_order(mapping), new_order); new_order = min_t(unsigned int, new_order, ilog2(ra->size)); } From patchwork Wed May 29 13:45:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D16EC27C43 for ; Wed, 29 May 2024 13:45:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C25C6B00A1; Wed, 29 May 2024 09:45:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 94A886B00A3; Wed, 29 May 2024 09:45:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C5066B00A4; Wed, 29 May 2024 09:45:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5752C6B00A1 for ; Wed, 29 May 2024 09:45:37 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 02DC980222 for ; Wed, 29 May 2024 13:45:36 +0000 (UTC) X-FDA: 82171555914.20.CE06949 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf30.hostedemail.com (Postfix) with ESMTP id 1565080002 for ; Wed, 29 May 2024 13:45:34 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=kHV48mAC; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990335; a=rsa-sha256; cv=none; b=4VkFdRVEXJTXdBAGqpf6z0v2siN3d/RDMoyJ74Vse7z8UqAhwecFAX8O7y37RvzvGBeJCy Kgk/ggqB6c8B7AOpngftwcuXwvS+7/XdjuagjTtTpq2ON21ceGWc5vfls+q6oZEWOeHzw2 +nvdlRtnvlf2IRl1CiSsQwNDKUxVZsw= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=kHV48mAC; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990335; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tlQpRGWS6k8Nnmy1d5M+f7+JbGsV1KjNjVvW+vslAEQ=; b=S5Uj2ji27G6uB/mU8l6C6dYVr+JnH4aG5JgqCasQGTeN2YjRiqlfWn6R1l7vnGn9mnyfjc jsO1NCDRdpadm4xOP95WbwGhCkq59oTcnN6wAtjgYABflDNmq/J5CIo5GBhKHqCtqSW4Od WqD40UFlUa+Wzd7GZMuJFVLoDiBbvFM= Received: from smtp1.mailbox.org (smtp1.mailbox.org [IPv6:2001:67c:2050:b231:465::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Vq9cq6JLcz9spB; Wed, 29 May 2024 15:45:31 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990331; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tlQpRGWS6k8Nnmy1d5M+f7+JbGsV1KjNjVvW+vslAEQ=; b=kHV48mAC0R1wMXDUMrp9Emz/8PU1P2Vl+zzKRA1uZ0pu73iv/EC5smkoAHXFjLyfOXZygD RK8uFCQ3Nr4P4tpUdYLWhO88JffmKdvuskTCs6XV2cwKujvmlX/NeSHgjVApHVyfOII49H D+AIaMHTRL8rVRqsOG8w+L+PzMck3yiHNbUnB52tXOCOdY7O63fhH6kZ1itxmOQd5/rB5N 0JXcZ01gdb2YQwndaJZ7Py7kSo/5DcTQyR6wOiZ7CJ9Cs32m+NtVNL5ABFMu+JaxVdjpvh sS6coq29XFSmLtES5gr2HhMXkjBz8vrXmrWbKNIM0c4GdQFosaIf2EAxyeIgWQ== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 03/11] filemap: allocate mapping_min_order folios in the page cache Date: Wed, 29 May 2024 15:45:01 +0200 Message-Id: <20240529134509.120826-4-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1565080002 X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: nams8uzwqqni1kwsd31p7gbk8tjdmbhb X-HE-Tag: 1716990334-174004 X-HE-Meta: U2FsdGVkX1/rSe5pTtEl/Ha/wOwmfte2TFmcuGO+byhskL6Z12bSfx8zgj0pRz/27stTremDOfTaYEMt4mwJZIhRNEEzLydS4qMBD9AZc3mjXvW7Xj4NjlpVa6bFOWh7+WkPedhxVKqBLDqpV9k1q7xLwlq8IT424R6UmWflY3rXe+gbTjNqr64HMOb5priuJmqP+5d4I9FsuuWb7SGjR2FNnW2ESncJhklHATZ9f9VNFa3lJFRcfcYOluJDqe+++RjefldaT/LovE0sHguAJUZ3rnLRpFgpKUidVlzlYiB6bITiGzuwcPKXP8VgyLxMoeXrMxqMNyWLvUhhNJ4ryDnmYqdPyKdT/pBWt6+dQTQx76LMTX8r3Th6L2/po+ehhuKVGdOKmHNZc2NyUTZE+ejyvu3uvePifhbmj3K6b1noQAZDhwucwUjhCZvFhaT463UjafkB0LVnhFZX/T9TdSAlQD4k1LgoQQoSOXsE65WnzGFZ60PNgc9YnG0VFqWwIB3yhuSbSaOrPXRO9sDLegBDE9pIE5SVUwJYxX4+m2dnIpWmvrusm5nxilcsB2tB/RPSDftckWyJsww5MrczqwTg45hNvs5gmq1Lr+r4qf25xKG6W8k9FPuKIs7aV8yTkd/tIA9VyuqO6L1trBgrCbCDU3pNBujBUgpmxMi5XiTVsGXEto4GVbWQA1hyFnGdJsoMuxXRc0uzTzgAQMK0NeI7GseuLpxl6ywg295Kv2uw13NeSJ2ffFLFRpQDiGLCGW6DaqvzhU6hEwMNmaBE3q1/hyLpCAzOYJatnCnVBP4V1uOTpxqc4dSEOTEdsm14GXNH0EKVQvwn2fJAJQttV+OrVqEojjWOzM4h7ipV3O8qpncO/PMPPL9kIZvQyxBmDXiOvbN9vYoGRqbAeNhbQSGSKs8P1wVObxs8LmMuoS+bmUObmkNIPWc3CfORw69FzyhQpnfZJbgqfbe8xSa TWV8Mzuo Ynl5U/tYzeHgo2l6B9fdzf9hEygmAcw3NV1YoOGK4iJZkZuOIr0SefdFqUMsQ82Of4YPzy8qF9kAY5RhkLFmaY4aLloxHJj3YoNgTQ6ILvf4u0dbUG4ZFYw8/HhqI9TkMxa3Ns4/8iAhVSTc0spIsD6Ku6j5LkTlWXBLSkL112g97CDRr6nUyJyT96uMNLodH5yHd8YOO85IRNFRYV/YtGV+b/OCwCfZbuswXbtpE5YCXcSRZZu1ixNsy9XCTaci+MsH1abx//yoj52M18vtxaWd7dhVeKT+GgxIcRVwc3NoN0qGY6cjhyXlDJtqXFSKcOBynatbxYlRKRgdEWJG0u7teXA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav filemap_create_folio() and do_read_cache_folio() were always allocating folio of order 0. __filemap_get_folio was trying to allocate higher order folios when fgp_flags had higher order hint set but it will default to order 0 folio if higher order memory allocation fails. Supporting mapping_min_order implies that we guarantee each folio in the page cache has at least an order of mapping_min_order. When adding new folios to the page cache we must also ensure the index used is aligned to the mapping_min_order as the page cache requires the index to be aligned to the order of the folio. Signed-off-by: Pankaj Raghav Co-developed-by: Luis Chamberlain Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke Reviewed-by: Darrick J. Wong --- include/linux/pagemap.h | 20 ++++++++++++++++++++ mm/filemap.c | 24 +++++++++++++++++------- 2 files changed, 37 insertions(+), 7 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 228275e7049f..899b8d751768 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -439,6 +439,26 @@ unsigned int mapping_min_folio_order(const struct address_space *mapping) return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN; } +static inline unsigned long mapping_min_folio_nrpages(struct address_space *mapping) +{ + return 1UL << mapping_min_folio_order(mapping); +} + +/** + * mapping_align_start_index() - Align starting index based on the min + * folio order of the page cache. + * @mapping: The address_space. + * + * Ensure the index used is aligned to the minimum folio order when adding + * new folios to the page cache by rounding down to the nearest minimum + * folio number of pages. + */ +static inline pgoff_t mapping_align_start_index(struct address_space *mapping, + pgoff_t index) +{ + return round_down(index, mapping_min_folio_nrpages(mapping)); +} + /* * Large folio support currently depends on THP. These dependencies are * being worked on but are not yet fixed. diff --git a/mm/filemap.c b/mm/filemap.c index 308714a44a0f..0914ef2e8256 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -859,6 +859,8 @@ noinline int __filemap_add_folio(struct address_space *mapping, VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio); + VM_BUG_ON_FOLIO(folio_order(folio) < mapping_min_folio_order(mapping), + folio); mapping_set_update(&xas, mapping); VM_BUG_ON_FOLIO(index & (folio_nr_pages(folio) - 1), folio); @@ -1919,8 +1921,10 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, folio_wait_stable(folio); no_page: if (!folio && (fgp_flags & FGP_CREAT)) { - unsigned order = FGF_GET_ORDER(fgp_flags); + unsigned int min_order = mapping_min_folio_order(mapping); + unsigned int order = max(min_order, FGF_GET_ORDER(fgp_flags)); int err; + index = mapping_align_start_index(mapping, index); if ((fgp_flags & FGP_WRITE) && mapping_can_writeback(mapping)) gfp |= __GFP_WRITE; @@ -1958,7 +1962,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, break; folio_put(folio); folio = NULL; - } while (order-- > 0); + } while (order-- > min_order); if (err == -EEXIST) goto repeat; @@ -2447,13 +2451,16 @@ static int filemap_update_page(struct kiocb *iocb, } static int filemap_create_folio(struct file *file, - struct address_space *mapping, pgoff_t index, + struct address_space *mapping, loff_t pos, struct folio_batch *fbatch) { struct folio *folio; int error; + unsigned int min_order = mapping_min_folio_order(mapping); + pgoff_t index; - folio = filemap_alloc_folio(mapping_gfp_mask(mapping), 0); + folio = filemap_alloc_folio(mapping_gfp_mask(mapping), + min_order); if (!folio) return -ENOMEM; @@ -2471,6 +2478,8 @@ static int filemap_create_folio(struct file *file, * well to keep locking rules simple. */ filemap_invalidate_lock_shared(mapping); + /* index in PAGE units but aligned to min_order number of pages. */ + index = (pos >> (PAGE_SHIFT + min_order)) << min_order; error = filemap_add_folio(mapping, folio, index, mapping_gfp_constraint(mapping, GFP_KERNEL)); if (error == -EEXIST) @@ -2531,8 +2540,7 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count, if (!folio_batch_count(fbatch)) { if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ)) return -EAGAIN; - err = filemap_create_folio(filp, mapping, - iocb->ki_pos >> PAGE_SHIFT, fbatch); + err = filemap_create_folio(filp, mapping, iocb->ki_pos, fbatch); if (err == AOP_TRUNCATED_PAGE) goto retry; return err; @@ -3748,9 +3756,11 @@ static struct folio *do_read_cache_folio(struct address_space *mapping, repeat: folio = filemap_get_folio(mapping, index); if (IS_ERR(folio)) { - folio = filemap_alloc_folio(gfp, 0); + folio = filemap_alloc_folio(gfp, + mapping_min_folio_order(mapping)); if (!folio) return ERR_PTR(-ENOMEM); + index = mapping_align_start_index(mapping, index); err = filemap_add_folio(mapping, folio, index, gfp); if (unlikely(err)) { folio_put(folio); From patchwork Wed May 29 13:45:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 004A3C25B75 for ; Wed, 29 May 2024 13:45:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7EF486B00A6; Wed, 29 May 2024 09:45:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 776BE6B00A7; Wed, 29 May 2024 09:45:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A25A6B00A8; Wed, 29 May 2024 09:45:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 344F06B00A6 for ; Wed, 29 May 2024 09:45:41 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E9E7140B2E for ; Wed, 29 May 2024 13:45:40 +0000 (UTC) X-FDA: 82171556040.17.B87A61D Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) by imf19.hostedemail.com (Postfix) with ESMTP id 230BE1A0020 for ; Wed, 29 May 2024 13:45:38 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=T0Lrlfmv; spf=pass (imf19.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pUdfC/WKWCQctCsU2EgKroiARXKN//2pAERfC44xAGs=; b=nONnSjxbRRmm+N9angu1kMXMwOMvG9LpiYUf/SHDh7VJoe846e5U2Q2hXJ4In2DiMNNv2o 56aOh+B5PFILugaLk/HoBHVb5lA8YYE3J2wsXQpYU9sETDFbmboGI+hAw32rpOQl62mTEs sBIVkaEZiXQkhWmJIKBtYK7BEiY8PeY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990339; a=rsa-sha256; cv=none; b=v6gKdTH6T6gYVKLrm8esb3+EdBBr1IG8aXO3YQ7fkDjpIORmoc1EDkX9BAlowSWCSeaany mB+KGlYLPHX0YL+0ZIX/rfgaveFKdl5MW74YIXKRCuqoQOVDbNXNa11PXoOJNvnPO837GT gn30fPSEAAxcoqi1K1347PNitI78PLU= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=T0Lrlfmv; spf=pass (imf19.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4Vq9cv680Kz9spR; Wed, 29 May 2024 15:45:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990335; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pUdfC/WKWCQctCsU2EgKroiARXKN//2pAERfC44xAGs=; b=T0LrlfmvkXeRbIR3+6/v8cSGJSLZyoRswYQNTCKEHVMiH7oMeWPQcJKoD1ZFu28aOgz76c /IJskjrRKDmRjdBnrKmDPnohP25Yb+Nk4ZcD7EmD3YKfSUwtQXHSmead6J+1Yq7OcPmCR3 mh2rB45o21CM8eVOz023SuA09Dsd1z6Q1YafkaB7hRhzCUHaUxQ/8r99BH2+1jtU2tr/g9 6+Z6Trj6r5EsKCNs8F+YKT44Z3++o27zlUp9OXvH56eS6eXx01tCaMsx+qhFE2mkZQvaaf hZskk6uFN9pIu+ysb51utULD6AY3exPLes+tMw+jURSijS3Awtx1HKv7kZmAjA== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 04/11] readahead: allocate folios with mapping_min_order in readahead Date: Wed, 29 May 2024 15:45:02 +0200 Message-Id: <20240529134509.120826-5-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 230BE1A0020 X-Rspam-User: X-Stat-Signature: 9jsskprd87wg16763x7qgq46bqh5474q X-HE-Tag: 1716990338-715821 X-HE-Meta: U2FsdGVkX1/Ad6kuii5VcfHTKvDODaTKPNdqTtvyoMTTsmmDhvUDAaSDRTN1FhkrxQdUbvy+FmXoFh3Ulk9RxCIAKFLTfFowjtsV/X2MOFWKw2Q5huXkzf8aKgDjsSucZnkaFi0MFj5D1HlqWaA8ebc5CdDai775pQD2HC31TxRT+0PrwP8y8i05s4HEXn0JYAVgv7AoIC5sQMB5VgKYtyCmmfypY2x3wKhpA7epC2/kWHG7UPgWI+tUblNV5kjmrVcvIA3VscnJ/Ur+Kngm4AiXpXq+FI02rqnOlOQEnDk7S8MnfMqg/3Tvn8opMxT3Vth0wXaptrYrtpN85XFRQj3SWmBvCIKLX3+6DqJY+TtoqfmMYQYmunNSZ+/4NtgjXbPSSTZYOGFnp1KyQu3X5TM1svYlS2kskRuS8h47JwtYCc4r1gzvA77Aan2thoJDmYOhNNNFc9Buk+uory3slYnz6CHgt8AetcN9W/YfYrTuHVUZFBbS8/oZl8ssGJ1IcyIms2yaeTI//+fvgSPdRinkRUxl12PVtuYr/M5arsEWQz2c5b++vHrT7cf0FblieDwWbxST+XvsZXPzaAE+mcL/vFYlRYTIkHpBwTr22rFQQNvSsQGjAxGkRvN2HcjVO1L3rdD3oB9u+Cpan9DD5CWqB6tj/xvDVpPmOi3TPcv99ezyiA9USoDDFLfAtSZEhi62rDjEl7xdnfcb7eb6vZnp4O3JZYCIEW7z3K6YhxI3UCdRiTuw72ed1X4e+0uErGUn+XkHlF6F8TrQgarT6fGHO7QDl+2G8cA8fbkEWWAVQKd23+Q6p51NOu5TydWDTSebWJjRSRkPFeDQm++M/e213RxDgAJxU/wDpTJz5ymLm1x1DBOFxHhgTgMBD6PiwcVkJj8XdxQzWQGYIw5lp+7uRqMkL4W5RG8juoG56kU5V7NkAOCfFleXgsBsHEnsBemWtjQVz4fC6WI4D5v kdTWiIrk DjEop/PeEN26uRtRAHafzHtDf2rw/DFoMjrorss1WqYJZLn8GIbLBhkHRu7TCJpE8hsNkXdAZV7bLuTUDrlHQyAXHr1Hyd0a6hqqmG5wCZcibePzdAPg4Jri6Mfcehv93g4GTMx0SsjsScvLwTq8swwo2nIcVckrDN+qQ35AJOM/dYYmh6Fp3mqxcdP7FS7e26gXpDVuoIRGbDhiLvycfHwByOyEu0mHoxs9Hfff/EW1s37ypQOLiArFa/Obyzd/CNuXuO11VpfRt2cDsdq86tAnysQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav page_cache_ra_unbounded() was allocating single pages (0 order folios) if there was no folio found in an index. Allocate mapping_min_order folios as we need to guarantee the minimum order if it is set. When read_pages() is triggered and if a page is already present, check for truncation and move the ractl->_index by mapping_min_nrpages if that folio was truncated. This is done to ensure we keep the alignment requirement while adding a folio to the page cache. page_cache_ra_order() tries to allocate folio to the page cache with a higher order if the index aligns with that order. Modify it so that the order does not go below the mapping_min_order requirement of the page cache. This function will do the right thing even if the new_order passed is less than the mapping_min_order. When adding new folios to the page cache we must also ensure the index used is aligned to the mapping_min_order as the page cache requires the index to be aligned to the order of the folio. readahead_expand() is called from readahead aops to extend the range of the readahead so this function can assume ractl->_index to be aligned with min_order. Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke --- mm/readahead.c | 85 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 71 insertions(+), 14 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index da34b28da02c..389cd802da63 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -206,9 +206,10 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, unsigned long nr_to_read, unsigned long lookahead_size) { struct address_space *mapping = ractl->mapping; - unsigned long index = readahead_index(ractl); + unsigned long ra_folio_index, index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); - unsigned long i = 0; + unsigned long mark, i = 0; + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); /* * Partway through the readahead operation, we will have added @@ -223,6 +224,22 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, unsigned int nofs = memalloc_nofs_save(); filemap_invalidate_lock_shared(mapping); + index = mapping_align_start_index(mapping, index); + + /* + * As iterator `i` is aligned to min_nrpages, round_up the + * difference between nr_to_read and lookahead_size to mark the + * index that only has lookahead or "async_region" to set the + * readahead flag. + */ + ra_folio_index = round_up(readahead_index(ractl) + nr_to_read - lookahead_size, + min_nrpages); + mark = ra_folio_index - index; + if (index != readahead_index(ractl)) { + nr_to_read += readahead_index(ractl) - index; + ractl->_index = index; + } + /* * Preallocate as many pages as we will need. */ @@ -230,7 +247,9 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, struct folio *folio = xa_load(&mapping->i_pages, index + i); int ret; + if (folio && !xa_is_value(folio)) { + long nr_pages = folio_nr_pages(folio); /* * Page already present? Kick off the current batch * of contiguous pages before continuing with the @@ -240,12 +259,24 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index += folio_nr_pages(folio); + + /* + * Move the ractl->_index by at least min_pages + * if the folio got truncated to respect the + * alignment constraint in the page cache. + * + */ + if (mapping != folio->mapping) + nr_pages = min_nrpages; + + VM_BUG_ON_FOLIO(nr_pages < min_nrpages, folio); + ractl->_index += nr_pages; i = ractl->_index + ractl->_nr_pages - index; continue; } - folio = filemap_alloc_folio(gfp_mask, 0); + folio = filemap_alloc_folio(gfp_mask, + mapping_min_folio_order(mapping)); if (!folio) break; @@ -255,11 +286,11 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, if (ret == -ENOMEM) break; read_pages(ractl); - ractl->_index++; + ractl->_index += min_nrpages; i = ractl->_index + ractl->_nr_pages - index; continue; } - if (i == nr_to_read - lookahead_size) + if (i == mark) folio_set_readahead(folio); ractl->_workingset |= folio_test_workingset(folio); ractl->_nr_pages += folio_nr_pages(folio); @@ -493,13 +524,19 @@ void page_cache_ra_order(struct readahead_control *ractl, { struct address_space *mapping = ractl->mapping; pgoff_t index = readahead_index(ractl); + unsigned int min_order = mapping_min_folio_order(mapping); pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; pgoff_t mark = index + ra->size - ra->async_size; unsigned int nofs; int err = 0; gfp_t gfp = readahead_gfp_mask(mapping); + unsigned int min_ra_size = max(4, mapping_min_folio_nrpages(mapping)); - if (!mapping_large_folio_support(mapping) || ra->size < 4) + /* + * Fallback when size < min_nrpages as each folio should be + * at least min_nrpages anyway. + */ + if (!mapping_large_folio_support(mapping) || ra->size < min_ra_size) goto fallback; limit = min(limit, index + ra->size - 1); @@ -508,11 +545,20 @@ void page_cache_ra_order(struct readahead_control *ractl, new_order += 2; new_order = min(mapping_max_folio_order(mapping), new_order); new_order = min_t(unsigned int, new_order, ilog2(ra->size)); + new_order = max(new_order, min_order); } /* See comment in page_cache_ra_unbounded() */ nofs = memalloc_nofs_save(); filemap_invalidate_lock_shared(mapping); + /* + * If the new_order is greater than min_order and index is + * already aligned to new_order, then this will be noop as index + * aligned to new_order should also be aligned to min_order. + */ + ractl->_index = mapping_align_start_index(mapping, index); + index = readahead_index(ractl); + while (index <= limit) { unsigned int order = new_order; @@ -520,7 +566,7 @@ void page_cache_ra_order(struct readahead_control *ractl, if (index & ((1UL << order) - 1)) order = __ffs(index); /* Don't allocate pages past EOF */ - while (index + (1UL << order) - 1 > limit) + while (order > min_order && index + (1UL << order) - 1 > limit) order--; err = ra_alloc_folio(ractl, index, mark, order, gfp); if (err) @@ -784,8 +830,15 @@ void readahead_expand(struct readahead_control *ractl, struct file_ra_state *ra = ractl->ra; pgoff_t new_index, new_nr_pages; gfp_t gfp_mask = readahead_gfp_mask(mapping); + unsigned long min_nrpages = mapping_min_folio_nrpages(mapping); + unsigned int min_order = mapping_min_folio_order(mapping); new_index = new_start / PAGE_SIZE; + /* + * Readahead code should have aligned the ractl->_index to + * min_nrpages before calling readahead aops. + */ + VM_BUG_ON(!IS_ALIGNED(ractl->_index, min_nrpages)); /* Expand the leading edge downwards */ while (ractl->_index > new_index) { @@ -795,9 +848,11 @@ void readahead_expand(struct readahead_control *ractl, if (folio && !xa_is_value(folio)) return; /* Folio apparently present */ - folio = filemap_alloc_folio(gfp_mask, 0); + folio = filemap_alloc_folio(gfp_mask, min_order); if (!folio) return; + + index = mapping_align_start_index(mapping, index); if (filemap_add_folio(mapping, folio, index, gfp_mask) < 0) { folio_put(folio); return; @@ -807,7 +862,7 @@ void readahead_expand(struct readahead_control *ractl, ractl->_workingset = true; psi_memstall_enter(&ractl->_pflags); } - ractl->_nr_pages++; + ractl->_nr_pages += min_nrpages; ractl->_index = folio->index; } @@ -822,9 +877,11 @@ void readahead_expand(struct readahead_control *ractl, if (folio && !xa_is_value(folio)) return; /* Folio apparently present */ - folio = filemap_alloc_folio(gfp_mask, 0); + folio = filemap_alloc_folio(gfp_mask, min_order); if (!folio) return; + + index = mapping_align_start_index(mapping, index); if (filemap_add_folio(mapping, folio, index, gfp_mask) < 0) { folio_put(folio); return; @@ -834,10 +891,10 @@ void readahead_expand(struct readahead_control *ractl, ractl->_workingset = true; psi_memstall_enter(&ractl->_pflags); } - ractl->_nr_pages++; + ractl->_nr_pages += min_nrpages; if (ra) { - ra->size++; - ra->async_size++; + ra->size += min_nrpages; + ra->async_size += min_nrpages; } } } From patchwork Wed May 29 13:45:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678908 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F6E6C25B75 for ; Wed, 29 May 2024 13:45:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0E316B00A7; Wed, 29 May 2024 09:45:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D92B16B00A8; Wed, 29 May 2024 09:45:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE3BC6B00A9; Wed, 29 May 2024 09:45:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9ABD76B00A7 for ; Wed, 29 May 2024 09:45:45 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 58F58161154 for ; Wed, 29 May 2024 13:45:45 +0000 (UTC) X-FDA: 82171556250.29.7969390 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) by imf07.hostedemail.com (Postfix) with ESMTP id 88B8140015 for ; Wed, 29 May 2024 13:45:43 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=lpGPvRj2; spf=pass (imf07.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.152 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990343; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d3bvvOpkVjx29wta03S/fuMORRSyW24WVI14FdYTCO8=; b=6BxixvJ/HV+rivcgezvbIPXbm1AlI4nvXqWV7P0fgDq8DkxSP6bkEU0LhunX4ko2rP4H5l 7IxYjiyXvQr2v6o85A9qGF5ritJZhYEC2UFUJcmWoe2/7qmAYIKH2oGWn1eHlLQStZn0gj LuEZasNxq1w/Mhhq7HFZVLOCFzyQnjc= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=lpGPvRj2; spf=pass (imf07.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.152 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990343; a=rsa-sha256; cv=none; b=8ptfHm1AsjfRzaFB7uNi3tlHNkrZ0lbh7QhrjQaeikssE7DOSHE1K2Glcr0bAxBxXrHymQ XnVCU7lHqBwVWgmcUUIwRB0vf4Dadz8maVmwGR2KrvzYyYu0CbY58BPSZevHap2a1RU+CW MsozDVfQ/qMu33rUu2/P8e0kSWidROg= Received: from smtp1.mailbox.org (smtp1.mailbox.org [IPv6:2001:67c:2050:b231:465::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4Vq9d02DnJz9scZ; Wed, 29 May 2024 15:45:40 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990340; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d3bvvOpkVjx29wta03S/fuMORRSyW24WVI14FdYTCO8=; b=lpGPvRj2tehBGHLKZ8b2ytls560fWEEb9wcfaMKxHsz7x8C+lL7v89ixRoeBklbysRPhHx hyCihbar79d/eNslfUcZZ2yFnKD5stVWQ9p8OXKnhHV6LZzltAocguLTAJyz5O7k4p+8lW +3tMoVkrudSht2hZ63gvhUzglKMgzz1/erTdfUQYBwcRm65h/WNlMmMazDoMCoLu4kOcKp vxT6GAP4aAy5Sq0eC4PghGeSx7p/6jX+N7wZPuVDq1vRSe7pXuLq7Lu8hfPWjrBhUjb7o6 Q+nE5x0yH3WLxkcY/nxlcL8l1cruvHxEkhgjV8bLclRotJEe1B7BcGJTnFotCQ== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 05/11] mm: split a folio in minimum folio order chunks Date: Wed, 29 May 2024 15:45:03 +0200 Message-Id: <20240529134509.120826-6-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 88B8140015 X-Stat-Signature: sbd4m9z7bsoyecijom9i9dw6h6skds6a X-Rspam-User: X-HE-Tag: 1716990343-366994 X-HE-Meta: U2FsdGVkX1/16YBfYgHIYmmv0Lg+Omc8SdrHhYmqvf372nbBPicdM0uQY+I5N/dFQKaDME+vWvh1sySgpDBmIOPGDI3nAREJTk9Jt2xG0+5OuBriTB3nMy3yEzqgxYNNoRJNn6GeIK3RMdMPBJ3akNQsPbFcN5EWctq76vePCrxl2v7yfNJMMksl7tx1pr4YovifnFsDivmXJkYOUUKJi7VHHqyR77TcKye7Hqs14V/owImlDbrvvaPHb5XgYSHOoVYugGvdU1PyGEfmMi44qgl293pTM5R0MvYqZdgmQYSJ0jLUUKMu73r6WkT4URHUPSLRJTu24hzKpIkRMaLsZxS5f1hStDNGH3hkcFkSEpHzJRKGf7uSBdmx5rL+DZ44L/ZH+9UqoaN6/ZB+bAy6SVZUl+X8lXeSGUAGgKuaL8nznwliRUzH4FNDOaP9KlmZOPU0l3DtrNPstYKKmCQq1pQV15dp7mF6ybXo0u8UT3M8Io9g69I08xqlvKF7TGeDjXeXuy/UqsqaF/jd4TnZdHvMuDym0363JsPfNheNLUEwCNLqZJjdmzj9VBS32AQrk3AGyPqUM4MOvZOENnhwekx0yIPlbSffDCgVMWiJhDOE766AKcvRdM9nz4yDjVkXbc9T3avc7GEyACdafbuZL4kYd6re1ZvyZT+tR6YompR+kE3mPDo5ewtM+YsJ2dwcXC1pRPczA11+Nz81fWrGVzKNjQ+WNC2hlk7JqGkFvGdYKaMndsXhgAbhsotE5R8XrHkx7nqyYBYx8JPhob4culNflRvxfgml4LhppskYL8+uhqJ/GZouaytOxGpKq2vr4fmPizm5eQMDofQu93cVNDHzSS31QuV0O+Dit/9oQU4J0tf8KreNNY+wmV8Zx2NxxA99MKR6RscxVOvz+KJkm+ieLoqg/N/btz+yQ8R3mQNvToGBQI9CQu21BTYL8K/Wp97S5wcDguzGNb2GMPr y6Toeap+ PsivL5Qh0nwZGp1Gxwpi0IeuJuIkCij+BHTXfVLZgWKld6RIYSZJbHMEznaH2R2uKDzKjzCwkbZ59QyCb+CUzL/blwC1/bwdHoW11b5WLCsRVWAL5WOYBK/QDL3sGZxPJb7nWkxjVdNfdd3ssVLq/GqVA2+vpQ9kqyumAM60R5iUSYP3X4XsB7i+NcePzqpLf05KQoMgx6oINBSVf4ncNeMXy6j8V1T/nl3XzscFqbvRp+F9LrpxNR3KZFgZb6DsgtzBNL07AUT0+aTq8WnizWknevE3eSC5XAHx8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Luis Chamberlain split_folio() and split_folio_to_list() assume order 0, to support minorder we must expand these to check the folio mapping order and use that. Set new_order to be at least minimum folio order if it is set in split_huge_page_to_list() so that we can maintain minimum folio order requirement in the page cache. Update the debugfs write files used for testing to ensure the order is respected as well. We simply enforce the min order when a file mapping is used. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke --- include/linux/huge_mm.h | 14 ++++++++---- mm/huge_memory.c | 50 ++++++++++++++++++++++++++++++++++++++--- 2 files changed, 57 insertions(+), 7 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 87682498a5af..6a8e527b78a2 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -88,6 +88,8 @@ extern struct kobj_attribute shmem_enabled_attr; #define thp_vma_allowable_order(vma, vm_flags, tva_flags, order) \ (!!thp_vma_allowable_orders(vma, vm_flags, tva_flags, BIT(order))) +#define split_folio(f) split_folio_to_list(f, NULL) + #ifdef CONFIG_PGTABLE_HAS_HUGE_LEAVES #define HPAGE_PMD_SHIFT PMD_SHIFT #define HPAGE_PUD_SHIFT PUD_SHIFT @@ -307,9 +309,10 @@ unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long add bool can_split_folio(struct folio *folio, int *pextra_pins); int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, unsigned int new_order); +int split_folio_to_list(struct folio *folio, struct list_head *list); static inline int split_huge_page(struct page *page) { - return split_huge_page_to_list_to_order(page, NULL, 0); + return split_folio(page_folio(page)); } void deferred_split_folio(struct folio *folio); @@ -474,6 +477,12 @@ static inline int split_huge_page(struct page *page) { return 0; } + +static inline int split_folio_to_list(struct folio *folio, struct list_head *list) +{ + return 0; +} + static inline void deferred_split_folio(struct folio *folio) {} #define split_huge_pmd(__vma, __pmd, __address) \ do { } while (0) @@ -578,7 +587,4 @@ static inline int split_folio_to_order(struct folio *folio, int new_order) return split_folio_to_list_to_order(folio, NULL, new_order); } -#define split_folio_to_list(f, l) split_folio_to_list_to_order(f, l, 0) -#define split_folio(f) split_folio_to_order(f, 0) - #endif /* _LINUX_HUGE_MM_H */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index cf9ead052d2a..e4e0b3431dc6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3068,6 +3068,9 @@ bool can_split_folio(struct folio *folio, int *pextra_pins) * released, or if some unexpected race happened (e.g., anon VMA disappeared, * truncation). * + * Callers should ensure that the order respects the address space mapping + * min-order if one is set. + * * Returns -EINVAL when trying to split to an order that is incompatible * with the folio. Splitting to order 0 is compatible with all folios. */ @@ -3143,6 +3146,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, mapping = NULL; anon_vma_lock_write(anon_vma); } else { + unsigned int min_order; gfp_t gfp; mapping = folio->mapping; @@ -3153,6 +3157,14 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, goto out; } + min_order = mapping_min_folio_order(folio->mapping); + if (new_order < min_order) { + VM_WARN_ONCE(1, "Cannot split mapped folio below min-order: %u", + min_order); + ret = -EINVAL; + goto out; + } + gfp = current_gfp_context(mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK); @@ -3264,6 +3276,21 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, return ret; } +int split_folio_to_list(struct folio *folio, struct list_head *list) +{ + unsigned int min_order = 0; + + if (!folio_test_anon(folio)) { + if (!folio->mapping) { + count_vm_event(THP_SPLIT_PAGE_FAILED); + return -EBUSY; + } + min_order = mapping_min_folio_order(folio->mapping); + } + + return split_huge_page_to_list_to_order(&folio->page, list, min_order); +} + void __folio_undo_large_rmappable(struct folio *folio) { struct deferred_split *ds_queue; @@ -3493,6 +3520,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, struct vm_area_struct *vma = vma_lookup(mm, addr); struct page *page; struct folio *folio; + unsigned int target_order = new_order; if (!vma) break; @@ -3529,7 +3557,7 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, if (!folio_trylock(folio)) goto next; - if (!split_folio_to_order(folio, new_order)) + if (!split_folio_to_order(folio, target_order)) split++; folio_unlock(folio); @@ -3572,14 +3600,19 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, for (index = off_start; index < off_end; index += nr_pages) { struct folio *folio = filemap_get_folio(mapping, index); + unsigned int min_order, target_order = new_order; nr_pages = 1; if (IS_ERR(folio)) continue; - if (!folio_test_large(folio)) + if (!folio->mapping || !folio_test_large(folio)) goto next; + min_order = mapping_min_folio_order(mapping); + if (new_order < min_order) + target_order = min_order; + total++; nr_pages = folio_nr_pages(folio); @@ -3589,7 +3622,18 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, if (!folio_trylock(folio)) goto next; - if (!split_folio_to_order(folio, new_order)) + if (!folio_test_anon(folio)) { + unsigned int min_order; + + if (!folio->mapping) + goto next; + + min_order = mapping_min_folio_order(folio->mapping); + if (new_order < target_order) + target_order = min_order; + } + + if (!split_folio_to_order(folio, target_order)) split++; folio_unlock(folio); From patchwork Wed May 29 13:45:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB476C25B75 for ; Wed, 29 May 2024 13:45:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 46C8F6B00A9; Wed, 29 May 2024 09:45:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3F3816B00AA; Wed, 29 May 2024 09:45:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2213E6B00AB; Wed, 29 May 2024 09:45:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id EE2226B00A9 for ; Wed, 29 May 2024 09:45:48 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id AD52B160702 for ; Wed, 29 May 2024 13:45:48 +0000 (UTC) X-FDA: 82171556376.08.7CBA5CC Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf06.hostedemail.com (Postfix) with ESMTP id 0EA7C180010 for ; Wed, 29 May 2024 13:45:46 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=TqtfP3YU; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf06.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990347; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IMIZWzID8Oa4XZs06SFDXCAlixk5rrqjrLjasnPiZuM=; b=0ptFLcycCIwCu/Xr5GYTjJXaMsSIMwJCgCbpDAilQsDWdDKQojepMPsMRjAlZVitekOhPR uh54a53KW7ZvuQlb6mfbG/y/T3ohrlxbVUvne8de6nYRFRv/gBAB+ZS9WR6H55gxTolRpE CTsSW2eYcOv/y9L+Ksz4+MaUxCSNH5M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990347; a=rsa-sha256; cv=none; b=XLqjBr0E55I8ImZ1H8cGlErbh1RtvZKlGE2a4yOLU8PRhOeEEjPzaNPHr5cRou/eb9NYKt kJKpTsl3bAReYL0OYeR8R8Yrk6UXOM9cDD4nkBHwfydEETuiVheJlWjCM2FwS/kpuZ8VtG +vfPMbrhHNlUB8uO0e1Bo45ZfmDOvuc= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=TqtfP3YU; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf06.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Vq9d35L8bz9sSV; Wed, 29 May 2024 15:45:43 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990343; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IMIZWzID8Oa4XZs06SFDXCAlixk5rrqjrLjasnPiZuM=; b=TqtfP3YUwosVu7v0T47uciWkMe74vRAl8UvhaEbvYSJvKjyU80QI/rlEdo79J3hwSsOL+x XAzNTXip1JSRvFThNu6Ij0FFS31H5OKr3oEJjrdHGPRKRjrm7oADrjwiTliDERUr8dSzFr 397nhNEFSnybU26IOcKwVIlESyW3YXW3YHAGH8Xd5nDsexhkoksr/o9c6fRoDoDoluWCQQ dwmCJitjxvFOnbeJObIQDmb4xY7dATIx7ePqyMFspsBtIjcLi7pq4LGt4Onfz+1kklSG/M bZkjJPooW+suPUUkKtzq00x8pQ7Oo0j5Yh2i5xT/Tb9Y5zNSh0CxI/KsvapQNQ== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 06/11] filemap: cap PTE range to be created to allowed zero fill in folio_map_range() Date: Wed, 29 May 2024 15:45:04 +0200 Message-Id: <20240529134509.120826-7-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Stat-Signature: 3znfxe6u5rq1qdy4d8wzmdsytn4p3w5b X-Rspamd-Queue-Id: 0EA7C180010 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1716990346-477627 X-HE-Meta: U2FsdGVkX1/xOxVoocyGkuHAkDVOukz6fFpxFTkPrhhN6W5MKiGlnPk332d1Y2iKMXGJyh/KUjloyOULj7FY9nopAgywfCpT64ovhoRSxH4XSIbBx6OkvLYdJ2/LwdBVQEauL/FAxOZsycT6Nr12fnaquuzFNP3mQrU5lGziP2INP+yjWHaW7d5NZB/YZEshoH04Pp8ylSphrG8yXTUux2JUDs+OSKCRi4k5oYs5tz+UFegjldNh67EjPQEjVIISeYw7EuTFq9qim758z5J6IHzVmnWUFbCKZwaPTt5xKHILSA+TYL917TDK1JTO4tSV9RB3gvFcGxkNFWU4pt4g1SYls/vgRpLS3Ht/0G3VFSDBlxAmRnqwPMw7LZ7Xg/mITOQLCZKetdUiVb3VGV/9gXRS+ekXbexpO41J34VoMzJdcbdHDRkHnFK3UH82gkOoTKR7ShFC4w7r9wWk7YVuhs9IYnEMUFkOc9mhsi4wrZLP/aJFjrsXl35jl+YHgzZY4iyF1RXGp6h7/BZ0cyQpNIjj0UFKkkmhbzbt15zXnXQ7ZWQr4l3wIx0i3vtTQfoI5ZIbLF/JkfRNZsvhOUPjZCGTGzH3rLozIGkIOa2/eacueLjxT0Z3YHkDVGA+vsr6rLZhZrLoI6lfsbY6pwUsMmmHF9hgxgGo43Bu37BJTFzHDjmUWyimtLHfguT9aIXUvzSGXcVfndcU67NTmsi5sZaUEbkFWKtPcivE2NjftLuq5N7L67qTdAMdB6DkGso0gYptnTo8rSSPDPkKM2o1EU+oAqI1+L8Xo4SRgsb4NcPl1PSuUPe2XKNNSOtbxq2tzf6Hb/z8RZPbbPVF0ejZ4H48xtXciqfT5FcEpYzbIdvlCQr8lb44i/f1Ad0gfRHd+rSJDMIBHkjwO+9NZP/ytY5AQv+FJIpm5PHv3e8XJ7P0vrTKQlkD9YPadmliBjWWgotS7W8/gchlYPljsSR e8bN3x20 OL2O0S3JHoWoO/8wABESZ/nqHTtzepXbXoNEQxP78LNqrauDOJydAIOGNS0UoLP0CkbvvmE8hf7BE02+Wj1qsaOazISMWmGT/lusEeDRM6yhXKO6c3/o2+We0mvG3GWDYDqUiGbwUn0RFr4bXw6+WGm+Kmt/hbGKMm/ixigvsstx+bh7GAQdcxKYyRsJp8JArN39oIF0bimSUUMItzS9QwjCT5PxnbdtCXjKoMBchATEIjn9RqSs0WKIQwBJb4NK/JSq7L4hjmAABaIkNBmKKcVr/wi3Hso58VH//tbwHYOmwOa+MHnLCKP3kNk0qi7Q+v/zeWux7TGQYtXs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav Usually the page cache does not extend beyond the size of the inode, therefore, no PTEs are created for folios that extend beyond the size. But with LBS support, we might extend page cache beyond the size of the inode as we need to guarantee folios of minimum order. Cap the PTE range to be created for the page cache up to the max allowed zero-fill file end, which is aligned to the PAGE_SIZE. An fstests test has been created to trigger this edge case [0]. [0] https://lore.kernel.org/fstests/20240415081054.1782715-1-mcgrof@kernel.org/ Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke --- mm/filemap.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/filemap.c b/mm/filemap.c index 0914ef2e8256..e398fa7b2ef6 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3610,7 +3610,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, struct vm_area_struct *vma = vmf->vma; struct file *file = vma->vm_file; struct address_space *mapping = file->f_mapping; - pgoff_t last_pgoff = start_pgoff; + pgoff_t file_end, last_pgoff = start_pgoff; unsigned long addr; XA_STATE(xas, &mapping->i_pages, start_pgoff); struct folio *folio; @@ -3636,6 +3636,10 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, goto out; } + file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1; + if (end_pgoff > file_end) + end_pgoff = file_end; + folio_type = mm_counter_file(folio); do { unsigned long end; From patchwork Wed May 29 13:45:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C4A6C27C43 for ; Wed, 29 May 2024 13:45:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30D226B00AC; Wed, 29 May 2024 09:45:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 294A96B00AD; Wed, 29 May 2024 09:45:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E94F6B00AE; Wed, 29 May 2024 09:45:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D936B6B00AC for ; Wed, 29 May 2024 09:45:53 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9EACAC07F2 for ; Wed, 29 May 2024 13:45:53 +0000 (UTC) X-FDA: 82171556586.18.4CD325E Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) by imf16.hostedemail.com (Postfix) with ESMTP id CE8A418000D for ; Wed, 29 May 2024 13:45:51 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=NUYJ7pDe; spf=pass (imf16.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990352; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bRJNlq7gVNU9I4tEFfojUYVu+p8wArfYUJ6q9izpCCA=; b=2Pg4BpKClo1qTzgvid2BvdGHak2V/rP9vo6gyfwQIcd45FfIuhTJ/waOnnxo37GYAbAS// I1aM0wAb+9wflx1b6LH7ByTc/7iXgouvCtPiLveTothMuuM2tsIILqP3c+lvBS7gVBF8vd qTCoFW2casAyu6myEAoYdpUt0wJ7NIg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=NUYJ7pDe; spf=pass (imf16.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990352; a=rsa-sha256; cv=none; b=CxTfiBKrXnkxWFxtEDFq6d0lKquvSLGy8dRDBj5Pd06JW9hFOHCRtjLWbQnA/oig612V+y f/FBOtbry7ZajMV43u+5o14Pxh5TPGDsYCnCJ7jMsNqQb/brBJ5xjg3BJYljA8H0qOxeU1 44R2/zGahfQ4nWAcdDHnJNaHVZRSP4M= Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4Vq9d80z8Fz9sqr; Wed, 29 May 2024 15:45:48 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990348; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bRJNlq7gVNU9I4tEFfojUYVu+p8wArfYUJ6q9izpCCA=; b=NUYJ7pDeclNdi7QRK++Dp5VL9zrEcwXU9xsGVgXvv2EEW23dQf7srN5Ji7vfC3d8WgJQko qAXJR123EPpxL8sX1KJaB0F4CA5JA5mkw8Z1nnxni24NQgV+OmmSHJbatQXqi2/QYRAL1j tmsToUP4sNiMFEz06xbIj2y2MfrAzroSgYT+SW46dv4/avjtoUmgRV3kSCHETds7CJpX5D WFFQh8Ec3CwYcRAD2QRHbx9MtSFAhaca5b2kCuG8C1kf6n1tPSoFBYnAEgJ8HeQFvGznXp o5wG4QUeS0m/kEhlHnzzWyvHFVtlh5O4nBS/xBDbfKb/kguGkgIHImCIH+LWyw== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 07/11] iomap: fix iomap_dio_zero() for fs bs > system page size Date: Wed, 29 May 2024 15:45:05 +0200 Message-Id: <20240529134509.120826-8-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CE8A418000D X-Stat-Signature: tyhr97pe76fsko6defamygtsi7j1nqa1 X-HE-Tag: 1716990351-977632 X-HE-Meta: U2FsdGVkX18qBYOAIqiRLu98SO+pfH2qNU6OdBDlQ55O6SFhCgl0Q2v4+7Q04lsh5enZQhDNk7YR0cNW7kl4pP/j+WRB3XckKcdUoIQTpiSbMYH+f6yZxEFag5CNGOSyosihYrG/3HuUULWzb3lmJKgGIY0eWT2rNSnPA07I8sMlhyVPjY8Y/+oUw1ok1nA88xxPFMlCYgkyCIEthSazn1bx/GIQpXUhHa5C7QomGTXb7Fw3/092YWX0noxRbeS7SNmDi5M7A+sJm/v1+NwIExD1CpS7dO6l5wfAuHjUn+yUU4mrOUQ4YOrSfzUezY0KSt55hchxjVhOiYTWYw2Iqa6KclM7ckNyVVb4zHWSIpRJsoDoNXyWT66p7nmjODT1L6y25+0o37eHV3sntMGeaAGtqi0qUaZZRvRL91+f5RST0rFsNnJmnQSHUxY9WpDX+bhd+k/IghCuWnJGLA+QfodZngLR6ubU5XRVoaN6L83zo8Bb5RX7gWzLsvX3YzV3lCw0W9Ysx1WW9DyckxJF/xD8ijmYtlSkjChDW0oaKysyS3eYIBiLkt8mAedE6BzzoMViiO1qTH6p83IRkE43A61jYJmkhphoxXw2Dp/qBpUq7YSeifR6zjGwMdlhCUmyaVGpu4wcCOv4FORYyK4y9POKcpmlyBuIsABBBybCDmcibqhSddBxVG2AlOHv4MXvnKzOqDySaApY5z/DTMuUn+y+HbAgoBH5pEFESQLwwuIGep2t0ZTyehGAJhYarsi13eyAi7RRl5HYxVwybCeES4jQSXFFwF+O4P+jSDOpvIpo5FY0ivtKaxSsJdOk0TA/IgmnDG5McKdybZJKYF0AVeCf5XNxfMoFX8ouRVp+Un4wJVJ30JXR2qHo016aeONLie0ojygvvjReB9SWthpxtT/FjgVhP3TKeW2V3sIX1hpSRIE4H7eV8IKR3FfwAPBjTeKYl/JbhLRRQmSncsy 8tQZ5/+k yWh6YIwjvtrn8y0dPf6t8pjeATBTXxZcebRhgx0eNDWiy/SUuHW8KF9XesWvXIzNpoV59sihQQUUYkb3TVQgj2nMAWXnUramCbOygLhfBAM09GST70ayPnDRsnMUUgsOwNKBpzLDRYnbh7gAjILyL9WkQ1yFddrhNu70jyVHfgeY+/SDmUH5TSWvhT2LHM/xx6yRvd//tTPxt277Z9lNqOGOZ9oA0t10lXA2mlzmc+b7pu9M1VIMRK1IF8IWZxLejRVO9zSn9KhcbFY9k9PGjx1WS7ZzjYvCFKnliAKK3kUft4DwcX3XpyVRp7K3MPkiudXD6sWZPzZF9Y/DC2WmeE6BaUnY/6nXrBZrg5bhWc2ywWndKz7ofZMRoGIBEV8VZZbx21hukTpKdkyA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav iomap_dio_zero() will pad a fs block with zeroes if the direct IO size < fs block size. iomap_dio_zero() has an implicit assumption that fs block size < page_size. This is true for most filesystems at the moment. If the block size > page size, this will send the contents of the page next to zero page(as len > PAGE_SIZE) to the underlying block device, causing FS corruption. iomap is a generic infrastructure and it should not make any assumptions about the fs block size and the page size of the system. Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke --- After disucssing a bit in LSFMM about this, it was clear that using a PMD sized zero folio might not be a good idea[0], especially in platforms with 64k base page size, the huge zero folio can be as high as 512M just for zeroing small block sizes in the direct IO path. The idea to use iomap_init to allocate 64k zero buffer was suggested by Dave Chinner as it gives decent tradeoff between memory usage and efficiency. This is a good enough solution for now as moving beyond 64k block size in XFS might take a while. We can work on a more generic solution in the future to offer different sized zero folio that can go beyond 64k. [0] https://lore.kernel.org/linux-fsdevel/ZkdcAsENj2mBHh91@casper.infradead.org/ fs/internal.h | 8 ++++++++ fs/iomap/buffered-io.c | 5 +++++ fs/iomap/direct-io.c | 9 +++++++-- 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index 84f371193f74..18eedbb82c50 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -35,6 +35,14 @@ static inline void bdev_cache_init(void) int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len, get_block_t *get_block, const struct iomap *iomap); +/* + * iomap/buffered-io.c + */ + +#define ZERO_FSB_SIZE (65536) +#define ZERO_FSB_ORDER (get_order(ZERO_FSB_SIZE)) +extern struct page *zero_fs_block; + /* * char_dev.c */ diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index c5802a459334..2c0149c827cd 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -42,6 +42,7 @@ struct iomap_folio_state { }; static struct bio_set iomap_ioend_bioset; +struct page *zero_fs_block; static inline bool ifs_is_fully_uptodate(struct folio *folio, struct iomap_folio_state *ifs) @@ -1998,6 +1999,10 @@ EXPORT_SYMBOL_GPL(iomap_writepages); static int __init iomap_init(void) { + zero_fs_block = alloc_pages(GFP_KERNEL | __GFP_ZERO, ZERO_FSB_ORDER); + if (!zero_fs_block) + return -ENOMEM; + return bioset_init(&iomap_ioend_bioset, 4 * (PAGE_SIZE / SECTOR_SIZE), offsetof(struct iomap_ioend, io_bio), BIOSET_NEED_BVECS); diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index f3b43d223a46..50c2bca8a347 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -236,17 +236,22 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, loff_t pos, unsigned len) { struct inode *inode = file_inode(dio->iocb->ki_filp); - struct page *page = ZERO_PAGE(0); struct bio *bio; + /* + * Max block size supported is 64k + */ + WARN_ON_ONCE(len > ZERO_FSB_SIZE); + bio = iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits, GFP_KERNEL); + bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos); bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - __bio_add_page(bio, page, len, 0); + __bio_add_page(bio, zero_fs_block, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); } From patchwork Wed May 29 13:45:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8559C27C43 for ; Wed, 29 May 2024 13:45:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7ABDC6B00AE; Wed, 29 May 2024 09:45:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 728EA6B00AF; Wed, 29 May 2024 09:45:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 584FE6B00B0; Wed, 29 May 2024 09:45:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2A1496B00AE for ; Wed, 29 May 2024 09:45:58 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9E541A1F94 for ; Wed, 29 May 2024 13:45:57 +0000 (UTC) X-FDA: 82171556754.30.DE8FA00 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) by imf16.hostedemail.com (Postfix) with ESMTP id D436B180029 for ; Wed, 29 May 2024 13:45:55 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=XtmSRYj1; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf16.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990356; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1QtcK2fxbZ+x1HZw3p1m35+0G7d81xL3TX3Jwuj1ZS8=; b=i3CwFkExQInY9IpS4O7QVqDV1eRGGwyvxHY0YLTk5knZMcvOvwhjfsSnfm705/PZBpAzz5 7HikOVnZitNxYIIBEeFOBQL6/8HP9EarHBW9D/9syD8Aao7SFx2grXPWmMZyjg2SZIh4Ak /LiXgUpXWOKi/51fpKIhBB32mSehyu8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=XtmSRYj1; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf16.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990356; a=rsa-sha256; cv=none; b=cMEidWcxIBArgNGlpyYQusG56vh5iTmRS0C42G53Stc6/O5FrEw8kUExbvy33jojZ0/FPp FNKBKTZMx+husbobV92x8cQPdz/O2ncoHsKxuw6DvDH8akau072vfvQCrdXMgL/RZADPe+ NokxP6idrizEjfX1GRufmeo2uz1V8rc= Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4Vq9dD5WLlz9srg; Wed, 29 May 2024 15:45:52 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990352; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1QtcK2fxbZ+x1HZw3p1m35+0G7d81xL3TX3Jwuj1ZS8=; b=XtmSRYj18h9+1ShKisqhJuhbvmFNCkHXkrsU/DwgyRVYGbZx8dom0yMEQL7U9ngd+ACH4q cxANxsQsXNqDYf8t3VO+RKQajOU6g5SL1w1u7KAPSRb/8t+qJDmTlMMY5mgfCtJmjGkyar bdQ9ZZcUFfZKkPiJLjPDGura3CWpql6cfv3N9B0ri29HyUB6pM3I8nI/LJyKIPoAZ3mzUS Mfp5TcBEP7f0p6iI0cliKqVakwYXw15t2f0GrFMq7FtHYWmgIjbsMbjDuueEkNec3nLCXO UdBFXpbh1KzrRFqCrVS6KA2ZvQiIe8nHUUkiYFbT3YhJndAYblsnR1RihCReEA== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Dave Chinner , Pankaj Raghav Subject: [PATCH v6 08/11] xfs: use kvmalloc for xattr buffers Date: Wed, 29 May 2024 15:45:06 +0200 Message-Id: <20240529134509.120826-9-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: D436B180029 X-Stat-Signature: trwmmnzabcdgx9gwcryj4qi9buhsn3ma X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1716990355-888296 X-HE-Meta: U2FsdGVkX1+EpLT0NH3jr3hu9K0jkd4fm7t/LEJdnocvHP/1JcELXMVSWC6wyDevqyijNV88iwu+pQQEUWjAL43I5OfBk2LQFexr5Eb5x7TI3Iu1TDcHCXtDfrJFnf45EBNDCChc4my3fqOqXuE1++gV5s3SL+em8lH32IUwTHgCb6c21lLXwfF5Z9CNbHcqbAoN9wifS2j6X9gb5ht2kseTVQWvUMnhYVoaLK25zosysafHL9s8SDlJD0QSLtqLeAROQDqcsXqqljoo+573I6peRYXALfOEXMR3k+ajenAE8WoDizlzNR+yAoiZFuNP2UGiIAbO2a1etkHWV4TSXzLj4BMSWXK+TFp3xfpEZdXs+UkIHbNCVSipaCPdLpnMyKlfAwgvHBAzfzr+QbVjgSDUeawt4nr5qtj4LI7YYt8xAIO0GtD4mlpvBRwUr1K6CKIE7MSLPqmOb0eMuYIw5zIf0gECVCQIXq6/iO07y8yYJYYKDH33OO3nHSEbZbRVOQcLhZDNHjFwBwZNlGw9R+RYlxRbceLwgiQmHaA+qTtxlcvfVxslwon7clVAmAqEyigWrwWEWABd+R5dy2AtLEDIs1lEnbJoIz1tU6Y8re/LlZFvVFARijaE0uTlXeBzEK8g6DtmQAhKFpw43e5gEfZycLZ/e/ncupbt2t4+EH8T0kZ6BgQK+xfJKWKJVsap+NHn4Mk+VFxL+htgNXBkOmK1nQXKDn89+6K02F1/nklYdup0yPFRiHkXWE9xZ+ojD0PDcN/weGSwRb7mreQlHsTk9dA7ON3pmXOl7AB9f0bMMdtmqdQAQ4+nH3kPhOirdyNlPdqlOJBA/IIPdQvOBdSPvFlIFlP6nhCOT4qUMzujDxNoUjxGsW++I/wwF25ZUUoUfh723kuushwwAwHxD8kKdLMZ9M/GKWXSnjIkKe/0FCWA8owIVNVLgR7f71GUHBvxZHnPzTveNZzl//o mS/iCB2e Q18APMZajC5na/qQr88zRRCOCYgyRR/s/4RryJz96gLizyy056qwIu7YlmwtBSoF73P5NHCnS9HH7YSKCnHreccA+kPgxr9g07F3usvvOpg39thAHbUrloifH8Wvay5rY3rELUb+Jxx/SMmlvpjOFW2isgaoTquE1ou17hy6gxPsUqGwc5ER5Jj4iXUXvctp7FoU1aN644J2YEup0Ua12tsKrzE5rTiYheV9lS3FJfcleOc0uz/kOBu+9W71ApeAjpM/bOge8jndFh5+LgE/8NjeJR1cjbBaTAaLUdFafTiIkVOcxayZ0IBqV61HJOPHpAeX2COQxBpMVnob5Oo7seRRpx2k+NINPpMFhPdzDyp7hKfM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Dave Chinner Pankaj Raghav reported that when filesystem block size is larger than page size, the xattr code can use kmalloc() for high order allocations. This triggers a useless warning in the allocator as it is a __GFP_NOFAIL allocation here: static inline struct page *rmqueue(struct zone *preferred_zone, struct zone *zone, unsigned int order, gfp_t gfp_flags, unsigned int alloc_flags, int migratetype) { struct page *page; /* * We most definitely don't want callers attempting to * allocate greater than order-1 page units with __GFP_NOFAIL. */ >>>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); ... Fix this by changing all these call sites to use kvmalloc(), which will strip the NOFAIL from the kmalloc attempt and if that fails will do a __GFP_NOFAIL vmalloc(). This is not an issue that productions systems will see as filesystems with block size > page size cannot be mounted by the kernel; Pankaj is developing this functionality right now. Reported-by: Pankaj Raghav Fixes: f078d4ea8276 ("xfs: convert kmem_alloc() to kmalloc()") Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong Reviewed-by: Pankaj Raghav --- fs/xfs/libxfs/xfs_attr_leaf.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c index b9e98950eb3d..09f4cb061a6e 100644 --- a/fs/xfs/libxfs/xfs_attr_leaf.c +++ b/fs/xfs/libxfs/xfs_attr_leaf.c @@ -1138,10 +1138,7 @@ xfs_attr3_leaf_to_shortform( trace_xfs_attr_leaf_to_sf(args); - tmpbuffer = kmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); - if (!tmpbuffer) - return -ENOMEM; - + tmpbuffer = kvmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); memcpy(tmpbuffer, bp->b_addr, args->geo->blksize); leaf = (xfs_attr_leafblock_t *)tmpbuffer; @@ -1205,7 +1202,7 @@ xfs_attr3_leaf_to_shortform( error = 0; out: - kfree(tmpbuffer); + kvfree(tmpbuffer); return error; } @@ -1613,7 +1610,7 @@ xfs_attr3_leaf_compact( trace_xfs_attr_leaf_compact(args); - tmpbuffer = kmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); + tmpbuffer = kvmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); memcpy(tmpbuffer, bp->b_addr, args->geo->blksize); memset(bp->b_addr, 0, args->geo->blksize); leaf_src = (xfs_attr_leafblock_t *)tmpbuffer; @@ -1651,7 +1648,7 @@ xfs_attr3_leaf_compact( */ xfs_trans_log_buf(trans, bp, 0, args->geo->blksize - 1); - kfree(tmpbuffer); + kvfree(tmpbuffer); } /* @@ -2330,7 +2327,7 @@ xfs_attr3_leaf_unbalance( struct xfs_attr_leafblock *tmp_leaf; struct xfs_attr3_icleaf_hdr tmphdr; - tmp_leaf = kzalloc(state->args->geo->blksize, + tmp_leaf = kvzalloc(state->args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); /* @@ -2371,7 +2368,7 @@ xfs_attr3_leaf_unbalance( } memcpy(save_leaf, tmp_leaf, state->args->geo->blksize); savehdr = tmphdr; /* struct copy */ - kfree(tmp_leaf); + kvfree(tmp_leaf); } xfs_attr3_leaf_hdr_to_disk(state->args->geo, save_leaf, &savehdr); From patchwork Wed May 29 13:45:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3A96C25B75 for ; Wed, 29 May 2024 13:46:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3F6E56B00B0; Wed, 29 May 2024 09:46:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 37F726B00B1; Wed, 29 May 2024 09:46:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F8E66B00B2; Wed, 29 May 2024 09:46:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id F1E246B00B0 for ; Wed, 29 May 2024 09:46:03 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 753A014143F for ; Wed, 29 May 2024 13:46:03 +0000 (UTC) X-FDA: 82171557006.13.75D9161 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf02.hostedemail.com (Postfix) with ESMTP id BFB688000C for ; Wed, 29 May 2024 13:46:01 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=CzeaDneB; spf=pass (imf02.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990362; a=rsa-sha256; cv=none; b=7j/qGt8yH/1pQs8mtNEChh0RBmy58iEa+GB77CqjXSOn6t2CqqVOwCOOxXjuOmUbHYc1Nx c34Ljhu88Jr9ZFxvP7pRa9SKIaZaWHMaDMhvqC3SqFXiNMVt8n48AaRf8aYK8hUFs3Tsdh Pikddks+XosYE5jcDMiIT2MduDJatuU= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=CzeaDneB; spf=pass (imf02.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990362; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d+RsMcbV5eI3QqbTnWxDqe2O4kUw7UzqHd/STS+Ml6s=; b=K7M+WxDDVcishpZKCCcrDOTeGDoJuNEVyuO6WJXIWpsNwXRQIMzA7IAPtxsyiHTRwLBbOi NVzMsUcoYv99gFAQZLAwp87sWUq4EpVX02pmp4Ydqv0GAJBOdmTwtSNOGFLE5/16DP3wQS ndtmEzHkg7u1+fntKpZe3vM+nhXf+ws= Received: from smtp1.mailbox.org (smtp1.mailbox.org [IPv6:2001:67c:2050:b231:465::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Vq9dL3YDrz9spp; Wed, 29 May 2024 15:45:58 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990358; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d+RsMcbV5eI3QqbTnWxDqe2O4kUw7UzqHd/STS+Ml6s=; b=CzeaDneBn3N/GkSraY0dI6MZ/hR0flpo+tMra3TU1FhLWffBKR5nIK0+/3AuP2B8sHuMQd E3pHAlZBeUCfHAkyL8kSg6tKrD2S1hUXT09TkdM2nvJVLTwgPdLfox5AicjQlvSmX3wZCi NaaO0Ss4cU1CXsFLjfpYw1qUDOSyKBdILF5Ly3afcAZ7OdD0Gw44loPxSjUQ4eMuCP1AWa mrXmV5BHgtYyZH7lqKNoja9s+ZneW6O3a7L1VOEQDqEArXxIaPJbkTYPytPbMv0kXCQa+5 v9WwG/lvUPRDn0Lo12NNwoedrZF8p74tK1SA3vPmGGsBKp1rDFqeIRfAptX+VA== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 09/11] xfs: expose block size in stat Date: Wed, 29 May 2024 15:45:07 +0200 Message-Id: <20240529134509.120826-10-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: BFB688000C X-Stat-Signature: jn8zn9ikfbsu7r9qpsd6gpifatdqgwrz X-HE-Tag: 1716990361-888523 X-HE-Meta: U2FsdGVkX1+4scFcytT120qscHJg+H77An+CAdbCu9RTwjKfhyWlpGojt8J95bs9jqCHoxxpIzl5HwMrtgSE+CoTqz74PB7CHvdVF1BXjYDfslfCBpTIxkVcXbPH1USYcowcILglFOB3cTTvmdF8HlTxFvi/Dl5GhOkcX2YwDEX0iYo/sU8vxRSxsLrpS+gOrJg7g0KHh5U9bXs2ULQIVlzuP3ZKc8sRmf+hT8nQuDFRk87urZ3yDTZPeiRnsCiKwNiZJIhmYhd1UOiizAtrduh7lSUYAs6uJ5G8XF+STPlkZU/kVPbAtkOpLBcPa1k1sXDa+lYYawWy13SdRya6eFkitkRygawWdMrOMs7JYKGlSZk5QY2iyhHXwKZynjqzA1ndqPLRr4syOfWOp3ZPI79Ot95AfSAqmDZYXwmwbrApJCw96yAOR/cI+SB+h+b8ya5fy4np5lfkYV4a1H4+FNUyCtbJ//1N7LPTnlCFPEWQdZf4IBy+9gjLQ63fwvgoRkgP9Aw9z7WbjLjg6gTlvRX7K8pLSRMof5RU4XmXvLHZqZDTItefQzViZ3Sq7LpnX5GW6F/zbZebv1GemnGxunf8wAZaScNssSQkvfuqkZhipCL9YJzVPzssC4aGSqUzsTjULIkH8qHUQeeBfepw8/P3qI/KWakk+F9MGj6r18UTtqxlLwmC/btDi5XAR/pwsH5DQldWyIROkcYuP81DZWj2NuKiiJ/OxJOFSOM1upmGc9Fr2xd7fvK99P+CuS8lNa00LZ4XAEhnACd84EavYL+8mgZx3RKssTq1+g3QgrW1szqu6L4VqPi+pcxYZHWeNaqRnwSQ7CHg4E3881HFq6aIAxclb70vs9u1KeXi+Hja7IRIxph+q9eYNIMWwLAXvc/lMR+zs5igkTfUPClx9FehC9mBI2QEoQwBKzaa0jWduDd3IIkB4fFlPSo1cn32603dhvS+86ojYb+sSst 2im3NEjH AKIgl6f10xuZFhQRzPzFRef0rS8n6H4YDNjFI87M5ov0w5EiczOZFdtE08VgFsjJMHbbXwOWfBWxkUVo1STceWbppj5ST5J/LzPcq/R9eG5NqVA8EScyW+Hk1BU5tjT3Xzqw9CTLp11fK2naQc+XRHs9Qf0eoT4Ypz8Cmeoan4zhRkk8BdkiR8sJCP3vvoE8lTVGb9b2ZiY1zwubWiNGjeikA6wTD+sXxvsxI7mfJTcemMstDTS3YWbQQx00uiOJNcbRDixgfsivJBRbTdc8slPxrBDGVGISN145jQ2qcWuFQIs/FQam0VYBIf0BIqcV1Etgh76IHorCswTSBwGnbZj+FbaN2jmbo99CjZt4sQVjCMLf2uTqZfVF0iP6x0Z05JQ+4lp+dYDT68HATlEBBkg2dxA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav For block size larger than page size, the unit of efficient IO is the block size, not the page size. Leaving stat() to report PAGE_SIZE as the block size causes test programs like fsx to issue illegal ranges for operations that require block size alignment (e.g. fallocate() insert range). Hence update the preferred IO size to reflect the block size in this case. This change is based on a patch originally from Dave Chinner.[1] [1] https://lwn.net/ml/linux-fsdevel/20181107063127.3902-16-david@fromorbit.com/ Reviewed-by: Darrick J. Wong Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- fs/xfs/xfs_iops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index ff222827e550..a7883303dee8 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -560,7 +560,7 @@ xfs_stat_blksize( return 1U << mp->m_allocsize_log; } - return PAGE_SIZE; + return max_t(uint32_t, PAGE_SIZE, mp->m_sb.sb_blocksize); } STATIC int From patchwork Wed May 29 13:45:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85AB7C25B75 for ; Wed, 29 May 2024 13:46:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15F846B00B2; Wed, 29 May 2024 09:46:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BFA36B00B3; Wed, 29 May 2024 09:46:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA2376B00B4; Wed, 29 May 2024 09:46:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C96DD6B00B2 for ; Wed, 29 May 2024 09:46:06 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 89AB71A11D3 for ; Wed, 29 May 2024 13:46:06 +0000 (UTC) X-FDA: 82171557132.07.4FF5CF5 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf11.hostedemail.com (Postfix) with ESMTP id D99D84000A for ; Wed, 29 May 2024 13:46:04 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=Y7wlIk9q; spf=pass (imf11.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990365; a=rsa-sha256; cv=none; b=LeHxwsx1gJ3GsacCrjwEQjIjKyaFdxEGQ0bJtj6+jBIcHesdK6J372ODveY9DPH0Gh4VJ5 1p+qZjEB12Wjlz0xmSVTyLXoUE4wpCGhxIzRlKty9+yQGzJlw4Q/wXh/hNURpzyDaQgVh0 bTVSN2pz6JlKj51qNPKJfqyomvl7Ce0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=Y7wlIk9q; spf=pass (imf11.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990365; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XXfVdZswQrW+PJKNoyAg9m3URRhUBz2fB+d6rG/p16U=; b=5NShcUMw69h4ivujoie3yLRMhhMGeHfi+/ryhUD5oT00FDTIF+ALU3dMx5foubZn0i/zjk sQBOrOQFGq3S9hki7TBGlEEd5tpXVFvmrE/F/2QzqUQFWOfb3yv0GABvIsp6jJA13qdGoT lrdFsqocwScuK6m67ffR/uQ5lDO8dUc= Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Vq9dP6J7Jz9spb; Wed, 29 May 2024 15:46:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990361; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XXfVdZswQrW+PJKNoyAg9m3URRhUBz2fB+d6rG/p16U=; b=Y7wlIk9qRzv7K4tdfjamyHbmXFm6kDedt5IbNXR0ASRzYBqQDbTiwUKWH3Nxn48ot+6Zak OfwiYNTnKui3W1HEuPH8HcTistKWS4bwj/E3yNgX5khyuQfsyhvO2ggmx3lY6yTPqRXknl 2oW31GzJVJ6uMqwvtNLujrnLYsvkcPp9YMr53PaMMp/OSKOl4sKnDi1N4nHbKwBz2JgO+P cppHvmzxJrjOOW+xIZhfVRkXwVUO4r9TilmrOj8FhcQmeytlubrh9IZpJB10UBCjnmbVVy XA4RkNV6f4o2hwdpIO9pw/Si9kSgaNhUS1QE8GB8Twgw4F0sHd1EOB49eUZPcQ== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 10/11] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Date: Wed, 29 May 2024 15:45:08 +0200 Message-Id: <20240529134509.120826-11-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D99D84000A X-Stat-Signature: gjmoqt7ie1pwof5ktrxo54oopbhkxswm X-HE-Tag: 1716990364-989166 X-HE-Meta: U2FsdGVkX1/N7h8AnpwXgj3td3+24OMNzApWlG7SCKoRS58SZ2tzXg2qSjPY8Q7ES8+HTcOagHUFFBpN8f481N5DCTh9wTKDeBZgXpUYvkP5GTtzYJbS35qkbjOPrEH/ZwM4cFbMInj+XhAN94VrkwOrvimAbKxZP/y56x9QaffVb17AL7BqCvpo95InwWfjYRYoVUWCQwS3jOx3DxGDLACBUVnHMWf70oxTcxw0yvVlak/pzUwWQBJQrPQxtNxA636R8neK5kS49PW2Pg5x+iI7I8ssz1RBtVcBwVHk55kmBVHqollTueTNZVKeJif9jB0xtx3v44PdSs10CGu2YKFKRb/NUJxsoksLlC9e9DCOiUw5IJfdMZpVyPG4IAKYN2g0Xv6itrkCPqwbjVJMAZ5C+9by+Nxu7EpAirgN6sl6Ld0LQoig599VVmyFM4lcnrsNCGZ1rNpK40eZqKEYKrIyTGR8WfC+E7Vkd2DInCWm2YcVrORXq6cMLO5h2mbr+fs0WZB2tMOc5wfa7qKuP65BXZR+QrRbfzIbjpH3C02ezVA6ZfCOSCFTOdPQAGJysRTPSiSrYErxKfwQUY2RUgJCBnPMlwq//TfeRgFYz4iDZiuEgp3VR6Hw8m10alXCHMER9TRgbWGLzgRQ6UaXWGtpm8OEW24SiOmP6krJq+JvN9iezmpzTFGLa0q2Sc+lDSaGS/Cu8dQ/G+tzpegPqaA4IzdpunpTq0RJqjtWx9yCgCLZeJRerZM7fam7lvNSBIl050FQU8SVhXdYeOI3GxOoBVjulGsj1w/qajMMYP0sNPbF2UhszGS9w0Q9do7KiOYKdad73HJS8MmoXo6OcU95zSSqVeyjxPLs3RBzEOhrdUKhbhUdo589o8K96OYWX6eYrThRBTyKl/d7mc+jeerCGrcegYZEPZNjimDv6QAqZCXs+rKaO9ygTaFjlN5Ka9OP87ewm/2zrXZK38j xF7K0F3m VCavAtJeC6cIBOaDYGqdKpfifVmCAvPIN+GCPurojzKt//hdPJ5m0zil+oyuDntQCOUrVF7rTMHmoRMHK6jDd5HSJ2jfncH8IOU1EejxQlqQY26yykNzt0l6nZ1QdbI69cu5MAKyEIriRALoSrH3WmJNAbzdlSNlojxZwTq/YHqGoFXr9tqkywn0nyJEEirXiN1pOaqu3VFNb12I1rWKudGnBqaQD40y6flmvGDJy+BhfpWKP9Eo7JI92ejzgkZ3PuYyh/iL5bfiudQNQ/W185m7w+pDnNcar0rjihUhcYfFXb9eGX9xdN1P33rG15v1S1Xkp0P0lXkP42pY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav Instead of assuming that PAGE_SHIFT is always higher than the blocklog, make the calculation generic so that page cache count can be calculated correctly for LBS. Reviewed-by: Darrick J. Wong Signed-off-by: Pankaj Raghav --- fs/xfs/xfs_mount.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 09eef1721ef4..46cb0384143b 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -132,11 +132,19 @@ xfs_sb_validate_fsb_count( xfs_sb_t *sbp, uint64_t nblocks) { + uint64_t max_index; + uint64_t max_bytes; + ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >= BBSHIFT); + if (check_shl_overflow(nblocks, sbp->sb_blocklog, &max_bytes)) + return -EFBIG; + /* Limited by ULONG_MAX of page cache index */ - if (nblocks >> (PAGE_SHIFT - sbp->sb_blocklog) > ULONG_MAX) + max_index = max_bytes >> PAGE_SHIFT; + + if (max_index > ULONG_MAX) return -EFBIG; return 0; } From patchwork Wed May 29 13:45:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13678914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC147C25B75 for ; Wed, 29 May 2024 13:46:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6B8BB6B00B4; Wed, 29 May 2024 09:46:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 63E8F6B00B6; Wed, 29 May 2024 09:46:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 491D06B00B7; Wed, 29 May 2024 09:46:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 278EE6B00B4 for ; Wed, 29 May 2024 09:46:10 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D776C141221 for ; Wed, 29 May 2024 13:46:09 +0000 (UTC) X-FDA: 82171557258.22.C315633 Received: from mout-p-103.mailbox.org (mout-p-103.mailbox.org [80.241.56.161]) by imf30.hostedemail.com (Postfix) with ESMTP id 22B6080013 for ; Wed, 29 May 2024 13:46:07 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=XOB+N42p; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.161 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716990368; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CqWeILzukHgisrDT2Qez2BasOK0ht2d7BaH6ApZlpmc=; b=M/XkiSQM8+yVzpyR9Xw1Jr1JibxgelYHmH0P904hKod+NmSSuyq5b89ziQCAFIWuRtIqBA touiTBPmqOO97/u/DQz8c+wihNRvBa+Oqch/adXpwzZSoAlU+op1iBJx3JYuvto3h1/D7i PByMyo6n1qUZiMD3luP66byxDFPgH60= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=XOB+N42p; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.161 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716990368; a=rsa-sha256; cv=none; b=OTsA66tqO0JvQSzf1YdXseGwTTD7gBwQr1sSCxugMECd/FnUmeWuNuzE5l6HReT+6ivMu+ BkQ5xIIF2V8CphwcsDUKKYsoO9018cJLKsc2nkHPfeV70AIbLkI+e5nQip1BegmJEXgM07 92tV0ZH1Adp/PrQtAmv7O2NCEq+M550= Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-103.mailbox.org (Postfix) with ESMTPS id 4Vq9dT0Tshz9shb; Wed, 29 May 2024 15:46:05 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1716990365; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CqWeILzukHgisrDT2Qez2BasOK0ht2d7BaH6ApZlpmc=; b=XOB+N42pJ9+3yh+2a6EE3JxrNjiMx8gcmgcydGpbQbIokRbFL9n3t1fou4c1PKi/QVZjOS dTxNmxENzLyvTp1SXADWg964pMqyPN3GvoMMuBfee0rb00QvHSqiAOuULn+UyqpRJTRgTN 9e9oJyX7ba+wT93bWncZTvZBfDTMvOZmPihPM4+cO/q40fueOR+XsgPF3Wgq2fBwarvFlU zPAKTlze5cXDwRdOdd9G2L1s1oqtTlVZUbyy89yBU89iVhZ2zyAWqcJlR9CK5IIol98vkB iFvGPn/UHeDLe6u78yR2gJyiiDC3TA7k7Ux5UUp++B4SSoEcLTqVFSyWzsjLBA== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, brauner@kernel.org, willy@infradead.org, djwong@kernel.org Cc: linux-kernel@vger.kernel.org, hare@suse.de, john.g.garry@oracle.com, gost.dev@samsung.com, yang@os.amperecomputing.com, p.raghav@samsung.com, cl@os.amperecomputing.com, linux-xfs@vger.kernel.org, hch@lst.de, mcgrof@kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 11/11] xfs: enable block size larger than page size support Date: Wed, 29 May 2024 15:45:09 +0200 Message-Id: <20240529134509.120826-12-kernel@pankajraghav.com> In-Reply-To: <20240529134509.120826-1-kernel@pankajraghav.com> References: <20240529134509.120826-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 22B6080013 X-Stat-Signature: gecromh81mouk9ojrnrod1sotqd3ipui X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1716990367-794845 X-HE-Meta: U2FsdGVkX184OyRthoaOWhfuQVVcpMg2XwhXR5hZEV3+d++WKfbUTNwsgrHZvMU+CXBeDH1+YXgpi5WyKuPHGl4a79dIQC+6jKeedxV7K42z+T4TjLNds0fTbbCFAJzj7QMZ4kNUw4CowxwaBtZ9D7WGWg30NPJaJL8ieqgwFC1giiSCnySqK0Em5R+5GGlFxKnuTkoXE1gkEbtvHWsicclQLvXG4UwvDQLrnM4QdDWdFc7Ohhgg2SshC8+4cjzkn8daulRwEmR+5TG/IYtLYGK77nJ8TNxQ2iKZf+Tp3qRCFjf9r5RnM3jzsPTENGe18C2BP78sqGbpM8cXUvf+s6QDA2iMP1cir4Fm9gLwHAnWfbQpUvXRi0tsVDQ6+XTDlilM7C+gMXHJxLbcQer+HEFCnLIw1ciyRoekGd+4DDeNAp5VIU8ITHmKjQR/dbvqchZjCIn0gflxaxBnzJLvGpWx4CU4je41HOaKF7V5FzxZdlW2vQN+GwCNshhOYdbhedAUajUOLdlG2ujNdF+dy/eCxe+R7p+6ewppiROHXNFiVAMsd8V9oIPMr8awyQc020/Af9VPPDZLmvUXtJRqNnn7uw9GEWGbV+v2YFRZkpAcsNaP0lAgHNAU1TZEGF0n/d/09bth/CBaECBHNC7XTMpoXGu3Wd+uAqkj2u85ItEo14O7y9Lerd82yuRbcr37Hm5nItQtDsalErQ2DacQ4+V7ynEtu2SN2xkO32enMp7FHKjEsbe7hynr9jvXTnDBUFVpeUiLiXWIJrYdOyszZNeUeMfJM3Ua0qDbCOxtyn36slpHSdMmjzbozLKduaR5wPRY+D4FleXixC4SoxKirCpkUXaVt0vyf46jRUPWWA+keRTMkh8LAgW3K3q7fChrKHYvADkOacxKNyNFkT2WyMelpl8+r+/GHRfCfed0eyt2Y+2nV1WHld+pJc3xfGvS7VrCCyII6pGWMbvwzoO 8fmFXXG2 FB9Q4JWNTPqP3kRjwiy2kUahsDnKwhXKhOUHcq5ehEkQka3uJ862mCkdaxn+GMYdjF9DY5XZZ7aMvnwHLiQJdUhxV06VSG8Ewrdjhnr465QTty97jgvkHWgkBiNgAp7CqKzqSl0Q0KrJp830TG2ctkcNFzfgGg4sjyPC7sPSE6hyX6MdjNKqXV3UX04UNsTxOxySo4O711z7y2j3npWFtSL6MZr5o5WuDch/rR4bUNxztcqpM61XRKybI5FhekwcrAh5h03cJFKkN7IeiNkH+VPq+iOhcog2b3g72 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav Page cache now has the ability to have a minimum order when allocating a folio which is a prerequisite to add support for block size > page size. Reviewed-by: Darrick J. Wong Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- fs/xfs/libxfs/xfs_ialloc.c | 5 +++++ fs/xfs/libxfs/xfs_shared.h | 3 +++ fs/xfs/xfs_icache.c | 6 ++++-- fs/xfs/xfs_mount.c | 1 - fs/xfs/xfs_super.c | 18 ++++++++++-------- 5 files changed, 22 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 14c81f227c5b..1e76431d75a4 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -3019,6 +3019,11 @@ xfs_ialloc_setup_geometry( igeo->ialloc_align = mp->m_dalign; else igeo->ialloc_align = 0; + + if (mp->m_sb.sb_blocksize > PAGE_SIZE) + igeo->min_folio_order = mp->m_sb.sb_blocklog - PAGE_SHIFT; + else + igeo->min_folio_order = 0; } /* Compute the location of the root directory inode that is laid out by mkfs. */ diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h index 34f104ed372c..e67a1c7cc0b0 100644 --- a/fs/xfs/libxfs/xfs_shared.h +++ b/fs/xfs/libxfs/xfs_shared.h @@ -231,6 +231,9 @@ struct xfs_ino_geometry { /* precomputed value for di_flags2 */ uint64_t new_diflags2; + /* minimum folio order of a page cache allocation */ + unsigned int min_folio_order; + }; #endif /* __XFS_SHARED_H__ */ diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 0953163a2d84..5ed3dc9e7d90 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -89,7 +89,8 @@ xfs_inode_alloc( /* VFS doesn't initialise i_mode or i_state! */ VFS_I(ip)->i_mode = 0; VFS_I(ip)->i_state = 0; - mapping_set_large_folios(VFS_I(ip)->i_mapping); + mapping_set_folio_min_order(VFS_I(ip)->i_mapping, + M_IGEO(mp)->min_folio_order); XFS_STATS_INC(mp, vn_active); ASSERT(atomic_read(&ip->i_pincount) == 0); @@ -324,7 +325,8 @@ xfs_reinit_inode( inode->i_rdev = dev; inode->i_uid = uid; inode->i_gid = gid; - mapping_set_large_folios(inode->i_mapping); + mapping_set_folio_min_order(inode->i_mapping, + M_IGEO(mp)->min_folio_order); return error; } diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 46cb0384143b..a99454208807 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -135,7 +135,6 @@ xfs_sb_validate_fsb_count( uint64_t max_index; uint64_t max_bytes; - ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >= BBSHIFT); if (check_shl_overflow(nblocks, sbp->sb_blocklog, &max_bytes)) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 27e9f749c4c7..b8a93a8f35ca 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1638,16 +1638,18 @@ xfs_fs_fill_super( goto out_free_sb; } - /* - * Until this is fixed only page-sized or smaller data blocks work. - */ if (mp->m_sb.sb_blocksize > PAGE_SIZE) { - xfs_warn(mp, - "File system with blocksize %d bytes. " - "Only pagesize (%ld) or less will currently work.", + if (!xfs_has_crc(mp)) { + xfs_warn(mp, +"V4 Filesystem with blocksize %d bytes. Only pagesize (%ld) or less is supported.", mp->m_sb.sb_blocksize, PAGE_SIZE); - error = -ENOSYS; - goto out_free_sb; + error = -ENOSYS; + goto out_free_sb; + } + + xfs_warn(mp, +"EXPERIMENTAL: V5 Filesystem with Large Block Size (%d bytes) enabled.", + mp->m_sb.sb_blocksize); } /* Ensure this filesystem fits in the page cache limits */