From patchwork Fri Feb 21 22:38:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13986396 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA69D1FF612; Fri, 21 Feb 2025 22:38:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177511; cv=none; b=kABLXl0pFKC3s9TZi7dNZ7dNQ8pn4ohjLLDti7G8Jl1sV5bOw3bZsxr65t+3XA/qcvU7upJroB7eSw1EJ03ySv46qSRMPv4FRj83WOFE0/Ph/odtfxp1Df41H877N28B0jMmbo70JXTYkGNc2amxb5ACbinCDuUprVJogFkq9uc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177511; c=relaxed/simple; bh=SUSEWwwPXKlY38GdAKi9a07ZmDy4DSycjiy6IDA8QGw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KsDd2CLxODm/8pg8EHGG3dTt6SZpdz7inyeRMXbGZhstTF+b8/TcYwsGnsUWGHeU+j5Niamyxh4SW62LmIv5izoWWtWYjOBJtk0aLMQLjq/STvNqUG6DEEwijp30tX31IM4fq93v76W3uPj+aSFeFO3cos72R2UF8ciBH3L49Wg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=QjrfoaCI; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="QjrfoaCI" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=9SL9dFZnxTYWHIK0ELgxGzaHF/BgeRzk+6m5RMpyXOQ=; b=QjrfoaCIfvDWXPGcUb8l9WbJQ1 ILfiLiecFwoGMbekJ5E0Q7wwhRpMhxjYb5SKNf2xsEiUlNCS7n9rJ4tOlO07A3uRPG7Vz7geVOVdD sWwDNoOweGQt0ze4/euWidilzry8UydBfqH5T2mvoRkCIz2VF9w5HUbvQoZ3ZB9/Uj40SlikgLFbX lkc8xCuyU6zLVLVjuArqQ+JpTriOEsDSpL4SWQLMX/p9ONXK0qYirstmus9XW5ZKfDl2PvGkdB7m4 l1I+Wzs0gY4t8f0ujfCjIcGUz7s0mBQAN+5zV5BAgWH81wGS7jKZzjClWHNEJ2UWran++CHIpmSfC ZlW1ekFw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tlbf7-000000073D0-36QQ; Fri, 21 Feb 2025 22:38:25 +0000 From: Luis Chamberlain To: brauner@kernel.org, akpm@linux-foundation.org, hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v3 1/8] fs/buffer: simplify block_read_full_folio() with bh_offset() Date: Fri, 21 Feb 2025 14:38:16 -0800 Message-ID: <20250221223823.1680616-2-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250221223823.1680616-1-mcgrof@kernel.org> References: <20250221223823.1680616-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain When we read over all buffers in a folio we currently use the buffer index on the folio and blocksize to get the offset. Simplify this with bh_offset(). This simplifies the loop while making no functional changes. Suggested-by: Matthew Wilcox Reviewed-by: Hannes Reinecke Signed-off-by: Luis Chamberlain --- fs/buffer.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index cc8452f60251..b99560e8a142 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2381,7 +2381,6 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) lblock = div_u64(limit + blocksize - 1, blocksize); bh = head; nr = 0; - i = 0; do { if (buffer_uptodate(bh)) @@ -2398,7 +2397,7 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) page_error = true; } if (!buffer_mapped(bh)) { - folio_zero_range(folio, i * blocksize, + folio_zero_range(folio, bh_offset(bh), blocksize); if (!err) set_buffer_uptodate(bh); @@ -2412,7 +2411,7 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) continue; } arr[nr++] = bh; - } while (i++, iblock++, (bh = bh->b_this_page) != head); + } while (iblock++, (bh = bh->b_this_page) != head); if (fully_mapped) folio_set_mappedtodisk(folio); From patchwork Fri Feb 21 22:38:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13986397 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA704204698; Fri, 21 Feb 2025 22:38:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177511; cv=none; b=KsO4eXQpxFBr2Hj1v3+9q/zeNX2kZIPg8LQhtVIiypXZikkpG3KYXX9oPHuNGXdJbWPysOdAZtHtlTIKLo6LMnuAO8/CmzNPM0q+P2FJwdCYFsXN8Ks5ZELNVEnQYmF89xCEz1rBIHY6szad/k/DV9eCgxorOKgrCqpMDXqnJrU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177511; c=relaxed/simple; bh=t6TYAnZVnQOFpQ10O87bLj7qiZUW1QA9U/iuFzK5rIY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OPCUUTg1wX3E0JsZyWbt80/v0RTPn/KBZTgEeP2SsTKZwpzRtboRWsn8BFFydfwGtj1zrZ5fFNQlvpDi0ZUGi85JR5Z+vSzUsn75MyhSVZrPu5hKaJytjtUPSxLnum14NUHne60/DxA1EkJliyo3gwlvbyUhofl8gnUUuGIu6Ak= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=T3tx0wBJ; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="T3tx0wBJ" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=fpTh1KbPClQQAibvuLbxjJCGXlYIrXiRedMs+nLQf5A=; b=T3tx0wBJhUa4LW3rSL9hlLLjCm /nfyI88jHLYZ/es2eDKg6NggJ3wwhw7XRGc9qVH/+wHe3uGIMtnKEORNzbYCn/KloX86/WmPANPs7 7CeCqTCrLZgF0FO242DBCTVmLUCHi3DOrJhys6RXzmVCZmf+rEuujTDpTM5m58KVPJnnpbo6LE/4a Qe3Y7/gGuc3cdgPUzM5kfvJjrBoYZp7pdh+sLmLUqr99v01fYrGwUDVyGYkDmPhIBjj0UVmnNqksP k0wOoQC3TfvK+cpGqKrB7zkGYtnNc3C6jcBC20z2UscC1nM0zc6MAjDRzH/oiBZ034AeNwaC5coGX CDikMjWw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tlbf7-000000073D2-3GTV; Fri, 21 Feb 2025 22:38:25 +0000 From: Luis Chamberlain To: brauner@kernel.org, akpm@linux-foundation.org, hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v3 2/8] fs/buffer: remove batching from async read Date: Fri, 21 Feb 2025 14:38:17 -0800 Message-ID: <20250221223823.1680616-3-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250221223823.1680616-1-mcgrof@kernel.org> References: <20250221223823.1680616-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Matthew Wilcox block_read_full_folio() currently puts all !uptodate buffers into an array allocated on the stack, then iterates over it twice, first locking the buffers and then submitting them for read. We want to remove this array because it occupies too much stack space on configurations with a larger PAGE_SIZE (eg 512 bytes with 8 byte pointers and a 64KiB PAGE_SIZE). We cannot simply submit buffer heads as we find them as the completion handler needs to be able to tell when all reads are finished, so it can end the folio read. So we keep one buffer in reserve (using the 'prev' variable) until the end of the function. Reviewed-by: Hannes Reinecke Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Luis Chamberlain --- fs/buffer.c | 51 +++++++++++++++++++++------------------------------ 1 file changed, 21 insertions(+), 30 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index b99560e8a142..167fa3e33566 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2361,9 +2361,8 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) { struct inode *inode = folio->mapping->host; sector_t iblock, lblock; - struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE]; + struct buffer_head *bh, *head, *prev = NULL; size_t blocksize; - int nr, i; int fully_mapped = 1; bool page_error = false; loff_t limit = i_size_read(inode); @@ -2380,7 +2379,6 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) iblock = div_u64(folio_pos(folio), blocksize); lblock = div_u64(limit + blocksize - 1, blocksize); bh = head; - nr = 0; do { if (buffer_uptodate(bh)) @@ -2410,40 +2408,33 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) if (buffer_uptodate(bh)) continue; } - arr[nr++] = bh; + + lock_buffer(bh); + if (buffer_uptodate(bh)) { + unlock_buffer(bh); + continue; + } + + mark_buffer_async_read(bh); + if (prev) + submit_bh(REQ_OP_READ, prev); + prev = bh; } while (iblock++, (bh = bh->b_this_page) != head); if (fully_mapped) folio_set_mappedtodisk(folio); - if (!nr) { - /* - * All buffers are uptodate or get_block() returned an - * error when trying to map them - we can finish the read. - */ - folio_end_read(folio, !page_error); - return 0; - } - - /* Stage two: lock the buffers */ - for (i = 0; i < nr; i++) { - bh = arr[i]; - lock_buffer(bh); - mark_buffer_async_read(bh); - } - /* - * Stage 3: start the IO. Check for uptodateness - * inside the buffer lock in case another process reading - * the underlying blockdev brought it uptodate (the sct fix). + * All buffers are uptodate or get_block() returned an error + * when trying to map them - we must finish the read because + * end_buffer_async_read() will never be called on any buffer + * in this folio. */ - for (i = 0; i < nr; i++) { - bh = arr[i]; - if (buffer_uptodate(bh)) - end_buffer_async_read(bh, 1); - else - submit_bh(REQ_OP_READ, bh); - } + if (prev) + submit_bh(REQ_OP_READ, prev); + else + folio_end_read(folio, !page_error); + return 0; } EXPORT_SYMBOL(block_read_full_folio); From patchwork Fri Feb 21 22:38:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13986404 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85B7D2397BF; Fri, 21 Feb 2025 22:38:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177513; cv=none; b=DKlUf7bywhspU7b/fybLWqQ6F0WQjMq/QT0HxZbQYeTmBUduIuW1oevXHFRTR35aaBjqgg2cKy1j1mTbVtAVxXLDl6YnSkeRreMYxSLNqi2ACOhtaIfidfTpvtfOm3hpNqAtySMb3x+jXVZP6MGdb80NVTZ4n47/GT8vlHrrEVw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177513; c=relaxed/simple; bh=hh+kjJOX3Lo3l4ILL+fbseEofYE8t0mReKLbQTa2XbM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CL2T6s2NEFXu9NIJae8Q1ypxPD9Ed+Jx3LQC+NUjoutmKNDGBNSPl9T1XTofIJpgMikWR8hyLHeFy0ov/RiVHFHczrIUg+s9KrCimm9JIpuvjPotDGEa9snNUXD8p3TBOTX1XjOTXqokLamGkXODZZ6nMQNCO4MyTRl/L0XfUoo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=j0PYVvwX; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="j0PYVvwX" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=GoCon0Fq6X63chDqWaV9ZGQ+ZBO2iIy6otYzhff4lOE=; b=j0PYVvwXI7eCRUoi3gZOV2na2H Bj98iRzd1YMF4Lz6r9LwSxt53op4Q5kXHZRxGptgTBnpTPcgVrjtVTzNrCNH9l6rWodcKn1uzPvD9 5Ugsk0CiQqBqoK3Z1WXfrcTsqQ4lWh+ibde8Kyn/b37D0hn79Jyk5ikSCLn85SAI3yk2Siu3h8Rcn WBQJFYEGyP08H9hJUa6NJvNezR+bLm0zKeLxDXBLYKEHGofvghjAj8odarDyzARiJBcNYIk34HXt5 p8plBqyKTGkpOZtRcaOYmBc4TH/6BHg024qY5gApNVtEJBdUVAa+0WQsf/4QcJHNgJe3/pdPjNLkg Y+5JFOAA==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tlbf7-000000073D5-3Plg; Fri, 21 Feb 2025 22:38:25 +0000 From: Luis Chamberlain To: brauner@kernel.org, akpm@linux-foundation.org, hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org, Hannes Reinecke Subject: [PATCH v3 3/8] fs/mpage: avoid negative shift for large blocksize Date: Fri, 21 Feb 2025 14:38:18 -0800 Message-ID: <20250221223823.1680616-4-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250221223823.1680616-1-mcgrof@kernel.org> References: <20250221223823.1680616-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Hannes Reinecke For large blocksizes the number of block bits is larger than PAGE_SHIFT, so calculate the sector number from the byte offset instead. This is required to enable large folios with buffer-heads. Reviewed-by: Matthew Wilcox (Oracle) Signed-off-by: Luis Chamberlain Signed-off-by: Hannes Reinecke --- fs/mpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/mpage.c b/fs/mpage.c index 82aecf372743..a3c82206977f 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -181,7 +181,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) if (folio_buffers(folio)) goto confused; - block_in_file = (sector_t)folio->index << (PAGE_SHIFT - blkbits); + block_in_file = folio_pos(folio) >> blkbits; last_block = block_in_file + args->nr_pages * blocks_per_page; last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits; if (last_block > last_block_in_file) @@ -527,7 +527,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, * The page has no buffers: map it to disk */ BUG_ON(!folio_test_uptodate(folio)); - block_in_file = (sector_t)folio->index << (PAGE_SHIFT - blkbits); + block_in_file = folio_pos(folio) >> blkbits; /* * Whole page beyond EOF? Skip allocating blocks to avoid leaking * space. From patchwork Fri Feb 21 22:38:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13986400 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 593EB205500; Fri, 21 Feb 2025 22:38:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; cv=none; b=J5Ok3HPY7zZc5dn8yZ6vwWyPhlVZG6C/YPd3ISwRbeVJRKwq6RRedfCY9ZhAveHD7ZVTvQv1gFn6I6yGMAISBxF3yJXmzxVb+hwWqqdreQVgVfrePDbmbzDKLIRBu0MPnNHKzjO9d7CSzAo/SgR9MovKRTGEs19rDq3B9s41sns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; c=relaxed/simple; bh=ma96Qs2kim9HgDwJCKBpXdNPaDUm+CmD2jgiVM/hR6A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=i6pFHmnR/CHJcnqbktsARLPe9ESR5xoLrng/bl0XLOA8IDOTl0tPepe9wqXZvBGSllJdylk4kBhtn9bq+3uH3Y8IU4fcIPzske5NRww+xrOGIjz9eBrMkdinrIAXLquD7SRvTtI2XNmtqEM0qNl2x+raa1enqIQtRi6yizk2CKE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=SSoOiVej; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="SSoOiVej" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=+ubD7J9wYkED4fOf4+EQ9LXTiGFTdc0o5ColS8uVAP8=; b=SSoOiVejG4f2uQMzI6veQFB34H bOsLDfXVdlrvZ15lpIx/WnblMH4RvdUHujStmPHPCDwC+n6glopmf6jiSp+VpdwPcx6o/c27cfoSw whXgslKmM9kDiDS0UmEat01qMZv/NiaTOywGu5NyZe/IZisj95t4MxS+UpKCvTdEZ8YxHboIwNZkC XNRVsX2G4n8xTy/7U1jmQvs0R+bGPehAuDdwXJieeVO5LFJsoWERuxBF8BC1nm6R8AbK6BjgUgsZr Vfdsd4A8Xu5L20YpC1ZLmYiLB0EeKZUh8G4wvcBW+0XTxKbBasaOrpCuIyxFXhe32r9GiUodrYFkK UP3OFF0A==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tlbf7-000000073D7-3Yz7; Fri, 21 Feb 2025 22:38:25 +0000 From: Luis Chamberlain To: brauner@kernel.org, akpm@linux-foundation.org, hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v3 4/8] fs/mpage: use blocks_per_folio instead of blocks_per_page Date: Fri, 21 Feb 2025 14:38:19 -0800 Message-ID: <20250221223823.1680616-5-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250221223823.1680616-1-mcgrof@kernel.org> References: <20250221223823.1680616-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain Convert mpage to folios and adjust accounting for the number of blocks within a folio instead of a single page. This also adjusts the number of pages we should process to be the size of the folio to ensure we always read a full folio. Note that the page cache code already ensures do_mpage_readpage() will work with folios respecting the address space min order, this ensures that so long as folio_size() is used for our requirements mpage will also now be able to process block sizes larger than the page size. Originally-by: Hannes Reinecke Signed-off-by: Luis Chamberlain --- fs/mpage.c | 42 +++++++++++++++++++++--------------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/fs/mpage.c b/fs/mpage.c index a3c82206977f..9c8cf4015238 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -107,7 +107,7 @@ static void map_buffer_to_folio(struct folio *folio, struct buffer_head *bh, * don't make any buffers if there is only one buffer on * the folio and the folio just needs to be set up to date */ - if (inode->i_blkbits == PAGE_SHIFT && + if (inode->i_blkbits == folio_shift(folio) && buffer_uptodate(bh)) { folio_mark_uptodate(folio); return; @@ -153,7 +153,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) struct folio *folio = args->folio; struct inode *inode = folio->mapping->host; const unsigned blkbits = inode->i_blkbits; - const unsigned blocks_per_page = PAGE_SIZE >> blkbits; + const unsigned blocks_per_folio = folio_size(folio) >> blkbits; const unsigned blocksize = 1 << blkbits; struct buffer_head *map_bh = &args->map_bh; sector_t block_in_file; @@ -161,7 +161,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) sector_t last_block_in_file; sector_t first_block; unsigned page_block; - unsigned first_hole = blocks_per_page; + unsigned first_hole = blocks_per_folio; struct block_device *bdev = NULL; int length; int fully_mapped = 1; @@ -182,7 +182,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) goto confused; block_in_file = folio_pos(folio) >> blkbits; - last_block = block_in_file + args->nr_pages * blocks_per_page; + last_block = block_in_file + ((args->nr_pages * PAGE_SIZE) >> blkbits); last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits; if (last_block > last_block_in_file) last_block = last_block_in_file; @@ -204,7 +204,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) clear_buffer_mapped(map_bh); break; } - if (page_block == blocks_per_page) + if (page_block == blocks_per_folio) break; page_block++; block_in_file++; @@ -216,7 +216,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) * Then do more get_blocks calls until we are done with this folio. */ map_bh->b_folio = folio; - while (page_block < blocks_per_page) { + while (page_block < blocks_per_folio) { map_bh->b_state = 0; map_bh->b_size = 0; @@ -229,7 +229,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) if (!buffer_mapped(map_bh)) { fully_mapped = 0; - if (first_hole == blocks_per_page) + if (first_hole == blocks_per_folio) first_hole = page_block; page_block++; block_in_file++; @@ -247,7 +247,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) goto confused; } - if (first_hole != blocks_per_page) + if (first_hole != blocks_per_folio) goto confused; /* hole -> non-hole */ /* Contiguous blocks? */ @@ -260,7 +260,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) if (relative_block == nblocks) { clear_buffer_mapped(map_bh); break; - } else if (page_block == blocks_per_page) + } else if (page_block == blocks_per_folio) break; page_block++; block_in_file++; @@ -268,8 +268,8 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) bdev = map_bh->b_bdev; } - if (first_hole != blocks_per_page) { - folio_zero_segment(folio, first_hole << blkbits, PAGE_SIZE); + if (first_hole != blocks_per_folio) { + folio_zero_segment(folio, first_hole << blkbits, folio_size(folio)); if (first_hole == 0) { folio_mark_uptodate(folio); folio_unlock(folio); @@ -303,10 +303,10 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) relative_block = block_in_file - args->first_logical_block; nblocks = map_bh->b_size >> blkbits; if ((buffer_boundary(map_bh) && relative_block == nblocks) || - (first_hole != blocks_per_page)) + (first_hole != blocks_per_folio)) args->bio = mpage_bio_submit_read(args->bio); else - args->last_block_in_bio = first_block + blocks_per_page - 1; + args->last_block_in_bio = first_block + blocks_per_folio - 1; out: return args->bio; @@ -385,7 +385,7 @@ int mpage_read_folio(struct folio *folio, get_block_t get_block) { struct mpage_readpage_args args = { .folio = folio, - .nr_pages = 1, + .nr_pages = folio_nr_pages(folio), .get_block = get_block, }; @@ -456,12 +456,12 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, struct address_space *mapping = folio->mapping; struct inode *inode = mapping->host; const unsigned blkbits = inode->i_blkbits; - const unsigned blocks_per_page = PAGE_SIZE >> blkbits; + const unsigned blocks_per_folio = folio_size(folio) >> blkbits; sector_t last_block; sector_t block_in_file; sector_t first_block; unsigned page_block; - unsigned first_unmapped = blocks_per_page; + unsigned first_unmapped = blocks_per_folio; struct block_device *bdev = NULL; int boundary = 0; sector_t boundary_block = 0; @@ -486,12 +486,12 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, */ if (buffer_dirty(bh)) goto confused; - if (first_unmapped == blocks_per_page) + if (first_unmapped == blocks_per_folio) first_unmapped = page_block; continue; } - if (first_unmapped != blocks_per_page) + if (first_unmapped != blocks_per_folio) goto confused; /* hole -> non-hole */ if (!buffer_dirty(bh) || !buffer_uptodate(bh)) @@ -536,7 +536,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, goto page_is_mapped; last_block = (i_size - 1) >> blkbits; map_bh.b_folio = folio; - for (page_block = 0; page_block < blocks_per_page; ) { + for (page_block = 0; page_block < blocks_per_folio; ) { map_bh.b_state = 0; map_bh.b_size = 1 << blkbits; @@ -618,14 +618,14 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, BUG_ON(folio_test_writeback(folio)); folio_start_writeback(folio); folio_unlock(folio); - if (boundary || (first_unmapped != blocks_per_page)) { + if (boundary || (first_unmapped != blocks_per_folio)) { bio = mpage_bio_submit_write(bio); if (boundary_block) { write_boundary_block(boundary_bdev, boundary_block, 1 << blkbits); } } else { - mpd->last_block_in_bio = first_block + blocks_per_page - 1; + mpd->last_block_in_bio = first_block + blocks_per_folio - 1; } goto out; From patchwork Fri Feb 21 22:38:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13986402 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E38F320766D; Fri, 21 Feb 2025 22:38:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; cv=none; b=M3CTHvDMNBoIZH0Zaj9xzfAOHR10IggwEj4Z41bZGqmXKLmm+cav+/irxPj74UqbXLAhFEuGao3VOzFQFDx9AlPIoL4I8VaZsxOFYAaDkdSRUFgf3ysAP5wz8IRPSQ8yEeyWaPXZht58o16TjFSgYmBNuH5FkwBFRrVPFwEq6KM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; c=relaxed/simple; bh=wWfr49DKRm/B+SXkMQ63cMJsOmEaf6hhkzN7xVPhKwA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iHsfkV+3TyOQStoZ4OTcpgmMeS0gicrbPn0jZTabH+eiIlLKjSLPNLoIMiX3NsUV9EbxikvGoEqodJ8QeLfs+JdQLrrvYA2GdFmdPxGM/HhOvxmBW6PvjgV3xW0SidI+2S5VgvHdaK+ZKQFFZvHtiQHXTbS2c4AVd0vv9yyoN/U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=fNM7nLrE; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="fNM7nLrE" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=rJT3XHCnmeFYD6/zws/j5/iie6UCiOo1LUIbVleTx0A=; b=fNM7nLrEP3BEk0dY3WSxjlab7o 6lNb3AIH3BFvSoLugZ4WNdP7IG8Cg8W2ch0jq0AhMywPJKPaiogqFwSJZ0ECdy04X3UTKESBvU3PK JLx5CfzV5CaKjs818jt1jtN+H1FfwqHJX8EgMqHeh9vUT2+WlVUy1uNkgTfJ1gfYl0+3CcDLUxUUM qurRxxCwJtcyZ6hdL9364M+WTQuluLk41RQbAQkWKGiYdDig5Ixpr43QFsEpKnObdxnu1YrnaG8US RCV/7DPIvZHWOrh2rN57C9HA5SQCON7LK59aMOl8vXC4bqPmhEug6c9iaai5HcqdDKE0ssxa0/RZY ckh02Tog==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tlbf7-000000073DB-3iIF; Fri, 21 Feb 2025 22:38:25 +0000 From: Luis Chamberlain To: brauner@kernel.org, akpm@linux-foundation.org, hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v3 5/8] fs/buffer fs/mpage: remove large folio restriction Date: Fri, 21 Feb 2025 14:38:20 -0800 Message-ID: <20250221223823.1680616-6-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250221223823.1680616-1-mcgrof@kernel.org> References: <20250221223823.1680616-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain Now that buffer-heads has been converted over to support large folios we can remove the built-in VM_BUG_ON_FOLIO() checks which prevents their use. Reviewed-by: Hannes Reinecke Reviewed-by: Matthew Wilcox (Oracle) Signed-off-by: Luis Chamberlain --- fs/buffer.c | 2 -- fs/mpage.c | 3 --- 2 files changed, 5 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 167fa3e33566..194eacbefc95 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2371,8 +2371,6 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode)) limit = inode->i_sb->s_maxbytes; - VM_BUG_ON_FOLIO(folio_test_large(folio), folio); - head = folio_create_buffers(folio, inode, 0); blocksize = head->b_size; diff --git a/fs/mpage.c b/fs/mpage.c index 9c8cf4015238..ad7844de87c3 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -170,9 +170,6 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) unsigned relative_block; gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL); - /* MAX_BUF_PER_PAGE, for example */ - VM_BUG_ON_FOLIO(folio_test_large(folio), folio); - if (args->is_readahead) { opf |= REQ_RAHEAD; gfp |= __GFP_NORETRY | __GFP_NOWARN; From patchwork Fri Feb 21 22:38:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13986398 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56D752045BC; Fri, 21 Feb 2025 22:38:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; cv=none; b=fVPBBdoXiV5xKiP7irXQrQAKUK3M7BYDnTbce725evYCATEkV0e9988RKXdj3I9yR0VbTdYtZbHnCYyR+COn8VJHMkFQjRyWYnOhDhMVRRr6/UPawgqOCeuA1qBaBa0LqezaXTaZQ6fujolXYEN4KREoAMtsg5KkKk0Q1cpiI8w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; c=relaxed/simple; bh=xlVW+nmNUU3Mx7kjCHAf4g3c3X46e1nkPyeVOwrtLYU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LZX6XcA+JiMa3sq12HYmgzzBfx+DWNAOtyKUTDJJik7IZLEYpwKw+u0B8/+d19cizwVCDPUk/u8jDssI1/fZt3B+v0w6pJ58kUe+kkUEQvrbKSJMkZ5jkBfvudenyJMnWOjzuDTspPSKU9btDAGu4bk6vb9YwmkQu7yBO4A7KIk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=XDnfMo1V; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="XDnfMo1V" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=3/8moX70+aSKaT13TlyhV/XaFEGoqVjVhuDSY9DkRzU=; b=XDnfMo1Vwdj4jcr8Jvs3oJ0PuK s24IOa7rDwYGgQAUm0OWzKX5z9w431aQJmyM6li85UjqomVbjd1UDPxOLb3Q3vMkcBcuY+tjeqIp5 y/bPWO+YhAC2JyoWReaELre7hSPE3h7sHi0ob8piVq4oyfWNdonXcTLygztsa6Vmbiida6D19SJJU IwYVaxi6yPulxaGEpq7RVAlyotRmFlbt7LSWQ0Ls7FRpICGq4NDKIaqeKvuY48/PVFA6eywoIRFMI LARMsRH1stIV3oiPSLhqH7+ETYc4NRQORSuf1bnRP3EGwEgt71j+zSlHDJPCJtNGAjjF5BQih7HFK XcE6Tzdg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tlbf7-000000073DD-3rzt; Fri, 21 Feb 2025 22:38:25 +0000 From: Luis Chamberlain To: brauner@kernel.org, akpm@linux-foundation.org, hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v3 6/8] block/bdev: enable large folio support for large logical block sizes Date: Fri, 21 Feb 2025 14:38:21 -0800 Message-ID: <20250221223823.1680616-7-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250221223823.1680616-1-mcgrof@kernel.org> References: <20250221223823.1680616-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Hannes Reinecke Call mapping_set_folio_min_order() when modifying the logical block size to ensure folios are allocated with the correct size. Reviewed-by: Luis Chamberlain Reviewed-by: Matthew Wilcox (Oracle) Signed-off-by: Hannes Reinecke --- block/bdev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/bdev.c b/block/bdev.c index 9d73a8fbf7f9..8aadf1f23cb4 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -148,6 +148,8 @@ static void set_init_blocksize(struct block_device *bdev) bsize <<= 1; } BD_INODE(bdev)->i_blkbits = blksize_bits(bsize); + mapping_set_folio_min_order(BD_INODE(bdev)->i_mapping, + get_order(bsize)); } int set_blocksize(struct file *file, int size) @@ -169,6 +171,7 @@ int set_blocksize(struct file *file, int size) if (inode->i_blkbits != blksize_bits(size)) { sync_blockdev(bdev); inode->i_blkbits = blksize_bits(size); + mapping_set_folio_min_order(inode->i_mapping, get_order(size)); kill_bdev(bdev); } return 0; From patchwork Fri Feb 21 22:38:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13986403 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1658220966B; Fri, 21 Feb 2025 22:38:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; cv=none; b=DLvxcMij9fRn/8exeNQ2ozpMoy439i9VPdW+cO7wKzrwYl/AcKZn8Kawy+1V5OLvuTyiUu8dPzVZynLJxxyIx71GK+OQaEvHASC9q+uDa5XuEHofhgdeiujYMt1/3XIQbXyzi1O2X4gnnwl7JTS5SMVr0EE8iy13M6ZovTvunZ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; c=relaxed/simple; bh=S/We1UmV8DPFMcyNwpf63ueboaDW9xHq67ntzhpX6Qg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=P6/hikXKvo4mcfdQEqZRVQBdK1oNgy/Aj5mQHFeN+F8tOQIz97HWOl8yPQFxo36Jhdbeuq9zLvgWH8gOhkYIrN+fx5kyW8I7pn7hSNncyyhYRucpNRwSGGERvUNj9cff3mO5MkSsqKQ6zpNBDKJR2YQl6prmBsQirGsAVL9nFko= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=KovaJ9el; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="KovaJ9el" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=uqPIF+VvQBHBH50rSFfIRgR8/dA90Fp8sC9NPmrqGVY=; b=KovaJ9elkLijktp/Abx8dRypXr DRIMF2gDcno3sOYSd7AGjBhj7LPwqbloLT/XVcG9RyJjaqI0uBUiA0CHb97ewv5bYFhn04oQmPtqJ l9yUhtcJXseNVC7p6LDh8imdU0/Htn7uzKBiNWzZfcKCSixHqG1iZ3KmQJOEdWnvNH6jIf2Q5sPaV xv9i4Tb8GOLEZEeiRfuTie8IDbiq+Ce++8Kzl6AYKumyoPHbtCeAirWEII6l6/u5NLm/2BmmwCx2A sB8eSjBO5fTJeTXRUXGEzwTD1o1+uGCxPwQy+zwwJC1zFciCUVgZqlyDFUmwxhnQLEzs7RNExo3Cw kHeAkTkQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tlbf7-000000073DF-40fU; Fri, 21 Feb 2025 22:38:25 +0000 From: Luis Chamberlain To: brauner@kernel.org, akpm@linux-foundation.org, hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v3 7/8] block/bdev: lift block size restrictions to 64k Date: Fri, 21 Feb 2025 14:38:22 -0800 Message-ID: <20250221223823.1680616-8-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250221223823.1680616-1-mcgrof@kernel.org> References: <20250221223823.1680616-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain We now can support blocksizes larger than PAGE_SIZE, so in theory we should be able to lift the restriction up to the max supported page cache order. However bound ourselves to what we can currently validate and test. Through blktests and fstest we can validate up to 64k today. Reviewed-by: Hannes Reinecke Reviewed-by: Matthew Wilcox (Oracle) Signed-off-by: Luis Chamberlain --- block/bdev.c | 3 +-- include/linux/blkdev.h | 8 +++++++- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 8aadf1f23cb4..22806ce11e1d 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -183,8 +183,7 @@ int sb_set_blocksize(struct super_block *sb, int size) { if (set_blocksize(sb->s_bdev_file, size)) return 0; - /* If we get here, we know size is power of two - * and it's value is between 512 and PAGE_SIZE */ + /* If we get here, we know size is validated */ sb->s_blocksize = size; sb->s_blocksize_bits = blksize_bits(size); return sb->s_blocksize; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 248416ecd01c..a97428e8bbbe 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -267,10 +267,16 @@ static inline dev_t disk_devt(struct gendisk *disk) return MKDEV(disk->major, disk->first_minor); } +/* + * We should strive for 1 << (PAGE_SHIFT + MAX_PAGECACHE_ORDER) + * however we constrain this to what we can validate and test. + */ +#define BLK_MAX_BLOCK_SIZE SZ_64K + /* blk_validate_limits() validates bsize, so drivers don't usually need to */ static inline int blk_validate_block_size(unsigned long bsize) { - if (bsize < 512 || bsize > PAGE_SIZE || !is_power_of_2(bsize)) + if (bsize < 512 || bsize > BLK_MAX_BLOCK_SIZE || !is_power_of_2(bsize)) return -EINVAL; return 0; From patchwork Fri Feb 21 22:38:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13986399 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56DDE2054EC; Fri, 21 Feb 2025 22:38:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; cv=none; b=vGBKeR9Aa4HIeu9Zd8eMPNXtXKGlYxESaKhoWbaEpZG1OONCjiSh0Y82m1tUg9qx8hdx9VNbgst/O2jzucHVI96/MBiqpGbovVF8UxtQwjvWr0eIea3urhEh65oRpSb3tycV0Hp4El/LST2Q+J7+qtMamHE9L3VfLkro1xkQ3lo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740177512; c=relaxed/simple; bh=cpFeSl7V1siK2jIrJWc8/1O6vqPGiI+x+UOItHwJfNM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Q7JeKQAjLg3SAFJ9iZUz5PLDT6Bins2GYXh5LyA1EZFmKoT4wGv/cUB34CFb+0lu9JVM7uYykghzBqzydcLBqCV9PPCrdd+m26yUoce3cAHFVyr+UWivd06RT35Yaackk7U4smIwiIeLUS0CnV6SeCaZmoVVnLTR7JkYLYMcURc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=qfI+6CVA; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="qfI+6CVA" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=dPYsaEMRasFiFhMNTtOpkluc57HIzwvU4rUiMYmi9nk=; b=qfI+6CVA3tipgW2HoFB+Sph6kC ojVbYPteH2HN9YNIRQNgabxVOxbyu3vzNyWvcv3QbydCSqjtYEdETy6GhjYsXGmXOCgLRADt5Riul gId1AxDN9dF2oJR5t3/LL0ik7MU+eLrERaAhdwNDmE+OwEgiGDfgIepYOT7kwkzzJ7oOkWQt5vtyx f9kws4SbOJoxp26AbIBPJq6lE0zge4Mc3mCg891tFx45dI0gbb4dw2do/e3cbtflZ1JWK7hmP3Fxk oeIUsxXFbB/ahkHzLQcFsLTsiMgkf978tsMNCly6+41cEHR4AljXIfdUIbUuP0CE6LZ+QsyN1B5cn gBsm3zOQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tlbf7-000000073DH-49mH; Fri, 21 Feb 2025 22:38:25 +0000 From: Luis Chamberlain To: brauner@kernel.org, akpm@linux-foundation.org, hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v3 8/8] bdev: use bdev_io_min() for statx block size Date: Fri, 21 Feb 2025 14:38:23 -0800 Message-ID: <20250221223823.1680616-9-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250221223823.1680616-1-mcgrof@kernel.org> References: <20250221223823.1680616-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain You can use lsblk to query for a block device block device block size: lsblk -o MIN-IO /dev/nvme0n1 MIN-IO 4096 The min-io is the minimum IO the block device prefers for optimal performance. In turn we map this to the block device block size. The current block size exposed even for block devices with an LBA format of 16k is 4k. Likewise devices which support 4k LBA format but have a larger Indirection Unit of 16k have an exposed block size of 4k. This incurs read-modify-writes on direct IO against devices with a min-io larger than the page size. To fix this, use the block device min io, which is the minimal optimal IO the device prefers. With this we now get: lsblk -o MIN-IO /dev/nvme0n1 MIN-IO 16384 And so userspace gets the appropriate information it needs for optimal performance. This is verified with blkalgn against mkfs against a device with LBA format of 4k but an NPWG of 16k (min io size) mkfs.xfs -f -b size=16k /dev/nvme3n1 blkalgn -d nvme3n1 --ops Write Block size : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 0 | | 8192 -> 16383 : 0 | | 16384 -> 32767 : 66 |****************************************| 32768 -> 65535 : 0 | | 65536 -> 131071 : 0 | | 131072 -> 262143 : 2 |* | Block size: 14 - 66 Block size: 17 - 2 Algn size : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 0 | | 8192 -> 16383 : 0 | | 16384 -> 32767 : 66 |****************************************| 32768 -> 65535 : 0 | | 65536 -> 131071 : 0 | | 131072 -> 262143 : 2 |* | Algn size: 14 - 66 Algn size: 17 - 2 Reviewed-by: Hannes Reinecke Signed-off-by: Luis Chamberlain --- block/bdev.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 22806ce11e1d..3bd948e6438d 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -1276,9 +1276,6 @@ void bdev_statx(struct path *path, struct kstat *stat, struct inode *backing_inode; struct block_device *bdev; - if (!(request_mask & (STATX_DIOALIGN | STATX_WRITE_ATOMIC))) - return; - backing_inode = d_backing_inode(path->dentry); /* @@ -1305,6 +1302,8 @@ void bdev_statx(struct path *path, struct kstat *stat, queue_atomic_write_unit_max_bytes(bd_queue)); } + stat->blksize = bdev_io_min(bdev); + blkdev_put_no_open(bdev); }