From patchwork Thu Aug 24 06:33:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13363607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E80BCC71145 for ; Thu, 24 Aug 2023 06:34:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240112AbjHXGeJ (ORCPT ); Thu, 24 Aug 2023 02:34:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240143AbjHXGdp (ORCPT ); Thu, 24 Aug 2023 02:33:45 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B31FAFD for ; Wed, 23 Aug 2023 23:33:43 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7312822C04 for ; Thu, 24 Aug 2023 06:33:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1692858822; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dIfzvI1ED9+julezqMNKRgfKO5V22zcBvbaFJ+pWKx4=; b=X5Z8ZhgR5aYfRhShgwYvN5qrjogZfOJsGakAI4F6Qg6n9zT3LN8aPcQbOefPAWiDaKskw5 vi7Hf+4O1xIcKJYSTiA3oZzwmv/yJz1AU2wZ8qrBXo5CtvzrLShOST4b0JCTpphZyzSoyC ByXfb4R/aX8VyiaS4u2mtouD809i64Y= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id C80D3138FB for ; Thu, 24 Aug 2023 06:33:41 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id EJBPJMX55mQqDAAAMHmgww (envelope-from ) for ; Thu, 24 Aug 2023 06:33:41 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 1/3] btrfs: warn on tree blocks which are not nodesize aligned Date: Thu, 24 Aug 2023 14:33:36 +0800 Message-ID: <09481f8720302e0c4aaee7e460c142f632c72fe8.1692858397.git.wqu@suse.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org A long time ago, we have some metadata chunks which starts at sector boundary but not aligned at nodesize boundary. This led to some older fs which can have tree blocks only aligned to sectorsize, but not nodesize. Later btrfs check gained the ability to detect and warn about such tree blocks, and kernel fixed the chunk allocation behavior, nowadays those tree blocks should be pretty rare. But in the future, if we want to migrate metadata to folio, we can not have such tree blocks, as filemap_add_folio() requires the page index to be aligned with the folio number of pages. (AKA, such unaligned tree blocks can lead to VM_BUG_ON().) So this patch adds extra warning for those unaligned tree blocks, as a preparation for the future folio migration. Signed-off-by: Qu Wenruo Reviewed-by: Anand Jain --- fs/btrfs/extent_io.c | 8 ++++++++ fs/btrfs/fs.h | 7 +++++++ 2 files changed, 15 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ac3fca5a5e41..f13211975e0b 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3462,6 +3462,14 @@ static int check_eb_alignment(struct btrfs_fs_info *fs_info, u64 start) start, fs_info->nodesize); return -EINVAL; } + if (!IS_ALIGNED(start, fs_info->nodesize) && + !test_and_set_bit(BTRFS_FS_UNALIGNED_TREE_BLOCK, + &fs_info->flags)) { + btrfs_warn(fs_info, + "tree block not nodesize aligned, start %llu nodesize %u", + start, fs_info->nodesize); + btrfs_warn(fs_info, "this can be solved by a full metadata balance"); + } return 0; } diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index a523d64d5491..4dc16d74437c 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -139,6 +139,13 @@ enum { */ BTRFS_FS_FEATURE_CHANGED, + /* + * Indicate if we have tree block which is only aligned to sectorsize, + * but not to nodesize. + * This should be rare nowadays. + */ + BTRFS_FS_UNALIGNED_TREE_BLOCK, + #if BITS_PER_LONG == 32 /* Indicate if we have error/warn message printed on 32bit systems */ BTRFS_FS_32BIT_ERROR, From patchwork Thu Aug 24 06:33:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13363608 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C36EC71153 for ; Thu, 24 Aug 2023 06:34:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240119AbjHXGeK (ORCPT ); Thu, 24 Aug 2023 02:34:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240145AbjHXGdr (ORCPT ); Thu, 24 Aug 2023 02:33:47 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B88BA8 for ; Wed, 23 Aug 2023 23:33:44 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 7EEB522C09 for ; Thu, 24 Aug 2023 06:33:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1692858823; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BINppVa/0Pc0Jtdm2Vyfj6SivsAuYcyiTC1uV1pPcSY=; b=fko1AG5GMr/RaFrfDyoaKRzxGFUBfuRxbA2XiNLb2fsHOis6+9lRJfTVSv4ngd5rdfaEad 4uTmFelHiSsHNUFheG4aJ73Zr2d1dB8cA3ebuwkz/27255TaoFqiB/K4/7YCwt0yQ4twvS JA23cbnw7HbnxDqNjZ/bjKlOqVE3b4s= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D4672138FB for ; Thu, 24 Aug 2023 06:33:42 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id kPt0J8b55mQqDAAAMHmgww (envelope-from ) for ; Thu, 24 Aug 2023 06:33:42 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 2/3] btrfs: map uncontinuous extent buffer pages into virtual address space Date: Thu, 24 Aug 2023 14:33:37 +0800 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs implements its extent buffer read-write using various helpers doing cross-page handling for the pages array. However other filesystems like XFS is mapping the pages into kernel virtual address space, greatly simplify the access. This patch would learn from XFS and map the pages into virtual address space, if and only if the pages are not physically continuous. (Note, a single page counts as physically continuous.) For now we only do the map, but not yet really utilize the mapped address. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 70 ++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/extent_io.h | 7 +++++ 2 files changed, 77 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f13211975e0b..9f9a3ab82f04 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "misc.h" #include "extent_io.h" #include "extent-io-tree.h" @@ -3153,6 +3154,8 @@ static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) ASSERT(!extent_buffer_under_io(eb)); num_pages = num_extent_pages(eb); + if (eb->vaddr) + vm_unmap_ram(eb->vaddr, num_pages); for (i = 0; i < num_pages; i++) { struct page *page = eb->pages[i]; @@ -3202,6 +3205,7 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src) { int i; struct extent_buffer *new; + bool pages_contig = true; int num_pages = num_extent_pages(src); int ret; @@ -3226,6 +3230,9 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src) int ret; struct page *p = new->pages[i]; + if (i && p != new->pages[i - 1] + 1) + pages_contig = false; + ret = attach_extent_buffer_page(new, p, NULL); if (ret < 0) { btrfs_release_extent_buffer(new); @@ -3233,6 +3240,23 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src) } WARN_ON(PageDirty(p)); } + if (!pages_contig) { + unsigned int nofs_flag; + int retried = 0; + + nofs_flag = memalloc_nofs_save(); + do { + new->vaddr = vm_map_ram(new->pages, num_pages, -1); + if (new->vaddr) + break; + vm_unmap_aliases(); + } while ((retried++) <= 1); + memalloc_nofs_restore(nofs_flag); + if (!new->vaddr) { + btrfs_release_extent_buffer(new); + return NULL; + } + } copy_extent_buffer_full(new, src); set_extent_buffer_uptodate(new); @@ -3243,6 +3267,7 @@ struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, unsigned long len) { struct extent_buffer *eb; + bool pages_contig = true; int num_pages; int i; int ret; @@ -3259,11 +3284,29 @@ struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info, for (i = 0; i < num_pages; i++) { struct page *p = eb->pages[i]; + if (i && p != eb->pages[i - 1] + 1) + pages_contig = false; + ret = attach_extent_buffer_page(eb, p, NULL); if (ret < 0) goto err; } + if (!pages_contig) { + unsigned int nofs_flag; + int retried = 0; + + nofs_flag = memalloc_nofs_save(); + do { + eb->vaddr = vm_map_ram(eb->pages, num_pages, -1); + if (eb->vaddr) + break; + vm_unmap_aliases(); + } while ((retried++) <= 1); + memalloc_nofs_restore(nofs_flag); + if (!eb->vaddr) + goto err; + } set_extent_buffer_uptodate(eb); btrfs_set_header_nritems(eb, 0); set_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); @@ -3486,6 +3529,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, struct address_space *mapping = fs_info->btree_inode->i_mapping; struct btrfs_subpage *prealloc = NULL; u64 lockdep_owner = owner_root; + bool pages_contig = true; int uptodate = 1; int ret; @@ -3558,6 +3602,10 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, /* Should not fail, as we have preallocated the memory */ ret = attach_extent_buffer_page(eb, p, prealloc); ASSERT(!ret); + + if (i && p != eb->pages[i - 1] + 1) + pages_contig = false; + /* * To inform we have extra eb under allocation, so that * detach_extent_buffer_page() won't release the page private @@ -3583,6 +3631,28 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, * we could crash. */ } + + /* + * If pages are not continuous, here we map it into a continuous virtual + * range to make later access easier. + */ + if (!pages_contig) { + unsigned int nofs_flag; + int retried = 0; + + nofs_flag = memalloc_nofs_save(); + do { + eb->vaddr = vm_map_ram(eb->pages, num_pages, -1); + if (eb->vaddr) + break; + vm_unmap_aliases(); + } while ((retried++) <= 1); + memalloc_nofs_restore(nofs_flag); + if (!eb->vaddr) { + exists = ERR_PTR(-ENOMEM); + goto free_eb; + } + } if (uptodate) set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); again: diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 68368ba99321..930a2dc38157 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -87,6 +87,13 @@ struct extent_buffer { struct rw_semaphore lock; + /* + * For virtually mapped address. + * + * NULL if the pages are physically continuous. + */ + void *vaddr; + struct page *pages[INLINE_EXTENT_BUFFER_PAGES]; #ifdef CONFIG_BTRFS_DEBUG struct list_head leak_list; From patchwork Thu Aug 24 06:33:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13363609 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 045EBC7EE2C for ; Thu, 24 Aug 2023 06:34:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240135AbjHXGeL (ORCPT ); Thu, 24 Aug 2023 02:34:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240147AbjHXGdt (ORCPT ); Thu, 24 Aug 2023 02:33:49 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBA0E137 for ; Wed, 23 Aug 2023 23:33:45 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 8D3A920E4E for ; Thu, 24 Aug 2023 06:33:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1692858824; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nFv/xGoQANjeC0a3gI4RMLIj75WjPnpGmh+uZyZJ1wI=; b=Lhps0aL/QDZswYyTI4sA5MUuByxbu6qEMdR2EViW1a5YzApMM6jOWAN/SCHh6lzLQXSn3l DkgVdN/vB38oz34BlgzwTaltnLlshZhqpj68FiD/8LA2/549wQnLzcgkoB1K1capwx+Ksz LZV7AVeaLie+S33xAAJzndyoesCg3Bs= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id E16FA138FB for ; Thu, 24 Aug 2023 06:33:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id oDl0Ksf55mQqDAAAMHmgww (envelope-from ) for ; Thu, 24 Aug 2023 06:33:43 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 3/3] btrfs: utilize the physically/virtually continuous extent buffer memory Date: Thu, 24 Aug 2023 14:33:38 +0800 Message-ID: <8bc15bfdaa2805d1d1b660b8b2e07a55aa02027d.1692858397.git.wqu@suse.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Since the extent buffer pages are either physically or virtually continuous, let's benefit from the new feature. This involves the following changes: - Extent buffer accessors Now read/write/memcpy/memmove_extent_buffer() functions are just a wrapper of memcpy()/memmove(). The cross-page handling are handled by hardware MMU. - Extent buffer bitmap accessors - csum_tree_block() We can directly go crypto_shash_digest(), as we don't need to handle page boundaries anymore. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 18 +-- fs/btrfs/extent_io.c | 282 +++++-------------------------------------- fs/btrfs/extent_io.h | 10 ++ 3 files changed, 47 insertions(+), 263 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 0a96ea8c1d3a..03a423f687b8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -75,24 +75,14 @@ static void btrfs_free_csum_hash(struct btrfs_fs_info *fs_info) static void csum_tree_block(struct extent_buffer *buf, u8 *result) { struct btrfs_fs_info *fs_info = buf->fs_info; - const int num_pages = num_extent_pages(buf); - const int first_page_part = min_t(u32, PAGE_SIZE, fs_info->nodesize); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); - char *kaddr; - int i; + void *eb_addr = btrfs_get_eb_addr(buf); + memset(result, 0, BTRFS_CSUM_SIZE); shash->tfm = fs_info->csum_shash; crypto_shash_init(shash); - kaddr = page_address(buf->pages[0]) + offset_in_page(buf->start); - crypto_shash_update(shash, kaddr + BTRFS_CSUM_SIZE, - first_page_part - BTRFS_CSUM_SIZE); - - for (i = 1; i < num_pages && INLINE_EXTENT_BUFFER_PAGES > 1; i++) { - kaddr = page_address(buf->pages[i]); - crypto_shash_update(shash, kaddr, PAGE_SIZE); - } - memset(result, 0, BTRFS_CSUM_SIZE); - crypto_shash_final(shash, result); + crypto_shash_digest(shash, eb_addr + BTRFS_CSUM_SIZE, + buf->len - BTRFS_CSUM_SIZE, result); } /* diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 9f9a3ab82f04..70e22b9ccd28 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4073,100 +4073,39 @@ static inline int check_eb_range(const struct extent_buffer *eb, void read_extent_buffer(const struct extent_buffer *eb, void *dstv, unsigned long start, unsigned long len) { - size_t cur; - size_t offset; - struct page *page; - char *kaddr; - char *dst = (char *)dstv; - unsigned long i = get_eb_page_index(start); + void *eb_addr = btrfs_get_eb_addr(eb); if (check_eb_range(eb, start, len)) return; - offset = get_eb_offset_in_page(eb, start); - - while (len > 0) { - page = eb->pages[i]; - - cur = min(len, (PAGE_SIZE - offset)); - kaddr = page_address(page); - memcpy(dst, kaddr + offset, cur); - - dst += cur; - len -= cur; - offset = 0; - i++; - } + memcpy(dstv, eb_addr + start, len); } int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb, void __user *dstv, unsigned long start, unsigned long len) { - size_t cur; - size_t offset; - struct page *page; - char *kaddr; - char __user *dst = (char __user *)dstv; - unsigned long i = get_eb_page_index(start); - int ret = 0; + void *eb_addr = btrfs_get_eb_addr(eb); + int ret; WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = get_eb_offset_in_page(eb, start); - - while (len > 0) { - page = eb->pages[i]; - - cur = min(len, (PAGE_SIZE - offset)); - kaddr = page_address(page); - if (copy_to_user_nofault(dst, kaddr + offset, cur)) { - ret = -EFAULT; - break; - } - - dst += cur; - len -= cur; - offset = 0; - i++; - } - - return ret; + ret = copy_to_user_nofault(dstv, eb_addr + start, len); + if (ret) + return -EFAULT; + return 0; } int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, unsigned long start, unsigned long len) { - size_t cur; - size_t offset; - struct page *page; - char *kaddr; - char *ptr = (char *)ptrv; - unsigned long i = get_eb_page_index(start); - int ret = 0; + void *eb_addr = btrfs_get_eb_addr(eb); if (check_eb_range(eb, start, len)) return -EINVAL; - offset = get_eb_offset_in_page(eb, start); - - while (len > 0) { - page = eb->pages[i]; - - cur = min(len, (PAGE_SIZE - offset)); - - kaddr = page_address(page); - ret = memcmp(ptr, kaddr + offset, cur); - if (ret) - break; - - ptr += cur; - len -= cur; - offset = 0; - i++; - } - return ret; + return memcmp(ptrv, eb_addr + start, len); } /* @@ -4200,67 +4139,20 @@ static void assert_eb_page_uptodate(const struct extent_buffer *eb, } } -static void __write_extent_buffer(const struct extent_buffer *eb, - const void *srcv, unsigned long start, - unsigned long len, bool use_memmove) -{ - size_t cur; - size_t offset; - struct page *page; - char *kaddr; - char *src = (char *)srcv; - unsigned long i = get_eb_page_index(start); - /* For unmapped (dummy) ebs, no need to check their uptodate status. */ - const bool check_uptodate = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); - - WARN_ON(test_bit(EXTENT_BUFFER_NO_CHECK, &eb->bflags)); - - if (check_eb_range(eb, start, len)) - return; - - offset = get_eb_offset_in_page(eb, start); - - while (len > 0) { - page = eb->pages[i]; - if (check_uptodate) - assert_eb_page_uptodate(eb, page); - - cur = min(len, PAGE_SIZE - offset); - kaddr = page_address(page); - if (use_memmove) - memmove(kaddr + offset, src, cur); - else - memcpy(kaddr + offset, src, cur); - - src += cur; - len -= cur; - offset = 0; - i++; - } -} - void write_extent_buffer(const struct extent_buffer *eb, const void *srcv, unsigned long start, unsigned long len) { - return __write_extent_buffer(eb, srcv, start, len, false); + void *eb_addr = btrfs_get_eb_addr(eb); + + memcpy(eb_addr + start, srcv, len); } static void memset_extent_buffer(const struct extent_buffer *eb, int c, unsigned long start, unsigned long len) { - unsigned long cur = start; + void *eb_addr = btrfs_get_eb_addr(eb); - while (cur < start + len) { - unsigned long index = get_eb_page_index(cur); - unsigned int offset = get_eb_offset_in_page(eb, cur); - unsigned int cur_len = min(start + len - cur, PAGE_SIZE - offset); - struct page *page = eb->pages[index]; - - assert_eb_page_uptodate(eb, page); - memset(page_address(page) + offset, c, cur_len); - - cur += cur_len; - } + memset(eb_addr + start, c, len); } void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start, @@ -4274,20 +4166,12 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start, void copy_extent_buffer_full(const struct extent_buffer *dst, const struct extent_buffer *src) { - unsigned long cur = 0; + void *dst_addr = btrfs_get_eb_addr(dst); + void *src_addr = btrfs_get_eb_addr(src); ASSERT(dst->len == src->len); - while (cur < src->len) { - unsigned long index = get_eb_page_index(cur); - unsigned long offset = get_eb_offset_in_page(src, cur); - unsigned long cur_len = min(src->len, PAGE_SIZE - offset); - void *addr = page_address(src->pages[index]) + offset; - - write_extent_buffer(dst, addr, cur, cur_len); - - cur += cur_len; - } + memcpy(dst_addr, src_addr, dst->len); } void copy_extent_buffer(const struct extent_buffer *dst, @@ -4296,11 +4180,8 @@ void copy_extent_buffer(const struct extent_buffer *dst, unsigned long len) { u64 dst_len = dst->len; - size_t cur; - size_t offset; - struct page *page; - char *kaddr; - unsigned long i = get_eb_page_index(dst_offset); + void *dst_addr = btrfs_get_eb_addr(dst); + void *src_addr = btrfs_get_eb_addr(src); if (check_eb_range(dst, dst_offset, len) || check_eb_range(src, src_offset, len)) @@ -4308,54 +4189,7 @@ void copy_extent_buffer(const struct extent_buffer *dst, WARN_ON(src->len != dst_len); - offset = get_eb_offset_in_page(dst, dst_offset); - - while (len > 0) { - page = dst->pages[i]; - assert_eb_page_uptodate(dst, page); - - cur = min(len, (unsigned long)(PAGE_SIZE - offset)); - - kaddr = page_address(page); - read_extent_buffer(src, kaddr + offset, src_offset, cur); - - src_offset += cur; - len -= cur; - offset = 0; - i++; - } -} - -/* - * eb_bitmap_offset() - calculate the page and offset of the byte containing the - * given bit number - * @eb: the extent buffer - * @start: offset of the bitmap item in the extent buffer - * @nr: bit number - * @page_index: return index of the page in the extent buffer that contains the - * given bit number - * @page_offset: return offset into the page given by page_index - * - * This helper hides the ugliness of finding the byte in an extent buffer which - * contains a given bit. - */ -static inline void eb_bitmap_offset(const struct extent_buffer *eb, - unsigned long start, unsigned long nr, - unsigned long *page_index, - size_t *page_offset) -{ - size_t byte_offset = BIT_BYTE(nr); - size_t offset; - - /* - * The byte we want is the offset of the extent buffer + the offset of - * the bitmap item in the extent buffer + the offset of the byte in the - * bitmap item. - */ - offset = start + offset_in_page(eb->start) + byte_offset; - - *page_index = offset >> PAGE_SHIFT; - *page_offset = offset_in_page(offset); + memcpy(dst_addr + dst_offset, src_addr + src_offset, len); } /* @@ -4368,25 +4202,18 @@ static inline void eb_bitmap_offset(const struct extent_buffer *eb, int extent_buffer_test_bit(const struct extent_buffer *eb, unsigned long start, unsigned long nr) { - u8 *kaddr; - struct page *page; - unsigned long i; - size_t offset; + const u8 *kaddr = btrfs_get_eb_addr(eb); + const unsigned long first_byte = start + BIT_BYTE(nr); - eb_bitmap_offset(eb, start, nr, &i, &offset); - page = eb->pages[i]; - assert_eb_page_uptodate(eb, page); - kaddr = page_address(page); - return 1U & (kaddr[offset] >> (nr & (BITS_PER_BYTE - 1))); + assert_eb_page_uptodate(eb, eb->pages[first_byte >> PAGE_SHIFT]); + return 1U & (kaddr[first_byte] >> (nr & (BITS_PER_BYTE - 1))); } static u8 *extent_buffer_get_byte(const struct extent_buffer *eb, unsigned long bytenr) { - unsigned long index = get_eb_page_index(bytenr); - if (check_eb_range(eb, bytenr, 1)) return NULL; - return page_address(eb->pages[index]) + get_eb_offset_in_page(eb, bytenr); + return btrfs_get_eb_addr(eb) + bytenr; } /* @@ -4471,72 +4298,29 @@ void memcpy_extent_buffer(const struct extent_buffer *dst, unsigned long dst_offset, unsigned long src_offset, unsigned long len) { - unsigned long cur_off = 0; + void *eb_addr = btrfs_get_eb_addr(dst); if (check_eb_range(dst, dst_offset, len) || check_eb_range(dst, src_offset, len)) return; - while (cur_off < len) { - unsigned long cur_src = cur_off + src_offset; - unsigned long pg_index = get_eb_page_index(cur_src); - unsigned long pg_off = get_eb_offset_in_page(dst, cur_src); - unsigned long cur_len = min(src_offset + len - cur_src, - PAGE_SIZE - pg_off); - void *src_addr = page_address(dst->pages[pg_index]) + pg_off; - const bool use_memmove = areas_overlap(src_offset + cur_off, - dst_offset + cur_off, cur_len); - - __write_extent_buffer(dst, src_addr, dst_offset + cur_off, cur_len, - use_memmove); - cur_off += cur_len; - } + if (areas_overlap(dst_offset, src_offset, len)) + memmove(eb_addr + dst_offset, eb_addr + src_offset, len); + else + memcpy(eb_addr + dst_offset, eb_addr + src_offset, len); } void memmove_extent_buffer(const struct extent_buffer *dst, unsigned long dst_offset, unsigned long src_offset, unsigned long len) { - unsigned long dst_end = dst_offset + len - 1; - unsigned long src_end = src_offset + len - 1; + void *eb_addr = btrfs_get_eb_addr(dst); if (check_eb_range(dst, dst_offset, len) || check_eb_range(dst, src_offset, len)) return; - if (dst_offset < src_offset) { - memcpy_extent_buffer(dst, dst_offset, src_offset, len); - return; - } - - while (len > 0) { - unsigned long src_i; - size_t cur; - size_t dst_off_in_page; - size_t src_off_in_page; - void *src_addr; - bool use_memmove; - - src_i = get_eb_page_index(src_end); - - dst_off_in_page = get_eb_offset_in_page(dst, dst_end); - src_off_in_page = get_eb_offset_in_page(dst, src_end); - - cur = min_t(unsigned long, len, src_off_in_page + 1); - cur = min(cur, dst_off_in_page + 1); - - src_addr = page_address(dst->pages[src_i]) + src_off_in_page - - cur + 1; - use_memmove = areas_overlap(src_end - cur + 1, dst_end - cur + 1, - cur); - - __write_extent_buffer(dst, src_addr, dst_end - cur + 1, cur, - use_memmove); - - dst_end -= cur; - src_end -= cur; - len -= cur; - } + memmove(eb_addr + dst_offset, eb_addr + src_offset, len); } #define GANG_LOOKUP_SIZE 16 diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 930a2dc38157..bfa14457f461 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -140,6 +140,16 @@ static inline unsigned long get_eb_page_index(unsigned long offset) return offset >> PAGE_SHIFT; } +static inline void *btrfs_get_eb_addr(const struct extent_buffer *eb) +{ + /* For fallback vmapped extent buffer. */ + if (eb->vaddr) + return eb->vaddr; + + /* For physically continuous pages and subpage cases. */ + return page_address(eb->pages[0]) + offset_in_page(eb->start); +} + /* * Structure to record how many bytes and which ranges are set/cleared */