From patchwork Tue Nov 26 06:29:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13885499 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75AAC3208; Tue, 26 Nov 2024 06:30:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732602603; cv=none; b=YoO10YiZs0aI85SF9XE6nCrIeQvsOqJ1blmpzUFoV7PCh1kSB0GOJXJbYkA/qBz1yH7F4y493bubRYQdOsNeds1RCv6FslBqhgRuL4XXjrW4O8cU3WSpr6zfMzth7D2YEeDBxnzKtQaGrHUUfjzdw3oi6iLDtsVIGQZm0yzi8JI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732602603; c=relaxed/simple; bh=PGVIabG8i4G54Xa+u6HsKZwViPWlu+jiGWi5wqziZo8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IIn1/+uXN3hUSAahZTuBnJfBecZqZcsa5XMeFYZm24jucVWeK7q4rWdbQZEhv09WKWCp9ykmu7Ny7zFbO/jUUI0uI8TCNQpKu489hiYyE3KhRbmANMzcMhK9rAJNuL+P4x/Warn2ZCjS8U9/0Pe28/UVMQ77qR7Y5CcA3E5qdiw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b=k/pnDnc3; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b=k/pnDnc3; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="k/pnDnc3"; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="k/pnDnc3" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5C18F21153; Tue, 26 Nov 2024 06:29:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1732602593; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gnrUl362Xiw2/ygz3JhH81Z9lOpoWU3jOtJSx7S/a5E=; b=k/pnDnc3KKNVu7BnJgajZWsfdurgnLw2mkPVDxKAkEV0PoM5UYLHwvnu/CcHzFhcbyVRod q64uZ1WpB3R/0A7771jmjB/j1Pp8ph4R3X5/faeLiZ64rI3CMYLorTXCGUyumTfm8Ehcl0 OtNNPj9nrPi6TPy86i/VsWBdtKMnIt8= Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.com header.s=susede1 header.b="k/pnDnc3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1732602593; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gnrUl362Xiw2/ygz3JhH81Z9lOpoWU3jOtJSx7S/a5E=; b=k/pnDnc3KKNVu7BnJgajZWsfdurgnLw2mkPVDxKAkEV0PoM5UYLHwvnu/CcHzFhcbyVRod q64uZ1WpB3R/0A7771jmjB/j1Pp8ph4R3X5/faeLiZ64rI3CMYLorTXCGUyumTfm8Ehcl0 OtNNPj9nrPi6TPy86i/VsWBdtKMnIt8= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3B9EB139AA; Tue, 26 Nov 2024 06:29:51 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id OOACOt9qRWeZUQAAD6G6ig (envelope-from ); Tue, 26 Nov 2024 06:29:51 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: stable@vger.kernel.org Subject: [PATCH 1/2] btrfs: handle btrfs_run_delalloc_range() errors correctly Date: Tue, 26 Nov 2024 16:59:23 +1030 Message-ID: <3d7d7a1151b3f3cc64dbf3a06d46dd08c5c86a8f.1732596971.git.wqu@suse.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: 5C18F21153 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.com:s=susede1]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FROM_HAS_DN(0.00)[]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; FUZZY_BLOCKED(0.00)[rspamd.com]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; DKIM_SIGNED(0.00)[suse.com:s=susede1]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; DKIM_TRACE(0.00)[suse.com:+] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: [BUG] There are several crash or hang during fstests runs with sectorsize < page size setup. It turns out that most of those hang happens after a btrfs_run_delalloc_range() failure (caused by -ENOSPC). The most common one is generic/750. The symptom are all related to ordered extent finishing, where we double account the target ordered extent. [CAUSE] Inside writepage_delalloc() if we hit an error from btrfs_run_delalloc_range(), we still need to unlock all the locked range, but that's the only error handling. If we have the following page layout with a 64K page size and 4K sector size: 0 4K 32K 40K 60K 64K |////| |////| |/////| Where |//| is the dirtied blocks inside the folio. Then we hit the following sequence: - Enter writepage_delalloc() for folio 0 - btrfs_run_delalloc_range() returned 0 for [0, 4K) And created regular COW ordered extent for range [0, 4K) - btrfs_run_delalloc_range() returned 0 for [32K, 40K) And created async extent for range [32K, 40K). This means the error handling will be done in another thread, we should not touch the range anymore. - btrfs_run_delalloc_range() failed with -ENOSPC for range [60K, 64K) In theory we should not fail since we should have reserved enough space at buffered write time, but let's ignore that rabbit hole and focus on the error handling. - Error handling in extent_writepage() Now we go to the done: tag, calling btrfs_mark_ordered_io_finished() for the whole folio range. This will find ranges [0, 4K) and [32K, 40K) to cleanup, for [0, 4K) it should be cleaned up, but for range [32K, 40K) it's asynchronously handled, the OE may have already been submitted. This will lead to the double account for range [32K, 40K) and crash the kernel. Unfortunately this bad error handling is from the very beginning of sector size < page size support. [FIX] Instead of relying on the btrfs_mark_ordered_io_finished() call to cleanup the whole folio range, record the last successfully ran delalloc range. And combined with bio_ctrl->submit_bitmap to properly clean up any newly created ordered extents. Since we have cleaned up the ordered extents in range, we should not rely on the btrfs_mark_ordered_io_finished() inside extent_writepage() anymore. By this, we should avoid double accounting during error handling. Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 45 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 37 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 438974d4def4..0132c2b84d99 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1145,11 +1145,13 @@ static bool find_next_delalloc_bitmap(struct folio *folio, * helper for extent_writepage(), doing all of the delayed allocation setup. * * This returns 1 if btrfs_run_delalloc_range function did all the work required - * to write the page (copy into inline extent). In this case the IO has - * been started and the page is already unlocked. + * to write the page (copy into inline extent or compression). In this case + * the IO has been started and we should no longer touch the page (may have + * already been unlocked). * * This returns 0 if all went well (page still locked) - * This returns < 0 if there were errors (page still locked) + * This returns < 0 if there were errors (page still locked), in this case any + * newly created delalloc range will be marked as error and finished. */ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode, struct folio *folio, @@ -1167,6 +1169,12 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode, * last delalloc end. */ u64 last_delalloc_end = 0; + /* + * Save the last successfully ran delalloc range end (exclusive). + * This is for error handling to avoid ranges with ordered extent created + * but no IO will be submitted due to error. + */ + u64 last_finished = page_start; u64 delalloc_start = page_start; u64 delalloc_end = page_end; u64 delalloc_to_write = 0; @@ -1235,11 +1243,19 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode, found_len = last_delalloc_end + 1 - found_start; if (ret >= 0) { + /* + * Some delalloc range may be created by previous folios. + * Thus we still need to clean those range up during error + * handling. + */ + last_finished = found_start; /* No errors hit so far, run the current delalloc range. */ ret = btrfs_run_delalloc_range(inode, folio, found_start, found_start + found_len - 1, wbc); + if (ret >= 0) + last_finished = found_start + found_len; } else { /* * We've hit an error during previous delalloc range, @@ -1274,8 +1290,21 @@ static noinline_for_stack int writepage_delalloc(struct btrfs_inode *inode, delalloc_start = found_start + found_len; } - if (ret < 0) + /* + * It's possible we have some ordered extents created before we hit + * an error, cleanup non-async successfully created delalloc ranges. + */ + if (unlikely(ret < 0)) { + unsigned int bitmap_size = min( + (last_finished - page_start) >> fs_info->sectorsize_bits, + fs_info->sectors_per_page); + + for_each_set_bit(bit, &bio_ctrl->submit_bitmap, bitmap_size) + btrfs_mark_ordered_io_finished(inode, folio, + page_start + (bit << fs_info->sectorsize_bits), + fs_info->sectorsize, false); return ret; + } out: if (last_delalloc_end) delalloc_end = last_delalloc_end; @@ -1509,13 +1538,13 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl bio_ctrl->wbc->nr_to_write--; -done: - if (ret) { + if (ret) btrfs_mark_ordered_io_finished(BTRFS_I(inode), folio, page_start, PAGE_SIZE, !ret); - mapping_set_error(folio->mapping, ret); - } +done: + if (ret < 0) + mapping_set_error(folio->mapping, ret); /* * Only unlock ranges that are submitted. As there can be some async * submitted ranges inside the folio. From patchwork Tue Nov 26 06:29:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13885500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85BB3180A80; Tue, 26 Nov 2024 06:30:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732602604; cv=none; b=htKLRkZGXiC0tlWvyRVy/76mdJuF/ASn8dGqnhywXJTJAWAvpA31JU1wUdmuPsvT6/wRvKtoPBZVUzCvn4XBTNy7kOjkiMr0jgwkF3XsHP0hS0lckhbMok+fjvzbquGInOMZEDFgsnXiP2+UzoRcDXqJN+QRhCra0PrNzCRTnsM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732602604; c=relaxed/simple; bh=1/hWEooApaBRPCTkdajgx2eawA03uLQIq/BdKbL/EM0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=c/+x0Tqo97Jf7cAjtGt/aSkx1pd9UBcerto7aTStBE262TxiWzAUNnBa5Nk6fC7Pd+bpOolS4tuKpHX2HFzC1cZPFZUx1XmogvQ5JYuAb5ZvDc9H1QiIBuU8rMRd7x3UjTCFT2+BdbRJrt6crlNw9TKCxyD/559AdW6tLtbVc7M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b=WXoHMXL8; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b=WXoHMXL8; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="WXoHMXL8"; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="WXoHMXL8" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id F2BE42115B; Tue, 26 Nov 2024 06:29:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1732602595; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RJxk4IBLgyTntyA90w01BnXiPuDnemrvU7WwgFbe/xE=; b=WXoHMXL8cFZ3W60pACr3gR0Vy50gtiGUtU6Zz8nIiE+OeveRcghtkSWpPR9MSf2HkMMmXs rurzcHcBVGlAqwaUMDRnaeQu196o2h/G+Th1QkWZuz6gC+tMo3H+wGqtf44R3WF780tvuR qqd1pdlN8FVfJJQ1cOmSDunJKy43Huk= Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.com header.s=susede1 header.b=WXoHMXL8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1732602595; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RJxk4IBLgyTntyA90w01BnXiPuDnemrvU7WwgFbe/xE=; b=WXoHMXL8cFZ3W60pACr3gR0Vy50gtiGUtU6Zz8nIiE+OeveRcghtkSWpPR9MSf2HkMMmXs rurzcHcBVGlAqwaUMDRnaeQu196o2h/G+Th1QkWZuz6gC+tMo3H+wGqtf44R3WF780tvuR qqd1pdlN8FVfJJQ1cOmSDunJKy43Huk= Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id DACC9139AA; Tue, 26 Nov 2024 06:29:53 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id iF5OJeFqRWeZUQAAD6G6ig (envelope-from ); Tue, 26 Nov 2024 06:29:53 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: stable@vger.kernel.org Subject: [PATCH 2/2] btrfs: handle submit_one_sector() error inside extent_writepage_io() Date: Tue, 26 Nov 2024 16:59:24 +1030 Message-ID: <465546590ba23fbbe82998dd2b4273d71bebd07c.1732596971.git.wqu@suse.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Rspamd-Queue-Id: F2BE42115B X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.com:s=susede1]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FROM_HAS_DN(0.00)[]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; FUZZY_BLOCKED(0.00)[rspamd.com]; DBL_BLOCKED_OPENRESOLVER(0.00)[imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; DKIM_SIGNED(0.00)[suse.com:s=susede1]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; DKIM_TRACE(0.00)[suse.com:+] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Level: [BUG] If submit_one_sector() failed inside extent_writepage_io() for sector size < page size cases (e.g. 4K sector size and 64K page size), then we can hit double ordered extent accounting error. This should be very rare, as submit_one_sector() only fails when we failed to grab the extent map, and such extent map should exist inside the memory and have been pinned. [CAUSE] For example we have the following folio layout: 0 4K 32K 48K 60K 64K |//| |//////| |///| Where |///| is the dirty range we need to writeback. The 3 different dirty ranges are submitted for regular COW. Now we hit the following sequence: - submit_one_sector() returned 0 for [0, 4K) - submit_one_sector() returned 0 for [32K, 48K) - submit_one_sector() returned error for [60K, 64K) - btrfs_mark_ordered_io_finished() called for the whole folio This will mark the following ranges as finished: * [0, 4K) * [32K, 48K) Both ranges have their IO already submitted, this cleanup will lead to double accounting. * [60K, 64K) That's the correct cleanup. Unfortunately the behavior dates back to the old days when there is no subpage support. [FIX] Instead of calling btrfs_mark_ordered_io_finished() unconditionally at extent_writepage(), which can touch ranges we should not touch, instead move the error handling inside extent_writepage_io(). So that we can cleanup exact sectors that are ought to be submitted but failed. This provide much more accurate cleanup, avoiding the double accounting. Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 32 +++++++++++++++++++------------- 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 0132c2b84d99..a3d4f698fd25 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1420,6 +1420,7 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode, struct btrfs_fs_info *fs_info = inode->root->fs_info; unsigned long range_bitmap = 0; bool submitted_io = false; + bool error = false; const u64 folio_start = folio_pos(folio); u64 cur; int bit; @@ -1462,11 +1463,21 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode, break; } ret = submit_one_sector(inode, folio, cur, bio_ctrl, i_size); - if (ret < 0) - goto out; + if (unlikely(ret < 0)) { + submit_one_bio(bio_ctrl); + /* + * Failed to grab the extent map which should be very rare. + * Since there is no bio submitted to finish the ordered + * extent, we have to manually finish this sector. + */ + btrfs_mark_ordered_io_finished(inode, folio, cur, + fs_info->sectorsize, false); + error = true; + continue; + } submitted_io = true; } -out: + /* * If we didn't submitted any sector (>= i_size), folio dirty get * cleared but PAGECACHE_TAG_DIRTY is not cleared (only cleared @@ -1474,8 +1485,11 @@ static noinline_for_stack int extent_writepage_io(struct btrfs_inode *inode, * * Here we set writeback and clear for the range. If the full folio * is no longer dirty then we clear the PAGECACHE_TAG_DIRTY tag. + * + * If we hit any error, the corresponding sector will still be dirty + * thus no need to clear PAGECACHE_TAG_DIRTY. */ - if (!submitted_io) { + if (!submitted_io && !error) { btrfs_folio_set_writeback(fs_info, folio, start, len); btrfs_folio_clear_writeback(fs_info, folio, start, len); } @@ -1495,7 +1509,6 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl { struct inode *inode = folio->mapping->host; struct btrfs_fs_info *fs_info = inode_to_fs_info(inode); - const u64 page_start = folio_pos(folio); int ret; size_t pg_offset; loff_t i_size = i_size_read(inode); @@ -1538,10 +1551,6 @@ static int extent_writepage(struct folio *folio, struct btrfs_bio_ctrl *bio_ctrl bio_ctrl->wbc->nr_to_write--; - if (ret) - btrfs_mark_ordered_io_finished(BTRFS_I(inode), folio, - page_start, PAGE_SIZE, !ret); - done: if (ret < 0) mapping_set_error(folio->mapping, ret); @@ -2322,11 +2331,8 @@ void extent_write_locked_range(struct inode *inode, const struct folio *locked_f if (ret == 1) goto next_page; - if (ret) { - btrfs_mark_ordered_io_finished(BTRFS_I(inode), folio, - cur, cur_len, !ret); + if (ret) mapping_set_error(mapping, ret); - } btrfs_folio_end_lock(fs_info, folio, cur, cur_len); if (ret < 0) found_error = true;