From patchwork Tue Nov 15 01:30:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13043084 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 123D8C43217 for ; Tue, 15 Nov 2022 01:30:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 522358E0006; Mon, 14 Nov 2022 20:30:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4D1558E0002; Mon, 14 Nov 2022 20:30:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2FC4A8E0006; Mon, 14 Nov 2022 20:30:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 206A88E0002 for ; Mon, 14 Nov 2022 20:30:52 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id AE718C0345 for ; Tue, 15 Nov 2022 01:30:51 +0000 (UTC) X-FDA: 80133947502.20.5A724BA Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by imf25.hostedemail.com (Postfix) with ESMTP id 402C2A000D for ; Tue, 15 Nov 2022 01:30:51 +0000 (UTC) Received: by mail-pg1-f177.google.com with SMTP id 6so11875623pgm.6 for ; Mon, 14 Nov 2022 17:30:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jgZL4Q6/hiN7SK0U+fxjQpeeWJPO0+rsmSJULzaGQ0k=; b=UUqPu/qLFSsKkqibdIyWpzTl7kJ0392Arsqxv5eqI2K6kcKDlIPK26OhDQtPxdgLk6 NmBy3VVOPc9exNJwbPMYA/O1J5+2B3/QWN36xbCtHZFo1XS7tAziLiwIwM1I4ZSIMSfz ya5v3sLA5e4ngF5cW55rzN6yyHLIdJP385ok87P7w4GraXHJfZkCQd0rd9WUT3aIz21m kZBez5immZ36uVk6aLtli7nW9CxMyY3g627gqnkAHVEJgAzsGAheJkioxNrm6Xx9p+yi 0q0KykfQMpzuWtFoq8of5+MbNVzqwkjdjnt0CdGcSJOfNkZKtS97NnGEg+ruJCBe2siW uJGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jgZL4Q6/hiN7SK0U+fxjQpeeWJPO0+rsmSJULzaGQ0k=; b=CN1qKX6giPplSDMTYXjNYjZHb+g4Onns/alC/xVexc7Ui/VKc+Aha8u6YE+BbR9mqU MVynTQS8nJPOfLI6nX1OHpDXDPrLCy6OpYqFbWbxFit5HpkBwqOk9yeih0iz40jKHODU K4N4S+pw5KpABYU7IMcat2x1ZmNgOf3oWgPDjwelWxVstNk828mQvmwK1eh0wNKyqtmw t7d6tOKLjs8WAskx2k9z41FR9S8Fl0VWkBntw5MDE9hd+Yfyw5N22BYOzZsVEk5IWyRo /6WzZQPfYatGu8sIpCbYHZE5ygu9L81SonPxtS7lQ1VbnmkfNEmp1gDn2Fr6/hk4NijZ Vn2w== X-Gm-Message-State: ANoB5pm4cyhPPdT9qaZ0d8HD3tIrHl662WqtZ0CEW5Jt9VJT7A2ktc70 zFjp6p03RWi1O5arxWd1Oh4ZtQ== X-Google-Smtp-Source: AA0mqf53q+Ed2oyN1jbck8d7rOoRUBqjIi3l7i6bP6FZ3exURtjovulj7jav8V30tqIY2/ycTy12og== X-Received: by 2002:aa7:814e:0:b0:572:6e9b:9fa2 with SMTP id d14-20020aa7814e000000b005726e9b9fa2mr1008065pfn.8.1668475850202; Mon, 14 Nov 2022 17:30:50 -0800 (PST) Received: from dread.disaster.area (pa49-181-106-210.pa.nsw.optusnet.com.au. [49.181.106.210]) by smtp.gmail.com with ESMTPSA id d17-20020a170902ced100b001868ed86a95sm8290720plg.174.2022.11.14.17.30.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Nov 2022 17:30:48 -0800 (PST) Received: from discord.disaster.area ([192.168.253.110]) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1oukmj-00EKGJ-Hn; Tue, 15 Nov 2022 12:30:45 +1100 Received: from dave by discord.disaster.area with local (Exim 4.96) (envelope-from ) id 1oukmj-001Vpr-1d; Tue, 15 Nov 2022 12:30:45 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 9/9] xfs: drop write error injection is unfixable, remove it Date: Tue, 15 Nov 2022 12:30:43 +1100 Message-Id: <20221115013043.360610-10-david@fromorbit.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221115013043.360610-1-david@fromorbit.com> References: <20221115013043.360610-1-david@fromorbit.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1668475851; a=rsa-sha256; cv=none; b=EGMiutRgfKfr3CLjO/fHztUIVXGbhECc7Hof2Qmihim6XhPGVXPXK0opdZ30tUyKvL/M49 21f7pbiLHUVG0PXZyIMOU0xPqKBbMKOz8eoQdI5voOf/J4/0M6ETjSWaCm7fLeSXUmKJ5J CGIPzk2oF9zGzIxd29VIp/pu4BUxVjs= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=fromorbit-com.20210112.gappssmtp.com header.s=20210112 header.b="UUqPu/qL"; spf=none (imf25.hostedemail.com: domain of david@fromorbit.com has no SPF policy when checking 209.85.215.177) smtp.mailfrom=david@fromorbit.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1668475851; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jgZL4Q6/hiN7SK0U+fxjQpeeWJPO0+rsmSJULzaGQ0k=; b=AJC7+Q4IVAt/2yZVZOXo0+jiluDp2i4aoJiONNB7T0SNBFcgaIWa+xgiWf4NBPciY9KFbz 1ZrPQ/vf7FntuGxDvoxS+TeLlMFeHu0lTqK/WzJBptf0rhT996qz/XtGRe5iYS/06vkNhj jjSwqstIXWSZKFHCvD5ZnddMkycGNRk= X-Stat-Signature: 5homj9ux1frcd66oy1nw7yssn5n89nit Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=fromorbit-com.20210112.gappssmtp.com header.s=20210112 header.b="UUqPu/qL"; spf=none (imf25.hostedemail.com: domain of david@fromorbit.com has no SPF policy when checking 209.85.215.177) smtp.mailfrom=david@fromorbit.com; dmarc=none X-Rspamd-Server: rspam10 X-Rspam-User: X-Rspamd-Queue-Id: 402C2A000D X-HE-Tag: 1668475851-732015 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Dave Chinner With the changes to scan the page cache for dirty data to avoid data corruptions from partial write cleanup racing with other page cache operations, the drop writes error injection no longer works the same way it used to and causes xfs/196 to fail. This is because xfs/196 writes to the file and populates the page cache before it turns on the error injection and starts failing -overwrites-. The result is that the original drop-writes code failed writes only -after- overwriting the data in the cache, followed by invalidates the cached data, then punching out the delalloc extent from under that data. On the surface, this looks fine. The problem is that page cache invalidation *doesn't guarantee that it removes anything from the page cache* and it doesn't change the dirty state of the folio. When block size == page size and we do page aligned IO (as xfs/196 does) everything happens to align perfectly and page cache invalidation removes the single page folios that span the written data. Hence the followup delalloc punch pass does not find cached data over that range and it can punch the extent out. IOWs, xfs/196 "works" for block size == page size with the new code. I say "works", because it actually only works for the case where IO is page aligned, and no data was read from disk before writes occur. Because the moment we actually read data first, the readahead code allocates multipage folios and suddenly the invalidate code goes back to zeroing subfolio ranges without changing dirty state. Hence, with multipage folios in play, block size == page size is functionally identical to block size < page size behaviour, and drop-writes is manifestly broken w.r.t to this case. Invalidation of a subfolio range doesn't result in the folio being removed from the cache, just the range gets zeroed. Hence after we've sequentially walked over a folio that we've dirtied (via write data) and then invalidated, we end up with a dirty folio full of zeroed data. And because the new code skips punching ranges that have dirty folios covering them, we end up leaving the delalloc range intact after failing all the writes. Hence failed writes now end up writing zeroes to disk in the cases where invalidation zeroes folios rather than removing them from cache. This is a fundamental change of behaviour that is needed to avoid the data corruption vectors that exist in the old write fail path, and it renders the drop-writes injection non-functional and unworkable as it stands. As it is, I think the error injection is also now unnecessary, as partial writes that need delalloc extent are going to be a lot more common with stale iomap detection in place. Hence this patch removes the drop-writes error injection completely. xfs/196 can remain for testing kernels that don't have this data corruption fix, but those that do will report: xfs/196 3s ... [not run] XFS error injection drop_writes unknown on this kernel. Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_errortag.h | 12 +++++------- fs/xfs/xfs_error.c | 27 ++++++++++++++++++++------- fs/xfs/xfs_iomap.c | 9 --------- 3 files changed, 25 insertions(+), 23 deletions(-) diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h index 5362908164b0..580ccbd5aadc 100644 --- a/fs/xfs/libxfs/xfs_errortag.h +++ b/fs/xfs/libxfs/xfs_errortag.h @@ -40,13 +40,12 @@ #define XFS_ERRTAG_REFCOUNT_FINISH_ONE 25 #define XFS_ERRTAG_BMAP_FINISH_ONE 26 #define XFS_ERRTAG_AG_RESV_CRITICAL 27 + /* - * DEBUG mode instrumentation to test and/or trigger delayed allocation - * block killing in the event of failed writes. When enabled, all - * buffered writes are silenty dropped and handled as if they failed. - * All delalloc blocks in the range of the write (including pre-existing - * delalloc blocks!) are tossed as part of the write failure error - * handling sequence. + * Drop-writes support removed because write error handling cannot trash + * pre-existing delalloc extents in any useful way anymore. We retain the + * definition so that we can reject it as an invalid value in + * xfs_errortag_valid(). */ #define XFS_ERRTAG_DROP_WRITES 28 #define XFS_ERRTAG_LOG_BAD_CRC 29 @@ -95,7 +94,6 @@ #define XFS_RANDOM_REFCOUNT_FINISH_ONE 1 #define XFS_RANDOM_BMAP_FINISH_ONE 1 #define XFS_RANDOM_AG_RESV_CRITICAL 4 -#define XFS_RANDOM_DROP_WRITES 1 #define XFS_RANDOM_LOG_BAD_CRC 1 #define XFS_RANDOM_LOG_ITEM_PIN 1 #define XFS_RANDOM_BUF_LRU_REF 2 diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c index c6b2aabd6f18..dea3c0649d2f 100644 --- a/fs/xfs/xfs_error.c +++ b/fs/xfs/xfs_error.c @@ -46,7 +46,7 @@ static unsigned int xfs_errortag_random_default[] = { XFS_RANDOM_REFCOUNT_FINISH_ONE, XFS_RANDOM_BMAP_FINISH_ONE, XFS_RANDOM_AG_RESV_CRITICAL, - XFS_RANDOM_DROP_WRITES, + 0, /* XFS_RANDOM_DROP_WRITES has been removed */ XFS_RANDOM_LOG_BAD_CRC, XFS_RANDOM_LOG_ITEM_PIN, XFS_RANDOM_BUF_LRU_REF, @@ -162,7 +162,6 @@ XFS_ERRORTAG_ATTR_RW(refcount_continue_update, XFS_ERRTAG_REFCOUNT_CONTINUE_UPDA XFS_ERRORTAG_ATTR_RW(refcount_finish_one, XFS_ERRTAG_REFCOUNT_FINISH_ONE); XFS_ERRORTAG_ATTR_RW(bmap_finish_one, XFS_ERRTAG_BMAP_FINISH_ONE); XFS_ERRORTAG_ATTR_RW(ag_resv_critical, XFS_ERRTAG_AG_RESV_CRITICAL); -XFS_ERRORTAG_ATTR_RW(drop_writes, XFS_ERRTAG_DROP_WRITES); XFS_ERRORTAG_ATTR_RW(log_bad_crc, XFS_ERRTAG_LOG_BAD_CRC); XFS_ERRORTAG_ATTR_RW(log_item_pin, XFS_ERRTAG_LOG_ITEM_PIN); XFS_ERRORTAG_ATTR_RW(buf_lru_ref, XFS_ERRTAG_BUF_LRU_REF); @@ -206,7 +205,6 @@ static struct attribute *xfs_errortag_attrs[] = { XFS_ERRORTAG_ATTR_LIST(refcount_finish_one), XFS_ERRORTAG_ATTR_LIST(bmap_finish_one), XFS_ERRORTAG_ATTR_LIST(ag_resv_critical), - XFS_ERRORTAG_ATTR_LIST(drop_writes), XFS_ERRORTAG_ATTR_LIST(log_bad_crc), XFS_ERRORTAG_ATTR_LIST(log_item_pin), XFS_ERRORTAG_ATTR_LIST(buf_lru_ref), @@ -256,6 +254,19 @@ xfs_errortag_del( kmem_free(mp->m_errortag); } +static bool +xfs_errortag_valid( + unsigned int error_tag) +{ + if (error_tag >= XFS_ERRTAG_MAX) + return false; + + /* Error out removed injection types */ + if (error_tag == XFS_ERRTAG_DROP_WRITES) + return false; + return true; +} + bool xfs_errortag_test( struct xfs_mount *mp, @@ -277,7 +288,9 @@ xfs_errortag_test( if (!mp->m_errortag) return false; - ASSERT(error_tag < XFS_ERRTAG_MAX); + if (!xfs_errortag_valid(error_tag)) + return false; + randfactor = mp->m_errortag[error_tag]; if (!randfactor || prandom_u32_max(randfactor)) return false; @@ -293,7 +306,7 @@ xfs_errortag_get( struct xfs_mount *mp, unsigned int error_tag) { - if (error_tag >= XFS_ERRTAG_MAX) + if (!xfs_errortag_valid(error_tag)) return -EINVAL; return mp->m_errortag[error_tag]; @@ -305,7 +318,7 @@ xfs_errortag_set( unsigned int error_tag, unsigned int tag_value) { - if (error_tag >= XFS_ERRTAG_MAX) + if (!xfs_errortag_valid(error_tag)) return -EINVAL; mp->m_errortag[error_tag] = tag_value; @@ -319,7 +332,7 @@ xfs_errortag_add( { BUILD_BUG_ON(ARRAY_SIZE(xfs_errortag_random_default) != XFS_ERRTAG_MAX); - if (error_tag >= XFS_ERRTAG_MAX) + if (!xfs_errortag_valid(error_tag)) return -EINVAL; return xfs_errortag_set(mp, error_tag, diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index fa578c913323..bf1661410272 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1335,15 +1335,6 @@ xfs_buffered_write_iomap_end( if (iomap->type != IOMAP_DELALLOC) return 0; - /* - * Behave as if the write failed if drop writes is enabled. Set the NEW - * flag to force delalloc cleanup. - */ - if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_DROP_WRITES)) { - iomap->flags |= IOMAP_F_NEW; - written = 0; - } - /* If we didn't reserve the blocks, we're not allowed to punch them. */ if (!(iomap->flags & IOMAP_F_NEW)) return 0;