From patchwork Fri Nov 8 12:42:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13868124 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 713511E884B for ; Fri, 8 Nov 2024 12:41:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731069678; cv=none; b=qz6SczQawx5n2u2h3DCyJJ5aDavJUocz1y7vPk0ig8/WvyLdjsZMkFxeWkMUjdLsJAK3W0Rye+7GBgrMBKpaalVHgOQcxu3rePOZVz4J0g9YAduOoYP1Rd+93RBUaBfYMZ+EYj5pEfj6FvF2ttpp5GvC598dHsuinUWxyBBjp3o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731069678; c=relaxed/simple; bh=JyK04c7bAWyOwLJ7OpQ+2fEinm1tQGvmaDTDdmMoCOQ=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dOPD68pFWz2LXquJM4Yh15TAU/lLvJ3t4YESj/Ek/1tvJh8ClSpXIHln6xjpadNjfYFTkCnxPLDiQXjhdcXHeaDZvK0T9EepgMRSRoqAT+Ex7yMZsxXsRbHbOqbd1RgzThqHYIwUNL0b04bEDn4OQs1NziwGagbCAHUvrBRlgQM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=GlF/kLE3; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GlF/kLE3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1731069675; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F2JK3RJR9W88PivNQjsAvbcErJwExntyQMVOS/w4R6A=; b=GlF/kLE3DlqNqch2kwZUCTMuGRTB/OKewOFNabHZpmIoRGK4g6g/Wru3G3AoJNHEAdS1wP gz3k8fOhIE153fOsz5a6TMIYxbMhDTDHckgBCYTvMSiqQytYrFVKRR+nbO134gnyVzS2o2 cyoQ9lQykDVppDfRhfe8gISQjcRUswM= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-288-oYcUdHVHP2y5lrfz-StV2w-1; Fri, 08 Nov 2024 07:41:13 -0500 X-MC-Unique: oYcUdHVHP2y5lrfz-StV2w-1 X-Mimecast-MFC-AGG-ID: oYcUdHVHP2y5lrfz-StV2w Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1B571195608B for ; Fri, 8 Nov 2024 12:41:13 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.64.111]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9D91E1955F3E for ; Fri, 8 Nov 2024 12:41:12 +0000 (UTC) From: Brian Foster To: linux-fsdevel@vger.kernel.org Subject: [PATCH v3 1/4] iomap: reset per-iter state on non-error iter advances Date: Fri, 8 Nov 2024 07:42:43 -0500 Message-ID: <20241108124246.198489-2-bfoster@redhat.com> In-Reply-To: <20241108124246.198489-1-bfoster@redhat.com> References: <20241108124246.198489-1-bfoster@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 iomap_iter_advance() zeroes the processed and mapping fields on every non-error iteration except for the last expected iteration (i.e. return 0 expected to terminate the iteration loop). This appears to be circumstantial as nothing currently relies on these fields after the final iteration. Therefore to better faciliate iomap_iter reuse in subsequent patches, update iomap_iter_advance() to always reset per-iteration state on successful completion. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/iomap/iter.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c index 79a0614eaab7..3790918646af 100644 --- a/fs/iomap/iter.c +++ b/fs/iomap/iter.c @@ -22,26 +22,25 @@ static inline int iomap_iter_advance(struct iomap_iter *iter) { bool stale = iter->iomap.flags & IOMAP_F_STALE; + int ret = 1; /* handle the previous iteration (if any) */ if (iter->iomap.length) { if (iter->processed < 0) return iter->processed; - if (!iter->processed && !stale) - return 0; if (WARN_ON_ONCE(iter->processed > iomap_length(iter))) return -EIO; iter->pos += iter->processed; iter->len -= iter->processed; - if (!iter->len) - return 0; + if (!iter->len || (!iter->processed && !stale)) + ret = 0; } - /* clear the state for the next iteration */ + /* clear the per iteration state */ iter->processed = 0; memset(&iter->iomap, 0, sizeof(iter->iomap)); memset(&iter->srcmap, 0, sizeof(iter->srcmap)); - return 1; + return ret; } static inline void iomap_iter_done(struct iomap_iter *iter) From patchwork Fri Nov 8 12:42:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13868125 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F2CF1E9062 for ; Fri, 8 Nov 2024 12:41:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731069678; cv=none; b=A0sAvqfDZsahTs8rvhA3Dg/M0jawxLYU2ebVVG3m03yX8s6Sep+UJhj48oHu5tWnZPYQ9VoGb2COfv744+W1HkeYsYAZl99IiLZtyItA/7iQ5l0q0LqekKx/RJHLbzhaPV2e7kkIXAOBIhI2gb9Yq3i2nA+JZhJ5EWfDYCiPV88= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731069678; c=relaxed/simple; bh=MsqC22c04jXtwtqRCrwRMUrOe0DXV36idPQO4W8RiVU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Sz6quC/w/WepxK60QhZP6vAw2BB1y0POHEmglEXwNVQnF/lzlZQgJW1bd3JjvWNfyTQsau/SfvrEVaDFmt+UAn9JjSagmiCu3UPF8Pu5gdAClM0eVb1CZrlpxbNkRGOD4lTUg1H8Ax4Ja5RRnnAasqwlZd8O9a/EUUuwT1EDnvg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=G9CbFHiG; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="G9CbFHiG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1731069676; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=usUcH3wz/HjPZAnLZ2ldoPzaSqZUJUZ/Tb70MMky47Q=; b=G9CbFHiG8GoX7CY4fAseKjo8vzKYdrqQ4ogea0OBIsIwDCjuENITqwrpj6lWxzsiSVnQGM S4Qyst23tXswzw6SNcPW832W/xvrXwBssDYgIZHAcclJKqhxFDg7AgP+lvmP0UV6fI70dE qqXBe3C1cqfTW4Yl0cg3+4OAdLRvEw4= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-324-sYvy6H2oO6SYGnN7RSQJ9g-1; Fri, 08 Nov 2024 07:41:14 -0500 X-MC-Unique: sYvy6H2oO6SYGnN7RSQJ9g-1 X-Mimecast-MFC-AGG-ID: sYvy6H2oO6SYGnN7RSQJ9g Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D679F19560AF for ; Fri, 8 Nov 2024 12:41:13 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.64.111]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5EAF51956054 for ; Fri, 8 Nov 2024 12:41:13 +0000 (UTC) From: Brian Foster To: linux-fsdevel@vger.kernel.org Subject: [PATCH v3 2/4] iomap: lift zeroed mapping handling into iomap_zero_range() Date: Fri, 8 Nov 2024 07:42:44 -0500 Message-ID: <20241108124246.198489-3-bfoster@redhat.com> In-Reply-To: <20241108124246.198489-1-bfoster@redhat.com> References: <20241108124246.198489-1-bfoster@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 In preparation for special handling of subranges, lift the zeroed mapping logic from the iterator into the caller. Since this puts the pagecache dirty check and flushing in the same place, streamline the comments a bit as well. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 64 +++++++++++++++--------------------------- 1 file changed, 22 insertions(+), 42 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index ef0b68bccbb6..a78b5b9b3df3 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1350,40 +1350,12 @@ static inline int iomap_zero_iter_flush_and_stale(struct iomap_iter *i) return filemap_write_and_wait_range(mapping, i->pos, end); } -static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero, - bool *range_dirty) +static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) { - const struct iomap *srcmap = iomap_iter_srcmap(iter); loff_t pos = iter->pos; loff_t length = iomap_length(iter); loff_t written = 0; - /* - * We must zero subranges of unwritten mappings that might be dirty in - * pagecache from previous writes. We only know whether the entire range - * was clean or not, however, and dirty folios may have been written - * back or reclaimed at any point after mapping lookup. - * - * The easiest way to deal with this is to flush pagecache to trigger - * any pending unwritten conversions and then grab the updated extents - * from the fs. The flush may change the current mapping, so mark it - * stale for the iterator to remap it for the next pass to handle - * properly. - * - * Note that holes are treated the same as unwritten because zero range - * is (ab)used for partial folio zeroing in some cases. Hole backed - * post-eof ranges can be dirtied via mapped write and the flush - * triggers writeback time post-eof zeroing. - */ - if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN) { - if (*range_dirty) { - *range_dirty = false; - return iomap_zero_iter_flush_and_stale(iter); - } - /* range is clean and already zeroed, nothing to do */ - return length; - } - do { struct folio *folio; int status; @@ -1433,24 +1405,32 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, bool range_dirty; /* - * Zero range wants to skip pre-zeroed (i.e. unwritten) mappings, but - * pagecache must be flushed to ensure stale data from previous - * buffered writes is not exposed. A flush is only required for certain - * types of mappings, but checking pagecache after mapping lookup is - * racy with writeback and reclaim. + * Zero range can skip mappings that are zero on disk so long as + * pagecache is clean. If pagecache was dirty prior to zero range, the + * mapping converts on writeback completion and so must be zeroed. * - * Therefore, check the entire range first and pass along whether any - * part of it is dirty. If so and an underlying mapping warrants it, - * flush the cache at that point. This trades off the occasional false - * positive (and spurious flush, if the dirty data and mapping don't - * happen to overlap) for simplicity in handling a relatively uncommon - * situation. + * The simplest way to deal with this across a range is to flush + * pagecache and process the updated mappings. To avoid an unconditional + * flush, check pagecache state and only flush if dirty and the fs + * returns a mapping that might convert on writeback. */ range_dirty = filemap_range_needs_writeback(inode->i_mapping, pos, pos + len - 1); + while ((ret = iomap_iter(&iter, ops)) > 0) { + const struct iomap *s = iomap_iter_srcmap(&iter); + + if (s->type == IOMAP_HOLE || s->type == IOMAP_UNWRITTEN) { + loff_t p = iomap_length(&iter); + if (range_dirty) { + range_dirty = false; + p = iomap_zero_iter_flush_and_stale(&iter); + } + iter.processed = p; + continue; + } - while ((ret = iomap_iter(&iter, ops)) > 0) - iter.processed = iomap_zero_iter(&iter, did_zero, &range_dirty); + iter.processed = iomap_zero_iter(&iter, did_zero); + } return ret; } EXPORT_SYMBOL_GPL(iomap_zero_range); From patchwork Fri Nov 8 12:42:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13868126 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDD441E9078 for ; Fri, 8 Nov 2024 12:41:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731069679; cv=none; b=Fhyct5Hlh4SWVEUzs1oUTN4D1sr4L6ICY6gtlxpGUhni2hQBfkUwyuxnAquhBXSPaIaelob3fe97uzu30Ar7qAPgxb2pxQWdYRlAWBowpdPFEFIP9NJndloai1sBlbgGlz1O0nu1QfeqiOXDLlcxQG7BiHDjxvS1rSKgCkUxD2c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731069679; c=relaxed/simple; bh=IBR/F7lQZCKJRDpari86IncDFoz0s0yDN6QG/Q/vfbs=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=OTQqsApEzUPTP3L53Q0iDIll/Jpi796NV8FiuX51S5T58db5AWXHHpem9ec+JLL7AqtckY6iZHCyY3cuOeyEnwEt3drxSNpGd5SSly5VvKYoYczW6Um7Ty3bamP0dWtWF7FVP6gjO5L1iR3iUGy8ujPIOZu4u3Lzsl+bx+W+Muo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=LqyadNvu; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LqyadNvu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1731069676; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aWnPGAhYPeZR2u3ZeDz6tmm1g+TWVBBMmghBGbo0Gmc=; b=LqyadNvu9RKHHQPHXpRolKPPAcrCQXv2Hkb7xBjYhuP60poWwSyF8zliIc6Qbrly2bVn0o Z3lKg0IkKa7ZO6cAxWirMydzVTeKt2nWajTGJ6+Tfcfg11RJoSS9a1DwGYwdbWOn383wkJ uEqTtgbkBxj81ejjuTjysvQe0ghWRTY= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-256-1gGciMJLO5ma8w0Q6vIqLQ-1; Fri, 08 Nov 2024 07:41:15 -0500 X-MC-Unique: 1gGciMJLO5ma8w0Q6vIqLQ-1 X-Mimecast-MFC-AGG-ID: 1gGciMJLO5ma8w0Q6vIqLQ Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9BDFD1955EE7 for ; Fri, 8 Nov 2024 12:41:14 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.64.111]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 255BF1956054 for ; Fri, 8 Nov 2024 12:41:14 +0000 (UTC) From: Brian Foster To: linux-fsdevel@vger.kernel.org Subject: [PATCH v3 3/4] iomap: elide flush from partial eof zero range Date: Fri, 8 Nov 2024 07:42:45 -0500 Message-ID: <20241108124246.198489-4-bfoster@redhat.com> In-Reply-To: <20241108124246.198489-1-bfoster@redhat.com> References: <20241108124246.198489-1-bfoster@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 iomap zero range flushes pagecache in certain situations to determine which parts of the range might require zeroing if dirty data is present in pagecache. The kernel robot recently reported a regression associated with this flushing in the following stress-ng workload on XFS: stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --metamix 64 This workload involves repeated small, strided, extending writes. On XFS, this produces a pattern of post-eof speculative preallocation, conversion of preallocation from delalloc to unwritten, dirtying pagecache over newly unwritten blocks, and then rinse and repeat from the new EOF. This leads to repetitive flushing of the EOF folio via the zero range call XFS uses for writes that start beyond current EOF. To mitigate this problem, special case EOF block zeroing to prefer zeroing the folio over a flush when the EOF folio is already dirty. To do this, split out and open code handling of an unaligned start offset. This brings most of the performance back by avoiding flushes on zero range calls via write and truncate extension operations. The flush doesn't occur in these situations because the entire range is post-eof and therefore the folio that overlaps EOF is the only one in the range. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index a78b5b9b3df3..7f40234a301e 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1401,6 +1401,10 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, .len = len, .flags = IOMAP_ZERO, }; + struct address_space *mapping = inode->i_mapping; + unsigned int blocksize = i_blocksize(inode); + unsigned int off = pos & (blocksize - 1); + loff_t plen = min_t(loff_t, len, blocksize - off); int ret; bool range_dirty; @@ -1410,12 +1414,28 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, * mapping converts on writeback completion and so must be zeroed. * * The simplest way to deal with this across a range is to flush - * pagecache and process the updated mappings. To avoid an unconditional - * flush, check pagecache state and only flush if dirty and the fs - * returns a mapping that might convert on writeback. + * pagecache and process the updated mappings. To avoid excessive + * flushing on partial eof zeroing, special case it to zero the + * unaligned start portion if already dirty in pagecache. + */ + if (off && + filemap_range_needs_writeback(mapping, pos, pos + plen - 1)) { + iter.len = plen; + while ((ret = iomap_iter(&iter, ops)) > 0) + iter.processed = iomap_zero_iter(&iter, did_zero); + + iter.len = len - (iter.pos - pos); + if (ret || !iter.len) + return ret; + } + + /* + * To avoid an unconditional flush, check pagecache state and only flush + * if dirty and the fs returns a mapping that might convert on + * writeback. */ range_dirty = filemap_range_needs_writeback(inode->i_mapping, - pos, pos + len - 1); + iter.pos, iter.pos + iter.len - 1); while ((ret = iomap_iter(&iter, ops)) > 0) { const struct iomap *s = iomap_iter_srcmap(&iter); From patchwork Fri Nov 8 12:42:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13868127 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE5681E882A for ; Fri, 8 Nov 2024 12:41:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731069680; cv=none; b=Nvk7yKs81s56fsLvfyQgaUQ1KNu9G4EQqRJA2R5lz81VnCwlYu7Ob1BEtS1A4YAtKW5QJTcuFIqn70+ai8j3lZxsiEo8ur4PsKdnraRuBlH49xqG9YIllV48XWwr2AlGym2r3H4hhBrbhmNFIwIdtGyfUy4tpCwy5lOGojlSUvM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731069680; c=relaxed/simple; bh=CZJi/OyUjBJ3Qns3n/1YgPbQWTEBPth83uDK/Wns008=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=lMjwZN1e19rPqJkP5FS2IUA0OOe2EU84CYs4oZfv4rd5j1wmUf6z2DskBWNkeuRyyKiWEAK6iZrd3+tO7X39OXDkNVXv+ZH5JYD0qXRedYpgVY7rBhLeok3OG1uQFePbTzkM0zwy+A6TO2Ndg5+gYj01rGEcIJ9UlhZTffRC93c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=X5n42CSi; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="X5n42CSi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1731069677; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hdLf/MaIVZKlGy8e6e0n3R3u77Bw4zfNKRce7SSBUDw=; b=X5n42CSinbMfNj0PnWnebmnv/R0EP2Xtx4TDQsnXjaOlqnxsN6l9WLKxM9ll10ZEA0k/vZ D7lzVu6D6oiIa79DDu0F6O/bjBd5UwxydHStvd3EBvSBI9o41X6b6UahtQlE+tbl4GkmBE pGKIcHyfvabVdgVM1DZcfO7xYcxtPlU= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-384-0UGv2EowMryeisKsJyHQTw-1; Fri, 08 Nov 2024 07:41:16 -0500 X-MC-Unique: 0UGv2EowMryeisKsJyHQTw-1 X-Mimecast-MFC-AGG-ID: 0UGv2EowMryeisKsJyHQTw Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6292419560B0 for ; Fri, 8 Nov 2024 12:41:15 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.64.111]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E024F1956054 for ; Fri, 8 Nov 2024 12:41:14 +0000 (UTC) From: Brian Foster To: linux-fsdevel@vger.kernel.org Subject: [PATCH v3 4/4] iomap: warn on zero range of a post-eof folio Date: Fri, 8 Nov 2024 07:42:46 -0500 Message-ID: <20241108124246.198489-5-bfoster@redhat.com> In-Reply-To: <20241108124246.198489-1-bfoster@redhat.com> References: <20241108124246.198489-1-bfoster@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 iomap_zero_range() uses buffered writes for manual zeroing, no longer updates i_size for such writes, but is still explicitly called for post-eof ranges. The historical use case for this is zeroing post-eof speculative preallocation on extending writes from XFS. However, XFS also recently changed to convert all post-eof delalloc mappings to unwritten in the iomap_begin() handler, which means it now never expects manual zeroing of post-eof mappings. In other words, all post-eof mappings should be reported as holes or unwritten. This is a subtle dependency that can be hard to detect if violated because associated codepaths are likely to update i_size after folio locks are dropped, but before writeback happens to occur. For example, if XFS reverts back to some form of manual zeroing of post-eof blocks on write extension, writeback of those zeroed folios will now race with the presumed i_size update from the subsequent buffered write. Since iomap_zero_range() can't correctly zero post-eof mappings beyond EOF without updating i_size, warn if this ever occurs. This serves as minimal indication that if this use case is reintroduced by a filesystem, iomap_zero_range() might need to reconsider i_size updates for write extending use cases. Signed-off-by: Brian Foster Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 7f40234a301e..e18830e4809b 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1354,6 +1354,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) { loff_t pos = iter->pos; loff_t length = iomap_length(iter); + loff_t isize = iter->inode->i_size; loff_t written = 0; do { @@ -1369,6 +1370,8 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero) if (iter->iomap.flags & IOMAP_F_STALE) break; + /* warn about zeroing folios beyond eof that won't write back */ + WARN_ON_ONCE(folio_pos(folio) > isize); offset = offset_in_folio(folio, pos); if (bytes > folio_size(folio) - offset) bytes = folio_size(folio) - offset;