From patchwork Thu Jan 18 22:19:39 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Dave Chinner <david@fromorbit.com>
X-Patchwork-Id: 13523242
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id CA065C47DAF
	for <linux-mm@archiver.kernel.org>; Thu, 18 Jan 2024 22:22:25 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 40D4E6B007E; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 3BC616B0078; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 284946B0080; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com
 [216.40.44.12])
	by kanga.kvack.org (Postfix) with ESMTP id 06D686B0078
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay10.hostedemail.com (Postfix) with ESMTP id CA415C0305
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 22:22:24 +0000 (UTC)
X-FDA: 81693856608.12.AC47ACD
Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com
 [209.85.215.179])
	by imf21.hostedemail.com (Postfix) with ESMTP id BA3511C0017
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 22:22:22 +0000 (UTC)
Authentication-Results: imf21.hostedemail.com;
	dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601
 header.b=NJ5Kv4Mv;
	dmarc=pass (policy=quarantine) header.from=fromorbit.com;
	spf=pass (imf21.hostedemail.com: domain of david@fromorbit.com designates
 209.85.215.179 as permitted sender) smtp.mailfrom=david@fromorbit.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1705616542;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=SIYg4fbOx5xpSV7xBsoY4WQ6ng+5chM3y1Ni3HkVJSg=;
	b=2g3arzSUHKufHKmEO2jq2svTJutBEZ+COxmPIwXcQHPVuT3AZENm0RtgGdXHiqSCvWZz7t
	T2vz44lXqYt97GiVXScEfy6SACXZ160J5IzCR/fGRFL2tYRdve8Ggemu7CcCik5Raii+gr
	b1ml/Z6KIviVAsSu06uxeg2mWlH0Xlc=
ARC-Authentication-Results: i=1;
	imf21.hostedemail.com;
	dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601
 header.b=NJ5Kv4Mv;
	dmarc=pass (policy=quarantine) header.from=fromorbit.com;
	spf=pass (imf21.hostedemail.com: domain of david@fromorbit.com designates
 209.85.215.179 as permitted sender) smtp.mailfrom=david@fromorbit.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705616542; a=rsa-sha256;
	cv=none;
	b=GJ08wGzP86zQaX1uKftf1cGmuWckl1xztUYoEkm0VL8G9sTANGxgqZax6/7xwD6YKr2NvY
	RczETg6yeyhsCHxJamcvnv19a/u92JrHx2E+MV7d3HmXNuw8g6/qDzII+b5lxUanJHQC8e
	rBJ1cvzY4tu5Qqez/TTvuFSoYfqrwk8=
Received: by mail-pg1-f179.google.com with SMTP id
 41be03b00d2f7-5cf450eba00so151358a12.0
        for <linux-mm@kvack.org>; Thu, 18 Jan 2024 14:22:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616541;
 x=1706221341; darn=kvack.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=SIYg4fbOx5xpSV7xBsoY4WQ6ng+5chM3y1Ni3HkVJSg=;
        b=NJ5Kv4MvgKgRN/bdkZP0ovnh3b3nJrlcOJDZUgCY1upj/skqt7PTYQvvCMdYZGOJk8
         G5sLhNqvJVOaVJUOh7HGafF5gl90uQ02yQal/Vh4B4Z+sNfRsNRLxIE0E2bB4ASUqOw+
         2xQi0eGD0pD2A71ZDYE4nuh9eEGH1sPA2OpnnQj85mkgFm3NKNWJCSe0DBWo1G52Gy8e
         UybipwFqaF5pRb4oo4cVt/kHoHMMANMe9JJzWhksVzs5n46byN8Nwa7pzl9uN8rbBRdH
         Qm9uoCEnE+0bvynxzdbXtO7XzB/zWZcn/oFdFX8maTl44q7PZX+lwSj+e0eIEwTtThNR
         X0Sg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1705616541; x=1706221341;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=SIYg4fbOx5xpSV7xBsoY4WQ6ng+5chM3y1Ni3HkVJSg=;
        b=E4aO/86hpraH3HT8zlx4OT8exuvdkwDNqdqEARjsowLht5MI3Z513alHhNwhPsMrYQ
         dsoDY2zBRdkxgBzxa3eZYB+S4FY3YalPNm3TEhb2aSAUuOpoTXk5TD5hYkas0PRIRaHs
         XW1mjjF3aTQNfQxpiZ2BpnIqJNvkyiWaAFyJNLeJcNbAs63dhsuD98CufzqepWMW/zn/
         MQ84dQKJhtYfPhBdmU2q/AEw0pT/z3sKWQuzr3CNXPDlcv3/0Jate2TZ2X8WXTSFrEmb
         eICp60TYF57DOFrWgwTmk5zi3oTGSe845FnEaKbbWk4izgq3gZfV0bWcrIa/NXB+LzI3
         OeoQ==
X-Gm-Message-State: AOJu0YzghSnJenK5KG3XhPW52ypHieWCV5BSCxtrVpG7cNgbKiim0lT1
	vZDpvfnVsZmrSGNSgm82sJX8h1qR/HGLxQP8iSWYdcwPUZSu0TePEDUsgUSNOvY=
X-Google-Smtp-Source: 
 AGHT+IFvJI+inGv4x/2Vzu4CfgzR5ivHLaCvAfCWUVCseYlv9qmktZULfGpdbMDYc55IPbpJDykUzw==
X-Received: by 2002:a17:902:ee8a:b0:1d5:7220:9ff with SMTP id
 a10-20020a170902ee8a00b001d5722009ffmr1594374pld.117.1705616541575;
        Thu, 18 Jan 2024 14:22:21 -0800 (PST)
Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au.
 [49.180.249.6])
        by smtp.gmail.com with ESMTPSA id
 p12-20020a170902eacc00b001d71729ec9csm531276pld.188.2024.01.18.14.22.20
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 18 Jan 2024 14:22:21 -0800 (PST)
Received: from [192.168.253.23] (helo=devoid.disaster.area)
	by dread.disaster.area with esmtp (Exim 4.96)
	(envelope-from <dave@fromorbit.com>)
	id 1rQamB-00CCGN-0h;
	Fri, 19 Jan 2024 09:22:18 +1100
Received: from dave by devoid.disaster.area with local (Exim 4.97)
	(envelope-from <dave@devoid.disaster.area>)
	id 1rQamA-0000000HMlm-2z8S;
	Fri, 19 Jan 2024 09:22:18 +1100
From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Cc: willy@infradead.org,
	linux-mm@kvack.org
Subject: [PATCH 1/3] xfs: unmapped buffer item size straddling mismatch
Date: Fri, 19 Jan 2024 09:19:39 +1100
Message-ID: <20240118222216.4131379-2-david@fromorbit.com>
X-Mailer: git-send-email 2.43.0
In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com>
References: <20240118222216.4131379-1-david@fromorbit.com>
MIME-Version: 1.0
X-Rspamd-Queue-Id: BA3511C0017
X-Rspam-User: 
X-Rspamd-Server: rspam05
X-Stat-Signature: jorztkxi3qnpf71tzuerinjxr91ondqe
X-HE-Tag: 1705616542-394970
X-HE-Meta: 
 U2FsdGVkX1/as4uTvNRlc28XPbqJ8scXzuhspPCOxWONdzwer6P0RP8bGgHC0z4BkU+Ahq78sZv8xAdnQ/W9YEReHeY7Ct1g1uQdkqbJy+iHgPuvEMyxIOUsVfBzOFz3smrhjLoj6ZWqH0/W4lhNB80ggBadiq1wGHAGNGa/M48/6f+pi3cDry2jq3eGkqwrzQuxvOZ8d+aRl+g62QD2iQwht5q621CHeOrQLJlH2Ap75UWAxHYrTtrLLjrAnJN/1/kNwmmIVdCEr+sdRA3aRxBZqJhoaXmflL9gNLbWQwmphBezRYy2qYjeZqCo9lIxHs6vlAFUgVxsondGGOd09jQKHiY9gIVY0betJ8pkXcQyNU55Wt1/TZMzd63k4GXjWdkfHFTxTh2nQ8t+H36BzqhFea2nWVUjvaNwVk6rbMgSfWiheppKamYkP5ArjC9xR5WQ8DFB4+BS1pP5lYTqVWbfPzLJSMz+bVDaPos5Ubbz4cPg6tbnN8pH8MyLEpzaVKbYxrMndiMncE0y3xBpLoGgxLZJFF7638NFxd7ve3yTIXh/s4S6s3zt07JYFIMpM/z29z5nC824tpo3dxznsORCa8AN8jcx12GUGciBWTJQgRrDl0HthKvKTMD25hgSIe/799OXDllk9eHrzLGvQoFANFgrwnkUiVfoZeeT41hV2nMKLoxd2pYJNI8z1nMG5GscVM0nKPxM6QHjSKJRc0Fc3CdAnnoNQh5ewNfqhCVo/HGkolEpLqCF6m5KNb6h7e2/1KdMtafx/VoUk1h/Gt4mTXf9+h1jSo+wxOJdOiN8WeYD/iOf3a2R1e72SwvcDLM63W4G2PcbVcPPXf3h3BqGdhFC3RgglcH/SNrrFWZ9yJd62s14Wu7KAeSxfoo6cT1MufbNP+LvNa33Ae0L3nBtae6r6oKruQVC5IaBbfbuIQcbNe60jgHrM/0H3DdcHik09w+i9pku+lfjEjR
 R4lk487q
 JhIlOMWN0ltj6noTi5su5E1vL4KKQxrp5Zo2wAVAFnhuLEAVz+WufN9cfkttTn15skSoPM9ha4BHK6kSMyA/Q+lKkS0GGLKFbSMjVWtw0YLDgjQz1AA4tFHzV/CZ+JO8FRMC9K7F8MVXChqjxezoS2EaVmIutNCBN0dMwiOBgWlLNndsBtYO2pRAHybb0AHkWHfJD1i728dR1iNa9eKYFXXeEn1tSgYnxJJc96ipuI+NmIwus9LbK8ArRpNWG+6/+oquYHHSwNwSG9sS2XEW/9WJ30vnt+kaJ7EjbxZECiSGodJbD8Ntj2OG9NhMsBWEJjeU0P7aBD+a4ZHf4J/pmDqbIVpGEs3cJOOjYNsFxCuPyA1uYTYzVf3qQFpTWma+OX+f2DiUpQ7828Y67tSD2mEjbljaBdRLVaxE5u51xhsuWOJO3tgYI7okAO0MDoIHSUAxiok3mLm0QYgDcJKiKLpaGic7Mxp3j5RAN5dg+Hrdenok=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000092, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

From: Dave Chinner <dchinner@redhat.com>

We never log large contiguous regions of unmapped buffers, so this
bug is never triggered by the current code. However, the slowpath
for formatting buffer straddling regions is broken.

That is, the size and shape of the log vector calculated across a
straddle does not match how the formatting code formats a straddle.
This results in a log vector with an uninitialised iovec and this
causes a crash when xlog_write_full() goes to copy the iovec into
the journal.

Whilst touching this code, don't bother checking mapped or single
folio buffers for discontiguous regions because they don't have
them. This significantly reduces the overhead of this check when
logging large buffers as calling xfs_buf_offset() is not free and
it occurs a *lot* in those cases.

Fixes: 929f8b0deb83 ("xfs: optimise xfs_buf_item_size/format for contiguous regions")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_buf_item.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 43031842341a..83a81cb52d8e 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -56,6 +56,10 @@ xfs_buf_log_format_size(
 			(blfp->blf_map_size * sizeof(blfp->blf_data_map[0]));
 }
 
+/*
+ * We only have to worry about discontiguous buffer range straddling on unmapped
+ * buffers. Everything else will have a contiguous data region we can copy from.
+ */
 static inline bool
 xfs_buf_item_straddle(
 	struct xfs_buf		*bp,
@@ -65,6 +69,9 @@ xfs_buf_item_straddle(
 {
 	void			*first, *last;
 
+	if (bp->b_page_count == 1 || !(bp->b_flags & XBF_UNMAPPED))
+		return false;
+
 	first = xfs_buf_offset(bp, offset + (first_bit << XFS_BLF_SHIFT));
 	last = xfs_buf_offset(bp,
 			offset + ((first_bit + nbits) << XFS_BLF_SHIFT));
@@ -132,11 +139,13 @@ xfs_buf_item_size_segment(
 	return;
 
 slow_scan:
-	/* Count the first bit we jumped out of the above loop from */
-	(*nvecs)++;
-	*nbytes += XFS_BLF_CHUNK;
+	ASSERT(bp->b_addr == NULL);
 	last_bit = first_bit;
+	nbits = 1;
 	while (last_bit != -1) {
+
+		*nbytes += XFS_BLF_CHUNK;
+
 		/*
 		 * This takes the bit number to start looking from and
 		 * returns the next set bit from there.  It returns -1
@@ -151,6 +160,8 @@ xfs_buf_item_size_segment(
 		 * else keep scanning the current set of bits.
 		 */
 		if (next_bit == -1) {
+			if (first_bit != last_bit)
+				(*nvecs)++;
 			break;
 		} else if (next_bit != last_bit + 1 ||
 		           xfs_buf_item_straddle(bp, offset, first_bit, nbits)) {
@@ -162,7 +173,6 @@ xfs_buf_item_size_segment(
 			last_bit++;
 			nbits++;
 		}
-		*nbytes += XFS_BLF_CHUNK;
 	}
 }
 

From patchwork Thu Jan 18 22:19:40 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Dave Chinner <david@fromorbit.com>
X-Patchwork-Id: 13523245
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3F593C47422
	for <linux-mm@archiver.kernel.org>; Thu, 18 Jan 2024 22:22:32 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 238CA6B0080; Thu, 18 Jan 2024 17:22:26 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 1E9316B0081; Thu, 18 Jan 2024 17:22:26 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 060F76B0082; Thu, 18 Jan 2024 17:22:26 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com
 [216.40.44.13])
	by kanga.kvack.org (Postfix) with ESMTP id E80C26B0081
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay01.hostedemail.com (Postfix) with ESMTP id C9C321C0982
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 22:22:25 +0000 (UTC)
X-FDA: 81693856650.09.6D5A261
Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com
 [209.85.214.181])
	by imf03.hostedemail.com (Postfix) with ESMTP id D886720010
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 22:22:23 +0000 (UTC)
Authentication-Results: imf03.hostedemail.com;
	dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601
 header.b=qsqDMXUk;
	spf=pass (imf03.hostedemail.com: domain of david@fromorbit.com designates
 209.85.214.181 as permitted sender) smtp.mailfrom=david@fromorbit.com;
	dmarc=pass (policy=quarantine) header.from=fromorbit.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1705616543;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=xOQkNaBecqrKeof8epQC7tBtDt+f9sgqsspBN6V4LXw=;
	b=khlGIqCs/fnl4BDiwCl4h9OEVl8i1SeQBWvUmMqBwLEWJjtBjrTD5bfdQULzXn3yAx7fJF
	UjlpAVycWct1kWcIq7s3MLeUKZs/YMk+vgAuPVRbXDd+LSLkztQjkVvu+bE39LWtPKWjif
	JFqk3jb0U+r3ComiXIvBGfxFMuiHbPE=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705616543; a=rsa-sha256;
	cv=none;
	b=Hy8mizWGVJdEM15iZLtR+HTlZi4M69lOntiouSqIpZuboh1angF3rL7tI2F8VBwXiP0Bx+
	ju6q8ZwNFZkouelMW5kSSId2+/Lt+mAhirBi2wrvg4kjpM/gMmpU7js+AXDU3E8vws4yNd
	OaxOHtvS+/fdxVy6vMVJnx2YHRflkks=
ARC-Authentication-Results: i=1;
	imf03.hostedemail.com;
	dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601
 header.b=qsqDMXUk;
	spf=pass (imf03.hostedemail.com: domain of david@fromorbit.com designates
 209.85.214.181 as permitted sender) smtp.mailfrom=david@fromorbit.com;
	dmarc=pass (policy=quarantine) header.from=fromorbit.com
Received: by mail-pl1-f181.google.com with SMTP id
 d9443c01a7336-1d70696b6faso783735ad.3
        for <linux-mm@kvack.org>; Thu, 18 Jan 2024 14:22:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616542;
 x=1706221342; darn=kvack.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=xOQkNaBecqrKeof8epQC7tBtDt+f9sgqsspBN6V4LXw=;
        b=qsqDMXUk28ujx3cv0lgdQbMz0ALWTQcQjX+sKPDOQ8wb9Eg3hYKMyx8U9zQOmkphB/
         ot2FqjKpUDPdVOg2nnB23DmAJOqWHvdBsw2inggdVMbM35BqiA4DWm7Ca4jyoUfy09zh
         NPrrvzm2f38Xm6IboIk6nyshCN6bvJPaD/MPMBwOm6VEgzmz+cF8B7oLltAi7Yfd8/gV
         Kyh2x6kJuN9/WwyALDpa7tJW5KL1AzfhbJ80M4osbjvRIgmGu1BCUTlbR599GYQBxU1p
         SBvdfFhvSBB11HT3Ra+L/D4KbPXj0HPmSRz82ZssPIrPfzIhbucpDs5zNPXAYVPqD66Z
         EQjA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1705616542; x=1706221342;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=xOQkNaBecqrKeof8epQC7tBtDt+f9sgqsspBN6V4LXw=;
        b=NTQU5blEY32+JvmnX7IeePQh5M1I1WsvEHxruAUafIzfVWlhOU6/mr0vtPXZJVHbj0
         fvM+zWaB/gY73aypI1WI8RAkC+XzU3zv8crWkkRVGOwGttSlmUSsGkTxoLUE7k0QAM3C
         HBgihQ2UJ3AokkYIyVyXd9CaVweTDZd94aqUnu0PzGLl4UqiZHrfyQi2hdcwoi3Rd7TE
         Z68vA5iRKvxPIDstEeku97C2myzeck+2a91fSQ8X73yXJvhfE+xai4UkTH+64kTg/fwX
         PIgfN9Us2k0yATkD1mdI/xCPqn0M/xWu+Iz7GXC6+G5725PQI4xjCQOdsIiLrG1C4VXf
         4CdA==
X-Gm-Message-State: AOJu0Yx0fK94x/MkTfVuUbN8wp4Id3kdLXhSPFTD+sTF0qpQgqoKqub6
	ZLhYF5FKfzI8bJW7YliBJwVG7kPgbwpbbkl2Tn7J448qS9YoQC+9ThgPuTbzKLc=
X-Google-Smtp-Source: 
 AGHT+IHiaXm1EWAGasNNOOfrGx0XW1ck6lQq+G178B7PjIkwrsJeX+rv69F7epMmGMSQOU5IySypvg==
X-Received: by 2002:a17:902:bc8b:b0:1d5:8bf4:c7b2 with SMTP id
 bb11-20020a170902bc8b00b001d58bf4c7b2mr1474681plb.88.1705616542573;
        Thu, 18 Jan 2024 14:22:22 -0800 (PST)
Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au.
 [49.180.249.6])
        by smtp.gmail.com with ESMTPSA id
 ji15-20020a170903324f00b001d7164acf5csm601148plb.120.2024.01.18.14.22.21
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 18 Jan 2024 14:22:22 -0800 (PST)
Received: from [192.168.253.23] (helo=devoid.disaster.area)
	by dread.disaster.area with esmtp (Exim 4.96)
	(envelope-from <dave@fromorbit.com>)
	id 1rQamB-00CCGQ-11;
	Fri, 19 Jan 2024 09:22:18 +1100
Received: from dave by devoid.disaster.area with local (Exim 4.97)
	(envelope-from <dave@devoid.disaster.area>)
	id 1rQamA-0000000HMlq-3FAD;
	Fri, 19 Jan 2024 09:22:18 +1100
From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Cc: willy@infradead.org,
	linux-mm@kvack.org
Subject: [PATCH 2/3] xfs: use folios in the buffer cache
Date: Fri, 19 Jan 2024 09:19:40 +1100
Message-ID: <20240118222216.4131379-3-david@fromorbit.com>
X-Mailer: git-send-email 2.43.0
In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com>
References: <20240118222216.4131379-1-david@fromorbit.com>
MIME-Version: 1.0
X-Rspamd-Queue-Id: D886720010
X-Rspam-User: 
X-Stat-Signature: mt549o6m6gdghc9j8zcxsg4qp3n3nn5k
X-Rspamd-Server: rspam03
X-HE-Tag: 1705616543-876718
X-HE-Meta: 
 U2FsdGVkX19sqnBFUvtjqSvtx1lib99Jy5Gb6c99ETbQ40vCGlso3ZBS+Hk3dhMJasP9ZiqpBi5ZJDhE5zQN0RRdKlCZf+LwekKHVcHhFFgeoFEEQInPQzO/t6W5lEhZNtLNSBuxug8g8Kw2pyfVK6a9Zh/Z0IO4o3wDU1NSHYmN422f9nRWDBMtG/wQO/uBfN56rl/I17iNzay4LoA5EudY1fsYRAdxhkJ+gOOCsh8YfABYdI9qcLCIQirAzzbuDtq2tGD5xONrB2TFBGrETFReMb17NmStU06QqrfuyVUCSQ1ilnQ+WBxvxyUh7sKuwcxzfLltbAFhyxgQdrOxDObkZ1UKlFz5YVjRpisCkyTGVEjgxECqOT+kF1/UTfYYqiKanqjgG7yqITKJ24coGemjJTUYLtMCEBowueKEjqVtZAOsFhHNSOG5nybElNbk1odfpHq1Dd4wLSzLeEzS3sutIQfM8UgbffC8jv5i3xoEyT0fbhIcVMgUkn9E/+LLZPdmxDg5t7y48T+5pR1zvoGgxG70+yLGQONOyhNGJelKvHky2z8nKdQKxF+TPQSHVwv5pGexa0kAIeQRRsugHU5DGwxry7FN1tmaoa5EH5S7xy7Yh3WOBZKErm7eQwDGczwo4xICs/iK9IqxAeqVdwB15zDP81rk8h8Ws0fbYkHJqXw8ug4y3mQ+KW8Pynl8CUZhKJRAf5O9juNiNhAcnuAPxZvtLRGO6I4biejeEPfs/+MHS6FiAAWc1gbDwYCGzZlRpw1cAxwYkm2slb4n9+oLyiPg4OQt26XZBwht8Rh67x6ow7ny9RVi1GkRoEfIOPos6v4wyS5WFsBucCsPASwZ3RdfvNHEe8n4e7yRcvhD1diEhq4lzhFIjimwBnJoZFGjIh4KGgzd5Y6HzShJ2Yd0G4JZisnSaqp+9btx1VvTns5ElIhrW2ipGEDI35kiN0nJ6Bmj/RWQ8XKuk+y
 wkrfXKYE
 PJxGlTivtX4O0xih4gB6LVcAH3XAVttndHVmsHds3+vo2HNwvC1jl8RuOFc2ExGOmnYvXMIUOMGbHDKWQlqiKrMfOMXcapKXaaT6I6CmvG9XNAbTTTaaUvYXviPGQoTpWjKC6chGh0fjObHQLx3hLu708t2JeLeInJRAYK4/A9+PhEfq8Q6GfJCG858F2YscsOUj/hxHkZguIWYR2wt0QS0m/PYXJ1aou+Dhsnn32VcWhVq1msFuegzgOvK3lhs1MkEIRqZ+kGJ4H3CE0wfNdeRjjjPwtcPwXVs/GM6fjjToPv/HUkDtUF4506sHp9xj+iYi4vtlYf46Jfcc/wiCq8UqPbgXwPHdGX+YfTTtHvNIF0RYa6szRYEDTo9X2xbN8pBD1lryHg9yg2v/5N1/jhb/mB8dHvixaRGet7399hmXUc48BrNlbVhFbsg==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

From: Dave Chinner <dchinner@redhat.com>

Convert the use of struct pages to struct folio everywhere. This
is just direct API conversion, no actual logic of code changes
should result.

Note: this conversion currently assumes only single page folios are
allocated, and because some of the MM interfaces we use take
pointers to arrays of struct pages, the address of single page
folios and struct pages are the same. e.g alloc_pages_bulk_array(),
vm_map_ram(), etc.


Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf.c      | 127 +++++++++++++++++++++---------------------
 fs/xfs/xfs_buf.h      |  14 ++---
 fs/xfs/xfs_buf_item.c |   2 +-
 fs/xfs/xfs_linux.h    |   8 +++
 4 files changed, 80 insertions(+), 71 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 08f2fbc04db5..15907e92d0d3 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -60,25 +60,25 @@ xfs_buf_submit(
 	return __xfs_buf_submit(bp, !(bp->b_flags & XBF_ASYNC));
 }
 
+/*
+ * Return true if the buffer is vmapped.
+ *
+ * b_addr is null if the buffer is not mapped, but the code is clever enough to
+ * know it doesn't have to map a single folio, so the check has to be both for
+ * b_addr and bp->b_folio_count > 1.
+ */
 static inline int
 xfs_buf_is_vmapped(
 	struct xfs_buf	*bp)
 {
-	/*
-	 * Return true if the buffer is vmapped.
-	 *
-	 * b_addr is null if the buffer is not mapped, but the code is clever
-	 * enough to know it doesn't have to map a single page, so the check has
-	 * to be both for b_addr and bp->b_page_count > 1.
-	 */
-	return bp->b_addr && bp->b_page_count > 1;
+	return bp->b_addr && bp->b_folio_count > 1;
 }
 
 static inline int
 xfs_buf_vmap_len(
 	struct xfs_buf	*bp)
 {
-	return (bp->b_page_count * PAGE_SIZE);
+	return (bp->b_folio_count * PAGE_SIZE);
 }
 
 /*
@@ -197,7 +197,7 @@ xfs_buf_get_maps(
 }
 
 /*
- *	Frees b_pages if it was allocated.
+ *	Frees b_maps if it was allocated.
  */
 static void
 xfs_buf_free_maps(
@@ -273,26 +273,26 @@ _xfs_buf_alloc(
 }
 
 static void
-xfs_buf_free_pages(
+xfs_buf_free_folios(
 	struct xfs_buf	*bp)
 {
 	uint		i;
 
-	ASSERT(bp->b_flags & _XBF_PAGES);
+	ASSERT(bp->b_flags & _XBF_FOLIOS);
 
 	if (xfs_buf_is_vmapped(bp))
-		vm_unmap_ram(bp->b_addr, bp->b_page_count);
+		vm_unmap_ram(bp->b_addr, bp->b_folio_count);
 
-	for (i = 0; i < bp->b_page_count; i++) {
-		if (bp->b_pages[i])
-			__free_page(bp->b_pages[i]);
+	for (i = 0; i < bp->b_folio_count; i++) {
+		if (bp->b_folios[i])
+			__folio_put(bp->b_folios[i]);
 	}
-	mm_account_reclaimed_pages(bp->b_page_count);
+	mm_account_reclaimed_pages(bp->b_folio_count);
 
-	if (bp->b_pages != bp->b_page_array)
-		kfree(bp->b_pages);
-	bp->b_pages = NULL;
-	bp->b_flags &= ~_XBF_PAGES;
+	if (bp->b_folios != bp->b_folio_array)
+		kfree(bp->b_folios);
+	bp->b_folios = NULL;
+	bp->b_flags &= ~_XBF_FOLIOS;
 }
 
 static void
@@ -313,8 +313,8 @@ xfs_buf_free(
 
 	ASSERT(list_empty(&bp->b_lru));
 
-	if (bp->b_flags & _XBF_PAGES)
-		xfs_buf_free_pages(bp);
+	if (bp->b_flags & _XBF_FOLIOS)
+		xfs_buf_free_folios(bp);
 	else if (bp->b_flags & _XBF_KMEM)
 		kfree(bp->b_addr);
 
@@ -345,15 +345,15 @@ xfs_buf_alloc_kmem(
 		return -ENOMEM;
 	}
 	bp->b_offset = offset_in_page(bp->b_addr);
-	bp->b_pages = bp->b_page_array;
-	bp->b_pages[0] = kmem_to_page(bp->b_addr);
-	bp->b_page_count = 1;
+	bp->b_folios = bp->b_folio_array;
+	bp->b_folios[0] = kmem_to_folio(bp->b_addr);
+	bp->b_folio_count = 1;
 	bp->b_flags |= _XBF_KMEM;
 	return 0;
 }
 
 static int
-xfs_buf_alloc_pages(
+xfs_buf_alloc_folios(
 	struct xfs_buf	*bp,
 	xfs_buf_flags_t	flags)
 {
@@ -364,16 +364,16 @@ xfs_buf_alloc_pages(
 		gfp_mask |= __GFP_NORETRY;
 
 	/* Make sure that we have a page list */
-	bp->b_page_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE);
-	if (bp->b_page_count <= XB_PAGES) {
-		bp->b_pages = bp->b_page_array;
+	bp->b_folio_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE);
+	if (bp->b_folio_count <= XB_FOLIOS) {
+		bp->b_folios = bp->b_folio_array;
 	} else {
-		bp->b_pages = kzalloc(sizeof(struct page *) * bp->b_page_count,
+		bp->b_folios = kzalloc(sizeof(struct folio *) * bp->b_folio_count,
 					gfp_mask);
-		if (!bp->b_pages)
+		if (!bp->b_folios)
 			return -ENOMEM;
 	}
-	bp->b_flags |= _XBF_PAGES;
+	bp->b_flags |= _XBF_FOLIOS;
 
 	/* Assure zeroed buffer for non-read cases. */
 	if (!(flags & XBF_READ))
@@ -387,9 +387,9 @@ xfs_buf_alloc_pages(
 	for (;;) {
 		long	last = filled;
 
-		filled = alloc_pages_bulk_array(gfp_mask, bp->b_page_count,
-						bp->b_pages);
-		if (filled == bp->b_page_count) {
+		filled = alloc_pages_bulk_array(gfp_mask, bp->b_folio_count,
+						(struct page **)bp->b_folios);
+		if (filled == bp->b_folio_count) {
 			XFS_STATS_INC(bp->b_mount, xb_page_found);
 			break;
 		}
@@ -398,7 +398,7 @@ xfs_buf_alloc_pages(
 			continue;
 
 		if (flags & XBF_READ_AHEAD) {
-			xfs_buf_free_pages(bp);
+			xfs_buf_free_folios(bp);
 			return -ENOMEM;
 		}
 
@@ -412,14 +412,14 @@ xfs_buf_alloc_pages(
  *	Map buffer into kernel address-space if necessary.
  */
 STATIC int
-_xfs_buf_map_pages(
+_xfs_buf_map_folios(
 	struct xfs_buf		*bp,
 	xfs_buf_flags_t		flags)
 {
-	ASSERT(bp->b_flags & _XBF_PAGES);
-	if (bp->b_page_count == 1) {
+	ASSERT(bp->b_flags & _XBF_FOLIOS);
+	if (bp->b_folio_count == 1) {
 		/* A single page buffer is always mappable */
-		bp->b_addr = page_address(bp->b_pages[0]);
+		bp->b_addr = folio_address(bp->b_folios[0]);
 	} else if (flags & XBF_UNMAPPED) {
 		bp->b_addr = NULL;
 	} else {
@@ -443,8 +443,8 @@ _xfs_buf_map_pages(
 		 */
 		nofs_flag = memalloc_nofs_save();
 		do {
-			bp->b_addr = vm_map_ram(bp->b_pages, bp->b_page_count,
-						-1);
+			bp->b_addr = vm_map_ram((struct page **)bp->b_folios,
+					bp->b_folio_count, -1);
 			if (bp->b_addr)
 				break;
 			vm_unmap_aliases();
@@ -571,7 +571,7 @@ xfs_buf_find_lock(
 			return -ENOENT;
 		}
 		ASSERT((bp->b_flags & _XBF_DELWRI_Q) == 0);
-		bp->b_flags &= _XBF_KMEM | _XBF_PAGES;
+		bp->b_flags &= _XBF_KMEM | _XBF_FOLIOS;
 		bp->b_ops = NULL;
 	}
 	return 0;
@@ -629,14 +629,15 @@ xfs_buf_find_insert(
 		goto out_drop_pag;
 
 	/*
-	 * For buffers that fit entirely within a single page, first attempt to
-	 * allocate the memory from the heap to minimise memory usage. If we
-	 * can't get heap memory for these small buffers, we fall back to using
-	 * the page allocator.
+	 * For buffers that fit entirely within a single page folio, first
+	 * attempt to allocate the memory from the heap to minimise memory
+	 * usage. If we can't get heap memory for these small buffers, we fall
+	 * back to using the page allocator.
 	 */
+
 	if (BBTOB(new_bp->b_length) >= PAGE_SIZE ||
 	    xfs_buf_alloc_kmem(new_bp, flags) < 0) {
-		error = xfs_buf_alloc_pages(new_bp, flags);
+		error = xfs_buf_alloc_folios(new_bp, flags);
 		if (error)
 			goto out_free_buf;
 	}
@@ -728,11 +729,11 @@ xfs_buf_get_map(
 
 	/* We do not hold a perag reference anymore. */
 	if (!bp->b_addr) {
-		error = _xfs_buf_map_pages(bp, flags);
+		error = _xfs_buf_map_folios(bp, flags);
 		if (unlikely(error)) {
 			xfs_warn_ratelimited(btp->bt_mount,
-				"%s: failed to map %u pages", __func__,
-				bp->b_page_count);
+				"%s: failed to map %u folios", __func__,
+				bp->b_folio_count);
 			xfs_buf_relse(bp);
 			return error;
 		}
@@ -963,14 +964,14 @@ xfs_buf_get_uncached(
 	if (error)
 		return error;
 
-	error = xfs_buf_alloc_pages(bp, flags);
+	error = xfs_buf_alloc_folios(bp, flags);
 	if (error)
 		goto fail_free_buf;
 
-	error = _xfs_buf_map_pages(bp, 0);
+	error = _xfs_buf_map_folios(bp, 0);
 	if (unlikely(error)) {
 		xfs_warn(target->bt_mount,
-			"%s: failed to map pages", __func__);
+			"%s: failed to map folios", __func__);
 		goto fail_free_buf;
 	}
 
@@ -1465,7 +1466,7 @@ xfs_buf_ioapply_map(
 	blk_opf_t	op)
 {
 	int		page_index;
-	unsigned int	total_nr_pages = bp->b_page_count;
+	unsigned int	total_nr_pages = bp->b_folio_count;
 	int		nr_pages;
 	struct bio	*bio;
 	sector_t	sector =  bp->b_maps[map].bm_bn;
@@ -1503,7 +1504,7 @@ xfs_buf_ioapply_map(
 		if (nbytes > size)
 			nbytes = size;
 
-		rbytes = bio_add_page(bio, bp->b_pages[page_index], nbytes,
+		rbytes = bio_add_folio(bio, bp->b_folios[page_index], nbytes,
 				      offset);
 		if (rbytes < nbytes)
 			break;
@@ -1716,13 +1717,13 @@ xfs_buf_offset(
 	struct xfs_buf		*bp,
 	size_t			offset)
 {
-	struct page		*page;
+	struct folio		*folio;
 
 	if (bp->b_addr)
 		return bp->b_addr + offset;
 
-	page = bp->b_pages[offset >> PAGE_SHIFT];
-	return page_address(page) + (offset & (PAGE_SIZE-1));
+	folio = bp->b_folios[offset >> PAGE_SHIFT];
+	return folio_address(folio) + (offset & (PAGE_SIZE-1));
 }
 
 void
@@ -1735,18 +1736,18 @@ xfs_buf_zero(
 
 	bend = boff + bsize;
 	while (boff < bend) {
-		struct page	*page;
+		struct folio	*folio;
 		int		page_index, page_offset, csize;
 
 		page_index = (boff + bp->b_offset) >> PAGE_SHIFT;
 		page_offset = (boff + bp->b_offset) & ~PAGE_MASK;
-		page = bp->b_pages[page_index];
+		folio = bp->b_folios[page_index];
 		csize = min_t(size_t, PAGE_SIZE - page_offset,
 				      BBTOB(bp->b_length) - boff);
 
 		ASSERT((csize + page_offset) <= PAGE_SIZE);
 
-		memset(page_address(page) + page_offset, 0, csize);
+		memset(folio_address(folio) + page_offset, 0, csize);
 
 		boff += csize;
 	}
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index b470de08a46c..1e7298ff3fa5 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -29,7 +29,7 @@ struct xfs_buf;
 #define XBF_READ_AHEAD	 (1u << 2) /* asynchronous read-ahead */
 #define XBF_NO_IOACCT	 (1u << 3) /* bypass I/O accounting (non-LRU bufs) */
 #define XBF_ASYNC	 (1u << 4) /* initiator will not wait for completion */
-#define XBF_DONE	 (1u << 5) /* all pages in the buffer uptodate */
+#define XBF_DONE	 (1u << 5) /* all folios in the buffer uptodate */
 #define XBF_STALE	 (1u << 6) /* buffer has been staled, do not find it */
 #define XBF_WRITE_FAIL	 (1u << 7) /* async writes have failed on this buffer */
 
@@ -39,7 +39,7 @@ struct xfs_buf;
 #define _XBF_LOGRECOVERY (1u << 18)/* log recovery buffer */
 
 /* flags used only internally */
-#define _XBF_PAGES	 (1u << 20)/* backed by refcounted pages */
+#define _XBF_FOLIOS	 (1u << 20)/* backed by refcounted folios */
 #define _XBF_KMEM	 (1u << 21)/* backed by heap memory */
 #define _XBF_DELWRI_Q	 (1u << 22)/* buffer on a delwri queue */
 
@@ -68,7 +68,7 @@ typedef unsigned int xfs_buf_flags_t;
 	{ _XBF_INODES,		"INODES" }, \
 	{ _XBF_DQUOTS,		"DQUOTS" }, \
 	{ _XBF_LOGRECOVERY,	"LOG_RECOVERY" }, \
-	{ _XBF_PAGES,		"PAGES" }, \
+	{ _XBF_FOLIOS,		"FOLIOS" }, \
 	{ _XBF_KMEM,		"KMEM" }, \
 	{ _XBF_DELWRI_Q,	"DELWRI_Q" }, \
 	/* The following interface flags should never be set */ \
@@ -116,7 +116,7 @@ typedef struct xfs_buftarg {
 	struct ratelimit_state	bt_ioerror_rl;
 } xfs_buftarg_t;
 
-#define XB_PAGES	2
+#define XB_FOLIOS	2
 
 struct xfs_buf_map {
 	xfs_daddr_t		bm_bn;	/* block number for I/O */
@@ -180,14 +180,14 @@ struct xfs_buf {
 	struct xfs_buf_log_item	*b_log_item;
 	struct list_head	b_li_list;	/* Log items list head */
 	struct xfs_trans	*b_transp;
-	struct page		**b_pages;	/* array of page pointers */
-	struct page		*b_page_array[XB_PAGES]; /* inline pages */
+	struct folio		**b_folios;	/* array of folio pointers */
+	struct folio		*b_folio_array[XB_FOLIOS]; /* inline folios */
 	struct xfs_buf_map	*b_maps;	/* compound buffer map */
 	struct xfs_buf_map	__b_map;	/* inline compound buffer map */
 	int			b_map_count;
 	atomic_t		b_pin_count;	/* pin count */
 	atomic_t		b_io_remaining;	/* #outstanding I/O requests */
-	unsigned int		b_page_count;	/* size of page array */
+	unsigned int		b_folio_count;	/* size of folio array */
 	unsigned int		b_offset;	/* page offset of b_addr,
 						   only for _XBF_KMEM buffers */
 	int			b_error;	/* error code on I/O */
diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 83a81cb52d8e..d1407cee48d9 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -69,7 +69,7 @@ xfs_buf_item_straddle(
 {
 	void			*first, *last;
 
-	if (bp->b_page_count == 1 || !(bp->b_flags & XBF_UNMAPPED))
+	if (bp->b_folio_count == 1 || !(bp->b_flags & XBF_UNMAPPED))
 		return false;
 
 	first = xfs_buf_offset(bp, offset + (first_bit << XFS_BLF_SHIFT));
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index caccb7f76690..804389b8e802 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -279,4 +279,12 @@ kmem_to_page(void *addr)
 	return virt_to_page(addr);
 }
 
+static inline struct folio *
+kmem_to_folio(void *addr)
+{
+	if (is_vmalloc_addr(addr))
+		return page_folio(vmalloc_to_page(addr));
+	return virt_to_folio(addr);
+}
+
 #endif /* __XFS_LINUX__ */

From patchwork Thu Jan 18 22:19:41 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Dave Chinner <david@fromorbit.com>
X-Patchwork-Id: 13523244
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 63641C47DB7
	for <linux-mm@archiver.kernel.org>; Thu, 18 Jan 2024 22:22:29 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id CE60C6B007B; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id C95DB6B0080; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id B35AB6B0081; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com
 [216.40.44.16])
	by kanga.kvack.org (Postfix) with ESMTP id A121D6B007B
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 17:22:25 -0500 (EST)
Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id 6F7C31402C4
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 22:22:25 +0000 (UTC)
X-FDA: 81693856650.10.011F609
Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com
 [209.85.210.173])
	by imf23.hostedemail.com (Postfix) with ESMTP id 757C6140003
	for <linux-mm@kvack.org>; Thu, 18 Jan 2024 22:22:23 +0000 (UTC)
Authentication-Results: imf23.hostedemail.com;
	dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601
 header.b=U0KHcNRS;
	dmarc=pass (policy=quarantine) header.from=fromorbit.com;
	spf=pass (imf23.hostedemail.com: domain of david@fromorbit.com designates
 209.85.210.173 as permitted sender) smtp.mailfrom=david@fromorbit.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1705616543;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=fO+mBN2GDAFsDbZRgLX551gORnQ6e1TTPyY8dgTZaZk=;
	b=MyxR1P45DGPEMkJSALrFfd5uAH3Ttbv7p8PwIzxAHFt5LsltYK8vwxXAKWtrXxw7L8LW7Y
	Qo7okK16gZLDJ/tOVmB+GJqQUJFPTJ/t7QSRFFrIr1Qu2jZ+lAR2qBNRVK9mE0hqHHgeoz
	RgBNnYiuowQsx0RvspYTf7S3pYENUlo=
ARC-Authentication-Results: i=1;
	imf23.hostedemail.com;
	dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601
 header.b=U0KHcNRS;
	dmarc=pass (policy=quarantine) header.from=fromorbit.com;
	spf=pass (imf23.hostedemail.com: domain of david@fromorbit.com designates
 209.85.210.173 as permitted sender) smtp.mailfrom=david@fromorbit.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705616543; a=rsa-sha256;
	cv=none;
	b=lzAllWjH4EKFlQxecccYEK2PIzSFugxvtbC2tCD/1XL20XZV58lze0x0uYDgAeoyGSqAj4
	1uUQTRWQEwVxN5wDnskNZ4U6zH0wG334jD+Xfjg9nL28gqAO1dNH9RzXC+YZtU49A5fF9V
	P/P0P1/j0t8mHwv9KBl2OFPXtxtvrKc=
Received: by mail-pf1-f173.google.com with SMTP id
 d2e1a72fcca58-6d9bba6d773so188812b3a.1
        for <linux-mm@kvack.org>; Thu, 18 Jan 2024 14:22:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616542;
 x=1706221342; darn=kvack.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=fO+mBN2GDAFsDbZRgLX551gORnQ6e1TTPyY8dgTZaZk=;
        b=U0KHcNRSbfuourm4tei6Pahj/4BEKLZSi7x6Yfrw9Y0uSJEW9YpaTyJjrt0eR4KlCG
         xOGWC1MCtBUu/9YFgFYjN+0kryYGvRMPOGDiqKLm1k4bsD7qvBI3zBngm9GnBF89y82x
         HUmOgZWbx4eoqL/V2QeFqqW5U1N4ioMofIL3T2GIrMgAM7uO11F/seYV/bs5670Iy1y3
         LcANLjSVu2sDPpCnsV7m+2+ApGIEj33dXaUHIwxwNmc1qDumlpHUq30BBP/zQUdDWnmE
         UjSOMXrHO2gCFLdWYCrimrVa5rQZcNJ+G9TssnNEIImG4gczuVflDpPk4Gg/yuP43vFl
         24+w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1705616542; x=1706221342;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=fO+mBN2GDAFsDbZRgLX551gORnQ6e1TTPyY8dgTZaZk=;
        b=CwfFnZ6RLG01KD+pBn9GbHQDH2xhbGH3hW0GjGH1CCFEYeIlR74C+exh0EaxNZjjJC
         ZuJbgY00aMnKuw4YB3LQcqsNFk7rCiu5/8n9YesgWg5FdK/i9A4wcVCQ8CyEdxgs+Gg6
         uk+9PE5HmBfm2qxlGCFzIdU0rWiqidkXHtrD1R+yPCCHzOfV3WHOPvf8sgcedDB0N5pr
         0sxCmQRsFo1CfHBh1hGFfy58/n/lh91L1QlK5mY948oSp1UGMBhDIBqOyHX6oDPTA4uN
         XoTzKxQU5jhRuFobLTbhxeja1zVoLqRBtf+e+z+Q0hYQr4yps+d6/SYmUb7vN8xfNfTh
         g6Lg==
X-Gm-Message-State: AOJu0Ywr92DHNe8eIcs2Cj1LiSODB34kPOK2Y36KB+RbFhWJXqc0uGQ5
	+rE2H8o5ixIDMpCUCvoC7iIS+eA4KlH+pQJyd9VpfT79usD9kwTadjdxg9uMjQEOhRW8RhdZNHA
	Y
X-Google-Smtp-Source: 
 AGHT+IFxnt3+P3qfd/Or7vfgHgHJT56Si91E5SB2dSZzMTOU7Zs7rV/RsmUryc8DgCT+nt/t5oxBsQ==
X-Received: by 2002:a05:6a20:ae1c:b0:19b:4580:e9c6 with SMTP id
 dp28-20020a056a20ae1c00b0019b4580e9c6mr1220797pzb.65.1705616542306;
        Thu, 18 Jan 2024 14:22:22 -0800 (PST)
Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au.
 [49.180.249.6])
        by smtp.gmail.com with ESMTPSA id
 s13-20020a056a00194d00b006db13a02921sm3764329pfk.183.2024.01.18.14.22.21
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 18 Jan 2024 14:22:22 -0800 (PST)
Received: from [192.168.253.23] (helo=devoid.disaster.area)
	by dread.disaster.area with esmtp (Exim 4.96)
	(envelope-from <dave@fromorbit.com>)
	id 1rQamB-00CCGW-1A;
	Fri, 19 Jan 2024 09:22:18 +1100
Received: from dave by devoid.disaster.area with local (Exim 4.97)
	(envelope-from <dave@devoid.disaster.area>)
	id 1rQamA-0000000HMlw-3ZYk;
	Fri, 19 Jan 2024 09:22:18 +1100
From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Cc: willy@infradead.org,
	linux-mm@kvack.org
Subject: [PATCH 3/3] xfs: convert buffer cache to use high order folios
Date: Fri, 19 Jan 2024 09:19:41 +1100
Message-ID: <20240118222216.4131379-4-david@fromorbit.com>
X-Mailer: git-send-email 2.43.0
In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com>
References: <20240118222216.4131379-1-david@fromorbit.com>
MIME-Version: 1.0
X-Rspam-User: 
X-Stat-Signature: y13qmukoenn6eqqc98fx19qby6wswq7g
X-Rspamd-Server: rspam07
X-Rspamd-Queue-Id: 757C6140003
X-HE-Tag: 1705616543-911691
X-HE-Meta: 
 U2FsdGVkX1/fzKyd8y3wRm+siOoH1xIO7sYCcc4uwRDQnhW3ZC3MnpIpuYakyOmtI/Q+cloDiWX5VWVq2QeBS6xcVQNu473vcgIpPjhKsj6RpCYz3ZMXEDlOM+3gYfkFtqZRTuSDcKXBklI6BmnkvR3MJujBFeCNUKnyd9Vw2bLezXRxAIL1+MDodMdQw2mBT1eNh/tHG7g2yTj1rHkeoyOe5/TdWr8I/Q1/ygVloqaTEzyKWXpOeH2VHgZYamgYpm60zqBIdxa/vPGZd+GOWFp0vlfOn2Sgr+V07onJFyp4Hvin5vX3mc5iK8Q3AWqqUxfaWX3P75jH4GxE8FeQaBncyO6O3CQY02wEQ4dgPMPQbYAGm0LsdoIH/H5TBAYjvk8kzBdGroRWU7i98p0qnGbv7nmtWhlngs+ThF1WXxzkLt/hjUPIfSr84S/HsY2dMZorJ3EIAA11DFQ3q9j1eu5CWxFCc8q+V+GCTY2naFa6eTo+7DsyFZG2ju7rlOGIPjlTKgG3Z6mgXm3NK5AjvTBCyKmwIL24odZPCFkyx/A7b0ct5LnCbJzDbhmKrQjunlnOl/+fNREZ/kx2N9MUniq49JTdM7klGLiy46ldY+YlZV/xIczGQ0ZGXXw6jpNUMU/Wk734e2rvH2byfPfYaPiNphTcICcEjmYHzxy49RLLigG/DPgs5dvCuLxAhSrIz8z039X45f0ktr5EK2VSmHT1ixi5BoB6osuxHbhAXRh+V20d4YYByE+aot6PhEK+Z1Xhf8biv1VXhOZuGnmBoo4YnQbEM09KPIiMLXdsbiw6SS4i1eGaQWYM8CE77k0/PKpuE1cysoc/yGBwygkI98fGxXjZauP3sA9TlcZhob1EyAhME+QXOpbNvHBx5V2TrWInsgZSotjB2s5Ow9JXl2v78Isc4XJ8JeWpVAC749PNnHTEcmobXSK4O0e29bKdneDI7oSjZVlhYGYEjnF
 8cGljBSZ
 hp+lVbdqz692+vudRjIaASYU5kH8+ElMxjlf0AIgi3FNMmjVJp9qe9Sj87UfsKNkmk9MbOmQN3BdIHSCI4V5+v9eBR7/w/J9NFTxznzj+apqSm1QahKqOgseMd0NYpeeLNSJ5/t1igq6RRwJqPE5MalBpiWrkK+hWmJ0BURDqg80ulCnEf1dY7XOpsDAYN4vJUHeY+yuH3dA7cF2YSZP/qLdetJmkZnbKou6MT55qEFnFE+I9gmgsxEqIB0aSO2H4cnbdInZmmiQNXoNG3Gs4UGeFxx3LcYYOijgpEBpF5x1jEn1Uji5WJeTgHJxmRBti2ZxIzX21GzEVEuXULuQhq0HD0fC+OHkn2YaeTQSzgrRaa0mfk2bzjAAk0fqpeVQjktDSDhZVfFpPUUNuxwRT+62QNe6ReA+ia2eLo6jNVDSv8FtQK0uGdNB3iJ/m8Le8SzRrD3bYZFZilHKcM3DiTk8GXkw47ydZ1rAHlwXVL1isn0k=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

From: Dave Chinner <dchinner@redhat.com>

Now that we have the buffer cache using the folio API, we can extend
the use of folios to allocate high order folios for multi-page
buffers rather than an array of single pages that are then vmapped
into a contiguous range.

This creates two types of buffers: single folio buffers that can
have arbitrary order, and multi-folio buffers made up of many single
page folios that get vmapped. The latter is essentially the existing
code, so there are no logic changes to handle this case.

There are a few places where we iterate the folios on a buffer.
These need to be converted to handle the high order folio case.
Luckily, this only occurs when bp->b_folio_count == 1, and the code
for handling this case is just a simple application of the folio API
to the operations that need to be performed.

The code that allocates buffers will optimistically attempt a high
order folio allocation as a fast path. If this high order allocation
fails, then we fall back to the existing multi-folio allocation
code. This now forms the slow allocation path, and hopefully will be
largely unused in normal conditions.

This should improve performance of large buffer operations (e.g.
large directory block sizes) as we should now mostly avoid the
expense of vmapping large buffers (and the vmap lock contention that
can occur) as well as avoid the runtime pressure that frequently
accessing kernel vmapped pages put on the TLBs.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf.c | 150 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 119 insertions(+), 31 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 15907e92d0d3..df363f17ea1a 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -74,6 +74,10 @@ xfs_buf_is_vmapped(
 	return bp->b_addr && bp->b_folio_count > 1;
 }
 
+/*
+ * See comment above xfs_buf_alloc_folios() about the constraints placed on
+ * allocating vmapped buffers.
+ */
 static inline int
 xfs_buf_vmap_len(
 	struct xfs_buf	*bp)
@@ -344,14 +348,72 @@ xfs_buf_alloc_kmem(
 		bp->b_addr = NULL;
 		return -ENOMEM;
 	}
-	bp->b_offset = offset_in_page(bp->b_addr);
 	bp->b_folios = bp->b_folio_array;
 	bp->b_folios[0] = kmem_to_folio(bp->b_addr);
+	bp->b_offset = offset_in_folio(bp->b_folios[0], bp->b_addr);
 	bp->b_folio_count = 1;
 	bp->b_flags |= _XBF_KMEM;
 	return 0;
 }
 
+/*
+ * Allocating a high order folio makes the assumption that buffers are a
+ * power-of-2 size so that ilog2() returns the exact order needed to fit
+ * the contents of the buffer. Buffer lengths are mostly a power of two,
+ * so this is not an unreasonable approach to take by default.
+ *
+ * The exception here are user xattr data buffers, which can be arbitrarily
+ * sized up to 64kB plus structure metadata. In that case, round up the order.
+ */
+static bool
+xfs_buf_alloc_folio(
+	struct xfs_buf	*bp,
+	gfp_t		gfp_mask)
+{
+	int		length = BBTOB(bp->b_length);
+	int		order;
+
+	order = ilog2(length);
+	if ((1 << order) < length)
+		order = ilog2(length - 1) + 1;
+
+	if (order <= PAGE_SHIFT)
+		order = 0;
+	else
+		order -= PAGE_SHIFT;
+
+	bp->b_folio_array[0] = folio_alloc(gfp_mask, order);
+	if (!bp->b_folio_array[0])
+		return false;
+
+	bp->b_folios = bp->b_folio_array;
+	bp->b_folio_count = 1;
+	bp->b_flags |= _XBF_FOLIOS;
+	return true;
+}
+
+/*
+ * When we allocate folios for a buffer, we end up with one of two types of
+ * buffer.
+ *
+ * The first type is a single folio buffer - this may be a high order
+ * folio or just a single page sized folio, but either way they get treated the
+ * same way by the rest of the code - the buffer memory spans a single
+ * contiguous memory region that we don't have to map and unmap to access the
+ * data directly.
+ *
+ * The second type of buffer is the multi-folio buffer. These are *always* made
+ * up of single page folios so that they can be fed to vmap_ram() to return a
+ * contiguous memory region we can access the data through, or mark it as
+ * XBF_UNMAPPED and access the data directly through individual folio_address()
+ * calls.
+ *
+ * We don't use high order folios for this second type of buffer (yet) because
+ * having variable size folios makes offset-to-folio indexing and iteration of
+ * the data range more complex than if they are fixed size. This case should now
+ * be the slow path, though, so unless we regularly fail to allocate high order
+ * folios, there should be little need to optimise this path.
+ */
 static int
 xfs_buf_alloc_folios(
 	struct xfs_buf	*bp,
@@ -363,7 +425,15 @@ xfs_buf_alloc_folios(
 	if (flags & XBF_READ_AHEAD)
 		gfp_mask |= __GFP_NORETRY;
 
-	/* Make sure that we have a page list */
+	/* Assure zeroed buffer for non-read cases. */
+	if (!(flags & XBF_READ))
+		gfp_mask |= __GFP_ZERO;
+
+	/* Optimistically attempt a single high order folio allocation. */
+	if (xfs_buf_alloc_folio(bp, gfp_mask))
+		return 0;
+
+	/* Fall back to allocating an array of single page folios. */
 	bp->b_folio_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE);
 	if (bp->b_folio_count <= XB_FOLIOS) {
 		bp->b_folios = bp->b_folio_array;
@@ -375,9 +445,6 @@ xfs_buf_alloc_folios(
 	}
 	bp->b_flags |= _XBF_FOLIOS;
 
-	/* Assure zeroed buffer for non-read cases. */
-	if (!(flags & XBF_READ))
-		gfp_mask |= __GFP_ZERO;
 
 	/*
 	 * Bulk filling of pages can take multiple calls. Not filling the entire
@@ -418,7 +485,7 @@ _xfs_buf_map_folios(
 {
 	ASSERT(bp->b_flags & _XBF_FOLIOS);
 	if (bp->b_folio_count == 1) {
-		/* A single page buffer is always mappable */
+		/* A single folio buffer is always mappable */
 		bp->b_addr = folio_address(bp->b_folios[0]);
 	} else if (flags & XBF_UNMAPPED) {
 		bp->b_addr = NULL;
@@ -1465,20 +1532,28 @@ xfs_buf_ioapply_map(
 	int		*count,
 	blk_opf_t	op)
 {
-	int		page_index;
-	unsigned int	total_nr_pages = bp->b_folio_count;
-	int		nr_pages;
+	int		folio_index;
+	unsigned int	total_nr_folios = bp->b_folio_count;
+	int		nr_folios;
 	struct bio	*bio;
 	sector_t	sector =  bp->b_maps[map].bm_bn;
 	int		size;
 	int		offset;
 
-	/* skip the pages in the buffer before the start offset */
-	page_index = 0;
+	/*
+	 * If the start offset if larger than a single page, we need to be
+	 * careful. We might have a high order folio, in which case the indexing
+	 * is from the start of the buffer. However, if we have more than one
+	 * folio single page folio in the buffer, we need to skip the folios in
+	 * the buffer before the start offset.
+	 */
+	folio_index = 0;
 	offset = *buf_offset;
-	while (offset >= PAGE_SIZE) {
-		page_index++;
-		offset -= PAGE_SIZE;
+	if (bp->b_folio_count > 1) {
+		while (offset >= PAGE_SIZE) {
+			folio_index++;
+			offset -= PAGE_SIZE;
+		}
 	}
 
 	/*
@@ -1491,28 +1566,28 @@ xfs_buf_ioapply_map(
 
 next_chunk:
 	atomic_inc(&bp->b_io_remaining);
-	nr_pages = bio_max_segs(total_nr_pages);
+	nr_folios = bio_max_segs(total_nr_folios);
 
-	bio = bio_alloc(bp->b_target->bt_bdev, nr_pages, op, GFP_NOIO);
+	bio = bio_alloc(bp->b_target->bt_bdev, nr_folios, op, GFP_NOIO);
 	bio->bi_iter.bi_sector = sector;
 	bio->bi_end_io = xfs_buf_bio_end_io;
 	bio->bi_private = bp;
 
-	for (; size && nr_pages; nr_pages--, page_index++) {
-		int	rbytes, nbytes = PAGE_SIZE - offset;
+	for (; size && nr_folios; nr_folios--, folio_index++) {
+		struct folio	*folio = bp->b_folios[folio_index];
+		int		nbytes = folio_size(folio) - offset;
 
 		if (nbytes > size)
 			nbytes = size;
 
-		rbytes = bio_add_folio(bio, bp->b_folios[page_index], nbytes,
-				      offset);
-		if (rbytes < nbytes)
+		if (!bio_add_folio(bio, folio, nbytes,
+				offset_in_folio(folio, offset)))
 			break;
 
 		offset = 0;
 		sector += BTOBB(nbytes);
 		size -= nbytes;
-		total_nr_pages--;
+		total_nr_folios--;
 	}
 
 	if (likely(bio->bi_iter.bi_size)) {
@@ -1722,6 +1797,13 @@ xfs_buf_offset(
 	if (bp->b_addr)
 		return bp->b_addr + offset;
 
+	/* Single folio buffers may use large folios. */
+	if (bp->b_folio_count == 1) {
+		folio = bp->b_folios[0];
+		return folio_address(folio) + offset_in_folio(folio, offset);
+	}
+
+	/* Multi-folio buffers always use PAGE_SIZE folios */
 	folio = bp->b_folios[offset >> PAGE_SHIFT];
 	return folio_address(folio) + (offset & (PAGE_SIZE-1));
 }
@@ -1737,18 +1819,24 @@ xfs_buf_zero(
 	bend = boff + bsize;
 	while (boff < bend) {
 		struct folio	*folio;
-		int		page_index, page_offset, csize;
+		int		folio_index, folio_offset, csize;
 
-		page_index = (boff + bp->b_offset) >> PAGE_SHIFT;
-		page_offset = (boff + bp->b_offset) & ~PAGE_MASK;
-		folio = bp->b_folios[page_index];
-		csize = min_t(size_t, PAGE_SIZE - page_offset,
+		/* Single folio buffers may use large folios. */
+		if (bp->b_folio_count == 1) {
+			folio = bp->b_folios[0];
+			folio_offset = offset_in_folio(folio,
+						bp->b_offset + boff);
+		} else {
+			folio_index = (boff + bp->b_offset) >> PAGE_SHIFT;
+			folio_offset = (boff + bp->b_offset) & ~PAGE_MASK;
+			folio = bp->b_folios[folio_index];
+		}
+
+		csize = min_t(size_t, folio_size(folio) - folio_offset,
 				      BBTOB(bp->b_length) - boff);
+		ASSERT((csize + folio_offset) <= folio_size(folio));
 
-		ASSERT((csize + page_offset) <= PAGE_SIZE);
-
-		memset(folio_address(folio) + page_offset, 0, csize);
-
+		memset(folio_address(folio) + folio_offset, 0, csize);
 		boff += csize;
 	}
 }