From patchwork Thu Jan 18 22:19:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13523242 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA065C47DAF for ; Thu, 18 Jan 2024 22:22:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 40D4E6B007E; Thu, 18 Jan 2024 17:22:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BC616B0078; Thu, 18 Jan 2024 17:22:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 284946B0080; Thu, 18 Jan 2024 17:22:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 06D686B0078 for ; Thu, 18 Jan 2024 17:22:25 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id CA415C0305 for ; Thu, 18 Jan 2024 22:22:24 +0000 (UTC) X-FDA: 81693856608.12.AC47ACD Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) by imf21.hostedemail.com (Postfix) with ESMTP id BA3511C0017 for ; Thu, 18 Jan 2024 22:22:22 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=NJ5Kv4Mv; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf21.hostedemail.com: domain of david@fromorbit.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705616542; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SIYg4fbOx5xpSV7xBsoY4WQ6ng+5chM3y1Ni3HkVJSg=; b=2g3arzSUHKufHKmEO2jq2svTJutBEZ+COxmPIwXcQHPVuT3AZENm0RtgGdXHiqSCvWZz7t T2vz44lXqYt97GiVXScEfy6SACXZ160J5IzCR/fGRFL2tYRdve8Ggemu7CcCik5Raii+gr b1ml/Z6KIviVAsSu06uxeg2mWlH0Xlc= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=NJ5Kv4Mv; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf21.hostedemail.com: domain of david@fromorbit.com designates 209.85.215.179 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705616542; a=rsa-sha256; cv=none; b=GJ08wGzP86zQaX1uKftf1cGmuWckl1xztUYoEkm0VL8G9sTANGxgqZax6/7xwD6YKr2NvY RczETg6yeyhsCHxJamcvnv19a/u92JrHx2E+MV7d3HmXNuw8g6/qDzII+b5lxUanJHQC8e rBJ1cvzY4tu5Qqez/TTvuFSoYfqrwk8= Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-5cf450eba00so151358a12.0 for ; Thu, 18 Jan 2024 14:22:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616541; x=1706221341; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SIYg4fbOx5xpSV7xBsoY4WQ6ng+5chM3y1Ni3HkVJSg=; b=NJ5Kv4MvgKgRN/bdkZP0ovnh3b3nJrlcOJDZUgCY1upj/skqt7PTYQvvCMdYZGOJk8 G5sLhNqvJVOaVJUOh7HGafF5gl90uQ02yQal/Vh4B4Z+sNfRsNRLxIE0E2bB4ASUqOw+ 2xQi0eGD0pD2A71ZDYE4nuh9eEGH1sPA2OpnnQj85mkgFm3NKNWJCSe0DBWo1G52Gy8e UybipwFqaF5pRb4oo4cVt/kHoHMMANMe9JJzWhksVzs5n46byN8Nwa7pzl9uN8rbBRdH Qm9uoCEnE+0bvynxzdbXtO7XzB/zWZcn/oFdFX8maTl44q7PZX+lwSj+e0eIEwTtThNR X0Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705616541; x=1706221341; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SIYg4fbOx5xpSV7xBsoY4WQ6ng+5chM3y1Ni3HkVJSg=; b=E4aO/86hpraH3HT8zlx4OT8exuvdkwDNqdqEARjsowLht5MI3Z513alHhNwhPsMrYQ dsoDY2zBRdkxgBzxa3eZYB+S4FY3YalPNm3TEhb2aSAUuOpoTXk5TD5hYkas0PRIRaHs XW1mjjF3aTQNfQxpiZ2BpnIqJNvkyiWaAFyJNLeJcNbAs63dhsuD98CufzqepWMW/zn/ MQ84dQKJhtYfPhBdmU2q/AEw0pT/z3sKWQuzr3CNXPDlcv3/0Jate2TZ2X8WXTSFrEmb eICp60TYF57DOFrWgwTmk5zi3oTGSe845FnEaKbbWk4izgq3gZfV0bWcrIa/NXB+LzI3 OeoQ== X-Gm-Message-State: AOJu0YzghSnJenK5KG3XhPW52ypHieWCV5BSCxtrVpG7cNgbKiim0lT1 vZDpvfnVsZmrSGNSgm82sJX8h1qR/HGLxQP8iSWYdcwPUZSu0TePEDUsgUSNOvY= X-Google-Smtp-Source: AGHT+IFvJI+inGv4x/2Vzu4CfgzR5ivHLaCvAfCWUVCseYlv9qmktZULfGpdbMDYc55IPbpJDykUzw== X-Received: by 2002:a17:902:ee8a:b0:1d5:7220:9ff with SMTP id a10-20020a170902ee8a00b001d5722009ffmr1594374pld.117.1705616541575; Thu, 18 Jan 2024 14:22:21 -0800 (PST) Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au. [49.180.249.6]) by smtp.gmail.com with ESMTPSA id p12-20020a170902eacc00b001d71729ec9csm531276pld.188.2024.01.18.14.22.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 14:22:21 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1rQamB-00CCGN-0h; Fri, 19 Jan 2024 09:22:18 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.97) (envelope-from ) id 1rQamA-0000000HMlm-2z8S; Fri, 19 Jan 2024 09:22:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: willy@infradead.org, linux-mm@kvack.org Subject: [PATCH 1/3] xfs: unmapped buffer item size straddling mismatch Date: Fri, 19 Jan 2024 09:19:39 +1100 Message-ID: <20240118222216.4131379-2-david@fromorbit.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com> References: <20240118222216.4131379-1-david@fromorbit.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: BA3511C0017 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: jorztkxi3qnpf71tzuerinjxr91ondqe X-HE-Tag: 1705616542-394970 X-HE-Meta: U2FsdGVkX1/as4uTvNRlc28XPbqJ8scXzuhspPCOxWONdzwer6P0RP8bGgHC0z4BkU+Ahq78sZv8xAdnQ/W9YEReHeY7Ct1g1uQdkqbJy+iHgPuvEMyxIOUsVfBzOFz3smrhjLoj6ZWqH0/W4lhNB80ggBadiq1wGHAGNGa/M48/6f+pi3cDry2jq3eGkqwrzQuxvOZ8d+aRl+g62QD2iQwht5q621CHeOrQLJlH2Ap75UWAxHYrTtrLLjrAnJN/1/kNwmmIVdCEr+sdRA3aRxBZqJhoaXmflL9gNLbWQwmphBezRYy2qYjeZqCo9lIxHs6vlAFUgVxsondGGOd09jQKHiY9gIVY0betJ8pkXcQyNU55Wt1/TZMzd63k4GXjWdkfHFTxTh2nQ8t+H36BzqhFea2nWVUjvaNwVk6rbMgSfWiheppKamYkP5ArjC9xR5WQ8DFB4+BS1pP5lYTqVWbfPzLJSMz+bVDaPos5Ubbz4cPg6tbnN8pH8MyLEpzaVKbYxrMndiMncE0y3xBpLoGgxLZJFF7638NFxd7ve3yTIXh/s4S6s3zt07JYFIMpM/z29z5nC824tpo3dxznsORCa8AN8jcx12GUGciBWTJQgRrDl0HthKvKTMD25hgSIe/799OXDllk9eHrzLGvQoFANFgrwnkUiVfoZeeT41hV2nMKLoxd2pYJNI8z1nMG5GscVM0nKPxM6QHjSKJRc0Fc3CdAnnoNQh5ewNfqhCVo/HGkolEpLqCF6m5KNb6h7e2/1KdMtafx/VoUk1h/Gt4mTXf9+h1jSo+wxOJdOiN8WeYD/iOf3a2R1e72SwvcDLM63W4G2PcbVcPPXf3h3BqGdhFC3RgglcH/SNrrFWZ9yJd62s14Wu7KAeSxfoo6cT1MufbNP+LvNa33Ae0L3nBtae6r6oKruQVC5IaBbfbuIQcbNe60jgHrM/0H3DdcHik09w+i9pku+lfjEjR R4lk487q JhIlOMWN0ltj6noTi5su5E1vL4KKQxrp5Zo2wAVAFnhuLEAVz+WufN9cfkttTn15skSoPM9ha4BHK6kSMyA/Q+lKkS0GGLKFbSMjVWtw0YLDgjQz1AA4tFHzV/CZ+JO8FRMC9K7F8MVXChqjxezoS2EaVmIutNCBN0dMwiOBgWlLNndsBtYO2pRAHybb0AHkWHfJD1i728dR1iNa9eKYFXXeEn1tSgYnxJJc96ipuI+NmIwus9LbK8ArRpNWG+6/+oquYHHSwNwSG9sS2XEW/9WJ30vnt+kaJ7EjbxZECiSGodJbD8Ntj2OG9NhMsBWEJjeU0P7aBD+a4ZHf4J/pmDqbIVpGEs3cJOOjYNsFxCuPyA1uYTYzVf3qQFpTWma+OX+f2DiUpQ7828Y67tSD2mEjbljaBdRLVaxE5u51xhsuWOJO3tgYI7okAO0MDoIHSUAxiok3mLm0QYgDcJKiKLpaGic7Mxp3j5RAN5dg+Hrdenok= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000092, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Dave Chinner We never log large contiguous regions of unmapped buffers, so this bug is never triggered by the current code. However, the slowpath for formatting buffer straddling regions is broken. That is, the size and shape of the log vector calculated across a straddle does not match how the formatting code formats a straddle. This results in a log vector with an uninitialised iovec and this causes a crash when xlog_write_full() goes to copy the iovec into the journal. Whilst touching this code, don't bother checking mapped or single folio buffers for discontiguous regions because they don't have them. This significantly reduces the overhead of this check when logging large buffers as calling xfs_buf_offset() is not free and it occurs a *lot* in those cases. Fixes: 929f8b0deb83 ("xfs: optimise xfs_buf_item_size/format for contiguous regions") Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_buf_item.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 43031842341a..83a81cb52d8e 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -56,6 +56,10 @@ xfs_buf_log_format_size( (blfp->blf_map_size * sizeof(blfp->blf_data_map[0])); } +/* + * We only have to worry about discontiguous buffer range straddling on unmapped + * buffers. Everything else will have a contiguous data region we can copy from. + */ static inline bool xfs_buf_item_straddle( struct xfs_buf *bp, @@ -65,6 +69,9 @@ xfs_buf_item_straddle( { void *first, *last; + if (bp->b_page_count == 1 || !(bp->b_flags & XBF_UNMAPPED)) + return false; + first = xfs_buf_offset(bp, offset + (first_bit << XFS_BLF_SHIFT)); last = xfs_buf_offset(bp, offset + ((first_bit + nbits) << XFS_BLF_SHIFT)); @@ -132,11 +139,13 @@ xfs_buf_item_size_segment( return; slow_scan: - /* Count the first bit we jumped out of the above loop from */ - (*nvecs)++; - *nbytes += XFS_BLF_CHUNK; + ASSERT(bp->b_addr == NULL); last_bit = first_bit; + nbits = 1; while (last_bit != -1) { + + *nbytes += XFS_BLF_CHUNK; + /* * This takes the bit number to start looking from and * returns the next set bit from there. It returns -1 @@ -151,6 +160,8 @@ xfs_buf_item_size_segment( * else keep scanning the current set of bits. */ if (next_bit == -1) { + if (first_bit != last_bit) + (*nvecs)++; break; } else if (next_bit != last_bit + 1 || xfs_buf_item_straddle(bp, offset, first_bit, nbits)) { @@ -162,7 +173,6 @@ xfs_buf_item_size_segment( last_bit++; nbits++; } - *nbytes += XFS_BLF_CHUNK; } } From patchwork Thu Jan 18 22:19:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13523245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F593C47422 for ; Thu, 18 Jan 2024 22:22:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 238CA6B0080; Thu, 18 Jan 2024 17:22:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E9316B0081; Thu, 18 Jan 2024 17:22:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 060F76B0082; Thu, 18 Jan 2024 17:22:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E80C26B0081 for ; Thu, 18 Jan 2024 17:22:25 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C9C321C0982 for ; Thu, 18 Jan 2024 22:22:25 +0000 (UTC) X-FDA: 81693856650.09.6D5A261 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf03.hostedemail.com (Postfix) with ESMTP id D886720010 for ; Thu, 18 Jan 2024 22:22:23 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=qsqDMXUk; spf=pass (imf03.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705616543; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xOQkNaBecqrKeof8epQC7tBtDt+f9sgqsspBN6V4LXw=; b=khlGIqCs/fnl4BDiwCl4h9OEVl8i1SeQBWvUmMqBwLEWJjtBjrTD5bfdQULzXn3yAx7fJF UjlpAVycWct1kWcIq7s3MLeUKZs/YMk+vgAuPVRbXDd+LSLkztQjkVvu+bE39LWtPKWjif JFqk3jb0U+r3ComiXIvBGfxFMuiHbPE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705616543; a=rsa-sha256; cv=none; b=Hy8mizWGVJdEM15iZLtR+HTlZi4M69lOntiouSqIpZuboh1angF3rL7tI2F8VBwXiP0Bx+ ju6q8ZwNFZkouelMW5kSSId2+/Lt+mAhirBi2wrvg4kjpM/gMmpU7js+AXDU3E8vws4yNd OaxOHtvS+/fdxVy6vMVJnx2YHRflkks= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=qsqDMXUk; spf=pass (imf03.hostedemail.com: domain of david@fromorbit.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=david@fromorbit.com; dmarc=pass (policy=quarantine) header.from=fromorbit.com Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1d70696b6faso783735ad.3 for ; Thu, 18 Jan 2024 14:22:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616542; x=1706221342; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xOQkNaBecqrKeof8epQC7tBtDt+f9sgqsspBN6V4LXw=; b=qsqDMXUk28ujx3cv0lgdQbMz0ALWTQcQjX+sKPDOQ8wb9Eg3hYKMyx8U9zQOmkphB/ ot2FqjKpUDPdVOg2nnB23DmAJOqWHvdBsw2inggdVMbM35BqiA4DWm7Ca4jyoUfy09zh NPrrvzm2f38Xm6IboIk6nyshCN6bvJPaD/MPMBwOm6VEgzmz+cF8B7oLltAi7Yfd8/gV Kyh2x6kJuN9/WwyALDpa7tJW5KL1AzfhbJ80M4osbjvRIgmGu1BCUTlbR599GYQBxU1p SBvdfFhvSBB11HT3Ra+L/D4KbPXj0HPmSRz82ZssPIrPfzIhbucpDs5zNPXAYVPqD66Z EQjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705616542; x=1706221342; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xOQkNaBecqrKeof8epQC7tBtDt+f9sgqsspBN6V4LXw=; b=NTQU5blEY32+JvmnX7IeePQh5M1I1WsvEHxruAUafIzfVWlhOU6/mr0vtPXZJVHbj0 fvM+zWaB/gY73aypI1WI8RAkC+XzU3zv8crWkkRVGOwGttSlmUSsGkTxoLUE7k0QAM3C HBgihQ2UJ3AokkYIyVyXd9CaVweTDZd94aqUnu0PzGLl4UqiZHrfyQi2hdcwoi3Rd7TE Z68vA5iRKvxPIDstEeku97C2myzeck+2a91fSQ8X73yXJvhfE+xai4UkTH+64kTg/fwX PIgfN9Us2k0yATkD1mdI/xCPqn0M/xWu+Iz7GXC6+G5725PQI4xjCQOdsIiLrG1C4VXf 4CdA== X-Gm-Message-State: AOJu0Yx0fK94x/MkTfVuUbN8wp4Id3kdLXhSPFTD+sTF0qpQgqoKqub6 ZLhYF5FKfzI8bJW7YliBJwVG7kPgbwpbbkl2Tn7J448qS9YoQC+9ThgPuTbzKLc= X-Google-Smtp-Source: AGHT+IHiaXm1EWAGasNNOOfrGx0XW1ck6lQq+G178B7PjIkwrsJeX+rv69F7epMmGMSQOU5IySypvg== X-Received: by 2002:a17:902:bc8b:b0:1d5:8bf4:c7b2 with SMTP id bb11-20020a170902bc8b00b001d58bf4c7b2mr1474681plb.88.1705616542573; Thu, 18 Jan 2024 14:22:22 -0800 (PST) Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au. [49.180.249.6]) by smtp.gmail.com with ESMTPSA id ji15-20020a170903324f00b001d7164acf5csm601148plb.120.2024.01.18.14.22.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 14:22:22 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1rQamB-00CCGQ-11; Fri, 19 Jan 2024 09:22:18 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.97) (envelope-from ) id 1rQamA-0000000HMlq-3FAD; Fri, 19 Jan 2024 09:22:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: willy@infradead.org, linux-mm@kvack.org Subject: [PATCH 2/3] xfs: use folios in the buffer cache Date: Fri, 19 Jan 2024 09:19:40 +1100 Message-ID: <20240118222216.4131379-3-david@fromorbit.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com> References: <20240118222216.4131379-1-david@fromorbit.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: D886720010 X-Rspam-User: X-Stat-Signature: mt549o6m6gdghc9j8zcxsg4qp3n3nn5k X-Rspamd-Server: rspam03 X-HE-Tag: 1705616543-876718 X-HE-Meta: U2FsdGVkX19sqnBFUvtjqSvtx1lib99Jy5Gb6c99ETbQ40vCGlso3ZBS+Hk3dhMJasP9ZiqpBi5ZJDhE5zQN0RRdKlCZf+LwekKHVcHhFFgeoFEEQInPQzO/t6W5lEhZNtLNSBuxug8g8Kw2pyfVK6a9Zh/Z0IO4o3wDU1NSHYmN422f9nRWDBMtG/wQO/uBfN56rl/I17iNzay4LoA5EudY1fsYRAdxhkJ+gOOCsh8YfABYdI9qcLCIQirAzzbuDtq2tGD5xONrB2TFBGrETFReMb17NmStU06QqrfuyVUCSQ1ilnQ+WBxvxyUh7sKuwcxzfLltbAFhyxgQdrOxDObkZ1UKlFz5YVjRpisCkyTGVEjgxECqOT+kF1/UTfYYqiKanqjgG7yqITKJ24coGemjJTUYLtMCEBowueKEjqVtZAOsFhHNSOG5nybElNbk1odfpHq1Dd4wLSzLeEzS3sutIQfM8UgbffC8jv5i3xoEyT0fbhIcVMgUkn9E/+LLZPdmxDg5t7y48T+5pR1zvoGgxG70+yLGQONOyhNGJelKvHky2z8nKdQKxF+TPQSHVwv5pGexa0kAIeQRRsugHU5DGwxry7FN1tmaoa5EH5S7xy7Yh3WOBZKErm7eQwDGczwo4xICs/iK9IqxAeqVdwB15zDP81rk8h8Ws0fbYkHJqXw8ug4y3mQ+KW8Pynl8CUZhKJRAf5O9juNiNhAcnuAPxZvtLRGO6I4biejeEPfs/+MHS6FiAAWc1gbDwYCGzZlRpw1cAxwYkm2slb4n9+oLyiPg4OQt26XZBwht8Rh67x6ow7ny9RVi1GkRoEfIOPos6v4wyS5WFsBucCsPASwZ3RdfvNHEe8n4e7yRcvhD1diEhq4lzhFIjimwBnJoZFGjIh4KGgzd5Y6HzShJ2Yd0G4JZisnSaqp+9btx1VvTns5ElIhrW2ipGEDI35kiN0nJ6Bmj/RWQ8XKuk+y wkrfXKYE PJxGlTivtX4O0xih4gB6LVcAH3XAVttndHVmsHds3+vo2HNwvC1jl8RuOFc2ExGOmnYvXMIUOMGbHDKWQlqiKrMfOMXcapKXaaT6I6CmvG9XNAbTTTaaUvYXviPGQoTpWjKC6chGh0fjObHQLx3hLu708t2JeLeInJRAYK4/A9+PhEfq8Q6GfJCG858F2YscsOUj/hxHkZguIWYR2wt0QS0m/PYXJ1aou+Dhsnn32VcWhVq1msFuegzgOvK3lhs1MkEIRqZ+kGJ4H3CE0wfNdeRjjjPwtcPwXVs/GM6fjjToPv/HUkDtUF4506sHp9xj+iYi4vtlYf46Jfcc/wiCq8UqPbgXwPHdGX+YfTTtHvNIF0RYa6szRYEDTo9X2xbN8pBD1lryHg9yg2v/5N1/jhb/mB8dHvixaRGet7399hmXUc48BrNlbVhFbsg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Dave Chinner Convert the use of struct pages to struct folio everywhere. This is just direct API conversion, no actual logic of code changes should result. Note: this conversion currently assumes only single page folios are allocated, and because some of the MM interfaces we use take pointers to arrays of struct pages, the address of single page folios and struct pages are the same. e.g alloc_pages_bulk_array(), vm_map_ram(), etc. Signed-off-by: Dave Chinner --- fs/xfs/xfs_buf.c | 127 +++++++++++++++++++++--------------------- fs/xfs/xfs_buf.h | 14 ++--- fs/xfs/xfs_buf_item.c | 2 +- fs/xfs/xfs_linux.h | 8 +++ 4 files changed, 80 insertions(+), 71 deletions(-) diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index 08f2fbc04db5..15907e92d0d3 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -60,25 +60,25 @@ xfs_buf_submit( return __xfs_buf_submit(bp, !(bp->b_flags & XBF_ASYNC)); } +/* + * Return true if the buffer is vmapped. + * + * b_addr is null if the buffer is not mapped, but the code is clever enough to + * know it doesn't have to map a single folio, so the check has to be both for + * b_addr and bp->b_folio_count > 1. + */ static inline int xfs_buf_is_vmapped( struct xfs_buf *bp) { - /* - * Return true if the buffer is vmapped. - * - * b_addr is null if the buffer is not mapped, but the code is clever - * enough to know it doesn't have to map a single page, so the check has - * to be both for b_addr and bp->b_page_count > 1. - */ - return bp->b_addr && bp->b_page_count > 1; + return bp->b_addr && bp->b_folio_count > 1; } static inline int xfs_buf_vmap_len( struct xfs_buf *bp) { - return (bp->b_page_count * PAGE_SIZE); + return (bp->b_folio_count * PAGE_SIZE); } /* @@ -197,7 +197,7 @@ xfs_buf_get_maps( } /* - * Frees b_pages if it was allocated. + * Frees b_maps if it was allocated. */ static void xfs_buf_free_maps( @@ -273,26 +273,26 @@ _xfs_buf_alloc( } static void -xfs_buf_free_pages( +xfs_buf_free_folios( struct xfs_buf *bp) { uint i; - ASSERT(bp->b_flags & _XBF_PAGES); + ASSERT(bp->b_flags & _XBF_FOLIOS); if (xfs_buf_is_vmapped(bp)) - vm_unmap_ram(bp->b_addr, bp->b_page_count); + vm_unmap_ram(bp->b_addr, bp->b_folio_count); - for (i = 0; i < bp->b_page_count; i++) { - if (bp->b_pages[i]) - __free_page(bp->b_pages[i]); + for (i = 0; i < bp->b_folio_count; i++) { + if (bp->b_folios[i]) + __folio_put(bp->b_folios[i]); } - mm_account_reclaimed_pages(bp->b_page_count); + mm_account_reclaimed_pages(bp->b_folio_count); - if (bp->b_pages != bp->b_page_array) - kfree(bp->b_pages); - bp->b_pages = NULL; - bp->b_flags &= ~_XBF_PAGES; + if (bp->b_folios != bp->b_folio_array) + kfree(bp->b_folios); + bp->b_folios = NULL; + bp->b_flags &= ~_XBF_FOLIOS; } static void @@ -313,8 +313,8 @@ xfs_buf_free( ASSERT(list_empty(&bp->b_lru)); - if (bp->b_flags & _XBF_PAGES) - xfs_buf_free_pages(bp); + if (bp->b_flags & _XBF_FOLIOS) + xfs_buf_free_folios(bp); else if (bp->b_flags & _XBF_KMEM) kfree(bp->b_addr); @@ -345,15 +345,15 @@ xfs_buf_alloc_kmem( return -ENOMEM; } bp->b_offset = offset_in_page(bp->b_addr); - bp->b_pages = bp->b_page_array; - bp->b_pages[0] = kmem_to_page(bp->b_addr); - bp->b_page_count = 1; + bp->b_folios = bp->b_folio_array; + bp->b_folios[0] = kmem_to_folio(bp->b_addr); + bp->b_folio_count = 1; bp->b_flags |= _XBF_KMEM; return 0; } static int -xfs_buf_alloc_pages( +xfs_buf_alloc_folios( struct xfs_buf *bp, xfs_buf_flags_t flags) { @@ -364,16 +364,16 @@ xfs_buf_alloc_pages( gfp_mask |= __GFP_NORETRY; /* Make sure that we have a page list */ - bp->b_page_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE); - if (bp->b_page_count <= XB_PAGES) { - bp->b_pages = bp->b_page_array; + bp->b_folio_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE); + if (bp->b_folio_count <= XB_FOLIOS) { + bp->b_folios = bp->b_folio_array; } else { - bp->b_pages = kzalloc(sizeof(struct page *) * bp->b_page_count, + bp->b_folios = kzalloc(sizeof(struct folio *) * bp->b_folio_count, gfp_mask); - if (!bp->b_pages) + if (!bp->b_folios) return -ENOMEM; } - bp->b_flags |= _XBF_PAGES; + bp->b_flags |= _XBF_FOLIOS; /* Assure zeroed buffer for non-read cases. */ if (!(flags & XBF_READ)) @@ -387,9 +387,9 @@ xfs_buf_alloc_pages( for (;;) { long last = filled; - filled = alloc_pages_bulk_array(gfp_mask, bp->b_page_count, - bp->b_pages); - if (filled == bp->b_page_count) { + filled = alloc_pages_bulk_array(gfp_mask, bp->b_folio_count, + (struct page **)bp->b_folios); + if (filled == bp->b_folio_count) { XFS_STATS_INC(bp->b_mount, xb_page_found); break; } @@ -398,7 +398,7 @@ xfs_buf_alloc_pages( continue; if (flags & XBF_READ_AHEAD) { - xfs_buf_free_pages(bp); + xfs_buf_free_folios(bp); return -ENOMEM; } @@ -412,14 +412,14 @@ xfs_buf_alloc_pages( * Map buffer into kernel address-space if necessary. */ STATIC int -_xfs_buf_map_pages( +_xfs_buf_map_folios( struct xfs_buf *bp, xfs_buf_flags_t flags) { - ASSERT(bp->b_flags & _XBF_PAGES); - if (bp->b_page_count == 1) { + ASSERT(bp->b_flags & _XBF_FOLIOS); + if (bp->b_folio_count == 1) { /* A single page buffer is always mappable */ - bp->b_addr = page_address(bp->b_pages[0]); + bp->b_addr = folio_address(bp->b_folios[0]); } else if (flags & XBF_UNMAPPED) { bp->b_addr = NULL; } else { @@ -443,8 +443,8 @@ _xfs_buf_map_pages( */ nofs_flag = memalloc_nofs_save(); do { - bp->b_addr = vm_map_ram(bp->b_pages, bp->b_page_count, - -1); + bp->b_addr = vm_map_ram((struct page **)bp->b_folios, + bp->b_folio_count, -1); if (bp->b_addr) break; vm_unmap_aliases(); @@ -571,7 +571,7 @@ xfs_buf_find_lock( return -ENOENT; } ASSERT((bp->b_flags & _XBF_DELWRI_Q) == 0); - bp->b_flags &= _XBF_KMEM | _XBF_PAGES; + bp->b_flags &= _XBF_KMEM | _XBF_FOLIOS; bp->b_ops = NULL; } return 0; @@ -629,14 +629,15 @@ xfs_buf_find_insert( goto out_drop_pag; /* - * For buffers that fit entirely within a single page, first attempt to - * allocate the memory from the heap to minimise memory usage. If we - * can't get heap memory for these small buffers, we fall back to using - * the page allocator. + * For buffers that fit entirely within a single page folio, first + * attempt to allocate the memory from the heap to minimise memory + * usage. If we can't get heap memory for these small buffers, we fall + * back to using the page allocator. */ + if (BBTOB(new_bp->b_length) >= PAGE_SIZE || xfs_buf_alloc_kmem(new_bp, flags) < 0) { - error = xfs_buf_alloc_pages(new_bp, flags); + error = xfs_buf_alloc_folios(new_bp, flags); if (error) goto out_free_buf; } @@ -728,11 +729,11 @@ xfs_buf_get_map( /* We do not hold a perag reference anymore. */ if (!bp->b_addr) { - error = _xfs_buf_map_pages(bp, flags); + error = _xfs_buf_map_folios(bp, flags); if (unlikely(error)) { xfs_warn_ratelimited(btp->bt_mount, - "%s: failed to map %u pages", __func__, - bp->b_page_count); + "%s: failed to map %u folios", __func__, + bp->b_folio_count); xfs_buf_relse(bp); return error; } @@ -963,14 +964,14 @@ xfs_buf_get_uncached( if (error) return error; - error = xfs_buf_alloc_pages(bp, flags); + error = xfs_buf_alloc_folios(bp, flags); if (error) goto fail_free_buf; - error = _xfs_buf_map_pages(bp, 0); + error = _xfs_buf_map_folios(bp, 0); if (unlikely(error)) { xfs_warn(target->bt_mount, - "%s: failed to map pages", __func__); + "%s: failed to map folios", __func__); goto fail_free_buf; } @@ -1465,7 +1466,7 @@ xfs_buf_ioapply_map( blk_opf_t op) { int page_index; - unsigned int total_nr_pages = bp->b_page_count; + unsigned int total_nr_pages = bp->b_folio_count; int nr_pages; struct bio *bio; sector_t sector = bp->b_maps[map].bm_bn; @@ -1503,7 +1504,7 @@ xfs_buf_ioapply_map( if (nbytes > size) nbytes = size; - rbytes = bio_add_page(bio, bp->b_pages[page_index], nbytes, + rbytes = bio_add_folio(bio, bp->b_folios[page_index], nbytes, offset); if (rbytes < nbytes) break; @@ -1716,13 +1717,13 @@ xfs_buf_offset( struct xfs_buf *bp, size_t offset) { - struct page *page; + struct folio *folio; if (bp->b_addr) return bp->b_addr + offset; - page = bp->b_pages[offset >> PAGE_SHIFT]; - return page_address(page) + (offset & (PAGE_SIZE-1)); + folio = bp->b_folios[offset >> PAGE_SHIFT]; + return folio_address(folio) + (offset & (PAGE_SIZE-1)); } void @@ -1735,18 +1736,18 @@ xfs_buf_zero( bend = boff + bsize; while (boff < bend) { - struct page *page; + struct folio *folio; int page_index, page_offset, csize; page_index = (boff + bp->b_offset) >> PAGE_SHIFT; page_offset = (boff + bp->b_offset) & ~PAGE_MASK; - page = bp->b_pages[page_index]; + folio = bp->b_folios[page_index]; csize = min_t(size_t, PAGE_SIZE - page_offset, BBTOB(bp->b_length) - boff); ASSERT((csize + page_offset) <= PAGE_SIZE); - memset(page_address(page) + page_offset, 0, csize); + memset(folio_address(folio) + page_offset, 0, csize); boff += csize; } diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h index b470de08a46c..1e7298ff3fa5 100644 --- a/fs/xfs/xfs_buf.h +++ b/fs/xfs/xfs_buf.h @@ -29,7 +29,7 @@ struct xfs_buf; #define XBF_READ_AHEAD (1u << 2) /* asynchronous read-ahead */ #define XBF_NO_IOACCT (1u << 3) /* bypass I/O accounting (non-LRU bufs) */ #define XBF_ASYNC (1u << 4) /* initiator will not wait for completion */ -#define XBF_DONE (1u << 5) /* all pages in the buffer uptodate */ +#define XBF_DONE (1u << 5) /* all folios in the buffer uptodate */ #define XBF_STALE (1u << 6) /* buffer has been staled, do not find it */ #define XBF_WRITE_FAIL (1u << 7) /* async writes have failed on this buffer */ @@ -39,7 +39,7 @@ struct xfs_buf; #define _XBF_LOGRECOVERY (1u << 18)/* log recovery buffer */ /* flags used only internally */ -#define _XBF_PAGES (1u << 20)/* backed by refcounted pages */ +#define _XBF_FOLIOS (1u << 20)/* backed by refcounted folios */ #define _XBF_KMEM (1u << 21)/* backed by heap memory */ #define _XBF_DELWRI_Q (1u << 22)/* buffer on a delwri queue */ @@ -68,7 +68,7 @@ typedef unsigned int xfs_buf_flags_t; { _XBF_INODES, "INODES" }, \ { _XBF_DQUOTS, "DQUOTS" }, \ { _XBF_LOGRECOVERY, "LOG_RECOVERY" }, \ - { _XBF_PAGES, "PAGES" }, \ + { _XBF_FOLIOS, "FOLIOS" }, \ { _XBF_KMEM, "KMEM" }, \ { _XBF_DELWRI_Q, "DELWRI_Q" }, \ /* The following interface flags should never be set */ \ @@ -116,7 +116,7 @@ typedef struct xfs_buftarg { struct ratelimit_state bt_ioerror_rl; } xfs_buftarg_t; -#define XB_PAGES 2 +#define XB_FOLIOS 2 struct xfs_buf_map { xfs_daddr_t bm_bn; /* block number for I/O */ @@ -180,14 +180,14 @@ struct xfs_buf { struct xfs_buf_log_item *b_log_item; struct list_head b_li_list; /* Log items list head */ struct xfs_trans *b_transp; - struct page **b_pages; /* array of page pointers */ - struct page *b_page_array[XB_PAGES]; /* inline pages */ + struct folio **b_folios; /* array of folio pointers */ + struct folio *b_folio_array[XB_FOLIOS]; /* inline folios */ struct xfs_buf_map *b_maps; /* compound buffer map */ struct xfs_buf_map __b_map; /* inline compound buffer map */ int b_map_count; atomic_t b_pin_count; /* pin count */ atomic_t b_io_remaining; /* #outstanding I/O requests */ - unsigned int b_page_count; /* size of page array */ + unsigned int b_folio_count; /* size of folio array */ unsigned int b_offset; /* page offset of b_addr, only for _XBF_KMEM buffers */ int b_error; /* error code on I/O */ diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 83a81cb52d8e..d1407cee48d9 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -69,7 +69,7 @@ xfs_buf_item_straddle( { void *first, *last; - if (bp->b_page_count == 1 || !(bp->b_flags & XBF_UNMAPPED)) + if (bp->b_folio_count == 1 || !(bp->b_flags & XBF_UNMAPPED)) return false; first = xfs_buf_offset(bp, offset + (first_bit << XFS_BLF_SHIFT)); diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h index caccb7f76690..804389b8e802 100644 --- a/fs/xfs/xfs_linux.h +++ b/fs/xfs/xfs_linux.h @@ -279,4 +279,12 @@ kmem_to_page(void *addr) return virt_to_page(addr); } +static inline struct folio * +kmem_to_folio(void *addr) +{ + if (is_vmalloc_addr(addr)) + return page_folio(vmalloc_to_page(addr)); + return virt_to_folio(addr); +} + #endif /* __XFS_LINUX__ */ From patchwork Thu Jan 18 22:19:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13523244 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63641C47DB7 for ; Thu, 18 Jan 2024 22:22:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE60C6B007B; Thu, 18 Jan 2024 17:22:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C95DB6B0080; Thu, 18 Jan 2024 17:22:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B35AB6B0081; Thu, 18 Jan 2024 17:22:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id A121D6B007B for ; Thu, 18 Jan 2024 17:22:25 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 6F7C31402C4 for ; Thu, 18 Jan 2024 22:22:25 +0000 (UTC) X-FDA: 81693856650.10.011F609 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf23.hostedemail.com (Postfix) with ESMTP id 757C6140003 for ; Thu, 18 Jan 2024 22:22:23 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=U0KHcNRS; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf23.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705616543; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fO+mBN2GDAFsDbZRgLX551gORnQ6e1TTPyY8dgTZaZk=; b=MyxR1P45DGPEMkJSALrFfd5uAH3Ttbv7p8PwIzxAHFt5LsltYK8vwxXAKWtrXxw7L8LW7Y Qo7okK16gZLDJ/tOVmB+GJqQUJFPTJ/t7QSRFFrIr1Qu2jZ+lAR2qBNRVK9mE0hqHHgeoz RgBNnYiuowQsx0RvspYTf7S3pYENUlo= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=U0KHcNRS; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf23.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705616543; a=rsa-sha256; cv=none; b=lzAllWjH4EKFlQxecccYEK2PIzSFugxvtbC2tCD/1XL20XZV58lze0x0uYDgAeoyGSqAj4 1uUQTRWQEwVxN5wDnskNZ4U6zH0wG334jD+Xfjg9nL28gqAO1dNH9RzXC+YZtU49A5fF9V P/P0P1/j0t8mHwv9KBl2OFPXtxtvrKc= Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-6d9bba6d773so188812b3a.1 for ; Thu, 18 Jan 2024 14:22:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616542; x=1706221342; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fO+mBN2GDAFsDbZRgLX551gORnQ6e1TTPyY8dgTZaZk=; b=U0KHcNRSbfuourm4tei6Pahj/4BEKLZSi7x6Yfrw9Y0uSJEW9YpaTyJjrt0eR4KlCG xOGWC1MCtBUu/9YFgFYjN+0kryYGvRMPOGDiqKLm1k4bsD7qvBI3zBngm9GnBF89y82x HUmOgZWbx4eoqL/V2QeFqqW5U1N4ioMofIL3T2GIrMgAM7uO11F/seYV/bs5670Iy1y3 LcANLjSVu2sDPpCnsV7m+2+ApGIEj33dXaUHIwxwNmc1qDumlpHUq30BBP/zQUdDWnmE UjSOMXrHO2gCFLdWYCrimrVa5rQZcNJ+G9TssnNEIImG4gczuVflDpPk4Gg/yuP43vFl 24+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705616542; x=1706221342; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fO+mBN2GDAFsDbZRgLX551gORnQ6e1TTPyY8dgTZaZk=; b=CwfFnZ6RLG01KD+pBn9GbHQDH2xhbGH3hW0GjGH1CCFEYeIlR74C+exh0EaxNZjjJC ZuJbgY00aMnKuw4YB3LQcqsNFk7rCiu5/8n9YesgWg5FdK/i9A4wcVCQ8CyEdxgs+Gg6 uk+9PE5HmBfm2qxlGCFzIdU0rWiqidkXHtrD1R+yPCCHzOfV3WHOPvf8sgcedDB0N5pr 0sxCmQRsFo1CfHBh1hGFfy58/n/lh91L1QlK5mY948oSp1UGMBhDIBqOyHX6oDPTA4uN XoTzKxQU5jhRuFobLTbhxeja1zVoLqRBtf+e+z+Q0hYQr4yps+d6/SYmUb7vN8xfNfTh g6Lg== X-Gm-Message-State: AOJu0Ywr92DHNe8eIcs2Cj1LiSODB34kPOK2Y36KB+RbFhWJXqc0uGQ5 +rE2H8o5ixIDMpCUCvoC7iIS+eA4KlH+pQJyd9VpfT79usD9kwTadjdxg9uMjQEOhRW8RhdZNHA Y X-Google-Smtp-Source: AGHT+IFxnt3+P3qfd/Or7vfgHgHJT56Si91E5SB2dSZzMTOU7Zs7rV/RsmUryc8DgCT+nt/t5oxBsQ== X-Received: by 2002:a05:6a20:ae1c:b0:19b:4580:e9c6 with SMTP id dp28-20020a056a20ae1c00b0019b4580e9c6mr1220797pzb.65.1705616542306; Thu, 18 Jan 2024 14:22:22 -0800 (PST) Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au. [49.180.249.6]) by smtp.gmail.com with ESMTPSA id s13-20020a056a00194d00b006db13a02921sm3764329pfk.183.2024.01.18.14.22.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 14:22:22 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1rQamB-00CCGW-1A; Fri, 19 Jan 2024 09:22:18 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.97) (envelope-from ) id 1rQamA-0000000HMlw-3ZYk; Fri, 19 Jan 2024 09:22:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: willy@infradead.org, linux-mm@kvack.org Subject: [PATCH 3/3] xfs: convert buffer cache to use high order folios Date: Fri, 19 Jan 2024 09:19:41 +1100 Message-ID: <20240118222216.4131379-4-david@fromorbit.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com> References: <20240118222216.4131379-1-david@fromorbit.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: y13qmukoenn6eqqc98fx19qby6wswq7g X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 757C6140003 X-HE-Tag: 1705616543-911691 X-HE-Meta: U2FsdGVkX1/fzKyd8y3wRm+siOoH1xIO7sYCcc4uwRDQnhW3ZC3MnpIpuYakyOmtI/Q+cloDiWX5VWVq2QeBS6xcVQNu473vcgIpPjhKsj6RpCYz3ZMXEDlOM+3gYfkFtqZRTuSDcKXBklI6BmnkvR3MJujBFeCNUKnyd9Vw2bLezXRxAIL1+MDodMdQw2mBT1eNh/tHG7g2yTj1rHkeoyOe5/TdWr8I/Q1/ygVloqaTEzyKWXpOeH2VHgZYamgYpm60zqBIdxa/vPGZd+GOWFp0vlfOn2Sgr+V07onJFyp4Hvin5vX3mc5iK8Q3AWqqUxfaWX3P75jH4GxE8FeQaBncyO6O3CQY02wEQ4dgPMPQbYAGm0LsdoIH/H5TBAYjvk8kzBdGroRWU7i98p0qnGbv7nmtWhlngs+ThF1WXxzkLt/hjUPIfSr84S/HsY2dMZorJ3EIAA11DFQ3q9j1eu5CWxFCc8q+V+GCTY2naFa6eTo+7DsyFZG2ju7rlOGIPjlTKgG3Z6mgXm3NK5AjvTBCyKmwIL24odZPCFkyx/A7b0ct5LnCbJzDbhmKrQjunlnOl/+fNREZ/kx2N9MUniq49JTdM7klGLiy46ldY+YlZV/xIczGQ0ZGXXw6jpNUMU/Wk734e2rvH2byfPfYaPiNphTcICcEjmYHzxy49RLLigG/DPgs5dvCuLxAhSrIz8z039X45f0ktr5EK2VSmHT1ixi5BoB6osuxHbhAXRh+V20d4YYByE+aot6PhEK+Z1Xhf8biv1VXhOZuGnmBoo4YnQbEM09KPIiMLXdsbiw6SS4i1eGaQWYM8CE77k0/PKpuE1cysoc/yGBwygkI98fGxXjZauP3sA9TlcZhob1EyAhME+QXOpbNvHBx5V2TrWInsgZSotjB2s5Ow9JXl2v78Isc4XJ8JeWpVAC749PNnHTEcmobXSK4O0e29bKdneDI7oSjZVlhYGYEjnF 8cGljBSZ hp+lVbdqz692+vudRjIaASYU5kH8+ElMxjlf0AIgi3FNMmjVJp9qe9Sj87UfsKNkmk9MbOmQN3BdIHSCI4V5+v9eBR7/w/J9NFTxznzj+apqSm1QahKqOgseMd0NYpeeLNSJ5/t1igq6RRwJqPE5MalBpiWrkK+hWmJ0BURDqg80ulCnEf1dY7XOpsDAYN4vJUHeY+yuH3dA7cF2YSZP/qLdetJmkZnbKou6MT55qEFnFE+I9gmgsxEqIB0aSO2H4cnbdInZmmiQNXoNG3Gs4UGeFxx3LcYYOijgpEBpF5x1jEn1Uji5WJeTgHJxmRBti2ZxIzX21GzEVEuXULuQhq0HD0fC+OHkn2YaeTQSzgrRaa0mfk2bzjAAk0fqpeVQjktDSDhZVfFpPUUNuxwRT+62QNe6ReA+ia2eLo6jNVDSv8FtQK0uGdNB3iJ/m8Le8SzRrD3bYZFZilHKcM3DiTk8GXkw47ydZ1rAHlwXVL1isn0k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Dave Chinner Now that we have the buffer cache using the folio API, we can extend the use of folios to allocate high order folios for multi-page buffers rather than an array of single pages that are then vmapped into a contiguous range. This creates two types of buffers: single folio buffers that can have arbitrary order, and multi-folio buffers made up of many single page folios that get vmapped. The latter is essentially the existing code, so there are no logic changes to handle this case. There are a few places where we iterate the folios on a buffer. These need to be converted to handle the high order folio case. Luckily, this only occurs when bp->b_folio_count == 1, and the code for handling this case is just a simple application of the folio API to the operations that need to be performed. The code that allocates buffers will optimistically attempt a high order folio allocation as a fast path. If this high order allocation fails, then we fall back to the existing multi-folio allocation code. This now forms the slow allocation path, and hopefully will be largely unused in normal conditions. This should improve performance of large buffer operations (e.g. large directory block sizes) as we should now mostly avoid the expense of vmapping large buffers (and the vmap lock contention that can occur) as well as avoid the runtime pressure that frequently accessing kernel vmapped pages put on the TLBs. Signed-off-by: Dave Chinner --- fs/xfs/xfs_buf.c | 150 +++++++++++++++++++++++++++++++++++++---------- 1 file changed, 119 insertions(+), 31 deletions(-) diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index 15907e92d0d3..df363f17ea1a 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -74,6 +74,10 @@ xfs_buf_is_vmapped( return bp->b_addr && bp->b_folio_count > 1; } +/* + * See comment above xfs_buf_alloc_folios() about the constraints placed on + * allocating vmapped buffers. + */ static inline int xfs_buf_vmap_len( struct xfs_buf *bp) @@ -344,14 +348,72 @@ xfs_buf_alloc_kmem( bp->b_addr = NULL; return -ENOMEM; } - bp->b_offset = offset_in_page(bp->b_addr); bp->b_folios = bp->b_folio_array; bp->b_folios[0] = kmem_to_folio(bp->b_addr); + bp->b_offset = offset_in_folio(bp->b_folios[0], bp->b_addr); bp->b_folio_count = 1; bp->b_flags |= _XBF_KMEM; return 0; } +/* + * Allocating a high order folio makes the assumption that buffers are a + * power-of-2 size so that ilog2() returns the exact order needed to fit + * the contents of the buffer. Buffer lengths are mostly a power of two, + * so this is not an unreasonable approach to take by default. + * + * The exception here are user xattr data buffers, which can be arbitrarily + * sized up to 64kB plus structure metadata. In that case, round up the order. + */ +static bool +xfs_buf_alloc_folio( + struct xfs_buf *bp, + gfp_t gfp_mask) +{ + int length = BBTOB(bp->b_length); + int order; + + order = ilog2(length); + if ((1 << order) < length) + order = ilog2(length - 1) + 1; + + if (order <= PAGE_SHIFT) + order = 0; + else + order -= PAGE_SHIFT; + + bp->b_folio_array[0] = folio_alloc(gfp_mask, order); + if (!bp->b_folio_array[0]) + return false; + + bp->b_folios = bp->b_folio_array; + bp->b_folio_count = 1; + bp->b_flags |= _XBF_FOLIOS; + return true; +} + +/* + * When we allocate folios for a buffer, we end up with one of two types of + * buffer. + * + * The first type is a single folio buffer - this may be a high order + * folio or just a single page sized folio, but either way they get treated the + * same way by the rest of the code - the buffer memory spans a single + * contiguous memory region that we don't have to map and unmap to access the + * data directly. + * + * The second type of buffer is the multi-folio buffer. These are *always* made + * up of single page folios so that they can be fed to vmap_ram() to return a + * contiguous memory region we can access the data through, or mark it as + * XBF_UNMAPPED and access the data directly through individual folio_address() + * calls. + * + * We don't use high order folios for this second type of buffer (yet) because + * having variable size folios makes offset-to-folio indexing and iteration of + * the data range more complex than if they are fixed size. This case should now + * be the slow path, though, so unless we regularly fail to allocate high order + * folios, there should be little need to optimise this path. + */ static int xfs_buf_alloc_folios( struct xfs_buf *bp, @@ -363,7 +425,15 @@ xfs_buf_alloc_folios( if (flags & XBF_READ_AHEAD) gfp_mask |= __GFP_NORETRY; - /* Make sure that we have a page list */ + /* Assure zeroed buffer for non-read cases. */ + if (!(flags & XBF_READ)) + gfp_mask |= __GFP_ZERO; + + /* Optimistically attempt a single high order folio allocation. */ + if (xfs_buf_alloc_folio(bp, gfp_mask)) + return 0; + + /* Fall back to allocating an array of single page folios. */ bp->b_folio_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE); if (bp->b_folio_count <= XB_FOLIOS) { bp->b_folios = bp->b_folio_array; @@ -375,9 +445,6 @@ xfs_buf_alloc_folios( } bp->b_flags |= _XBF_FOLIOS; - /* Assure zeroed buffer for non-read cases. */ - if (!(flags & XBF_READ)) - gfp_mask |= __GFP_ZERO; /* * Bulk filling of pages can take multiple calls. Not filling the entire @@ -418,7 +485,7 @@ _xfs_buf_map_folios( { ASSERT(bp->b_flags & _XBF_FOLIOS); if (bp->b_folio_count == 1) { - /* A single page buffer is always mappable */ + /* A single folio buffer is always mappable */ bp->b_addr = folio_address(bp->b_folios[0]); } else if (flags & XBF_UNMAPPED) { bp->b_addr = NULL; @@ -1465,20 +1532,28 @@ xfs_buf_ioapply_map( int *count, blk_opf_t op) { - int page_index; - unsigned int total_nr_pages = bp->b_folio_count; - int nr_pages; + int folio_index; + unsigned int total_nr_folios = bp->b_folio_count; + int nr_folios; struct bio *bio; sector_t sector = bp->b_maps[map].bm_bn; int size; int offset; - /* skip the pages in the buffer before the start offset */ - page_index = 0; + /* + * If the start offset if larger than a single page, we need to be + * careful. We might have a high order folio, in which case the indexing + * is from the start of the buffer. However, if we have more than one + * folio single page folio in the buffer, we need to skip the folios in + * the buffer before the start offset. + */ + folio_index = 0; offset = *buf_offset; - while (offset >= PAGE_SIZE) { - page_index++; - offset -= PAGE_SIZE; + if (bp->b_folio_count > 1) { + while (offset >= PAGE_SIZE) { + folio_index++; + offset -= PAGE_SIZE; + } } /* @@ -1491,28 +1566,28 @@ xfs_buf_ioapply_map( next_chunk: atomic_inc(&bp->b_io_remaining); - nr_pages = bio_max_segs(total_nr_pages); + nr_folios = bio_max_segs(total_nr_folios); - bio = bio_alloc(bp->b_target->bt_bdev, nr_pages, op, GFP_NOIO); + bio = bio_alloc(bp->b_target->bt_bdev, nr_folios, op, GFP_NOIO); bio->bi_iter.bi_sector = sector; bio->bi_end_io = xfs_buf_bio_end_io; bio->bi_private = bp; - for (; size && nr_pages; nr_pages--, page_index++) { - int rbytes, nbytes = PAGE_SIZE - offset; + for (; size && nr_folios; nr_folios--, folio_index++) { + struct folio *folio = bp->b_folios[folio_index]; + int nbytes = folio_size(folio) - offset; if (nbytes > size) nbytes = size; - rbytes = bio_add_folio(bio, bp->b_folios[page_index], nbytes, - offset); - if (rbytes < nbytes) + if (!bio_add_folio(bio, folio, nbytes, + offset_in_folio(folio, offset))) break; offset = 0; sector += BTOBB(nbytes); size -= nbytes; - total_nr_pages--; + total_nr_folios--; } if (likely(bio->bi_iter.bi_size)) { @@ -1722,6 +1797,13 @@ xfs_buf_offset( if (bp->b_addr) return bp->b_addr + offset; + /* Single folio buffers may use large folios. */ + if (bp->b_folio_count == 1) { + folio = bp->b_folios[0]; + return folio_address(folio) + offset_in_folio(folio, offset); + } + + /* Multi-folio buffers always use PAGE_SIZE folios */ folio = bp->b_folios[offset >> PAGE_SHIFT]; return folio_address(folio) + (offset & (PAGE_SIZE-1)); } @@ -1737,18 +1819,24 @@ xfs_buf_zero( bend = boff + bsize; while (boff < bend) { struct folio *folio; - int page_index, page_offset, csize; + int folio_index, folio_offset, csize; - page_index = (boff + bp->b_offset) >> PAGE_SHIFT; - page_offset = (boff + bp->b_offset) & ~PAGE_MASK; - folio = bp->b_folios[page_index]; - csize = min_t(size_t, PAGE_SIZE - page_offset, + /* Single folio buffers may use large folios. */ + if (bp->b_folio_count == 1) { + folio = bp->b_folios[0]; + folio_offset = offset_in_folio(folio, + bp->b_offset + boff); + } else { + folio_index = (boff + bp->b_offset) >> PAGE_SHIFT; + folio_offset = (boff + bp->b_offset) & ~PAGE_MASK; + folio = bp->b_folios[folio_index]; + } + + csize = min_t(size_t, folio_size(folio) - folio_offset, BBTOB(bp->b_length) - boff); + ASSERT((csize + folio_offset) <= folio_size(folio)); - ASSERT((csize + page_offset) <= PAGE_SIZE); - - memset(folio_address(folio) + page_offset, 0, csize); - + memset(folio_address(folio) + folio_offset, 0, csize); boff += csize; } }