From patchwork Sun Oct 8 18:20:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 13412775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0517AE95A67 for ; Sun, 8 Oct 2023 18:27:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233002AbjJHS1K (ORCPT ); Sun, 8 Oct 2023 14:27:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229945AbjJHS1J (ORCPT ); Sun, 8 Oct 2023 14:27:09 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F10CA3 for ; Sun, 8 Oct 2023 11:27:08 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AB062C433C7; Sun, 8 Oct 2023 18:27:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1696789628; bh=APcOMhpicMZPBbYa1bKVVkvniVJI4v0PiTNBfKOmqls=; h=From:To:Cc:Subject:Date:From; b=oVsaP67mlOU5RVNsTptGUpOXV+T+hjTnjxb0/rlBpwgxFu1/w8a0kFYLFrOcJYU5H FRIFvEgQvNd+yZ0M/Uq6b42hm54vg9V3ma97YwK31FyO9b2YYmYq8Pg9GX7kUeB8vO ObWUREGXL0+DEmwcTtzeaiUF2GpKBrCdvc0JDiIfJdPtY+xMPaWYW3ebNQO10LHDmB oPqIdU2rwxafJpm3+iK2f3oi5xS1etFqDoO7n2IHsWvOZnsi50AuidiMgc4YAqyvZ6 sFvfma6K3jRHxXKrbDjlX5CwNbY0p2DadphKHdl9D9hKe/Tp9LdRNnnoc4oEAa5YoZ UcLnImAL+Gr6A== From: trondmy@kernel.org To: Anna Schumaker Cc: linux-nfs@vger.kernel.org Subject: [PATCH] pNFS: Fix a hang in nfs4_evict_inode() Date: Sun, 8 Oct 2023 14:20:19 -0400 Message-ID: <20231008182019.12842-1-trondmy@kernel.org> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Trond Myklebust We are not allowed to call pnfs_mark_matching_lsegs_return() without also holding a reference to the layout header, since doing so could lead to the reference count going to zero when we call pnfs_layout_remove_lseg(). This again can lead to a hang when we get to nfs4_evict_inode() and are unable to clear the layout pointer. pnfs_layout_return_unused_byserver() is guilty of this behaviour, and has been seen to trigger the refcount warning prior to a hang. Fixes: b6d49ecd1081 ("NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust --- fs/nfs/pnfs.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c index 63904a372b2f..21a365357629 100644 --- a/fs/nfs/pnfs.c +++ b/fs/nfs/pnfs.c @@ -2638,31 +2638,44 @@ pnfs_should_return_unused_layout(struct pnfs_layout_hdr *lo, return mode == 0; } -static int -pnfs_layout_return_unused_byserver(struct nfs_server *server, void *data) +static int pnfs_layout_return_unused_byserver(struct nfs_server *server, + void *data) { const struct pnfs_layout_range *range = data; + const struct cred *cred; struct pnfs_layout_hdr *lo; struct inode *inode; + nfs4_stateid stateid; + enum pnfs_iomode iomode; + restart: rcu_read_lock(); list_for_each_entry_rcu(lo, &server->layouts, plh_layouts) { - if (!pnfs_layout_can_be_returned(lo) || + inode = lo->plh_inode; + if (!inode || !pnfs_layout_can_be_returned(lo) || test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags)) continue; - inode = lo->plh_inode; spin_lock(&inode->i_lock); - if (!pnfs_should_return_unused_layout(lo, range)) { + if (!lo->plh_inode || + !pnfs_should_return_unused_layout(lo, range)) { spin_unlock(&inode->i_lock); continue; } + pnfs_get_layout_hdr(lo); + pnfs_set_plh_return_info(lo, range->iomode, 0); + if (pnfs_mark_matching_lsegs_return(lo, &lo->plh_return_segs, + range, 0) != 0 || + !pnfs_prepare_layoutreturn(lo, &stateid, &cred, &iomode)) { + spin_unlock(&inode->i_lock); + rcu_read_unlock(); + pnfs_put_layout_hdr(lo); + cond_resched(); + goto restart; + } spin_unlock(&inode->i_lock); - inode = pnfs_grab_inode_layout_hdr(lo); - if (!inode) - continue; rcu_read_unlock(); - pnfs_mark_layout_for_return(inode, range); - iput(inode); + pnfs_send_layoutreturn(lo, &stateid, &cred, iomode, false); + pnfs_put_layout_hdr(lo); cond_resched(); goto restart; }