From patchwork Thu Aug 24 23:50:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 9921113 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 72EE260353 for ; Thu, 24 Aug 2017 23:50:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 63A7C20090 for ; Thu, 24 Aug 2017 23:50:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5880220144; Thu, 24 Aug 2017 23:50:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DF6161FFEB for ; Thu, 24 Aug 2017 23:50:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753880AbdHXXuT (ORCPT ); Thu, 24 Aug 2017 19:50:19 -0400 Received: from mail-pg0-f41.google.com ([74.125.83.41]:38891 "EHLO mail-pg0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753428AbdHXXuS (ORCPT ); Thu, 24 Aug 2017 19:50:18 -0400 Received: by mail-pg0-f41.google.com with SMTP id b8so5219243pgn.5 for ; Thu, 24 Aug 2017 16:50:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=uZtgid8MrLs0kkmedbkARwHO2uDwjACLyuz234lrNJ8=; b=I9kYQ4yt/dv1DhUvfOl2xenQbExOOKHMfj2da8TTxGST9ZcAkJjybqJ/XXUY24krt6 BlCc9rhqkQnWIgnpCnxUILKdry3HYIg0psw2lCfQF1x8awm/W+lsk5E6cMHYV6XwzG5+ 0hWUeVCqrCnYMmdcu5pVvxRPHmwI9DnQG7fZ3TuGj4XD7u5TaZM7sonAZZIutMxc7u4D GnJ1QWCt51jVifscKPYKMqWxWDhx8+q9COjt6Z1IDvVgAVV2VlBgx7erkHrfCTKIV/Di /IKga+wbsTK8tTbk5Bm5ekHUqsqL4j7/f6Mm8EYPVskSAD+6urHgHwCA2k//e+MZ6LL6 d3Og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=uZtgid8MrLs0kkmedbkARwHO2uDwjACLyuz234lrNJ8=; b=eKUNRTCDumAEBz3s12VVSlEsaXE9GLSwlJwD0TonlBrGBIemEYa9eXlYEvSH5to5gS UMZRSSWvb4uFAyIPtka83FAEX6LuMn1orTSjAjIjHQann/nMimHt9NVSN4vbV9QgBjAl oiwFfPOGgNMTjv5bxeaYkeK6XAov627WPOWNK/ILGcHZpouQIBjMLabGvonrKeY+1j5Q gd0adXVSoLaIEQoFkTHoSmIgt6O53ZLxnugtuSPbFDR3DkSUgEgqnxoJNAkneZM7AwJd rA52FJPMumPs+HlsyhQg6oolpka+icUSo2iDmtfXmdx4uTNTNh7ufg84LIZ8pYbzrRyX VB2w== X-Gm-Message-State: AHYfb5g+IFkSD8UyjQIK4bz5OjKRN+toOaLOZ9UIbTCiji4FM1kdLtn2 HUKHraL7Po2qmKgVA4t57w== X-Received: by 10.84.168.131 with SMTP id f3mr8887647plb.350.1503618617598; Thu, 24 Aug 2017 16:50:17 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::7:72f7]) by smtp.gmail.com with ESMTPSA id b65sm8574015pfa.53.2017.08.24.16.50.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Aug 2017 16:50:17 -0700 (PDT) From: Omar Sandoval To: linux-xfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH] xfs: check for race with xfs_reclaim_inode() in xfs_ifree_cluster() Date: Thu, 24 Aug 2017 16:50:09 -0700 Message-Id: <93321e051fb5f80c4727844748cca04604a8757a.1503618336.git.osandov@fb.com> X-Mailer: git-send-email 2.14.1 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval After xfs_ifree_cluster() finds an inode in the radix tree and verifies that the inode number is what it expected, xfs_reclaim_inode() can swoop in and free it. xfs_ifree_cluster() will then happily continue working on the freed inode. Most importantly, it will mark the inode stale, which will probably be overwritten when the inode slab object is reallocated, but if it has already been reallocated then we can end up with an inode spuriously marked stale. In 8a17d7ddedb4 ("xfs: mark reclaimed inodes invalid earlier") we added a second check to xfs_iflush_cluster() to detect this race, but the similar RCU lookup in xfs_ifree_cluster() needs the same treatment. Signed-off-by: Omar Sandoval Reviewed-by: Brian Foster --- Based on v4.13-rc6. Let me know if my reasoning here is completely off, the XFS inode machinery makes my head spin. fs/xfs/xfs_icache.c | 10 +++++----- fs/xfs/xfs_inode.c | 23 ++++++++++++++++++----- 2 files changed, 23 insertions(+), 10 deletions(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 0a9e6985a0d0..34227115a5d6 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -1124,11 +1124,11 @@ xfs_reclaim_inode( * Because we use RCU freeing we need to ensure the inode always appears * to be reclaimed with an invalid inode number when in the free state. * We do this as early as possible under the ILOCK so that - * xfs_iflush_cluster() can be guaranteed to detect races with us here. - * By doing this, we guarantee that once xfs_iflush_cluster has locked - * XFS_ILOCK that it will see either a valid, flushable inode that will - * serialise correctly, or it will see a clean (and invalid) inode that - * it can skip. + * xfs_iflush_cluster() and xfs_ifree_cluster() can be guaranteed to + * detect races with us here. By doing this, we guarantee that once + * xfs_iflush_cluster() or xfs_ifree_cluster() has locked XFS_ILOCK that + * it will see either a valid inode that will serialise correctly, or it + * will see an invalid inode that it can skip. */ spin_lock(&ip->i_flags_lock); ip->i_flags = XFS_IRECLAIM; diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ff48f0096810..97045e8dfed5 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2359,11 +2359,24 @@ xfs_ifree_cluster( * already marked stale. If we can't lock it, back off * and retry. */ - if (ip != free_ip && - !xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) { - rcu_read_unlock(); - delay(1); - goto retry; + if (ip != free_ip) { + if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) { + rcu_read_unlock(); + delay(1); + goto retry; + } + + /* + * Check the inode number again in case we're + * racing with freeing in xfs_reclaim_inode(). + * See the comments in that function for more + * information as to why the initial check is + * not sufficient. + */ + if (ip->i_ino != inum + i) { + xfs_iunlock(ip, XFS_ILOCK_EXCL); + continue; + } } rcu_read_unlock();