From patchwork Fri Apr 6 06:59:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 10325725 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 765A06038F for ; Fri, 6 Apr 2018 06:59:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5851629493 for ; Fri, 6 Apr 2018 06:59:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4CE6A29497; Fri, 6 Apr 2018 06:59:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5A6529493 for ; Fri, 6 Apr 2018 06:59:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750815AbeDFG7p (ORCPT ); Fri, 6 Apr 2018 02:59:45 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:34066 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750807AbeDFG7o (ORCPT ); Fri, 6 Apr 2018 02:59:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=vYeY9nQbBCUQZEEUwufEJ9V7zx3HV7mXvx4FhNc5QWQ=; b=MQxcPurHPhKhGjs7Sk5T7YqDQ NetFZbQSv9bD6km8sRrTX+Nnu3kAaVnuyZKOP5RITcrya6q5Q4qJRLuhHdioby9sePPodGGycqhTa nfpRgDoVpCbfGnmKaI8o1xL1CVnEqP6Nbiv+WwyzNujlJ2MGG8U6dgjtVos9YC5K5qqp0bcPLkxAZ EMEz0ix8xNrPqh4h6oe9ydG7HiO1rMtLYzL1vq+9GECih8lqwhTmU4JiDZx25CU4/csjzB92NpcdZ XU2lxjkBNt8qArDvcKqXXyML/pllHoGbGkwTibkThneCsdjqXX66TxOwJyx/i5DFPYSCjkIesCZXa ddg2ZTyCw==; Received: from hch by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1f4LLc-0004Fo-Fz; Fri, 06 Apr 2018 06:59:44 +0000 Date: Thu, 5 Apr 2018 23:59:44 -0700 From: Christoph Hellwig To: Brian Foster Cc: Christoph Hellwig , linux-xfs@vger.kernel.org Subject: Re: [PATCH 2/2] xfs: replace filestream item xfs_inode reference with xfs_mount Message-ID: <20180406065944.GA12740@infradead.org> References: <20180405121147.60897-1-bfoster@redhat.com> <20180405121147.60897-2-bfoster@redhat.com> <20180405171842.GA4849@infradead.org> <20180405181300.GB417@bfoster.bfoster> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180405181300.GB417@bfoster.bfoster> User-Agent: Mutt/1.9.2 (2017-12-15) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Apr 05, 2018 at 02:13:01PM -0400, Brian Foster wrote: > On Thu, Apr 05, 2018 at 10:18:42AM -0700, Christoph Hellwig wrote: > > On Thu, Apr 05, 2018 at 08:11:47AM -0400, Brian Foster wrote: > > > The obvious solution is to grab an inode reference for > > > xfs_fstrm_item. The filestream mechanism only actually uses the > > > inode pointer as a means to access the xfs_mount, however. Rather > > > than add unnecessary complexity, simplify the implementation to > > > store an xfs_mount pointer instead of the inode. This also requires > > > updates to the tracepoint class to provide the associated data via > > > parameters rather than the inode and a minor hack to peek at the MRU > > > key to establish the inode number at free time. > > > > How about not replacing it all and just providing the context from > > the mru cache? Something like the untested patch below: > > > > I actually considered something like this briefly but it initially > looked more involved than what you've come up with here. I think that I > just didn't consider storing the mp directly in the xfs_mru_cache so it > would be available to the reaper context. With that, this looks nicer to > me. I'll give it a whirl.. thanks! FYI, here is a version with a proper changelog: Reviewed-by: Brian Foster --- From ef8248a6cf39f2fbf69b2590c4cc66c9a2ba57fd Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Thu, 5 Apr 2018 19:17:51 +0200 Subject: xfs: remove filestream item xfs_inode reference The filestreams allocator stores an xfs_fstrm_item structure in the MRU to cache inode number to agno mappings for a particular length of time. Each xfs_fstrm_item contains the internal MRU structure, an inode pointer and agno value. The inode pointer stored in the xfs_fstrm_item is not referenced, however, which means the inode itself can be removed and reclaimed before the MRU item is freed. If this occurs, xfs_fstrm_free_func() can access freed or unrelated memory through xfs_fstrm_item->ip and crash. The obvious solution is to grab an inode reference for xfs_fstrm_item. The filestream mechanism only actually uses the inode pointer as a means to access the xfs_mount, however. Rather than add unnecessary complexity, simplify the implementation to store an xfs_mount pointer in struct xfs_mru_cache, and pass it to the free callback. This also requires updates to the tracepoint class to provide the associated data via parameters rather than the inode and a minor hack to peek at the MRU key to establish the inode number at free time. Based on debugging work and an earlier patch from Brian Foster, who also wrote most of this changelog. Reported-by: Brian Foster Signed-off-by: Christoph Hellwig --- fs/xfs/xfs_filestream.c | 19 +++++++++---------- fs/xfs/xfs_mru_cache.c | 8 +++++--- fs/xfs/xfs_mru_cache.h | 8 ++++---- fs/xfs/xfs_trace.h | 14 +++++++------- 4 files changed, 25 insertions(+), 24 deletions(-) diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c index 043ca3808ea2..5131a6e25fc9 100644 --- a/fs/xfs/xfs_filestream.c +++ b/fs/xfs/xfs_filestream.c @@ -34,7 +34,6 @@ struct xfs_fstrm_item { struct xfs_mru_cache_elem mru; - struct xfs_inode *ip; xfs_agnumber_t ag; /* AG in use for this directory */ }; @@ -122,14 +121,15 @@ xfs_filestream_put_ag( static void xfs_fstrm_free_func( + void *data, struct xfs_mru_cache_elem *mru) { + struct xfs_mount *mp = data; struct xfs_fstrm_item *item = container_of(mru, struct xfs_fstrm_item, mru); - xfs_filestream_put_ag(item->ip->i_mount, item->ag); - - trace_xfs_filestream_free(item->ip, item->ag); + xfs_filestream_put_ag(mp, item->ag); + trace_xfs_filestream_free(mp, mru->key, item->ag); kmem_free(item); } @@ -165,7 +165,7 @@ xfs_filestream_pick_ag( trylock = XFS_ALLOC_FLAG_TRYLOCK; for (nscan = 0; 1; nscan++) { - trace_xfs_filestream_scan(ip, ag); + trace_xfs_filestream_scan(mp, ip->i_ino, ag); pag = xfs_perag_get(mp, ag); @@ -265,7 +265,6 @@ xfs_filestream_pick_ag( goto out_put_ag; item->ag = *agp; - item->ip = ip; err = xfs_mru_cache_insert(mp->m_filestream, ip->i_ino, &item->mru); if (err) { @@ -333,7 +332,7 @@ xfs_filestream_lookup_ag( ag = container_of(mru, struct xfs_fstrm_item, mru)->ag; xfs_mru_cache_done(mp->m_filestream); - trace_xfs_filestream_lookup(ip, ag); + trace_xfs_filestream_lookup(mp, ip->i_ino, ag); goto out; } @@ -399,7 +398,7 @@ xfs_filestream_new_ag( * Only free the item here so we skip over the old AG earlier. */ if (mru) - xfs_fstrm_free_func(mru); + xfs_fstrm_free_func(mp, mru); IRELE(pip); exit: @@ -426,8 +425,8 @@ xfs_filestream_mount( * timer tunable to within about 10 percent. This requires at least 10 * groups. */ - return xfs_mru_cache_create(&mp->m_filestream, xfs_fstrm_centisecs * 10, - 10, xfs_fstrm_free_func); + return xfs_mru_cache_create(&mp->m_filestream, mp, + xfs_fstrm_centisecs * 10, 10, xfs_fstrm_free_func); } void diff --git a/fs/xfs/xfs_mru_cache.c b/fs/xfs/xfs_mru_cache.c index f8a674d7f092..70eea7ae2876 100644 --- a/fs/xfs/xfs_mru_cache.c +++ b/fs/xfs/xfs_mru_cache.c @@ -112,6 +112,7 @@ struct xfs_mru_cache { xfs_mru_cache_free_func_t free_func; /* Function pointer for freeing. */ struct delayed_work work; /* Workqueue data for reaping. */ unsigned int queued; /* work has been queued */ + void *data; }; static struct workqueue_struct *xfs_mru_reap_wq; @@ -259,7 +260,7 @@ _xfs_mru_cache_clear_reap_list( list_for_each_entry_safe(elem, next, &tmp, list_node) { list_del_init(&elem->list_node); - mru->free_func(elem); + mru->free_func(mru->data, elem); } spin_lock(&mru->lock); @@ -326,6 +327,7 @@ xfs_mru_cache_uninit(void) int xfs_mru_cache_create( struct xfs_mru_cache **mrup, + void *data, unsigned int lifetime_ms, unsigned int grp_count, xfs_mru_cache_free_func_t free_func) @@ -369,7 +371,7 @@ xfs_mru_cache_create( mru->grp_time = grp_time; mru->free_func = free_func; - + mru->data = data; *mrup = mru; exit: @@ -492,7 +494,7 @@ xfs_mru_cache_delete( elem = xfs_mru_cache_remove(mru, key); if (elem) - mru->free_func(elem); + mru->free_func(mru->data, elem); } /* diff --git a/fs/xfs/xfs_mru_cache.h b/fs/xfs/xfs_mru_cache.h index fb5245ba5ff7..b3f3fbdfcc47 100644 --- a/fs/xfs/xfs_mru_cache.h +++ b/fs/xfs/xfs_mru_cache.h @@ -26,13 +26,13 @@ struct xfs_mru_cache_elem { }; /* Function pointer type for callback to free a client's data pointer. */ -typedef void (*xfs_mru_cache_free_func_t)(struct xfs_mru_cache_elem *elem); +typedef void (*xfs_mru_cache_free_func_t)(void *, struct xfs_mru_cache_elem *); int xfs_mru_cache_init(void); void xfs_mru_cache_uninit(void); -int xfs_mru_cache_create(struct xfs_mru_cache **mrup, unsigned int lifetime_ms, - unsigned int grp_count, - xfs_mru_cache_free_func_t free_func); +int xfs_mru_cache_create(struct xfs_mru_cache **mrup, void *data, + unsigned int lifetime_ms, unsigned int grp_count, + xfs_mru_cache_free_func_t free_func); void xfs_mru_cache_destroy(struct xfs_mru_cache *mru); int xfs_mru_cache_insert(struct xfs_mru_cache *mru, unsigned long key, struct xfs_mru_cache_elem *elem); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 945de08af7ba..19f4e30265c2 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -506,8 +506,8 @@ DEFINE_BUF_ITEM_EVENT(xfs_trans_bhold_release); DEFINE_BUF_ITEM_EVENT(xfs_trans_binval); DECLARE_EVENT_CLASS(xfs_filestream_class, - TP_PROTO(struct xfs_inode *ip, xfs_agnumber_t agno), - TP_ARGS(ip, agno), + TP_PROTO(struct xfs_mount *mp, xfs_ino_t ino, xfs_agnumber_t agno), + TP_ARGS(mp, ino, agno), TP_STRUCT__entry( __field(dev_t, dev) __field(xfs_ino_t, ino) @@ -515,10 +515,10 @@ DECLARE_EVENT_CLASS(xfs_filestream_class, __field(int, streams) ), TP_fast_assign( - __entry->dev = VFS_I(ip)->i_sb->s_dev; - __entry->ino = ip->i_ino; + __entry->dev = mp->m_super->s_dev; + __entry->ino = ino; __entry->agno = agno; - __entry->streams = xfs_filestream_peek_ag(ip->i_mount, agno); + __entry->streams = xfs_filestream_peek_ag(mp, agno); ), TP_printk("dev %d:%d ino 0x%llx agno %u streams %d", MAJOR(__entry->dev), MINOR(__entry->dev), @@ -528,8 +528,8 @@ DECLARE_EVENT_CLASS(xfs_filestream_class, ) #define DEFINE_FILESTREAM_EVENT(name) \ DEFINE_EVENT(xfs_filestream_class, name, \ - TP_PROTO(struct xfs_inode *ip, xfs_agnumber_t agno), \ - TP_ARGS(ip, agno)) + TP_PROTO(struct xfs_mount *mp, xfs_ino_t ino, xfs_agnumber_t agno), \ + TP_ARGS(mp, ino, agno)) DEFINE_FILESTREAM_EVENT(xfs_filestream_free); DEFINE_FILESTREAM_EVENT(xfs_filestream_lookup); DEFINE_FILESTREAM_EVENT(xfs_filestream_scan);