From patchwork Tue Nov 17 20:16:14 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 7641701 Return-Path: X-Original-To: patchwork-linux-nvdimm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E7A2EBF90C for ; Tue, 17 Nov 2015 20:16:46 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 04147204D2 for ; Tue, 17 Nov 2015 20:16:46 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DC36820426 for ; Tue, 17 Nov 2015 20:16:44 +0000 (UTC) Received: from ml01.vlan14.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id D1DD81A2027; Tue, 17 Nov 2015 12:16:44 -0800 (PST) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by ml01.01.org (Postfix) with ESMTP id D1D871A202A for ; Tue, 17 Nov 2015 12:16:42 -0800 (PST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP; 17 Nov 2015 12:16:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,309,1444719600"; d="scan'208";a="687816276" Received: from dwillia2-desk3.jf.intel.com ([10.54.39.39]) by orsmga003.jf.intel.com with ESMTP; 17 Nov 2015 12:16:42 -0800 Subject: [PATCH 4/8] mm, dax: truncate dax mappings at bdev or fs shutdown From: Dan Williams To: linux-nvdimm@lists.01.org Date: Tue, 17 Nov 2015 12:16:14 -0800 Message-ID: <20151117201614.15053.62376.stgit@dwillia2-desk3.jf.intel.com> In-Reply-To: <20151117201551.15053.32709.stgit@dwillia2-desk3.jf.intel.com> References: <20151117201551.15053.32709.stgit@dwillia2-desk3.jf.intel.com> User-Agent: StGit/0.17.1-9-g687f MIME-Version: 1.0 Cc: Dave Chinner , stable@vger.kernel.org, linux-block@vger.kernel.org, Jan Kara , linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently dax mappings survive block_device shutdown. While page cache pages are permitted to be read/written after the block_device is torn down this is not acceptable in the dax case as all media access must end when the device is disabled. The pfn backing a dax mapping is permitted to be invalidated after bdev shutdown and this is indeed the case with brd. When a dax capable block_device driver calls del_gendisk() in its shutdown path, or a filesystem evicts an inode it needs to ensure that all the pfns that had been mapped via bdev_direct_access() are unmapped. This is different than the pagecache backed case where truncate_inode_pages() is sufficient to end I/O to pages mapped to a dying inode. Since dax bypasses the page cache we need to unmap in addition to truncating pages. Also, since dax mappings are not accounted in the mapping radix we uncoditionally truncate all inodes with the S_DAX flag. Likely when we add support for dynamic dax enable/disable control we'll have infrastructure to detect if the inode is unmapped and can skip the truncate. Cc: Cc: Jan Kara Cc: Dave Chinner Cc: Matthew Wilcox Cc: Ross Zwisler Signed-off-by: Dan Williams --- fs/inode.c | 27 +++++++++++++++++++++++++++ mm/truncate.c | 13 +++++++++++-- 2 files changed, 38 insertions(+), 2 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 1be5f9003eb3..1029e033e991 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -579,6 +579,18 @@ static void dispose_list(struct list_head *head) } } +static void truncate_list(struct list_head *head) +{ + struct inode *inode, *_i; + + list_for_each_entry_safe(inode, _i, head, i_lru) { + list_del_init(&inode->i_lru); + truncate_pagecache(inode, 0); + iput(inode); + cond_resched(); + } +} + /** * evict_inodes - evict all evictable inodes for a superblock * @sb: superblock to operate on @@ -642,6 +654,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty) int busy = 0; struct inode *inode, *next; LIST_HEAD(dispose); + LIST_HEAD(truncate); spin_lock(&sb->s_inode_list_lock); list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { @@ -655,6 +668,19 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty) busy = 1; continue; } + if (IS_DAX(inode) && atomic_read(&inode->i_count)) { + /* + * dax mappings can't live past this invalidation event + * as there is no page cache present to allow the data + * to remain accessiable. + */ + __iget(inode); + inode_lru_list_del(inode); + spin_unlock(&inode->i_lock); + list_add(&inode->i_lru, &truncate); + busy = 1; + continue; + } if (atomic_read(&inode->i_count)) { spin_unlock(&inode->i_lock); busy = 1; @@ -669,6 +695,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty) spin_unlock(&sb->s_inode_list_lock); dispose_list(&dispose); + truncate_list(&truncate); return busy; } diff --git a/mm/truncate.c b/mm/truncate.c index 76e35ad97102..ff1fb3b0980e 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -402,6 +402,7 @@ EXPORT_SYMBOL(truncate_inode_pages); */ void truncate_inode_pages_final(struct address_space *mapping) { + struct inode *inode = mapping->host; unsigned long nrshadows; unsigned long nrpages; @@ -423,7 +424,7 @@ void truncate_inode_pages_final(struct address_space *mapping) smp_rmb(); nrshadows = mapping->nrshadows; - if (nrpages || nrshadows) { + if (nrpages || nrshadows || IS_DAX(inode)) { /* * As truncation uses a lockless tree lookup, cycle * the tree lock to make sure any ongoing tree @@ -433,7 +434,15 @@ void truncate_inode_pages_final(struct address_space *mapping) spin_lock_irq(&mapping->tree_lock); spin_unlock_irq(&mapping->tree_lock); - truncate_inode_pages(mapping, 0); + /* + * In the case of DAX we also need to unmap the inode + * since the pfn backing the mapping may be invalidated + * after this returns + */ + if (IS_DAX(inode)) + truncate_pagecache(inode, 0); + else + truncate_inode_pages(mapping, 0); } } EXPORT_SYMBOL(truncate_inode_pages_final);