From patchwork Tue Sep 29 14:23:47 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 7286541 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 770B79F39B for ; Tue, 29 Sep 2015 14:24:03 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7AB1120718 for ; Tue, 29 Sep 2015 14:24:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E04422061D for ; Tue, 29 Sep 2015 14:23:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965290AbbI2OX6 (ORCPT ); Tue, 29 Sep 2015 10:23:58 -0400 Received: from outbound-smtp01.blacknight.com ([81.17.249.7]:36983 "EHLO outbound-smtp01.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965227AbbI2OXz (ORCPT ); Tue, 29 Sep 2015 10:23:55 -0400 Received: from mail.blacknight.com (pemlinmail06.blacknight.ie [81.17.255.152]) by outbound-smtp01.blacknight.com (Postfix) with ESMTPS id E1D4D98C55 for ; Tue, 29 Sep 2015 14:23:48 +0000 (UTC) Received: (qmail 29435 invoked from network); 29 Sep 2015 14:23:48 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[37.228.245.0]) by 81.17.254.9 with ESMTPSA (DHE-RSA-AES256-SHA encrypted, authenticated); 29 Sep 2015 14:23:48 -0000 Date: Tue, 29 Sep 2015 15:23:47 +0100 From: Mel Gorman To: Andrew Morton Cc: Jerome Marchand , Trond Myklebust , Anna Schumaker , Hugh Dickins , Linux NFS Mailing List , Linux Kernel Mailing List , Linux-MM Subject: [PATCH] mm: swap: Use swap_lock to prevent parallel swapon activations instead of i_mutex Message-ID: <20150929142347.GK3068@techsingularity.net> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Jerome Marchand reported a lockdep warning as follows [ 6819.501009] ================================= [ 6819.501009] [ INFO: inconsistent lock state ] [ 6819.501009] 4.2.0-rc1-shmacct-babka-v2-next-20150709+ #255 Not tainted [ 6819.501009] --------------------------------- [ 6819.501009] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. [ 6819.501009] kswapd0/38 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 6819.501009] (&sb->s_type->i_mutex_key#17){+.+.?.}, at: [] nfs_file_direct_write+0x85/0x3f0 [nfs] [ 6819.501009] {RECLAIM_FS-ON-W} state was registered at: [ 6819.501009] [] mark_held_locks+0x71/0x90 [ 6819.501009] [] lockdep_trace_alloc+0x75/0xe0 [ 6819.501009] [] kmem_cache_alloc_node_trace+0x39/0x440 [ 6819.501009] [] __get_vm_area_node+0x7f/0x160 [ 6819.501009] [] __vmalloc_node_range+0x72/0x2c0 [ 6819.501009] [] vzalloc+0x54/0x60 [ 6819.501009] [] SyS_swapon+0x628/0xfc0 [ 6819.501009] [] entry_SYSCALL_64_fastpath+0x12/0x76 It's due to NFS acquiring i_mutex since a9ab5e840669 ("nfs: page cache invalidation for dio") to invalidate page cache before direct I/O. Filesystems may safely acquire i_mutex during direct writes but NFS is unique in its treatment of swap files. Ordinarily swap files are supported by the core VM looking up the physical block for a given offset in advance. There is no physical block for NFS and the direct write paths are used after calling mapping->swap_activate. The lockdep warning is triggered by swapon(), which is not in reclaim context, acquiring the i_mutex to ensure a swapfile is not activated twice. swapon does not need the i_mutex for this purpose. There is a requirement that fallocate not be used on swapfiles but this is protected by the inode flag S_SWAPFILE and nothing to do with i_mutex. In fact, the current protection does nothing for block devices. This patch expands the role of swap_lock to protect against parallel activations of block devices and swapfiles and removes the use of i_mutex. This both improves the protection for swapon and avoids the lockdep warning. Reported-and-tested-by: Jerome Marchand Signed-off-by: Mel Gorman --- mm/swapfile.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/mm/swapfile.c b/mm/swapfile.c index 58877312cf6b..e55a69fd24e4 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1970,9 +1970,9 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) set_blocksize(bdev, old_block_size); blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL); } else { - mutex_lock(&inode->i_mutex); + spin_lock(&swap_lock); inode->i_flags &= ~S_SWAPFILE; - mutex_unlock(&inode->i_mutex); + spin_unlock(&swap_lock); } filp_close(swap_file, NULL); @@ -2197,7 +2197,6 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode) p->flags |= SWP_BLKDEV; } else if (S_ISREG(inode->i_mode)) { p->bdev = inode->i_sb->s_bdev; - mutex_lock(&inode->i_mutex); if (IS_SWAPFILE(inode)) return -EBUSY; } else @@ -2426,12 +2425,15 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) goto bad_swap; } + /* prevent parallel swapons */ + spin_lock(&swap_lock); p->swap_file = swap_file; mapping = swap_file->f_mapping; inode = mapping->host; /* If S_ISREG(inode->i_mode) will do mutex_lock(&inode->i_mutex); */ error = claim_swapfile(p, inode); + spin_unlock(&swap_lock); if (unlikely(error)) goto bad_swap; @@ -2574,10 +2576,8 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) vfree(swap_map); vfree(cluster_info); if (swap_file) { - if (inode && S_ISREG(inode->i_mode)) { - mutex_unlock(&inode->i_mutex); + if (inode && S_ISREG(inode->i_mode)) inode = NULL; - } filp_close(swap_file, NULL); } out: @@ -2587,8 +2587,6 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) } if (name) putname(name); - if (inode && S_ISREG(inode->i_mode)) - mutex_unlock(&inode->i_mutex); return error; }