From patchwork Tue Jan 3 21:36:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ross Zwisler X-Patchwork-Id: 9495731 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B818660405 for ; Tue, 3 Jan 2017 21:37:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A998126E81 for ; Tue, 3 Jan 2017 21:37:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9E51D276D6; Tue, 3 Jan 2017 21:37:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 479A626E81 for ; Tue, 3 Jan 2017 21:37:57 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id D85658190C; Tue, 3 Jan 2017 13:37:57 -0800 (PST) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 9C3AF817F1 for ; Tue, 3 Jan 2017 13:37:56 -0800 (PST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP; 03 Jan 2017 13:37:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos; i="5.33,457,1477983600"; d="scan'208"; a="1107525450" Received: from theros.lm.intel.com ([10.232.112.77]) by fmsmga002.fm.intel.com with ESMTP; 03 Jan 2017 13:37:55 -0800 From: Ross Zwisler To: Xiong Zhou , stable@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] dax: fix deadlock with DAX 4k holes Date: Tue, 3 Jan 2017 14:36:05 -0700 Message-Id: <1483479365-13607-1-git-send-email-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <20161027112230.wsumgs62fqdxt3sc@xzhoul.usersys.redhat.com> References: <20161027112230.wsumgs62fqdxt3sc@xzhoul.usersys.redhat.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Kara , Andrew Morton , linux-mm@kvack.org, linux-nvdimm@lists.01.org, Dave Chinner , Christoph Hellwig , Dave Hansen MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP Currently in DAX if we have three read faults on the same hole address we can end up with the following: Thread 0 Thread 1 Thread 2 -------- -------- -------- dax_iomap_fault grab_mapping_entry lock_slot dax_iomap_fault grab_mapping_entry get_unlocked_mapping_entry dax_iomap_fault grab_mapping_entry get_unlocked_mapping_entry dax_load_hole find_or_create_page ... page_cache_tree_insert dax_wake_mapping_entry_waiter __radix_tree_replace get_page lock_page ... put_locked_mapping_entry unlock_page put_page The crux of the problem is that once we insert a 4k zero page, all locking from then on is done in terms of that 4k zero page and any additional threads sleeping on the empty DAX entry will never be woken. Fix this by waking all sleepers when we replace the DAX radix tree entry with a 4k zero page. This will allow all sleeping threads to successfully transition from locking based on the DAX empty entry to locking on the 4k zero page. With the test case reported by Xiong this happens very regularly in my test setup, with some runs resulting in 9+ threads in this deadlocked state. With this fix I've been able to run that same test dozens of times in a loop without issue. Signed-off-by: Ross Zwisler Reported-by: Xiong Zhou Fixes: commit ac401cc78242 ("dax: New fault locking") Cc: Jan Kara Cc: stable@vger.kernel.org # 4.7+ Reviewed-by: Jan Kara --- This issue exists as far back as v4.7, and I was easly able to reproduce it with v4.7 using the same test. Unfortunately this patch won't apply cleanly to the stable trees, but the change is very simple and should be easy to replicate by hand. Please ping me if you'd like patches that apply cleanly to the v4.9 and v4.8.15 trees. --- mm/filemap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/filemap.c b/mm/filemap.c index d0e4d10..b772a33 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -138,7 +138,7 @@ static int page_cache_tree_insert(struct address_space *mapping, dax_radix_locked_entry(0, RADIX_DAX_EMPTY)); /* Wakeup waiters for exceptional entry lock */ dax_wake_mapping_entry_waiter(mapping, page->index, p, - false); + true); } } __radix_tree_replace(&mapping->page_tree, node, slot, page,