From patchwork Mon Apr 19 21:36:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 12212745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E83ECC433B4 for ; Mon, 19 Apr 2021 21:39:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B928461354 for ; Mon, 19 Apr 2021 21:39:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241175AbhDSVjf (ORCPT ); Mon, 19 Apr 2021 17:39:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:54372 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237147AbhDSVjd (ORCPT ); Mon, 19 Apr 2021 17:39:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618868343; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dE2KhdgmniyozM29XA7Kl1Gn8/+n29ZTW89S26/E+NY=; b=VuUl0R0+M7nKexrz7/+D6ue7tTxHKeMXjbplRIcxmHC/H33DvDWqe65Iw6OqXUzAP/kOad kKhHUA5mw8AqyxNinNfphUM0xCYtf1wxQkbWUjV5hnOH6O2+xXKOkJnbTUqS4Fg1srAiFz eEfJ/0KGsDS1RbfqqXoXs3jPh79anNQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-569-jmSMO5waP9Cy8jwyVYMoww-1; Mon, 19 Apr 2021 17:39:01 -0400 X-MC-Unique: jmSMO5waP9Cy8jwyVYMoww-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A470F1006C81; Mon, 19 Apr 2021 21:38:59 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-35.rdu2.redhat.com [10.10.116.35]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0BB4A19727; Mon, 19 Apr 2021 21:38:53 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 93C78223D98; Mon, 19 Apr 2021 17:38:52 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, dan.j.williams@intel.com, jack@suse.cz, willy@infradead.org Cc: virtio-fs@redhat.com, slp@redhat.com, miklos@szeredi.hu, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, vgoyal@redhat.com Subject: [PATCH v3 1/3] dax: Add an enum for specifying dax wakup mode Date: Mon, 19 Apr 2021 17:36:34 -0400 Message-Id: <20210419213636.1514816-2-vgoyal@redhat.com> In-Reply-To: <20210419213636.1514816-1-vgoyal@redhat.com> References: <20210419213636.1514816-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Dan mentioned that he is not very fond of passing around a boolean true/false to specify if only next waiter should be woken up or all waiters should be woken up. He instead prefers that we introduce an enum and make it very explicity at the callsite itself. Easier to read code. This patch should not introduce any change of behavior. Suggested-by: Dan Williams Signed-off-by: Vivek Goyal Reviewed-by: Greg Kurz Reviewed-by: Jan Kara Signed-off-by: Vivek Goyal --- fs/dax.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index b3d27fdc6775..00978d0838b1 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -144,6 +144,16 @@ struct wait_exceptional_entry_queue { struct exceptional_entry_key key; }; +/** + * enum dax_entry_wake_mode: waitqueue wakeup toggle + * @WAKE_NEXT: entry was not mutated + * @WAKE_ALL: entry was invalidated, or resized + */ +enum dax_entry_wake_mode { + WAKE_NEXT, + WAKE_ALL, +}; + static wait_queue_head_t *dax_entry_waitqueue(struct xa_state *xas, void *entry, struct exceptional_entry_key *key) { @@ -182,7 +192,8 @@ static int wake_exceptional_entry_func(wait_queue_entry_t *wait, * The important information it's conveying is whether the entry at * this index used to be a PMD entry. */ -static void dax_wake_entry(struct xa_state *xas, void *entry, bool wake_all) +static void dax_wake_entry(struct xa_state *xas, void *entry, + enum dax_entry_wake_mode mode) { struct exceptional_entry_key key; wait_queue_head_t *wq; @@ -196,7 +207,7 @@ static void dax_wake_entry(struct xa_state *xas, void *entry, bool wake_all) * must be in the waitqueue and the following check will see them. */ if (waitqueue_active(wq)) - __wake_up(wq, TASK_NORMAL, wake_all ? 0 : 1, &key); + __wake_up(wq, TASK_NORMAL, mode == WAKE_ALL ? 0 : 1, &key); } /* @@ -268,7 +279,7 @@ static void put_unlocked_entry(struct xa_state *xas, void *entry) { /* If we were the only waiter woken, wake the next one */ if (entry && !dax_is_conflict(entry)) - dax_wake_entry(xas, entry, false); + dax_wake_entry(xas, entry, WAKE_NEXT); } /* @@ -286,7 +297,7 @@ static void dax_unlock_entry(struct xa_state *xas, void *entry) old = xas_store(xas, entry); xas_unlock_irq(xas); BUG_ON(!dax_is_locked(old)); - dax_wake_entry(xas, entry, false); + dax_wake_entry(xas, entry, WAKE_NEXT); } /* @@ -524,7 +535,7 @@ static void *grab_mapping_entry(struct xa_state *xas, dax_disassociate_entry(entry, mapping, false); xas_store(xas, NULL); /* undo the PMD join */ - dax_wake_entry(xas, entry, true); + dax_wake_entry(xas, entry, WAKE_ALL); mapping->nrexceptional--; entry = NULL; xas_set(xas, index); @@ -937,7 +948,7 @@ static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, xas_lock_irq(xas); xas_store(xas, entry); xas_clear_mark(xas, PAGECACHE_TAG_DIRTY); - dax_wake_entry(xas, entry, false); + dax_wake_entry(xas, entry, WAKE_NEXT); trace_dax_writeback_one(mapping->host, index, count); return ret; From patchwork Mon Apr 19 21:36:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 12212747 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47556C43460 for ; Mon, 19 Apr 2021 21:39:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 20EFA613AF for ; Mon, 19 Apr 2021 21:39:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241321AbhDSVjg (ORCPT ); Mon, 19 Apr 2021 17:39:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:30376 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240877AbhDSVjf (ORCPT ); Mon, 19 Apr 2021 17:39:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618868344; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IC12KDzjvoUKJ+dNEG7qWjewHh2vR3rSNthHIJpJXfg=; b=Zt+nyaL6bnrUPkgCzBnlF2FEkFy82VzG8tKeYGBkfmitTT7PhpYo7GmUGV/dnphe5FWlOi t9oH4QQnMceHV4U7J8xVyYM0FmJB9xWlMzH7Z3r4KOA2oc7bfyoZzB4/FGEXMuNne/itrW Ym0YqA+4hi3oewrqkKUFrnwEcoRCG1M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-370-kTXYu0NXNue_LfhajHIQlA-1; Mon, 19 Apr 2021 17:39:00 -0400 X-MC-Unique: kTXYu0NXNue_LfhajHIQlA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A48FE19251A0; Mon, 19 Apr 2021 21:38:59 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-35.rdu2.redhat.com [10.10.116.35]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0F69019744; Mon, 19 Apr 2021 21:38:53 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 98C1B223D99; Mon, 19 Apr 2021 17:38:52 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, dan.j.williams@intel.com, jack@suse.cz, willy@infradead.org Cc: virtio-fs@redhat.com, slp@redhat.com, miklos@szeredi.hu, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, vgoyal@redhat.com Subject: [PATCH v3 2/3] dax: Add a wakeup mode parameter to put_unlocked_entry() Date: Mon, 19 Apr 2021 17:36:35 -0400 Message-Id: <20210419213636.1514816-3-vgoyal@redhat.com> In-Reply-To: <20210419213636.1514816-1-vgoyal@redhat.com> References: <20210419213636.1514816-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org As of now put_unlocked_entry() always wakes up next waiter. In next patches we want to wake up all waiters at one callsite. Hence, add a parameter to the function. This patch does not introduce any change of behavior. Suggested-by: Dan Williams Signed-off-by: Vivek Goyal Reviewed-by: Greg Kurz Signed-off-by: Vivek Goyal Reviewed-by: Jan Kara --- fs/dax.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 00978d0838b1..f19d76a6a493 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -275,11 +275,12 @@ static void wait_entry_unlocked(struct xa_state *xas, void *entry) finish_wait(wq, &ewait.wait); } -static void put_unlocked_entry(struct xa_state *xas, void *entry) +static void put_unlocked_entry(struct xa_state *xas, void *entry, + enum dax_entry_wake_mode mode) { /* If we were the only waiter woken, wake the next one */ if (entry && !dax_is_conflict(entry)) - dax_wake_entry(xas, entry, WAKE_NEXT); + dax_wake_entry(xas, entry, mode); } /* @@ -633,7 +634,7 @@ struct page *dax_layout_busy_page_range(struct address_space *mapping, entry = get_unlocked_entry(&xas, 0); if (entry) page = dax_busy_page(entry); - put_unlocked_entry(&xas, entry); + put_unlocked_entry(&xas, entry, WAKE_NEXT); if (page) break; if (++scanned % XA_CHECK_SCHED) @@ -675,7 +676,7 @@ static int __dax_invalidate_entry(struct address_space *mapping, mapping->nrexceptional--; ret = 1; out: - put_unlocked_entry(&xas, entry); + put_unlocked_entry(&xas, entry, WAKE_NEXT); xas_unlock_irq(&xas); return ret; } @@ -954,7 +955,7 @@ static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, return ret; put_unlocked: - put_unlocked_entry(xas, entry); + put_unlocked_entry(xas, entry, WAKE_NEXT); return ret; } @@ -1695,7 +1696,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) /* Did we race with someone splitting entry or so? */ if (!entry || dax_is_conflict(entry) || (order == 0 && !dax_is_pte_entry(entry))) { - put_unlocked_entry(&xas, entry); + put_unlocked_entry(&xas, entry, WAKE_NEXT); xas_unlock_irq(&xas); trace_dax_insert_pfn_mkwrite_no_entry(mapping->host, vmf, VM_FAULT_NOPAGE); From patchwork Mon Apr 19 21:36:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 12212751 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0DCEC43470 for ; Mon, 19 Apr 2021 21:39:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B8322613AE for ; Mon, 19 Apr 2021 21:39:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241511AbhDSVjj (ORCPT ); Mon, 19 Apr 2021 17:39:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:56546 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240991AbhDSVjf (ORCPT ); Mon, 19 Apr 2021 17:39:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618868344; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NfbugZ0oNy5jWgMv3Gew8zQ7TeTsb6SXyuBWX0drnmc=; b=Eu+vK4d7EDzs+Z0e/Cl4qdnDOtWx+vruR7fCSi21ERSgVWEGA7JT2bcB9B/c7GTrUQxsn9 AfqiJjgj+cCf8wmZBRz2aTOql5OYDBTxSP0RlGirCYT1GEoTnvv3jP2RcINDujj11f8+88 Z3nJDsvlnQJDIRMmZ96lJoipEijytk4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-589-_wcYZwJwMMiSZF-iceMeoA-1; Mon, 19 Apr 2021 17:39:01 -0400 X-MC-Unique: _wcYZwJwMMiSZF-iceMeoA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 992B8107ACC7; Mon, 19 Apr 2021 21:38:59 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-35.rdu2.redhat.com [10.10.116.35]) by smtp.corp.redhat.com (Postfix) with ESMTP id 08DA860BF1; Mon, 19 Apr 2021 21:38:53 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 9F180225FCD; Mon, 19 Apr 2021 17:38:52 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, dan.j.williams@intel.com, jack@suse.cz, willy@infradead.org Cc: virtio-fs@redhat.com, slp@redhat.com, miklos@szeredi.hu, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, vgoyal@redhat.com Subject: [PATCH v3 3/3] dax: Wake up all waiters after invalidating dax entry Date: Mon, 19 Apr 2021 17:36:36 -0400 Message-Id: <20210419213636.1514816-4-vgoyal@redhat.com> In-Reply-To: <20210419213636.1514816-1-vgoyal@redhat.com> References: <20210419213636.1514816-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org I am seeing missed wakeups which ultimately lead to a deadlock when I am using virtiofs with DAX enabled and running "make -j". I had to mount virtiofs as rootfs and also reduce to dax window size to 256M to reproduce the problem consistently. So here is the problem. put_unlocked_entry() wakes up waiters only if entry is not null as well as !dax_is_conflict(entry). But if I call multiple instances of invalidate_inode_pages2() in parallel, then I can run into a situation where there are waiters on this index but nobody will wait these. invalidate_inode_pages2() invalidate_inode_pages2_range() invalidate_exceptional_entry2() dax_invalidate_mapping_entry_sync() __dax_invalidate_entry() { xas_lock_irq(&xas); entry = get_unlocked_entry(&xas, 0); ... ... dax_disassociate_entry(entry, mapping, trunc); xas_store(&xas, NULL); ... ... put_unlocked_entry(&xas, entry); xas_unlock_irq(&xas); } Say a fault in in progress and it has locked entry at offset say "0x1c". Now say three instances of invalidate_inode_pages2() are in progress (A, B, C) and they all try to invalidate entry at offset "0x1c". Given dax entry is locked, all tree instances A, B, C will wait in wait queue. When dax fault finishes, say A is woken up. It will store NULL entry at index "0x1c" and wake up B. When B comes along it will find "entry=0" at page offset 0x1c and it will call put_unlocked_entry(&xas, 0). And this means put_unlocked_entry() will not wake up next waiter, given the current code. And that means C continues to wait and is not woken up. This patch fixes the issue by waking up all waiters when a dax entry has been invalidated. This seems to fix the deadlock I am facing and I can make forward progress. Reported-by: Sergio Lopez Fixes: ac401cc78242 ("dax: New fault locking") Suggested-by: Dan Williams Signed-off-by: Vivek Goyal Reviewed-by: Jan Kara --- fs/dax.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index f19d76a6a493..cc497519be83 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -676,7 +676,7 @@ static int __dax_invalidate_entry(struct address_space *mapping, mapping->nrexceptional--; ret = 1; out: - put_unlocked_entry(&xas, entry, WAKE_NEXT); + put_unlocked_entry(&xas, entry, WAKE_ALL); xas_unlock_irq(&xas); return ret; }