From patchwork Mon Jan 24 03:48:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12721508 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47D2AC433F5 for ; Mon, 24 Jan 2022 03:54:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235933AbiAXDyO (ORCPT ); Sun, 23 Jan 2022 22:54:14 -0500 Received: from smtp-out1.suse.de ([195.135.220.28]:56994 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241230AbiAXDyN (ORCPT ); Sun, 23 Jan 2022 22:54:13 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id C88BC21995; Mon, 24 Jan 2022 03:54:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1642996452; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f9JPZARu0omLin5bRQVSgWDU6wAXTb1AVpCm2PGq0QA=; b=oWLVbYedFQP9v6Y+Vt2dCmYJ7BswiwzErr2bNqohkHzF29J5NtnzRR3hInA0/7Pbsg+a4p dkVN24TPtV/Zna3iFWM09kMIkBk2VhKvphP+cHwNm6UvHMXPUMMbSiNEVjA8OCcDOWDEtr bYJ4pdL9OTrQ+svPLtAoHw7TN/KouOY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1642996452; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f9JPZARu0omLin5bRQVSgWDU6wAXTb1AVpCm2PGq0QA=; b=d/H/0ZHUojlxTItoes3CwtM1O1/+IIfnm9ut5aMrLo6Gn1mp2AaqBsiqk+9II7iICgBBaQ 0SDOZBWG7iH2FMBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D7F8513305; Mon, 24 Jan 2022 03:54:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id qc9UJeEi7mHfRQAAMHmgww (envelope-from ); Mon, 24 Jan 2022 03:54:09 +0000 Subject: [PATCH 17/23] SUNRPC/xprt: async tasks mustn't block waiting for memory From: NeilBrown To: Trond Myklebust , Anna Schumaker , Chuck Lever , Andrew Morton , Mel Gorman , Christoph Hellwig , David Howells Cc: linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Date: Mon, 24 Jan 2022 14:48:32 +1100 Message-ID: <164299611283.26253.16655442929244733353.stgit@noble.brown> In-Reply-To: <164299573337.26253.7538614611220034049.stgit@noble.brown> References: <164299573337.26253.7538614611220034049.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org When memory is short, new worker threads cannot be created and we depend on the minimum one rpciod thread to be able to handle everything. So it must not block waiting for memory. xprt_dynamic_alloc_slot can block indefinitely. This can tie up all workqueue threads and NFS can deadlock. So when called from a workqueue, set __GFP_NORETRY. The rdma alloc_slot already does not block. However it sets the error to -EAGAIN suggesting this will trigger a sleep. It does not. As we can see in call_reserveresult(), only -ENOMEM causes a sleep. -EAGAIN causes immediate retry. Signed-off-by: NeilBrown --- net/sunrpc/xprt.c | 5 ++++- net/sunrpc/xprtrdma/transport.c | 2 +- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index a02de2bddb28..47d207e416ab 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -1687,12 +1687,15 @@ static bool xprt_throttle_congested(struct rpc_xprt *xprt, struct rpc_task *task static struct rpc_rqst *xprt_dynamic_alloc_slot(struct rpc_xprt *xprt) { struct rpc_rqst *req = ERR_PTR(-EAGAIN); + gfp_t gfp_mask = GFP_NOFS; if (xprt->num_reqs >= xprt->max_reqs) goto out; ++xprt->num_reqs; spin_unlock(&xprt->reserve_lock); - req = kzalloc(sizeof(struct rpc_rqst), GFP_NOFS); + if (current->flags & PF_WQ_WORKER) + gfp_mask |= __GFP_NORETRY | __GFP_NOWARN; + req = kzalloc(sizeof(struct rpc_rqst), gfp_mask); spin_lock(&xprt->reserve_lock); if (req != NULL) goto out; diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c index a52277115500..32df23796747 100644 --- a/net/sunrpc/xprtrdma/transport.c +++ b/net/sunrpc/xprtrdma/transport.c @@ -521,7 +521,7 @@ xprt_rdma_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task) return; out_sleep: - task->tk_status = -EAGAIN; + task->tk_status = -ENOMEM; xprt_add_backlog(xprt, task); }