From patchwork Fri Sep 14 21:23:23 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 1460321 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 292F340220 for ; Fri, 14 Sep 2012 21:24:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755922Ab2INVX5 (ORCPT ); Fri, 14 Sep 2012 17:23:57 -0400 Received: from mail-gg0-f174.google.com ([209.85.161.174]:54524 "EHLO mail-gg0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756124Ab2INVX1 (ORCPT ); Fri, 14 Sep 2012 17:23:27 -0400 Received: by ggdk6 with SMTP id k6so1097291ggd.19 for ; Fri, 14 Sep 2012 14:23:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:subject:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-type:content-transfer-encoding; bh=HOogpz8yqQmuAHy+FbcY2wPRpxEWN7C6tB+f/HeXPls=; b=i7edbh4s5y9fdqwpS1EXzPoeX9zrxLsMCGCDUmXlGnthxODh8ndy3xRvfheasDzJHl qQHHUXCVZxEF5rkn1kIeKkrCtXSvLc64swR0/fyEiTSLGJMYDcKZmvDRB5dSj8D8vu9f vnZ3Nat3O/UcBLBsbhFomLx8K64Ukpo+xjjVL35TG1sxKbcVGIDb47DHUkOrH0qUtHQm NRQE/MW7P5t4KLU1zNDAVzak7tGEEVXEGcfd/9qNgArGf7fbjgkOn+oeDHtPP239wcAu ce2FIHlwpDC6/HgxPsrfHHG26gKRk83Hv31Ir09f2DyLFM71FsrvRT1HEZZHobN7mbs/ sWKQ== Received: by 10.101.152.25 with SMTP id e25mr1819303ano.7.1347657806882; Fri, 14 Sep 2012 14:23:26 -0700 (PDT) Received: from degas.1015granger.net (adsl-99-26-161-222.dsl.sfldmi.sbcglobal.net. [99.26.161.222]) by mx.google.com with ESMTPS id t20sm2613369anl.19.2012.09.14.14.23.24 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 14 Sep 2012 14:23:25 -0700 (PDT) From: Chuck Lever Subject: [PATCH 02/10] NFS: Slow down state manager after an unhandled error To: trond.myklebust@netapp.com Cc: linux-nfs@vger.kernel.org Date: Fri, 14 Sep 2012 17:23:23 -0400 Message-ID: <20120914212323.1635.78058.stgit@degas.1015granger.net> In-Reply-To: <20120914211053.1635.74063.stgit@degas.1015granger.net> References: <20120914211053.1635.74063.stgit@degas.1015granger.net> User-Agent: StGIT/0.14.3 MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org If the state manager thread is not actually able to fully recover from some situation, it wakes up waiters, who kick off a new state manager thread. Quite often the fresh invocation of the state manager is just as successful. This results in a livelock as the client dumps thousands of NFS requests a second on the network in a vain attempt to recover. Not very friendly. To mitigate this situation, add a delay in the state manager after an unhandled error, so that the client sends just a few requests every second in this case. Signed-off-by: Chuck Lever --- fs/nfs/nfs4state.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 55148de..d6e0d63 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -2008,6 +2008,7 @@ out_error: pr_warn_ratelimited("NFS: state manager%s%s failed on NFSv4 server %s" " with error %d\n", section_sep, section, clp->cl_hostname, -status); + ssleep(1); nfs4_end_drain_session(clp); nfs4_clear_state_manager_bit(clp); }