From patchwork Thu Jul 25 02:44:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11057851 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4CBE13B1 for ; Thu, 25 Jul 2019 02:44:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B37D7287C2 for ; Thu, 25 Jul 2019 02:44:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A7F59288AA; Thu, 25 Jul 2019 02:44:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 480FB287C2 for ; Thu, 25 Jul 2019 02:44:38 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C319A4C446B; Wed, 24 Jul 2019 19:44:32 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 18CAA4C3F10 for ; Wed, 24 Jul 2019 19:44:15 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7FEA11005393; Wed, 24 Jul 2019 22:44:11 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 739469C2; Wed, 24 Jul 2019 22:44:11 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown , Shaun Tancheff Date: Wed, 24 Jul 2019 22:44:05 -0400 Message-Id: <1564022647-17351-7-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1564022647-17351-1-git-send-email-jsimmons@infradead.org> References: <1564022647-17351-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 6/8] lustre: fld: fld client lookup should retry X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: wang di , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: wang di If FLD client lookup fails because of the remote target is shutdown (or deactive), it should retry another target, otherwise it will cause the application failure. And FLD client should stop retry if the import has been deactive. WC-bug-id: https://jira.whamcloud.com/browse/LU-6419 Lustre-commit: 3ededde903c92f8485cae0dc9f958f194ff0b140 Signed-off-by: wang di Reviewed-on: http://review.whamcloud.com/14313 Reviewed-by: Lai Siyao Reviewed-by: Fan Yong Reviewed-by: Alex Zhuravlev Reviewed-by: Oleg Drokin --- fs/lustre/fld/fld_request.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/fs/lustre/fld/fld_request.c b/fs/lustre/fld/fld_request.c index 60e7105..dfd4ae9 100644 --- a/fs/lustre/fld/fld_request.c +++ b/fs/lustre/fld/fld_request.c @@ -367,7 +367,7 @@ int fld_client_rpc(struct obd_export *exp, rc = ptlrpc_queue_wait(req); obd_put_request_slot(&exp->exp_obd->u.cli); if (rc != 0) { - if (imp->imp_state != LUSTRE_IMP_CLOSED) { + if (imp->imp_state != LUSTRE_IMP_CLOSED && !imp->imp_deactive) { /* Since LWP is not replayable, so it will keep * trying unless umount happens, otherwise it would * cause unecessary failure of the application. @@ -404,6 +404,7 @@ int fld_client_lookup(struct lu_client_fld *fld, u64 seq, u32 *mds, { struct lu_seq_range res = { 0 }; struct lu_fld_target *target; + struct lu_fld_target *origin; int rc; rc = fld_cache_lookup(fld->lcf_cache, seq, &res); @@ -415,7 +416,8 @@ int fld_client_lookup(struct lu_client_fld *fld, u64 seq, u32 *mds, /* Can not find it in the cache */ target = fld_client_get_target(fld, seq); LASSERT(target); - + origin = target; +again: CDEBUG(D_INFO, "%s: Lookup fld entry (seq: %#llx) on target %s (idx %llu)\n", fld->lcf_name, seq, fld_target_name(target), target->ft_idx); @@ -424,6 +426,23 @@ int fld_client_lookup(struct lu_client_fld *fld, u64 seq, u32 *mds, fld_range_set_type(&res, flags); rc = fld_client_rpc(target->ft_exp, &res, FLD_QUERY, NULL); + if (rc == -ESHUTDOWN) { + /* If fld lookup failed because the target has been shutdown, + * then try next target in the list, until trying all targets + * or fld lookup succeeds + */ + spin_lock(&fld->lcf_lock); + if (target->ft_chain.next == fld->lcf_targets.prev) + target = list_entry(fld->lcf_targets.next, + struct lu_fld_target, ft_chain); + else + target = list_entry(target->ft_chain.next, + struct lu_fld_target, + ft_chain); + spin_unlock(&fld->lcf_lock); + if (target != origin) + goto again; + } if (rc == 0) { *mds = res.lsr_index;