From patchwork Thu Feb 27 21:13:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11410217 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED95917E0 for ; Thu, 27 Feb 2020 21:32:52 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D66FC246A2 for ; Thu, 27 Feb 2020 21:32:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D66FC246A2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9F007349BAC; Thu, 27 Feb 2020 13:27:52 -0800 (PST) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4944D21FC19 for ; Thu, 27 Feb 2020 13:20:01 -0800 (PST) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 42E1C8A85; Thu, 27 Feb 2020 16:18:17 -0500 (EST) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 414FB47C; Thu, 27 Feb 2020 16:18:17 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 27 Feb 2020 16:13:23 -0500 Message-Id: <1582838290-17243-336-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> References: <1582838290-17243-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 335/622] lnet: cache ni status X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Amir Shehata , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Amir Shehata When processing the data in the PUSH or the REPLY make sure to cache the ns_status. This is the status of the peer_ni as reported by the peer itself. WC-bug-id: https://jira.whamcloud.com/browse/LU-11300 Lustre-commit: 398f4071dc17 ("LU-11300 lnet: cache ni status") Signed-off-by: Amir Shehata Reviewed-on: https://review.whamcloud.com/33450 Reviewed-by: Chris Horn Reviewed-by: Olaf Weber Signed-off-by: James Simmons --- include/linux/lnet/lib-types.h | 2 ++ net/lnet/lnet/peer.c | 42 +++++++++++++++++++++++++++++++----------- 2 files changed, 33 insertions(+), 11 deletions(-) diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h index 31fe22a..a551005 100644 --- a/include/linux/lnet/lib-types.h +++ b/include/linux/lnet/lib-types.h @@ -585,6 +585,8 @@ struct lnet_peer_ni { int lpni_cpt; /* state flags -- protected by lpni_lock */ unsigned int lpni_state; + /* status of the peer NI as reported by the peer */ + u32 lpni_ns_status; /* sequence number used to round robin over peer nis within a net */ u32 lpni_seq; /* sequence number used to round robin over gateways */ diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c index cb70bc7..cba3da2 100644 --- a/net/lnet/lnet/peer.c +++ b/net/lnet/lnet/peer.c @@ -128,8 +128,10 @@ spin_lock_init(&lpni->lpni_lock); - lpni->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ - lpni->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ + if (lnet_peers_start_down()) + lpni->lpni_ns_status = LNET_NI_STATUS_DOWN; + else + lpni->lpni_ns_status = LNET_NI_STATUS_UP; lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL; lpni->lpni_nid = nid; lpni->lpni_cpt = cpt; @@ -2410,7 +2412,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, { struct lnet_peer_ni *lpni; lnet_nid_t *curnis = NULL; - lnet_nid_t *addnis = NULL; + struct lnet_ni_status *addnis = NULL; lnet_nid_t *delnis = NULL; unsigned int flags; int ncurnis; @@ -2426,9 +2428,9 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, flags |= LNET_PEER_MULTI_RAIL; nnis = max_t(int, lp->lp_nnis, pbuf->pb_info.pi_nnis); - curnis = kmalloc_array(nnis, sizeof(lnet_nid_t), GFP_NOFS); - addnis = kmalloc_array(nnis, sizeof(lnet_nid_t), GFP_NOFS); - delnis = kmalloc_array(nnis, sizeof(lnet_nid_t), GFP_NOFS); + curnis = kmalloc_array(nnis, sizeof(*curnis), GFP_NOFS); + addnis = kmalloc_array(nnis, sizeof(*addnis), GFP_NOFS); + delnis = kmalloc_array(nnis, sizeof(*delnis), GFP_NOFS); if (!curnis || !addnis || !delnis) { rc = -ENOMEM; goto out; @@ -2451,7 +2453,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, if (pbuf->pb_info.pi_ni[i].ns_nid == curnis[j]) break; if (j == ncurnis) - addnis[naddnis++] = pbuf->pb_info.pi_ni[i].ns_nid; + addnis[naddnis++] = pbuf->pb_info.pi_ni[i]; } /* * Check for NIDs in curnis[] not present in pbuf. @@ -2463,23 +2465,41 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, for (i = 0; i < ncurnis; i++) { if (LNET_NETTYP(LNET_NIDNET(curnis[i])) == LOLND) continue; - for (j = 1; j < pbuf->pb_info.pi_nnis; j++) - if (curnis[i] == pbuf->pb_info.pi_ni[j].ns_nid) + for (j = 1; j < pbuf->pb_info.pi_nnis; j++) { + if (curnis[i] == pbuf->pb_info.pi_ni[j].ns_nid) { + /* update the information we cache for the + * peer with the latest information we + * received + */ + lpni = lnet_find_peer_ni_locked(curnis[i]); + if (lpni) { + lpni->lpni_ns_status = + pbuf->pb_info.pi_ni[j].ns_status; + lnet_peer_ni_decref_locked(lpni); + } break; + } + } if (j == pbuf->pb_info.pi_nnis) delnis[ndelnis++] = curnis[i]; } for (i = 0; i < naddnis; i++) { - rc = lnet_peer_add_nid(lp, addnis[i], flags); + rc = lnet_peer_add_nid(lp, addnis[i].ns_nid, flags); if (rc) { CERROR("Error adding NID %s to peer %s: %d\n", - libcfs_nid2str(addnis[i]), + libcfs_nid2str(addnis[i].ns_nid), libcfs_nid2str(lp->lp_primary_nid), rc); if (rc == -ENOMEM) goto out; } + lpni = lnet_find_peer_ni_locked(addnis[i].ns_nid); + if (lpni) { + lpni->lpni_ns_status = addnis[i].ns_status; + lnet_peer_ni_decref_locked(lpni); + } } + for (i = 0; i < ndelnis; i++) { rc = lnet_peer_del_nid(lp, delnis[i], flags); if (rc) {