From patchwork Tue Sep 25 01:07:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613153 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9AB8D14DA for ; Tue, 25 Sep 2018 01:10:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9C0682A04E for ; Tue, 25 Sep 2018 01:10:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8F7892A052; Tue, 25 Sep 2018 01:10:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 671592A04E for ; Tue, 25 Sep 2018 01:09:58 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 099884C3C12; Mon, 24 Sep 2018 18:09:58 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id A57664C3C06 for ; Mon, 24 Sep 2018 18:09:56 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C2A3EB033; Tue, 25 Sep 2018 01:09:55 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:14 +1000 Message-ID: <153783763488.32103.14222362329822520874.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 01/34] lnet: replace all lp_ fields with lpni_ X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP sed -i 's/\blp_/lpni_/g' `git grep -l '\blp_' drivers/staging/lustre/lnet | grep '\.[ch]$'` followed by some long-line cleanups. This is part of Commit: 58091af960fe ("LU-7734 lnet: Multi-Rail peer split") from upstream lustre, where it is marked: Signed-off-by: Amir Shehata WC-bug-id: https://jira.whamcloud.com/browse/LU-7734 Reviewed-on: http://review.whamcloud.com/18293 Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 24 +- .../staging/lustre/include/linux/lnet/lib-types.h | 63 +++--- drivers/staging/lustre/lnet/lnet/lib-move.c | 146 +++++++------ drivers/staging/lustre/lnet/lnet/peer.c | 125 ++++++----- drivers/staging/lustre/lnet/lnet/router.c | 218 ++++++++++---------- drivers/staging/lustre/lnet/lnet/router_proc.c | 52 ++--- 6 files changed, 316 insertions(+), 312 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 5ee770cd7a5f..9b54a3d72290 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -79,10 +79,10 @@ extern struct lnet the_lnet; /* THE network */ static inline int lnet_is_route_alive(struct lnet_route *route) { /* gateway is down */ - if (!route->lr_gateway->lp_alive) + if (!route->lr_gateway->lpni_alive) return 0; /* no NI status, assume it's alive */ - if ((route->lr_gateway->lp_ping_feats & + if ((route->lr_gateway->lpni_ping_feats & LNET_PING_FEAT_NI_STATUS) == 0) return 1; /* has NI status, check # down NIs */ @@ -313,8 +313,8 @@ lnet_handle2me(struct lnet_handle_me *handle) static inline void lnet_peer_addref_locked(struct lnet_peer *lp) { - LASSERT(lp->lp_refcount > 0); - lp->lp_refcount++; + LASSERT(lp->lpni_refcount > 0); + lp->lpni_refcount++; } void lnet_destroy_peer_locked(struct lnet_peer *lp); @@ -322,16 +322,16 @@ void lnet_destroy_peer_locked(struct lnet_peer *lp); static inline void lnet_peer_decref_locked(struct lnet_peer *lp) { - LASSERT(lp->lp_refcount > 0); - lp->lp_refcount--; - if (!lp->lp_refcount) + LASSERT(lp->lpni_refcount > 0); + lp->lpni_refcount--; + if (!lp->lpni_refcount) lnet_destroy_peer_locked(lp); } static inline int lnet_isrouter(struct lnet_peer *lp) { - return lp->lp_rtr_refcount ? 1 : 0; + return lp->lpni_rtr_refcount ? 1 : 0; } static inline void @@ -652,10 +652,10 @@ int lnet_get_peer_info(__u32 peer_index, __u64 *nid, static inline void lnet_peer_set_alive(struct lnet_peer *lp) { - lp->lp_last_query = ktime_get_seconds(); - lp->lp_last_alive = lp->lp_last_query; - if (!lp->lp_alive) - lnet_notify_locked(lp, 0, 1, lp->lp_last_alive); + lp->lpni_last_query = ktime_get_seconds(); + lp->lpni_last_alive = lp->lpni_last_query; + if (!lp->lpni_alive) + lnet_notify_locked(lp, 0, 1, lp->lpni_last_alive); } #endif diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 8bc72f25a897..59a1a2620675 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -385,61 +385,61 @@ struct lnet_rc_data { struct lnet_peer { /* chain on peer hash */ - struct list_head lp_hashlist; + struct list_head lpni_hashlist; /* messages blocking for tx credits */ - struct list_head lp_txq; + struct list_head lpni_txq; /* messages blocking for router credits */ - struct list_head lp_rtrq; + struct list_head lpni_rtrq; /* chain on router list */ - struct list_head lp_rtr_list; + struct list_head lpni_rtr_list; /* # tx credits available */ - int lp_txcredits; + int lpni_txcredits; /* low water mark */ - int lp_mintxcredits; + int lpni_mintxcredits; /* # router credits */ - int lp_rtrcredits; + int lpni_rtrcredits; /* low water mark */ - int lp_minrtrcredits; + int lpni_minrtrcredits; /* alive/dead? */ - unsigned int lp_alive:1; + unsigned int lpni_alive:1; /* notification outstanding? */ - unsigned int lp_notify:1; + unsigned int lpni_notify:1; /* outstanding notification for LND? */ - unsigned int lp_notifylnd:1; + unsigned int lpni_notifylnd:1; /* some thread is handling notification */ - unsigned int lp_notifying:1; + unsigned int lpni_notifying:1; /* SEND event outstanding from ping */ - unsigned int lp_ping_notsent; + unsigned int lpni_ping_notsent; /* # times router went dead<->alive */ - int lp_alive_count; + int lpni_alive_count; /* ytes queued for sending */ - long lp_txqnob; + long lpni_txqnob; /* time of last aliveness news */ - time64_t lp_timestamp; + time64_t lpni_timestamp; /* time of last ping attempt */ - time64_t lp_ping_timestamp; + time64_t lpni_ping_timestamp; /* != 0 if ping reply expected */ - time64_t lp_ping_deadline; + time64_t lpni_ping_deadline; /* when I was last alive */ - time64_t lp_last_alive; - /* when lp_ni was queried last time */ - time64_t lp_last_query; + time64_t lpni_last_alive; + /* when lpni_ni was queried last time */ + time64_t lpni_last_query; /* network peer is on */ - struct lnet_net *lp_net; + struct lnet_net *lpni_net; /* peer's NID */ - lnet_nid_t lp_nid; + lnet_nid_t lpni_nid; /* # refs */ - int lp_refcount; + int lpni_refcount; /* CPT this peer attached on */ - int lp_cpt; + int lpni_cpt; /* # refs from lnet_route::lr_gateway */ - int lp_rtr_refcount; + int lpni_rtr_refcount; /* returned RC ping features */ - unsigned int lp_ping_feats; + unsigned int lpni_ping_feats; /* routers on this peer */ - struct list_head lp_routes; + struct list_head lpni_routes; /* router checker state */ - struct lnet_rc_data *lp_rcd; + struct lnet_rc_data *lpni_rcd; }; /* peer hash size */ @@ -464,8 +464,9 @@ struct lnet_peer_table { * peer aliveness is enabled only on routers for peers in a network where the * lnet_ni::ni_peertimeout has been set to a positive value */ -#define lnet_peer_aliveness_enabled(lp) (the_lnet.ln_routing && \ - (lp)->lp_net->net_tunables.lct_peer_timeout > 0) +#define lnet_peer_aliveness_enabled(lp) \ + (the_lnet.ln_routing && \ + (lp)->lpni_net->net_tunables.lct_peer_timeout > 0) struct lnet_route { /* chain on net */ diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index b75ebc236f3a..5879a109d46a 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -481,7 +481,7 @@ lnet_ni_eager_recv(struct lnet_ni *ni, struct lnet_msg *msg) &msg->msg_private); if (rc) { CERROR("recv from %s / send to %s aborted: eager_recv failed %d\n", - libcfs_nid2str(msg->msg_rxpeer->lp_nid), + libcfs_nid2str(msg->msg_rxpeer->lpni_nid), libcfs_id2str(msg->msg_target), rc); LASSERT(rc < 0); /* required by my callers */ } @@ -498,14 +498,14 @@ lnet_ni_query_locked(struct lnet_ni *ni, struct lnet_peer *lp) LASSERT(lnet_peer_aliveness_enabled(lp)); LASSERT(ni->ni_net->net_lnd->lnd_query); - lnet_net_unlock(lp->lp_cpt); - ni->ni_net->net_lnd->lnd_query(ni, lp->lp_nid, &last_alive); - lnet_net_lock(lp->lp_cpt); + lnet_net_unlock(lp->lpni_cpt); + ni->ni_net->net_lnd->lnd_query(ni, lp->lpni_nid, &last_alive); + lnet_net_lock(lp->lpni_cpt); - lp->lp_last_query = ktime_get_seconds(); + lp->lpni_last_query = ktime_get_seconds(); if (last_alive) /* NI has updated timestamp */ - lp->lp_last_alive = last_alive; + lp->lpni_last_alive = last_alive; } /* NB: always called with lnet_net_lock held */ @@ -520,21 +520,21 @@ lnet_peer_is_alive(struct lnet_peer *lp, unsigned long now) /* Trust lnet_notify() if it has more recent aliveness news, but * ignore the initial assumed death (see lnet_peers_start_down()). */ - if (!lp->lp_alive && lp->lp_alive_count > 0 && - lp->lp_timestamp >= lp->lp_last_alive) + if (!lp->lpni_alive && lp->lpni_alive_count > 0 && + lp->lpni_timestamp >= lp->lpni_last_alive) return 0; - deadline = lp->lp_last_alive + - lp->lp_net->net_tunables.lct_peer_timeout; + deadline = lp->lpni_last_alive + + lp->lpni_net->net_tunables.lct_peer_timeout; alive = deadline > now; - /* Update obsolete lp_alive except for routers assumed to be dead + /* Update obsolete lpni_alive except for routers assumed to be dead * initially, because router checker would update aliveness in this - * case, and moreover lp_last_alive at peer creation is assumed. + * case, and moreover lpni_last_alive at peer creation is assumed. */ - if (alive && !lp->lp_alive && - !(lnet_isrouter(lp) && !lp->lp_alive_count)) - lnet_notify_locked(lp, 0, 1, lp->lp_last_alive); + if (alive && !lp->lpni_alive && + !(lnet_isrouter(lp) && !lp->lpni_alive_count)) + lnet_notify_locked(lp, 0, 1, lp->lpni_last_alive); return alive; } @@ -558,19 +558,19 @@ lnet_peer_alive_locked(struct lnet_ni *ni, struct lnet_peer *lp) * Peer appears dead, but we should avoid frequent NI queries (at * most once per lnet_queryinterval seconds). */ - if (lp->lp_last_query) { + if (lp->lpni_last_query) { static const int lnet_queryinterval = 1; time64_t next_query; - next_query = lp->lp_last_query + lnet_queryinterval; + next_query = lp->lpni_last_query + lnet_queryinterval; if (now < next_query) { - if (lp->lp_alive) + if (lp->lpni_alive) CWARN("Unexpected aliveness of peer %s: %lld < %lld (%d/%d)\n", - libcfs_nid2str(lp->lp_nid), + libcfs_nid2str(lp->lpni_nid), now, next_query, lnet_queryinterval, - lp->lp_net->net_tunables.lct_peer_timeout); + lp->lpni_net->net_tunables.lct_peer_timeout); return 0; } } @@ -581,7 +581,7 @@ lnet_peer_alive_locked(struct lnet_ni *ni, struct lnet_peer *lp) if (lnet_peer_is_alive(lp, now)) return 1; - lnet_notify_locked(lp, 0, 0, lp->lp_last_alive); + lnet_notify_locked(lp, 0, 0, lp->lpni_last_alive); return 0; } @@ -639,19 +639,19 @@ lnet_post_send_locked(struct lnet_msg *msg, int do_send) } if (!msg->msg_peertxcredit) { - LASSERT((lp->lp_txcredits < 0) == - !list_empty(&lp->lp_txq)); + LASSERT((lp->lpni_txcredits < 0) == + !list_empty(&lp->lpni_txq)); msg->msg_peertxcredit = 1; - lp->lp_txqnob += msg->msg_len + sizeof(struct lnet_hdr); - lp->lp_txcredits--; + lp->lpni_txqnob += msg->msg_len + sizeof(struct lnet_hdr); + lp->lpni_txcredits--; - if (lp->lp_txcredits < lp->lp_mintxcredits) - lp->lp_mintxcredits = lp->lp_txcredits; + if (lp->lpni_txcredits < lp->lpni_mintxcredits) + lp->lpni_mintxcredits = lp->lpni_txcredits; - if (lp->lp_txcredits < 0) { + if (lp->lpni_txcredits < 0) { msg->msg_tx_delayed = 1; - list_add_tail(&msg->msg_list, &lp->lp_txq); + list_add_tail(&msg->msg_list, &lp->lpni_txq); return LNET_CREDIT_WAIT; } } @@ -725,19 +725,19 @@ lnet_post_routed_recv_locked(struct lnet_msg *msg, int do_recv) LASSERT(!do_recv || msg->msg_rx_delayed); if (!msg->msg_peerrtrcredit) { - LASSERT((lp->lp_rtrcredits < 0) == - !list_empty(&lp->lp_rtrq)); + LASSERT((lp->lpni_rtrcredits < 0) == + !list_empty(&lp->lpni_rtrq)); msg->msg_peerrtrcredit = 1; - lp->lp_rtrcredits--; - if (lp->lp_rtrcredits < lp->lp_minrtrcredits) - lp->lp_minrtrcredits = lp->lp_rtrcredits; + lp->lpni_rtrcredits--; + if (lp->lpni_rtrcredits < lp->lpni_minrtrcredits) + lp->lpni_minrtrcredits = lp->lpni_rtrcredits; - if (lp->lp_rtrcredits < 0) { + if (lp->lpni_rtrcredits < 0) { /* must have checked eager_recv before here */ LASSERT(msg->msg_rx_ready_delay); msg->msg_rx_delayed = 1; - list_add_tail(&msg->msg_list, &lp->lp_rtrq); + list_add_tail(&msg->msg_list, &lp->lpni_rtrq); return LNET_CREDIT_WAIT; } } @@ -811,15 +811,15 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) /* give back peer txcredits */ msg->msg_peertxcredit = 0; - LASSERT((txpeer->lp_txcredits < 0) == - !list_empty(&txpeer->lp_txq)); + LASSERT((txpeer->lpni_txcredits < 0) == + !list_empty(&txpeer->lpni_txq)); - txpeer->lp_txqnob -= msg->msg_len + sizeof(struct lnet_hdr); - LASSERT(txpeer->lp_txqnob >= 0); + txpeer->lpni_txqnob -= msg->msg_len + sizeof(struct lnet_hdr); + LASSERT(txpeer->lpni_txqnob >= 0); - txpeer->lp_txcredits++; - if (txpeer->lp_txcredits <= 0) { - msg2 = list_entry(txpeer->lp_txq.next, + txpeer->lpni_txcredits++; + if (txpeer->lpni_txcredits <= 0) { + msg2 = list_entry(txpeer->lpni_txq.next, struct lnet_msg, msg_list); list_del(&msg2->msg_list); @@ -939,19 +939,19 @@ lnet_return_rx_credits_locked(struct lnet_msg *msg) /* give back peer router credits */ msg->msg_peerrtrcredit = 0; - LASSERT((rxpeer->lp_rtrcredits < 0) == - !list_empty(&rxpeer->lp_rtrq)); + LASSERT((rxpeer->lpni_rtrcredits < 0) == + !list_empty(&rxpeer->lpni_rtrq)); - rxpeer->lp_rtrcredits++; + rxpeer->lpni_rtrcredits++; /* * drop all messages which are queued to be routed on that * peer. */ if (!the_lnet.ln_routing) { - lnet_drop_routed_msgs_locked(&rxpeer->lp_rtrq, + lnet_drop_routed_msgs_locked(&rxpeer->lpni_rtrq, msg->msg_rx_cpt); - } else if (rxpeer->lp_rtrcredits <= 0) { - msg2 = list_entry(rxpeer->lp_rtrq.next, + } else if (rxpeer->lpni_rtrcredits <= 0) { + msg2 = list_entry(rxpeer->lpni_rtrq.next, struct lnet_msg, msg_list); list_del(&msg2->msg_list); @@ -988,16 +988,16 @@ lnet_compare_routes(struct lnet_route *r1, struct lnet_route *r2) if (r1_hops > r2_hops) return -ERANGE; - if (p1->lp_txqnob < p2->lp_txqnob) + if (p1->lpni_txqnob < p2->lpni_txqnob) return 1; - if (p1->lp_txqnob > p2->lp_txqnob) + if (p1->lpni_txqnob > p2->lpni_txqnob) return -ERANGE; - if (p1->lp_txcredits > p2->lp_txcredits) + if (p1->lpni_txcredits > p2->lpni_txcredits) return 1; - if (p1->lp_txcredits < p2->lp_txcredits) + if (p1->lpni_txcredits < p2->lpni_txcredits) return -ERANGE; if (r1->lr_seq - r2->lr_seq <= 0) @@ -1014,7 +1014,7 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, struct lnet_route *route; struct lnet_route *best_route; struct lnet_route *last_route; - struct lnet_peer *lp_best; + struct lnet_peer *lpni_best; struct lnet_peer *lp; int rc; @@ -1026,7 +1026,7 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, if (!rnet) return NULL; - lp_best = NULL; + lpni_best = NULL; best_route = NULL; last_route = NULL; list_for_each_entry(route, &rnet->lrn_routes, lr_list) { @@ -1035,16 +1035,16 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, if (!lnet_is_route_alive(route)) continue; - if (net && lp->lp_net != net) + if (net && lp->lpni_net != net) continue; - if (lp->lp_nid == rtr_nid) /* it's pre-determined router */ + if (lp->lpni_nid == rtr_nid) /* it's pre-determined router */ return lp; - if (!lp_best) { + if (!lpni_best) { best_route = route; last_route = route; - lp_best = lp; + lpni_best = lp; continue; } @@ -1057,7 +1057,7 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, continue; best_route = route; - lp_best = lp; + lpni_best = lp; } /* @@ -1067,7 +1067,7 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, */ if (best_route) best_route->lr_seq = last_route->lr_seq + 1; - return lp_best; + return lpni_best; } int @@ -1156,7 +1156,7 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) /* ENOMEM or shutting down */ return rc; } - LASSERT(lp->lp_net == src_ni->ni_net); + LASSERT(lp->lpni_net == src_ni->ni_net); } else { /* sending to a remote network */ lp = lnet_find_route_locked(src_ni ? src_ni->ni_net : NULL, @@ -1176,27 +1176,27 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) * pre-determined router, this can happen if router table * was changed when we release the lock */ - if (rtr_nid != lp->lp_nid) { - cpt2 = lp->lp_cpt; + if (rtr_nid != lp->lpni_nid) { + cpt2 = lp->lpni_cpt; if (cpt2 != cpt) { lnet_net_unlock(cpt); - rtr_nid = lp->lp_nid; + rtr_nid = lp->lpni_nid; cpt = cpt2; goto again; } } CDEBUG(D_NET, "Best route to %s via %s for %s %d\n", - libcfs_nid2str(dst_nid), libcfs_nid2str(lp->lp_nid), + libcfs_nid2str(dst_nid), libcfs_nid2str(lp->lpni_nid), lnet_msgtyp2str(msg->msg_type), msg->msg_len); if (!src_ni) { - src_ni = lnet_get_next_ni_locked(lp->lp_net, NULL); + src_ni = lnet_get_next_ni_locked(lp->lpni_net, NULL); LASSERT(src_ni); src_nid = src_ni->ni_nid; } else { - LASSERT(src_ni->ni_net == lp->lp_net); + LASSERT(src_ni->ni_net == lp->lpni_net); } lnet_peer_addref_locked(lp); @@ -1210,7 +1210,7 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) } msg->msg_target_is_router = 1; - msg->msg_target.nid = lp->lp_nid; + msg->msg_target.nid = lp->lpni_nid; msg->msg_target.pid = LNET_PID_LUSTRE; } @@ -1289,7 +1289,7 @@ lnet_parse_put(struct lnet_ni *ni, struct lnet_msg *msg) info.mi_rlength = hdr->payload_length; info.mi_roffset = hdr->msg.put.offset; info.mi_mbits = hdr->msg.put.match_bits; - info.mi_cpt = msg->msg_rxpeer->lp_cpt; + info.mi_cpt = msg->msg_rxpeer->lpni_cpt; msg->msg_rx_ready_delay = !ni->ni_net->net_lnd->lnd_eager_recv; ready_delay = msg->msg_rx_ready_delay; @@ -1520,7 +1520,7 @@ lnet_parse_forward_locked(struct lnet_ni *ni, struct lnet_msg *msg) if (!the_lnet.ln_routing) return -ECANCELED; - if (msg->msg_rxpeer->lp_rtrcredits <= 0 || + if (msg->msg_rxpeer->lpni_rtrcredits <= 0 || lnet_msg2bufpool(msg)->rbp_credits <= 0) { if (!ni->ni_net->net_lnd->lnd_eager_recv) { msg->msg_rx_ready_delay = 1; @@ -1909,7 +1909,7 @@ lnet_drop_delayed_msg_list(struct list_head *head, char *reason) * until that's done */ lnet_drop_message(msg->msg_rxni, - msg->msg_rxpeer->lp_cpt, + msg->msg_rxpeer->lpni_cpt, msg->msg_private, msg->msg_len); /* * NB: message will not generate event because w/o attached MD, @@ -2376,7 +2376,7 @@ LNetDist(lnet_nid_t dstnid, lnet_nid_t *srcnidp, __u32 *orderp) hops = shortest_hops; if (srcnidp) { ni = lnet_get_next_ni_locked( - shortest->lr_gateway->lp_net, + shortest->lr_gateway->lpni_net, NULL); *srcnidp = ni->ni_nid; } diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 42bc35010f64..619d016b1d89 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -111,10 +111,10 @@ lnet_peer_table_cleanup_locked(struct lnet_ni *ni, for (i = 0; i < LNET_PEER_HASH_SIZE; i++) { list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i], - lp_hashlist) { - if (ni && ni->ni_net != lp->lp_net) + lpni_hashlist) { + if (ni && ni->ni_net != lp->lpni_net) continue; - list_del_init(&lp->lp_hashlist); + list_del_init(&lp->lpni_hashlist); /* Lose hash table's ref */ ptable->pt_zombies++; lnet_peer_decref_locked(lp); @@ -148,22 +148,22 @@ lnet_peer_table_del_rtrs_locked(struct lnet_ni *ni, { struct lnet_peer *lp; struct lnet_peer *tmp; - lnet_nid_t lp_nid; + lnet_nid_t lpni_nid; int i; for (i = 0; i < LNET_PEER_HASH_SIZE; i++) { list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i], - lp_hashlist) { - if (ni->ni_net != lp->lp_net) + lpni_hashlist) { + if (ni->ni_net != lp->lpni_net) continue; - if (!lp->lp_rtr_refcount) + if (!lp->lpni_rtr_refcount) continue; - lp_nid = lp->lp_nid; + lpni_nid = lp->lpni_nid; lnet_net_unlock(cpt_locked); - lnet_del_route(LNET_NIDNET(LNET_NID_ANY), lp_nid); + lnet_del_route(LNET_NIDNET(LNET_NID_ANY), lpni_nid); lnet_net_lock(cpt_locked); } } @@ -209,8 +209,8 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) } while (!list_empty(&deathrow)) { - lp = list_entry(deathrow.next, struct lnet_peer, lp_hashlist); - list_del(&lp->lp_hashlist); + lp = list_entry(deathrow.next, struct lnet_peer, lpni_hashlist); + list_del(&lp->lpni_hashlist); kfree(lp); } } @@ -220,19 +220,19 @@ lnet_destroy_peer_locked(struct lnet_peer *lp) { struct lnet_peer_table *ptable; - LASSERT(!lp->lp_refcount); - LASSERT(!lp->lp_rtr_refcount); - LASSERT(list_empty(&lp->lp_txq)); - LASSERT(list_empty(&lp->lp_hashlist)); - LASSERT(!lp->lp_txqnob); + LASSERT(!lp->lpni_refcount); + LASSERT(!lp->lpni_rtr_refcount); + LASSERT(list_empty(&lp->lpni_txq)); + LASSERT(list_empty(&lp->lpni_hashlist)); + LASSERT(!lp->lpni_txqnob); - ptable = the_lnet.ln_peer_tables[lp->lp_cpt]; + ptable = the_lnet.ln_peer_tables[lp->lpni_cpt]; LASSERT(ptable->pt_number > 0); ptable->pt_number--; - lp->lp_net = NULL; + lp->lpni_net = NULL; - list_add(&lp->lp_hashlist, &ptable->pt_deathrow); + list_add(&lp->lpni_hashlist, &ptable->pt_deathrow); LASSERT(ptable->pt_zombies > 0); ptable->pt_zombies--; } @@ -246,8 +246,8 @@ lnet_find_peer_locked(struct lnet_peer_table *ptable, lnet_nid_t nid) LASSERT(!the_lnet.ln_shutdown); peers = &ptable->pt_hash[lnet_nid2peerhash(nid)]; - list_for_each_entry(lp, peers, lp_hashlist) { - if (lp->lp_nid == nid) { + list_for_each_entry(lp, peers, lpni_hashlist) { + if (lp->lpni_nid == nid) { lnet_peer_addref_locked(lp); return lp; } @@ -281,8 +281,8 @@ lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) if (!list_empty(&ptable->pt_deathrow)) { lp = list_entry(ptable->pt_deathrow.next, - struct lnet_peer, lp_hashlist); - list_del(&lp->lp_hashlist); + struct lnet_peer, lpni_hashlist); + list_del(&lp->lpni_hashlist); } /* @@ -303,24 +303,24 @@ lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) goto out; } - INIT_LIST_HEAD(&lp->lp_txq); - INIT_LIST_HEAD(&lp->lp_rtrq); - INIT_LIST_HEAD(&lp->lp_routes); - - lp->lp_notify = 0; - lp->lp_notifylnd = 0; - lp->lp_notifying = 0; - lp->lp_alive_count = 0; - lp->lp_timestamp = 0; - lp->lp_alive = !lnet_peers_start_down(); /* 1 bit!! */ - lp->lp_last_alive = ktime_get_seconds(); /* assumes alive */ - lp->lp_last_query = 0; /* haven't asked NI yet */ - lp->lp_ping_timestamp = 0; - lp->lp_ping_feats = LNET_PING_FEAT_INVAL; - lp->lp_nid = nid; - lp->lp_cpt = cpt2; - lp->lp_refcount = 2; /* 1 for caller; 1 for hash */ - lp->lp_rtr_refcount = 0; + INIT_LIST_HEAD(&lp->lpni_txq); + INIT_LIST_HEAD(&lp->lpni_rtrq); + INIT_LIST_HEAD(&lp->lpni_routes); + + lp->lpni_notify = 0; + lp->lpni_notifylnd = 0; + lp->lpni_notifying = 0; + lp->lpni_alive_count = 0; + lp->lpni_timestamp = 0; + lp->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ + lp->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ + lp->lpni_last_query = 0; /* haven't asked NI yet */ + lp->lpni_ping_timestamp = 0; + lp->lpni_ping_feats = LNET_PING_FEAT_INVAL; + lp->lpni_nid = nid; + lp->lpni_cpt = cpt2; + lp->lpni_refcount = 2; /* 1 for caller; 1 for hash */ + lp->lpni_rtr_refcount = 0; lnet_net_lock(cpt); @@ -335,13 +335,14 @@ lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) goto out; } - lp->lp_net = lnet_get_net_locked(LNET_NIDNET(lp->lp_nid)); - lp->lp_txcredits = - lp->lp_mintxcredits = lp->lp_net->net_tunables.lct_peer_tx_credits; - lp->lp_rtrcredits = - lp->lp_minrtrcredits = lnet_peer_buffer_credits(lp->lp_net); + lp->lpni_net = lnet_get_net_locked(LNET_NIDNET(lp->lpni_nid)); + lp->lpni_txcredits = + lp->lpni_mintxcredits = + lp->lpni_net->net_tunables.lct_peer_tx_credits; + lp->lpni_rtrcredits = + lp->lpni_minrtrcredits = lnet_peer_buffer_credits(lp->lpni_net); - list_add_tail(&lp->lp_hashlist, + list_add_tail(&lp->lpni_hashlist, &ptable->pt_hash[lnet_nid2peerhash(nid)]); ptable->pt_version++; *lpp = lp; @@ -349,7 +350,7 @@ lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) return 0; out: if (lp) - list_add(&lp->lp_hashlist, &ptable->pt_deathrow); + list_add(&lp->lpni_hashlist, &ptable->pt_deathrow); ptable->pt_number--; return rc; } @@ -373,13 +374,13 @@ lnet_debug_peer(lnet_nid_t nid) } if (lnet_isrouter(lp) || lnet_peer_aliveness_enabled(lp)) - aliveness = lp->lp_alive ? "up" : "down"; + aliveness = lp->lpni_alive ? "up" : "down"; CDEBUG(D_WARNING, "%-24s %4d %5s %5d %5d %5d %5d %5d %ld\n", - libcfs_nid2str(lp->lp_nid), lp->lp_refcount, - aliveness, lp->lp_net->net_tunables.lct_peer_tx_credits, - lp->lp_rtrcredits, lp->lp_minrtrcredits, - lp->lp_txcredits, lp->lp_mintxcredits, lp->lp_txqnob); + libcfs_nid2str(lp->lpni_nid), lp->lpni_refcount, + aliveness, lp->lpni_net->net_tunables.lct_peer_tx_credits, + lp->lpni_rtrcredits, lp->lpni_minrtrcredits, + lp->lpni_txcredits, lp->lpni_mintxcredits, lp->lpni_txqnob); lnet_peer_decref_locked(lp); @@ -420,7 +421,7 @@ lnet_get_peer_info(__u32 peer_index, __u64 *nid, for (j = 0; j < LNET_PEER_HASH_SIZE && !found; j++) { struct list_head *peers = &peer_table->pt_hash[j]; - list_for_each_entry(lp, peers, lp_hashlist) { + list_for_each_entry(lp, peers, lpni_hashlist) { if (peer_index-- > 0) continue; @@ -428,16 +429,16 @@ lnet_get_peer_info(__u32 peer_index, __u64 *nid, if (lnet_isrouter(lp) || lnet_peer_aliveness_enabled(lp)) snprintf(aliveness, LNET_MAX_STR_LEN, - lp->lp_alive ? "up" : "down"); + lp->lpni_alive ? "up" : "down"); - *nid = lp->lp_nid; - *refcount = lp->lp_refcount; + *nid = lp->lpni_nid; + *refcount = lp->lpni_refcount; *ni_peer_tx_credits = - lp->lp_net->net_tunables.lct_peer_tx_credits; - *peer_tx_credits = lp->lp_txcredits; - *peer_rtr_credits = lp->lp_rtrcredits; - *peer_min_rtr_credits = lp->lp_mintxcredits; - *peer_tx_qnob = lp->lp_txqnob; + lp->lpni_net->net_tunables.lct_peer_tx_credits; + *peer_tx_credits = lp->lpni_txcredits; + *peer_rtr_credits = lp->lpni_rtrcredits; + *peer_min_rtr_credits = lp->lpni_mintxcredits; + *peer_tx_qnob = lp->lpni_txqnob; found = true; } diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 2bbd1cf86a8c..2be1ffb6b720 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -103,30 +103,30 @@ void lnet_notify_locked(struct lnet_peer *lp, int notifylnd, int alive, time64_t when) { - if (lp->lp_timestamp > when) { /* out of date information */ + if (lp->lpni_timestamp > when) { /* out of date information */ CDEBUG(D_NET, "Out of date\n"); return; } - lp->lp_timestamp = when; /* update timestamp */ - lp->lp_ping_deadline = 0; /* disable ping timeout */ + lp->lpni_timestamp = when; /* update timestamp */ + lp->lpni_ping_deadline = 0; /* disable ping timeout */ - if (lp->lp_alive_count && /* got old news */ - (!lp->lp_alive) == (!alive)) { /* new date for old news */ + if (lp->lpni_alive_count && /* got old news */ + (!lp->lpni_alive) == (!alive)) { /* new date for old news */ CDEBUG(D_NET, "Old news\n"); return; } /* Flag that notification is outstanding */ - lp->lp_alive_count++; - lp->lp_alive = !(!alive); /* 1 bit! */ - lp->lp_notify = 1; - lp->lp_notifylnd |= notifylnd; - if (lp->lp_alive) - lp->lp_ping_feats = LNET_PING_FEAT_INVAL; /* reset */ + lp->lpni_alive_count++; + lp->lpni_alive = !(!alive); /* 1 bit! */ + lp->lpni_notify = 1; + lp->lpni_notifylnd |= notifylnd; + if (lp->lpni_alive) + lp->lpni_ping_feats = LNET_PING_FEAT_INVAL; /* reset */ - CDEBUG(D_NET, "set %s %d\n", libcfs_nid2str(lp->lp_nid), alive); + CDEBUG(D_NET, "set %s %d\n", libcfs_nid2str(lp->lpni_nid), alive); } static void @@ -140,55 +140,56 @@ lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer *lp) * NB individual events can be missed; the only guarantee is that you * always get the most recent news */ - if (lp->lp_notifying || !ni) + if (lp->lpni_notifying || !ni) return; - lp->lp_notifying = 1; + lp->lpni_notifying = 1; - while (lp->lp_notify) { - alive = lp->lp_alive; - notifylnd = lp->lp_notifylnd; + while (lp->lpni_notify) { + alive = lp->lpni_alive; + notifylnd = lp->lpni_notifylnd; - lp->lp_notifylnd = 0; - lp->lp_notify = 0; + lp->lpni_notifylnd = 0; + lp->lpni_notify = 0; if (notifylnd && ni->ni_net->net_lnd->lnd_notify) { - lnet_net_unlock(lp->lp_cpt); + lnet_net_unlock(lp->lpni_cpt); /* * A new notification could happen now; I'll handle it * when control returns to me */ - ni->ni_net->net_lnd->lnd_notify(ni, lp->lp_nid, alive); + ni->ni_net->net_lnd->lnd_notify(ni, lp->lpni_nid, + alive); - lnet_net_lock(lp->lp_cpt); + lnet_net_lock(lp->lpni_cpt); } } - lp->lp_notifying = 0; + lp->lpni_notifying = 0; } static void lnet_rtr_addref_locked(struct lnet_peer *lp) { - LASSERT(lp->lp_refcount > 0); - LASSERT(lp->lp_rtr_refcount >= 0); + LASSERT(lp->lpni_refcount > 0); + LASSERT(lp->lpni_rtr_refcount >= 0); /* lnet_net_lock must be exclusively locked */ - lp->lp_rtr_refcount++; - if (lp->lp_rtr_refcount == 1) { + lp->lpni_rtr_refcount++; + if (lp->lpni_rtr_refcount == 1) { struct list_head *pos; /* a simple insertion sort */ list_for_each_prev(pos, &the_lnet.ln_routers) { struct lnet_peer *rtr; - rtr = list_entry(pos, struct lnet_peer, lp_rtr_list); - if (rtr->lp_nid < lp->lp_nid) + rtr = list_entry(pos, struct lnet_peer, lpni_rtr_list); + if (rtr->lpni_nid < lp->lpni_nid) break; } - list_add(&lp->lp_rtr_list, pos); + list_add(&lp->lpni_rtr_list, pos); /* addref for the_lnet.ln_routers */ lnet_peer_addref_locked(lp); the_lnet.ln_routers_version++; @@ -198,21 +199,21 @@ lnet_rtr_addref_locked(struct lnet_peer *lp) static void lnet_rtr_decref_locked(struct lnet_peer *lp) { - LASSERT(lp->lp_refcount > 0); - LASSERT(lp->lp_rtr_refcount > 0); + LASSERT(lp->lpni_refcount > 0); + LASSERT(lp->lpni_rtr_refcount > 0); /* lnet_net_lock must be exclusively locked */ - lp->lp_rtr_refcount--; - if (!lp->lp_rtr_refcount) { - LASSERT(list_empty(&lp->lp_routes)); + lp->lpni_rtr_refcount--; + if (!lp->lpni_rtr_refcount) { + LASSERT(list_empty(&lp->lpni_routes)); - if (lp->lp_rcd) { - list_add(&lp->lp_rcd->rcd_list, + if (lp->lpni_rcd) { + list_add(&lp->lpni_rcd->rcd_list, &the_lnet.ln_rcd_deathrow); - lp->lp_rcd = NULL; + lp->lpni_rcd = NULL; } - list_del(&lp->lp_rtr_list); + list_del(&lp->lpni_rtr_list); /* decref for the_lnet.ln_routers */ lnet_peer_decref_locked(lp); the_lnet.ln_routers_version++; @@ -279,7 +280,7 @@ lnet_add_route_to_rnet(struct lnet_remotenet *rnet, struct lnet_route *route) offset--; } list_add(&route->lr_list, e); - list_add(&route->lr_gwlist, &route->lr_gateway->lp_routes); + list_add(&route->lr_gwlist, &route->lr_gateway->lpni_routes); the_lnet.ln_remote_nets_version++; lnet_rtr_addref_locked(route->lr_gateway); @@ -364,14 +365,14 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, } /* our lookups must be true */ - LASSERT(route2->lr_gateway->lp_nid != gateway); + LASSERT(route2->lr_gateway->lpni_nid != gateway); } if (add_route) { lnet_peer_addref_locked(route->lr_gateway); /* +1 for notify */ lnet_add_route_to_rnet(rnet2, route); - ni = lnet_get_next_ni_locked(route->lr_gateway->lp_net, NULL); + ni = lnet_get_next_ni_locked(route->lr_gateway->lpni_net, NULL); lnet_net_unlock(LNET_LOCK_EX); /* XXX Assume alive */ @@ -426,12 +427,12 @@ lnet_check_routes(void) continue; } - if (route->lr_gateway->lp_net == - route2->lr_gateway->lp_net) + if (route->lr_gateway->lpni_net == + route2->lr_gateway->lpni_net) continue; - nid1 = route->lr_gateway->lp_nid; - nid2 = route2->lr_gateway->lp_nid; + nid1 = route->lr_gateway->lpni_nid; + nid2 = route2->lr_gateway->lpni_nid; net = rnet->lrn_net; lnet_net_unlock(cpt); @@ -481,7 +482,7 @@ lnet_del_route(__u32 net, lnet_nid_t gw_nid) list_for_each_entry(route, &rnet->lrn_routes, lr_list) { gateway = route->lr_gateway; if (!(gw_nid == LNET_NID_ANY || - gw_nid == gateway->lp_nid)) + gw_nid == gateway->lpni_nid)) continue; list_del(&route->lr_list); @@ -575,7 +576,7 @@ lnet_get_route(int idx, __u32 *net, __u32 *hops, *net = rnet->lrn_net; *hops = route->lr_hops; *priority = route->lr_priority; - *gateway = route->lr_gateway->lp_nid; + *gateway = route->lr_gateway->lpni_nid; *alive = lnet_is_route_alive(route); lnet_net_unlock(cpt); return 0; @@ -616,7 +617,7 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) struct lnet_peer *gw = rcd->rcd_gateway; struct lnet_route *rte; - if (!gw->lp_alive) + if (!gw->lpni_alive) return; if (info->pi_magic == __swab32(LNET_PROTO_PING_MAGIC)) @@ -625,27 +626,27 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) /* NB always racing with network! */ if (info->pi_magic != LNET_PROTO_PING_MAGIC) { CDEBUG(D_NET, "%s: Unexpected magic %08x\n", - libcfs_nid2str(gw->lp_nid), info->pi_magic); - gw->lp_ping_feats = LNET_PING_FEAT_INVAL; + libcfs_nid2str(gw->lpni_nid), info->pi_magic); + gw->lpni_ping_feats = LNET_PING_FEAT_INVAL; return; } - gw->lp_ping_feats = info->pi_features; - if (!(gw->lp_ping_feats & LNET_PING_FEAT_MASK)) { + gw->lpni_ping_feats = info->pi_features; + if (!(gw->lpni_ping_feats & LNET_PING_FEAT_MASK)) { CDEBUG(D_NET, "%s: Unexpected features 0x%x\n", - libcfs_nid2str(gw->lp_nid), gw->lp_ping_feats); + libcfs_nid2str(gw->lpni_nid), gw->lpni_ping_feats); return; /* nothing I can understand */ } - if (!(gw->lp_ping_feats & LNET_PING_FEAT_NI_STATUS)) + if (!(gw->lpni_ping_feats & LNET_PING_FEAT_NI_STATUS)) return; /* can't carry NI status info */ - list_for_each_entry(rte, &gw->lp_routes, lr_gwlist) { + list_for_each_entry(rte, &gw->lpni_routes, lr_gwlist) { int down = 0; int up = 0; int i; - if (gw->lp_ping_feats & LNET_PING_FEAT_RTE_DISABLED) { + if (gw->lpni_ping_feats & LNET_PING_FEAT_RTE_DISABLED) { rte->lr_downis = 1; continue; } @@ -656,8 +657,8 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) if (nid == LNET_NID_ANY) { CDEBUG(D_NET, "%s: unexpected LNET_NID_ANY\n", - libcfs_nid2str(gw->lp_nid)); - gw->lp_ping_feats = LNET_PING_FEAT_INVAL; + libcfs_nid2str(gw->lpni_nid)); + gw->lpni_ping_feats = LNET_PING_FEAT_INVAL; return; } @@ -678,8 +679,8 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) } CDEBUG(D_NET, "%s: Unexpected status 0x%x\n", - libcfs_nid2str(gw->lp_nid), stat->ns_status); - gw->lp_ping_feats = LNET_PING_FEAT_INVAL; + libcfs_nid2str(gw->lpni_nid), stat->ns_status); + gw->lpni_ping_feats = LNET_PING_FEAT_INVAL; return; } @@ -722,14 +723,14 @@ lnet_router_checker_event(struct lnet_event *event) * places need to hold both locks at the same time, please take * care of lock ordering */ - lnet_net_lock(lp->lp_cpt); - if (!lnet_isrouter(lp) || lp->lp_rcd != rcd) { + lnet_net_lock(lp->lpni_cpt); + if (!lnet_isrouter(lp) || lp->lpni_rcd != rcd) { /* ignore if no longer a router or rcd is replaced */ goto out; } if (event->type == LNET_EVENT_SEND) { - lp->lp_ping_notsent = 0; + lp->lpni_ping_notsent = 0; if (!event->status) goto out; } @@ -753,7 +754,7 @@ lnet_router_checker_event(struct lnet_event *event) lnet_parse_rc_info(rcd); out: - lnet_net_unlock(lp->lp_cpt); + lnet_net_unlock(lp->lpni_cpt); } static void @@ -768,8 +769,8 @@ lnet_wait_known_routerstate(void) int cpt = lnet_net_lock_current(); all_known = 1; - list_for_each_entry(rtr, &the_lnet.ln_routers, lp_rtr_list) { - if (!rtr->lp_alive_count) { + list_for_each_entry(rtr, &the_lnet.ln_routers, lpni_rtr_list) { + if (!rtr->lpni_alive_count) { all_known = 0; break; } @@ -789,8 +790,8 @@ lnet_router_ni_update_locked(struct lnet_peer *gw, __u32 net) { struct lnet_route *rte; - if ((gw->lp_ping_feats & LNET_PING_FEAT_NI_STATUS)) { - list_for_each_entry(rte, &gw->lp_routes, lr_gwlist) { + if ((gw->lpni_ping_feats & LNET_PING_FEAT_NI_STATUS)) { + list_for_each_entry(rte, &gw->lpni_routes, lr_gwlist) { if (rte->lr_net == net) { rte->lr_downis = 0; break; @@ -849,7 +850,7 @@ lnet_destroy_rc_data(struct lnet_rc_data *rcd) LASSERT(LNetMDHandleIsInvalid(rcd->rcd_mdh)); if (rcd->rcd_gateway) { - int cpt = rcd->rcd_gateway->lp_cpt; + int cpt = rcd->rcd_gateway->lpni_cpt; lnet_net_lock(cpt); lnet_peer_decref_locked(rcd->rcd_gateway); @@ -870,7 +871,7 @@ lnet_create_rc_data_locked(struct lnet_peer *gateway) int rc; int i; - lnet_net_unlock(gateway->lp_cpt); + lnet_net_unlock(gateway->lpni_cpt); rcd = kzalloc(sizeof(*rcd), GFP_NOFS); if (!rcd) @@ -904,17 +905,17 @@ lnet_create_rc_data_locked(struct lnet_peer *gateway) } LASSERT(!rc); - lnet_net_lock(gateway->lp_cpt); + lnet_net_lock(gateway->lpni_cpt); /* router table changed or someone has created rcd for this gateway */ - if (!lnet_isrouter(gateway) || gateway->lp_rcd) { - lnet_net_unlock(gateway->lp_cpt); + if (!lnet_isrouter(gateway) || gateway->lpni_rcd) { + lnet_net_unlock(gateway->lpni_cpt); goto out; } lnet_peer_addref_locked(gateway); rcd->rcd_gateway = gateway; - gateway->lp_rcd = rcd; - gateway->lp_ping_notsent = 0; + gateway->lpni_rcd = rcd; + gateway->lpni_ping_notsent = 0; return rcd; @@ -927,8 +928,8 @@ lnet_create_rc_data_locked(struct lnet_peer *gateway) lnet_destroy_rc_data(rcd); } - lnet_net_lock(gateway->lp_cpt); - return gateway->lp_rcd; + lnet_net_lock(gateway->lpni_cpt); + return gateway->lpni_rcd; } static int @@ -936,7 +937,7 @@ lnet_router_check_interval(struct lnet_peer *rtr) { int secs; - secs = rtr->lp_alive ? live_router_check_interval : + secs = rtr->lpni_alive ? live_router_check_interval : dead_router_check_interval; if (secs < 0) secs = 0; @@ -954,12 +955,12 @@ lnet_ping_router_locked(struct lnet_peer *rtr) lnet_peer_addref_locked(rtr); - if (rtr->lp_ping_deadline && /* ping timed out? */ - now > rtr->lp_ping_deadline) + if (rtr->lpni_ping_deadline && /* ping timed out? */ + now > rtr->lpni_ping_deadline) lnet_notify_locked(rtr, 1, 0, now); /* Run any outstanding notifications */ - ni = lnet_get_next_ni_locked(rtr->lp_net, NULL); + ni = lnet_get_next_ni_locked(rtr->lpni_net, NULL); lnet_ni_notify_locked(ni, rtr); if (!lnet_isrouter(rtr) || @@ -969,8 +970,8 @@ lnet_ping_router_locked(struct lnet_peer *rtr) return; } - rcd = rtr->lp_rcd ? - rtr->lp_rcd : lnet_create_rc_data_locked(rtr); + rcd = rtr->lpni_rcd ? + rtr->lpni_rcd : lnet_create_rc_data_locked(rtr); if (!rcd) return; @@ -978,39 +979,40 @@ lnet_ping_router_locked(struct lnet_peer *rtr) secs = lnet_router_check_interval(rtr); CDEBUG(D_NET, - "rtr %s %lldd: deadline %lld ping_notsent %d alive %d alive_count %d lp_ping_timestamp %lld\n", - libcfs_nid2str(rtr->lp_nid), secs, - rtr->lp_ping_deadline, rtr->lp_ping_notsent, - rtr->lp_alive, rtr->lp_alive_count, rtr->lp_ping_timestamp); - - if (secs && !rtr->lp_ping_notsent && - now > rtr->lp_ping_timestamp + secs) { + "rtr %s %lldd: deadline %lld ping_notsent %d alive %d alive_count %d lpni_ping_timestamp %lld\n", + libcfs_nid2str(rtr->lpni_nid), secs, + rtr->lpni_ping_deadline, rtr->lpni_ping_notsent, + rtr->lpni_alive, rtr->lpni_alive_count, + rtr->lpni_ping_timestamp); + + if (secs && !rtr->lpni_ping_notsent && + now > rtr->lpni_ping_timestamp + secs) { int rc; struct lnet_process_id id; struct lnet_handle_md mdh; - id.nid = rtr->lp_nid; + id.nid = rtr->lpni_nid; id.pid = LNET_PID_LUSTRE; CDEBUG(D_NET, "Check: %s\n", libcfs_id2str(id)); - rtr->lp_ping_notsent = 1; - rtr->lp_ping_timestamp = now; + rtr->lpni_ping_notsent = 1; + rtr->lpni_ping_timestamp = now; mdh = rcd->rcd_mdh; - if (!rtr->lp_ping_deadline) { - rtr->lp_ping_deadline = ktime_get_seconds() + + if (!rtr->lpni_ping_deadline) { + rtr->lpni_ping_deadline = ktime_get_seconds() + router_ping_timeout; } - lnet_net_unlock(rtr->lp_cpt); + lnet_net_unlock(rtr->lpni_cpt); rc = LNetGet(LNET_NID_ANY, mdh, id, LNET_RESERVED_PORTAL, LNET_PROTO_PING_MATCHBITS, 0); - lnet_net_lock(rtr->lp_cpt); + lnet_net_lock(rtr->lpni_cpt); if (rc) - rtr->lp_ping_notsent = 0; /* no event pending */ + rtr->lpni_ping_notsent = 0; /* no event pending */ } lnet_peer_decref_locked(rtr); @@ -1106,14 +1108,14 @@ lnet_prune_rc_data(int wait_unlink) if (the_lnet.ln_rc_state != LNET_RC_STATE_RUNNING) { /* router checker is stopping, prune all */ list_for_each_entry(lp, &the_lnet.ln_routers, - lp_rtr_list) { - if (!lp->lp_rcd) + lpni_rtr_list) { + if (!lp->lpni_rcd) continue; - LASSERT(list_empty(&lp->lp_rcd->rcd_list)); - list_add(&lp->lp_rcd->rcd_list, + LASSERT(list_empty(&lp->lpni_rcd->rcd_list)); + list_add(&lp->lpni_rcd->rcd_list, &the_lnet.ln_rcd_deathrow); - lp->lp_rcd = NULL; + lp->lpni_rcd = NULL; } } @@ -1206,8 +1208,8 @@ lnet_router_checker(void *arg) rescan: version = the_lnet.ln_routers_version; - list_for_each_entry(rtr, &the_lnet.ln_routers, lp_rtr_list) { - cpt2 = rtr->lp_cpt; + list_for_each_entry(rtr, &the_lnet.ln_routers, lpni_rtr_list) { + cpt2 = rtr->lpni_cpt; if (cpt != cpt2) { lnet_net_unlock(cpt); cpt = cpt2; @@ -1745,8 +1747,8 @@ lnet_notify(struct lnet_ni *ni, lnet_nid_t nid, int alive, time64_t when) * call us with when == _time_when_the_node_was_booted_ if * no connections were successfully established */ - if (ni && !alive && when < lp->lp_last_alive) - when = lp->lp_last_alive; + if (ni && !alive && when < lp->lpni_last_alive) + when = lp->lpni_last_alive; lnet_notify_locked(lp, !ni, alive, when); diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index 52714b898aac..01c9ad44266f 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -214,7 +214,7 @@ static int proc_lnet_routes(struct ctl_table *table, int write, __u32 net = rnet->lrn_net; __u32 hops = route->lr_hops; unsigned int priority = route->lr_priority; - lnet_nid_t nid = route->lr_gateway->lp_nid; + lnet_nid_t nid = route->lr_gateway->lpni_nid; int alive = lnet_is_route_alive(route); s += snprintf(s, tmpstr + tmpsiz - s, @@ -306,7 +306,7 @@ static int proc_lnet_routers(struct ctl_table *table, int write, while (r != &the_lnet.ln_routers) { struct lnet_peer *lp; - lp = list_entry(r, struct lnet_peer, lp_rtr_list); + lp = list_entry(r, struct lnet_peer, lpni_rtr_list); if (!skip) { peer = lp; break; @@ -317,21 +317,21 @@ static int proc_lnet_routers(struct ctl_table *table, int write, } if (peer) { - lnet_nid_t nid = peer->lp_nid; + lnet_nid_t nid = peer->lpni_nid; time64_t now = ktime_get_seconds(); - time64_t deadline = peer->lp_ping_deadline; - int nrefs = peer->lp_refcount; - int nrtrrefs = peer->lp_rtr_refcount; - int alive_cnt = peer->lp_alive_count; - int alive = peer->lp_alive; - int pingsent = !peer->lp_ping_notsent; - time64_t last_ping = now - peer->lp_ping_timestamp; + time64_t deadline = peer->lpni_ping_deadline; + int nrefs = peer->lpni_refcount; + int nrtrrefs = peer->lpni_rtr_refcount; + int alive_cnt = peer->lpni_alive_count; + int alive = peer->lpni_alive; + int pingsent = !peer->lpni_ping_notsent; + time64_t last_ping = now - peer->lpni_ping_timestamp; int down_ni = 0; struct lnet_route *rtr; - if ((peer->lp_ping_feats & + if ((peer->lpni_ping_feats & LNET_PING_FEAT_NI_STATUS)) { - list_for_each_entry(rtr, &peer->lp_routes, + list_for_each_entry(rtr, &peer->lpni_routes, lr_gwlist) { /* * downis on any route should be the @@ -452,16 +452,16 @@ static int proc_lnet_peers(struct ctl_table *table, int write, struct lnet_peer *lp; lp = list_entry(p, struct lnet_peer, - lp_hashlist); + lpni_hashlist); if (!skip) { peer = lp; /* * minor optimization: start from idx+1 * on next iteration if we've just - * drained lp_hashlist + * drained lpni_hashlist */ - if (lp->lp_hashlist.next == + if (lp->lpni_hashlist.next == &ptable->pt_hash[hash]) { hoff = 1; hash++; @@ -473,7 +473,7 @@ static int proc_lnet_peers(struct ctl_table *table, int write, } skip--; - p = lp->lp_hashlist.next; + p = lp->lpni_hashlist.next; } if (peer) @@ -485,25 +485,25 @@ static int proc_lnet_peers(struct ctl_table *table, int write, } if (peer) { - lnet_nid_t nid = peer->lp_nid; - int nrefs = peer->lp_refcount; + lnet_nid_t nid = peer->lpni_nid; + int nrefs = peer->lpni_refcount; time64_t lastalive = -1; char *aliveness = "NA"; - int maxcr = peer->lp_net->net_tunables.lct_peer_tx_credits; - int txcr = peer->lp_txcredits; - int mintxcr = peer->lp_mintxcredits; - int rtrcr = peer->lp_rtrcredits; - int minrtrcr = peer->lp_minrtrcredits; - int txqnob = peer->lp_txqnob; + int maxcr = peer->lpni_net->net_tunables.lct_peer_tx_credits; + int txcr = peer->lpni_txcredits; + int mintxcr = peer->lpni_mintxcredits; + int rtrcr = peer->lpni_rtrcredits; + int minrtrcr = peer->lpni_minrtrcredits; + int txqnob = peer->lpni_txqnob; if (lnet_isrouter(peer) || lnet_peer_aliveness_enabled(peer)) - aliveness = peer->lp_alive ? "up" : "down"; + aliveness = peer->lpni_alive ? "up" : "down"; if (lnet_peer_aliveness_enabled(peer)) { time64_t now = ktime_get_seconds(); - lastalive = now - peer->lp_last_alive; + lastalive = now - peer->lpni_last_alive; /* No need to mess up peers contents with * arbitrarily long integers - it suffices to From patchwork Tue Sep 25 01:07:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613155 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8100C112B for ; Tue, 25 Sep 2018 01:10:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 80F6E2A04E for ; Tue, 25 Sep 2018 01:10:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 72C772A052; Tue, 25 Sep 2018 01:10:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7669B2A04E for ; Tue, 25 Sep 2018 01:10:05 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1B4084C3BFA; Mon, 24 Sep 2018 18:10:05 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3090B21FF21 for ; Mon, 24 Sep 2018 18:10:03 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 61EC0B034; Tue, 25 Sep 2018 01:10:02 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:14 +1000 Message-ID: <153783763493.32103.9862696546795848039.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 02/34] lnet: change struct lnet_peer to struct lnet_peer_ni X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Also remove the typedef sed -i -e 's/struct lnet_peer\b/struct lnet_peer_ni/g' -e 's/lnet_peer_t\b/struct lnet_peer_ni/g' `git ls-files lnet` Then edit lib-types.h to remove the typedef, and clean up some long lines. This is part of Commit: 58091af960fe ("LU-7734 lnet: Multi-Rail peer split") from upstream lustre, where it is marked: Signed-off-by: Amir Shehata WC-bug-id: https://jira.whamcloud.com/browse/LU-7734 Reviewed-on: http://review.whamcloud.com/18293 Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 20 ++++++----- .../staging/lustre/include/linux/lnet/lib-types.h | 10 +++--- drivers/staging/lustre/lnet/lnet/lib-move.c | 26 +++++++-------- drivers/staging/lustre/lnet/lnet/peer.c | 31 +++++++++--------- drivers/staging/lustre/lnet/lnet/router.c | 35 ++++++++++---------- drivers/staging/lustre/lnet/lnet/router_proc.c | 12 +++---- 6 files changed, 68 insertions(+), 66 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 9b54a3d72290..ef53638e20f6 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -311,16 +311,16 @@ lnet_handle2me(struct lnet_handle_me *handle) } static inline void -lnet_peer_addref_locked(struct lnet_peer *lp) +lnet_peer_addref_locked(struct lnet_peer_ni *lp) { LASSERT(lp->lpni_refcount > 0); lp->lpni_refcount++; } -void lnet_destroy_peer_locked(struct lnet_peer *lp); +void lnet_destroy_peer_locked(struct lnet_peer_ni *lp); static inline void -lnet_peer_decref_locked(struct lnet_peer *lp) +lnet_peer_decref_locked(struct lnet_peer_ni *lp) { LASSERT(lp->lpni_refcount > 0); lp->lpni_refcount--; @@ -329,7 +329,7 @@ lnet_peer_decref_locked(struct lnet_peer *lp) } static inline int -lnet_isrouter(struct lnet_peer *lp) +lnet_isrouter(struct lnet_peer_ni *lp) { return lp->lpni_rtr_refcount ? 1 : 0; } @@ -410,7 +410,7 @@ void lnet_lib_exit(void); int lnet_notify(struct lnet_ni *ni, lnet_nid_t peer, int alive, time64_t when); -void lnet_notify_locked(struct lnet_peer *lp, int notifylnd, int alive, +void lnet_notify_locked(struct lnet_peer_ni *lp, int notifylnd, int alive, time64_t when); int lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway_nid, unsigned int priority); @@ -624,7 +624,7 @@ int lnet_peer_buffer_credits(struct lnet_net *net); int lnet_router_checker_start(void); void lnet_router_checker_stop(void); -void lnet_router_ni_update_locked(struct lnet_peer *gw, __u32 net); +void lnet_router_ni_update_locked(struct lnet_peer_ni *gw, __u32 net); void lnet_swap_pinginfo(struct lnet_ping_info *info); int lnet_parse_ip2nets(char **networksp, char *ip2nets); @@ -635,9 +635,9 @@ bool lnet_net_unique(__u32 net_id, struct list_head *nilist, struct lnet_net **net); bool lnet_ni_unique_net(struct list_head *nilist, char *iface); -int lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt); -struct lnet_peer *lnet_find_peer_locked(struct lnet_peer_table *ptable, - lnet_nid_t nid); +int lnet_nid2peer_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); +struct lnet_peer_ni *lnet_find_peer_locked(struct lnet_peer_table *ptable, + lnet_nid_t nid); void lnet_peer_tables_cleanup(struct lnet_ni *ni); void lnet_peer_tables_destroy(void); int lnet_peer_tables_create(void); @@ -650,7 +650,7 @@ int lnet_get_peer_info(__u32 peer_index, __u64 *nid, __u32 *peer_tx_qnob); static inline void -lnet_peer_set_alive(struct lnet_peer *lp) +lnet_peer_set_alive(struct lnet_peer_ni *lp) { lp->lpni_last_query = ktime_get_seconds(); lp->lpni_last_alive = lp->lpni_last_query; diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 59a1a2620675..4b26801d7d29 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -93,8 +93,8 @@ struct lnet_msg { unsigned int msg_onactivelist:1; /* on the activelist */ unsigned int msg_rdma_get:1; - struct lnet_peer *msg_txpeer; /* peer I'm sending to */ - struct lnet_peer *msg_rxpeer; /* peer I received from */ + struct lnet_peer_ni *msg_txpeer; /* peer I'm sending to */ + struct lnet_peer_ni *msg_rxpeer; /* peer I received from */ void *msg_private; struct lnet_libmd *msg_md; @@ -378,12 +378,12 @@ struct lnet_rc_data { /* ping buffer MD */ struct lnet_handle_md rcd_mdh; /* reference to gateway */ - struct lnet_peer *rcd_gateway; + struct lnet_peer_ni *rcd_gateway; /* ping buffer */ struct lnet_ping_info *rcd_pinginfo; }; -struct lnet_peer { +struct lnet_peer_ni { /* chain on peer hash */ struct list_head lpni_hashlist; /* messages blocking for tx credits */ @@ -474,7 +474,7 @@ struct lnet_route { /* chain on gateway */ struct list_head lr_gwlist; /* router node */ - struct lnet_peer *lr_gateway; + struct lnet_peer_ni *lr_gateway; /* remote network number */ __u32 lr_net; /* sequence for round-robin */ diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 5879a109d46a..4425406e441b 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -491,7 +491,7 @@ lnet_ni_eager_recv(struct lnet_ni *ni, struct lnet_msg *msg) /* NB: caller shall hold a ref on 'lp' as I'd drop lnet_net_lock */ static void -lnet_ni_query_locked(struct lnet_ni *ni, struct lnet_peer *lp) +lnet_ni_query_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) { time64_t last_alive = 0; @@ -510,7 +510,7 @@ lnet_ni_query_locked(struct lnet_ni *ni, struct lnet_peer *lp) /* NB: always called with lnet_net_lock held */ static inline int -lnet_peer_is_alive(struct lnet_peer *lp, unsigned long now) +lnet_peer_is_alive(struct lnet_peer_ni *lp, unsigned long now) { int alive; time64_t deadline; @@ -544,7 +544,7 @@ lnet_peer_is_alive(struct lnet_peer *lp, unsigned long now) * may drop the lnet_net_lock */ static int -lnet_peer_alive_locked(struct lnet_ni *ni, struct lnet_peer *lp) +lnet_peer_alive_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) { time64_t now = ktime_get_seconds(); @@ -599,7 +599,7 @@ lnet_peer_alive_locked(struct lnet_ni *ni, struct lnet_peer *lp) static int lnet_post_send_locked(struct lnet_msg *msg, int do_send) { - struct lnet_peer *lp = msg->msg_txpeer; + struct lnet_peer_ni *lp = msg->msg_txpeer; struct lnet_ni *ni = msg->msg_txni; int cpt = msg->msg_tx_cpt; struct lnet_tx_queue *tq = ni->ni_tx_queues[cpt]; @@ -710,7 +710,7 @@ lnet_post_routed_recv_locked(struct lnet_msg *msg, int do_recv) * I return LNET_CREDIT_WAIT if msg blocked and LNET_CREDIT_OK if * received or OK to receive */ - struct lnet_peer *lp = msg->msg_rxpeer; + struct lnet_peer_ni *lp = msg->msg_rxpeer; struct lnet_rtrbufpool *rbp; struct lnet_rtrbuf *rb; @@ -780,7 +780,7 @@ lnet_post_routed_recv_locked(struct lnet_msg *msg, int do_recv) void lnet_return_tx_credits_locked(struct lnet_msg *msg) { - struct lnet_peer *txpeer = msg->msg_txpeer; + struct lnet_peer_ni *txpeer = msg->msg_txpeer; struct lnet_msg *msg2; struct lnet_ni *txni = msg->msg_txni; @@ -881,7 +881,7 @@ lnet_drop_routed_msgs_locked(struct list_head *list, int cpt) void lnet_return_rx_credits_locked(struct lnet_msg *msg) { - struct lnet_peer *rxpeer = msg->msg_rxpeer; + struct lnet_peer_ni *rxpeer = msg->msg_rxpeer; struct lnet_ni *rxni = msg->msg_rxni; struct lnet_msg *msg2; @@ -971,8 +971,8 @@ lnet_return_rx_credits_locked(struct lnet_msg *msg) static int lnet_compare_routes(struct lnet_route *r1, struct lnet_route *r2) { - struct lnet_peer *p1 = r1->lr_gateway; - struct lnet_peer *p2 = r2->lr_gateway; + struct lnet_peer_ni *p1 = r1->lr_gateway; + struct lnet_peer_ni *p2 = r2->lr_gateway; int r1_hops = (r1->lr_hops == LNET_UNDEFINED_HOPS) ? 1 : r1->lr_hops; int r2_hops = (r2->lr_hops == LNET_UNDEFINED_HOPS) ? 1 : r2->lr_hops; @@ -1006,7 +1006,7 @@ lnet_compare_routes(struct lnet_route *r1, struct lnet_route *r2) return -ERANGE; } -static struct lnet_peer * +static struct lnet_peer_ni * lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, lnet_nid_t rtr_nid) { @@ -1014,8 +1014,8 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, struct lnet_route *route; struct lnet_route *best_route; struct lnet_route *last_route; - struct lnet_peer *lpni_best; - struct lnet_peer *lp; + struct lnet_peer_ni *lpni_best; + struct lnet_peer_ni *lp; int rc; /* @@ -1076,7 +1076,7 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) lnet_nid_t dst_nid = msg->msg_target.nid; struct lnet_ni *src_ni; struct lnet_ni *local_ni; - struct lnet_peer *lp; + struct lnet_peer_ni *lp; int cpt; int cpt2; int rc; diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 619d016b1d89..67614309f242 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -106,8 +106,8 @@ lnet_peer_table_cleanup_locked(struct lnet_ni *ni, struct lnet_peer_table *ptable) { int i; - struct lnet_peer *lp; - struct lnet_peer *tmp; + struct lnet_peer_ni *lp; + struct lnet_peer_ni *tmp; for (i = 0; i < LNET_PEER_HASH_SIZE; i++) { list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i], @@ -146,8 +146,8 @@ lnet_peer_table_del_rtrs_locked(struct lnet_ni *ni, struct lnet_peer_table *ptable, int cpt_locked) { - struct lnet_peer *lp; - struct lnet_peer *tmp; + struct lnet_peer_ni *lp; + struct lnet_peer_ni *tmp; lnet_nid_t lpni_nid; int i; @@ -174,7 +174,7 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) { struct lnet_peer_table *ptable; struct list_head deathrow; - struct lnet_peer *lp; + struct lnet_peer_ni *lp; int i; INIT_LIST_HEAD(&deathrow); @@ -209,14 +209,15 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) } while (!list_empty(&deathrow)) { - lp = list_entry(deathrow.next, struct lnet_peer, lpni_hashlist); + lp = list_entry(deathrow.next, struct lnet_peer_ni, + lpni_hashlist); list_del(&lp->lpni_hashlist); kfree(lp); } } void -lnet_destroy_peer_locked(struct lnet_peer *lp) +lnet_destroy_peer_locked(struct lnet_peer_ni *lp) { struct lnet_peer_table *ptable; @@ -237,11 +238,11 @@ lnet_destroy_peer_locked(struct lnet_peer *lp) ptable->pt_zombies--; } -struct lnet_peer * +struct lnet_peer_ni * lnet_find_peer_locked(struct lnet_peer_table *ptable, lnet_nid_t nid) { struct list_head *peers; - struct lnet_peer *lp; + struct lnet_peer_ni *lp; LASSERT(!the_lnet.ln_shutdown); @@ -257,11 +258,11 @@ lnet_find_peer_locked(struct lnet_peer_table *ptable, lnet_nid_t nid) } int -lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) +lnet_nid2peer_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt) { struct lnet_peer_table *ptable; - struct lnet_peer *lp = NULL; - struct lnet_peer *lp2; + struct lnet_peer_ni *lp = NULL; + struct lnet_peer_ni *lp2; int cpt2; int rc = 0; @@ -281,7 +282,7 @@ lnet_nid2peer_locked(struct lnet_peer **lpp, lnet_nid_t nid, int cpt) if (!list_empty(&ptable->pt_deathrow)) { lp = list_entry(ptable->pt_deathrow.next, - struct lnet_peer, lpni_hashlist); + struct lnet_peer_ni, lpni_hashlist); list_del(&lp->lpni_hashlist); } @@ -359,7 +360,7 @@ void lnet_debug_peer(lnet_nid_t nid) { char *aliveness = "NA"; - struct lnet_peer *lp; + struct lnet_peer_ni *lp; int rc; int cpt; @@ -396,7 +397,7 @@ lnet_get_peer_info(__u32 peer_index, __u64 *nid, __u32 *peer_tx_qnob) { struct lnet_peer_table *peer_table; - struct lnet_peer *lp; + struct lnet_peer_ni *lp; bool found = false; int lncpt, j; diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 2be1ffb6b720..31685406dcc3 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -100,7 +100,7 @@ lnet_peers_start_down(void) } void -lnet_notify_locked(struct lnet_peer *lp, int notifylnd, int alive, +lnet_notify_locked(struct lnet_peer_ni *lp, int notifylnd, int alive, time64_t when) { if (lp->lpni_timestamp > when) { /* out of date information */ @@ -130,7 +130,7 @@ lnet_notify_locked(struct lnet_peer *lp, int notifylnd, int alive, } static void -lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer *lp) +lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) { int alive; int notifylnd; @@ -170,7 +170,7 @@ lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer *lp) } static void -lnet_rtr_addref_locked(struct lnet_peer *lp) +lnet_rtr_addref_locked(struct lnet_peer_ni *lp) { LASSERT(lp->lpni_refcount > 0); LASSERT(lp->lpni_rtr_refcount >= 0); @@ -182,9 +182,10 @@ lnet_rtr_addref_locked(struct lnet_peer *lp) /* a simple insertion sort */ list_for_each_prev(pos, &the_lnet.ln_routers) { - struct lnet_peer *rtr; + struct lnet_peer_ni *rtr; - rtr = list_entry(pos, struct lnet_peer, lpni_rtr_list); + rtr = list_entry(pos, struct lnet_peer_ni, + lpni_rtr_list); if (rtr->lpni_nid < lp->lpni_nid) break; } @@ -197,7 +198,7 @@ lnet_rtr_addref_locked(struct lnet_peer *lp) } static void -lnet_rtr_decref_locked(struct lnet_peer *lp) +lnet_rtr_decref_locked(struct lnet_peer_ni *lp) { LASSERT(lp->lpni_refcount > 0); LASSERT(lp->lpni_rtr_refcount > 0); @@ -453,7 +454,7 @@ lnet_check_routes(void) int lnet_del_route(__u32 net, lnet_nid_t gw_nid) { - struct lnet_peer *gateway; + struct lnet_peer_ni *gateway; struct lnet_remotenet *rnet; struct lnet_route *route; int rc = -ENOENT; @@ -614,7 +615,7 @@ static void lnet_parse_rc_info(struct lnet_rc_data *rcd) { struct lnet_ping_info *info = rcd->rcd_pinginfo; - struct lnet_peer *gw = rcd->rcd_gateway; + struct lnet_peer_ni *gw = rcd->rcd_gateway; struct lnet_route *rte; if (!gw->lpni_alive) @@ -703,7 +704,7 @@ static void lnet_router_checker_event(struct lnet_event *event) { struct lnet_rc_data *rcd = event->md.user_ptr; - struct lnet_peer *lp; + struct lnet_peer_ni *lp; LASSERT(rcd); @@ -760,7 +761,7 @@ lnet_router_checker_event(struct lnet_event *event) static void lnet_wait_known_routerstate(void) { - struct lnet_peer *rtr; + struct lnet_peer_ni *rtr; int all_known; LASSERT(the_lnet.ln_rc_state == LNET_RC_STATE_RUNNING); @@ -786,7 +787,7 @@ lnet_wait_known_routerstate(void) } void -lnet_router_ni_update_locked(struct lnet_peer *gw, __u32 net) +lnet_router_ni_update_locked(struct lnet_peer_ni *gw, __u32 net) { struct lnet_route *rte; @@ -863,7 +864,7 @@ lnet_destroy_rc_data(struct lnet_rc_data *rcd) } static struct lnet_rc_data * -lnet_create_rc_data_locked(struct lnet_peer *gateway) +lnet_create_rc_data_locked(struct lnet_peer_ni *gateway) { struct lnet_rc_data *rcd = NULL; struct lnet_ping_info *pi; @@ -933,7 +934,7 @@ lnet_create_rc_data_locked(struct lnet_peer *gateway) } static int -lnet_router_check_interval(struct lnet_peer *rtr) +lnet_router_check_interval(struct lnet_peer_ni *rtr) { int secs; @@ -946,7 +947,7 @@ lnet_router_check_interval(struct lnet_peer *rtr) } static void -lnet_ping_router_locked(struct lnet_peer *rtr) +lnet_ping_router_locked(struct lnet_peer_ni *rtr) { struct lnet_rc_data *rcd = NULL; time64_t now = ktime_get_seconds(); @@ -1092,7 +1093,7 @@ lnet_prune_rc_data(int wait_unlink) { struct lnet_rc_data *rcd; struct lnet_rc_data *tmp; - struct lnet_peer *lp; + struct lnet_peer_ni *lp; struct list_head head; int i = 2; @@ -1197,7 +1198,7 @@ lnet_router_checker_active(void) static int lnet_router_checker(void *arg) { - struct lnet_peer *rtr; + struct lnet_peer_ni *rtr; while (the_lnet.ln_rc_state == LNET_RC_STATE_RUNNING) { __u64 version; @@ -1693,7 +1694,7 @@ lnet_rtrpools_disable(void) int lnet_notify(struct lnet_ni *ni, lnet_nid_t nid, int alive, time64_t when) { - struct lnet_peer *lp = NULL; + struct lnet_peer_ni *lp = NULL; time64_t now = ktime_get_seconds(); int cpt = lnet_cpt_of_nid(nid, ni); diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index 01c9ad44266f..d0340707feaa 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -289,7 +289,7 @@ static int proc_lnet_routers(struct ctl_table *table, int write, *ppos = LNET_PROC_POS_MAKE(0, ver, 0, off); } else { struct list_head *r; - struct lnet_peer *peer = NULL; + struct lnet_peer_ni *peer = NULL; int skip = off - 1; lnet_net_lock(0); @@ -304,9 +304,9 @@ static int proc_lnet_routers(struct ctl_table *table, int write, r = the_lnet.ln_routers.next; while (r != &the_lnet.ln_routers) { - struct lnet_peer *lp; + struct lnet_peer_ni *lp; - lp = list_entry(r, struct lnet_peer, lpni_rtr_list); + lp = list_entry(r, struct lnet_peer_ni, lpni_rtr_list); if (!skip) { peer = lp; break; @@ -425,7 +425,7 @@ static int proc_lnet_peers(struct ctl_table *table, int write, hoff++; } else { - struct lnet_peer *peer; + struct lnet_peer_ni *peer; struct list_head *p; int skip; again: @@ -449,9 +449,9 @@ static int proc_lnet_peers(struct ctl_table *table, int write, p = ptable->pt_hash[hash].next; while (p != &ptable->pt_hash[hash]) { - struct lnet_peer *lp; + struct lnet_peer_ni *lp; - lp = list_entry(p, struct lnet_peer, + lp = list_entry(p, struct lnet_peer_ni, lpni_hashlist); if (!skip) { peer = lp; From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613157 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1C8FC1803 for ; Tue, 25 Sep 2018 01:10:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 20A312A04E for ; Tue, 25 Sep 2018 01:10:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 14CBD2A052; Tue, 25 Sep 2018 01:10:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 962B62A04E for ; Tue, 25 Sep 2018 01:10:11 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3AF0E4C3BC3; Mon, 24 Sep 2018 18:10:11 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4D0A821F595 for ; Mon, 24 Sep 2018 18:10:09 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6BF82B032; Tue, 25 Sep 2018 01:10:08 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763497.32103.13856612081166369948.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 03/34] lnet: Change lpni_refcount to atomic_t X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP This is part of Commit: 58091af960fe ("LU-7734 lnet: Multi-Rail peer split") from upstream lustre, where it is marked: Signed-off-by: Amir Shehata WC-bug-id: https://jira.whamcloud.com/browse/LU-7734 Reviewed-on: http://review.whamcloud.com/18293 Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 10 +++++----- .../staging/lustre/include/linux/lnet/lib-types.h | 2 +- drivers/staging/lustre/lnet/lnet/peer.c | 8 ++++---- drivers/staging/lustre/lnet/lnet/router.c | 4 ++-- drivers/staging/lustre/lnet/lnet/router_proc.c | 4 ++-- 5 files changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index ef53638e20f6..88e010aa3f68 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -313,8 +313,8 @@ lnet_handle2me(struct lnet_handle_me *handle) static inline void lnet_peer_addref_locked(struct lnet_peer_ni *lp) { - LASSERT(lp->lpni_refcount > 0); - lp->lpni_refcount++; + LASSERT(atomic_read(&lp->lpni_refcount) > 0); + atomic_inc(&lp->lpni_refcount); } void lnet_destroy_peer_locked(struct lnet_peer_ni *lp); @@ -322,9 +322,9 @@ void lnet_destroy_peer_locked(struct lnet_peer_ni *lp); static inline void lnet_peer_decref_locked(struct lnet_peer_ni *lp) { - LASSERT(lp->lpni_refcount > 0); - lp->lpni_refcount--; - if (!lp->lpni_refcount) + LASSERT(atomic_read(&lp->lpni_refcount) > 0); + atomic_dec(&lp->lpni_refcount); + if (atomic_read(&lp->lpni_refcount) == 0) lnet_destroy_peer_locked(lp); } diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 4b26801d7d29..9a2cf319dba9 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -429,7 +429,7 @@ struct lnet_peer_ni { /* peer's NID */ lnet_nid_t lpni_nid; /* # refs */ - int lpni_refcount; + atomic_t lpni_refcount; /* CPT this peer attached on */ int lpni_cpt; /* # refs from lnet_route::lr_gateway */ diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 67614309f242..7475678ea184 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -221,7 +221,7 @@ lnet_destroy_peer_locked(struct lnet_peer_ni *lp) { struct lnet_peer_table *ptable; - LASSERT(!lp->lpni_refcount); + LASSERT(atomic_read(&lp->lpni_refcount) == 0); LASSERT(!lp->lpni_rtr_refcount); LASSERT(list_empty(&lp->lpni_txq)); LASSERT(list_empty(&lp->lpni_hashlist)); @@ -320,7 +320,7 @@ lnet_nid2peer_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt) lp->lpni_ping_feats = LNET_PING_FEAT_INVAL; lp->lpni_nid = nid; lp->lpni_cpt = cpt2; - lp->lpni_refcount = 2; /* 1 for caller; 1 for hash */ + atomic_set(&lp->lpni_refcount, 2); /* 1 for caller; 1 for hash */ lp->lpni_rtr_refcount = 0; lnet_net_lock(cpt); @@ -378,7 +378,7 @@ lnet_debug_peer(lnet_nid_t nid) aliveness = lp->lpni_alive ? "up" : "down"; CDEBUG(D_WARNING, "%-24s %4d %5s %5d %5d %5d %5d %5d %ld\n", - libcfs_nid2str(lp->lpni_nid), lp->lpni_refcount, + libcfs_nid2str(lp->lpni_nid), atomic_read(&lp->lpni_refcount), aliveness, lp->lpni_net->net_tunables.lct_peer_tx_credits, lp->lpni_rtrcredits, lp->lpni_minrtrcredits, lp->lpni_txcredits, lp->lpni_mintxcredits, lp->lpni_txqnob); @@ -433,7 +433,7 @@ lnet_get_peer_info(__u32 peer_index, __u64 *nid, lp->lpni_alive ? "up" : "down"); *nid = lp->lpni_nid; - *refcount = lp->lpni_refcount; + *refcount = atomic_read(&lp->lpni_refcount); *ni_peer_tx_credits = lp->lpni_net->net_tunables.lct_peer_tx_credits; *peer_tx_credits = lp->lpni_txcredits; diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 31685406dcc3..bfd4b22cc28a 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -172,7 +172,7 @@ lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) static void lnet_rtr_addref_locked(struct lnet_peer_ni *lp) { - LASSERT(lp->lpni_refcount > 0); + LASSERT(atomic_read(&lp->lpni_refcount) > 0); LASSERT(lp->lpni_rtr_refcount >= 0); /* lnet_net_lock must be exclusively locked */ @@ -200,7 +200,7 @@ lnet_rtr_addref_locked(struct lnet_peer_ni *lp) static void lnet_rtr_decref_locked(struct lnet_peer_ni *lp) { - LASSERT(lp->lpni_refcount > 0); + LASSERT(atomic_read(&lp->lpni_refcount) > 0); LASSERT(lp->lpni_rtr_refcount > 0); /* lnet_net_lock must be exclusively locked */ diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index d0340707feaa..12a4b1708d3c 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -320,7 +320,7 @@ static int proc_lnet_routers(struct ctl_table *table, int write, lnet_nid_t nid = peer->lpni_nid; time64_t now = ktime_get_seconds(); time64_t deadline = peer->lpni_ping_deadline; - int nrefs = peer->lpni_refcount; + int nrefs = atomic_read(&peer->lpni_refcount); int nrtrrefs = peer->lpni_rtr_refcount; int alive_cnt = peer->lpni_alive_count; int alive = peer->lpni_alive; @@ -486,7 +486,7 @@ static int proc_lnet_peers(struct ctl_table *table, int write, if (peer) { lnet_nid_t nid = peer->lpni_nid; - int nrefs = peer->lpni_refcount; + int nrefs = atomic_read(&peer->lpni_refcount); time64_t lastalive = -1; char *aliveness = "NA"; int maxcr = peer->lpni_net->net_tunables.lct_peer_tx_credits; From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613159 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03896161F for ; Tue, 25 Sep 2018 01:10:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0812A2A04E for ; Tue, 25 Sep 2018 01:10:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F07FD2A052; Tue, 25 Sep 2018 01:10:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5BB802A04E for ; Tue, 25 Sep 2018 01:10:17 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0BCEE200D72; Mon, 24 Sep 2018 18:10:17 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6360721FD9A for ; Mon, 24 Sep 2018 18:10:15 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A28DCB033; Tue, 25 Sep 2018 01:10:14 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763501.32103.12514510583840060159.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 04/34] lnet: change some function names - add 'ni'. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP sed -i 's/lnet_peer_addref_locked/lnet_peer_ni_addref_locked/g' `git grep -l lnet_peer_addref_locked lnet/` sed -i 's/lnet_destroy_peer_locked/lnet_destroy_peer_ni_locked/g' `git grep -l lnet_destroy_peer_locked lnet/` sed -i 's/lnet_peer_decref_locked/lnet_peer_ni_decref_locked/g' `git grep -l lnet_peer_decref_locked lnet/` sed -i 's/lnet_nid2peer_locked/lnet_nid2peerni_locked/g' `git grep -l lnet_nid2peer_locked lnet/` This is part of Commit: 58091af960fe ("LU-7734 lnet: Multi-Rail peer split") from upstream lustre, where it is marked: Signed-off-by: Amir Shehata WC-bug-id: https://jira.whamcloud.com/browse/LU-7734 Reviewed-on: http://review.whamcloud.com/18293 Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 10 ++++---- drivers/staging/lustre/lnet/lnet/lib-move.c | 10 ++++---- drivers/staging/lustre/lnet/lnet/peer.c | 12 +++++----- drivers/staging/lustre/lnet/lnet/router.c | 24 ++++++++++---------- 4 files changed, 28 insertions(+), 28 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 88e010aa3f68..a1c581069eb1 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -311,21 +311,21 @@ lnet_handle2me(struct lnet_handle_me *handle) } static inline void -lnet_peer_addref_locked(struct lnet_peer_ni *lp) +lnet_peer_ni_addref_locked(struct lnet_peer_ni *lp) { LASSERT(atomic_read(&lp->lpni_refcount) > 0); atomic_inc(&lp->lpni_refcount); } -void lnet_destroy_peer_locked(struct lnet_peer_ni *lp); +void lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lp); static inline void -lnet_peer_decref_locked(struct lnet_peer_ni *lp) +lnet_peer_ni_decref_locked(struct lnet_peer_ni *lp) { LASSERT(atomic_read(&lp->lpni_refcount) > 0); atomic_dec(&lp->lpni_refcount); if (atomic_read(&lp->lpni_refcount) == 0) - lnet_destroy_peer_locked(lp); + lnet_destroy_peer_ni_locked(lp); } static inline int @@ -635,7 +635,7 @@ bool lnet_net_unique(__u32 net_id, struct list_head *nilist, struct lnet_net **net); bool lnet_ni_unique_net(struct list_head *nilist, char *iface); -int lnet_nid2peer_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); +int lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); struct lnet_peer_ni *lnet_find_peer_locked(struct lnet_peer_table *ptable, lnet_nid_t nid); void lnet_peer_tables_cleanup(struct lnet_ni *ni); diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 4425406e441b..edbec7e9ed7e 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -837,7 +837,7 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) if (txpeer) { msg->msg_txpeer = NULL; - lnet_peer_decref_locked(txpeer); + lnet_peer_ni_decref_locked(txpeer); } } @@ -964,7 +964,7 @@ lnet_return_rx_credits_locked(struct lnet_msg *msg) } if (rxpeer) { msg->msg_rxpeer = NULL; - lnet_peer_decref_locked(rxpeer); + lnet_peer_ni_decref_locked(rxpeer); } } @@ -1148,7 +1148,7 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) return 0; } - rc = lnet_nid2peer_locked(&lp, dst_nid, cpt); + rc = lnet_nid2peerni_locked(&lp, dst_nid, cpt); if (rc) { lnet_net_unlock(cpt); LCONSOLE_WARN("Error %d finding peer %s\n", rc, @@ -1199,7 +1199,7 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) LASSERT(src_ni->ni_net == lp->lpni_net); } - lnet_peer_addref_locked(lp); + lnet_peer_ni_addref_locked(lp); LASSERT(src_nid != LNET_NID_ANY); lnet_msg_commit(msg, cpt); @@ -1810,7 +1810,7 @@ lnet_parse(struct lnet_ni *ni, struct lnet_hdr *hdr, lnet_nid_t from_nid, } lnet_net_lock(cpt); - rc = lnet_nid2peer_locked(&msg->msg_rxpeer, from_nid, cpt); + rc = lnet_nid2peerni_locked(&msg->msg_rxpeer, from_nid, cpt); if (rc) { lnet_net_unlock(cpt); CERROR("%s, src %s: Dropping %s (error %d looking up sender)\n", diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 7475678ea184..fcfad77b9f2c 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -117,7 +117,7 @@ lnet_peer_table_cleanup_locked(struct lnet_ni *ni, list_del_init(&lp->lpni_hashlist); /* Lose hash table's ref */ ptable->pt_zombies++; - lnet_peer_decref_locked(lp); + lnet_peer_ni_decref_locked(lp); } } } @@ -217,7 +217,7 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) } void -lnet_destroy_peer_locked(struct lnet_peer_ni *lp) +lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lp) { struct lnet_peer_table *ptable; @@ -249,7 +249,7 @@ lnet_find_peer_locked(struct lnet_peer_table *ptable, lnet_nid_t nid) peers = &ptable->pt_hash[lnet_nid2peerhash(nid)]; list_for_each_entry(lp, peers, lpni_hashlist) { if (lp->lpni_nid == nid) { - lnet_peer_addref_locked(lp); + lnet_peer_ni_addref_locked(lp); return lp; } } @@ -258,7 +258,7 @@ lnet_find_peer_locked(struct lnet_peer_table *ptable, lnet_nid_t nid) } int -lnet_nid2peer_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt) +lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt) { struct lnet_peer_table *ptable; struct lnet_peer_ni *lp = NULL; @@ -367,7 +367,7 @@ lnet_debug_peer(lnet_nid_t nid) cpt = lnet_cpt_of_nid(nid, NULL); lnet_net_lock(cpt); - rc = lnet_nid2peer_locked(&lp, nid, cpt); + rc = lnet_nid2peerni_locked(&lp, nid, cpt); if (rc) { lnet_net_unlock(cpt); CDEBUG(D_WARNING, "No peer %s\n", libcfs_nid2str(nid)); @@ -383,7 +383,7 @@ lnet_debug_peer(lnet_nid_t nid) lp->lpni_rtrcredits, lp->lpni_minrtrcredits, lp->lpni_txcredits, lp->lpni_mintxcredits, lp->lpni_txqnob); - lnet_peer_decref_locked(lp); + lnet_peer_ni_decref_locked(lp); lnet_net_unlock(cpt); } diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index bfd4b22cc28a..ba2b2b930576 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -192,7 +192,7 @@ lnet_rtr_addref_locked(struct lnet_peer_ni *lp) list_add(&lp->lpni_rtr_list, pos); /* addref for the_lnet.ln_routers */ - lnet_peer_addref_locked(lp); + lnet_peer_ni_addref_locked(lp); the_lnet.ln_routers_version++; } } @@ -216,7 +216,7 @@ lnet_rtr_decref_locked(struct lnet_peer_ni *lp) list_del(&lp->lpni_rtr_list); /* decref for the_lnet.ln_routers */ - lnet_peer_decref_locked(lp); + lnet_peer_ni_decref_locked(lp); the_lnet.ln_routers_version++; } } @@ -332,7 +332,7 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, lnet_net_lock(LNET_LOCK_EX); - rc = lnet_nid2peer_locked(&route->lr_gateway, gateway, LNET_LOCK_EX); + rc = lnet_nid2peerni_locked(&route->lr_gateway, gateway, LNET_LOCK_EX); if (rc) { lnet_net_unlock(LNET_LOCK_EX); @@ -370,7 +370,7 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, } if (add_route) { - lnet_peer_addref_locked(route->lr_gateway); /* +1 for notify */ + lnet_peer_ni_addref_locked(route->lr_gateway); /* +1 for notify */ lnet_add_route_to_rnet(rnet2, route); ni = lnet_get_next_ni_locked(route->lr_gateway->lpni_net, NULL); @@ -384,7 +384,7 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, } /* -1 for notify or !add_route */ - lnet_peer_decref_locked(route->lr_gateway); + lnet_peer_ni_decref_locked(route->lr_gateway); lnet_net_unlock(LNET_LOCK_EX); rc = 0; @@ -496,7 +496,7 @@ lnet_del_route(__u32 net, lnet_nid_t gw_nid) rnet = NULL; lnet_rtr_decref_locked(gateway); - lnet_peer_decref_locked(gateway); + lnet_peer_ni_decref_locked(gateway); lnet_net_unlock(LNET_LOCK_EX); @@ -854,7 +854,7 @@ lnet_destroy_rc_data(struct lnet_rc_data *rcd) int cpt = rcd->rcd_gateway->lpni_cpt; lnet_net_lock(cpt); - lnet_peer_decref_locked(rcd->rcd_gateway); + lnet_peer_ni_decref_locked(rcd->rcd_gateway); lnet_net_unlock(cpt); } @@ -913,7 +913,7 @@ lnet_create_rc_data_locked(struct lnet_peer_ni *gateway) goto out; } - lnet_peer_addref_locked(gateway); + lnet_peer_ni_addref_locked(gateway); rcd->rcd_gateway = gateway; gateway->lpni_rcd = rcd; gateway->lpni_ping_notsent = 0; @@ -954,7 +954,7 @@ lnet_ping_router_locked(struct lnet_peer_ni *rtr) time64_t secs; struct lnet_ni *ni; - lnet_peer_addref_locked(rtr); + lnet_peer_ni_addref_locked(rtr); if (rtr->lpni_ping_deadline && /* ping timed out? */ now > rtr->lpni_ping_deadline) @@ -967,7 +967,7 @@ lnet_ping_router_locked(struct lnet_peer_ni *rtr) if (!lnet_isrouter(rtr) || the_lnet.ln_rc_state != LNET_RC_STATE_RUNNING) { /* router table changed or router checker is shutting down */ - lnet_peer_decref_locked(rtr); + lnet_peer_ni_decref_locked(rtr); return; } @@ -1016,7 +1016,7 @@ lnet_ping_router_locked(struct lnet_peer_ni *rtr) rtr->lpni_ping_notsent = 0; /* no event pending */ } - lnet_peer_decref_locked(rtr); + lnet_peer_ni_decref_locked(rtr); } int @@ -1756,7 +1756,7 @@ lnet_notify(struct lnet_ni *ni, lnet_nid_t nid, int alive, time64_t when) if (ni) lnet_ni_notify_locked(ni, lp); - lnet_peer_decref_locked(lp); + lnet_peer_ni_decref_locked(lp); lnet_net_unlock(cpt); return 0; From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613161 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F27A1803 for ; Tue, 25 Sep 2018 01:10:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A60652A04E for ; Tue, 25 Sep 2018 01:10:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9A5492A052; Tue, 25 Sep 2018 01:10:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4EA342A04E for ; Tue, 25 Sep 2018 01:10:23 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id ECD3D4C3CAD; Mon, 24 Sep 2018 18:10:22 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id A8C45200CF5 for ; Mon, 24 Sep 2018 18:10:21 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id CCB27B032; Tue, 25 Sep 2018 01:10:20 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763504.32103.13261120162709788445.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 05/34] lnet: make lnet_nid_cpt_hash non-static. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Future patch will need this. This is part of Commit: 58091af960fe ("LU-7734 lnet: Multi-Rail peer split") from upstream lustre, where it is marked: Signed-off-by: Amir Shehata WC-bug-id: https://jira.whamcloud.com/browse/LU-7734 Reviewed-on: http://review.whamcloud.com/18293 Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 + drivers/staging/lustre/lnet/lnet/api-ni.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index a1c581069eb1..f925e3cd64ca 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -395,6 +395,7 @@ lnet_net2rnethash(__u32 net) extern struct lnet_lnd the_lolnd; extern int avoid_asym_router_failure; +unsigned int lnet_nid_cpt_hash(lnet_nid_t nid, unsigned int number); int lnet_cpt_of_nid_locked(lnet_nid_t nid, struct lnet_ni *ni); int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni); struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt); diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 1b47cbd6fd68..20fa3fea04b9 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -693,7 +693,7 @@ lnet_get_net_locked(__u32 net_id) return NULL; } -static unsigned int +unsigned int lnet_nid_cpt_hash(lnet_nid_t nid, unsigned int number) { __u64 key = nid; From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613163 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 66E95112B for ; Tue, 25 Sep 2018 01:10:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6BA8B2A04E for ; Tue, 25 Sep 2018 01:10:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6042A2A052; Tue, 25 Sep 2018 01:10:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0A9BA2A04E for ; Tue, 25 Sep 2018 01:10:29 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9C1A84C3D1F; Mon, 24 Sep 2018 18:10:28 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B8264200CF5 for ; Mon, 24 Sep 2018 18:10:27 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E80D2B033; Tue, 25 Sep 2018 01:10:26 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763508.32103.13303107512736711408.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 06/34] lnet: introduce lnet_find_peer_ni_locked() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Use in place of lnet_find_peer_locked() This is part of Commit: 58091af960fe ("LU-7734 lnet: Multi-Rail peer split") from upstream lustre, where it is marked: Signed-off-by: Amir Shehata WC-bug-id: https://jira.whamcloud.com/browse/LU-7734 Reviewed-on: http://review.whamcloud.com/18293 Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown Reviewed-by: James Simmons --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 + drivers/staging/lustre/lnet/lnet/peer.c | 31 ++++++++++++++++++++ drivers/staging/lustre/lnet/lnet/router.c | 2 + 3 files changed, 33 insertions(+), 1 deletion(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index f925e3cd64ca..656177b64336 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -639,6 +639,7 @@ bool lnet_ni_unique_net(struct list_head *nilist, char *iface); int lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); struct lnet_peer_ni *lnet_find_peer_locked(struct lnet_peer_table *ptable, lnet_nid_t nid); +struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid, int cpt); void lnet_peer_tables_cleanup(struct lnet_ni *ni); void lnet_peer_tables_destroy(void); int lnet_peer_tables_create(void); diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index fcfad77b9f2c..53b0ca0a2021 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -216,6 +216,37 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) } } +static struct lnet_peer_ni * +lnet_get_peer_ni_locked(struct lnet_peer_table *ptable, lnet_nid_t nid) +{ + struct list_head *peers; + struct lnet_peer_ni *lp; + + LASSERT(!the_lnet.ln_shutdown); + + peers = &ptable->pt_hash[lnet_nid2peerhash(nid)]; + list_for_each_entry(lp, peers, lpni_hashlist) { + if (lp->lpni_nid == nid) { + lnet_peer_ni_addref_locked(lp); + return lp; + } + } + + return NULL; +} + +struct lnet_peer_ni * +lnet_find_peer_ni_locked(lnet_nid_t nid, int cpt) +{ + struct lnet_peer_ni *lpni; + struct lnet_peer_table *ptable; + + ptable = the_lnet.ln_peer_tables[cpt]; + lpni = lnet_get_peer_ni_locked(ptable, nid); + + return lpni; +} + void lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lp) { diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index ba2b2b930576..de037a77671d 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -1734,7 +1734,7 @@ lnet_notify(struct lnet_ni *ni, lnet_nid_t nid, int alive, time64_t when) return -ESHUTDOWN; } - lp = lnet_find_peer_locked(the_lnet.ln_peer_tables[cpt], nid); + lp = lnet_find_peer_ni_locked(nid, cpt); if (!lp) { /* nid not found */ lnet_net_unlock(cpt); From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613165 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7CDF2112B for ; Tue, 25 Sep 2018 01:10:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 81ECC2A04E for ; Tue, 25 Sep 2018 01:10:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 760162A052; Tue, 25 Sep 2018 01:10:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2AEB42A04E for ; Tue, 25 Sep 2018 01:10:35 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 928194C3C5D; Mon, 24 Sep 2018 18:10:34 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C491F21F9DC for ; Mon, 24 Sep 2018 18:10:33 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0D2F9B032; Tue, 25 Sep 2018 01:10:33 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763511.32103.5073326313518943569.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 07/34] lnet: lnet_peer_tables_cleanup: use an exclusive lock. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP Why?? surely this will deadlock. This is part of Commit: 58091af960fe ("LU-7734 lnet: Multi-Rail peer split") from upstream lustre, where it is marked: Signed-off-by: Amir Shehata WC-bug-id: https://jira.whamcloud.com/browse/LU-7734 Reviewed-on: http://review.whamcloud.com/18293 Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown Reviewed-by: James Simmons Reviewed-by:. Maybe git can't handle the indent? --- drivers/staging/lustre/lnet/lnet/peer.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 53b0ca0a2021..376e3459fa92 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -185,9 +185,9 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) * peers are gateways for. */ cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { - lnet_net_lock(i); + lnet_net_lock(LNET_LOCK_EX); lnet_peer_table_del_rtrs_locked(ni, ptable, i); - lnet_net_unlock(i); + lnet_net_unlock(LNET_LOCK_EX); } /* @@ -195,17 +195,17 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) * deathrow. */ cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { - lnet_net_lock(i); + lnet_net_lock(LNET_LOCK_EX); lnet_peer_table_cleanup_locked(ni, ptable); - lnet_net_unlock(i); + lnet_net_unlock(LNET_LOCK_EX); } /* Cleanup all entries on deathrow. */ cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { - lnet_net_lock(i); + lnet_net_lock(LNET_LOCK_EX); lnet_peer_table_deathrow_wait_locked(ptable, i); list_splice_init(&ptable->pt_deathrow, &deathrow); - lnet_net_unlock(i); + lnet_net_unlock(LNET_LOCK_EX); } while (!list_empty(&deathrow)) { From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613167 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF10F161F for ; Tue, 25 Sep 2018 01:10:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BAB112A052 for ; Tue, 25 Sep 2018 01:10:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AD6EC2A05D; Tue, 25 Sep 2018 01:10:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B010B2A052 for ; Tue, 25 Sep 2018 01:10:45 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2E43A4C3D37; Mon, 24 Sep 2018 18:10:45 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5E971200D36 for ; Mon, 24 Sep 2018 18:10:43 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 541E0B033; Tue, 25 Sep 2018 01:10:42 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763514.32103.5823547209878409504.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 08/34] LU-7734 lnet: Multi-Rail peer split X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata [[Note, the preceding few patches are part of this in the upstream lustre code - they were split for easier merging into linux. ]] Split the peer structure into peer/peer_net/peer_ni, as described in the Multi-Rail HLD. Removed deathrow list in peers, instead peers are immediately deleted. deathrow complicates memory management for peers to little gain. Moved to LNET_LOCK_EX for any operations which will modify the peer tables. And CPT locks for any operatios which read the peer tables. Therefore there is no need to use lnet_cpt_of_nid() to calculate the CPT of the peer NID, instead we use lnet_nid_cpt_hash() to distribute peers across multiple CPTs. It is no longe true that peers and NIs would exist on the same CPT. In the new design peers and NIs don't have a 1-1 relationship. You can send to the same peer from several NIs, which can exist on separate CPTs Signed-off-by: Amir Shehata Change-Id: Ida41d830d38d0ab2bb551476e4a8866d52a25fe2 Reviewed-on: http://review.whamcloud.com/18293 Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 2 .../staging/lustre/include/linux/lnet/lib-types.h | 29 ++ drivers/staging/lustre/lnet/lnet/api-ni.c | 1 drivers/staging/lustre/lnet/lnet/peer.c | 260 ++++++++++++-------- drivers/staging/lustre/lnet/lnet/router_proc.c | 3 5 files changed, 191 insertions(+), 104 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 656177b64336..bf076298de71 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -637,8 +637,6 @@ bool lnet_net_unique(__u32 net_id, struct list_head *nilist, bool lnet_ni_unique_net(struct list_head *nilist, char *iface); int lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); -struct lnet_peer_ni *lnet_find_peer_locked(struct lnet_peer_table *ptable, - lnet_nid_t nid); struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid, int cpt); void lnet_peer_tables_cleanup(struct lnet_ni *ni); void lnet_peer_tables_destroy(void); diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 9a2cf319dba9..9f70c094cc4c 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -384,6 +384,7 @@ struct lnet_rc_data { }; struct lnet_peer_ni { + struct list_head lpni_on_peer_net_list; /* chain on peer hash */ struct list_head lpni_hashlist; /* messages blocking for tx credits */ @@ -394,6 +395,7 @@ struct lnet_peer_ni { struct list_head lpni_rtr_list; /* # tx credits available */ int lpni_txcredits; + struct lnet_peer_net *lpni_peer_net; /* low water mark */ int lpni_mintxcredits; /* # router credits */ @@ -442,6 +444,31 @@ struct lnet_peer_ni { struct lnet_rc_data *lpni_rcd; }; +struct lnet_peer { + /* chain on global peer list */ + struct list_head lp_on_lnet_peer_list; + + /* list of peer nets */ + struct list_head lp_peer_nets; + + /* primary NID of the peer */ + lnet_nid_t lp_primary_nid; +}; + +struct lnet_peer_net { + /* chain on peer block */ + struct list_head lpn_on_peer_list; + + /* list of peer_nis on this network */ + struct list_head lpn_peer_nis; + + /* pointer to the peer I'm part of */ + struct lnet_peer *lpn_peer; + + /* Net ID */ + __u32 lpn_net_id; +}; + /* peer hash size */ #define LNET_PEER_HASH_BITS 9 #define LNET_PEER_HASH_SIZE (1 << LNET_PEER_HASH_BITS) @@ -686,6 +713,8 @@ struct lnet { struct lnet_msg_container **ln_msg_containers; struct lnet_counters **ln_counters; struct lnet_peer_table **ln_peer_tables; + /* list of configured or discovered peers */ + struct list_head ln_peers; /* failure simulation */ struct list_head ln_test_peers; struct list_head ln_drop_rules; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 20fa3fea04b9..821b030f9621 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -542,6 +542,7 @@ lnet_prepare(lnet_pid_t requested_pid) the_lnet.ln_pid = requested_pid; INIT_LIST_HEAD(&the_lnet.ln_test_peers); + INIT_LIST_HEAD(&the_lnet.ln_peers); INIT_LIST_HEAD(&the_lnet.ln_nets); INIT_LIST_HEAD(&the_lnet.ln_routers); INIT_LIST_HEAD(&the_lnet.ln_drop_rules); diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 376e3459fa92..97ee1f5cfd2f 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -54,8 +54,6 @@ lnet_peer_tables_create(void) } cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { - INIT_LIST_HEAD(&ptable->pt_deathrow); - hash = kvmalloc_cpt(LNET_PEER_HASH_SIZE * sizeof(*hash), GFP_KERNEL, i); if (!hash) { @@ -88,8 +86,6 @@ lnet_peer_tables_destroy(void) if (!hash) /* not initialized */ break; - LASSERT(list_empty(&ptable->pt_deathrow)); - ptable->pt_hash = NULL; for (j = 0; j < LNET_PEER_HASH_SIZE; j++) LASSERT(list_empty(&hash[j])); @@ -123,7 +119,7 @@ lnet_peer_table_cleanup_locked(struct lnet_ni *ni, } static void -lnet_peer_table_deathrow_wait_locked(struct lnet_peer_table *ptable, +lnet_peer_table_finalize_wait_locked(struct lnet_peer_table *ptable, int cpt_locked) { int i; @@ -173,12 +169,8 @@ void lnet_peer_tables_cleanup(struct lnet_ni *ni) { struct lnet_peer_table *ptable; - struct list_head deathrow; - struct lnet_peer_ni *lp; int i; - INIT_LIST_HEAD(&deathrow); - LASSERT(the_lnet.ln_shutdown || ni); /* * If just deleting the peers for a NI, get rid of any routes these @@ -191,8 +183,7 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) } /* - * Start the process of moving the applicable peers to - * deathrow. + * Start the cleanup process */ cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { lnet_net_lock(LNET_LOCK_EX); @@ -200,20 +191,12 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) lnet_net_unlock(LNET_LOCK_EX); } - /* Cleanup all entries on deathrow. */ + /* Wait until all peers have been destroyed. */ cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { lnet_net_lock(LNET_LOCK_EX); - lnet_peer_table_deathrow_wait_locked(ptable, i); - list_splice_init(&ptable->pt_deathrow, &deathrow); + lnet_peer_table_finalize_wait_locked(ptable, i); lnet_net_unlock(LNET_LOCK_EX); } - - while (!list_empty(&deathrow)) { - lp = list_entry(deathrow.next, struct lnet_peer_ni, - lpni_hashlist); - list_del(&lp->lpni_hashlist); - kfree(lp); - } } static struct lnet_peer_ni * @@ -247,74 +230,143 @@ lnet_find_peer_ni_locked(lnet_nid_t nid, int cpt) return lpni; } -void -lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lp) +static void +lnet_try_destroy_peer_hierarchy_locked(struct lnet_peer_ni *lpni) { - struct lnet_peer_table *ptable; + struct lnet_peer_net *peer_net; + struct lnet_peer *peer; - LASSERT(atomic_read(&lp->lpni_refcount) == 0); - LASSERT(!lp->lpni_rtr_refcount); - LASSERT(list_empty(&lp->lpni_txq)); - LASSERT(list_empty(&lp->lpni_hashlist)); - LASSERT(!lp->lpni_txqnob); + /* TODO: could the below situation happen? accessing an already + * destroyed peer? + */ + if (!lpni->lpni_peer_net || + !lpni->lpni_peer_net->lpn_peer) + return; - ptable = the_lnet.ln_peer_tables[lp->lpni_cpt]; - LASSERT(ptable->pt_number > 0); - ptable->pt_number--; + peer_net = lpni->lpni_peer_net; + peer = lpni->lpni_peer_net->lpn_peer; - lp->lpni_net = NULL; + list_del_init(&lpni->lpni_on_peer_net_list); + lpni->lpni_peer_net = NULL; - list_add(&lp->lpni_hashlist, &ptable->pt_deathrow); - LASSERT(ptable->pt_zombies > 0); - ptable->pt_zombies--; + /* if peer_net is empty, then remove it from the peer */ + if (list_empty(&peer_net->lpn_peer_nis)) { + list_del_init(&peer_net->lpn_on_peer_list); + peer_net->lpn_peer = NULL; + kfree(peer_net); + + /* If the peer is empty then remove it from the + * the_lnet.ln_peers + */ + if (list_empty(&peer->lp_peer_nets)) { + list_del_init(&peer->lp_on_lnet_peer_list); + kfree(peer); + } + } } -struct lnet_peer_ni * -lnet_find_peer_locked(struct lnet_peer_table *ptable, lnet_nid_t nid) +static int +lnet_build_peer_hierarchy(struct lnet_peer_ni *lpni) { - struct list_head *peers; - struct lnet_peer_ni *lp; + struct lnet_peer *peer; + struct lnet_peer_net *peer_net; + __u32 lpni_net = LNET_NIDNET(lpni->lpni_nid); - LASSERT(!the_lnet.ln_shutdown); + peer = NULL; + peer_net = NULL; - peers = &ptable->pt_hash[lnet_nid2peerhash(nid)]; - list_for_each_entry(lp, peers, lpni_hashlist) { - if (lp->lpni_nid == nid) { - lnet_peer_ni_addref_locked(lp); - return lp; - } + peer = kzalloc(sizeof(*peer), GFP_KERNEL); + if (!peer) + return -ENOMEM; + + peer_net = kzalloc(sizeof(*peer_net), GFP_KERNEL); + if (!peer_net) { + kfree(peer); + return -ENOMEM; } - return NULL; + INIT_LIST_HEAD(&peer->lp_on_lnet_peer_list); + INIT_LIST_HEAD(&peer->lp_peer_nets); + INIT_LIST_HEAD(&peer_net->lpn_on_peer_list); + INIT_LIST_HEAD(&peer_net->lpn_peer_nis); + + /* build the hierarchy */ + peer_net->lpn_net_id = lpni_net; + peer_net->lpn_peer = peer; + lpni->lpni_peer_net = peer_net; + peer->lp_primary_nid = lpni->lpni_nid; + list_add_tail(&peer_net->lpn_on_peer_list, &peer->lp_peer_nets); + list_add_tail(&lpni->lpni_on_peer_net_list, &peer_net->lpn_peer_nis); + list_add_tail(&peer->lp_on_lnet_peer_list, &the_lnet.ln_peers); + + return 0; +} + +void +lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lpni) +{ + struct lnet_peer_table *ptable; + + LASSERT(atomic_read(&lpni->lpni_refcount) == 0); + LASSERT(lpni->lpni_rtr_refcount == 0); + LASSERT(list_empty(&lpni->lpni_txq)); + LASSERT(list_empty(&lpni->lpni_hashlist)); + LASSERT(lpni->lpni_txqnob == 0); + LASSERT(lpni->lpni_peer_net); + LASSERT(lpni->lpni_peer_net->lpn_peer); + + ptable = the_lnet.ln_peer_tables[lpni->lpni_cpt]; + LASSERT(ptable->pt_number > 0); + ptable->pt_number--; + + lpni->lpni_net = NULL; + + lnet_try_destroy_peer_hierarchy_locked(lpni); + + kfree(lpni); + + LASSERT(ptable->pt_zombies > 0); + ptable->pt_zombies--; } int -lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt) +lnet_nid2peerni_locked(struct lnet_peer_ni **lpnip, lnet_nid_t nid, int cpt) { struct lnet_peer_table *ptable; - struct lnet_peer_ni *lp = NULL; - struct lnet_peer_ni *lp2; + struct lnet_peer_ni *lpni = NULL; + struct lnet_peer_ni *lpni2; int cpt2; int rc = 0; - *lpp = NULL; + *lpnip = NULL; if (the_lnet.ln_shutdown) /* it's shutting down */ return -ESHUTDOWN; - /* cpt can be LNET_LOCK_EX if it's called from router functions */ - cpt2 = cpt != LNET_LOCK_EX ? cpt : lnet_cpt_of_nid_locked(nid, NULL); + /* + * calculate cpt2 with the standard hash function + * This cpt2 becomes the slot where we'll find or create the peer. + */ + cpt2 = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); - ptable = the_lnet.ln_peer_tables[cpt2]; - lp = lnet_find_peer_locked(ptable, nid); - if (lp) { - *lpp = lp; - return 0; + /* + * Any changes to the peer tables happen under exclusive write + * lock. Any reads to the peer tables can be done via a standard + * CPT read lock. + */ + if (cpt != LNET_LOCK_EX) { + lnet_net_unlock(cpt); + lnet_net_lock(LNET_LOCK_EX); } - if (!list_empty(&ptable->pt_deathrow)) { - lp = list_entry(ptable->pt_deathrow.next, - struct lnet_peer_ni, lpni_hashlist); - list_del(&lp->lpni_hashlist); + ptable = the_lnet.ln_peer_tables[cpt2]; + lpni = lnet_get_peer_ni_locked(ptable, nid); + if (lpni) { + *lpnip = lpni; + if (cpt != LNET_LOCK_EX) { + lnet_net_unlock(LNET_LOCK_EX); + lnet_net_lock(cpt); + } + return 0; } /* @@ -322,68 +374,72 @@ lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt) * and destroyed locks and peer-table before I finish the allocation */ ptable->pt_number++; - lnet_net_unlock(cpt); + lnet_net_unlock(LNET_LOCK_EX); - if (lp) - memset(lp, 0, sizeof(*lp)); - else - lp = kzalloc_cpt(sizeof(*lp), GFP_NOFS, cpt2); - - if (!lp) { + lpni = kzalloc_cpt(sizeof(*lpni), GFP_KERNEL, cpt2); + if (!lpni) { rc = -ENOMEM; lnet_net_lock(cpt); goto out; } - INIT_LIST_HEAD(&lp->lpni_txq); - INIT_LIST_HEAD(&lp->lpni_rtrq); - INIT_LIST_HEAD(&lp->lpni_routes); - - lp->lpni_notify = 0; - lp->lpni_notifylnd = 0; - lp->lpni_notifying = 0; - lp->lpni_alive_count = 0; - lp->lpni_timestamp = 0; - lp->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ - lp->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ - lp->lpni_last_query = 0; /* haven't asked NI yet */ - lp->lpni_ping_timestamp = 0; - lp->lpni_ping_feats = LNET_PING_FEAT_INVAL; - lp->lpni_nid = nid; - lp->lpni_cpt = cpt2; - atomic_set(&lp->lpni_refcount, 2); /* 1 for caller; 1 for hash */ - lp->lpni_rtr_refcount = 0; + INIT_LIST_HEAD(&lpni->lpni_txq); + INIT_LIST_HEAD(&lpni->lpni_rtrq); + INIT_LIST_HEAD(&lpni->lpni_routes); - lnet_net_lock(cpt); + lpni->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ + lpni->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ + lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL; + lpni->lpni_nid = nid; + lpni->lpni_cpt = cpt2; + atomic_set(&lpni->lpni_refcount, 2); /* 1 for caller; 1 for hash */ + + rc = lnet_build_peer_hierarchy(lpni); + if (rc != 0) + goto out; + + lnet_net_lock(LNET_LOCK_EX); if (the_lnet.ln_shutdown) { rc = -ESHUTDOWN; goto out; } - lp2 = lnet_find_peer_locked(ptable, nid); - if (lp2) { - *lpp = lp2; + lpni2 = lnet_get_peer_ni_locked(ptable, nid); + if (lpni2) { + *lpnip = lpni2; goto out; } - lp->lpni_net = lnet_get_net_locked(LNET_NIDNET(lp->lpni_nid)); - lp->lpni_txcredits = - lp->lpni_mintxcredits = - lp->lpni_net->net_tunables.lct_peer_tx_credits; - lp->lpni_rtrcredits = - lp->lpni_minrtrcredits = lnet_peer_buffer_credits(lp->lpni_net); + lpni->lpni_net = lnet_get_net_locked(LNET_NIDNET(lpni->lpni_nid)); + lpni->lpni_txcredits = + lpni->lpni_mintxcredits = + lpni->lpni_net->net_tunables.lct_peer_tx_credits; + lpni->lpni_rtrcredits = + lpni->lpni_minrtrcredits = + lnet_peer_buffer_credits(lpni->lpni_net); - list_add_tail(&lp->lpni_hashlist, + list_add_tail(&lpni->lpni_hashlist, &ptable->pt_hash[lnet_nid2peerhash(nid)]); ptable->pt_version++; - *lpp = lp; + *lpnip = lpni; + + if (cpt != LNET_LOCK_EX) { + lnet_net_unlock(LNET_LOCK_EX); + lnet_net_lock(cpt); + } return 0; out: - if (lp) - list_add(&lp->lpni_hashlist, &ptable->pt_deathrow); + if (lpni) { + lnet_try_destroy_peer_hierarchy_locked(lpni); + kfree(lpni); + } ptable->pt_number--; + if (cpt != LNET_LOCK_EX) { + lnet_net_unlock(LNET_LOCK_EX); + lnet_net_lock(cpt); + } return rc; } diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index 12a4b1708d3c..977a937f261c 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -385,6 +385,9 @@ static int proc_lnet_routers(struct ctl_table *table, int write, return rc; } +/* TODO: there should be no direct access to ptable. We should add a set + * of APIs that give access to the ptable and its members + */ static int proc_lnet_peers(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos) { From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613169 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3AED71390 for ; Tue, 25 Sep 2018 01:10:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3771E2A052 for ; Tue, 25 Sep 2018 01:10:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2B3D12A05D; Tue, 25 Sep 2018 01:10:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ADF002A052 for ; Tue, 25 Sep 2018 01:10:51 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 526894C3DA3; Mon, 24 Sep 2018 18:10:51 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E76B74C3BE0 for ; Mon, 24 Sep 2018 18:10:49 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0DAC5B032; Tue, 25 Sep 2018 01:10:49 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763518.32103.4120463532750655807.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 09/34] LU-7734 lnet: Multi-Rail local_ni/peer_ni selection X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata This patch implements the local_ni/peer_ni selection algorithm. It adds APIs to the peer module to encapsulate iterating through the peer_nis in a peer and creating a peer. Signed-off-by: Amir Shehata Change-Id: Ifc0e5ebf84ab25753adfcfcb433b024100f35ace Reviewed-on: http://review.whamcloud.com/18383 Reviewed-by: Doug Oucharek Reviewed-by: Olaf Weber Tested-by: Jenkins Tested-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 53 ++ .../staging/lustre/include/linux/lnet/lib-types.h | 17 + drivers/staging/lustre/lnet/lnet/api-ni.c | 20 + drivers/staging/lustre/lnet/lnet/lib-move.c | 522 +++++++++++++++----- drivers/staging/lustre/lnet/lnet/peer.c | 120 ++++- 5 files changed, 603 insertions(+), 129 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index bf076298de71..6ffe5c1c9925 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -403,6 +403,7 @@ struct lnet_ni *lnet_nid2ni_addref(lnet_nid_t nid); struct lnet_ni *lnet_net2ni_locked(__u32 net, int cpt); struct lnet_ni *lnet_net2ni(__u32 net); bool lnet_is_ni_healthy_locked(struct lnet_ni *ni); +struct lnet_net *lnet_get_net_locked(u32 net_id); extern int portal_rotor; @@ -635,13 +636,24 @@ int lnet_parse_networks(struct list_head *nilist, char *networks, bool lnet_net_unique(__u32 net_id, struct list_head *nilist, struct lnet_net **net); bool lnet_ni_unique_net(struct list_head *nilist, char *iface); - +void lnet_incr_dlc_seq(void); +u32 lnet_get_dlc_seq_locked(void); + +struct lnet_peer_ni *lnet_get_next_peer_ni_locked(struct lnet_peer *peer, + struct lnet_peer_net *peer_net, + struct lnet_peer_ni *prev); +int lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, + struct lnet_peer **peer); int lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid, int cpt); void lnet_peer_tables_cleanup(struct lnet_ni *ni); void lnet_peer_tables_destroy(void); int lnet_peer_tables_create(void); void lnet_debug_peer(lnet_nid_t nid); +struct lnet_peer_net *lnet_peer_get_net_locked(struct lnet_peer *peer, + u32 net_id); +bool lnet_peer_is_ni_pref_locked(struct lnet_peer_ni *lpni, + struct lnet_ni *ni); int lnet_get_peer_info(__u32 peer_index, __u64 *nid, char alivness[LNET_MAX_STR_LEN], __u32 *cpt_iter, __u32 *refcount, @@ -649,6 +661,45 @@ int lnet_get_peer_info(__u32 peer_index, __u64 *nid, __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credtis, __u32 *peer_tx_qnob); +static inline bool +lnet_is_peer_ni_healthy_locked(struct lnet_peer_ni *lpni) +{ + return lpni->lpni_healthy; +} + +static inline void +lnet_set_peer_ni_health_locked(struct lnet_peer_ni *lpni, bool health) +{ + lpni->lpni_healthy = health; +} + +static inline bool +lnet_is_peer_net_healthy_locked(struct lnet_peer_net *peer_net) +{ + struct lnet_peer_ni *lpni; + + list_for_each_entry(lpni, &peer_net->lpn_peer_nis, + lpni_on_peer_net_list) { + if (lnet_is_peer_ni_healthy_locked(lpni)) + return true; + } + + return false; +} + +static inline bool +lnet_is_peer_healthy_locked(struct lnet_peer *peer) +{ + struct lnet_peer_net *peer_net; + + list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_on_peer_list) { + if (lnet_is_peer_net_healthy_locked(peer_net)) + return true; + } + + return false; +} + static inline void lnet_peer_set_alive(struct lnet_peer_ni *lp) { diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 9f70c094cc4c..d935d273716d 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -346,6 +346,9 @@ struct lnet_ni { /* lnd tunables set explicitly */ bool ni_lnd_tunables_set; + /* sequence number used to round robin over nis within a net */ + u32 ni_seq; + /* * equivalent interfaces to use * This is an array because socklnd bonding can still be configured @@ -436,10 +439,18 @@ struct lnet_peer_ni { int lpni_cpt; /* # refs from lnet_route::lr_gateway */ int lpni_rtr_refcount; + /* sequence number used to round robin over peer nis within a net */ + u32 lpni_seq; + /* health flag */ + bool lpni_healthy; /* returned RC ping features */ unsigned int lpni_ping_feats; /* routers on this peer */ struct list_head lpni_routes; + /* array of preferred local nids */ + lnet_nid_t *lpni_pref_nids; + /* number of preferred NIDs in lnpi_pref_nids */ + u32 lpni_pref_nnids; /* router checker state */ struct lnet_rc_data *lpni_rcd; }; @@ -453,6 +464,9 @@ struct lnet_peer { /* primary NID of the peer */ lnet_nid_t lp_primary_nid; + + /* peer is Multi-Rail enabled peer */ + bool lp_multi_rail; }; struct lnet_peer_net { @@ -467,6 +481,9 @@ struct lnet_peer_net { /* Net ID */ __u32 lpn_net_id; + + /* health flag */ + bool lpn_healthy; }; /* peer hash size */ diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 821b030f9621..e8e0bc45d8aa 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -64,6 +64,15 @@ module_param(use_tcp_bonding, int, 0444); MODULE_PARM_DESC(use_tcp_bonding, "Set to 1 to use socklnd bonding. 0 to use Multi-Rail"); +/* + * This sequence number keeps track of how many times DLC was used to + * update the configuration. It is incremented on any DLC update and + * checked when sending a message to determine if there is a need to + * re-run the selection algorithm to handle configuration change. + * Look at lnet_select_pathway() for more details on its usage. + */ +static atomic_t lnet_dlc_seq_no = ATOMIC_INIT(0); + static int lnet_ping(struct lnet_process_id id, signed long timeout, struct lnet_process_id __user *ids, int n_ids); @@ -1490,6 +1499,7 @@ lnet_startup_lndnet(struct lnet_net *net, struct lnet_lnd_tunables *tun) lnet_net_lock(LNET_LOCK_EX); list_splice_tail(&local_ni_list, &net_l->net_ni_list); + lnet_incr_dlc_seq(); lnet_net_unlock(LNET_LOCK_EX); /* if the network is not unique then we don't want to keep @@ -2165,6 +2175,16 @@ lnet_dyn_del_ni(__u32 net_id) return rc; } +void lnet_incr_dlc_seq(void) +{ + atomic_inc(&lnet_dlc_seq_no); +} + +u32 lnet_get_dlc_seq_locked(void) +{ + return atomic_read(&lnet_dlc_seq_no); +} + /** * LNet ioctl handler. * diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index edbec7e9ed7e..54e3093355c2 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -444,7 +444,6 @@ lnet_prep_send(struct lnet_msg *msg, int type, struct lnet_process_id target, memset(&msg->msg_hdr, 0, sizeof(msg->msg_hdr)); msg->msg_hdr.type = cpu_to_le32(type); - msg->msg_hdr.dest_nid = cpu_to_le64(target.nid); msg->msg_hdr.dest_pid = cpu_to_le32(target.pid); /* src_nid will be set later */ msg->msg_hdr.src_pid = cpu_to_le32(the_lnet.ln_pid); @@ -836,6 +835,15 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) } if (txpeer) { + /* + * TODO: + * Once the patch for the health comes in we need to set + * the health of the peer ni to bad when we fail to send + * a message. + * int status = msg->msg_ev.status; + * if (status != 0) + * lnet_set_peer_ni_health_locked(txpeer, false) + */ msg->msg_txpeer = NULL; lnet_peer_ni_decref_locked(txpeer); } @@ -968,6 +976,24 @@ lnet_return_rx_credits_locked(struct lnet_msg *msg) } } +static int +lnet_compare_peers(struct lnet_peer_ni *p1, struct lnet_peer_ni *p2) +{ + if (p1->lpni_txqnob < p2->lpni_txqnob) + return 1; + + if (p1->lpni_txqnob > p2->lpni_txqnob) + return -1; + + if (p1->lpni_txcredits > p2->lpni_txcredits) + return 1; + + if (p1->lpni_txcredits < p2->lpni_txcredits) + return -1; + + return 0; +} + static int lnet_compare_routes(struct lnet_route *r1, struct lnet_route *r2) { @@ -975,35 +1001,28 @@ lnet_compare_routes(struct lnet_route *r1, struct lnet_route *r2) struct lnet_peer_ni *p2 = r2->lr_gateway; int r1_hops = (r1->lr_hops == LNET_UNDEFINED_HOPS) ? 1 : r1->lr_hops; int r2_hops = (r2->lr_hops == LNET_UNDEFINED_HOPS) ? 1 : r2->lr_hops; + int rc; if (r1->lr_priority < r2->lr_priority) return 1; if (r1->lr_priority > r2->lr_priority) - return -ERANGE; + return -1; if (r1_hops < r2_hops) return 1; if (r1_hops > r2_hops) - return -ERANGE; - - if (p1->lpni_txqnob < p2->lpni_txqnob) - return 1; - - if (p1->lpni_txqnob > p2->lpni_txqnob) - return -ERANGE; - - if (p1->lpni_txcredits > p2->lpni_txcredits) - return 1; + return -1; - if (p1->lpni_txcredits < p2->lpni_txcredits) - return -ERANGE; + rc = lnet_compare_peers(p1, p2); + if (rc) + return rc; if (r1->lr_seq - r2->lr_seq <= 0) return 1; - return -ERANGE; + return -1; } static struct lnet_peer_ni * @@ -1070,171 +1089,430 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, return lpni_best; } -int -lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) +static int +lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, + struct lnet_msg *msg, lnet_nid_t rtr_nid, bool *lo_sent) { - lnet_nid_t dst_nid = msg->msg_target.nid; - struct lnet_ni *src_ni; - struct lnet_ni *local_ni; - struct lnet_peer_ni *lp; - int cpt; - int cpt2; - int rc; - + struct lnet_ni *best_ni = NULL; + struct lnet_peer_ni *best_lpni = NULL; + struct lnet_peer_ni *net_gw = NULL; + struct lnet_peer_ni *best_gw = NULL; + struct lnet_peer_ni *lpni; + struct lnet_peer *peer = NULL; + struct lnet_peer_net *peer_net; + struct lnet_net *local_net; + struct lnet_ni *ni = NULL; + int cpt, cpt2, rc; + bool routing = false; + bool ni_is_pref = false; + bool preferred = false; + int best_credits = 0; + u32 seq, seq2; + int best_lpni_credits = INT_MIN; + +again: /* - * NB: rtr_nid is set to LNET_NID_ANY for all current use-cases, - * but we might want to use pre-determined router for ACK/REPLY - * in the future + * get an initial CPT to use for locking. The idea here is not to + * serialize the calls to select_pathway, so that as many + * operations can run concurrently as possible. To do that we use + * the CPT where this call is being executed. Later on when we + * determine the CPT to use in lnet_message_commit, we switch the + * lock and check if there was any configuration changes, if none, + * then we proceed, if there is, then we'll need to update the cpt + * and redo the operation. */ - /* NB: ni == interface pre-determined (ACK/REPLY) */ - LASSERT(!msg->msg_txpeer); - LASSERT(!msg->msg_sending); - LASSERT(!msg->msg_target_is_router); - LASSERT(!msg->msg_receiving); + cpt = lnet_net_lock_current(); - msg->msg_sending = 1; - - LASSERT(!msg->msg_tx_committed); - local_ni = lnet_net2ni(LNET_NIDNET(dst_nid)); - cpt = lnet_cpt_of_nid(rtr_nid == LNET_NID_ANY ? dst_nid : rtr_nid, - local_ni); - again: - lnet_net_lock(cpt); + best_gw = NULL; + routing = false; + local_net = NULL; + best_ni = NULL; if (the_lnet.ln_shutdown) { lnet_net_unlock(cpt); return -ESHUTDOWN; } - if (src_nid == LNET_NID_ANY) { - src_ni = NULL; - } else { - src_ni = lnet_nid2ni_locked(src_nid, cpt); - if (!src_ni) { + /* + * initialize the variables which could be reused if we go to + * again + */ + lpni = NULL; + seq = lnet_get_dlc_seq_locked(); + + rc = lnet_find_or_create_peer_locked(dst_nid, cpt, &peer); + if (rc != 0) { + lnet_net_unlock(cpt); + return rc; + } + + /* If peer is not healthy then can not send anything to it */ + if (!lnet_is_peer_healthy_locked(peer)) { + lnet_net_unlock(cpt); + return -EHOSTUNREACH; + } + + /* + * STEP 1: first jab at determineing best_ni + * if src_nid is explicitly specified, then best_ni is already + * pre-determiend for us. Otherwise we need to select the best + * one to use later on + */ + if (src_nid != LNET_NID_ANY) { + best_ni = lnet_nid2ni_locked(src_nid, cpt); + if (!best_ni) { lnet_net_unlock(cpt); LCONSOLE_WARN("Can't send to %s: src %s is not a local nid\n", libcfs_nid2str(dst_nid), libcfs_nid2str(src_nid)); return -EINVAL; } - LASSERT(!msg->msg_routing); - } - - /* Is this for someone on a local network? */ - local_ni = lnet_net2ni_locked(LNET_NIDNET(dst_nid), cpt); - if (local_ni) { - if (!src_ni) { - src_ni = local_ni; - src_nid = src_ni->ni_nid; - } else if (src_ni != local_ni) { + if (best_ni->ni_net->net_id != LNET_NIDNET(dst_nid)) { lnet_net_unlock(cpt); LCONSOLE_WARN("No route to %s via from %s\n", libcfs_nid2str(dst_nid), libcfs_nid2str(src_nid)); return -EINVAL; } + } - LASSERT(src_nid != LNET_NID_ANY); + if (best_ni == the_lnet.ln_loni) { + /* No send credit hassles with LOLND */ + msg->msg_hdr.dest_nid = cpu_to_le64(best_ni->ni_nid); + if (!msg->msg_routing) + msg->msg_hdr.src_nid = cpu_to_le64(best_ni->ni_nid); + msg->msg_target.nid = best_ni->ni_nid; lnet_msg_commit(msg, cpt); - if (!msg->msg_routing) - msg->msg_hdr.src_nid = cpu_to_le64(src_nid); + lnet_ni_addref_locked(best_ni, cpt); + lnet_net_unlock(cpt); + msg->msg_txni = best_ni; + lnet_ni_send(best_ni, msg); - if (src_ni == the_lnet.ln_loni) { - /* No send credit hassles with LOLND */ - lnet_net_unlock(cpt); - lnet_ni_send(src_ni, msg); - return 0; + *lo_sent = true; + return 0; + } + + if (best_ni) + goto pick_peer; + + /* + * Decide whether we need to route to peer_ni. + * Get the local net that I need to be on to be able to directly + * send to that peer. + * + * a. Find the peer which the dst_nid belongs to. + * b. Iterate through each of the peer_nets/nis to decide + * the best peer/local_ni pair to use + */ + list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_on_peer_list) { + if (!lnet_is_peer_net_healthy_locked(peer_net)) + continue; + + local_net = lnet_get_net_locked(peer_net->lpn_net_id); + if (!local_net) { + /* + * go through each peer_ni on that peer_net and + * determine the best possible gw to go through + */ + list_for_each_entry(lpni, &peer_net->lpn_peer_nis, + lpni_on_peer_net_list) { + net_gw = lnet_find_route_locked(NULL, + lpni->lpni_nid, + rtr_nid); + + /* + * if no route is found for that network then + * move onto the next peer_ni in the peer + */ + if (!net_gw) + continue; + + if (!best_gw) { + best_gw = net_gw; + best_lpni = lpni; + } else { + rc = lnet_compare_peers(net_gw, + best_gw); + if (rc > 0) { + best_gw = net_gw; + best_lpni = lpni; + } + } + } + + if (!best_gw) + continue; + + local_net = lnet_get_net_locked + (LNET_NIDNET(best_gw->lpni_nid)); + routing = true; + } else { + routing = false; + best_gw = NULL; } - rc = lnet_nid2peerni_locked(&lp, dst_nid, cpt); - if (rc) { - lnet_net_unlock(cpt); - LCONSOLE_WARN("Error %d finding peer %s\n", rc, - libcfs_nid2str(dst_nid)); - /* ENOMEM or shutting down */ - return rc; + /* no routable net found go on to a different net */ + if (!local_net) + continue; + + /* + * Second jab at determining best_ni + * if we get here then the peer we're trying to send + * to is on a directly connected network, and we'll + * need to pick the local_ni on that network to send + * from + */ + while ((ni = lnet_get_next_ni_locked(local_net, ni))) { + if (!lnet_is_ni_healthy_locked(ni)) + continue; + /* TODO: compare NUMA distance */ + if (ni->ni_tx_queues[cpt]->tq_credits <= + best_credits) { + /* + * all we want is to read tq_credits + * value as an approximation of how + * busy the NI is. No need to grab a lock + */ + continue; + } else if (best_ni) { + if ((best_ni)->ni_seq - ni->ni_seq <= 0) + continue; + (best_ni)->ni_seq = ni->ni_seq + 1; + } + + best_ni = ni; + best_credits = ni->ni_tx_queues[cpt]->tq_credits; } - LASSERT(lp->lpni_net == src_ni->ni_net); - } else { - /* sending to a remote network */ - lp = lnet_find_route_locked(src_ni ? src_ni->ni_net : NULL, - dst_nid, rtr_nid); - if (!lp) { - lnet_net_unlock(cpt); + } - LCONSOLE_WARN("No route to %s via %s (all routers down)\n", - libcfs_id2str(msg->msg_target), - libcfs_nid2str(src_nid)); - return -EHOSTUNREACH; + if (!best_ni) { + lnet_net_unlock(cpt); + LCONSOLE_WARN("No local ni found to send from to %s\n", + libcfs_nid2str(dst_nid)); + return -EINVAL; + } + + if (routing) + goto send; + +pick_peer: + lpni = NULL; + + if (msg->msg_type == LNET_MSG_REPLY || + msg->msg_type == LNET_MSG_ACK) { + /* + * for replies we want to respond on the same peer_ni we + * received the message on if possible. If not, then pick + * a peer_ni to send to + */ + best_lpni = lnet_find_peer_ni_locked(dst_nid, cpt); + if (best_lpni) { + lnet_peer_ni_decref_locked(best_lpni); + goto send; + } else { + CDEBUG(D_NET, + "unable to send msg_type %d to originating %s\n", + msg->msg_type, + libcfs_nid2str(dst_nid)); } + } + peer_net = lnet_peer_get_net_locked(peer, + best_ni->ni_net->net_id); + /* + * peer_net is not available or the src_nid is explicitly defined + * and the peer_net for that src_nid is unhealthy. find a route to + * the destination nid. + */ + if (!peer_net || + (src_nid != LNET_NID_ANY && + !lnet_is_peer_net_healthy_locked(peer_net))) { + best_gw = lnet_find_route_locked(best_ni->ni_net, + dst_nid, + rtr_nid); /* - * rtr_nid is LNET_NID_ANY or NID of pre-determined router, - * it's possible that rtr_nid isn't LNET_NID_ANY and lp isn't - * pre-determined router, this can happen if router table - * was changed when we release the lock + * if no route is found for that network then + * move onto the next peer_ni in the peer */ - if (rtr_nid != lp->lpni_nid) { - cpt2 = lp->lpni_cpt; - if (cpt2 != cpt) { - lnet_net_unlock(cpt); - - rtr_nid = lp->lpni_nid; - cpt = cpt2; - goto again; - } + if (!best_gw) { + lnet_net_unlock(cpt); + LCONSOLE_WARN("No route to peer from %s\n", + libcfs_nid2str(best_ni->ni_nid)); + return -EHOSTUNREACH; } CDEBUG(D_NET, "Best route to %s via %s for %s %d\n", - libcfs_nid2str(dst_nid), libcfs_nid2str(lp->lpni_nid), - lnet_msgtyp2str(msg->msg_type), msg->msg_len); + libcfs_nid2str(lpni->lpni_nid), + libcfs_nid2str(best_gw->lpni_nid), + lnet_msgtyp2str(msg->msg_type), msg->msg_len); - if (!src_ni) { - src_ni = lnet_get_next_ni_locked(lp->lpni_net, NULL); - LASSERT(src_ni); - src_nid = src_ni->ni_nid; - } else { - LASSERT(src_ni->ni_net == lp->lpni_net); + best_lpni = lnet_find_peer_ni_locked(dst_nid, cpt); + LASSERT(best_lpni); + lnet_peer_ni_decref_locked(best_lpni); + + routing = true; + + goto send; + } else if (!lnet_is_peer_net_healthy_locked(peer_net)) { + /* + * this peer_net is unhealthy but we still have an opportunity + * to find another peer_net that we can use + */ + u32 net_id = peer_net->lpn_net_id; + + lnet_net_unlock(cpt); + if (!best_lpni) + LCONSOLE_WARN("peer net %s unhealthy\n", + libcfs_net2str(net_id)); + goto again; + } + + best_lpni = NULL; + while ((lpni = lnet_get_next_peer_ni_locked(peer, peer_net, lpni))) { + /* + * if this peer ni is not healty just skip it, no point in + * examining it further + */ + if (!lnet_is_peer_ni_healthy_locked(lpni)) + continue; + ni_is_pref = lnet_peer_is_ni_pref_locked(lpni, best_ni); + + if (!preferred && ni_is_pref) { + preferred = true; + } else if (preferred && !ni_is_pref) { + continue; + } else if (lpni->lpni_txcredits <= best_lpni_credits) { + continue; + } else if (best_lpni) { + if (best_lpni->lpni_seq - lpni->lpni_seq <= 0) + continue; + best_lpni->lpni_seq = lpni->lpni_seq + 1; } - lnet_peer_ni_addref_locked(lp); + best_lpni = lpni; + best_lpni_credits = lpni->lpni_txcredits; + } - LASSERT(src_nid != LNET_NID_ANY); - lnet_msg_commit(msg, cpt); + /* if we still can't find a peer ni then we can't reach it */ + if (!best_lpni) { + u32 net_id = peer_net ? peer_net->lpn_net_id : + LNET_NIDNET(dst_nid); + + lnet_net_unlock(cpt); + LCONSOLE_WARN("no peer_ni found on peer net %s\n", + libcfs_net2str(net_id)); + goto again; + } - if (!msg->msg_routing) { - /* I'm the source and now I know which NI to send on */ - msg->msg_hdr.src_nid = cpu_to_le64(src_nid); +send: + /* + * determine the cpt to use and if it has changed then + * lock the new cpt and check if the config has changed. + * If it has changed then repeat the algorithm since the + * ni or peer list could have changed and the algorithm + * would endup picking a different ni/peer_ni pair. + */ + cpt2 = best_lpni->lpni_cpt; + if (cpt != cpt2) { + lnet_net_unlock(cpt); + cpt = cpt2; + lnet_net_lock(cpt); + seq2 = lnet_get_dlc_seq_locked(); + if (seq2 != seq) { + lnet_net_unlock(cpt); + goto again; } + } + + /* + * store the best_lpni in the message right away to avoid having + * to do the same operation under different conditions + */ + msg->msg_txpeer = (routing) ? best_gw : best_lpni; + msg->msg_txni = best_ni; + /* + * grab a reference for the best_ni since now it's in use in this + * send. the reference will need to be dropped when the message is + * finished in lnet_finalize() + */ + lnet_ni_addref_locked(msg->msg_txni, cpt); + lnet_peer_ni_addref_locked(msg->msg_txpeer); + + /* + * set the destination nid in the message here because it's + * possible that we'd be sending to a different nid than the one + * originaly given. + */ + msg->msg_hdr.dest_nid = cpu_to_le64(msg->msg_txpeer->lpni_nid); + /* + * Always set the target.nid to the best peer picked. Either the + * nid will be one of the preconfigured NIDs, or the same NID as + * what was originaly set in the target or it will be the NID of + * a router if this message should be routed + */ + msg->msg_target.nid = msg->msg_txpeer->lpni_nid; + + /* + * lnet_msg_commit assigns the correct cpt to the message, which + * is used to decrement the correct refcount on the ni when it's + * time to return the credits + */ + lnet_msg_commit(msg, cpt); + + /* + * If we are routing the message then we don't need to overwrite + * the src_nid since it would've been set at the origin. Otherwise + * we are the originator so we need to set it. + */ + if (!msg->msg_routing) + msg->msg_hdr.src_nid = cpu_to_le64(msg->msg_txni->ni_nid); + + if (routing) { msg->msg_target_is_router = 1; - msg->msg_target.nid = lp->lpni_nid; msg->msg_target.pid = LNET_PID_LUSTRE; } - /* 'lp' is our best choice of peer */ + rc = lnet_post_send_locked(msg, 0); - LASSERT(!msg->msg_peertxcredit); - LASSERT(!msg->msg_txcredit); + lnet_net_unlock(cpt); + + return rc; +} + +int +lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) +{ + lnet_nid_t dst_nid = msg->msg_target.nid; + int rc; + bool lo_sent = false; + + /* + * NB: rtr_nid is set to LNET_NID_ANY for all current use-cases, + * but we might want to use pre-determined router for ACK/REPLY + * in the future + */ + /* NB: !ni == interface pre-determined (ACK/REPLY) */ LASSERT(!msg->msg_txpeer); + LASSERT(!msg->msg_sending); + LASSERT(!msg->msg_target_is_router); + LASSERT(!msg->msg_receiving); - msg->msg_txpeer = lp; /* msg takes my ref on lp */ - /* set the NI for this message */ - msg->msg_txni = src_ni; - lnet_ni_addref_locked(msg->msg_txni, cpt); + msg->msg_sending = 1; - rc = lnet_post_send_locked(msg, 0); - lnet_net_unlock(cpt); + LASSERT(!msg->msg_tx_committed); - if (rc < 0) + rc = lnet_select_pathway(src_nid, dst_nid, msg, rtr_nid, &lo_sent); + if (rc < 0 || lo_sent) return rc; if (rc == LNET_CREDIT_OK) - lnet_ni_send(src_ni, msg); + lnet_ni_send(msg->msg_txni, msg); - return 0; /* rc == LNET_CREDIT_OK or LNET_CREDIT_WAIT */ + /* rc == LNET_CREDIT_OK or LNET_CREDIT_WAIT */ + return 0; } void diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 97ee1f5cfd2f..edba1b1d87cc 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -230,6 +230,95 @@ lnet_find_peer_ni_locked(lnet_nid_t nid, int cpt) return lpni; } +int +lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, + struct lnet_peer **peer) +{ + struct lnet_peer_ni *lpni; + + lpni = lnet_find_peer_ni_locked(dst_nid, cpt); + if (!lpni) { + int rc; + + rc = lnet_nid2peerni_locked(&lpni, dst_nid, cpt); + if (rc != 0) + return rc; + } + + *peer = lpni->lpni_peer_net->lpn_peer; + lnet_peer_ni_decref_locked(lpni); + + return 0; +} + +struct lnet_peer_ni * +lnet_get_next_peer_ni_locked(struct lnet_peer *peer, + struct lnet_peer_net *peer_net, + struct lnet_peer_ni *prev) +{ + struct lnet_peer_ni *lpni; + struct lnet_peer_net *net = peer_net; + + if (!prev) { + if (!net) + net = list_entry(peer->lp_peer_nets.next, + struct lnet_peer_net, + lpn_on_peer_list); + lpni = list_entry(net->lpn_peer_nis.next, struct lnet_peer_ni, + lpni_on_peer_net_list); + + return lpni; + } + + if (prev->lpni_on_peer_net_list.next == + &prev->lpni_peer_net->lpn_peer_nis) { + /* + * if you reached the end of the peer ni list and the peer + * net is specified then there are no more peer nis in that + * net. + */ + if (net) + return NULL; + + /* + * we reached the end of this net ni list. move to the + * next net + */ + if (prev->lpni_peer_net->lpn_on_peer_list.next == + &peer->lp_peer_nets) + /* no more nets and no more NIs. */ + return NULL; + + /* get the next net */ + net = list_entry(prev->lpni_peer_net->lpn_on_peer_list.next, + struct lnet_peer_net, + lpn_on_peer_list); + /* get the ni on it */ + lpni = list_entry(net->lpn_peer_nis.next, struct lnet_peer_ni, + lpni_on_peer_net_list); + + return lpni; + } + + /* there are more nis left */ + lpni = list_entry(prev->lpni_on_peer_net_list.next, + struct lnet_peer_ni, lpni_on_peer_net_list); + + return lpni; +} + +bool +lnet_peer_is_ni_pref_locked(struct lnet_peer_ni *lpni, struct lnet_ni *ni) +{ + int i; + + for (i = 0; i < lpni->lpni_pref_nnids; i++) { + if (lpni->lpni_pref_nids[i] == ni->ni_nid) + return true; + } + return false; +} + static void lnet_try_destroy_peer_hierarchy_locked(struct lnet_peer_ni *lpni) { @@ -302,6 +391,18 @@ lnet_build_peer_hierarchy(struct lnet_peer_ni *lpni) return 0; } +struct lnet_peer_net * +lnet_peer_get_net_locked(struct lnet_peer *peer, u32 net_id) +{ + struct lnet_peer_net *peer_net; + + list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_on_peer_list) { + if (peer_net->lpn_net_id == net_id) + return peer_net; + } + return NULL; +} + void lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lpni) { @@ -412,12 +513,19 @@ lnet_nid2peerni_locked(struct lnet_peer_ni **lpnip, lnet_nid_t nid, int cpt) } lpni->lpni_net = lnet_get_net_locked(LNET_NIDNET(lpni->lpni_nid)); - lpni->lpni_txcredits = - lpni->lpni_mintxcredits = - lpni->lpni_net->net_tunables.lct_peer_tx_credits; - lpni->lpni_rtrcredits = - lpni->lpni_minrtrcredits = - lnet_peer_buffer_credits(lpni->lpni_net); + if (lpni->lpni_net) { + lpni->lpni_txcredits = + lpni->lpni_mintxcredits = + lpni->lpni_net->net_tunables.lct_peer_tx_credits; + lpni->lpni_rtrcredits = + lpni->lpni_minrtrcredits = + lnet_peer_buffer_credits(lpni->lpni_net); + } else { + CDEBUG(D_NET, "peer_ni %s is not directly connected\n", + libcfs_nid2str(nid)); + } + + lnet_set_peer_ni_health_locked(lpni, true); list_add_tail(&lpni->lpni_hashlist, &ptable->pt_hash[lnet_nid2peerhash(nid)]); From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613171 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17EE7157B for ; Tue, 25 Sep 2018 01:11:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A56A2A052 for ; Tue, 25 Sep 2018 01:11:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0D8EE2A05D; Tue, 25 Sep 2018 01:11:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CAB6F2A052 for ; Tue, 25 Sep 2018 01:10:59 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 742514C3DA3; Mon, 24 Sep 2018 18:10:59 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id F027D200D36 for ; Mon, 24 Sep 2018 18:10:57 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E1609B033; Tue, 25 Sep 2018 01:10:56 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763522.32103.731439682287514589.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 10/34] LU-7734 lnet: configure peers from DLC X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata This patch adds the ability to configure peers from the DLC interface. When a peer is added a primary NID should be provided. If none is provided then the first NID in the list of NIDs will be used as the primary NID. Basic error checking is done at the DLC level to ensure properly formatted NIDs. However, if a NID is a duplicate, this will be detected when adding it in the kernel. Operation is halted, which means some peer NIDs might have already been added, but not the entire set. It's the role of the caller to backtrack and remove that peer that failed to add. When deleting a peer a primary NID or a normal NID can be provided. If a standard NID is provided, then the peer is found, and the primary NID is compared to the peer ni. If they are the same the entire peer is deleted. Otherwise, only the identified peer ni is deleted. If a set of NIDs are provided each one will be removed from the peer identified by the peer NID in turn. The existing show peer credits API can be used to show peer information. Signed-off-by: Amir Shehata Change-Id: Iaf588a062b44d74305aa9aa7d31c7341c6c384b9 Reviewed-on: http://review.whamcloud.com/18476 Reviewed-by: Doug Oucharek Tested-by: Maloo Reviewed-by: Olaf Weber Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 20 + .../staging/lustre/include/linux/lnet/lib-types.h | 4 .../lustre/include/uapi/linux/lnet/libcfs_ioctl.h | 5 .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 32 +- drivers/staging/lustre/lnet/lnet/api-ni.c | 39 ++ drivers/staging/lustre/lnet/lnet/lib-move.c | 4 drivers/staging/lustre/lnet/lnet/peer.c | 387 ++++++++++++++++++-- drivers/staging/lustre/lnet/lnet/router.c | 2 8 files changed, 433 insertions(+), 60 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 6ffe5c1c9925..11642f8aee90 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -645,21 +645,25 @@ struct lnet_peer_ni *lnet_get_next_peer_ni_locked(struct lnet_peer *peer, int lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, struct lnet_peer **peer); int lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); -struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid, int cpt); +struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid); void lnet_peer_tables_cleanup(struct lnet_ni *ni); -void lnet_peer_tables_destroy(void); +void lnet_peer_uninit(void); int lnet_peer_tables_create(void); void lnet_debug_peer(lnet_nid_t nid); struct lnet_peer_net *lnet_peer_get_net_locked(struct lnet_peer *peer, u32 net_id); bool lnet_peer_is_ni_pref_locked(struct lnet_peer_ni *lpni, struct lnet_ni *ni); -int lnet_get_peer_info(__u32 peer_index, __u64 *nid, - char alivness[LNET_MAX_STR_LEN], - __u32 *cpt_iter, __u32 *refcount, - __u32 *ni_peer_tx_credits, __u32 *peer_tx_credits, - __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credtis, - __u32 *peer_tx_qnob); +int lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid); +int lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid); +int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, + struct lnet_peer_ni_credit_info *peer_ni_info); +int lnet_get_peer_ni_info(__u32 peer_index, __u64 *nid, + char alivness[LNET_MAX_STR_LEN], + __u32 *cpt_iter, __u32 *refcount, + __u32 *ni_peer_tx_credits, __u32 *peer_tx_credits, + __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credtis, + __u32 *peer_tx_qnob); static inline bool lnet_is_peer_ni_healthy_locked(struct lnet_peer_ni *lpni) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index d935d273716d..22b141cb6cff 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -388,6 +388,8 @@ struct lnet_rc_data { struct lnet_peer_ni { struct list_head lpni_on_peer_net_list; + /* chain on remote peer list */ + struct list_head lpni_on_remote_peer_ni_list; /* chain on peer hash */ struct list_head lpni_hashlist; /* messages blocking for tx credits */ @@ -732,6 +734,8 @@ struct lnet { struct lnet_peer_table **ln_peer_tables; /* list of configured or discovered peers */ struct list_head ln_peers; + /* list of peer nis not on a local network */ + struct list_head ln_remote_peer_ni_list; /* failure simulation */ struct list_head ln_test_peers; struct list_head ln_drop_rules; diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h b/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h index cce6b58e3682..d5a3e7c85aa4 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h @@ -136,6 +136,9 @@ struct libcfs_debug_ioctl_data { #define IOC_LIBCFS_GET_BUF _IOWR(IOC_LIBCFS_TYPE, 89, IOCTL_CONFIG_SIZE) #define IOC_LIBCFS_GET_PEER_INFO _IOWR(IOC_LIBCFS_TYPE, 90, IOCTL_CONFIG_SIZE) #define IOC_LIBCFS_GET_LNET_STATS _IOWR(IOC_LIBCFS_TYPE, 91, IOCTL_CONFIG_SIZE) -#define IOC_LIBCFS_MAX_NR 91 +#define IOC_LIBCFS_ADD_PEER_NI _IOWR(IOC_LIBCFS_TYPE, 92, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_DEL_PEER_NI _IOWR(IOC_LIBCFS_TYPE, 93, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_GET_PEER_NI _IOWR(IOC_LIBCFS_TYPE, 94, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_MAX_NR 94 #endif /* __LIBCFS_IOCTL_H__ */ diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index ac29f9d24d5d..9c4e05e1b683 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -126,26 +126,36 @@ struct lnet_ioctl_config_data { char cfg_bulk[0]; }; +struct lnet_peer_ni_credit_info { + char cr_aliveness[LNET_MAX_STR_LEN]; + __u32 cr_refcount; + __s32 cr_ni_peer_tx_credits; + __s32 cr_peer_tx_credits; + __s32 cr_peer_rtr_credits; + __s32 cr_peer_min_rtr_credits; + __u32 cr_peer_tx_qnob; + __u32 cr_ncpt; +}; + struct lnet_ioctl_peer { struct libcfs_ioctl_hdr pr_hdr; __u32 pr_count; __u32 pr_pad; - __u64 pr_nid; + lnet_nid_t pr_nid; union { - struct { - char cr_aliveness[LNET_MAX_STR_LEN]; - __u32 cr_refcount; - __u32 cr_ni_peer_tx_credits; - __u32 cr_peer_tx_credits; - __u32 cr_peer_rtr_credits; - __u32 cr_peer_min_rtr_credits; - __u32 cr_peer_tx_qnob; - __u32 cr_ncpt; - } pr_peer_credits; + struct lnet_peer_ni_credit_info pr_peer_credits; } pr_lnd_u; }; +struct lnet_ioctl_peer_cfg { + struct libcfs_ioctl_hdr prcfg_hdr; + lnet_nid_t prcfg_key_nid; + lnet_nid_t prcfg_cfg_nid; + __u32 prcfg_idx; + char prcfg_bulk[0]; +}; + struct lnet_ioctl_lnet_stats { struct libcfs_ioctl_hdr st_hdr; struct lnet_counters st_cntrs; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index e8e0bc45d8aa..710f8a0be934 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -552,6 +552,7 @@ lnet_prepare(lnet_pid_t requested_pid) INIT_LIST_HEAD(&the_lnet.ln_test_peers); INIT_LIST_HEAD(&the_lnet.ln_peers); + INIT_LIST_HEAD(&the_lnet.ln_remote_peer_ni_list); INIT_LIST_HEAD(&the_lnet.ln_nets); INIT_LIST_HEAD(&the_lnet.ln_routers); INIT_LIST_HEAD(&the_lnet.ln_drop_rules); @@ -646,7 +647,7 @@ lnet_unprepare(void) lnet_res_container_cleanup(&the_lnet.ln_eq_container); lnet_msg_containers_destroy(); - lnet_peer_tables_destroy(); + lnet_peer_uninit(); lnet_rtrpools_free(0); if (the_lnet.ln_counters) { @@ -2318,13 +2319,33 @@ LNetCtl(unsigned int cmd, void *arg) return lnet_get_rtr_pool_cfg(config->cfg_count, pool_cfg); } + case IOC_LIBCFS_ADD_PEER_NI: { + struct lnet_ioctl_peer_cfg *cfg = arg; + + if (cfg->prcfg_hdr.ioc_len < sizeof(*cfg)) + return -EINVAL; + + return lnet_add_peer_ni_to_peer(cfg->prcfg_key_nid, + cfg->prcfg_cfg_nid); + } + + case IOC_LIBCFS_DEL_PEER_NI: { + struct lnet_ioctl_peer_cfg *cfg = arg; + + if (cfg->prcfg_hdr.ioc_len < sizeof(*cfg)) + return -EINVAL; + + return lnet_del_peer_ni_from_peer(cfg->prcfg_key_nid, + cfg->prcfg_cfg_nid); + } + case IOC_LIBCFS_GET_PEER_INFO: { struct lnet_ioctl_peer *peer_info = arg; if (peer_info->pr_hdr.ioc_len < sizeof(*peer_info)) return -EINVAL; - return lnet_get_peer_info(peer_info->pr_count, + return lnet_get_peer_ni_info(peer_info->pr_count, &peer_info->pr_nid, peer_info->pr_lnd_u.pr_peer_credits.cr_aliveness, &peer_info->pr_lnd_u.pr_peer_credits.cr_ncpt, @@ -2336,6 +2357,20 @@ LNetCtl(unsigned int cmd, void *arg) &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_tx_qnob); } + case IOC_LIBCFS_GET_PEER_NI: { + struct lnet_ioctl_peer_cfg *cfg = arg; + struct lnet_peer_ni_credit_info *lpni_cri; + size_t total = sizeof(*cfg) + sizeof(*lpni_cri); + + if (cfg->prcfg_hdr.ioc_len < total) + return -EINVAL; + + lpni_cri = (struct lnet_peer_ni_credit_info *)cfg->prcfg_bulk; + + return lnet_get_peer_info(cfg->prcfg_idx, &cfg->prcfg_key_nid, + &cfg->prcfg_cfg_nid, lpni_cri); + } + case IOC_LIBCFS_NOTIFY_ROUTER: { time64_t deadline = ktime_get_real_seconds() - data->ioc_u64[0]; diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 54e3093355c2..fbf209610ff9 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1307,7 +1307,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * received the message on if possible. If not, then pick * a peer_ni to send to */ - best_lpni = lnet_find_peer_ni_locked(dst_nid, cpt); + best_lpni = lnet_find_peer_ni_locked(dst_nid); if (best_lpni) { lnet_peer_ni_decref_locked(best_lpni); goto send; @@ -1348,7 +1348,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, libcfs_nid2str(best_gw->lpni_nid), lnet_msgtyp2str(msg->msg_type), msg->msg_len); - best_lpni = lnet_find_peer_ni_locked(dst_nid, cpt); + best_lpni = lnet_find_peer_ni_locked(dst_nid); LASSERT(best_lpni); lnet_peer_ni_decref_locked(best_lpni); diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index edba1b1d87cc..d081440579e0 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -38,6 +38,65 @@ #include #include +static void +lnet_peer_remove_from_remote_list(struct lnet_peer_ni *lpni) +{ + if (!list_empty(&lpni->lpni_on_remote_peer_ni_list)) { + list_del_init(&lpni->lpni_on_remote_peer_ni_list); + lnet_peer_ni_decref_locked(lpni); + } +} + +void +lnet_peer_tables_destroy(void) +{ + struct lnet_peer_table *ptable; + struct list_head *hash; + int i; + int j; + + if (!the_lnet.ln_peer_tables) + return; + + cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { + hash = ptable->pt_hash; + if (!hash) /* not initialized */ + break; + + ptable->pt_hash = NULL; + for (j = 0; j < LNET_PEER_HASH_SIZE; j++) + LASSERT(list_empty(&hash[j])); + + kvfree(hash); + } + + cfs_percpt_free(the_lnet.ln_peer_tables); + the_lnet.ln_peer_tables = NULL; +} + +void lnet_peer_uninit(void) +{ + int cpt; + struct lnet_peer_ni *lpni, *tmp; + struct lnet_peer_table *ptable = NULL; + + /* remove all peer_nis from the remote peer and he hash list */ + list_for_each_entry_safe(lpni, tmp, &the_lnet.ln_remote_peer_ni_list, + lpni_on_remote_peer_ni_list) { + list_del_init(&lpni->lpni_on_remote_peer_ni_list); + lnet_peer_ni_decref_locked(lpni); + + cpt = lnet_cpt_of_nid_locked(lpni->lpni_nid, NULL); + ptable = the_lnet.ln_peer_tables[cpt]; + ptable->pt_zombies++; + + list_del_init(&lpni->lpni_hashlist); + lnet_peer_ni_decref_locked(lpni); + } + + lnet_peer_tables_destroy(); +} + int lnet_peer_tables_create(void) { @@ -70,33 +129,6 @@ lnet_peer_tables_create(void) return 0; } -void -lnet_peer_tables_destroy(void) -{ - struct lnet_peer_table *ptable; - struct list_head *hash; - int i; - int j; - - if (!the_lnet.ln_peer_tables) - return; - - cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { - hash = ptable->pt_hash; - if (!hash) /* not initialized */ - break; - - ptable->pt_hash = NULL; - for (j = 0; j < LNET_PEER_HASH_SIZE; j++) - LASSERT(list_empty(&hash[j])); - - kvfree(hash); - } - - cfs_percpt_free(the_lnet.ln_peer_tables); - the_lnet.ln_peer_tables = NULL; -} - static void lnet_peer_table_cleanup_locked(struct lnet_ni *ni, struct lnet_peer_table *ptable) @@ -219,10 +251,13 @@ lnet_get_peer_ni_locked(struct lnet_peer_table *ptable, lnet_nid_t nid) } struct lnet_peer_ni * -lnet_find_peer_ni_locked(lnet_nid_t nid, int cpt) +lnet_find_peer_ni_locked(lnet_nid_t nid) { struct lnet_peer_ni *lpni; struct lnet_peer_table *ptable; + int cpt; + + cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); ptable = the_lnet.ln_peer_tables[cpt]; lpni = lnet_get_peer_ni_locked(ptable, nid); @@ -236,7 +271,7 @@ lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, { struct lnet_peer_ni *lpni; - lpni = lnet_find_peer_ni_locked(dst_nid, cpt); + lpni = lnet_find_peer_ni_locked(dst_nid); if (!lpni) { int rc; @@ -251,6 +286,25 @@ lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, return 0; } +struct lnet_peer_ni * +lnet_get_peer_ni_idx_locked(int idx, struct lnet_peer_net **lpn, + struct lnet_peer **lp) +{ + struct lnet_peer_ni *lpni; + + list_for_each_entry((*lp), &the_lnet.ln_peers, lp_on_lnet_peer_list) { + list_for_each_entry((*lpn), &((*lp)->lp_peer_nets), + lpn_on_peer_list) { + list_for_each_entry(lpni, &((*lpn)->lpn_peer_nis), + lpni_on_peer_net_list) + if (idx-- == 0) + return lpni; + } + } + + return NULL; +} + struct lnet_peer_ni * lnet_get_next_peer_ni_locked(struct lnet_peer *peer, struct lnet_peer_net *peer_net, @@ -403,6 +457,223 @@ lnet_peer_get_net_locked(struct lnet_peer *peer, u32 net_id) return NULL; } +/* + * given the key nid find the peer to add the new peer NID to. If the key + * nid is NULL, then create a new peer, but first make sure that the NID + * is unique + */ +int +lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid) +{ + struct lnet_peer_ni *lpni, *lpni2; + struct lnet_peer *peer; + struct lnet_peer_net *peer_net, *pn; + int cpt, cpt2, rc; + struct lnet_peer_table *ptable = NULL; + __u32 net_id = LNET_NIDNET(nid); + + if (nid == LNET_NID_ANY) + return -EINVAL; + + /* check that nid is unique */ + cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); + lnet_net_lock(cpt); + lpni = lnet_find_peer_ni_locked(nid); + if (lpni) { + lnet_peer_ni_decref_locked(lpni); + lnet_net_unlock(cpt); + return -EEXIST; + } + lnet_net_unlock(cpt); + + if (key_nid != LNET_NID_ANY) { + cpt2 = lnet_nid_cpt_hash(key_nid, LNET_CPT_NUMBER); + lnet_net_lock(cpt2); + lpni = lnet_find_peer_ni_locked(key_nid); + if (!lpni) { + lnet_net_unlock(cpt2); + /* key_nid refers to a non-existent peer_ni.*/ + return -EINVAL; + } + peer = lpni->lpni_peer_net->lpn_peer; + peer->lp_multi_rail = true; + lnet_peer_ni_decref_locked(lpni); + lnet_net_unlock(cpt2); + } else { + lnet_net_lock(LNET_LOCK_EX); + rc = lnet_nid2peerni_locked(&lpni, nid, LNET_LOCK_EX); + if (rc == 0) { + lpni->lpni_peer_net->lpn_peer->lp_multi_rail = true; + lnet_peer_ni_decref_locked(lpni); + } + lnet_net_unlock(LNET_LOCK_EX); + return rc; + } + + lpni = kzalloc_cpt(sizeof(*lpni), GFP_KERNEL, cpt); + if (!lpni) + return -ENOMEM; + + INIT_LIST_HEAD(&lpni->lpni_txq); + INIT_LIST_HEAD(&lpni->lpni_rtrq); + INIT_LIST_HEAD(&lpni->lpni_routes); + INIT_LIST_HEAD(&lpni->lpni_hashlist); + INIT_LIST_HEAD(&lpni->lpni_on_peer_net_list); + INIT_LIST_HEAD(&lpni->lpni_on_remote_peer_ni_list); + + lpni->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ + lpni->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ + lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL; + lpni->lpni_nid = nid; + lpni->lpni_cpt = cpt; + lnet_set_peer_ni_health_locked(lpni, true); + + /* allocate here in case we need to add a new peer_net */ + peer_net = NULL; + peer_net = kzalloc(sizeof(*peer_net), GFP_KERNEL); + if (!peer_net) { + rc = -ENOMEM; + kfree(lpni); + return rc; + } + + lnet_net_lock(LNET_LOCK_EX); + + ptable = the_lnet.ln_peer_tables[cpt]; + ptable->pt_number++; + + lpni2 = lnet_find_peer_ni_locked(nid); + if (lpni2) { + lnet_peer_ni_decref_locked(lpni2); + /* sanity check that lpni2's peer is what we expect */ + if (lpni2->lpni_peer_net->lpn_peer != peer) + rc = -EEXIST; + else + rc = -EINVAL; + + ptable->pt_number--; + /* another thread has already added it */ + lnet_net_unlock(LNET_LOCK_EX); + kfree(peer_net); + return rc; + } + + lpni->lpni_net = lnet_get_net_locked(LNET_NIDNET(lpni->lpni_nid)); + if (lpni->lpni_net) { + lpni->lpni_txcredits = + lpni->lpni_mintxcredits = + lpni->lpni_net->net_tunables.lct_peer_tx_credits; + lpni->lpni_rtrcredits = + lpni->lpni_minrtrcredits = + lnet_peer_buffer_credits(lpni->lpni_net); + } else { + /* + * if you're adding a peer which is not on a local network + * then we can't assign any of the credits. It won't be + * picked for sending anyway. Eventually a network can be + * added, in this case we need to revisit this peer and + * update its credits. + */ + + /* increment refcount for remote peer list */ + atomic_inc(&lpni->lpni_refcount); + list_add_tail(&lpni->lpni_on_remote_peer_ni_list, + &the_lnet.ln_remote_peer_ni_list); + } + + /* increment refcount for peer on hash list */ + atomic_inc(&lpni->lpni_refcount); + + list_add_tail(&lpni->lpni_hashlist, + &ptable->pt_hash[lnet_nid2peerhash(nid)]); + ptable->pt_version++; + + /* add the lpni to a net */ + list_for_each_entry(pn, &peer->lp_peer_nets, lpn_on_peer_list) { + if (pn->lpn_net_id == net_id) { + list_add_tail(&lpni->lpni_on_peer_net_list, + &pn->lpn_peer_nis); + lpni->lpni_peer_net = pn; + lnet_net_unlock(LNET_LOCK_EX); + kfree(peer_net); + return 0; + } + } + + INIT_LIST_HEAD(&peer_net->lpn_on_peer_list); + INIT_LIST_HEAD(&peer_net->lpn_peer_nis); + + /* build the hierarchy */ + peer_net->lpn_net_id = net_id; + peer_net->lpn_peer = peer; + lpni->lpni_peer_net = peer_net; + list_add_tail(&lpni->lpni_on_peer_net_list, &peer_net->lpn_peer_nis); + list_add_tail(&peer_net->lpn_on_peer_list, &peer->lp_peer_nets); + + lnet_net_unlock(LNET_LOCK_EX); + return 0; +} + +int +lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid) +{ + int cpt; + lnet_nid_t local_nid; + struct lnet_peer *peer; + struct lnet_peer_ni *lpni, *lpni2; + struct lnet_peer_table *ptable = NULL; + + if (key_nid == LNET_NID_ANY) + return -EINVAL; + + local_nid = (nid != LNET_NID_ANY) ? nid : key_nid; + cpt = lnet_nid_cpt_hash(local_nid, LNET_CPT_NUMBER); + lnet_net_lock(LNET_LOCK_EX); + + lpni = lnet_find_peer_ni_locked(local_nid); + if (!lpni) { + lnet_net_unlock(cpt); + return -EINVAL; + } + lnet_peer_ni_decref_locked(lpni); + + peer = lpni->lpni_peer_net->lpn_peer; + LASSERT(peer); + + if (peer->lp_primary_nid == lpni->lpni_nid) { + /* + * deleting the primary ni is equivalent to deleting the + * entire peer + */ + lpni = NULL; + lpni = lnet_get_next_peer_ni_locked(peer, NULL, lpni); + while (lpni) { + lpni2 = lnet_get_next_peer_ni_locked(peer, NULL, lpni); + cpt = lnet_nid_cpt_hash(lpni->lpni_nid, + LNET_CPT_NUMBER); + lnet_peer_remove_from_remote_list(lpni); + ptable = the_lnet.ln_peer_tables[cpt]; + ptable->pt_zombies++; + list_del_init(&lpni->lpni_hashlist); + lnet_peer_ni_decref_locked(lpni); + lpni = lpni2; + } + lnet_net_unlock(LNET_LOCK_EX); + + return 0; + } + + lnet_peer_remove_from_remote_list(lpni); + cpt = lnet_nid_cpt_hash(lpni->lpni_nid, LNET_CPT_NUMBER); + ptable = the_lnet.ln_peer_tables[cpt]; + ptable->pt_zombies++; + list_del_init(&lpni->lpni_hashlist); + lnet_peer_ni_decref_locked(lpni); + lnet_net_unlock(LNET_LOCK_EX); + + return 0; +} + void lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lpni) { @@ -487,6 +758,9 @@ lnet_nid2peerni_locked(struct lnet_peer_ni **lpnip, lnet_nid_t nid, int cpt) INIT_LIST_HEAD(&lpni->lpni_txq); INIT_LIST_HEAD(&lpni->lpni_rtrq); INIT_LIST_HEAD(&lpni->lpni_routes); + INIT_LIST_HEAD(&lpni->lpni_hashlist); + INIT_LIST_HEAD(&lpni->lpni_on_peer_net_list); + INIT_LIST_HEAD(&lpni->lpni_on_remote_peer_ni_list); lpni->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ lpni->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ @@ -521,8 +795,20 @@ lnet_nid2peerni_locked(struct lnet_peer_ni **lpnip, lnet_nid_t nid, int cpt) lpni->lpni_minrtrcredits = lnet_peer_buffer_credits(lpni->lpni_net); } else { + /* + * if you're adding a peer which is not on a local network + * then we can't assign any of the credits. It won't be + * picked for sending anyway. Eventually a network can be + * added, in this case we need to revisit this peer and + * update its credits. + */ + CDEBUG(D_NET, "peer_ni %s is not directly connected\n", libcfs_nid2str(nid)); + /* increment refcount for remote peer list */ + atomic_inc(&lpni->lpni_refcount); + list_add_tail(&lpni->lpni_on_remote_peer_ni_list, + &the_lnet.ln_remote_peer_ni_list); } lnet_set_peer_ni_health_locked(lpni, true); @@ -584,12 +870,12 @@ lnet_debug_peer(lnet_nid_t nid) } int -lnet_get_peer_info(__u32 peer_index, __u64 *nid, - char aliveness[LNET_MAX_STR_LEN], - __u32 *cpt_iter, __u32 *refcount, - __u32 *ni_peer_tx_credits, __u32 *peer_tx_credits, - __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credits, - __u32 *peer_tx_qnob) +lnet_get_peer_ni_info(__u32 peer_index, __u64 *nid, + char aliveness[LNET_MAX_STR_LEN], + __u32 *cpt_iter, __u32 *refcount, + __u32 *ni_peer_tx_credits, __u32 *peer_tx_credits, + __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credits, + __u32 *peer_tx_qnob) { struct lnet_peer_table *peer_table; struct lnet_peer_ni *lp; @@ -645,3 +931,34 @@ lnet_get_peer_info(__u32 peer_index, __u64 *nid, return found ? 0 : -ENOENT; } + +int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, + struct lnet_peer_ni_credit_info *peer_ni_info) +{ + struct lnet_peer_ni *lpni = NULL; + struct lnet_peer_net *lpn = NULL; + struct lnet_peer *lp = NULL; + + lpni = lnet_get_peer_ni_idx_locked(idx, &lpn, &lp); + + if (!lpni) + return -ENOENT; + + *primary_nid = lp->lp_primary_nid; + *nid = lpni->lpni_nid; + snprintf(peer_ni_info->cr_aliveness, LNET_MAX_STR_LEN, "NA"); + if (lnet_isrouter(lpni) || + lnet_peer_aliveness_enabled(lpni)) + snprintf(peer_ni_info->cr_aliveness, LNET_MAX_STR_LEN, + lpni->lpni_alive ? "up" : "down"); + + peer_ni_info->cr_refcount = atomic_read(&lpni->lpni_refcount); + peer_ni_info->cr_ni_peer_tx_credits = lpni->lpni_net ? + lpni->lpni_net->net_tunables.lct_peer_tx_credits : 0; + peer_ni_info->cr_peer_tx_credits = lpni->lpni_txcredits; + peer_ni_info->cr_peer_rtr_credits = lpni->lpni_rtrcredits; + peer_ni_info->cr_peer_min_rtr_credits = lpni->lpni_mintxcredits; + peer_ni_info->cr_peer_tx_qnob = lpni->lpni_txqnob; + + return 0; +} diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index de037a77671d..7913914620f3 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -1734,7 +1734,7 @@ lnet_notify(struct lnet_ni *ni, lnet_nid_t nid, int alive, time64_t when) return -ESHUTDOWN; } - lp = lnet_find_peer_ni_locked(nid, cpt); + lp = lnet_find_peer_ni_locked(nid); if (!lp) { /* nid not found */ lnet_net_unlock(cpt); From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613173 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E0D67157B for ; Tue, 25 Sep 2018 01:11:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DEE8A2A052 for ; Tue, 25 Sep 2018 01:11:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D2C452A05D; Tue, 25 Sep 2018 01:11:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2C4BE2A052 for ; Tue, 25 Sep 2018 01:11:06 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D79174C3E31; Mon, 24 Sep 2018 18:11:05 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5007C4C3D0E for ; Mon, 24 Sep 2018 18:11:03 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 63245B032; Tue, 25 Sep 2018 01:11:02 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763527.32103.6166409833321456335.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 11/34] LU-7734 lnet: configure local NI from DLC X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata This patch adds the ability to configure multiple network interfaces on the same network. This can be done via the lnetctl CLI interface or through a YAML configuration. Refer to the multi-rail HLD for more details on the syntax. It also deprecates ip2nets kernel parsing. All string parsing and network maching now happens in the DLC userspace library. New IOCTLs are added for adding/deleting local NIs, to keep backwards compatibility with older version of the DLC and lnetctl. The changes also include parsing and matching ip2nets syntax at the user level and then passing down the network interfaces down to the kernel to be configured. Signed-off-by: Amir Shehata Change-Id: I19ee7dc76514beb6f34de6517d19654d6468bcec Reviewed-on: http://review.whamcloud.com/18886 Tested-by: Maloo Signed-off-by: NeilBrown --- .../lustre/include/linux/libcfs/libcfs_string.h | 12 - .../staging/lustre/include/linux/lnet/lib-lnet.h | 13 - .../lustre/include/uapi/linux/lnet/libcfs_ioctl.h | 6 .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 57 ++ .../staging/lustre/lnet/klnds/socklnd/socklnd.c | 3 drivers/staging/lustre/lnet/lnet/api-ni.c | 479 +++++++++++++++++--- drivers/staging/lustre/lnet/lnet/config.c | 107 +++- drivers/staging/lustre/lnet/lnet/module.c | 70 ++- drivers/staging/lustre/lnet/lnet/peer.c | 21 + drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 2 drivers/staging/lustre/lustre/ptlrpc/service.c | 4 11 files changed, 650 insertions(+), 124 deletions(-) diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_string.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_string.h index cd7c3ccb2dc0..3117708b9ebb 100644 --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_string.h +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_string.h @@ -83,20 +83,10 @@ int cfs_expr_list_print(char *buffer, int count, struct cfs_expr_list *expr_list); int cfs_expr_list_values(struct cfs_expr_list *expr_list, int max, u32 **values); -static inline void -cfs_expr_list_values_free(u32 *values, int num) -{ - /* - * This array is allocated by kvalloc(), so it shouldn't be freed - * by OBD_FREE() if it's called by module other than libcfs & LNet, - * otherwise we will see fake memory leak - */ - kvfree(values); -} - void cfs_expr_list_free(struct cfs_expr_list *expr_list); int cfs_expr_list_parse(char *str, int len, unsigned int min, unsigned int max, struct cfs_expr_list **elpp); +void cfs_expr_list_free(struct cfs_expr_list *expr_list); void cfs_expr_list_free_list(struct list_head *list); #endif diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 11642f8aee90..a7cff6426ad8 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -377,6 +377,9 @@ lnet_net_alloc(__u32 net_type, struct list_head *netlist); struct lnet_ni * lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface); +struct lnet_ni * +lnet_ni_alloc_w_cpt_array(struct lnet_net *net, __u32 *cpts, __u32 ncpts, + char *iface); static inline int lnet_nid2peerhash(lnet_nid_t nid) @@ -401,7 +404,7 @@ int lnet_cpt_of_nid(lnet_nid_t nid, struct lnet_ni *ni); struct lnet_ni *lnet_nid2ni_locked(lnet_nid_t nid, int cpt); struct lnet_ni *lnet_nid2ni_addref(lnet_nid_t nid); struct lnet_ni *lnet_net2ni_locked(__u32 net, int cpt); -struct lnet_ni *lnet_net2ni(__u32 net); +struct lnet_ni *lnet_net2ni_addref(__u32 net); bool lnet_is_ni_healthy_locked(struct lnet_ni *ni); struct lnet_net *lnet_get_net_locked(u32 net_id); @@ -435,9 +438,10 @@ int lnet_rtrpools_enable(void); void lnet_rtrpools_disable(void); void lnet_rtrpools_free(int keep_pools); struct lnet_remotenet *lnet_find_rnet_locked(__u32 net); -int lnet_dyn_add_ni(lnet_pid_t requested_pid, - struct lnet_ioctl_config_data *conf); -int lnet_dyn_del_ni(__u32 net); +int lnet_dyn_add_net(struct lnet_ioctl_config_data *conf); +int lnet_dyn_del_net(__u32 net); +int lnet_dyn_add_ni(struct lnet_ioctl_config_ni *conf); +int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf); int lnet_clear_lazy_portal(struct lnet_ni *ni, int portal, char *reason); struct lnet_net *lnet_get_net_locked(__u32 net_id); @@ -646,6 +650,7 @@ int lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, struct lnet_peer **peer); int lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid); +void lnet_peer_net_added(struct lnet_net *net); void lnet_peer_tables_cleanup(struct lnet_ni *ni); void lnet_peer_uninit(void); int lnet_peer_tables_create(void); diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h b/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h index d5a3e7c85aa4..fa58aaf6ad9d 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h @@ -139,6 +139,10 @@ struct libcfs_debug_ioctl_data { #define IOC_LIBCFS_ADD_PEER_NI _IOWR(IOC_LIBCFS_TYPE, 92, IOCTL_CONFIG_SIZE) #define IOC_LIBCFS_DEL_PEER_NI _IOWR(IOC_LIBCFS_TYPE, 93, IOCTL_CONFIG_SIZE) #define IOC_LIBCFS_GET_PEER_NI _IOWR(IOC_LIBCFS_TYPE, 94, IOCTL_CONFIG_SIZE) -#define IOC_LIBCFS_MAX_NR 94 +#define IOC_LIBCFS_ADD_LOCAL_NI _IOWR(IOC_LIBCFS_TYPE, 95, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_DEL_LOCAL_NI _IOWR(IOC_LIBCFS_TYPE, 96, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_GET_LOCAL_NI _IOWR(IOC_LIBCFS_TYPE, 97, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_DBG _IOWR(IOC_LIBCFS_TYPE, 98, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_MAX_NR 98 #endif /* __LIBCFS_IOCTL_H__ */ diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index 9c4e05e1b683..bfd9fc6bc4df 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -37,6 +37,18 @@ #define LNET_MAX_SHOW_NUM_CPT 128 #define LNET_UNDEFINED_HOPS ((__u32)(-1)) +/* + * To allow for future enhancements to extend the tunables + * add a hdr to this structure, so that the version can be set + * and checked for backwards compatibility. Newer versions of LNet + * can still work with older versions of lnetctl. The restriction is + * that the structure can be added to and not removed from in order + * to not invalidate older lnetctl utilities. Moreover, the order of + * fields must remain the same, and new fields appended to the structure + * + * That said all existing LND tunables will be added in this structure + * to avoid future changes. + */ struct lnet_ioctl_config_lnd_cmn_tunables { __u32 lct_version; __s32 lct_peer_timeout; @@ -82,6 +94,10 @@ struct lnet_ioctl_net_config { /* # different router buffer pools */ #define LNET_NRBPOOLS (LNET_LARGE_BUF_IDX + 1) +enum lnet_dbg_task { + LNET_DBG_INCR_DLC_SEQ = 0 +}; + struct lnet_ioctl_pool_cfg { struct { __u32 pl_npages; @@ -126,6 +142,29 @@ struct lnet_ioctl_config_data { char cfg_bulk[0]; }; +/* + * lnet_ioctl_config_ni + * This structure describes an NI configuration. There are multiple components + * when configuring an NI: Net, Interfaces, CPT list and LND tunables + * A network is passed as a string to the DLC and translated using + * libcfs_str2net() + * An interface is the name of the system configured interface + * (ex eth0, ib1) + * CPT is the list of CPTS LND tunables are passed in the lic_bulk area + */ +struct lnet_ioctl_config_ni { + struct libcfs_ioctl_hdr lic_cfg_hdr; + lnet_nid_t lic_nid; + char lic_ni_intf[LNET_MAX_INTERFACES][LNET_MAX_STR_LEN]; + char lic_legacy_ip2nets[LNET_MAX_STR_LEN]; + __u32 lic_cpts[LNET_MAX_SHOW_NUM_CPT]; + __u32 lic_ncpts; + __u32 lic_status; + __u32 lic_tcp_bonding; + __u32 lic_idx; + char lic_bulk[0]; +}; + struct lnet_peer_ni_credit_info { char cr_aliveness[LNET_MAX_STR_LEN]; __u32 cr_refcount; @@ -148,6 +187,24 @@ struct lnet_ioctl_peer { } pr_lnd_u; }; +struct lnet_dbg_task_info { + /* + * TODO: a union can be added if the task requires more + * information from user space to be carried out in kernel space. + */ +}; + +/* + * This structure is intended to allow execution of debugging tasks. This + * is not intended to be backwards compatible. Extra tasks can be added in + * the future + */ +struct lnet_ioctl_dbg { + struct libcfs_ioctl_hdr dbg_hdr; + enum lnet_dbg_task dbg_task; + char dbg_bulk[0]; +}; + struct lnet_ioctl_peer_cfg { struct libcfs_ioctl_hdr prcfg_hdr; lnet_nid_t prcfg_key_nid; diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c index 766f0d525661..9df66c6d160f 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c @@ -2700,7 +2700,8 @@ ksocknal_net_start_threads(struct ksock_net *net, __u32 *cpts, int ncpts) int rc; int i; - LASSERT(ncpts > 0 && ncpts <= cfs_cpt_number(lnet_cpt_table())); + if (ncpts > 0 && ncpts > cfs_cpt_number(lnet_cpt_table())) + return -EINVAL; for (i = 0; i < ncpts; i++) { struct ksock_sched_info *info; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 710f8a0be934..1ef9a39b517d 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -679,17 +679,19 @@ lnet_net2ni_locked(__u32 net_id, int cpt) } struct lnet_ni * -lnet_net2ni(__u32 net) +lnet_net2ni_addref(__u32 net) { struct lnet_ni *ni; lnet_net_lock(0); ni = lnet_net2ni_locked(net, 0); + if (ni) + lnet_ni_addref_locked(ni, 0); lnet_net_unlock(0); return ni; } -EXPORT_SYMBOL(lnet_net2ni); +EXPORT_SYMBOL(lnet_net2ni_addref); struct lnet_net * lnet_get_net_locked(__u32 net_id) @@ -897,6 +899,18 @@ lnet_get_net_ni_count_locked(struct lnet_net *net) return count; } +static inline int +lnet_get_net_ni_count_pre(struct lnet_net *net) +{ + struct lnet_ni *ni; + int count = 0; + + list_for_each_entry(ni, &net->net_ni_added, ni_netlist) + count++; + + return count; +} + static inline int lnet_get_ni_count(void) { @@ -1839,15 +1853,91 @@ LNetNIFini(void) } EXPORT_SYMBOL(LNetNIFini); +static int lnet_handle_dbg_task(struct lnet_ioctl_dbg *dbg, + struct lnet_dbg_task_info *dbg_info) +{ + switch (dbg->dbg_task) { + case LNET_DBG_INCR_DLC_SEQ: + lnet_incr_dlc_seq(); + } + + return 0; +} + /** * Grabs the ni data from the ni structure and fills the out * parameters * - * \param[in] ni network interface structure - * \param[out] config NI configuration + * \param[in] ni network interface structure + * \param[out] cfg_ni NI config information + * \param[out] tun network and LND tunables */ static void -lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_data *config) +lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_ni *cfg_ni, + struct lnet_ioctl_config_lnd_tunables *tun, + __u32 tun_size) +{ + size_t min_size = 0; + int i; + + if (!ni || !cfg_ni || !tun) + return; + + if (ni->ni_interfaces[0]) { + for (i = 0; i < ARRAY_SIZE(ni->ni_interfaces); i++) { + if (ni->ni_interfaces[i]) { + strncpy(cfg_ni->lic_ni_intf[i], + ni->ni_interfaces[i], + sizeof(cfg_ni->lic_ni_intf[i])); + } + } + } + + cfg_ni->lic_nid = ni->ni_nid; + cfg_ni->lic_status = ni->ni_status->ns_status; + cfg_ni->lic_tcp_bonding = use_tcp_bonding; + + memcpy(&tun->lt_cmn, &ni->ni_net->net_tunables, sizeof(tun->lt_cmn)); + + /* + * tun->lt_tun will always be present, but in order to be + * backwards compatible, we need to deal with the cases when + * tun->lt_tun is smaller than what the kernel has, because it + * comes from an older version of a userspace program, then we'll + * need to copy as much information as we have available space. + */ + min_size = tun_size - sizeof(tun->lt_cmn); + memcpy(&tun->lt_tun, &ni->ni_lnd_tunables, min_size); + + /* copy over the cpts */ + if (ni->ni_ncpts == LNET_CPT_NUMBER && + !ni->ni_cpts) { + for (i = 0; i < ni->ni_ncpts; i++) + cfg_ni->lic_cpts[i] = i; + } else { + for (i = 0; + ni->ni_cpts && i < ni->ni_ncpts && + i < LNET_MAX_SHOW_NUM_CPT; + i++) + cfg_ni->lic_cpts[i] = ni->ni_cpts[i]; + } + cfg_ni->lic_ncpts = ni->ni_ncpts; +} + +/** + * NOTE: This is a legacy function left in the code to be backwards + * compatible with older userspace programs. It should eventually be + * removed. + * + * Grabs the ni data from the ni structure and fills the out + * parameters + * + * \param[in] ni network interface structure + * \param[out] config config information + */ +static void +lnet_fill_ni_info_legacy(struct lnet_ni *ni, + struct lnet_ioctl_config_data *config) { struct lnet_ioctl_config_lnd_tunables *lnd_cfg = NULL; struct lnet_ioctl_net_config *net_config; @@ -1994,7 +2084,7 @@ lnet_get_net_config(struct lnet_ioctl_config_data *config) if (ni) { rc = 0; lnet_ni_lock(ni); - lnet_fill_ni_info(ni, config); + lnet_fill_ni_info_legacy(ni, config); lnet_ni_unlock(ni); } @@ -2003,38 +2093,43 @@ lnet_get_net_config(struct lnet_ioctl_config_data *config) } int -lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) +lnet_get_ni_config(struct lnet_ioctl_config_ni *cfg_ni, + struct lnet_ioctl_config_lnd_tunables *tun, + __u32 tun_size) { - char *nets = conf->cfg_config_u.cfg_net.net_intf; - struct lnet_ping_info *pinfo; - struct lnet_handle_md md_handle; - struct lnet_net *net; - struct list_head net_head; - struct lnet_remotenet *rnet; - int rc; - int net_ni_count; - int num_acceptor_nets; - u32 net_type; - struct lnet_ioctl_config_lnd_tunables *lnd_tunables = NULL; - - INIT_LIST_HEAD(&net_head); + struct lnet_ni *ni; + int cpt; + int rc = -ENOENT; - if (conf && conf->cfg_hdr.ioc_len > sizeof(*conf)) - lnd_tunables = (struct lnet_ioctl_config_lnd_tunables *)conf->cfg_bulk; + if (!cfg_ni || !tun) + return -EINVAL; - /* Create a net/ni structures for the network string */ - rc = lnet_parse_networks(&net_head, nets, use_tcp_bonding); - if (rc <= 0) - return !rc ? -EINVAL : rc; + cpt = lnet_net_lock_current(); - mutex_lock(&the_lnet.ln_api_mutex); + ni = lnet_get_ni_idx_locked(cfg_ni->lic_idx); - if (rc > 1) { - rc = -EINVAL; /* only add one network per call */ - goto failed0; + if (ni) { + rc = 0; + lnet_ni_lock(ni); + lnet_fill_ni_info(ni, cfg_ni, tun, tun_size); + lnet_ni_unlock(ni); } - net = list_entry(net_head.next, struct lnet_net, net_list); + lnet_net_unlock(cpt); + return rc; +} + +static int lnet_add_net_common(struct lnet_net *net, + struct lnet_ioctl_config_lnd_tunables *tun) +{ + struct lnet_net *netl = NULL; + u32 net_id; + struct lnet_ping_info *pinfo; + struct lnet_handle_md md_handle; + int rc; + struct lnet_remotenet *rnet; + int net_ni_count; + int num_acceptor_nets; lnet_net_lock(LNET_LOCK_EX); rnet = lnet_find_rnet_locked(net->net_id); @@ -2045,9 +2140,9 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) */ if (rnet) { CERROR("Adding net %s will invalidate routing configuration\n", - nets); + libcfs_net2str(net->net_id)); rc = -EUSERS; - goto failed0; + goto failed1; } /* @@ -2056,21 +2151,21 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) * we should allocate enough slots to accomodate the number of NIs * which will be added. * - * We can use lnet_get_net_ni_count_locked() since the net is not - * on a public list yet, so locking is not a problem + * since ni hasn't been configured yet, use + * lnet_get_net_ni_count_pre() which checks the net_ni_added list */ - net_ni_count = lnet_get_net_ni_count_locked(net); + net_ni_count = lnet_get_net_ni_count_pre(net); rc = lnet_ping_info_setup(&pinfo, &md_handle, net_ni_count + lnet_get_ni_count(), false); - if (rc) - goto failed0; - - list_del_init(&net->net_list); - if (lnd_tunables) + if (rc < 0) + goto failed1; + if (tun) memcpy(&net->net_tunables, - &lnd_tunables->lt_cmn, sizeof(lnd_tunables->lt_cmn)); + &tun->lt_cmn, sizeof(net->net_tunables)); + else + memset(&net->net_tunables, -1, sizeof(net->net_tunables)); /* * before starting this network get a count of the current TCP @@ -2080,47 +2175,269 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) */ num_acceptor_nets = lnet_count_acceptor_nets(); - /* - * lnd_startup_lndnet() can deallocate 'net' even if it it returns - * success, because we endded up adding interfaces to an existing - * network. So grab the net_type now - */ - net_type = LNET_NETTYP(net->net_id); + net_id = net->net_id; - rc = lnet_startup_lndnet(net, (lnd_tunables ? - &lnd_tunables->lt_tun : NULL)); + rc = lnet_startup_lndnet(net, (tun ? + &tun->lt_tun : NULL)); if (rc < 0) - goto failed1; + goto failed; + + lnet_net_lock(LNET_LOCK_EX); + netl = lnet_get_net_locked(net_id); + lnet_net_unlock(LNET_LOCK_EX); + + LASSERT(netl); /* * Start the acceptor thread if this is the first network * being added that requires the thread. */ - if (net_type == SOCKLND && num_acceptor_nets == 0) { + if (netl->net_lnd->lnd_accept && + num_acceptor_nets == 0) { rc = lnet_acceptor_start(); if (rc < 0) { /* shutdown the net that we just started */ CERROR("Failed to start up acceptor thread\n"); - /* - * Note that if we needed to start the acceptor - * thread, then 'net' must have been the first TCP - * network, therefore was unique, and therefore - * wasn't deallocated by lnet_startup_lndnet() - */ lnet_shutdown_lndnet(net); - goto failed1; + goto failed; } } + lnet_net_lock(LNET_LOCK_EX); + lnet_peer_net_added(netl); + lnet_net_unlock(LNET_LOCK_EX); + lnet_ping_target_update(pinfo, md_handle); - mutex_unlock(&the_lnet.ln_api_mutex); return 0; -failed1: +failed: lnet_ping_md_unlink(pinfo, &md_handle); lnet_ping_info_free(pinfo); -failed0: +failed1: + lnet_net_free(net); + return rc; +} + +static int lnet_handle_legacy_ip2nets(char *ip2nets, + struct lnet_ioctl_config_lnd_tunables *tun) +{ + struct lnet_net *net; + char *nets; + int rc; + struct list_head net_head; + + INIT_LIST_HEAD(&net_head); + + rc = lnet_parse_ip2nets(&nets, ip2nets); + if (rc < 0) + return rc; + + rc = lnet_parse_networks(&net_head, nets, use_tcp_bonding); + if (rc < 0) + return rc; + + mutex_lock(&the_lnet.ln_api_mutex); + while (!list_empty(&net_head)) { + net = list_entry(net_head.next, struct lnet_net, net_list); + list_del_init(&net->net_list); + rc = lnet_add_net_common(net, tun); + if (rc < 0) + goto out; + } + +out: + mutex_unlock(&the_lnet.ln_api_mutex); + + while (!list_empty(&net_head)) { + net = list_entry(net_head.next, struct lnet_net, net_list); + list_del_init(&net->net_list); + lnet_net_free(net); + } + return rc; +} + +int lnet_dyn_add_ni(struct lnet_ioctl_config_ni *conf) +{ + struct lnet_net *net; + struct lnet_ni *ni; + struct lnet_ioctl_config_lnd_tunables *tun = NULL; + int rc; + u32 net_id; + + /* get the tunables if they are available */ + if (conf->lic_cfg_hdr.ioc_len >= + sizeof(*conf) + sizeof(*tun)) + tun = (struct lnet_ioctl_config_lnd_tunables *) + conf->lic_bulk; + + /* handle legacy ip2nets from DLC */ + if (conf->lic_legacy_ip2nets[0] != '\0') + return lnet_handle_legacy_ip2nets(conf->lic_legacy_ip2nets, + tun); + + net_id = LNET_NIDNET(conf->lic_nid); + + net = lnet_net_alloc(net_id, NULL); + if (!net) + return -ENOMEM; + + ni = lnet_ni_alloc_w_cpt_array(net, conf->lic_cpts, conf->lic_ncpts, + conf->lic_ni_intf[0]); + if (!ni) + return -ENOMEM; + + mutex_lock(&the_lnet.ln_api_mutex); + + rc = lnet_add_net_common(net, tun); + + mutex_unlock(&the_lnet.ln_api_mutex); + + return rc; +} + +int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf) +{ + struct lnet_net *net; + struct lnet_ni *ni; + u32 net_id = LNET_NIDNET(conf->lic_nid); + struct lnet_ping_info *pinfo; + struct lnet_handle_md md_handle; + int rc; + int net_count; + u32 addr; + + /* don't allow userspace to shutdown the LOLND */ + if (LNET_NETTYP(net_id) == LOLND) + return -EINVAL; + + mutex_lock(&the_lnet.ln_api_mutex); + + lnet_net_lock(0); + + net = lnet_get_net_locked(net_id); + if (!net) { + CERROR("net %s not found\n", + libcfs_net2str(net_id)); + rc = -ENOENT; + goto net_unlock; + } + + addr = LNET_NIDADDR(conf->lic_nid); + if (addr == 0) { + /* remove the entire net */ + net_count = lnet_get_net_ni_count_locked(net); + + lnet_net_unlock(0); + + /* create and link a new ping info, before removing the old one */ + rc = lnet_ping_info_setup(&pinfo, &md_handle, + lnet_get_ni_count() - net_count, + false); + if (rc != 0) + goto out; + + lnet_shutdown_lndnet(net); + + if (lnet_count_acceptor_nets() == 0) + lnet_acceptor_stop(); + + lnet_ping_target_update(pinfo, md_handle); + + goto out; + } + + ni = lnet_nid2ni_locked(conf->lic_nid, 0); + if (!ni) { + CERROR("nid %s not found\n", + libcfs_nid2str(conf->lic_nid)); + rc = -ENOENT; + goto net_unlock; + } + + net_count = lnet_get_net_ni_count_locked(net); + + lnet_net_unlock(0); + + /* create and link a new ping info, before removing the old one */ + rc = lnet_ping_info_setup(&pinfo, &md_handle, + lnet_get_ni_count() - 1, false); + if (rc != 0) + goto out; + + lnet_shutdown_lndni(ni); + + if (lnet_count_acceptor_nets() == 0) + lnet_acceptor_stop(); + + lnet_ping_target_update(pinfo, md_handle); + + /* check if the net is empty and remove it if it is */ + if (net_count == 1) + lnet_shutdown_lndnet(net); + + goto out; + +net_unlock: + lnet_net_unlock(0); +out: + mutex_unlock(&the_lnet.ln_api_mutex); + + return rc; +} + +/* + * lnet_dyn_add_net and lnet_dyn_del_net are now deprecated. + * They are only expected to be called for unique networks. + * That can be as a result of older DLC library + * calls. Multi-Rail DLC and beyond no longer uses these APIs. + */ +int +lnet_dyn_add_net(struct lnet_ioctl_config_data *conf) +{ + struct lnet_net *net; + struct list_head net_head; + int rc; + struct lnet_ioctl_config_lnd_tunables tun; + char *nets = conf->cfg_config_u.cfg_net.net_intf; + + INIT_LIST_HEAD(&net_head); + + /* Create a net/ni structures for the network string */ + rc = lnet_parse_networks(&net_head, nets, use_tcp_bonding); + if (rc <= 0) + return rc == 0 ? -EINVAL : rc; + + mutex_lock(&the_lnet.ln_api_mutex); + + if (rc > 1) { + rc = -EINVAL; /* only add one network per call */ + goto failed; + } + + net = list_entry(net_head.next, struct lnet_net, net_list); + list_del_init(&net->net_list); + + LASSERT(lnet_net_unique(net->net_id, &the_lnet.ln_nets, NULL)); + + memset(&tun, 0, sizeof(tun)); + + tun.lt_cmn.lct_peer_timeout = + conf->cfg_config_u.cfg_net.net_peer_timeout; + tun.lt_cmn.lct_peer_tx_credits = + conf->cfg_config_u.cfg_net.net_peer_tx_credits; + tun.lt_cmn.lct_peer_rtr_credits = + conf->cfg_config_u.cfg_net.net_peer_rtr_credits; + tun.lt_cmn.lct_max_tx_credits = + conf->cfg_config_u.cfg_net.net_max_tx_credits; + + rc = lnet_add_net_common(net, &tun); + if (rc != 0) + goto failed; + + return 0; + +failed: mutex_unlock(&the_lnet.ln_api_mutex); while (!list_empty(&net_head)) { net = list_entry(net_head.next, struct lnet_net, net_list); @@ -2131,7 +2448,7 @@ lnet_dyn_add_ni(lnet_pid_t requested_pid, struct lnet_ioctl_config_data *conf) } int -lnet_dyn_del_ni(__u32 net_id) +lnet_dyn_del_net(__u32 net_id) { struct lnet_net *net; struct lnet_ping_info *pinfo; @@ -2256,6 +2573,25 @@ LNetCtl(unsigned int cmd, void *arg) &config->cfg_config_u.cfg_route.rtr_flags, &config->cfg_config_u.cfg_route.rtr_priority); + case IOC_LIBCFS_GET_LOCAL_NI: { + struct lnet_ioctl_config_ni *cfg_ni; + struct lnet_ioctl_config_lnd_tunables *tun = NULL; + __u32 tun_size; + + cfg_ni = arg; + /* get the tunables if they are available */ + if (cfg_ni->lic_cfg_hdr.ioc_len < + sizeof(*cfg_ni) + sizeof(*tun)) + return -EINVAL; + + tun = (struct lnet_ioctl_config_lnd_tunables *) + cfg_ni->lic_bulk; + + tun_size = cfg_ni->lic_cfg_hdr.ioc_len - sizeof(*cfg_ni); + + return lnet_get_ni_config(cfg_ni, tun, tun_size); + } + case IOC_LIBCFS_GET_NET: { size_t total = sizeof(*config) + sizeof(struct lnet_ioctl_net_config); @@ -2423,8 +2759,22 @@ LNetCtl(unsigned int cmd, void *arg) data->ioc_count = rc; return 0; } + + case IOC_LIBCFS_DBG: { + struct lnet_ioctl_dbg *dbg = arg; + struct lnet_dbg_task_info *dbg_info; + size_t total = sizeof(*dbg) + sizeof(*dbg_info); + + if (dbg->dbg_hdr.ioc_len < total) + return -EINVAL; + + dbg_info = (struct lnet_dbg_task_info *)dbg->dbg_bulk; + + return lnet_handle_dbg_task(dbg, dbg_info); + } + default: - ni = lnet_net2ni(data->ioc_net); + ni = lnet_net2ni_addref(data->ioc_net); if (!ni) return -EINVAL; @@ -2433,6 +2783,7 @@ LNetCtl(unsigned int cmd, void *arg) else rc = ni->ni_net->net_lnd->lnd_ctl(ni, cmd, arg); + lnet_ni_decref(ni); return rc; } /* not reached */ diff --git a/drivers/staging/lustre/lnet/lnet/config.c b/drivers/staging/lustre/lnet/lnet/config.c index 9539ce07ae05..c11821a5838c 100644 --- a/drivers/staging/lustre/lnet/lnet/config.c +++ b/drivers/staging/lustre/lnet/lnet/config.c @@ -87,6 +87,9 @@ lnet_net_unique(__u32 net_id, struct list_head *netlist, { struct lnet_net *net_l; + if (!netlist) + return true; + list_for_each_entry(net_l, netlist, net_list) { if (net_l->net_id == net_id) { if (net) @@ -172,6 +175,7 @@ lnet_net_append_cpts(__u32 *cpts, __u32 ncpts, struct lnet_net *net) if (!net->net_cpts) return -ENOMEM; memcpy(net->net_cpts, cpts, ncpts); + net->net_ncpts = ncpts; return 0; } @@ -298,8 +302,7 @@ lnet_ni_free(struct lnet_ni *ni) if (ni->ni_tx_queues) cfs_percpt_free(ni->ni_tx_queues); - if (ni->ni_cpts) - cfs_expr_list_values_free(ni->ni_cpts, ni->ni_ncpts); + kfree(ni->ni_cpts); for (i = 0; i < LNET_MAX_INTERFACES && ni->ni_interfaces[i]; i++) kfree(ni->ni_interfaces[i]); @@ -371,7 +374,8 @@ lnet_net_alloc(__u32 net_id, struct list_head *net_list) net->net_tunables.lct_peer_tx_credits = -1; net->net_tunables.lct_peer_rtr_credits = -1; - list_add_tail(&net->net_list, net_list); + if (net_list) + list_add_tail(&net->net_list, net_list); return net; } @@ -414,13 +418,11 @@ lnet_ni_add_interface(struct lnet_ni *ni, char *iface) return 0; } -/* allocate and add to the provided network */ -struct lnet_ni * -lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) +static struct lnet_ni * +lnet_ni_alloc_common(struct lnet_net *net, char *iface) { struct lnet_tx_queue *tq; struct lnet_ni *ni; - int rc; int i; if (iface) @@ -452,6 +454,45 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) cfs_percpt_for_each(tq, i, ni->ni_tx_queues) INIT_LIST_HEAD(&tq->tq_delayed); + ni->ni_net = net; + /* LND will fill in the address part of the NID */ + ni->ni_nid = LNET_MKNID(net->net_id, 0); + + /* Store net namespace in which current ni is being created */ + if (current->nsproxy->net_ns) + ni->ni_net_ns = get_net(current->nsproxy->net_ns); + else + ni->ni_net_ns = NULL; + + ni->ni_last_alive = ktime_get_real_seconds(); + ni->ni_state = LNET_NI_STATE_INIT; + list_add_tail(&ni->ni_netlist, &net->net_ni_added); + + /* + * if an interface name is provided then make sure to add in that + * interface name in NI + */ + if (iface) + if (lnet_ni_add_interface(ni, iface) != 0) + goto failed; + + return ni; +failed: + lnet_ni_free(ni); + return NULL; +} + +/* allocate and add to the provided network */ +struct lnet_ni * +lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) +{ + struct lnet_ni *ni; + int rc; + + ni = lnet_ni_alloc_common(net, iface); + if (!ni) + return NULL; + if (!el) { ni->ni_cpts = NULL; ni->ni_ncpts = LNET_CPT_NUMBER; @@ -466,35 +507,51 @@ lnet_ni_alloc(struct lnet_net *net, struct cfs_expr_list *el, char *iface) LASSERT(rc <= LNET_CPT_NUMBER); if (rc == LNET_CPT_NUMBER) { - cfs_expr_list_values_free(ni->ni_cpts, LNET_CPT_NUMBER); + kfree(ni->ni_cpts); ni->ni_cpts = NULL; } ni->ni_ncpts = rc; } - ni->ni_net = net; - /* LND will fill in the address part of the NID */ - ni->ni_nid = LNET_MKNID(net->net_id, 0); - - /* Store net namespace in which current ni is being created */ - if (current->nsproxy->net_ns) - ni->ni_net_ns = get_net(current->nsproxy->net_ns); - else - ni->ni_net_ns = NULL; - - ni->ni_last_alive = ktime_get_real_seconds(); - ni->ni_state = LNET_NI_STATE_INIT; rc = lnet_net_append_cpts(ni->ni_cpts, ni->ni_ncpts, net); if (rc != 0) goto failed; - list_add_tail(&ni->ni_netlist, &net->net_ni_added); - /* if an interface name is provided then make sure to add in that - * interface name in NI */ - if (iface) - if (lnet_ni_add_interface(ni, iface) != 0) + return ni; +failed: + lnet_ni_free(ni); + return NULL; +} + +struct lnet_ni * +lnet_ni_alloc_w_cpt_array(struct lnet_net *net, __u32 *cpts, __u32 ncpts, + char *iface) +{ + struct lnet_ni *ni; + int rc; + + ni = lnet_ni_alloc_common(net, iface); + if (!ni) + return NULL; + + if (ncpts == 0) { + ni->ni_cpts = NULL; + ni->ni_ncpts = LNET_CPT_NUMBER; + } else { + size_t array_size = ncpts * sizeof(ni->ni_cpts[0]); + + ni->ni_cpts = kmalloc_array(ncpts, sizeof(ni->ni_cpts[0]), + GFP_KERNEL); + if (!ni->ni_cpts) goto failed; + memcpy(ni->ni_cpts, cpts, array_size); + ni->ni_ncpts = ncpts; + } + + rc = lnet_net_append_cpts(ni->ni_cpts, ni->ni_ncpts, net); + if (rc != 0) + goto failed; return ni; failed: diff --git a/drivers/staging/lustre/lnet/lnet/module.c b/drivers/staging/lustre/lnet/lnet/module.c index 9d06664f0c17..c82d27592391 100644 --- a/drivers/staging/lustre/lnet/lnet/module.c +++ b/drivers/staging/lustre/lnet/lnet/module.c @@ -92,7 +92,7 @@ lnet_unconfigure(void) } static int -lnet_dyn_configure(struct libcfs_ioctl_hdr *hdr) +lnet_dyn_configure_net(struct libcfs_ioctl_hdr *hdr) { struct lnet_ioctl_config_data *conf = (struct lnet_ioctl_config_data *)hdr; @@ -102,19 +102,17 @@ lnet_dyn_configure(struct libcfs_ioctl_hdr *hdr) return -EINVAL; mutex_lock(&lnet_config_mutex); - if (!the_lnet.ln_niinit_self) { + if (the_lnet.ln_niinit_self) + rc = lnet_dyn_add_net(conf); + else rc = -EINVAL; - goto out_unlock; - } - rc = lnet_dyn_add_ni(LNET_PID_LUSTRE, conf); -out_unlock: mutex_unlock(&lnet_config_mutex); return rc; } static int -lnet_dyn_unconfigure(struct libcfs_ioctl_hdr *hdr) +lnet_dyn_unconfigure_net(struct libcfs_ioctl_hdr *hdr) { struct lnet_ioctl_config_data *conf = (struct lnet_ioctl_config_data *)hdr; @@ -124,12 +122,50 @@ lnet_dyn_unconfigure(struct libcfs_ioctl_hdr *hdr) return -EINVAL; mutex_lock(&lnet_config_mutex); - if (!the_lnet.ln_niinit_self) { + if (the_lnet.ln_niinit_self) + rc = lnet_dyn_del_net(conf->cfg_net); + else + rc = -EINVAL; + mutex_unlock(&lnet_config_mutex); + + return rc; +} + +static int +lnet_dyn_configure_ni(struct libcfs_ioctl_hdr *hdr) +{ + struct lnet_ioctl_config_ni *conf = + (struct lnet_ioctl_config_ni *)hdr; + int rc; + + if (conf->lic_cfg_hdr.ioc_len < sizeof(*conf)) + return -EINVAL; + + mutex_lock(&lnet_config_mutex); + if (the_lnet.ln_niinit_self) + rc = lnet_dyn_add_ni(conf); + else + rc = -EINVAL; + mutex_unlock(&lnet_config_mutex); + + return rc; +} + +static int +lnet_dyn_unconfigure_ni(struct libcfs_ioctl_hdr *hdr) +{ + struct lnet_ioctl_config_ni *conf = + (struct lnet_ioctl_config_ni *)hdr; + int rc; + + if (conf->lic_cfg_hdr.ioc_len < sizeof(*conf)) + return -EINVAL; + + mutex_lock(&lnet_config_mutex); + if (the_lnet.ln_niinit_self) + rc = lnet_dyn_del_ni(conf); + else rc = -EINVAL; - goto out_unlock; - } - rc = lnet_dyn_del_ni(conf->cfg_net); -out_unlock: mutex_unlock(&lnet_config_mutex); return rc; @@ -161,11 +197,17 @@ lnet_ioctl(struct notifier_block *nb, break; case IOC_LIBCFS_ADD_NET: - rc = lnet_dyn_configure(hdr); + rc = lnet_dyn_configure_net(hdr); break; case IOC_LIBCFS_DEL_NET: - rc = lnet_dyn_unconfigure(hdr); + rc = lnet_dyn_unconfigure_net(hdr); + + case IOC_LIBCFS_ADD_LOCAL_NI: + return lnet_dyn_configure_ni(hdr); + + case IOC_LIBCFS_DEL_LOCAL_NI: + return lnet_dyn_unconfigure_ni(hdr); break; default: diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index d081440579e0..a760e43bcf7e 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -47,6 +47,27 @@ lnet_peer_remove_from_remote_list(struct lnet_peer_ni *lpni) } } +void +lnet_peer_net_added(struct lnet_net *net) +{ + struct lnet_peer_ni *lpni, *tmp; + + list_for_each_entry_safe(lpni, tmp, &the_lnet.ln_remote_peer_ni_list, + lpni_on_remote_peer_ni_list) { + if (LNET_NIDNET(lpni->lpni_nid) == net->net_id) { + lpni->lpni_net = net; + lpni->lpni_txcredits = + lpni->lpni_mintxcredits = + lpni->lpni_net->net_tunables.lct_peer_tx_credits; + lpni->lpni_rtrcredits = + lpni->lpni_minrtrcredits = + lnet_peer_buffer_credits(lpni->lpni_net); + + lnet_peer_remove_from_remote_list(lpni); + } + } +} + void lnet_peer_tables_destroy(void) { diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c index 66295b4fcdab..c201a8871943 100644 --- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c +++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c @@ -726,7 +726,7 @@ static int ptlrpcd_init(void) ptlrpcds_cpt_idx[cpt] = i; } - cfs_expr_list_values_free(cpts, rc); + kfree(cpts); ncpts = rc; } ptlrpcds_num = ncpts; diff --git a/drivers/staging/lustre/lustre/ptlrpc/service.c b/drivers/staging/lustre/lustre/ptlrpc/service.c index 55f68b9b3818..79baadc0d09f 100644 --- a/drivers/staging/lustre/lustre/ptlrpc/service.c +++ b/drivers/staging/lustre/lustre/ptlrpc/service.c @@ -2780,9 +2780,7 @@ ptlrpc_service_free(struct ptlrpc_service *svc) ptlrpc_service_for_each_part(svcpt, i, svc) kfree(svcpt); - if (svc->srv_cpts) - cfs_expr_list_values_free(svc->srv_cpts, svc->srv_ncpts); - + kfree(svc->srv_cpts); kfree(svc); } From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613175 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5645F1390 for ; Tue, 25 Sep 2018 01:11:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 58A0F2A052 for ; Tue, 25 Sep 2018 01:11:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4CF522A05D; Tue, 25 Sep 2018 01:11:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 55AA42A052 for ; Tue, 25 Sep 2018 01:11:11 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E74F14C3F5A; Mon, 24 Sep 2018 18:11:10 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id EC67C4C3EE6 for ; Mon, 24 Sep 2018 18:11:09 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 1287EB034; Tue, 25 Sep 2018 01:11:09 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763531.32103.14595088461405832909.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 12/34] LU-7734 lnet: NUMA support X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata This patch adds NUMA node support. NUMA node information is stored in the CPT table. A NUMA node mask is maintained for the entire table as well as for each CPT to track the NUMA nodes related to each of the CPTs. Following key APIs added: cfs_cpt_of_node(): returns the CPT of particular NUMA node cfs_cpt_distance(): calculates the distance between two CPTs When the LND device is started it finds the NUMA node of the physical device and then from there it finds the CPT, which is subsequently stored in the NI structure. When selecting the NI, the MD CPT is determined and the distance between the MD CPT and the device CPT is calculated. The NI with the shortest distance is preferred. If the device or system is not NUMA aware then the CPT for the device will default to CFS_CPT_ANY and the distance calculated when CFS_CPT_ANY is used is largest in the system. IE, none NUMA aware devices are least preferred. A NUMA range value can be set. If the value is large enough it amounts to basically turning off NUMA criterion completely. Signed-off-by: Amir Shehata Change-Id: I2d7c63f8e8fc8e8a6a249b0d6bfdd08fd090a837 Reviewed-on: http://review.whamcloud.com/18916 Tested-by: Jenkins Tested-by: Maloo Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 .../staging/lustre/include/linux/lnet/lib-types.h | 3 .../lustre/include/uapi/linux/lnet/libcfs_ioctl.h | 6 + .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 6 + .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 4 + .../staging/lustre/lnet/klnds/socklnd/socklnd.c | 13 ++ drivers/staging/lustre/lnet/lnet/api-ni.c | 27 +++ drivers/staging/lustre/lnet/lnet/lib-move.c | 160 +++++++++++++++++--- 8 files changed, 195 insertions(+), 25 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index a7cff6426ad8..c338e31b2cdd 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -408,6 +408,7 @@ struct lnet_ni *lnet_net2ni_addref(__u32 net); bool lnet_is_ni_healthy_locked(struct lnet_ni *ni); struct lnet_net *lnet_get_net_locked(u32 net_id); +extern unsigned int lnet_numa_range; extern int portal_rotor; int lnet_lib_init(void); diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 22b141cb6cff..5083b72ca20f 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -346,6 +346,9 @@ struct lnet_ni { /* lnd tunables set explicitly */ bool ni_lnd_tunables_set; + /* physical device CPT */ + int dev_cpt; + /* sequence number used to round robin over nis within a net */ u32 ni_seq; diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h b/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h index fa58aaf6ad9d..a231f6d89e95 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/libcfs_ioctl.h @@ -142,7 +142,9 @@ struct libcfs_debug_ioctl_data { #define IOC_LIBCFS_ADD_LOCAL_NI _IOWR(IOC_LIBCFS_TYPE, 95, IOCTL_CONFIG_SIZE) #define IOC_LIBCFS_DEL_LOCAL_NI _IOWR(IOC_LIBCFS_TYPE, 96, IOCTL_CONFIG_SIZE) #define IOC_LIBCFS_GET_LOCAL_NI _IOWR(IOC_LIBCFS_TYPE, 97, IOCTL_CONFIG_SIZE) -#define IOC_LIBCFS_DBG _IOWR(IOC_LIBCFS_TYPE, 98, IOCTL_CONFIG_SIZE) -#define IOC_LIBCFS_MAX_NR 98 +#define IOC_LIBCFS_SET_NUMA_RANGE _IOWR(IOC_LIBCFS_TYPE, 98, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_GET_NUMA_RANGE _IOWR(IOC_LIBCFS_TYPE, 99, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_DBG _IOWR(IOC_LIBCFS_TYPE, 100, IOCTL_CONFIG_SIZE) +#define IOC_LIBCFS_MAX_NR 100 #endif /* __LIBCFS_IOCTL_H__ */ diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index bfd9fc6bc4df..5eaaf0eae470 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -162,6 +162,7 @@ struct lnet_ioctl_config_ni { __u32 lic_status; __u32 lic_tcp_bonding; __u32 lic_idx; + __s32 lic_dev_cpt; char lic_bulk[0]; }; @@ -213,6 +214,11 @@ struct lnet_ioctl_peer_cfg { char prcfg_bulk[0]; }; +struct lnet_ioctl_numa_range { + struct libcfs_ioctl_hdr nr_hdr; + __u32 nr_range; +}; + struct lnet_ioctl_lnet_stats { struct libcfs_ioctl_hdr st_hdr; struct lnet_counters st_cntrs; diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 958ac9a99045..2e71abbf8a0c 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -2829,6 +2829,7 @@ static int kiblnd_startup(struct lnet_ni *ni) unsigned long flags; int rc; int newdev; + int node_id; LASSERT(ni->ni_net->net_lnd == &the_o2iblnd); @@ -2878,6 +2879,9 @@ static int kiblnd_startup(struct lnet_ni *ni) if (!ibdev) goto failed; + node_id = dev_to_node(ibdev->ibd_hdev->ibh_ibdev->dma_device); + ni->dev_cpt = cfs_cpt_of_node(lnet_cpt_table(), node_id); + net->ibn_dev = ibdev; ni->ni_nid = LNET_MKNID(LNET_NIDNET(ni->ni_nid), ibdev->ibd_ifip); diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c index 9df66c6d160f..ba1ec35a017a 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c @@ -38,6 +38,7 @@ * Author: Eric Barton */ +#include #include "socklnd.h" #include @@ -2726,6 +2727,8 @@ ksocknal_startup(struct lnet_ni *ni) struct ksock_net *net; int rc; int i; + struct net_device *net_dev; + int node_id; LASSERT(ni->ni_net->net_lnd == &the_ksocklnd); @@ -2773,6 +2776,16 @@ ksocknal_startup(struct lnet_ni *ni) } } + net_dev = dev_get_by_name(&init_net, + net->ksnn_interfaces[0].ksni_name); + if (net_dev) { + node_id = dev_to_node(&net_dev->dev); + ni->dev_cpt = cfs_cpt_of_node(lnet_cpt_table(), node_id); + dev_put(net_dev); + } else { + ni->dev_cpt = CFS_CPT_ANY; + } + /* call it before add it to ksocknal_data.ksnd_nets */ rc = ksocknal_net_start_threads(net, ni->ni_cpts, ni->ni_ncpts); if (rc) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 1ef9a39b517d..67a3301258d4 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -64,6 +64,12 @@ module_param(use_tcp_bonding, int, 0444); MODULE_PARM_DESC(use_tcp_bonding, "Set to 1 to use socklnd bonding. 0 to use Multi-Rail"); +unsigned int lnet_numa_range; +EXPORT_SYMBOL(lnet_numa_range); +module_param(lnet_numa_range, uint, 0444); +MODULE_PARM_DESC(lnet_numa_range, + "NUMA range to consider during Multi-Rail selection"); + /* * This sequence number keeps track of how many times DLC was used to * update the configuration. It is incremented on any DLC update and @@ -1896,6 +1902,7 @@ lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_ni *cfg_ni, cfg_ni->lic_nid = ni->ni_nid; cfg_ni->lic_status = ni->ni_status->ns_status; cfg_ni->lic_tcp_bonding = use_tcp_bonding; + cfg_ni->lic_dev_cpt = ni->dev_cpt; memcpy(&tun->lt_cmn, &ni->ni_net->net_tunables, sizeof(tun->lt_cmn)); @@ -2642,6 +2649,26 @@ LNetCtl(unsigned int cmd, void *arg) mutex_unlock(&the_lnet.ln_api_mutex); return rc; + case IOC_LIBCFS_SET_NUMA_RANGE: { + struct lnet_ioctl_numa_range *numa; + + numa = arg; + if (numa->nr_hdr.ioc_len != sizeof(*numa)) + return -EINVAL; + lnet_numa_range = numa->nr_range; + return 0; + } + + case IOC_LIBCFS_GET_NUMA_RANGE: { + struct lnet_ioctl_numa_range *numa; + + numa = arg; + if (numa->nr_hdr.ioc_len != sizeof(*numa)) + return -EINVAL; + numa->nr_range = lnet_numa_range; + return 0; + } + case IOC_LIBCFS_GET_BUF: { struct lnet_ioctl_pool_cfg *pool_cfg; size_t total = sizeof(*config) + sizeof(*pool_cfg); diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index fbf209610ff9..bf2256da6122 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1109,6 +1109,10 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, int best_credits = 0; u32 seq, seq2; int best_lpni_credits = INT_MIN; + int md_cpt = 0; + unsigned int shortest_distance = UINT_MAX; + unsigned int distance = 0; + bool found_ir = false; again: /* @@ -1127,12 +1131,20 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, routing = false; local_net = NULL; best_ni = NULL; + shortest_distance = UINT_MAX; + found_ir = false; if (the_lnet.ln_shutdown) { lnet_net_unlock(cpt); return -ESHUTDOWN; } + if (msg->msg_md) + /* get the cpt of the MD, used during NUMA based selection */ + md_cpt = lnet_cpt_of_cookie(msg->msg_md->md_lh.lh_cookie); + else + md_cpt = CFS_CPT_ANY; + /* * initialize the variables which could be reused if we go to * again @@ -1258,34 +1270,113 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, continue; /* - * Second jab at determining best_ni - * if we get here then the peer we're trying to send - * to is on a directly connected network, and we'll - * need to pick the local_ni on that network to send - * from + * Iterate through the NIs in this local Net and select + * the NI to send from. The selection is determined by + * these 3 criterion in the following priority: + * 1. NUMA + * 2. NI available credits + * 3. Round Robin */ while ((ni = lnet_get_next_ni_locked(local_net, ni))) { if (!lnet_is_ni_healthy_locked(ni)) continue; - /* TODO: compare NUMA distance */ - if (ni->ni_tx_queues[cpt]->tq_credits <= - best_credits) { + + /* + * calculate the distance from the cpt on which + * the message memory is allocated to the CPT of + * the NI's physical device + */ + distance = cfs_cpt_distance(lnet_cpt_table(), + md_cpt, + ni->dev_cpt); + + /* + * If we already have a closer NI within the NUMA + * range provided, then there is no need to + * consider the current NI. Move on to the next + * one. + */ + if (distance > shortest_distance && + distance > lnet_numa_range) + continue; + + if (distance < shortest_distance && + distance > lnet_numa_range) { /* - * all we want is to read tq_credits - * value as an approximation of how - * busy the NI is. No need to grab a lock + * The current NI is the closest one that we + * have found, even though it's not in the + * NUMA range specified. This occurs if + * the NUMA range is less than the least + * of the distances in the system. + * In effect NUMA range consideration is + * turned off. */ - continue; - } else if (best_ni) { - if ((best_ni)->ni_seq - ni->ni_seq <= 0) + shortest_distance = distance; + } else if ((distance <= shortest_distance && + distance < lnet_numa_range) || + distance == shortest_distance) { + /* + * This NI is either within range or it's + * equidistant. In both of these cases we + * would want to select the NI based on + * its available credits first, and then + * via Round Robin. + */ + if (distance <= shortest_distance && + distance < lnet_numa_range) { + /* + * If this is the first NI that's + * within range, then set the + * shortest distance to the range + * specified by the user. In + * effect we're saying that all + * NIs that fall within this NUMA + * range shall be dealt with as + * having equal NUMA weight. Which + * will mean that we should select + * through that set by their + * available credits first + * followed by Round Robin. + * + * And since this is the first NI + * in the range, let's just set it + * as our best_ni for now. The + * following NIs found in the + * range will be dealt with as + * mentioned previously. + */ + shortest_distance = lnet_numa_range; + if (!found_ir) { + found_ir = true; + goto set_ni; + } + } + /* + * This NI is NUMA equidistant let's + * select using credits followed by Round + * Robin. + */ + if (ni->ni_tx_queues[cpt]->tq_credits < + best_credits) { continue; - (best_ni)->ni_seq = ni->ni_seq + 1; + } else if (ni->ni_tx_queues[cpt]->tq_credits == + best_credits) { + if (best_ni && + best_ni->ni_seq <= ni->ni_seq) + continue; + } } - +set_ni: best_ni = ni; best_credits = ni->ni_tx_queues[cpt]->tq_credits; } } + /* + * Now that we selected the NI to use increment its sequence + * number so the Round Robin algorithm will detect that it has + * been used and pick the next NI. + */ + best_ni->ni_seq++; if (!best_ni) { lnet_net_unlock(cpt); @@ -1372,29 +1463,52 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, best_lpni = NULL; while ((lpni = lnet_get_next_peer_ni_locked(peer, peer_net, lpni))) { /* - * if this peer ni is not healty just skip it, no point in + * if this peer ni is not healthy just skip it, no point in * examining it further */ if (!lnet_is_peer_ni_healthy_locked(lpni)) continue; ni_is_pref = lnet_peer_is_ni_pref_locked(lpni, best_ni); + /* if this is a preferred peer use it */ if (!preferred && ni_is_pref) { preferred = true; } else if (preferred && !ni_is_pref) { + /* + * this is not the preferred peer so let's ignore + * it. + */ continue; - } else if (lpni->lpni_txcredits <= best_lpni_credits) { + } else if (lpni->lpni_txcredits < best_lpni_credits) { + /* + * We already have a peer that has more credits + * available than this one. No need to consider + * this peer further. + */ continue; - } else if (best_lpni) { - if (best_lpni->lpni_seq - lpni->lpni_seq <= 0) - continue; - best_lpni->lpni_seq = lpni->lpni_seq + 1; + } else if (lpni->lpni_txcredits == best_lpni_credits) { + /* + * The best peer found so far and the current peer + * have the same number of available credits let's + * make sure to select between them using Round + * Robin + */ + if (best_lpni) { + if (best_lpni->lpni_seq <= lpni->lpni_seq) + continue; + } } best_lpni = lpni; best_lpni_credits = lpni->lpni_txcredits; } + /* + * Increment sequence number of the peer selected so that we can + * pick the next one in Round Robin. + */ + best_lpni->lpni_seq++; + /* if we still can't find a peer ni then we can't reach it */ if (!best_lpni) { u32 net_id = peer_net ? peer_net->lpn_net_id : @@ -1403,7 +1517,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, lnet_net_unlock(cpt); LCONSOLE_WARN("no peer_ni found on peer net %s\n", libcfs_net2str(net_id)); - goto again; + return -EHOSTUNREACH; } send: From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613177 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1B5801390 for ; Tue, 25 Sep 2018 01:11:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BD812A052 for ; Tue, 25 Sep 2018 01:11:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0E08F2A05D; Tue, 25 Sep 2018 01:11:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 25AB82A052 for ; Tue, 25 Sep 2018 01:11:19 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id AD17A4C3F24; Mon, 24 Sep 2018 18:11:18 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4B3654C3E31 for ; Mon, 24 Sep 2018 18:11:16 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 70BCCB034; Tue, 25 Sep 2018 01:11:15 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763535.32103.1496474509485326901.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 13/34] LU-7734 lnet: Primary NID and traffic distribution X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata When receiving messages from a multi-rail peer we must keep track of both the source NID and the primary NID of the peer. When sending a reply message or RPC respone, the source NID is preferred. But most other uses require identifcation of the peer regardless of which source NID the message came from, and so the primary NID of the peer must then be used. An example for this is the creation of match entries. Another occurs when an event is created: the initiator should be the primary NID, to ensure upper layers (PtlRPC and Lustre) always see the same NID for that peer. This change also contains code to have PtlRPC use LNET_NID_ANY for the 'self' parameter of LNetPut() and LNetGet() when it doesn't care which NI it sends from, and to provide a local/peer NID pair when it does. This can be broken out into a separate change. Signed-off-by: Olaf Weber Signed-off-by: Amir Shehata Change-Id: If4391f2537a94f5784e8c61ae03aad266b2f8e7d Reviewed-on: http://review.whamcloud.com/18938 Tested-by: Maloo Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 .../staging/lustre/include/linux/lnet/lib-types.h | 2 + .../lustre/include/uapi/linux/lnet/lnet-types.h | 4 +- drivers/staging/lustre/lnet/lnet/lib-move.c | 49 ++++++++++++-------- drivers/staging/lustre/lnet/lnet/lib-msg.c | 10 +++- drivers/staging/lustre/lnet/lnet/lib-ptl.c | 3 + drivers/staging/lustre/lnet/lnet/peer.c | 18 +++++++ drivers/staging/lustre/lustre/include/lustre_net.h | 2 + drivers/staging/lustre/lustre/ptlrpc/events.c | 5 ++ drivers/staging/lustre/lustre/ptlrpc/niobuf.c | 16 +++---- 10 files changed, 77 insertions(+), 33 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index c338e31b2cdd..0259cd2251ed 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -652,6 +652,7 @@ int lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, int lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid); void lnet_peer_net_added(struct lnet_net *net); +lnet_nid_t lnet_peer_primary_nid(lnet_nid_t nid); void lnet_peer_tables_cleanup(struct lnet_ni *ni); void lnet_peer_uninit(void); int lnet_peer_tables_create(void); diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 5083b72ca20f..dbcd9b3da914 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -61,6 +61,8 @@ struct lnet_msg { struct list_head msg_list; /* Q for credits/MD */ struct lnet_process_id msg_target; + /* Primary NID of the source. */ + lnet_nid_t msg_initiator; /* where is it from, it's only for building event */ lnet_nid_t msg_from; __u32 msg_type; diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h index 5770876201c8..e80ef4182e5d 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h @@ -563,10 +563,12 @@ struct lnet_event { struct lnet_process_id target; /** The identifier (nid, pid) of the initiator. */ struct lnet_process_id initiator; + /** The source NID on the initiator. */ + struct lnet_process_id source; /** * The NID of the immediate sender. If the request has been forwarded * by routers, this is the NID of the last hop; otherwise it's the - * same as the initiator. + * same as the source. */ lnet_nid_t sender; /** Indicates the type of the event. */ diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index bf2256da6122..5153de984ede 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1189,23 +1189,6 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, } } - if (best_ni == the_lnet.ln_loni) { - /* No send credit hassles with LOLND */ - msg->msg_hdr.dest_nid = cpu_to_le64(best_ni->ni_nid); - if (!msg->msg_routing) - msg->msg_hdr.src_nid = cpu_to_le64(best_ni->ni_nid); - msg->msg_target.nid = best_ni->ni_nid; - lnet_msg_commit(msg, cpt); - - lnet_ni_addref_locked(best_ni, cpt); - lnet_net_unlock(cpt); - msg->msg_txni = best_ni; - lnet_ni_send(best_ni, msg); - - *lo_sent = true; - return 0; - } - if (best_ni) goto pick_peer; @@ -1389,6 +1372,23 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, goto send; pick_peer: + if (best_ni == the_lnet.ln_loni) { + /* No send credit hassles with LOLND */ + lnet_ni_addref_locked(best_ni, cpt); + msg->msg_hdr.dest_nid = cpu_to_le64(best_ni->ni_nid); + if (!msg->msg_routing) + msg->msg_hdr.src_nid = cpu_to_le64(best_ni->ni_nid); + msg->msg_target.nid = best_ni->ni_nid; + lnet_msg_commit(msg, cpt); + + lnet_net_unlock(cpt); + msg->msg_txni = best_ni; + lnet_ni_send(best_ni, msg); + + *lo_sent = true; + return 0; + } + lpni = NULL; if (msg->msg_type == LNET_MSG_REPLY || @@ -1674,7 +1674,8 @@ lnet_parse_put(struct lnet_ni *ni, struct lnet_msg *msg) le32_to_cpus(&hdr->msg.put.ptl_index); le32_to_cpus(&hdr->msg.put.offset); - info.mi_id.nid = hdr->src_nid; + /* Primary peer NID. */ + info.mi_id.nid = msg->msg_initiator; info.mi_id.pid = hdr->src_pid; info.mi_opc = LNET_MD_OP_PUT; info.mi_portal = hdr->msg.put.ptl_index; @@ -1725,6 +1726,7 @@ lnet_parse_get(struct lnet_ni *ni, struct lnet_msg *msg, int rdma_get) { struct lnet_match_info info; struct lnet_hdr *hdr = &msg->msg_hdr; + struct lnet_process_id source_id; struct lnet_handle_wire reply_wmd; int rc; @@ -1734,7 +1736,10 @@ lnet_parse_get(struct lnet_ni *ni, struct lnet_msg *msg, int rdma_get) le32_to_cpus(&hdr->msg.get.sink_length); le32_to_cpus(&hdr->msg.get.src_offset); - info.mi_id.nid = hdr->src_nid; + source_id.nid = hdr->src_nid; + source_id.pid = hdr->src_pid; + /* Primary peer NID */ + info.mi_id.nid = msg->msg_initiator; info.mi_id.pid = hdr->src_pid; info.mi_opc = LNET_MD_OP_GET; info.mi_portal = hdr->msg.get.ptl_index; @@ -1756,7 +1761,7 @@ lnet_parse_get(struct lnet_ni *ni, struct lnet_msg *msg, int rdma_get) reply_wmd = hdr->msg.get.return_wmd; - lnet_prep_send(msg, LNET_MSG_REPLY, info.mi_id, + lnet_prep_send(msg, LNET_MSG_REPLY, source_id, msg->msg_offset, msg->msg_wanted); msg->msg_hdr.msg.reply.dst_wmd = reply_wmd; @@ -2200,6 +2205,8 @@ lnet_parse(struct lnet_ni *ni, struct lnet_hdr *hdr, lnet_nid_t from_nid, msg->msg_hdr.dest_pid = dest_pid; msg->msg_hdr.payload_length = payload_length; } + /* Multi-Rail: Primary NID of source. */ + msg->msg_initiator = lnet_peer_primary_nid(src_nid); lnet_net_lock(cpt); rc = lnet_nid2peerni_locked(&msg->msg_rxpeer, from_nid, cpt); @@ -2518,6 +2525,8 @@ lnet_create_reply_msg(struct lnet_ni *ni, struct lnet_msg *getmsg) libcfs_nid2str(ni->ni_nid), libcfs_id2str(peer_id), getmd); /* setup information for lnet_build_msg_event */ + msg->msg_initiator = lnet_peer_primary_nid(peer_id.nid); + /* Cheaper: msg->msg_initiator = getmsg->msg_txpeer->lp_nid; */ msg->msg_from = peer_id.nid; msg->msg_type = LNET_MSG_GET; /* flag this msg as an "optimized" GET */ msg->msg_hdr.src_nid = peer_id.nid; diff --git a/drivers/staging/lustre/lnet/lnet/lib-msg.c b/drivers/staging/lustre/lnet/lnet/lib-msg.c index 27bdefa161cc..8628899e1631 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-msg.c +++ b/drivers/staging/lustre/lnet/lnet/lib-msg.c @@ -70,13 +70,19 @@ lnet_build_msg_event(struct lnet_msg *msg, enum lnet_event_kind ev_type) ev->target.pid = le32_to_cpu(hdr->dest_pid); ev->initiator.nid = LNET_NID_ANY; ev->initiator.pid = the_lnet.ln_pid; + ev->source.nid = LNET_NID_ANY; + ev->source.pid = the_lnet.ln_pid; ev->sender = LNET_NID_ANY; } else { /* event for passive message */ ev->target.pid = hdr->dest_pid; ev->target.nid = hdr->dest_nid; ev->initiator.pid = hdr->src_pid; - ev->initiator.nid = hdr->src_nid; + /* Multi-Rail: resolve src_nid to "primary" peer NID */ + ev->initiator.nid = msg->msg_initiator; + /* Multi-Rail: track source NID. */ + ev->source.pid = hdr->src_pid; + ev->source.nid = hdr->src_nid; ev->rlength = hdr->payload_length; ev->sender = msg->msg_from; ev->mlength = msg->msg_wanted; @@ -381,7 +387,7 @@ lnet_complete_msg_locked(struct lnet_msg *msg, int cpt) ack_wmd = msg->msg_hdr.msg.put.ack_wmd; - lnet_prep_send(msg, LNET_MSG_ACK, msg->msg_ev.initiator, 0, 0); + lnet_prep_send(msg, LNET_MSG_ACK, msg->msg_ev.source, 0, 0); msg->msg_hdr.msg.ack.dst_wmd = ack_wmd; msg->msg_hdr.msg.ack.match_bits = msg->msg_ev.match_bits; diff --git a/drivers/staging/lustre/lnet/lnet/lib-ptl.c b/drivers/staging/lustre/lnet/lnet/lib-ptl.c index c8d8162cc706..d4033530112e 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-ptl.c +++ b/drivers/staging/lustre/lnet/lnet/lib-ptl.c @@ -687,7 +687,8 @@ lnet_ptl_attach_md(struct lnet_me *me, struct lnet_libmd *md, LASSERT(msg->msg_rx_delayed || head == &ptl->ptl_msg_stealing); hdr = &msg->msg_hdr; - info.mi_id.nid = hdr->src_nid; + /* Multi-Rail: Primary peer NID */ + info.mi_id.nid = msg->msg_initiator; info.mi_id.pid = hdr->src_pid; info.mi_opc = LNET_MD_OP_PUT; info.mi_portal = hdr->msg.put.ptl_index; diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index a760e43bcf7e..bde7b6214668 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -394,6 +394,24 @@ lnet_peer_is_ni_pref_locked(struct lnet_peer_ni *lpni, struct lnet_ni *ni) return false; } +lnet_nid_t +lnet_peer_primary_nid(lnet_nid_t nid) +{ + struct lnet_peer_ni *lpni; + lnet_nid_t primary_nid = nid; + int cpt; + + cpt = lnet_net_lock_current(); + lpni = lnet_find_peer_ni_locked(nid); + if (lpni) { + primary_nid = lpni->lpni_peer_net->lpn_peer->lp_primary_nid; + lnet_peer_ni_decref_locked(lpni); + } + lnet_net_unlock(cpt); + + return primary_nid; +} + static void lnet_try_destroy_peer_hierarchy_locked(struct lnet_peer_ni *lpni) { diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h index 361b8970368e..2dbd20851b39 100644 --- a/drivers/staging/lustre/lustre/include/lustre_net.h +++ b/drivers/staging/lustre/lustre/include/lustre_net.h @@ -882,6 +882,8 @@ struct ptlrpc_request { lnet_nid_t rq_self; /** Peer description (the other side) */ struct lnet_process_id rq_peer; + /** Descriptor for the NID from which the peer sent the request. */ + struct lnet_process_id rq_source; /** * service time estimate (secs) * If the request is not served by this time, it is marked as timed out. diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c b/drivers/staging/lustre/lustre/ptlrpc/events.c index ebf985ec17a1..ab6dd74d0ae3 100644 --- a/drivers/staging/lustre/lustre/ptlrpc/events.c +++ b/drivers/staging/lustre/lustre/ptlrpc/events.c @@ -342,7 +342,9 @@ void request_in_callback(struct lnet_event *ev) if (ev->type == LNET_EVENT_PUT && ev->status == 0) req->rq_reqdata_len = ev->mlength; ktime_get_real_ts64(&req->rq_arrival_time); + /* Multi-Rail: keep track of both initiator and source NID. */ req->rq_peer = ev->initiator; + req->rq_source = ev->source; req->rq_self = ev->target.nid; req->rq_rqbd = rqbd; req->rq_phase = RQ_PHASE_NEW; @@ -350,7 +352,8 @@ void request_in_callback(struct lnet_event *ev) CDEBUG(D_INFO, "incoming req@%p x%llu msgsize %u\n", req, req->rq_xid, ev->mlength); - CDEBUG(D_RPCTRACE, "peer: %s\n", libcfs_id2str(req->rq_peer)); + CDEBUG(D_RPCTRACE, "peer: %s (source: %s)\n", + libcfs_id2str(req->rq_peer), libcfs_id2str(req->rq_source)); spin_lock(&svcpt->scp_lock); diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c index 2897afb8806c..d0bcd8827f8a 100644 --- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c +++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c @@ -47,14 +47,14 @@ */ static int ptl_send_buf(struct lnet_handle_md *mdh, void *base, int len, enum lnet_ack_req ack, struct ptlrpc_cb_id *cbid, - struct ptlrpc_connection *conn, int portal, __u64 xid, - unsigned int offset) + lnet_nid_t self, struct lnet_process_id peer_id, + int portal, __u64 xid, unsigned int offset) { int rc; struct lnet_md md; LASSERT(portal != 0); - CDEBUG(D_INFO, "conn=%p id %s\n", conn, libcfs_id2str(conn->c_peer)); + CDEBUG(D_INFO, "peer_id %s\n", libcfs_id2str(peer_id)); md.start = base; md.length = len; md.threshold = (ack == LNET_ACK_REQ) ? 2 : 1; @@ -79,8 +79,8 @@ static int ptl_send_buf(struct lnet_handle_md *mdh, void *base, int len, CDEBUG(D_NET, "Sending %d bytes to portal %d, xid %lld, offset %u\n", len, portal, xid, offset); - rc = LNetPut(conn->c_self, *mdh, ack, - conn->c_peer, portal, xid, offset, 0); + rc = LNetPut(self, *mdh, ack, + peer_id, portal, xid, offset, 0); if (unlikely(rc != 0)) { int rc2; /* We're going to get an UNLINK event when I unlink below, @@ -88,7 +88,7 @@ static int ptl_send_buf(struct lnet_handle_md *mdh, void *base, int len, * I fall through and return success here! */ CERROR("LNetPut(%s, %d, %lld) failed: %d\n", - libcfs_id2str(conn->c_peer), portal, xid, rc); + libcfs_id2str(peer_id), portal, xid, rc); rc2 = LNetMDUnlink(*mdh); LASSERTF(rc2 == 0, "rc2 = %d\n", rc2); } @@ -415,7 +415,7 @@ int ptlrpc_send_reply(struct ptlrpc_request *req, int flags) rc = ptl_send_buf(&rs->rs_md_h, rs->rs_repbuf, rs->rs_repdata_len, (rs->rs_difficult && !rs->rs_no_ack) ? LNET_ACK_REQ : LNET_NOACK_REQ, - &rs->rs_cb_id, conn, + &rs->rs_cb_id, req->rq_self, req->rq_source, ptlrpc_req2svc(req)->srv_rep_portal, req->rq_xid, req->rq_reply_off); out: @@ -683,7 +683,7 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) rc = ptl_send_buf(&request->rq_req_md_h, request->rq_reqbuf, request->rq_reqdata_len, LNET_NOACK_REQ, &request->rq_req_cbid, - connection, + LNET_NID_ANY, connection->c_peer, request->rq_request_portal, request->rq_xid, 0); if (likely(rc == 0)) From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613179 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E3F08157B for ; Tue, 25 Sep 2018 01:11:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E76EF2A052 for ; Tue, 25 Sep 2018 01:11:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DB0812A05D; Tue, 25 Sep 2018 01:11:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 644352A052 for ; Tue, 25 Sep 2018 01:11:24 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E07574C3FCE; Mon, 24 Sep 2018 18:11:23 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 769C04C3F9A for ; Mon, 24 Sep 2018 18:11:22 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 88CD1B032; Tue, 25 Sep 2018 01:11:21 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763540.32103.13948722910331939075.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 14/34] LU-7734 lnet: handle non-MR peers X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Add the ability to declare a peer to be non-MR from the DLC interface. By default if a peer is configured from DLC it is assumed to be MR capable, except when the non-mr flag is set. For non-MR peers always use the same NI to communicate with it. If multiple NIs are used to communicate with a non-MR peer the peer will consider that it's talking to different peers which could cause upper layers to be confused. Signed-off-by: Amir Shehata Change-Id: Ie3ec45f5f44fa7d72e3e0335b1383f9c3cc92627 Reviewed-on: http://review.whamcloud.com/19305 Tested-by: Jenkins Reviewed-by: Doug Oucharek Tested-by: Maloo Reviewed-by: Olaf Weber Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 17 ++++++++++++++++- .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 1 + drivers/staging/lustre/lnet/lnet/api-ni.c | 3 ++- drivers/staging/lustre/lnet/lnet/lib-move.c | 13 +++++++++++++ drivers/staging/lustre/lnet/lnet/peer.c | 7 ++++--- 5 files changed, 36 insertions(+), 5 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 0259cd2251ed..08fc4abad332 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -661,7 +661,7 @@ struct lnet_peer_net *lnet_peer_get_net_locked(struct lnet_peer *peer, u32 net_id); bool lnet_peer_is_ni_pref_locked(struct lnet_peer_ni *lpni, struct lnet_ni *ni); -int lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid); +int lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid, bool mr); int lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid); int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, struct lnet_peer_ni_credit_info *peer_ni_info); @@ -672,6 +672,21 @@ int lnet_get_peer_ni_info(__u32 peer_index, __u64 *nid, __u32 *peer_rtr_credits, __u32 *peer_min_rtr_credtis, __u32 *peer_tx_qnob); +static inline __u32 +lnet_get_num_peer_nis(struct lnet_peer *peer) +{ + struct lnet_peer_net *lpn; + struct lnet_peer_ni *lpni; + __u32 count = 0; + + list_for_each_entry(lpn, &peer->lp_peer_nets, lpn_on_peer_list) + list_for_each_entry(lpni, &lpn->lpn_peer_nis, + lpni_on_peer_net_list) + count++; + + return count; +} + static inline bool lnet_is_peer_ni_healthy_locked(struct lnet_peer_ni *lpni) { diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index 5eaaf0eae470..8be322dd4bd2 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -211,6 +211,7 @@ struct lnet_ioctl_peer_cfg { lnet_nid_t prcfg_key_nid; lnet_nid_t prcfg_cfg_nid; __u32 prcfg_idx; + bool prcfg_mr; char prcfg_bulk[0]; }; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 67a3301258d4..2d5d657de058 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -2689,7 +2689,8 @@ LNetCtl(unsigned int cmd, void *arg) return -EINVAL; return lnet_add_peer_ni_to_peer(cfg->prcfg_key_nid, - cfg->prcfg_cfg_nid); + cfg->prcfg_cfg_nid, + cfg->prcfg_mr); } case IOC_LIBCFS_DEL_PEER_NI: { diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 5153de984ede..6c5bb953a6d3 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1164,6 +1164,12 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, return -EHOSTUNREACH; } + if (!peer->lp_multi_rail && lnet_get_num_peer_nis(peer) > 1) { + CERROR("peer %s is declared to be non MR capable, yet configured with more than one NID\n", + libcfs_nid2str(dst_nid)); + return -EINVAL; + } + /* * STEP 1: first jab at determineing best_ni * if src_nid is explicitly specified, then best_ni is already @@ -1361,6 +1367,13 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, */ best_ni->ni_seq++; + /* + * if the peer is not MR capable, then we should always send to it + * using the first NI in the NET we determined. + */ + if (!peer->lp_multi_rail && local_net) + best_ni = lnet_net2ni_locked(local_net->net_id, cpt); + if (!best_ni) { lnet_net_unlock(cpt); LCONSOLE_WARN("No local ni found to send from to %s\n", diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index bde7b6214668..ecbd276703f1 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -477,6 +477,7 @@ lnet_build_peer_hierarchy(struct lnet_peer_ni *lpni) peer_net->lpn_peer = peer; lpni->lpni_peer_net = peer_net; peer->lp_primary_nid = lpni->lpni_nid; + peer->lp_multi_rail = false; list_add_tail(&peer_net->lpn_on_peer_list, &peer->lp_peer_nets); list_add_tail(&lpni->lpni_on_peer_net_list, &peer_net->lpn_peer_nis); list_add_tail(&peer->lp_on_lnet_peer_list, &the_lnet.ln_peers); @@ -502,7 +503,7 @@ lnet_peer_get_net_locked(struct lnet_peer *peer, u32 net_id) * is unique */ int -lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid) +lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid, bool mr) { struct lnet_peer_ni *lpni, *lpni2; struct lnet_peer *peer; @@ -535,14 +536,14 @@ lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid) return -EINVAL; } peer = lpni->lpni_peer_net->lpn_peer; - peer->lp_multi_rail = true; + peer->lp_multi_rail = mr; lnet_peer_ni_decref_locked(lpni); lnet_net_unlock(cpt2); } else { lnet_net_lock(LNET_LOCK_EX); rc = lnet_nid2peerni_locked(&lpni, nid, LNET_LOCK_EX); if (rc == 0) { - lpni->lpni_peer_net->lpn_peer->lp_multi_rail = true; + lpni->lpni_peer_net->lpn_peer->lp_multi_rail = mr; lnet_peer_ni_decref_locked(lpni); } lnet_net_unlock(LNET_LOCK_EX); From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613181 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 373B4157B for ; Tue, 25 Sep 2018 01:11:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C1322A052 for ; Tue, 25 Sep 2018 01:11:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 309D92A05D; Tue, 25 Sep 2018 01:11:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BE1DA2A052 for ; Tue, 25 Sep 2018 01:11:30 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 66BAA4C3FB3; Mon, 24 Sep 2018 18:11:30 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8871C4C3CD1 for ; Mon, 24 Sep 2018 18:11:28 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A69A1B037; Tue, 25 Sep 2018 01:11:27 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763544.32103.10828937501297905304.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 15/34] LU-7734 lnet: handle N NIs to 1 LND peer X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata This patch changes o2iblnd only, as socklnd already handles this case. In the new design you can have multiple NIs communicating to one peer. In the o2ilbnd the kib_peer has a pointer to the NI which implies a 1:1 relationship. This patch changes kiblnd_find_peer_locked() to use the peer NID and the NI NID as the key. This way a new peer will be created for each unique NI/peer_NI pair. This is similar to how socklnd handles this case. Signed-off-by: Amir Shehata Change-Id: Ifab7764489757ea473b15c46c1a22ef9ceeeceea Reviewed-on: http://review.whamcloud.com/19306 Reviewed-by: Doug Oucharek Tested-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 13 ++++++++++--- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h | 2 +- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 8 ++++---- 3 files changed, 15 insertions(+), 8 deletions(-) diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 2e71abbf8a0c..64df49146413 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -379,7 +379,7 @@ void kiblnd_destroy_peer(struct kib_peer *peer) atomic_dec(&net->ibn_npeers); } -struct kib_peer *kiblnd_find_peer_locked(lnet_nid_t nid) +struct kib_peer *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid) { /* * the caller is responsible for accounting the additional reference @@ -391,7 +391,14 @@ struct kib_peer *kiblnd_find_peer_locked(lnet_nid_t nid) list_for_each_entry(peer, peer_list, ibp_list) { LASSERT(!kiblnd_peer_idle(peer)); - if (peer->ibp_nid != nid) + /* + * Match a peer if its NID and the NID of the local NI it + * communicates over are the same. Otherwise don't match + * the peer, which will result in a new lnd peer being + * created. + */ + if (peer->ibp_nid != nid || + peer->ibp_ni->ni_nid != ni->ni_nid) continue; CDEBUG(D_NET, "got peer [%p] -> %s (%d) version: %x\n", @@ -1041,7 +1048,7 @@ static void kiblnd_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when) read_lock_irqsave(glock, flags); - peer = kiblnd_find_peer_locked(nid); + peer = kiblnd_find_peer_locked(ni, nid); if (peer) last_alive = peer->ibp_last_alive; diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h index 522eb150d9a6..520f586015f4 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h @@ -1019,7 +1019,7 @@ void kiblnd_destroy_peer(struct kib_peer *peer); bool kiblnd_reconnect_peer(struct kib_peer *peer); void kiblnd_destroy_dev(struct kib_dev *dev); void kiblnd_unlink_peer_locked(struct kib_peer *peer); -struct kib_peer *kiblnd_find_peer_locked(lnet_nid_t nid); +struct kib_peer *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid); int kiblnd_close_stale_conns_locked(struct kib_peer *peer, int version, __u64 incarnation); int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why); diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c index af8f863b6a68..f4b76347e1c6 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -1370,7 +1370,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) */ read_lock_irqsave(g_lock, flags); - peer = kiblnd_find_peer_locked(nid); + peer = kiblnd_find_peer_locked(ni, nid); if (peer && !list_empty(&peer->ibp_conns)) { /* Found a peer with an established connection */ conn = kiblnd_get_conn_locked(peer); @@ -1388,7 +1388,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) /* Re-try with a write lock */ write_lock(g_lock); - peer = kiblnd_find_peer_locked(nid); + peer = kiblnd_find_peer_locked(ni, nid); if (peer) { if (list_empty(&peer->ibp_conns)) { /* found a peer, but it's still connecting... */ @@ -1426,7 +1426,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) write_lock_irqsave(g_lock, flags); - peer2 = kiblnd_find_peer_locked(nid); + peer2 = kiblnd_find_peer_locked(ni, nid); if (peer2) { if (list_empty(&peer2->ibp_conns)) { /* found a peer, but it's still connecting... */ @@ -2388,7 +2388,7 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) write_lock_irqsave(g_lock, flags); - peer2 = kiblnd_find_peer_locked(nid); + peer2 = kiblnd_find_peer_locked(ni, nid); if (peer2) { if (!peer2->ibp_version) { peer2->ibp_version = version; From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10614203 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5E544913 for ; Tue, 25 Sep 2018 15:22:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 38A232A725 for ; Tue, 25 Sep 2018 15:22:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2A4972A789; Tue, 25 Sep 2018 15:22:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 730792A725 for ; Tue, 25 Sep 2018 15:22:37 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E862921F78F; Tue, 25 Sep 2018 08:22:36 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id EA7794C41BB for ; Mon, 24 Sep 2018 18:11:36 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id F4056B032; Tue, 25 Sep 2018 01:11:35 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763547.32103.18087278859321916658.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Mailman-Approved-At: Tue, 25 Sep 2018 08:22:34 -0700 Subject: [lustre-devel] [PATCH 16/34] LU-7734 lnet: rename LND peer to peer_ni X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Patch to rename LND peers to peer_ni to reflect the fact that these constructs reflect an actual connection between a local NI and remote peer NI. Signed-off-by: Amir Shehata Change-Id: I1c25a12eae61d8822a8c4ada2e077a5b2011ba22 Reviewed-on: http://review.whamcloud.com/19307 Reviewed-by: Doug Oucharek Tested-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 232 ++++--- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h | 118 ++-- .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 512 ++++++++------- .../staging/lustre/lnet/klnds/socklnd/socklnd.c | 662 ++++++++++---------- .../staging/lustre/lnet/klnds/socklnd/socklnd.h | 66 +- .../staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 207 +++--- .../lustre/lnet/klnds/socklnd/socklnd_lib.c | 4 .../lustre/lnet/klnds/socklnd/socklnd_proto.c | 14 8 files changed, 928 insertions(+), 887 deletions(-) diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 64df49146413..71256500f245 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -258,7 +258,7 @@ int kiblnd_unpack_msg(struct kib_msg *msg, int nob) msg->ibm_cksum = msg_cksum; if (flip) { - /* leave magic unflipped as a clue to peer endianness */ + /* leave magic unflipped as a clue to peer_ni endianness */ msg->ibm_version = version; BUILD_BUG_ON(sizeof(msg->ibm_type) != 1); BUILD_BUG_ON(sizeof(msg->ibm_credits) != 1); @@ -315,10 +315,10 @@ int kiblnd_unpack_msg(struct kib_msg *msg, int nob) return 0; } -int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer **peerp, +int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer_ni **peerp, lnet_nid_t nid) { - struct kib_peer *peer; + struct kib_peer_ni *peer_ni; struct kib_net *net = ni->ni_data; int cpt = lnet_cpt_of_nid(nid, ni); unsigned long flags; @@ -326,23 +326,23 @@ int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer **peerp, LASSERT(net); LASSERT(nid != LNET_NID_ANY); - peer = kzalloc_cpt(sizeof(*peer), GFP_NOFS, cpt); - if (!peer) { - CERROR("Cannot allocate peer\n"); + peer_ni = kzalloc_cpt(sizeof(*peer_ni), GFP_NOFS, cpt); + if (!peer_ni) { + CERROR("Cannot allocate peer_ni\n"); return -ENOMEM; } - peer->ibp_ni = ni; - peer->ibp_nid = nid; - peer->ibp_error = 0; - peer->ibp_last_alive = 0; - peer->ibp_max_frags = kiblnd_cfg_rdma_frags(peer->ibp_ni); - peer->ibp_queue_depth = ni->ni_net->net_tunables.lct_peer_tx_credits; - atomic_set(&peer->ibp_refcount, 1); /* 1 ref for caller */ + peer_ni->ibp_ni = ni; + peer_ni->ibp_nid = nid; + peer_ni->ibp_error = 0; + peer_ni->ibp_last_alive = 0; + peer_ni->ibp_max_frags = kiblnd_cfg_rdma_frags(peer_ni->ibp_ni); + peer_ni->ibp_queue_depth = ni->ni_net->net_tunables.lct_peer_tx_credits; + atomic_set(&peer_ni->ibp_refcount, 1); /* 1 ref for caller */ - INIT_LIST_HEAD(&peer->ibp_list); /* not in the peer table yet */ - INIT_LIST_HEAD(&peer->ibp_conns); - INIT_LIST_HEAD(&peer->ibp_tx_queue); + INIT_LIST_HEAD(&peer_ni->ibp_list); + INIT_LIST_HEAD(&peer_ni->ibp_conns); + INIT_LIST_HEAD(&peer_ni->ibp_tx_queue); write_lock_irqsave(&kiblnd_data.kib_global_lock, flags); @@ -354,93 +354,94 @@ int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer **peerp, write_unlock_irqrestore(&kiblnd_data.kib_global_lock, flags); - *peerp = peer; + *peerp = peer_ni; return 0; } -void kiblnd_destroy_peer(struct kib_peer *peer) +void kiblnd_destroy_peer(struct kib_peer_ni *peer_ni) { - struct kib_net *net = peer->ibp_ni->ni_data; + struct kib_net *net = peer_ni->ibp_ni->ni_data; LASSERT(net); - LASSERT(!atomic_read(&peer->ibp_refcount)); - LASSERT(!kiblnd_peer_active(peer)); - LASSERT(kiblnd_peer_idle(peer)); - LASSERT(list_empty(&peer->ibp_tx_queue)); + LASSERT(!atomic_read(&peer_ni->ibp_refcount)); + LASSERT(!kiblnd_peer_active(peer_ni)); + LASSERT(kiblnd_peer_idle(peer_ni)); + LASSERT(list_empty(&peer_ni->ibp_tx_queue)); - kfree(peer); + kfree(peer_ni); /* - * NB a peer's connections keep a reference on their peer until + * NB a peer_ni's connections keep a reference on their peer_ni until * they are destroyed, so we can be assured that _all_ state to do - * with this peer has been cleaned up when its refcount drops to + * with this peer_ni has been cleaned up when its refcount drops to * zero. */ atomic_dec(&net->ibn_npeers); } -struct kib_peer *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid) +struct kib_peer_ni *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid) { /* * the caller is responsible for accounting the additional reference * that this creates */ struct list_head *peer_list = kiblnd_nid2peerlist(nid); - struct kib_peer *peer; + struct kib_peer_ni *peer_ni; - list_for_each_entry(peer, peer_list, ibp_list) { - LASSERT(!kiblnd_peer_idle(peer)); + list_for_each_entry(peer_ni, peer_list, ibp_list) { + LASSERT(!kiblnd_peer_idle(peer_ni)); /* - * Match a peer if its NID and the NID of the local NI it + * Match a peer_ni if its NID and the NID of the local NI it * communicates over are the same. Otherwise don't match - * the peer, which will result in a new lnd peer being + * the peer_ni, which will result in a new lnd peer_ni being * created. */ - if (peer->ibp_nid != nid || - peer->ibp_ni->ni_nid != ni->ni_nid) + if (peer_ni->ibp_nid != nid || + peer_ni->ibp_ni->ni_nid != ni->ni_nid) continue; - CDEBUG(D_NET, "got peer [%p] -> %s (%d) version: %x\n", - peer, libcfs_nid2str(nid), - atomic_read(&peer->ibp_refcount), - peer->ibp_version); - return peer; + CDEBUG(D_NET, "got peer_ni [%p] -> %s (%d) version: %x\n", + peer_ni, libcfs_nid2str(nid), + atomic_read(&peer_ni->ibp_refcount), + peer_ni->ibp_version); + return peer_ni; } return NULL; } -void kiblnd_unlink_peer_locked(struct kib_peer *peer) +void kiblnd_unlink_peer_locked(struct kib_peer_ni *peer_ni) { - LASSERT(list_empty(&peer->ibp_conns)); + LASSERT(list_empty(&peer_ni->ibp_conns)); - LASSERT(kiblnd_peer_active(peer)); - list_del_init(&peer->ibp_list); + LASSERT(kiblnd_peer_active(peer_ni)); + list_del_init(&peer_ni->ibp_list); /* lose peerlist's ref */ - kiblnd_peer_decref(peer); + kiblnd_peer_decref(peer_ni); } static int kiblnd_get_peer_info(struct lnet_ni *ni, int index, lnet_nid_t *nidp, int *count) { - struct kib_peer *peer; + struct kib_peer_ni *peer_ni; int i; unsigned long flags; read_lock_irqsave(&kiblnd_data.kib_global_lock, flags); for (i = 0; i < kiblnd_data.kib_peer_hash_size; i++) { - list_for_each_entry(peer, &kiblnd_data.kib_peers[i], ibp_list) { - LASSERT(!kiblnd_peer_idle(peer)); + list_for_each_entry(peer_ni, &kiblnd_data.kib_peers[i], + ibp_list) { + LASSERT(!kiblnd_peer_idle(peer_ni)); - if (peer->ibp_ni != ni) + if (peer_ni->ibp_ni != ni) continue; if (index-- > 0) continue; - *nidp = peer->ibp_nid; - *count = atomic_read(&peer->ibp_refcount); + *nidp = peer_ni->ibp_nid; + *count = atomic_read(&peer_ni->ibp_refcount); read_unlock_irqrestore(&kiblnd_data.kib_global_lock, flags); @@ -452,34 +453,33 @@ static int kiblnd_get_peer_info(struct lnet_ni *ni, int index, return -ENOENT; } -static void kiblnd_del_peer_locked(struct kib_peer *peer) +static void kiblnd_del_peer_locked(struct kib_peer_ni *peer_ni) { struct list_head *ctmp; struct list_head *cnxt; struct kib_conn *conn; - if (list_empty(&peer->ibp_conns)) { - kiblnd_unlink_peer_locked(peer); + if (list_empty(&peer_ni->ibp_conns)) { + kiblnd_unlink_peer_locked(peer_ni); } else { - list_for_each_safe(ctmp, cnxt, &peer->ibp_conns) { + list_for_each_safe(ctmp, cnxt, &peer_ni->ibp_conns) { conn = list_entry(ctmp, struct kib_conn, ibc_list); kiblnd_close_conn_locked(conn, 0); } - /* NB closing peer's last conn unlinked it. */ + /* NB closing peer_ni's last conn unlinked it. */ } /* - * NB peer now unlinked; might even be freed if the peer table had the - * last ref on it. + * NB peer_ni now unlinked; might even be freed if the peer_ni + * table had the last ref on it. */ } static int kiblnd_del_peer(struct lnet_ni *ni, lnet_nid_t nid) { LIST_HEAD(zombies); - struct list_head *ptmp; - struct list_head *pnxt; - struct kib_peer *peer; + struct kib_peer_ni *pnxt; + struct kib_peer_ni *peer_ni; int lo; int hi; int i; @@ -497,24 +497,24 @@ static int kiblnd_del_peer(struct lnet_ni *ni, lnet_nid_t nid) } for (i = lo; i <= hi; i++) { - list_for_each_safe(ptmp, pnxt, &kiblnd_data.kib_peers[i]) { - peer = list_entry(ptmp, struct kib_peer, ibp_list); - LASSERT(!kiblnd_peer_idle(peer)); + list_for_each_entry_safe(peer_ni, pnxt, + &kiblnd_data.kib_peers[i], ibp_list) { + LASSERT(!kiblnd_peer_idle(peer_ni)); - if (peer->ibp_ni != ni) + if (peer_ni->ibp_ni != ni) continue; - if (!(nid == LNET_NID_ANY || peer->ibp_nid == nid)) + if (!(nid == LNET_NID_ANY || peer_ni->ibp_nid == nid)) continue; - if (!list_empty(&peer->ibp_tx_queue)) { - LASSERT(list_empty(&peer->ibp_conns)); + if (!list_empty(&peer_ni->ibp_tx_queue)) { + LASSERT(list_empty(&peer_ni->ibp_conns)); - list_splice_init(&peer->ibp_tx_queue, + list_splice_init(&peer_ni->ibp_tx_queue, &zombies); } - kiblnd_del_peer_locked(peer); + kiblnd_del_peer_locked(peer_ni); rc = 0; /* matched something */ } } @@ -528,7 +528,7 @@ static int kiblnd_del_peer(struct lnet_ni *ni, lnet_nid_t nid) static struct kib_conn *kiblnd_get_conn_by_idx(struct lnet_ni *ni, int index) { - struct kib_peer *peer; + struct kib_peer_ni *peer_ni; struct kib_conn *conn; int i; unsigned long flags; @@ -536,13 +536,15 @@ static struct kib_conn *kiblnd_get_conn_by_idx(struct lnet_ni *ni, int index) read_lock_irqsave(&kiblnd_data.kib_global_lock, flags); for (i = 0; i < kiblnd_data.kib_peer_hash_size; i++) { - list_for_each_entry(peer, &kiblnd_data.kib_peers[i], ibp_list) { - LASSERT(!kiblnd_peer_idle(peer)); + list_for_each_entry(peer_ni, &kiblnd_data.kib_peers[i], + ibp_list) { + LASSERT(!kiblnd_peer_idle(peer_ni)); - if (peer->ibp_ni != ni) + if (peer_ni->ibp_ni != ni) continue; - list_for_each_entry(conn, &peer->ibp_conns, ibc_list) { + list_for_each_entry(conn, &peer_ni->ibp_conns, + ibc_list) { if (index-- > 0) continue; @@ -620,20 +622,23 @@ static int kiblnd_get_completion_vector(struct kib_conn *conn, int cpt) return 1; } -struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, struct rdma_cm_id *cmid, +struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, + struct rdma_cm_id *cmid, int state, int version) { /* * CAVEAT EMPTOR: - * If the new conn is created successfully it takes over the caller's - * ref on 'peer'. It also "owns" 'cmid' and destroys it when it itself - * is destroyed. On failure, the caller's ref on 'peer' remains and - * she must dispose of 'cmid'. (Actually I'd block forever if I tried - * to destroy 'cmid' here since I'm called from the CM which still has + * + * If the new conn is created successfully it takes over the + * caller's ref on 'peer_ni'. It also "owns" 'cmid' and + * destroys it when it itself is destroyed. On failure, the + * caller's ref on 'peer_ni' remains and she must dispose of + * 'cmid'. (Actually I'd block forever if I tried to destroy + * 'cmid' here since I'm called from the CM which still has * its ref on 'cmid'). */ rwlock_t *glock = &kiblnd_data.kib_global_lock; - struct kib_net *net = peer->ibp_ni->ni_data; + struct kib_net *net = peer_ni->ibp_ni->ni_data; struct kib_dev *dev; struct ib_qp_init_attr *init_qp_attr; struct kib_sched_info *sched; @@ -650,7 +655,7 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, struct rdma_cm_id *cm dev = net->ibn_dev; - cpt = lnet_cpt_of_nid(peer->ibp_nid, peer->ibp_ni); + cpt = lnet_cpt_of_nid(peer_ni->ibp_nid, peer_ni->ibp_ni); sched = kiblnd_data.kib_scheds[cpt]; LASSERT(sched->ibs_nthreads > 0); @@ -658,24 +663,24 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, struct rdma_cm_id *cm init_qp_attr = kzalloc_cpt(sizeof(*init_qp_attr), GFP_NOFS, cpt); if (!init_qp_attr) { CERROR("Can't allocate qp_attr for %s\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); goto failed_0; } conn = kzalloc_cpt(sizeof(*conn), GFP_NOFS, cpt); if (!conn) { CERROR("Can't allocate connection for %s\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); goto failed_1; } conn->ibc_state = IBLND_CONN_INIT; conn->ibc_version = version; - conn->ibc_peer = peer; /* I take the caller's ref */ + conn->ibc_peer = peer_ni; /* I take the caller's ref */ cmid->context = conn; /* for future CM callbacks */ conn->ibc_cmid = cmid; - conn->ibc_max_frags = peer->ibp_max_frags; - conn->ibc_queue_depth = peer->ibp_queue_depth; + conn->ibc_max_frags = peer_ni->ibp_max_frags; + conn->ibc_queue_depth = peer_ni->ibp_queue_depth; INIT_LIST_HEAD(&conn->ibc_early_rxs); INIT_LIST_HEAD(&conn->ibc_tx_noops); @@ -834,7 +839,7 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, struct rdma_cm_id *cm void kiblnd_destroy_conn(struct kib_conn *conn) { struct rdma_cm_id *cmid = conn->ibc_cmid; - struct kib_peer *peer = conn->ibc_peer; + struct kib_peer_ni *peer_ni = conn->ibc_peer; int rc; LASSERT(!in_interrupt()); @@ -883,26 +888,26 @@ void kiblnd_destroy_conn(struct kib_conn *conn) /* See CAVEAT EMPTOR above in kiblnd_create_conn */ if (conn->ibc_state != IBLND_CONN_INIT) { - struct kib_net *net = peer->ibp_ni->ni_data; + struct kib_net *net = peer_ni->ibp_ni->ni_data; - kiblnd_peer_decref(peer); + kiblnd_peer_decref(peer_ni); rdma_destroy_id(cmid); atomic_dec(&net->ibn_nconns); } } -int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why) +int kiblnd_close_peer_conns_locked(struct kib_peer_ni *peer_ni, int why) { struct kib_conn *conn; struct list_head *ctmp; struct list_head *cnxt; int count = 0; - list_for_each_safe(ctmp, cnxt, &peer->ibp_conns) { + list_for_each_safe(ctmp, cnxt, &peer_ni->ibp_conns) { conn = list_entry(ctmp, struct kib_conn, ibc_list); CDEBUG(D_NET, "Closing conn -> %s, version: %x, reason: %d\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), conn->ibc_version, why); kiblnd_close_conn_locked(conn, why); @@ -912,7 +917,7 @@ int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why) return count; } -int kiblnd_close_stale_conns_locked(struct kib_peer *peer, +int kiblnd_close_stale_conns_locked(struct kib_peer_ni *peer_ni, int version, __u64 incarnation) { struct kib_conn *conn; @@ -920,7 +925,7 @@ int kiblnd_close_stale_conns_locked(struct kib_peer *peer, struct list_head *cnxt; int count = 0; - list_for_each_safe(ctmp, cnxt, &peer->ibp_conns) { + list_for_each_safe(ctmp, cnxt, &peer_ni->ibp_conns) { conn = list_entry(ctmp, struct kib_conn, ibc_list); if (conn->ibc_version == version && @@ -929,7 +934,7 @@ int kiblnd_close_stale_conns_locked(struct kib_peer *peer, CDEBUG(D_NET, "Closing stale conn -> %s version: %x, incarnation:%#llx(%x, %#llx)\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), conn->ibc_version, conn->ibc_incarnation, version, incarnation); @@ -942,9 +947,8 @@ int kiblnd_close_stale_conns_locked(struct kib_peer *peer, static int kiblnd_close_matching_conns(struct lnet_ni *ni, lnet_nid_t nid) { - struct kib_peer *peer; - struct list_head *ptmp; - struct list_head *pnxt; + struct kib_peer_ni *peer_ni; + struct kib_peer_ni *pnxt; int lo; int hi; int i; @@ -962,17 +966,17 @@ static int kiblnd_close_matching_conns(struct lnet_ni *ni, lnet_nid_t nid) } for (i = lo; i <= hi; i++) { - list_for_each_safe(ptmp, pnxt, &kiblnd_data.kib_peers[i]) { - peer = list_entry(ptmp, struct kib_peer, ibp_list); - LASSERT(!kiblnd_peer_idle(peer)); + list_for_each_entry_safe(peer_ni, pnxt, + &kiblnd_data.kib_peers[i], ibp_list) { + LASSERT(!kiblnd_peer_idle(peer_ni)); - if (peer->ibp_ni != ni) + if (peer_ni->ibp_ni != ni) continue; - if (!(nid == LNET_NID_ANY || nid == peer->ibp_nid)) + if (!(nid == LNET_NID_ANY || nid == peer_ni->ibp_nid)) continue; - count += kiblnd_close_peer_conns_locked(peer, 0); + count += kiblnd_close_peer_conns_locked(peer_ni, 0); } } @@ -1043,14 +1047,14 @@ static void kiblnd_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when) time64_t last_alive = 0; time64_t now = ktime_get_seconds(); rwlock_t *glock = &kiblnd_data.kib_global_lock; - struct kib_peer *peer; + struct kib_peer_ni *peer_ni; unsigned long flags; read_lock_irqsave(glock, flags); - peer = kiblnd_find_peer_locked(ni, nid); - if (peer) - last_alive = peer->ibp_last_alive; + peer_ni = kiblnd_find_peer_locked(ni, nid); + if (peer_ni) + last_alive = peer_ni->ibp_last_alive; read_unlock_irqrestore(glock, flags); @@ -1058,14 +1062,14 @@ static void kiblnd_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when) *when = last_alive; /* - * peer is not persistent in hash, trigger peer creation + * peer_ni is not persistent in hash, trigger peer_ni creation * and connection establishment with a NULL tx */ - if (!peer) + if (!peer_ni) kiblnd_launch_tx(ni, NULL, nid); - CDEBUG(D_NET, "Peer %s %p, alive %lld secs ago\n", - libcfs_nid2str(nid), peer, + CDEBUG(D_NET, "peer_ni %s %p, alive %lld secs ago\n", + libcfs_nid2str(nid), peer_ni, last_alive ? now - last_alive : -1); } @@ -2595,7 +2599,7 @@ static void kiblnd_shutdown(struct lnet_ni *ni) /* nuke all existing peers within this net */ kiblnd_del_peer(ni, LNET_NID_ANY); - /* Wait for all peer state to clean up */ + /* Wait for all peer_ni state to clean up */ i = 2; while (atomic_read(&net->ibn_npeers)) { i++; diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h index 520f586015f4..b1851b529ef8 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h @@ -66,7 +66,7 @@ #include -#define IBLND_PEER_HASH_SIZE 101 /* # peer lists */ +#define IBLND_PEER_HASH_SIZE 101 /* # peer_ni lists */ /* # scheduler loops before reschedule */ #define IBLND_RESCHED 100 @@ -96,8 +96,9 @@ extern struct kib_tunables kiblnd_tunables; #define IBLND_MSG_QUEUE_SIZE_V1 8 /* V1 only : # messages/RDMAs in-flight */ #define IBLND_CREDIT_HIGHWATER_V1 7 /* V1 only : when eagerly to return credits */ -#define IBLND_CREDITS_DEFAULT 8 /* default # of peer credits */ -#define IBLND_CREDITS_MAX ((typeof(((struct kib_msg *)0)->ibm_credits)) - 1) /* Max # of peer credits */ +#define IBLND_CREDITS_DEFAULT 8 /* default # of peer_ni credits */ +/* Max # of peer_ni credits */ +#define IBLND_CREDITS_MAX ((typeof(((struct kib_msg *)0)->ibm_credits)) - 1) /* when eagerly to return credits */ #define IBLND_CREDITS_HIGHWATER(t, v) ((v) == IBLND_MSG_VERSION_1 ? \ @@ -324,7 +325,7 @@ struct kib_data { struct list_head kib_failed_devs; /* list head of failed devices */ wait_queue_head_t kib_failover_waitq; /* schedulers sleep here */ atomic_t kib_nthreads; /* # live threads */ - rwlock_t kib_global_lock; /* stabilize net/dev/peer/conn ops */ + rwlock_t kib_global_lock; /* stabilize net/dev/peer_ni/conn ops */ struct list_head *kib_peers; /* hash table of all my known peers */ int kib_peer_hash_size; /* size of kib_peers */ void *kib_connd; /* the connd task (serialisation assertions) */ @@ -445,7 +446,7 @@ struct kib_rej { __u16 ibr_version; /* sender's version */ __u8 ibr_why; /* reject reason */ __u8 ibr_padding; /* padding */ - __u64 ibr_incarnation; /* incarnation of peer */ + __u64 ibr_incarnation; /* incarnation of peer_ni */ struct kib_connparams ibr_cp; /* connection parameters */ } __packed; @@ -453,11 +454,11 @@ struct kib_rej { #define IBLND_REJECT_CONN_RACE 1 /* You lost connection race */ #define IBLND_REJECT_NO_RESOURCES 2 /* Out of memory/conns etc */ #define IBLND_REJECT_FATAL 3 /* Anything else */ -#define IBLND_REJECT_CONN_UNCOMPAT 4 /* incompatible version peer */ -#define IBLND_REJECT_CONN_STALE 5 /* stale peer */ -/* peer's rdma frags doesn't match mine */ +#define IBLND_REJECT_CONN_UNCOMPAT 4 /* incompatible version peer_ni */ +#define IBLND_REJECT_CONN_STALE 5 /* stale peer_ni */ +/* peer_ni's rdma frags doesn't match mine */ #define IBLND_REJECT_RDMA_FRAGS 6 -/* peer's msg queue size doesn't match mine */ +/* peer_ni's msg queue size doesn't match mine */ #define IBLND_REJECT_MSG_QUEUE_SIZE 7 /***********************************************************************/ @@ -476,7 +477,7 @@ struct kib_rx { /* receive message */ #define IBLND_POSTRX_DONT_POST 0 /* don't post */ #define IBLND_POSTRX_NO_CREDIT 1 /* post: no credits */ -#define IBLND_POSTRX_PEER_CREDIT 2 /* post: give peer back 1 credit */ +#define IBLND_POSTRX_PEER_CREDIT 2 /* post: give peer_ni back 1 credit */ #define IBLND_POSTRX_RSRVD_CREDIT 3 /* post: give self back 1 reserved credit */ struct kib_tx { /* transmit message */ @@ -485,7 +486,7 @@ struct kib_tx { /* transmit message */ struct kib_conn *tx_conn; /* owning conn */ short tx_sending; /* # tx callbacks outstanding */ short tx_queued; /* queued for sending */ - short tx_waiting; /* waiting for peer */ + short tx_waiting; /* waiting for peer_ni */ int tx_status; /* LNET completion status */ ktime_t tx_deadline; /* completion deadline */ __u64 tx_cookie; /* completion cookie */ @@ -510,14 +511,14 @@ struct kib_connvars { struct kib_conn { struct kib_sched_info *ibc_sched; /* scheduler information */ - struct kib_peer *ibc_peer; /* owning peer */ + struct kib_peer_ni *ibc_peer; /* owning peer_ni */ struct kib_hca_dev *ibc_hdev; /* HCA bound on */ - struct list_head ibc_list; /* stash on peer's conn list */ + struct list_head ibc_list; /* stash on peer_ni's conn list */ struct list_head ibc_sched_list; /* schedule for attention */ __u16 ibc_version; /* version of connection */ /* reconnect later */ __u16 ibc_reconnect:1; - __u64 ibc_incarnation; /* which instance of the peer */ + __u64 ibc_incarnation; /* which instance of the peer_ni */ atomic_t ibc_refcount; /* # users */ int ibc_state; /* what's happening */ int ibc_nsends_posted; /* # uncompleted sends */ @@ -562,32 +563,32 @@ struct kib_conn { #define IBLND_CONN_CLOSING 4 /* being closed */ #define IBLND_CONN_DISCONNECTED 5 /* disconnected */ -struct kib_peer { - struct list_head ibp_list; /* stash on global peer list */ +struct kib_peer_ni { + struct list_head ibp_list; /* stash on global peer_ni list */ lnet_nid_t ibp_nid; /* who's on the other end(s) */ struct lnet_ni *ibp_ni; /* LNet interface */ struct list_head ibp_conns; /* all active connections */ struct kib_conn *ibp_next_conn; /* next connection to send on for * round robin */ struct list_head ibp_tx_queue; /* msgs waiting for a conn */ - __u64 ibp_incarnation; /* incarnation of peer */ + __u64 ibp_incarnation; /* incarnation of peer_ni */ /* when (in seconds) I was last alive */ time64_t ibp_last_alive; /* # users */ atomic_t ibp_refcount; - /* version of peer */ + /* version of peer_ni */ __u16 ibp_version; /* current passive connection attempts */ unsigned short ibp_accepting; /* current active connection attempts */ unsigned short ibp_connecting; - /* reconnect this peer later */ + /* reconnect this peer_ni later */ unsigned char ibp_reconnecting; /* counter of how many times we triggered a conn race */ unsigned char ibp_races; - /* # consecutive reconnection attempts to this peer */ + /* # consecutive reconnection attempts to this peer_ni */ unsigned int ibp_reconnected; - /* errno on closing this peer */ + /* errno on closing this peer_ni */ int ibp_error; /* max map_on_demand */ __u16 ibp_max_frags; @@ -694,36 +695,37 @@ do { \ } \ } while (0) -#define kiblnd_peer_addref(peer) \ +#define kiblnd_peer_addref(peer_ni) \ do { \ - CDEBUG(D_NET, "peer[%p] -> %s (%d)++\n", \ - (peer), libcfs_nid2str((peer)->ibp_nid), \ - atomic_read(&(peer)->ibp_refcount)); \ - atomic_inc(&(peer)->ibp_refcount); \ + CDEBUG(D_NET, "peer_ni[%p] -> %s (%d)++\n", \ + (peer_ni), libcfs_nid2str((peer_ni)->ibp_nid), \ + atomic_read(&(peer_ni)->ibp_refcount)); \ + atomic_inc(&(peer_ni)->ibp_refcount); \ } while (0) -#define kiblnd_peer_decref(peer) \ +#define kiblnd_peer_decref(peer_ni) \ do { \ - CDEBUG(D_NET, "peer[%p] -> %s (%d)--\n", \ - (peer), libcfs_nid2str((peer)->ibp_nid), \ - atomic_read(&(peer)->ibp_refcount)); \ - LASSERT_ATOMIC_POS(&(peer)->ibp_refcount); \ - if (atomic_dec_and_test(&(peer)->ibp_refcount)) \ - kiblnd_destroy_peer(peer); \ + CDEBUG(D_NET, "peer_ni[%p] -> %s (%d)--\n", \ + (peer_ni), libcfs_nid2str((peer_ni)->ibp_nid), \ + atomic_read(&(peer_ni)->ibp_refcount)); \ + LASSERT_ATOMIC_POS(&(peer_ni)->ibp_refcount); \ + if (atomic_dec_and_test(&(peer_ni)->ibp_refcount)) \ + kiblnd_destroy_peer(peer_ni); \ } while (0) static inline bool -kiblnd_peer_connecting(struct kib_peer *peer) +kiblnd_peer_connecting(struct kib_peer_ni *peer_ni) { - return peer->ibp_connecting || - peer->ibp_reconnecting || - peer->ibp_accepting; + return peer_ni->ibp_connecting || + peer_ni->ibp_reconnecting || + peer_ni->ibp_accepting; } static inline bool -kiblnd_peer_idle(struct kib_peer *peer) +kiblnd_peer_idle(struct kib_peer_ni *peer_ni) { - return !kiblnd_peer_connecting(peer) && list_empty(&peer->ibp_conns); + return !kiblnd_peer_connecting(peer_ni) && + list_empty(&peer_ni->ibp_conns); } static inline struct list_head * @@ -736,28 +738,28 @@ kiblnd_nid2peerlist(lnet_nid_t nid) } static inline int -kiblnd_peer_active(struct kib_peer *peer) +kiblnd_peer_active(struct kib_peer_ni *peer_ni) { - /* Am I in the peer hash table? */ - return !list_empty(&peer->ibp_list); + /* Am I in the peer_ni hash table? */ + return !list_empty(&peer_ni->ibp_list); } static inline struct kib_conn * -kiblnd_get_conn_locked(struct kib_peer *peer) +kiblnd_get_conn_locked(struct kib_peer_ni *peer_ni) { struct list_head *next; - LASSERT(!list_empty(&peer->ibp_conns)); + LASSERT(!list_empty(&peer_ni->ibp_conns)); /* Advance to next connection, be sure to skip the head node */ - if (!peer->ibp_next_conn || - peer->ibp_next_conn->ibc_list.next == &peer->ibp_conns) - next = peer->ibp_conns.next; + if (!peer_ni->ibp_next_conn || + peer_ni->ibp_next_conn->ibc_list.next == &peer_ni->ibp_conns) + next = peer_ni->ibp_conns.next; else - next = peer->ibp_next_conn->ibc_list.next; - peer->ibp_next_conn = list_entry(next, struct kib_conn, ibc_list); + next = peer_ni->ibp_next_conn->ibc_list.next; + peer_ni->ibp_next_conn = list_entry(next, struct kib_conn, ibc_list); - return peer->ibp_next_conn; + return peer_ni->ibp_next_conn; } static inline int @@ -1013,18 +1015,18 @@ int kiblnd_cm_callback(struct rdma_cm_id *cmid, int kiblnd_translate_mtu(int value); int kiblnd_dev_failover(struct kib_dev *dev); -int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer **peerp, +int kiblnd_create_peer(struct lnet_ni *ni, struct kib_peer_ni **peerp, lnet_nid_t nid); -void kiblnd_destroy_peer(struct kib_peer *peer); -bool kiblnd_reconnect_peer(struct kib_peer *peer); +void kiblnd_destroy_peer(struct kib_peer_ni *peer_ni); +bool kiblnd_reconnect_peer(struct kib_peer_ni *peer_ni); void kiblnd_destroy_dev(struct kib_dev *dev); -void kiblnd_unlink_peer_locked(struct kib_peer *peer); -struct kib_peer *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid); -int kiblnd_close_stale_conns_locked(struct kib_peer *peer, +void kiblnd_unlink_peer_locked(struct kib_peer_ni *peer_ni); +struct kib_peer_ni *kiblnd_find_peer_locked(struct lnet_ni *ni, lnet_nid_t nid); +int kiblnd_close_stale_conns_locked(struct kib_peer_ni *peer_ni, int version, __u64 incarnation); -int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why); +int kiblnd_close_peer_conns_locked(struct kib_peer_ni *peer_ni, int why); -struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, +struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, struct rdma_cm_id *cmid, int state, int version); void kiblnd_destroy_conn(struct kib_conn *conn); diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c index f4b76347e1c6..cb752dcd35d9 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -40,8 +40,9 @@ #define MAX_CONN_RACES_BEFORE_ABORT 20 -static void kiblnd_peer_alive(struct kib_peer *peer); -static void kiblnd_peer_connect_failed(struct kib_peer *peer, int active, int error); +static void kiblnd_peer_alive(struct kib_peer_ni *peer_ni); +static void kiblnd_peer_connect_failed(struct kib_peer_ni *peer_ni, int active, + int error); static void kiblnd_init_tx_msg(struct lnet_ni *ni, struct kib_tx *tx, int type, int body_nob); static int kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type, @@ -62,9 +63,9 @@ kiblnd_tx_done(struct lnet_ni *ni, struct kib_tx *tx) LASSERT(net); LASSERT(!in_interrupt()); - LASSERT(!tx->tx_queued); /* mustn't be queued for sending */ - LASSERT(!tx->tx_sending); /* mustn't be awaiting sent callback */ - LASSERT(!tx->tx_waiting); /* mustn't be awaiting peer response */ + LASSERT(!tx->tx_queued); /* mustn't be queued for sending */ + LASSERT(!tx->tx_sending); /* mustn't be awaiting sent callback */ + LASSERT(!tx->tx_waiting); /* mustn't be awaiting peer_ni response */ LASSERT(tx->tx_pool); kiblnd_unmap_tx(tx); @@ -414,7 +415,7 @@ kiblnd_handle_rx(struct kib_rx *rx) LASSERT(tx->tx_waiting); /* * CAVEAT EMPTOR: I could be racing with tx_complete, but... - * (a) I can overwrite tx_msg since my peer has received it! + * (a) I can overwrite tx_msg since my peer_ni has received it! * (b) tx_waiting set tells tx_complete() it's not done. */ tx->tx_nwrq = 0; /* overwrite PUT_REQ */ @@ -579,8 +580,8 @@ kiblnd_fmr_map_tx(struct kib_net *net, struct kib_tx *tx, struct kib_rdma_desc * } /* - * If rd is not tx_rd, it's going to get sent to a peer, who will need - * the rkey + * If rd is not tx_rd, it's going to get sent to a peer_ni, + * who will need the rkey */ rd->rd_key = tx->fmr.fmr_key; rd->rd_frags[0].rf_addr &= ~hdev->ibh_page_mask; @@ -611,7 +612,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, int i; /* - * If rd is not tx_rd, it's going to get sent to a peer and I'm the + * If rd is not tx_rd, it's going to get sent to a peer_ni and I'm the * RDMA sink */ tx->tx_dmadir = (rd != tx->tx_rd) ? DMA_FROM_DEVICE : DMA_TO_DEVICE; @@ -742,8 +743,8 @@ kiblnd_post_tx_locked(struct kib_conn *conn, struct kib_tx *tx, int credit) __must_hold(&conn->ibc_lock) { struct kib_msg *msg = tx->tx_msg; - struct kib_peer *peer = conn->ibc_peer; - struct lnet_ni *ni = peer->ibp_ni; + struct kib_peer_ni *peer_ni = conn->ibc_peer; + struct lnet_ni *ni = peer_ni->ibp_ni; int ver = conn->ibc_version; int rc; int done; @@ -761,13 +762,13 @@ kiblnd_post_tx_locked(struct kib_conn *conn, struct kib_tx *tx, int credit) if (conn->ibc_nsends_posted == kiblnd_concurrent_sends(ver, ni)) { /* tx completions outstanding... */ CDEBUG(D_NET, "%s: posted enough\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); return -EAGAIN; } if (credit && !conn->ibc_credits) { /* no credits */ CDEBUG(D_NET, "%s: no credits\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); return -EAGAIN; } @@ -775,7 +776,7 @@ kiblnd_post_tx_locked(struct kib_conn *conn, struct kib_tx *tx, int credit) conn->ibc_credits == 1 && /* last credit reserved */ msg->ibm_type != IBLND_MSG_NOOP) { /* for NOOP */ CDEBUG(D_NET, "%s: not using last credit\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); return -EAGAIN; } @@ -793,16 +794,17 @@ kiblnd_post_tx_locked(struct kib_conn *conn, struct kib_tx *tx, int credit) * posted NOOPs complete */ spin_unlock(&conn->ibc_lock); - kiblnd_tx_done(peer->ibp_ni, tx); + kiblnd_tx_done(peer_ni->ibp_ni, tx); spin_lock(&conn->ibc_lock); CDEBUG(D_NET, "%s(%d): redundant or enough NOOP\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), conn->ibc_noops_posted); return 0; } - kiblnd_pack_msg(peer->ibp_ni, msg, ver, conn->ibc_outstanding_credits, - peer->ibp_nid, conn->ibc_incarnation); + kiblnd_pack_msg(peer_ni->ibp_ni, msg, ver, + conn->ibc_outstanding_credits, + peer_ni->ibp_nid, conn->ibc_incarnation); conn->ibc_credits -= credit; conn->ibc_outstanding_credits = 0; @@ -844,7 +846,7 @@ kiblnd_post_tx_locked(struct kib_conn *conn, struct kib_tx *tx, int credit) } LASSERTF(bad->wr_id == kiblnd_ptr2wreqid(tx, IBLND_WID_TX), - "bad wr_id %llx, opc %d, flags %d, peer: %s\n", + "bad wr_id %llx, opc %d, flags %d, peer_ni: %s\n", bad->wr_id, bad->opcode, bad->send_flags, libcfs_nid2str(conn->ibc_peer->ibp_nid)); bad = NULL; @@ -878,15 +880,15 @@ kiblnd_post_tx_locked(struct kib_conn *conn, struct kib_tx *tx, int credit) if (conn->ibc_state == IBLND_CONN_ESTABLISHED) CERROR("Error %d posting transmit to %s\n", - rc, libcfs_nid2str(peer->ibp_nid)); + rc, libcfs_nid2str(peer_ni->ibp_nid)); else CDEBUG(D_NET, "Error %d posting transmit to %s\n", - rc, libcfs_nid2str(peer->ibp_nid)); + rc, libcfs_nid2str(peer_ni->ibp_nid)); kiblnd_close_conn(conn, rc); if (done) - kiblnd_tx_done(peer->ibp_ni, tx); + kiblnd_tx_done(peer_ni->ibp_ni, tx); spin_lock(&conn->ibc_lock); @@ -991,12 +993,12 @@ kiblnd_tx_complete(struct kib_tx *tx, int status) conn->ibc_noops_posted--; if (failed) { - tx->tx_waiting = 0; /* don't wait for peer */ + tx->tx_waiting = 0; /* don't wait for peer_ni */ tx->tx_status = -EIO; } idle = !tx->tx_sending && /* This is the final callback */ - !tx->tx_waiting && /* Not waiting for peer */ + !tx->tx_waiting && /* Not waiting for peer_ni */ !tx->tx_queued; /* Not re-queued (PUT_DONE) */ if (idle) list_del(&tx->tx_list); @@ -1058,7 +1060,7 @@ kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type, type == IBLND_MSG_PUT_DONE); if (kiblnd_rd_size(srcrd) > conn->ibc_max_frags << PAGE_SHIFT) { - CERROR("RDMA is too large for peer %s (%d), src size: %d dst size: %d\n", + CERROR("RDMA is too large for peer_ni %s (%d), src size: %d dst size: %d\n", libcfs_nid2str(conn->ibc_peer->ibp_nid), conn->ibc_max_frags << PAGE_SHIFT, kiblnd_rd_size(srcrd), kiblnd_rd_size(dstrd)); @@ -1080,7 +1082,7 @@ kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type, } if (tx->tx_nwrq >= IBLND_MAX_RDMA_FRAGS) { - CERROR("RDMA has too many fragments for peer %s (%d), src idx/frags: %d/%d dst idx/frags: %d/%d\n", + CERROR("RDMA has too many fragments for peer_ni %s (%d), src idx/frags: %d/%d dst idx/frags: %d/%d\n", libcfs_nid2str(conn->ibc_peer->ibp_nid), IBLND_MAX_RDMA_FRAGS, srcidx, srcrd->rd_nfrags, @@ -1234,24 +1236,24 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid, } static void -kiblnd_connect_peer(struct kib_peer *peer) +kiblnd_connect_peer(struct kib_peer_ni *peer_ni) { struct rdma_cm_id *cmid; struct kib_dev *dev; - struct kib_net *net = peer->ibp_ni->ni_data; + struct kib_net *net = peer_ni->ibp_ni->ni_data; struct sockaddr_in srcaddr; struct sockaddr_in dstaddr; int rc; LASSERT(net); - LASSERT(peer->ibp_connecting > 0); + LASSERT(peer_ni->ibp_connecting > 0); - cmid = kiblnd_rdma_create_id(kiblnd_cm_callback, peer, RDMA_PS_TCP, + cmid = kiblnd_rdma_create_id(kiblnd_cm_callback, peer_ni, RDMA_PS_TCP, IB_QPT_RC); if (IS_ERR(cmid)) { CERROR("Can't create CMID for %s: %ld\n", - libcfs_nid2str(peer->ibp_nid), PTR_ERR(cmid)); + libcfs_nid2str(peer_ni->ibp_nid), PTR_ERR(cmid)); rc = PTR_ERR(cmid); goto failed; } @@ -1264,9 +1266,9 @@ kiblnd_connect_peer(struct kib_peer *peer) memset(&dstaddr, 0, sizeof(dstaddr)); dstaddr.sin_family = AF_INET; dstaddr.sin_port = htons(*kiblnd_tunables.kib_service); - dstaddr.sin_addr.s_addr = htonl(LNET_NIDADDR(peer->ibp_nid)); + dstaddr.sin_addr.s_addr = htonl(LNET_NIDADDR(peer_ni->ibp_nid)); - kiblnd_peer_addref(peer); /* cmid's ref */ + kiblnd_peer_addref(peer_ni); /* cmid's ref */ if (*kiblnd_tunables.kib_use_priv_port) { rc = kiblnd_resolve_addr(cmid, &srcaddr, &dstaddr, @@ -1280,23 +1282,23 @@ kiblnd_connect_peer(struct kib_peer *peer) if (rc) { /* Can't initiate address resolution: */ CERROR("Can't resolve addr for %s: %d\n", - libcfs_nid2str(peer->ibp_nid), rc); + libcfs_nid2str(peer_ni->ibp_nid), rc); goto failed2; } return; failed2: - kiblnd_peer_connect_failed(peer, 1, rc); - kiblnd_peer_decref(peer); /* cmid's ref */ + kiblnd_peer_connect_failed(peer_ni, 1, rc); + kiblnd_peer_decref(peer_ni); /* cmid's ref */ rdma_destroy_id(cmid); return; failed: - kiblnd_peer_connect_failed(peer, 1, rc); + kiblnd_peer_connect_failed(peer_ni, 1, rc); } bool -kiblnd_reconnect_peer(struct kib_peer *peer) +kiblnd_reconnect_peer(struct kib_peer_ni *peer_ni) { rwlock_t *glock = &kiblnd_data.kib_global_lock; char *reason = NULL; @@ -1306,12 +1308,12 @@ kiblnd_reconnect_peer(struct kib_peer *peer) INIT_LIST_HEAD(&txs); write_lock_irqsave(glock, flags); - if (!peer->ibp_reconnecting) { - if (peer->ibp_accepting) + if (!peer_ni->ibp_reconnecting) { + if (peer_ni->ibp_accepting) reason = "accepting"; - else if (peer->ibp_connecting) + else if (peer_ni->ibp_connecting) reason = "connecting"; - else if (!list_empty(&peer->ibp_conns)) + else if (!list_empty(&peer_ni->ibp_conns)) reason = "connected"; else /* connected then closed */ reason = "closed"; @@ -1319,37 +1321,37 @@ kiblnd_reconnect_peer(struct kib_peer *peer) goto no_reconnect; } - LASSERT(!peer->ibp_accepting && !peer->ibp_connecting && - list_empty(&peer->ibp_conns)); - peer->ibp_reconnecting--; + LASSERT(!peer_ni->ibp_accepting && !peer_ni->ibp_connecting && + list_empty(&peer_ni->ibp_conns)); + peer_ni->ibp_reconnecting--; - if (!kiblnd_peer_active(peer)) { - list_splice_init(&peer->ibp_tx_queue, &txs); + if (!kiblnd_peer_active(peer_ni)) { + list_splice_init(&peer_ni->ibp_tx_queue, &txs); reason = "unlinked"; goto no_reconnect; } - peer->ibp_connecting++; - peer->ibp_reconnected++; + peer_ni->ibp_connecting++; + peer_ni->ibp_reconnected++; write_unlock_irqrestore(glock, flags); - kiblnd_connect_peer(peer); + kiblnd_connect_peer(peer_ni); return true; no_reconnect: write_unlock_irqrestore(glock, flags); CWARN("Abort reconnection of %s: %s\n", - libcfs_nid2str(peer->ibp_nid), reason); - kiblnd_txlist_done(peer->ibp_ni, &txs, -ECONNABORTED); + libcfs_nid2str(peer_ni->ibp_nid), reason); + kiblnd_txlist_done(peer_ni->ibp_ni, &txs, -ECONNABORTED); return false; } void kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) { - struct kib_peer *peer; - struct kib_peer *peer2; + struct kib_peer_ni *peer_ni; + struct kib_peer_ni *peer2; struct kib_conn *conn; rwlock_t *g_lock = &kiblnd_data.kib_global_lock; unsigned long flags; @@ -1370,10 +1372,10 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) */ read_lock_irqsave(g_lock, flags); - peer = kiblnd_find_peer_locked(ni, nid); - if (peer && !list_empty(&peer->ibp_conns)) { - /* Found a peer with an established connection */ - conn = kiblnd_get_conn_locked(peer); + peer_ni = kiblnd_find_peer_locked(ni, nid); + if (peer_ni && !list_empty(&peer_ni->ibp_conns)) { + /* Found a peer_ni with an established connection */ + conn = kiblnd_get_conn_locked(peer_ni); kiblnd_conn_addref(conn); /* 1 ref for me... */ read_unlock_irqrestore(g_lock, flags); @@ -1388,17 +1390,17 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) /* Re-try with a write lock */ write_lock(g_lock); - peer = kiblnd_find_peer_locked(ni, nid); - if (peer) { - if (list_empty(&peer->ibp_conns)) { - /* found a peer, but it's still connecting... */ - LASSERT(kiblnd_peer_connecting(peer)); + peer_ni = kiblnd_find_peer_locked(ni, nid); + if (peer_ni) { + if (list_empty(&peer_ni->ibp_conns)) { + /* found a peer_ni, but it's still connecting... */ + LASSERT(kiblnd_peer_connecting(peer_ni)); if (tx) list_add_tail(&tx->tx_list, - &peer->ibp_tx_queue); + &peer_ni->ibp_tx_queue); write_unlock_irqrestore(g_lock, flags); } else { - conn = kiblnd_get_conn_locked(peer); + conn = kiblnd_get_conn_locked(peer_ni); kiblnd_conn_addref(conn); /* 1 ref for me... */ write_unlock_irqrestore(g_lock, flags); @@ -1412,10 +1414,10 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) write_unlock_irqrestore(g_lock, flags); - /* Allocate a peer ready to add to the peer table and retry */ - rc = kiblnd_create_peer(ni, &peer, nid); + /* Allocate a peer_ni ready to add to the peer_ni table and retry */ + rc = kiblnd_create_peer(ni, &peer_ni, nid); if (rc) { - CERROR("Can't create peer %s\n", libcfs_nid2str(nid)); + CERROR("Can't create peer_ni %s\n", libcfs_nid2str(nid)); if (tx) { tx->tx_status = -EHOSTUNREACH; tx->tx_waiting = 0; @@ -1429,7 +1431,7 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) peer2 = kiblnd_find_peer_locked(ni, nid); if (peer2) { if (list_empty(&peer2->ibp_conns)) { - /* found a peer, but it's still connecting... */ + /* found a peer_ni, but it's still connecting... */ LASSERT(kiblnd_peer_connecting(peer2)); if (tx) list_add_tail(&tx->tx_list, @@ -1446,29 +1448,29 @@ kiblnd_launch_tx(struct lnet_ni *ni, struct kib_tx *tx, lnet_nid_t nid) kiblnd_conn_decref(conn); /* ...to here */ } - kiblnd_peer_decref(peer); + kiblnd_peer_decref(peer_ni); return; } - /* Brand new peer */ - LASSERT(!peer->ibp_connecting); - tunables = &peer->ibp_ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; - peer->ibp_connecting = tunables->lnd_conns_per_peer; + /* Brand new peer_ni */ + LASSERT(!peer_ni->ibp_connecting); + tunables = &peer_ni->ibp_ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; + peer_ni->ibp_connecting = tunables->lnd_conns_per_peer; /* always called with a ref on ni, which prevents ni being shutdown */ LASSERT(!((struct kib_net *)ni->ni_data)->ibn_shutdown); if (tx) - list_add_tail(&tx->tx_list, &peer->ibp_tx_queue); + list_add_tail(&tx->tx_list, &peer_ni->ibp_tx_queue); - kiblnd_peer_addref(peer); - list_add_tail(&peer->ibp_list, kiblnd_nid2peerlist(nid)); + kiblnd_peer_addref(peer_ni); + list_add_tail(&peer_ni->ibp_list, kiblnd_nid2peerlist(nid)); write_unlock_irqrestore(g_lock, flags); for (i = 0; i < tunables->lnd_conns_per_peer; i++) - kiblnd_connect_peer(peer); - kiblnd_peer_decref(peer); + kiblnd_connect_peer(peer_ni); + kiblnd_peer_decref(peer_ni); } int @@ -1787,7 +1789,7 @@ kiblnd_recv(struct lnet_ni *ni, void *private, struct lnet_msg *lntmsg, CERROR("Can't setup PUT sink for %s: %d\n", libcfs_nid2str(conn->ibc_peer->ibp_nid), rc); kiblnd_tx_done(ni, tx); - /* tell peer it's over */ + /* tell peer_ni it's over */ kiblnd_send_completion(rx->rx_conn, IBLND_MSG_PUT_NAK, rc, rxmsg->ibm_u.putreq.ibprm_cookie); break; @@ -1844,15 +1846,15 @@ kiblnd_thread_fini(void) } static void -kiblnd_peer_alive(struct kib_peer *peer) +kiblnd_peer_alive(struct kib_peer_ni *peer_ni) { /* This is racy, but everyone's only writing ktime_get_seconds() */ - peer->ibp_last_alive = ktime_get_seconds(); + peer_ni->ibp_last_alive = ktime_get_seconds(); mb(); } static void -kiblnd_peer_notify(struct kib_peer *peer) +kiblnd_peer_notify(struct kib_peer_ni *peer_ni) { int error = 0; time64_t last_alive = 0; @@ -1860,18 +1862,18 @@ kiblnd_peer_notify(struct kib_peer *peer) read_lock_irqsave(&kiblnd_data.kib_global_lock, flags); - if (kiblnd_peer_idle(peer) && peer->ibp_error) { - error = peer->ibp_error; - peer->ibp_error = 0; + if (kiblnd_peer_idle(peer_ni) && peer_ni->ibp_error) { + error = peer_ni->ibp_error; + peer_ni->ibp_error = 0; - last_alive = peer->ibp_last_alive; + last_alive = peer_ni->ibp_last_alive; } read_unlock_irqrestore(&kiblnd_data.kib_global_lock, flags); if (error) - lnet_notify(peer->ibp_ni, - peer->ibp_nid, 0, last_alive); + lnet_notify(peer_ni->ibp_ni, + peer_ni->ibp_nid, 0, last_alive); } void @@ -1885,7 +1887,7 @@ kiblnd_close_conn_locked(struct kib_conn *conn, int error) * already dealing with it (either to set it up or tear it down). * Caller holds kib_global_lock exclusively in irq context */ - struct kib_peer *peer = conn->ibc_peer; + struct kib_peer_ni *peer_ni = conn->ibc_peer; struct kib_dev *dev; unsigned long flags; @@ -1904,10 +1906,10 @@ kiblnd_close_conn_locked(struct kib_conn *conn, int error) list_empty(&conn->ibc_tx_queue_nocred) && list_empty(&conn->ibc_active_txs)) { CDEBUG(D_NET, "closing conn to %s\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); } else { CNETERR("Closing conn to %s: error %d%s%s%s%s%s\n", - libcfs_nid2str(peer->ibp_nid), error, + libcfs_nid2str(peer_ni->ibp_nid), error, list_empty(&conn->ibc_tx_queue) ? "" : "(sending)", list_empty(&conn->ibc_tx_noops) ? "" : "(sending_noops)", list_empty(&conn->ibc_tx_queue_rsrvd) ? "" : "(sending_rsrvd)", @@ -1915,19 +1917,19 @@ kiblnd_close_conn_locked(struct kib_conn *conn, int error) list_empty(&conn->ibc_active_txs) ? "" : "(waiting)"); } - dev = ((struct kib_net *)peer->ibp_ni->ni_data)->ibn_dev; - if (peer->ibp_next_conn == conn) + dev = ((struct kib_net *)peer_ni->ibp_ni->ni_data)->ibn_dev; + if (peer_ni->ibp_next_conn == conn) /* clear next_conn so it won't be used */ - peer->ibp_next_conn = NULL; + peer_ni->ibp_next_conn = NULL; list_del(&conn->ibc_list); /* connd (see below) takes over ibc_list's ref */ - if (list_empty(&peer->ibp_conns) && /* no more conns */ - kiblnd_peer_active(peer)) { /* still in peer table */ - kiblnd_unlink_peer_locked(peer); + if (list_empty(&peer_ni->ibp_conns) && /* no more conns */ + kiblnd_peer_active(peer_ni)) { /* still in peer_ni table */ + kiblnd_unlink_peer_locked(peer_ni); /* set/clear error on last conn */ - peer->ibp_error = conn->ibc_comms_error; + peer_ni->ibp_error = conn->ibc_comms_error; } kiblnd_set_conn_state(conn, IBLND_CONN_CLOSING); @@ -2046,7 +2048,7 @@ kiblnd_finalise_conn(struct kib_conn *conn) } static void -kiblnd_peer_connect_failed(struct kib_peer *peer, int active, int error) +kiblnd_peer_connect_failed(struct kib_peer_ni *peer_ni, int active, int error) { LIST_HEAD(zombies); unsigned long flags; @@ -2057,52 +2059,52 @@ kiblnd_peer_connect_failed(struct kib_peer *peer, int active, int error) write_lock_irqsave(&kiblnd_data.kib_global_lock, flags); if (active) { - LASSERT(peer->ibp_connecting > 0); - peer->ibp_connecting--; + LASSERT(peer_ni->ibp_connecting > 0); + peer_ni->ibp_connecting--; } else { - LASSERT(peer->ibp_accepting > 0); - peer->ibp_accepting--; + LASSERT(peer_ni->ibp_accepting > 0); + peer_ni->ibp_accepting--; } - if (kiblnd_peer_connecting(peer)) { + if (kiblnd_peer_connecting(peer_ni)) { /* another connection attempt under way... */ write_unlock_irqrestore(&kiblnd_data.kib_global_lock, flags); return; } - peer->ibp_reconnected = 0; - if (list_empty(&peer->ibp_conns)) { - /* Take peer's blocked transmits to complete with error */ - list_add(&zombies, &peer->ibp_tx_queue); - list_del_init(&peer->ibp_tx_queue); + peer_ni->ibp_reconnected = 0; + if (list_empty(&peer_ni->ibp_conns)) { + /* Take peer_ni's blocked transmits to complete with error */ + list_add(&zombies, &peer_ni->ibp_tx_queue); + list_del_init(&peer_ni->ibp_tx_queue); - if (kiblnd_peer_active(peer)) - kiblnd_unlink_peer_locked(peer); + if (kiblnd_peer_active(peer_ni)) + kiblnd_unlink_peer_locked(peer_ni); - peer->ibp_error = error; + peer_ni->ibp_error = error; } else { /* Can't have blocked transmits if there are connections */ - LASSERT(list_empty(&peer->ibp_tx_queue)); + LASSERT(list_empty(&peer_ni->ibp_tx_queue)); } write_unlock_irqrestore(&kiblnd_data.kib_global_lock, flags); - kiblnd_peer_notify(peer); + kiblnd_peer_notify(peer_ni); if (list_empty(&zombies)) return; CNETERR("Deleting messages for %s: connection failed\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); - kiblnd_txlist_done(peer->ibp_ni, &zombies, -EHOSTUNREACH); + kiblnd_txlist_done(peer_ni->ibp_ni, &zombies, -EHOSTUNREACH); } static void kiblnd_connreq_done(struct kib_conn *conn, int status) { - struct kib_peer *peer = conn->ibc_peer; + struct kib_peer_ni *peer_ni = conn->ibc_peer; struct kib_tx *tx; struct list_head txs; unsigned long flags; @@ -2111,21 +2113,21 @@ kiblnd_connreq_done(struct kib_conn *conn, int status) active = (conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT); CDEBUG(D_NET, "%s: active(%d), version(%x), status(%d)\n", - libcfs_nid2str(peer->ibp_nid), active, + libcfs_nid2str(peer_ni->ibp_nid), active, conn->ibc_version, status); LASSERT(!in_interrupt()); LASSERT((conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT && - peer->ibp_connecting > 0) || + peer_ni->ibp_connecting > 0) || (conn->ibc_state == IBLND_CONN_PASSIVE_WAIT && - peer->ibp_accepting > 0)); + peer_ni->ibp_accepting > 0)); kfree(conn->ibc_connvars); conn->ibc_connvars = NULL; if (status) { /* failed to establish connection */ - kiblnd_peer_connect_failed(peer, active, status); + kiblnd_peer_connect_failed(peer_ni, active, status); kiblnd_finalise_conn(conn); return; } @@ -2135,40 +2137,40 @@ kiblnd_connreq_done(struct kib_conn *conn, int status) conn->ibc_last_send = ktime_get(); kiblnd_set_conn_state(conn, IBLND_CONN_ESTABLISHED); - kiblnd_peer_alive(peer); + kiblnd_peer_alive(peer_ni); /* - * Add conn to peer's list and nuke any dangling conns from a different - * peer instance... + * Add conn to peer_ni's list and nuke any dangling conns from + * a different peer_ni instance... */ kiblnd_conn_addref(conn); /* +1 ref for ibc_list */ - list_add(&conn->ibc_list, &peer->ibp_conns); - peer->ibp_reconnected = 0; + list_add(&conn->ibc_list, &peer_ni->ibp_conns); + peer_ni->ibp_reconnected = 0; if (active) - peer->ibp_connecting--; + peer_ni->ibp_connecting--; else - peer->ibp_accepting--; + peer_ni->ibp_accepting--; - if (!peer->ibp_version) { - peer->ibp_version = conn->ibc_version; - peer->ibp_incarnation = conn->ibc_incarnation; + if (!peer_ni->ibp_version) { + peer_ni->ibp_version = conn->ibc_version; + peer_ni->ibp_incarnation = conn->ibc_incarnation; } - if (peer->ibp_version != conn->ibc_version || - peer->ibp_incarnation != conn->ibc_incarnation) { - kiblnd_close_stale_conns_locked(peer, conn->ibc_version, + if (peer_ni->ibp_version != conn->ibc_version || + peer_ni->ibp_incarnation != conn->ibc_incarnation) { + kiblnd_close_stale_conns_locked(peer_ni, conn->ibc_version, conn->ibc_incarnation); - peer->ibp_version = conn->ibc_version; - peer->ibp_incarnation = conn->ibc_incarnation; + peer_ni->ibp_version = conn->ibc_version; + peer_ni->ibp_incarnation = conn->ibc_incarnation; } /* grab pending txs while I have the lock */ - list_add(&txs, &peer->ibp_tx_queue); - list_del_init(&peer->ibp_tx_queue); + list_add(&txs, &peer_ni->ibp_tx_queue); + list_del_init(&peer_ni->ibp_tx_queue); - if (!kiblnd_peer_active(peer) || /* peer has been deleted */ + if (!kiblnd_peer_active(peer_ni) || /* peer_ni has been deleted */ conn->ibc_comms_error) { /* error has happened already */ - struct lnet_ni *ni = peer->ibp_ni; + struct lnet_ni *ni = peer_ni->ibp_ni; /* start to shut down connection */ kiblnd_close_conn_locked(conn, -ECONNABORTED); @@ -2181,7 +2183,7 @@ kiblnd_connreq_done(struct kib_conn *conn, int status) /* * +1 ref for myself, this connection is visible to other threads - * now, refcount of peer:ibp_conns can be released by connection + * now, refcount of peer_ni:ibp_conns can be released by connection * close from either a different thread, or the calling of * kiblnd_check_sends_locked() below. See bz21911 for details. */ @@ -2227,8 +2229,8 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) struct kib_msg *reqmsg = priv; struct kib_msg *ackmsg; struct kib_dev *ibdev; - struct kib_peer *peer; - struct kib_peer *peer2; + struct kib_peer_ni *peer_ni; + struct kib_peer_ni *peer2; struct kib_conn *conn; struct lnet_ni *ni = NULL; struct kib_net *net = NULL; @@ -2257,7 +2259,7 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) ntohs(peer_addr->sin_port) >= PROT_SOCK) { __u32 ip = ntohl(peer_addr->sin_addr.s_addr); - CERROR("Peer's port (%pI4h:%hu) is not privileged\n", + CERROR("peer_ni's port (%pI4h:%hu) is not privileged\n", &ip, ntohs(peer_addr->sin_port)); goto failed; } @@ -2272,7 +2274,7 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) * o2iblnd-specific protocol changes, or when LNET unifies * protocols over all LNDs, the initial connection will * negotiate a protocol version. I trap this here to avoid - * console errors; the reject tells the peer which protocol I + * console errors; the reject tells the peer_ni which protocol I * speak. */ if (reqmsg->ibm_magic == LNET_PROTO_MAGIC || @@ -2322,7 +2324,7 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) goto failed; } - /* I can accept peer's version */ + /* I can accept peer_ni's version */ version = reqmsg->ibm_version; if (reqmsg->ibm_type != IBLND_MSG_CONNREQ) { @@ -2374,17 +2376,17 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) goto failed; } - /* assume 'nid' is a new peer; create */ - rc = kiblnd_create_peer(ni, &peer, nid); + /* assume 'nid' is a new peer_ni; create */ + rc = kiblnd_create_peer(ni, &peer_ni, nid); if (rc) { - CERROR("Can't create peer for %s\n", libcfs_nid2str(nid)); + CERROR("Can't create peer_ni for %s\n", libcfs_nid2str(nid)); rej.ibr_why = IBLND_REJECT_NO_RESOURCES; goto failed; } - /* We have validated the peer's parameters so use those */ - peer->ibp_max_frags = max_frags; - peer->ibp_queue_depth = reqmsg->ibm_u.connparams.ibcp_queue_depth; + /* We have validated the peer_ni's parameters so use those */ + peer_ni->ibp_max_frags = max_frags; + peer_ni->ibp_queue_depth = reqmsg->ibm_u.connparams.ibcp_queue_depth; write_lock_irqsave(g_lock, flags); @@ -2410,7 +2412,7 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) libcfs_nid2str(nid), peer2->ibp_version, version, peer2->ibp_incarnation, reqmsg->ibm_srcstamp); - kiblnd_peer_decref(peer); + kiblnd_peer_decref(peer_ni); rej.ibr_why = IBLND_REJECT_CONN_STALE; goto failed; } @@ -2432,7 +2434,7 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) CDEBUG(D_NET, "Conn race %s\n", libcfs_nid2str(peer2->ibp_nid)); - kiblnd_peer_decref(peer); + kiblnd_peer_decref(peer_ni); rej.ibr_why = IBLND_REJECT_CONN_RACE; goto failed; } @@ -2440,9 +2442,9 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) CNETERR("Conn race %s: unresolved after %d attempts, letting lower NID win\n", libcfs_nid2str(peer2->ibp_nid), MAX_CONN_RACES_BEFORE_ABORT); - /** - * passive connection is allowed even this peer is waiting for - * reconnection. + /* + * passive connection is allowed even this peer_ni is + * waiting for reconnection. */ peer2->ibp_reconnecting = 0; peer2->ibp_races = 0; @@ -2452,38 +2454,38 @@ kiblnd_passive_connect(struct rdma_cm_id *cmid, void *priv, int priv_nob) /** * Race with kiblnd_launch_tx (active connect) to create peer * so copy validated parameters since we now know what the - * peer's limits are + * peer_ni's limits are */ - peer2->ibp_max_frags = peer->ibp_max_frags; - peer2->ibp_queue_depth = peer->ibp_queue_depth; + peer2->ibp_max_frags = peer_ni->ibp_max_frags; + peer2->ibp_queue_depth = peer_ni->ibp_queue_depth; write_unlock_irqrestore(g_lock, flags); - kiblnd_peer_decref(peer); - peer = peer2; + kiblnd_peer_decref(peer_ni); + peer_ni = peer2; } else { - /* Brand new peer */ - LASSERT(!peer->ibp_accepting); - LASSERT(!peer->ibp_version && - !peer->ibp_incarnation); + /* Brand new peer_ni */ + LASSERT(!peer_ni->ibp_accepting); + LASSERT(!peer_ni->ibp_version && + !peer_ni->ibp_incarnation); - peer->ibp_accepting = 1; - peer->ibp_version = version; - peer->ibp_incarnation = reqmsg->ibm_srcstamp; + peer_ni->ibp_accepting = 1; + peer_ni->ibp_version = version; + peer_ni->ibp_incarnation = reqmsg->ibm_srcstamp; /* I have a ref on ni that prevents it being shutdown */ LASSERT(!net->ibn_shutdown); - kiblnd_peer_addref(peer); - list_add_tail(&peer->ibp_list, kiblnd_nid2peerlist(nid)); + kiblnd_peer_addref(peer_ni); + list_add_tail(&peer_ni->ibp_list, kiblnd_nid2peerlist(nid)); write_unlock_irqrestore(g_lock, flags); } - conn = kiblnd_create_conn(peer, cmid, IBLND_CONN_PASSIVE_WAIT, + conn = kiblnd_create_conn(peer_ni, cmid, IBLND_CONN_PASSIVE_WAIT, version); if (!conn) { - kiblnd_peer_connect_failed(peer, 0, -ENOMEM); - kiblnd_peer_decref(peer); + kiblnd_peer_connect_failed(peer_ni, 0, -ENOMEM); + kiblnd_peer_decref(peer_ni); rej.ibr_why = IBLND_REJECT_NO_RESOURCES; goto failed; } @@ -2552,7 +2554,7 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, __u64 incarnation, int why, struct kib_connparams *cp) { rwlock_t *glock = &kiblnd_data.kib_global_lock; - struct kib_peer *peer = conn->ibc_peer; + struct kib_peer_ni *peer_ni = conn->ibc_peer; char *reason; int msg_size = IBLND_MSG_SIZE; int frag_num = -1; @@ -2561,7 +2563,7 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, unsigned long flags; LASSERT(conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT); - LASSERT(peer->ibp_connecting > 0); /* 'conn' at least */ + LASSERT(peer_ni->ibp_connecting > 0); /* 'conn' at least */ if (cp) { msg_size = cp->ibcp_max_msg_size; @@ -2577,10 +2579,10 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, * empty if ibp_version != version because reconnect may be * initiated by kiblnd_query() */ - reconnect = (!list_empty(&peer->ibp_tx_queue) || - peer->ibp_version != version) && - peer->ibp_connecting && - !peer->ibp_accepting; + reconnect = (!list_empty(&peer_ni->ibp_tx_queue) || + peer_ni->ibp_version != version) && + peer_ni->ibp_connecting && + !peer_ni->ibp_accepting; if (!reconnect) { reason = "no need"; goto out; @@ -2598,7 +2600,7 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, reason = "can't negotiate max frags"; goto out; } - tunables = &peer->ibp_ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; + tunables = &peer_ni->ibp_ni->ni_lnd_tunables.lnd_tun_u.lnd_o2ib; if (!tunables->lnd_map_on_demand) { reason = "map_on_demand must be enabled"; goto out; @@ -2608,7 +2610,7 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, goto out; } - peer->ibp_max_frags = frag_num; + peer_ni->ibp_max_frags = frag_num; reason = "rdma fragments"; break; } @@ -2622,7 +2624,7 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, goto out; } - peer->ibp_queue_depth = queue_dep; + peer_ni->ibp_queue_depth = queue_dep; reason = "queue depth"; break; @@ -2640,15 +2642,15 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, } conn->ibc_reconnect = 1; - peer->ibp_reconnecting++; - peer->ibp_version = version; + peer_ni->ibp_reconnecting++; + peer_ni->ibp_version = version; if (incarnation) - peer->ibp_incarnation = incarnation; + peer_ni->ibp_incarnation = incarnation; out: write_unlock_irqrestore(glock, flags); CNETERR("%s: %s (%s), %x, %x, msg_size: %d, queue_depth: %d/%d, max_frags: %d/%d\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), reconnect ? "reconnect" : "don't reconnect", reason, IBLND_MSG_VERSION, version, msg_size, conn->ibc_queue_depth, queue_dep, @@ -2662,7 +2664,7 @@ kiblnd_check_reconnect(struct kib_conn *conn, int version, static void kiblnd_rejected(struct kib_conn *conn, int reason, void *priv, int priv_nob) { - struct kib_peer *peer = conn->ibc_peer; + struct kib_peer_ni *peer_ni = conn->ibc_peer; LASSERT(!in_interrupt()); LASSERT(conn->ibc_state == IBLND_CONN_ACTIVE_CONNECT); @@ -2675,7 +2677,7 @@ kiblnd_rejected(struct kib_conn *conn, int reason, void *priv, int priv_nob) case IB_CM_REJ_INVALID_SERVICE_ID: CNETERR("%s rejected: no listener at %d\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), *kiblnd_tunables.kib_service); break; @@ -2691,7 +2693,7 @@ kiblnd_rejected(struct kib_conn *conn, int reason, void *priv, int priv_nob) * b) V2 will provide incarnation while rejecting me, * -1 will be overwrote. * - * if I try to connect to a V1 peer with V2 protocol, + * if I try to connect to a V1 peer_ni with V2 protocol, * it rejected me then upgrade to V2, I have no idea * about the upgrading and try to reconnect with V1, * in this case upgraded V2 can find out I'm trying to @@ -2727,22 +2729,24 @@ kiblnd_rejected(struct kib_conn *conn, int reason, void *priv, int priv_nob) if (rej->ibr_magic != IBLND_MSG_MAGIC && rej->ibr_magic != LNET_PROTO_MAGIC) { CERROR("%s rejected: consumer defined fatal error\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); break; } if (rej->ibr_version != IBLND_MSG_VERSION && rej->ibr_version != IBLND_MSG_VERSION_1) { CERROR("%s rejected: o2iblnd version %x error\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), rej->ibr_version); break; } if (rej->ibr_why == IBLND_REJECT_FATAL && rej->ibr_version == IBLND_MSG_VERSION_1) { - CDEBUG(D_NET, "rejected by old version peer %s: %x\n", - libcfs_nid2str(peer->ibp_nid), rej->ibr_version); + CDEBUG(D_NET, + "rejected by old version peer_ni %s: %x\n", + libcfs_nid2str(peer_ni->ibp_nid), + rej->ibr_version); if (conn->ibc_version != IBLND_MSG_VERSION_1) rej->ibr_why = IBLND_REJECT_CONN_UNCOMPAT; @@ -2761,17 +2765,17 @@ kiblnd_rejected(struct kib_conn *conn, int reason, void *priv, int priv_nob) case IBLND_REJECT_NO_RESOURCES: CERROR("%s rejected: o2iblnd no resources\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); break; case IBLND_REJECT_FATAL: CERROR("%s rejected: o2iblnd fatal error\n", - libcfs_nid2str(peer->ibp_nid)); + libcfs_nid2str(peer_ni->ibp_nid)); break; default: CERROR("%s rejected: o2iblnd reason %d\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), rej->ibr_why); break; } @@ -2780,7 +2784,7 @@ kiblnd_rejected(struct kib_conn *conn, int reason, void *priv, int priv_nob) /* fall through */ default: CNETERR("%s rejected: reason %d, size %d\n", - libcfs_nid2str(peer->ibp_nid), reason, priv_nob); + libcfs_nid2str(peer_ni->ibp_nid), reason, priv_nob); break; } @@ -2790,8 +2794,8 @@ kiblnd_rejected(struct kib_conn *conn, int reason, void *priv, int priv_nob) static void kiblnd_check_connreply(struct kib_conn *conn, void *priv, int priv_nob) { - struct kib_peer *peer = conn->ibc_peer; - struct lnet_ni *ni = peer->ibp_ni; + struct kib_peer_ni *peer_ni = conn->ibc_peer; + struct lnet_ni *ni = peer_ni->ibp_ni; struct kib_net *net = ni->ni_data; struct kib_msg *msg = priv; int ver = conn->ibc_version; @@ -2802,20 +2806,20 @@ kiblnd_check_connreply(struct kib_conn *conn, void *priv, int priv_nob) if (rc) { CERROR("Can't unpack connack from %s: %d\n", - libcfs_nid2str(peer->ibp_nid), rc); + libcfs_nid2str(peer_ni->ibp_nid), rc); goto failed; } if (msg->ibm_type != IBLND_MSG_CONNACK) { CERROR("Unexpected message %d from %s\n", - msg->ibm_type, libcfs_nid2str(peer->ibp_nid)); + msg->ibm_type, libcfs_nid2str(peer_ni->ibp_nid)); rc = -EPROTO; goto failed; } if (ver != msg->ibm_version) { CERROR("%s replied version %x is different with requested version %x\n", - libcfs_nid2str(peer->ibp_nid), msg->ibm_version, ver); + libcfs_nid2str(peer_ni->ibp_nid), msg->ibm_version, ver); rc = -EPROTO; goto failed; } @@ -2823,7 +2827,7 @@ kiblnd_check_connreply(struct kib_conn *conn, void *priv, int priv_nob) if (msg->ibm_u.connparams.ibcp_queue_depth > conn->ibc_queue_depth) { CERROR("%s has incompatible queue depth %d (<=%d wanted)\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), msg->ibm_u.connparams.ibcp_queue_depth, conn->ibc_queue_depth); rc = -EPROTO; @@ -2833,7 +2837,7 @@ kiblnd_check_connreply(struct kib_conn *conn, void *priv, int priv_nob) if ((msg->ibm_u.connparams.ibcp_max_frags >> IBLND_FRAG_SHIFT) > conn->ibc_max_frags) { CERROR("%s has incompatible max_frags %d (<=%d wanted)\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), msg->ibm_u.connparams.ibcp_max_frags >> IBLND_FRAG_SHIFT, conn->ibc_max_frags); rc = -EPROTO; @@ -2842,7 +2846,7 @@ kiblnd_check_connreply(struct kib_conn *conn, void *priv, int priv_nob) if (msg->ibm_u.connparams.ibcp_max_msg_size > IBLND_MSG_SIZE) { CERROR("%s max message size %d too big (%d max)\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), msg->ibm_u.connparams.ibcp_max_msg_size, IBLND_MSG_SIZE); rc = -EPROTO; @@ -2859,7 +2863,7 @@ kiblnd_check_connreply(struct kib_conn *conn, void *priv, int priv_nob) if (rc) { CERROR("Bad connection reply from %s, rc = %d, version: %x max_frags: %d\n", - libcfs_nid2str(peer->ibp_nid), rc, + libcfs_nid2str(peer_ni->ibp_nid), rc, msg->ibm_version, msg->ibm_u.connparams.ibcp_max_frags); goto failed; } @@ -2890,7 +2894,7 @@ kiblnd_check_connreply(struct kib_conn *conn, void *priv, int priv_nob) static int kiblnd_active_connect(struct rdma_cm_id *cmid) { - struct kib_peer *peer = (struct kib_peer *)cmid->context; + struct kib_peer_ni *peer_ni = (struct kib_peer_ni *)cmid->context; struct kib_conn *conn; struct kib_msg *msg; struct rdma_conn_param cp; @@ -2901,17 +2905,17 @@ kiblnd_active_connect(struct rdma_cm_id *cmid) read_lock_irqsave(&kiblnd_data.kib_global_lock, flags); - incarnation = peer->ibp_incarnation; - version = !peer->ibp_version ? IBLND_MSG_VERSION : - peer->ibp_version; + incarnation = peer_ni->ibp_incarnation; + version = !peer_ni->ibp_version ? IBLND_MSG_VERSION : + peer_ni->ibp_version; read_unlock_irqrestore(&kiblnd_data.kib_global_lock, flags); - conn = kiblnd_create_conn(peer, cmid, IBLND_CONN_ACTIVE_CONNECT, + conn = kiblnd_create_conn(peer_ni, cmid, IBLND_CONN_ACTIVE_CONNECT, version); if (!conn) { - kiblnd_peer_connect_failed(peer, 1, -ENOMEM); - kiblnd_peer_decref(peer); /* lose cmid's ref */ + kiblnd_peer_connect_failed(peer_ni, 1, -ENOMEM); + kiblnd_peer_decref(peer_ni); /* lose cmid's ref */ return -ENOMEM; } @@ -2928,8 +2932,8 @@ kiblnd_active_connect(struct rdma_cm_id *cmid) msg->ibm_u.connparams.ibcp_max_frags = conn->ibc_max_frags << IBLND_FRAG_SHIFT; msg->ibm_u.connparams.ibcp_max_msg_size = IBLND_MSG_SIZE; - kiblnd_pack_msg(peer->ibp_ni, msg, version, - 0, peer->ibp_nid, incarnation); + kiblnd_pack_msg(peer_ni->ibp_ni, msg, version, + 0, peer_ni->ibp_nid, incarnation); memset(&cp, 0, sizeof(cp)); cp.private_data = msg; @@ -2946,7 +2950,7 @@ kiblnd_active_connect(struct rdma_cm_id *cmid) rc = rdma_connect(cmid, &cp); if (rc) { CERROR("Can't connect to %s: %d\n", - libcfs_nid2str(peer->ibp_nid), rc); + libcfs_nid2str(peer_ni->ibp_nid), rc); kiblnd_connreq_done(conn, rc); kiblnd_conn_decref(conn); } @@ -2957,7 +2961,7 @@ kiblnd_active_connect(struct rdma_cm_id *cmid) int kiblnd_cm_callback(struct rdma_cm_id *cmid, struct rdma_cm_event *event) { - struct kib_peer *peer; + struct kib_peer_ni *peer_ni; struct kib_conn *conn; int rc; @@ -2976,33 +2980,34 @@ kiblnd_cm_callback(struct rdma_cm_id *cmid, struct rdma_cm_event *event) return rc; case RDMA_CM_EVENT_ADDR_ERROR: - peer = (struct kib_peer *)cmid->context; + peer_ni = (struct kib_peer_ni *)cmid->context; CNETERR("%s: ADDR ERROR %d\n", - libcfs_nid2str(peer->ibp_nid), event->status); - kiblnd_peer_connect_failed(peer, 1, -EHOSTUNREACH); - kiblnd_peer_decref(peer); + libcfs_nid2str(peer_ni->ibp_nid), event->status); + kiblnd_peer_connect_failed(peer_ni, 1, -EHOSTUNREACH); + kiblnd_peer_decref(peer_ni); return -EHOSTUNREACH; /* rc destroys cmid */ case RDMA_CM_EVENT_ADDR_RESOLVED: - peer = (struct kib_peer *)cmid->context; + peer_ni = (struct kib_peer_ni *)cmid->context; CDEBUG(D_NET, "%s Addr resolved: %d\n", - libcfs_nid2str(peer->ibp_nid), event->status); + libcfs_nid2str(peer_ni->ibp_nid), event->status); if (event->status) { CNETERR("Can't resolve address for %s: %d\n", - libcfs_nid2str(peer->ibp_nid), event->status); + libcfs_nid2str(peer_ni->ibp_nid), + event->status); rc = event->status; } else { rc = rdma_resolve_route( cmid, *kiblnd_tunables.kib_timeout * 1000); if (!rc) { - struct kib_net *net = peer->ibp_ni->ni_data; + struct kib_net *net = peer_ni->ibp_ni->ni_data; struct kib_dev *dev = net->ibn_dev; CDEBUG(D_NET, "%s: connection bound to "\ "%s:%pI4h:%s\n", - libcfs_nid2str(peer->ibp_nid), + libcfs_nid2str(peer_ni->ibp_nid), dev->ibd_ifname, &dev->ibd_ifip, cmid->device->name); @@ -3011,32 +3016,32 @@ kiblnd_cm_callback(struct rdma_cm_id *cmid, struct rdma_cm_event *event) /* Can't initiate route resolution */ CERROR("Can't resolve route for %s: %d\n", - libcfs_nid2str(peer->ibp_nid), rc); + libcfs_nid2str(peer_ni->ibp_nid), rc); } - kiblnd_peer_connect_failed(peer, 1, rc); - kiblnd_peer_decref(peer); + kiblnd_peer_connect_failed(peer_ni, 1, rc); + kiblnd_peer_decref(peer_ni); return rc; /* rc destroys cmid */ case RDMA_CM_EVENT_ROUTE_ERROR: - peer = (struct kib_peer *)cmid->context; + peer_ni = (struct kib_peer_ni *)cmid->context; CNETERR("%s: ROUTE ERROR %d\n", - libcfs_nid2str(peer->ibp_nid), event->status); - kiblnd_peer_connect_failed(peer, 1, -EHOSTUNREACH); - kiblnd_peer_decref(peer); + libcfs_nid2str(peer_ni->ibp_nid), event->status); + kiblnd_peer_connect_failed(peer_ni, 1, -EHOSTUNREACH); + kiblnd_peer_decref(peer_ni); return -EHOSTUNREACH; /* rc destroys cmid */ case RDMA_CM_EVENT_ROUTE_RESOLVED: - peer = (struct kib_peer *)cmid->context; + peer_ni = (struct kib_peer_ni *)cmid->context; CDEBUG(D_NET, "%s Route resolved: %d\n", - libcfs_nid2str(peer->ibp_nid), event->status); + libcfs_nid2str(peer_ni->ibp_nid), event->status); if (!event->status) return kiblnd_active_connect(cmid); CNETERR("Can't resolve route for %s: %d\n", - libcfs_nid2str(peer->ibp_nid), event->status); - kiblnd_peer_connect_failed(peer, 1, event->status); - kiblnd_peer_decref(peer); + libcfs_nid2str(peer_ni->ibp_nid), event->status); + kiblnd_peer_connect_failed(peer_ni, 1, event->status); + kiblnd_peer_decref(peer_ni); return event->status; /* rc destroys cmid */ case RDMA_CM_EVENT_UNREACHABLE: @@ -3177,7 +3182,7 @@ kiblnd_check_conns(int idx) LIST_HEAD(closes); LIST_HEAD(checksends); struct list_head *peers = &kiblnd_data.kib_peers[idx]; - struct kib_peer *peer; + struct kib_peer_ni *peer_ni; struct kib_conn *conn; unsigned long flags; @@ -3188,9 +3193,9 @@ kiblnd_check_conns(int idx) */ read_lock_irqsave(&kiblnd_data.kib_global_lock, flags); - list_for_each_entry(peer, peers, ibp_list) { + list_for_each_entry(peer_ni, peers, ibp_list) { - list_for_each_entry(conn, &peer->ibp_conns, ibc_list) { + list_for_each_entry(conn, &peer_ni->ibp_conns, ibc_list) { int timedout; int sendnoop; @@ -3207,8 +3212,9 @@ kiblnd_check_conns(int idx) if (timedout) { CERROR("Timed out RDMA with %s (%lld): c: %u, oc: %u, rc: %u\n", - libcfs_nid2str(peer->ibp_nid), - ktime_get_seconds() - peer->ibp_last_alive, + libcfs_nid2str(peer_ni->ibp_nid), + (ktime_get_seconds() - + peer_ni->ibp_last_alive), conn->ibc_credits, conn->ibc_outstanding_credits, conn->ibc_reserved_credits); @@ -3268,7 +3274,7 @@ kiblnd_disconnect_conn(struct kib_conn *conn) } /** - * High-water for reconnection to the same peer, reconnection attempt should + * High-water for reconnection to the same peer_ni, reconnection attempt should * be delayed after trying more than KIB_RECONN_HIGH_RACE. */ #define KIB_RECONN_HIGH_RACE 10 @@ -3302,14 +3308,14 @@ kiblnd_connd(void *arg) dropped_lock = 0; if (!list_empty(&kiblnd_data.kib_connd_zombies)) { - struct kib_peer *peer = NULL; + struct kib_peer_ni *peer_ni = NULL; conn = list_entry(kiblnd_data.kib_connd_zombies.next, struct kib_conn, ibc_list); list_del(&conn->ibc_list); if (conn->ibc_reconnect) { - peer = conn->ibc_peer; - kiblnd_peer_addref(peer); + peer_ni = conn->ibc_peer; + kiblnd_peer_addref(peer_ni); } spin_unlock_irqrestore(lock, flags); @@ -3318,13 +3324,13 @@ kiblnd_connd(void *arg) kiblnd_destroy_conn(conn); spin_lock_irqsave(lock, flags); - if (!peer) { + if (!peer_ni) { kfree(conn); continue; } - conn->ibc_peer = peer; - if (peer->ibp_reconnected < KIB_RECONN_HIGH_RACE) + conn->ibc_peer = peer_ni; + if (peer_ni->ibp_reconnected < KIB_RECONN_HIGH_RACE) list_add_tail(&conn->ibc_list, &kiblnd_data.kib_reconn_list); else @@ -3384,7 +3390,7 @@ kiblnd_connd(void *arg) /* * Time to check for RDMA timeouts on a few more * peers: I do checks every 'p' seconds on a - * proportion of the peer table and I need to check + * proportion of the peer_ni table and I need to check * every connection 'n' times within a timeout * interval, to ensure I detect a timeout on any * connection within (n+1)/n times the timeout diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c index ba1ec35a017a..c14711804d7b 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c @@ -104,38 +104,38 @@ ksocknal_create_peer(struct ksock_peer **peerp, struct lnet_ni *ni, { int cpt = lnet_cpt_of_nid(id.nid, ni); struct ksock_net *net = ni->ni_data; - struct ksock_peer *peer; + struct ksock_peer *peer_ni; LASSERT(id.nid != LNET_NID_ANY); LASSERT(id.pid != LNET_PID_ANY); LASSERT(!in_interrupt()); - peer = kzalloc_cpt(sizeof(*peer), GFP_NOFS, cpt); - if (!peer) + peer_ni = kzalloc_cpt(sizeof(*peer_ni), GFP_NOFS, cpt); + if (!peer_ni) return -ENOMEM; - peer->ksnp_ni = ni; - peer->ksnp_id = id; - atomic_set(&peer->ksnp_refcount, 1); /* 1 ref for caller */ - peer->ksnp_closing = 0; - peer->ksnp_accepting = 0; - peer->ksnp_proto = NULL; - peer->ksnp_last_alive = 0; - peer->ksnp_zc_next_cookie = SOCKNAL_KEEPALIVE_PING + 1; - - INIT_LIST_HEAD(&peer->ksnp_conns); - INIT_LIST_HEAD(&peer->ksnp_routes); - INIT_LIST_HEAD(&peer->ksnp_tx_queue); - INIT_LIST_HEAD(&peer->ksnp_zc_req_list); - spin_lock_init(&peer->ksnp_lock); + peer_ni->ksnp_ni = ni; + peer_ni->ksnp_id = id; + atomic_set(&peer_ni->ksnp_refcount, 1); /* 1 ref for caller */ + peer_ni->ksnp_closing = 0; + peer_ni->ksnp_accepting = 0; + peer_ni->ksnp_proto = NULL; + peer_ni->ksnp_last_alive = 0; + peer_ni->ksnp_zc_next_cookie = SOCKNAL_KEEPALIVE_PING + 1; + + INIT_LIST_HEAD(&peer_ni->ksnp_conns); + INIT_LIST_HEAD(&peer_ni->ksnp_routes); + INIT_LIST_HEAD(&peer_ni->ksnp_tx_queue); + INIT_LIST_HEAD(&peer_ni->ksnp_zc_req_list); + spin_lock_init(&peer_ni->ksnp_lock); spin_lock_bh(&net->ksnn_lock); if (net->ksnn_shutdown) { spin_unlock_bh(&net->ksnn_lock); - kfree(peer); - CERROR("Can't create peer: network shutdown\n"); + kfree(peer_ni); + CERROR("Can't create peer_ni: network shutdown\n"); return -ESHUTDOWN; } @@ -143,31 +143,31 @@ ksocknal_create_peer(struct ksock_peer **peerp, struct lnet_ni *ni, spin_unlock_bh(&net->ksnn_lock); - *peerp = peer; + *peerp = peer_ni; return 0; } void -ksocknal_destroy_peer(struct ksock_peer *peer) +ksocknal_destroy_peer(struct ksock_peer *peer_ni) { - struct ksock_net *net = peer->ksnp_ni->ni_data; + struct ksock_net *net = peer_ni->ksnp_ni->ni_data; - CDEBUG(D_NET, "peer %s %p deleted\n", - libcfs_id2str(peer->ksnp_id), peer); + CDEBUG(D_NET, "peer_ni %s %p deleted\n", + libcfs_id2str(peer_ni->ksnp_id), peer_ni); - LASSERT(!atomic_read(&peer->ksnp_refcount)); - LASSERT(!peer->ksnp_accepting); - LASSERT(list_empty(&peer->ksnp_conns)); - LASSERT(list_empty(&peer->ksnp_routes)); - LASSERT(list_empty(&peer->ksnp_tx_queue)); - LASSERT(list_empty(&peer->ksnp_zc_req_list)); + LASSERT(!atomic_read(&peer_ni->ksnp_refcount)); + LASSERT(!peer_ni->ksnp_accepting); + LASSERT(list_empty(&peer_ni->ksnp_conns)); + LASSERT(list_empty(&peer_ni->ksnp_routes)); + LASSERT(list_empty(&peer_ni->ksnp_tx_queue)); + LASSERT(list_empty(&peer_ni->ksnp_zc_req_list)); - kfree(peer); + kfree(peer_ni); /* - * NB a peer's connections and routes keep a reference on their peer + * NB a peer_ni's connections and routes keep a reference on their peer * until they are destroyed, so we can be assured that _all_ state to - * do with this peer has been cleaned up when its refcount drops to + * do with this peer_ni has been cleaned up when its refcount drops to * zero. */ spin_lock_bh(&net->ksnn_lock); @@ -179,22 +179,22 @@ struct ksock_peer * ksocknal_find_peer_locked(struct lnet_ni *ni, struct lnet_process_id id) { struct list_head *peer_list = ksocknal_nid2peerlist(id.nid); - struct ksock_peer *peer; + struct ksock_peer *peer_ni; - list_for_each_entry(peer, peer_list, ksnp_list) { - LASSERT(!peer->ksnp_closing); + list_for_each_entry(peer_ni, peer_list, ksnp_list) { + LASSERT(!peer_ni->ksnp_closing); - if (peer->ksnp_ni != ni) + if (peer_ni->ksnp_ni != ni) continue; - if (peer->ksnp_id.nid != id.nid || - peer->ksnp_id.pid != id.pid) + if (peer_ni->ksnp_id.nid != id.nid || + peer_ni->ksnp_id.pid != id.pid) continue; - CDEBUG(D_NET, "got peer [%p] -> %s (%d)\n", - peer, libcfs_id2str(id), - atomic_read(&peer->ksnp_refcount)); - return peer; + CDEBUG(D_NET, "got peer_ni [%p] -> %s (%d)\n", + peer_ni, libcfs_id2str(id), + atomic_read(&peer_ni->ksnp_refcount)); + return peer_ni; } return NULL; } @@ -202,47 +202,47 @@ ksocknal_find_peer_locked(struct lnet_ni *ni, struct lnet_process_id id) struct ksock_peer * ksocknal_find_peer(struct lnet_ni *ni, struct lnet_process_id id) { - struct ksock_peer *peer; + struct ksock_peer *peer_ni; read_lock(&ksocknal_data.ksnd_global_lock); - peer = ksocknal_find_peer_locked(ni, id); - if (peer) /* +1 ref for caller? */ - ksocknal_peer_addref(peer); + peer_ni = ksocknal_find_peer_locked(ni, id); + if (peer_ni) /* +1 ref for caller? */ + ksocknal_peer_addref(peer_ni); read_unlock(&ksocknal_data.ksnd_global_lock); - return peer; + return peer_ni; } static void -ksocknal_unlink_peer_locked(struct ksock_peer *peer) +ksocknal_unlink_peer_locked(struct ksock_peer *peer_ni) { int i; __u32 ip; struct ksock_interface *iface; - for (i = 0; i < peer->ksnp_n_passive_ips; i++) { + for (i = 0; i < peer_ni->ksnp_n_passive_ips; i++) { LASSERT(i < LNET_MAX_INTERFACES); - ip = peer->ksnp_passive_ips[i]; + ip = peer_ni->ksnp_passive_ips[i]; - iface = ksocknal_ip2iface(peer->ksnp_ni, ip); + iface = ksocknal_ip2iface(peer_ni->ksnp_ni, ip); /* - * All IPs in peer->ksnp_passive_ips[] come from the + * All IPs in peer_ni->ksnp_passive_ips[] come from the * interface list, therefore the call must succeed. */ LASSERT(iface); - CDEBUG(D_NET, "peer=%p iface=%p ksni_nroutes=%d\n", - peer, iface, iface->ksni_nroutes); + CDEBUG(D_NET, "peer_ni=%p iface=%p ksni_nroutes=%d\n", + peer_ni, iface, iface->ksni_nroutes); iface->ksni_npeers--; } - LASSERT(list_empty(&peer->ksnp_conns)); - LASSERT(list_empty(&peer->ksnp_routes)); - LASSERT(!peer->ksnp_closing); - peer->ksnp_closing = 1; - list_del(&peer->ksnp_list); + LASSERT(list_empty(&peer_ni->ksnp_conns)); + LASSERT(list_empty(&peer_ni->ksnp_routes)); + LASSERT(!peer_ni->ksnp_closing); + peer_ni->ksnp_closing = 1; + list_del(&peer_ni->ksnp_list); /* lose peerlist's ref */ - ksocknal_peer_decref(peer); + ksocknal_peer_decref(peer_ni); } static int @@ -250,7 +250,7 @@ ksocknal_get_peer_info(struct lnet_ni *ni, int index, struct lnet_process_id *id, __u32 *myip, __u32 *peer_ip, int *port, int *conn_count, int *share_count) { - struct ksock_peer *peer; + struct ksock_peer *peer_ni; struct ksock_route *route; int i; int j; @@ -259,17 +259,17 @@ ksocknal_get_peer_info(struct lnet_ni *ni, int index, read_lock(&ksocknal_data.ksnd_global_lock); for (i = 0; i < ksocknal_data.ksnd_peer_hash_size; i++) { - list_for_each_entry(peer, &ksocknal_data.ksnd_peers[i], ksnp_list) { - - if (peer->ksnp_ni != ni) + list_for_each_entry(peer_ni, &ksocknal_data.ksnd_peers[i], + ksnp_list) { + if (peer_ni->ksnp_ni != ni) continue; - if (!peer->ksnp_n_passive_ips && - list_empty(&peer->ksnp_routes)) { + if (!peer_ni->ksnp_n_passive_ips && + list_empty(&peer_ni->ksnp_routes)) { if (index-- > 0) continue; - *id = peer->ksnp_id; + *id = peer_ni->ksnp_id; *myip = 0; *peer_ip = 0; *port = 0; @@ -279,12 +279,12 @@ ksocknal_get_peer_info(struct lnet_ni *ni, int index, goto out; } - for (j = 0; j < peer->ksnp_n_passive_ips; j++) { + for (j = 0; j < peer_ni->ksnp_n_passive_ips; j++) { if (index-- > 0) continue; - *id = peer->ksnp_id; - *myip = peer->ksnp_passive_ips[j]; + *id = peer_ni->ksnp_id; + *myip = peer_ni->ksnp_passive_ips[j]; *peer_ip = 0; *port = 0; *conn_count = 0; @@ -293,12 +293,12 @@ ksocknal_get_peer_info(struct lnet_ni *ni, int index, goto out; } - list_for_each_entry(route, &peer->ksnp_routes, + list_for_each_entry(route, &peer_ni->ksnp_routes, ksnr_list) { if (index-- > 0) continue; - *id = peer->ksnp_id; + *id = peer_ni->ksnp_id; *myip = route->ksnr_myipaddr; *peer_ip = route->ksnr_ipaddr; *port = route->ksnr_port; @@ -318,7 +318,7 @@ static void ksocknal_associate_route_conn_locked(struct ksock_route *route, struct ksock_conn *conn) { - struct ksock_peer *peer = route->ksnr_peer; + struct ksock_peer *peer_ni = route->ksnr_peer; int type = conn->ksnc_type; struct ksock_interface *iface; @@ -329,12 +329,12 @@ ksocknal_associate_route_conn_locked(struct ksock_route *route, if (!route->ksnr_myipaddr) { /* route wasn't bound locally yet (the initial route) */ CDEBUG(D_NET, "Binding %s %pI4h to %pI4h\n", - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &route->ksnr_ipaddr, &conn->ksnc_myipaddr); } else { CDEBUG(D_NET, "Rebinding %s %pI4h from %pI4h to %pI4h\n", - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &route->ksnr_ipaddr, &route->ksnr_myipaddr, &conn->ksnc_myipaddr); @@ -362,33 +362,33 @@ ksocknal_associate_route_conn_locked(struct ksock_route *route, } static void -ksocknal_add_route_locked(struct ksock_peer *peer, struct ksock_route *route) +ksocknal_add_route_locked(struct ksock_peer *peer_ni, struct ksock_route *route) { struct ksock_conn *conn; struct ksock_route *route2; - LASSERT(!peer->ksnp_closing); + LASSERT(!peer_ni->ksnp_closing); LASSERT(!route->ksnr_peer); LASSERT(!route->ksnr_scheduled); LASSERT(!route->ksnr_connecting); LASSERT(!route->ksnr_connected); /* LASSERT(unique) */ - list_for_each_entry(route2, &peer->ksnp_routes, ksnr_list) { + list_for_each_entry(route2, &peer_ni->ksnp_routes, ksnr_list) { if (route2->ksnr_ipaddr == route->ksnr_ipaddr) { CERROR("Duplicate route %s %pI4h\n", - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &route->ksnr_ipaddr); LBUG(); } } - route->ksnr_peer = peer; - ksocknal_peer_addref(peer); - /* peer's routelist takes over my ref on 'route' */ - list_add_tail(&route->ksnr_list, &peer->ksnp_routes); + route->ksnr_peer = peer_ni; + ksocknal_peer_addref(peer_ni); + /* peer_ni's routelist takes over my ref on 'route' */ + list_add_tail(&route->ksnr_list, &peer_ni->ksnp_routes); - list_for_each_entry(conn, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(conn, &peer_ni->ksnp_conns, ksnc_list) { if (conn->ksnc_ipaddr != route->ksnr_ipaddr) continue; @@ -400,7 +400,7 @@ ksocknal_add_route_locked(struct ksock_peer *peer, struct ksock_route *route) static void ksocknal_del_route_locked(struct ksock_route *route) { - struct ksock_peer *peer = route->ksnr_peer; + struct ksock_peer *peer_ni = route->ksnr_peer; struct ksock_interface *iface; struct ksock_conn *conn; struct list_head *ctmp; @@ -409,7 +409,7 @@ ksocknal_del_route_locked(struct ksock_route *route) LASSERT(!route->ksnr_deleted); /* Close associated conns */ - list_for_each_safe(ctmp, cnxt, &peer->ksnp_conns) { + list_for_each_safe(ctmp, cnxt, &peer_ni->ksnp_conns) { conn = list_entry(ctmp, struct ksock_conn, ksnc_list); if (conn->ksnc_route != route) @@ -427,15 +427,15 @@ ksocknal_del_route_locked(struct ksock_route *route) route->ksnr_deleted = 1; list_del(&route->ksnr_list); - ksocknal_route_decref(route); /* drop peer's ref */ + ksocknal_route_decref(route); /* drop peer_ni's ref */ - if (list_empty(&peer->ksnp_routes) && - list_empty(&peer->ksnp_conns)) { + if (list_empty(&peer_ni->ksnp_routes) && + list_empty(&peer_ni->ksnp_conns)) { /* - * I've just removed the last route to a peer with no active + * I've just removed the last route to a peer_ni with no active * connections */ - ksocknal_unlink_peer_locked(peer); + ksocknal_unlink_peer_locked(peer_ni); } } @@ -443,7 +443,7 @@ int ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id, __u32 ipaddr, int port) { - struct ksock_peer *peer; + struct ksock_peer *peer_ni; struct ksock_peer *peer2; struct ksock_route *route; struct ksock_route *route2; @@ -453,14 +453,14 @@ ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id, __u32 ipaddr, id.pid == LNET_PID_ANY) return -EINVAL; - /* Have a brand new peer ready... */ - rc = ksocknal_create_peer(&peer, ni, id); + /* Have a brand new peer_ni ready... */ + rc = ksocknal_create_peer(&peer_ni, ni, id); if (rc) return rc; route = ksocknal_create_route(ipaddr, port); if (!route) { - ksocknal_peer_decref(peer); + ksocknal_peer_decref(peer_ni); return -ENOMEM; } @@ -471,15 +471,15 @@ ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id, __u32 ipaddr, peer2 = ksocknal_find_peer_locked(ni, id); if (peer2) { - ksocknal_peer_decref(peer); - peer = peer2; + ksocknal_peer_decref(peer_ni); + peer_ni = peer2; } else { - /* peer table takes my ref on peer */ - list_add_tail(&peer->ksnp_list, + /* peer_ni table takes my ref on peer_ni */ + list_add_tail(&peer_ni->ksnp_list, ksocknal_nid2peerlist(id.nid)); } - list_for_each_entry(route2, &peer->ksnp_routes, ksnr_list) { + list_for_each_entry(route2, &peer_ni->ksnp_routes, ksnr_list) { if (route2->ksnr_ipaddr == ipaddr) { /* Route already exists, use the old one */ ksocknal_route_decref(route); @@ -488,7 +488,7 @@ ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id, __u32 ipaddr, } } /* Route doesn't already exist, add the new one */ - ksocknal_add_route_locked(peer, route); + ksocknal_add_route_locked(peer_ni, route); route->ksnr_share_count++; out: write_unlock_bh(&ksocknal_data.ksnd_global_lock); @@ -497,7 +497,7 @@ ksocknal_add_peer(struct lnet_ni *ni, struct lnet_process_id id, __u32 ipaddr, } static void -ksocknal_del_peer_locked(struct ksock_peer *peer, __u32 ip) +ksocknal_del_peer_locked(struct ksock_peer *peer_ni, __u32 ip) { struct ksock_conn *conn; struct ksock_route *route; @@ -505,12 +505,12 @@ ksocknal_del_peer_locked(struct ksock_peer *peer, __u32 ip) struct list_head *nxt; int nshared; - LASSERT(!peer->ksnp_closing); + LASSERT(!peer_ni->ksnp_closing); - /* Extra ref prevents peer disappearing until I'm done with it */ - ksocknal_peer_addref(peer); + /* Extra ref prevents peer_ni disappearing until I'm done with it */ + ksocknal_peer_addref(peer_ni); - list_for_each_safe(tmp, nxt, &peer->ksnp_routes) { + list_for_each_safe(tmp, nxt, &peer_ni->ksnp_routes) { route = list_entry(tmp, struct ksock_route, ksnr_list); /* no match */ @@ -523,7 +523,7 @@ ksocknal_del_peer_locked(struct ksock_peer *peer, __u32 ip) } nshared = 0; - list_for_each_safe(tmp, nxt, &peer->ksnp_routes) { + list_for_each_safe(tmp, nxt, &peer_ni->ksnp_routes) { route = list_entry(tmp, struct ksock_route, ksnr_list); nshared += route->ksnr_share_count; } @@ -533,7 +533,7 @@ ksocknal_del_peer_locked(struct ksock_peer *peer, __u32 ip) * remove everything else if there are no explicit entries * left */ - list_for_each_safe(tmp, nxt, &peer->ksnp_routes) { + list_for_each_safe(tmp, nxt, &peer_ni->ksnp_routes) { route = list_entry(tmp, struct ksock_route, ksnr_list); /* we should only be removing auto-entries */ @@ -541,24 +541,23 @@ ksocknal_del_peer_locked(struct ksock_peer *peer, __u32 ip) ksocknal_del_route_locked(route); } - list_for_each_safe(tmp, nxt, &peer->ksnp_conns) { + list_for_each_safe(tmp, nxt, &peer_ni->ksnp_conns) { conn = list_entry(tmp, struct ksock_conn, ksnc_list); ksocknal_close_conn_locked(conn, 0); } } - ksocknal_peer_decref(peer); - /* NB peer unlinks itself when last conn/route is removed */ + ksocknal_peer_decref(peer_ni); + /* NB peer_ni unlinks itself when last conn/route is removed */ } static int ksocknal_del_peer(struct lnet_ni *ni, struct lnet_process_id id, __u32 ip) { LIST_HEAD(zombies); - struct list_head *ptmp; - struct list_head *pnxt; - struct ksock_peer *peer; + struct ksock_peer *pnxt; + struct ksock_peer *peer_ni; int lo; int hi; int i; @@ -575,30 +574,32 @@ ksocknal_del_peer(struct lnet_ni *ni, struct lnet_process_id id, __u32 ip) } for (i = lo; i <= hi; i++) { - list_for_each_safe(ptmp, pnxt, &ksocknal_data.ksnd_peers[i]) { - peer = list_entry(ptmp, struct ksock_peer, ksnp_list); - - if (peer->ksnp_ni != ni) + list_for_each_entry_safe(peer_ni, pnxt, + &ksocknal_data.ksnd_peers[i], + ksnp_list) { + if (peer_ni->ksnp_ni != ni) continue; - if (!((id.nid == LNET_NID_ANY || peer->ksnp_id.nid == id.nid) && - (id.pid == LNET_PID_ANY || peer->ksnp_id.pid == id.pid))) + if (!((id.nid == LNET_NID_ANY || + peer_ni->ksnp_id.nid == id.nid) && + (id.pid == LNET_PID_ANY || + peer_ni->ksnp_id.pid == id.pid))) continue; - ksocknal_peer_addref(peer); /* a ref for me... */ + ksocknal_peer_addref(peer_ni); /* a ref for me... */ - ksocknal_del_peer_locked(peer, ip); + ksocknal_del_peer_locked(peer_ni, ip); - if (peer->ksnp_closing && - !list_empty(&peer->ksnp_tx_queue)) { - LASSERT(list_empty(&peer->ksnp_conns)); - LASSERT(list_empty(&peer->ksnp_routes)); + if (peer_ni->ksnp_closing && + !list_empty(&peer_ni->ksnp_tx_queue)) { + LASSERT(list_empty(&peer_ni->ksnp_conns)); + LASSERT(list_empty(&peer_ni->ksnp_routes)); - list_splice_init(&peer->ksnp_tx_queue, + list_splice_init(&peer_ni->ksnp_tx_queue, &zombies); } - ksocknal_peer_decref(peer); /* ...till here */ + ksocknal_peer_decref(peer_ni); /* ...till here */ rc = 0; /* matched! */ } @@ -614,20 +615,22 @@ ksocknal_del_peer(struct lnet_ni *ni, struct lnet_process_id id, __u32 ip) static struct ksock_conn * ksocknal_get_conn_by_idx(struct lnet_ni *ni, int index) { - struct ksock_peer *peer; + struct ksock_peer *peer_ni; struct ksock_conn *conn; int i; read_lock(&ksocknal_data.ksnd_global_lock); for (i = 0; i < ksocknal_data.ksnd_peer_hash_size; i++) { - list_for_each_entry(peer, &ksocknal_data.ksnd_peers[i], ksnp_list) { - LASSERT(!peer->ksnp_closing); + list_for_each_entry(peer_ni, &ksocknal_data.ksnd_peers[i], + ksnp_list) { + LASSERT(!peer_ni->ksnp_closing); - if (peer->ksnp_ni != ni) + if (peer_ni->ksnp_ni != ni) continue; - list_for_each_entry(conn, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(conn, &peer_ni->ksnp_conns, + ksnc_list) { if (index-- > 0) continue; @@ -728,10 +731,10 @@ ksocknal_match_peerip(struct ksock_interface *iface, __u32 *ips, int nips) } static int -ksocknal_select_ips(struct ksock_peer *peer, __u32 *peerips, int n_peerips) +ksocknal_select_ips(struct ksock_peer *peer_ni, __u32 *peerips, int n_peerips) { rwlock_t *global_lock = &ksocknal_data.ksnd_global_lock; - struct ksock_net *net = peer->ksnp_ni->ni_data; + struct ksock_net *net = peer_ni->ksnp_ni->ni_data; struct ksock_interface *iface; struct ksock_interface *best_iface; int n_ips; @@ -766,26 +769,26 @@ ksocknal_select_ips(struct ksock_peer *peer, __u32 *peerips, int n_peerips) n_ips = (net->ksnn_ninterfaces < 2) ? 0 : min(n_peerips, net->ksnn_ninterfaces); - for (i = 0; peer->ksnp_n_passive_ips < n_ips; i++) { + for (i = 0; peer_ni->ksnp_n_passive_ips < n_ips; i++) { /* ^ yes really... */ /* * If we have any new interfaces, first tick off all the - * peer IPs that match old interfaces, then choose new - * interfaces to match the remaining peer IPS. + * peer_ni IPs that match old interfaces, then choose new + * interfaces to match the remaining peer_ni IPS. * We don't forget interfaces we've stopped using; we might * start using them again... */ - if (i < peer->ksnp_n_passive_ips) { + if (i < peer_ni->ksnp_n_passive_ips) { /* Old interface. */ - ip = peer->ksnp_passive_ips[i]; - best_iface = ksocknal_ip2iface(peer->ksnp_ni, ip); + ip = peer_ni->ksnp_passive_ips[i]; + best_iface = ksocknal_ip2iface(peer_ni->ksnp_ni, ip); - /* peer passive ips are kept up to date */ + /* peer_ni passive ips are kept up to date */ LASSERT(best_iface); } else { /* choose a new interface */ - LASSERT(i == peer->ksnp_n_passive_ips); + LASSERT(i == peer_ni->ksnp_n_passive_ips); best_iface = NULL; best_netmatch = 0; @@ -795,11 +798,14 @@ ksocknal_select_ips(struct ksock_peer *peer, __u32 *peerips, int n_peerips) iface = &net->ksnn_interfaces[j]; ip = iface->ksni_ipaddr; - for (k = 0; k < peer->ksnp_n_passive_ips; k++) - if (peer->ksnp_passive_ips[k] == ip) + for (k = 0; + k < peer_ni->ksnp_n_passive_ips; + k++) + if (peer_ni->ksnp_passive_ips[k] == ip) break; - if (k < peer->ksnp_n_passive_ips) /* using it already */ + if (k < peer_ni->ksnp_n_passive_ips) + /* using it already */ continue; k = ksocknal_match_peerip(iface, peerips, @@ -822,17 +828,17 @@ ksocknal_select_ips(struct ksock_peer *peer, __u32 *peerips, int n_peerips) best_iface->ksni_npeers++; ip = best_iface->ksni_ipaddr; - peer->ksnp_passive_ips[i] = ip; - peer->ksnp_n_passive_ips = i + 1; + peer_ni->ksnp_passive_ips[i] = ip; + peer_ni->ksnp_n_passive_ips = i + 1; } - /* mark the best matching peer IP used */ + /* mark the best matching peer_ni IP used */ j = ksocknal_match_peerip(best_iface, peerips, n_peerips); peerips[j] = 0; } - /* Overwrite input peer IP addresses */ - memcpy(peerips, peer->ksnp_passive_ips, n_ips * sizeof(*peerips)); + /* Overwrite input peer_ni IP addresses */ + memcpy(peerips, peer_ni->ksnp_passive_ips, n_ips * sizeof(*peerips)); write_unlock_bh(global_lock); @@ -840,12 +846,12 @@ ksocknal_select_ips(struct ksock_peer *peer, __u32 *peerips, int n_peerips) } static void -ksocknal_create_routes(struct ksock_peer *peer, int port, +ksocknal_create_routes(struct ksock_peer *peer_ni, int port, __u32 *peer_ipaddrs, int npeer_ipaddrs) { struct ksock_route *newroute = NULL; rwlock_t *global_lock = &ksocknal_data.ksnd_global_lock; - struct lnet_ni *ni = peer->ksnp_ni; + struct lnet_ni *ni = peer_ni->ksnp_ni; struct ksock_net *net = ni->ni_data; struct ksock_route *route; struct ksock_interface *iface; @@ -888,13 +894,13 @@ ksocknal_create_routes(struct ksock_peer *peer, int port, write_lock_bh(global_lock); } - if (peer->ksnp_closing) { - /* peer got closed under me */ + if (peer_ni->ksnp_closing) { + /* peer_ni got closed under me */ break; } /* Already got a route? */ - list_for_each_entry(route, &peer->ksnp_routes, ksnr_list) + list_for_each_entry(route, &peer_ni->ksnp_routes, ksnr_list) if (route->ksnr_ipaddr != newroute->ksnr_ipaddr) goto next_ipaddr; @@ -909,7 +915,8 @@ ksocknal_create_routes(struct ksock_peer *peer, int port, iface = &net->ksnn_interfaces[j]; /* Using this interface already? */ - list_for_each_entry(route, &peer->ksnp_routes, ksnr_list) + list_for_each_entry(route, &peer_ni->ksnp_routes, + ksnr_list) if (route->ksnr_myipaddr == iface->ksni_ipaddr) goto next_iface; @@ -935,7 +942,7 @@ ksocknal_create_routes(struct ksock_peer *peer, int port, newroute->ksnr_myipaddr = best_iface->ksni_ipaddr; best_iface->ksni_nroutes++; - ksocknal_add_route_locked(peer, newroute); + ksocknal_add_route_locked(peer_ni, newroute); newroute = NULL; next_ipaddr:; } @@ -977,11 +984,11 @@ ksocknal_accept(struct lnet_ni *ni, struct socket *sock) } static int -ksocknal_connecting(struct ksock_peer *peer, __u32 ipaddr) +ksocknal_connecting(struct ksock_peer *peer_ni, __u32 ipaddr) { struct ksock_route *route; - list_for_each_entry(route, &peer->ksnp_routes, ksnr_list) { + list_for_each_entry(route, &peer_ni->ksnp_routes, ksnr_list) { if (route->ksnr_ipaddr == ipaddr) return route->ksnr_connecting; } @@ -998,7 +1005,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, __u64 incarnation; struct ksock_conn *conn; struct ksock_conn *conn2; - struct ksock_peer *peer = NULL; + struct ksock_peer *peer_ni = NULL; struct ksock_peer *peer2; struct ksock_sched *sched; struct ksock_hello_msg *hello; @@ -1054,21 +1061,21 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, goto failed_1; /* - * Find out/confirm peer's NID and connection type and get the + * Find out/confirm peer_ni's NID and connection type and get the * vector of interfaces she's willing to let me connect to. - * Passive connections use the listener timeout since the peer sends + * Passive connections use the listener timeout since the peer_ni sends * eagerly */ if (active) { - peer = route->ksnr_peer; - LASSERT(ni == peer->ksnp_ni); + peer_ni = route->ksnr_peer; + LASSERT(ni == peer_ni->ksnp_ni); /* Active connection sends HELLO eagerly */ hello->kshm_nips = ksocknal_local_ipvec(ni, hello->kshm_ips); - peerid = peer->ksnp_id; + peerid = peer_ni->ksnp_id; write_lock_bh(global_lock); - conn->ksnc_proto = peer->ksnp_proto; + conn->ksnc_proto = peer_ni->ksnp_proto; write_unlock_bh(global_lock); if (!conn->ksnc_proto) { @@ -1088,7 +1095,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, peerid.nid = LNET_NID_ANY; peerid.pid = LNET_PID_ANY; - /* Passive, get protocol from peer */ + /* Passive, get protocol from peer_ni */ conn->ksnc_proto = NULL; } @@ -1103,10 +1110,10 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, cpt = lnet_cpt_of_nid(peerid.nid, ni); if (active) { - ksocknal_peer_addref(peer); + ksocknal_peer_addref(peer_ni); write_lock_bh(global_lock); } else { - rc = ksocknal_create_peer(&peer, ni, peerid); + rc = ksocknal_create_peer(&peer_ni, ni, peerid); if (rc) goto failed_1; @@ -1118,61 +1125,61 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, peer2 = ksocknal_find_peer_locked(ni, peerid); if (!peer2) { /* - * NB this puts an "empty" peer in the peer + * NB this puts an "empty" peer_ni in the peer * table (which takes my ref) */ - list_add_tail(&peer->ksnp_list, + list_add_tail(&peer_ni->ksnp_list, ksocknal_nid2peerlist(peerid.nid)); } else { - ksocknal_peer_decref(peer); - peer = peer2; + ksocknal_peer_decref(peer_ni); + peer_ni = peer2; } /* +1 ref for me */ - ksocknal_peer_addref(peer); - peer->ksnp_accepting++; + ksocknal_peer_addref(peer_ni); + peer_ni->ksnp_accepting++; /* * Am I already connecting to this guy? Resolve in * favour of higher NID... */ if (peerid.nid < ni->ni_nid && - ksocknal_connecting(peer, conn->ksnc_ipaddr)) { + ksocknal_connecting(peer_ni, conn->ksnc_ipaddr)) { rc = EALREADY; warn = "connection race resolution"; goto failed_2; } } - if (peer->ksnp_closing || + if (peer_ni->ksnp_closing || (active && route->ksnr_deleted)) { - /* peer/route got closed under me */ + /* peer_ni/route got closed under me */ rc = -ESTALE; - warn = "peer/route removed"; + warn = "peer_ni/route removed"; goto failed_2; } - if (!peer->ksnp_proto) { + if (!peer_ni->ksnp_proto) { /* * Never connected before. * NB recv_hello may have returned EPROTO to signal my peer * wants a different protocol than the one I asked for. */ - LASSERT(list_empty(&peer->ksnp_conns)); + LASSERT(list_empty(&peer_ni->ksnp_conns)); - peer->ksnp_proto = conn->ksnc_proto; - peer->ksnp_incarnation = incarnation; + peer_ni->ksnp_proto = conn->ksnc_proto; + peer_ni->ksnp_incarnation = incarnation; } - if (peer->ksnp_proto != conn->ksnc_proto || - peer->ksnp_incarnation != incarnation) { - /* Peer rebooted or I've got the wrong protocol version */ - ksocknal_close_peer_conns_locked(peer, 0, 0); + if (peer_ni->ksnp_proto != conn->ksnc_proto || + peer_ni->ksnp_incarnation != incarnation) { + /* peer_ni rebooted or I've got the wrong protocol version */ + ksocknal_close_peer_conns_locked(peer_ni, 0, 0); - peer->ksnp_proto = NULL; + peer_ni->ksnp_proto = NULL; rc = ESTALE; - warn = peer->ksnp_incarnation != incarnation ? - "peer rebooted" : + warn = peer_ni->ksnp_incarnation != incarnation ? + "peer_ni rebooted" : "wrong proto version"; goto failed_2; } @@ -1195,7 +1202,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, * loopback connection */ if (conn->ksnc_ipaddr != conn->ksnc_myipaddr) { - list_for_each_entry(conn2, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(conn2, &peer_ni->ksnp_conns, ksnc_list) { if (conn2->ksnc_ipaddr != conn->ksnc_ipaddr || conn2->ksnc_myipaddr != conn->ksnc_myipaddr || @@ -1223,7 +1230,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, if (active && route->ksnr_ipaddr != conn->ksnc_ipaddr) { CERROR("Route %s %pI4h connected to %pI4h\n", - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &route->ksnr_ipaddr, &conn->ksnc_ipaddr); } @@ -1231,10 +1238,10 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, /* * Search for a route corresponding to the new connection and * create an association. This allows incoming connections created - * by routes in my peer to match my own route entries so I don't + * by routes in my peer_ni to match my own route entries so I don't * continually create duplicate routes. */ - list_for_each_entry(route, &peer->ksnp_routes, ksnr_list) { + list_for_each_entry(route, &peer_ni->ksnp_routes, ksnr_list) { if (route->ksnr_ipaddr != conn->ksnc_ipaddr) continue; @@ -1242,10 +1249,10 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, break; } - conn->ksnc_peer = peer; /* conn takes my ref on peer */ - peer->ksnp_last_alive = ktime_get_seconds(); - peer->ksnp_send_keepalive = 0; - peer->ksnp_error = 0; + conn->ksnc_peer = peer_ni; /* conn takes my ref on peer_ni */ + peer_ni->ksnp_last_alive = ktime_get_seconds(); + peer_ni->ksnp_send_keepalive = 0; + peer_ni->ksnp_error = 0; sched = ksocknal_choose_scheduler_locked(cpt); sched->kss_nconns++; @@ -1256,9 +1263,9 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, conn->ksnc_tx_bufnob = sock->sk->sk_wmem_queued; conn->ksnc_tx_deadline = ktime_get_seconds() + *ksocknal_tunables.ksnd_timeout; - mb(); /* order with adding to peer's conn list */ + mb(); /* order with adding to peer_ni's conn list */ - list_add(&conn->ksnc_list, &peer->ksnp_conns); + list_add(&conn->ksnc_list, &peer_ni->ksnp_conns); ksocknal_conn_addref(conn); ksocknal_new_packet(conn, 0); @@ -1266,7 +1273,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, conn->ksnc_zc_capable = ksocknal_lib_zc_capable(conn); /* Take packets blocking for this connection. */ - list_for_each_entry_safe(tx, txtmp, &peer->ksnp_tx_queue, tx_list) { + list_for_each_entry_safe(tx, txtmp, &peer_ni->ksnp_tx_queue, tx_list) { int match = conn->ksnc_proto->pro_match_tx(conn, tx, tx->tx_nonblk); @@ -1295,10 +1302,10 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, if (active) { /* additional routes after interface exchange? */ - ksocknal_create_routes(peer, conn->ksnc_port, + ksocknal_create_routes(peer_ni, conn->ksnc_port, hello->kshm_ips, hello->kshm_nips); } else { - hello->kshm_nips = ksocknal_select_ips(peer, hello->kshm_ips, + hello->kshm_nips = ksocknal_select_ips(peer_ni, hello->kshm_ips, hello->kshm_nips); rc = ksocknal_send_hello(ni, conn, peerid.nid, hello); } @@ -1321,7 +1328,7 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, ksocknal_lib_set_callback(sock, conn); if (!active) - peer->ksnp_accepting--; + peer_ni->ksnp_accepting--; write_unlock_bh(global_lock); @@ -1344,12 +1351,12 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, return rc; failed_2: - if (!peer->ksnp_closing && - list_empty(&peer->ksnp_conns) && - list_empty(&peer->ksnp_routes)) { - list_add(&zombies, &peer->ksnp_tx_queue); - list_del_init(&peer->ksnp_tx_queue); - ksocknal_unlink_peer_locked(peer); + if (!peer_ni->ksnp_closing && + list_empty(&peer_ni->ksnp_conns) && + list_empty(&peer_ni->ksnp_routes)) { + list_add(&zombies, &peer_ni->ksnp_tx_queue); + list_del_init(&peer_ni->ksnp_tx_queue); + ksocknal_unlink_peer_locked(peer_ni); } write_unlock_bh(global_lock); @@ -1375,12 +1382,12 @@ ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, } write_lock_bh(global_lock); - peer->ksnp_accepting--; + peer_ni->ksnp_accepting--; write_unlock_bh(global_lock); } ksocknal_txlist_done(ni, &zombies, 1); - ksocknal_peer_decref(peer); + ksocknal_peer_decref(peer_ni); failed_1: kvfree(hello); @@ -1400,15 +1407,15 @@ ksocknal_close_conn_locked(struct ksock_conn *conn, int error) * connection for the reaper to terminate. * Caller holds ksnd_global_lock exclusively in irq context */ - struct ksock_peer *peer = conn->ksnc_peer; + struct ksock_peer *peer_ni = conn->ksnc_peer; struct ksock_route *route; struct ksock_conn *conn2; - LASSERT(!peer->ksnp_error); + LASSERT(!peer_ni->ksnp_error); LASSERT(!conn->ksnc_closing); conn->ksnc_closing = 1; - /* ksnd_deathrow_conns takes over peer's ref */ + /* ksnd_deathrow_conns takes over peer_ni's ref */ list_del(&conn->ksnc_list); route = conn->ksnc_route; @@ -1417,7 +1424,7 @@ ksocknal_close_conn_locked(struct ksock_conn *conn, int error) LASSERT(!route->ksnr_deleted); LASSERT(route->ksnr_connected & (1 << conn->ksnc_type)); - list_for_each_entry(conn2, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(conn2, &peer_ni->ksnp_conns, ksnc_list) { if (conn2->ksnc_route == route && conn2->ksnc_type == conn->ksnc_type) goto conn2_found; @@ -1429,10 +1436,10 @@ ksocknal_close_conn_locked(struct ksock_conn *conn, int error) ksocknal_route_decref(route); /* drop conn's ref on route */ } - if (list_empty(&peer->ksnp_conns)) { - /* No more connections to this peer */ + if (list_empty(&peer_ni->ksnp_conns)) { + /* No more connections to this peer_ni */ - if (!list_empty(&peer->ksnp_tx_queue)) { + if (!list_empty(&peer_ni->ksnp_tx_queue)) { struct ksock_tx *tx; LASSERT(conn->ksnc_proto == &ksocknal_protocol_v3x); @@ -1441,25 +1448,25 @@ ksocknal_close_conn_locked(struct ksock_conn *conn, int error) * throw them to the last connection..., * these TXs will be send to /dev/null by scheduler */ - list_for_each_entry(tx, &peer->ksnp_tx_queue, + list_for_each_entry(tx, &peer_ni->ksnp_tx_queue, tx_list) ksocknal_tx_prep(conn, tx); spin_lock_bh(&conn->ksnc_scheduler->kss_lock); - list_splice_init(&peer->ksnp_tx_queue, + list_splice_init(&peer_ni->ksnp_tx_queue, &conn->ksnc_tx_queue); spin_unlock_bh(&conn->ksnc_scheduler->kss_lock); } - peer->ksnp_proto = NULL; /* renegotiate protocol version */ - peer->ksnp_error = error; /* stash last conn close reason */ + peer_ni->ksnp_proto = NULL; /* renegotiate protocol version */ + peer_ni->ksnp_error = error; /* stash last conn close reason */ - if (list_empty(&peer->ksnp_routes)) { + if (list_empty(&peer_ni->ksnp_routes)) { /* * I've just closed last conn belonging to a - * peer with no routes to it + * peer_ni with no routes to it */ - ksocknal_unlink_peer_locked(peer); + ksocknal_unlink_peer_locked(peer_ni); } } @@ -1473,37 +1480,37 @@ ksocknal_close_conn_locked(struct ksock_conn *conn, int error) } void -ksocknal_peer_failed(struct ksock_peer *peer) +ksocknal_peer_failed(struct ksock_peer *peer_ni) { int notify = 0; time64_t last_alive = 0; /* * There has been a connection failure or comms error; but I'll only - * tell LNET I think the peer is dead if it's to another kernel and + * tell LNET I think the peer_ni is dead if it's to another kernel and * there are no connections or connection attempts in existence. */ read_lock(&ksocknal_data.ksnd_global_lock); - if (!(peer->ksnp_id.pid & LNET_PID_USERFLAG) && - list_empty(&peer->ksnp_conns) && - !peer->ksnp_accepting && - !ksocknal_find_connecting_route_locked(peer)) { + if (!(peer_ni->ksnp_id.pid & LNET_PID_USERFLAG) && + list_empty(&peer_ni->ksnp_conns) && + !peer_ni->ksnp_accepting && + !ksocknal_find_connecting_route_locked(peer_ni)) { notify = 1; - last_alive = peer->ksnp_last_alive; + last_alive = peer_ni->ksnp_last_alive; } read_unlock(&ksocknal_data.ksnd_global_lock); if (notify) - lnet_notify(peer->ksnp_ni, peer->ksnp_id.nid, 0, + lnet_notify(peer_ni->ksnp_ni, peer_ni->ksnp_id.nid, 0, last_alive); } void ksocknal_finalize_zcreq(struct ksock_conn *conn) { - struct ksock_peer *peer = conn->ksnc_peer; + struct ksock_peer *peer_ni = conn->ksnc_peer; struct ksock_tx *tx; struct ksock_tx *tmp; LIST_HEAD(zlist); @@ -1514,9 +1521,10 @@ ksocknal_finalize_zcreq(struct ksock_conn *conn) */ LASSERT(!conn->ksnc_sock); - spin_lock(&peer->ksnp_lock); + spin_lock(&peer_ni->ksnp_lock); - list_for_each_entry_safe(tx, tmp, &peer->ksnp_zc_req_list, tx_zc_list) { + list_for_each_entry_safe(tx, tmp, &peer_ni->ksnp_zc_req_list, + tx_zc_list) { if (tx->tx_conn != conn) continue; @@ -1528,7 +1536,7 @@ ksocknal_finalize_zcreq(struct ksock_conn *conn) list_add(&tx->tx_zc_list, &zlist); } - spin_unlock(&peer->ksnp_lock); + spin_unlock(&peer_ni->ksnp_lock); while (!list_empty(&zlist)) { tx = list_entry(zlist.next, struct ksock_tx, tx_zc_list); @@ -1547,7 +1555,7 @@ ksocknal_terminate_conn(struct ksock_conn *conn) * ksnc_refcount will eventually hit zero, and then the reaper will * destroy it. */ - struct ksock_peer *peer = conn->ksnc_peer; + struct ksock_peer *peer_ni = conn->ksnc_peer; struct ksock_sched *sched = conn->ksnc_scheduler; int failed = 0; @@ -1583,17 +1591,17 @@ ksocknal_terminate_conn(struct ksock_conn *conn) */ conn->ksnc_scheduler->kss_nconns--; - if (peer->ksnp_error) { - /* peer's last conn closed in error */ - LASSERT(list_empty(&peer->ksnp_conns)); + if (peer_ni->ksnp_error) { + /* peer_ni's last conn closed in error */ + LASSERT(list_empty(&peer_ni->ksnp_conns)); failed = 1; - peer->ksnp_error = 0; /* avoid multiple notifications */ + peer_ni->ksnp_error = 0; /* avoid multiple notifications */ } write_unlock_bh(&ksocknal_data.ksnd_global_lock); if (failed) - ksocknal_peer_failed(peer); + ksocknal_peer_failed(peer_ni); /* * The socket is closed on the final put; either here, or in @@ -1679,14 +1687,15 @@ ksocknal_destroy_conn(struct ksock_conn *conn) } int -ksocknal_close_peer_conns_locked(struct ksock_peer *peer, __u32 ipaddr, int why) +ksocknal_close_peer_conns_locked(struct ksock_peer *peer_ni, + __u32 ipaddr, int why) { struct ksock_conn *conn; struct list_head *ctmp; struct list_head *cnxt; int count = 0; - list_for_each_safe(ctmp, cnxt, &peer->ksnp_conns) { + list_for_each_safe(ctmp, cnxt, &peer_ni->ksnp_conns) { conn = list_entry(ctmp, struct ksock_conn, ksnc_list); if (!ipaddr || conn->ksnc_ipaddr == ipaddr) { @@ -1701,13 +1710,13 @@ ksocknal_close_peer_conns_locked(struct ksock_peer *peer, __u32 ipaddr, int why) int ksocknal_close_conn_and_siblings(struct ksock_conn *conn, int why) { - struct ksock_peer *peer = conn->ksnc_peer; + struct ksock_peer *peer_ni = conn->ksnc_peer; __u32 ipaddr = conn->ksnc_ipaddr; int count; write_lock_bh(&ksocknal_data.ksnd_global_lock); - count = ksocknal_close_peer_conns_locked(peer, ipaddr, why); + count = ksocknal_close_peer_conns_locked(peer_ni, ipaddr, why); write_unlock_bh(&ksocknal_data.ksnd_global_lock); @@ -1717,9 +1726,8 @@ ksocknal_close_conn_and_siblings(struct ksock_conn *conn, int why) int ksocknal_close_matching_conns(struct lnet_process_id id, __u32 ipaddr) { - struct ksock_peer *peer; - struct list_head *ptmp; - struct list_head *pnxt; + struct ksock_peer *peer_ni; + struct ksock_peer *pnxt; int lo; int hi; int i; @@ -1736,16 +1744,17 @@ ksocknal_close_matching_conns(struct lnet_process_id id, __u32 ipaddr) } for (i = lo; i <= hi; i++) { - list_for_each_safe(ptmp, pnxt, - &ksocknal_data.ksnd_peers[i]) { - peer = list_entry(ptmp, struct ksock_peer, ksnp_list); - - if (!((id.nid == LNET_NID_ANY || id.nid == peer->ksnp_id.nid) && - (id.pid == LNET_PID_ANY || id.pid == peer->ksnp_id.pid))) + list_for_each_entry_safe(peer_ni, pnxt, + &ksocknal_data.ksnd_peers[i], + ksnp_list) { + if (!((id.nid == LNET_NID_ANY || + id.nid == peer_ni->ksnp_id.nid) && + (id.pid == LNET_PID_ANY || + id.pid == peer_ni->ksnp_id.pid))) continue; - count += ksocknal_close_peer_conns_locked(peer, ipaddr, - 0); + count += ksocknal_close_peer_conns_locked(peer_ni, + ipaddr, 0); } } @@ -1794,7 +1803,7 @@ ksocknal_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when) int connect = 1; time64_t last_alive = 0; time64_t now = ktime_get_seconds(); - struct ksock_peer *peer = NULL; + struct ksock_peer *peer_ni = NULL; rwlock_t *glock = &ksocknal_data.ksnd_global_lock; struct lnet_process_id id = { .nid = nid, @@ -1803,25 +1812,25 @@ ksocknal_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when) read_lock(glock); - peer = ksocknal_find_peer_locked(ni, id); - if (peer) { + peer_ni = ksocknal_find_peer_locked(ni, id); + if (peer_ni) { struct ksock_conn *conn; int bufnob; - list_for_each_entry(conn, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(conn, &peer_ni->ksnp_conns, ksnc_list) { bufnob = conn->ksnc_sock->sk->sk_wmem_queued; if (bufnob < conn->ksnc_tx_bufnob) { /* something got ACKed */ conn->ksnc_tx_deadline = ktime_get_seconds() + *ksocknal_tunables.ksnd_timeout; - peer->ksnp_last_alive = now; + peer_ni->ksnp_last_alive = now; conn->ksnc_tx_bufnob = bufnob; } } - last_alive = peer->ksnp_last_alive; - if (!ksocknal_find_connectable_route_locked(peer)) + last_alive = peer_ni->ksnp_last_alive; + if (!ksocknal_find_connectable_route_locked(peer_ni)) connect = 0; } @@ -1830,8 +1839,8 @@ ksocknal_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when) if (last_alive) *when = last_alive * HZ; - CDEBUG(D_NET, "Peer %s %p, alive %lld secs ago, connect %d\n", - libcfs_nid2str(nid), peer, + CDEBUG(D_NET, "peer_ni %s %p, alive %lld secs ago, connect %d\n", + libcfs_nid2str(nid), peer_ni, last_alive ? now - last_alive : -1, connect); @@ -1842,15 +1851,15 @@ ksocknal_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when) write_lock_bh(glock); - peer = ksocknal_find_peer_locked(ni, id); - if (peer) - ksocknal_launch_all_connections_locked(peer); + peer_ni = ksocknal_find_peer_locked(ni, id); + if (peer_ni) + ksocknal_launch_all_connections_locked(peer_ni); write_unlock_bh(glock); } static void -ksocknal_push_peer(struct ksock_peer *peer) +ksocknal_push_peer(struct ksock_peer *peer_ni) { int index; int i; @@ -1862,7 +1871,7 @@ ksocknal_push_peer(struct ksock_peer *peer) i = 0; conn = NULL; - list_for_each_entry(conn, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(conn, &peer_ni->ksnp_conns, ksnc_list) { if (i++ == index) { ksocknal_conn_addref(conn); break; @@ -1896,22 +1905,22 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) } for (tmp = start; tmp <= end; tmp++) { - int peer_off; /* searching offset in peer hash table */ + int peer_off; /* searching offset in peer_ni hash table */ for (peer_off = 0; ; peer_off++) { - struct ksock_peer *peer; + struct ksock_peer *peer_ni; int i = 0; read_lock(&ksocknal_data.ksnd_global_lock); - list_for_each_entry(peer, tmp, ksnp_list) { + list_for_each_entry(peer_ni, tmp, ksnp_list) { if (!((id.nid == LNET_NID_ANY || - id.nid == peer->ksnp_id.nid) && + id.nid == peer_ni->ksnp_id.nid) && (id.pid == LNET_PID_ANY || - id.pid == peer->ksnp_id.pid))) + id.pid == peer_ni->ksnp_id.pid))) continue; if (i++ == peer_off) { - ksocknal_peer_addref(peer); + ksocknal_peer_addref(peer_ni); break; } } @@ -1921,8 +1930,8 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) break; rc = 0; - ksocknal_push_peer(peer); - ksocknal_peer_decref(peer); + ksocknal_push_peer(peer_ni); + ksocknal_peer_decref(peer_ni); } } return rc; @@ -1936,7 +1945,7 @@ ksocknal_add_interface(struct lnet_ni *ni, __u32 ipaddress, __u32 netmask) int rc; int i; int j; - struct ksock_peer *peer; + struct ksock_peer *peer_ni; struct ksock_route *route; if (!ipaddress || !netmask) @@ -1959,14 +1968,19 @@ ksocknal_add_interface(struct lnet_ni *ni, __u32 ipaddress, __u32 netmask) iface->ksni_npeers = 0; for (i = 0; i < ksocknal_data.ksnd_peer_hash_size; i++) { - list_for_each_entry(peer, &ksocknal_data.ksnd_peers[i], + list_for_each_entry(peer_ni, + &ksocknal_data.ksnd_peers[i], ksnp_list) { - for (j = 0; j < peer->ksnp_n_passive_ips; j++) - if (peer->ksnp_passive_ips[j] == ipaddress) + for (j = 0; + j < peer_ni->ksnp_n_passive_ips; + j++) + if (peer_ni->ksnp_passive_ips[j] == + ipaddress) iface->ksni_npeers++; - list_for_each_entry(route, &peer->ksnp_routes, + list_for_each_entry(route, + &peer_ni->ksnp_routes, ksnr_list) { if (route->ksnr_myipaddr == ipaddress) iface->ksni_nroutes++; @@ -1987,7 +2001,7 @@ ksocknal_add_interface(struct lnet_ni *ni, __u32 ipaddress, __u32 netmask) } static void -ksocknal_peer_del_interface_locked(struct ksock_peer *peer, __u32 ipaddr) +ksocknal_peer_del_interface_locked(struct ksock_peer *peer_ni, __u32 ipaddr) { struct list_head *tmp; struct list_head *nxt; @@ -1996,16 +2010,16 @@ ksocknal_peer_del_interface_locked(struct ksock_peer *peer, __u32 ipaddr) int i; int j; - for (i = 0; i < peer->ksnp_n_passive_ips; i++) - if (peer->ksnp_passive_ips[i] == ipaddr) { - for (j = i + 1; j < peer->ksnp_n_passive_ips; j++) - peer->ksnp_passive_ips[j - 1] = - peer->ksnp_passive_ips[j]; - peer->ksnp_n_passive_ips--; + for (i = 0; i < peer_ni->ksnp_n_passive_ips; i++) + if (peer_ni->ksnp_passive_ips[i] == ipaddr) { + for (j = i + 1; j < peer_ni->ksnp_n_passive_ips; j++) + peer_ni->ksnp_passive_ips[j - 1] = + peer_ni->ksnp_passive_ips[j]; + peer_ni->ksnp_n_passive_ips--; break; } - list_for_each_safe(tmp, nxt, &peer->ksnp_routes) { + list_for_each_safe(tmp, nxt, &peer_ni->ksnp_routes) { route = list_entry(tmp, struct ksock_route, ksnr_list); if (route->ksnr_myipaddr != ipaddr) @@ -2019,7 +2033,7 @@ ksocknal_peer_del_interface_locked(struct ksock_peer *peer, __u32 ipaddr) } } - list_for_each_safe(tmp, nxt, &peer->ksnp_conns) { + list_for_each_safe(tmp, nxt, &peer_ni->ksnp_conns) { conn = list_entry(tmp, struct ksock_conn, ksnc_list); if (conn->ksnc_myipaddr == ipaddr) @@ -2032,9 +2046,8 @@ ksocknal_del_interface(struct lnet_ni *ni, __u32 ipaddress) { struct ksock_net *net = ni->ni_data; int rc = -ENOENT; - struct list_head *tmp; - struct list_head *nxt; - struct ksock_peer *peer; + struct ksock_peer *nxt; + struct ksock_peer *peer_ni; __u32 this_ip; int i; int j; @@ -2056,14 +2069,14 @@ ksocknal_del_interface(struct lnet_ni *ni, __u32 ipaddress) net->ksnn_ninterfaces--; for (j = 0; j < ksocknal_data.ksnd_peer_hash_size; j++) { - list_for_each_safe(tmp, nxt, - &ksocknal_data.ksnd_peers[j]) { - peer = list_entry(tmp, struct ksock_peer, ksnp_list); - - if (peer->ksnp_ni != ni) + list_for_each_entry_safe(peer_ni, nxt, + &ksocknal_data.ksnd_peers[j], + ksnp_list) { + if (peer_ni->ksnp_ni != ni) continue; - ksocknal_peer_del_interface_locked(peer, this_ip); + ksocknal_peer_del_interface_locked(peer_ni, + this_ip); } } } @@ -2461,36 +2474,41 @@ ksocknal_base_startup(void) static void ksocknal_debug_peerhash(struct lnet_ni *ni) { - struct ksock_peer *peer = NULL; + struct ksock_peer *peer_ni = NULL; int i; read_lock(&ksocknal_data.ksnd_global_lock); for (i = 0; i < ksocknal_data.ksnd_peer_hash_size; i++) { - list_for_each_entry(peer, &ksocknal_data.ksnd_peers[i], ksnp_list) { + list_for_each_entry(peer_ni, &ksocknal_data.ksnd_peers[i], + ksnp_list) { struct ksock_route *route; struct ksock_conn *conn; - if (peer->ksnp_ni != ni) + if (peer_ni->ksnp_ni != ni) continue; - CWARN("Active peer on shutdown: %s, ref %d, scnt %d, closing %d, accepting %d, err %d, zcookie %llu, txq %d, zc_req %d\n", - libcfs_id2str(peer->ksnp_id), - atomic_read(&peer->ksnp_refcount), - peer->ksnp_sharecount, peer->ksnp_closing, - peer->ksnp_accepting, peer->ksnp_error, - peer->ksnp_zc_next_cookie, - !list_empty(&peer->ksnp_tx_queue), - !list_empty(&peer->ksnp_zc_req_list)); + CWARN("Active peer_ni on shutdown: %s, ref %d, scnt %d, closing %d, accepting %d, err %d, zcookie %llu, txq %d, zc_req %d\n", + libcfs_id2str(peer_ni->ksnp_id), + atomic_read(&peer_ni->ksnp_refcount), + peer_ni->ksnp_sharecount, peer_ni->ksnp_closing, + peer_ni->ksnp_accepting, peer_ni->ksnp_error, + peer_ni->ksnp_zc_next_cookie, + !list_empty(&peer_ni->ksnp_tx_queue), + !list_empty(&peer_ni->ksnp_zc_req_list)); - list_for_each_entry(route, &peer->ksnp_routes, ksnr_list) { + list_for_each_entry(route, &peer_ni->ksnp_routes, + ksnr_list) { CWARN("Route: ref %d, schd %d, conn %d, cnted %d, del %d\n", atomic_read(&route->ksnr_refcount), - route->ksnr_scheduled, route->ksnr_connecting, - route->ksnr_connected, route->ksnr_deleted); + route->ksnr_scheduled, + route->ksnr_connecting, + route->ksnr_connected, + route->ksnr_deleted); } - list_for_each_entry(conn, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(conn, &peer_ni->ksnp_conns, + ksnc_list) { CWARN("Conn: ref %d, sref %d, t %d, c %d\n", atomic_read(&conn->ksnc_conn_refcount), atomic_read(&conn->ksnc_sock_refcount), @@ -2523,7 +2541,7 @@ ksocknal_shutdown(struct lnet_ni *ni) /* Delete all peers */ ksocknal_del_peer(ni, anyid, 0); - /* Wait for all peer state to clean up */ + /* Wait for all peer_ni state to clean up */ i = 2; spin_lock_bh(&net->ksnn_lock); while (net->ksnn_npeers) { diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h index 2a619291fccc..cc813e4c1422 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h @@ -54,7 +54,7 @@ #define SOCKNAL_NSCHEDS 3 #define SOCKNAL_NSCHEDS_HIGH (SOCKNAL_NSCHEDS << 1) -#define SOCKNAL_PEER_HASH_SIZE 101 /* # peer lists */ +#define SOCKNAL_PEER_HASH_SIZE 101 /* # peer_ni lists */ #define SOCKNAL_RESCHED 100 /* # scheduler loops before reschedule */ #define SOCKNAL_INSANITY_RECONN 5000 /* connd is trying on reconn infinitely */ #define SOCKNAL_ENOMEM_RETRY 1 /* seconds between retries */ @@ -142,10 +142,11 @@ struct ksock_tunables { int *ksnd_credits; /* # concurrent sends */ int *ksnd_peertxcredits; /* # concurrent sends to 1 peer */ - int *ksnd_peerrtrcredits; /* # per-peer router buffer + int *ksnd_peerrtrcredits; /* # per-peer_ni router buffer * credits */ - int *ksnd_peertimeout; /* seconds to consider peer dead + int *ksnd_peertimeout; /* seconds to consider + * peer_ni dead */ int *ksnd_enable_csum; /* enable check sum */ int *ksnd_inject_csum_error; /* set non-zero to inject @@ -185,8 +186,8 @@ struct ksock_nal_data { */ int ksnd_nnets; /* # networks set up */ struct list_head ksnd_nets; /* list of nets */ - rwlock_t ksnd_global_lock; /* stabilize peer/conn - * ops + rwlock_t ksnd_global_lock; /* stabilize + * peer_ni/conn ops */ struct list_head *ksnd_peers; /* hash table of all my * known peers @@ -270,7 +271,7 @@ struct ksock_proto; /* forward ref */ struct ksock_tx { /* transmit packet */ struct list_head tx_list; /* queue on conn for transmission etc */ - struct list_head tx_zc_list; /* queue on peer for ZC request */ + struct list_head tx_zc_list; /* queue on peer_ni for ZC request */ atomic_t tx_refcount; /* tx reference count */ int tx_nob; /* # packet bytes */ int tx_resid; /* residual bytes */ @@ -311,9 +312,9 @@ struct ksock_tx { /* transmit packet */ #define SOCKNAL_RX_SLOP 6 /* skipping body */ struct ksock_conn { - struct ksock_peer *ksnc_peer; /* owning peer */ + struct ksock_peer *ksnc_peer; /* owning peer_ni */ struct ksock_route *ksnc_route; /* owning route */ - struct list_head ksnc_list; /* stash on peer's conn list */ + struct list_head ksnc_list; /* stash on peer_ni's conn list */ struct socket *ksnc_sock; /* actual socket */ void *ksnc_saved_data_ready; /* socket's original * data_ready() callback @@ -326,8 +327,8 @@ struct ksock_conn { struct ksock_sched *ksnc_scheduler; /* who schedules this connection */ __u32 ksnc_myipaddr; /* my IP */ - __u32 ksnc_ipaddr; /* peer's IP */ - int ksnc_port; /* peer's port */ + __u32 ksnc_ipaddr; /* peer_ni's IP */ + int ksnc_port; /* peer_ni's port */ signed int ksnc_type:3; /* type of connection, should be * signed value */ @@ -382,9 +383,9 @@ struct ksock_conn { }; struct ksock_route { - struct list_head ksnr_list; /* chain on peer route list */ + struct list_head ksnr_list; /* chain on peer_ni route list */ struct list_head ksnr_connd_list; /* chain on ksnr_connd_routes */ - struct ksock_peer *ksnr_peer; /* owning peer */ + struct ksock_peer *ksnr_peer; /* owning peer_ni */ atomic_t ksnr_refcount; /* # users */ time64_t ksnr_timeout; /* when (in secs) reconnection * can happen next @@ -400,7 +401,7 @@ struct ksock_route { unsigned int ksnr_connected:4; /* connections established by * type */ - unsigned int ksnr_deleted:1; /* been removed from peer? */ + unsigned int ksnr_deleted:1; /* been removed from peer_ni? */ unsigned int ksnr_share_count; /* created explicitly? */ int ksnr_conn_count; /* # conns established by this * route @@ -410,7 +411,7 @@ struct ksock_route { #define SOCKNAL_KEEPALIVE_PING 1 /* cookie for keepalive ping */ struct ksock_peer { - struct list_head ksnp_list; /* stash on global peer list */ + struct list_head ksnp_list; /* stash on global peer_ni list */ time64_t ksnp_last_alive; /* when (in seconds) I was last * alive */ @@ -422,9 +423,12 @@ struct ksock_peer { */ int ksnp_error; /* errno on closing last conn */ __u64 ksnp_zc_next_cookie; /* ZC completion cookie */ - __u64 ksnp_incarnation; /* latest known peer incarnation + __u64 ksnp_incarnation; /* latest known peer_ni + * incarnation + */ + struct ksock_proto *ksnp_proto; /* latest known peer_ni + * protocol */ - struct ksock_proto *ksnp_proto; /* latest known peer protocol */ struct list_head ksnp_conns; /* all active connections */ struct list_head ksnp_routes; /* routes */ struct list_head ksnp_tx_queue; /* waiting packets */ @@ -606,20 +610,20 @@ ksocknal_route_decref(struct ksock_route *route) } static inline void -ksocknal_peer_addref(struct ksock_peer *peer) +ksocknal_peer_addref(struct ksock_peer *peer_ni) { - LASSERT(atomic_read(&peer->ksnp_refcount) > 0); - atomic_inc(&peer->ksnp_refcount); + LASSERT(atomic_read(&peer_ni->ksnp_refcount) > 0); + atomic_inc(&peer_ni->ksnp_refcount); } -void ksocknal_destroy_peer(struct ksock_peer *peer); +void ksocknal_destroy_peer(struct ksock_peer *peer_ni); static inline void -ksocknal_peer_decref(struct ksock_peer *peer) +ksocknal_peer_decref(struct ksock_peer *peer_ni) { - LASSERT(atomic_read(&peer->ksnp_refcount) > 0); - if (atomic_dec_and_test(&peer->ksnp_refcount)) - ksocknal_destroy_peer(peer); + LASSERT(atomic_read(&peer_ni->ksnp_refcount) > 0); + if (atomic_dec_and_test(&peer_ni->ksnp_refcount)) + ksocknal_destroy_peer(peer_ni); } int ksocknal_startup(struct lnet_ni *ni); @@ -636,17 +640,17 @@ struct ksock_peer *ksocknal_find_peer_locked(struct lnet_ni *ni, struct lnet_process_id id); struct ksock_peer *ksocknal_find_peer(struct lnet_ni *ni, struct lnet_process_id id); -void ksocknal_peer_failed(struct ksock_peer *peer); +void ksocknal_peer_failed(struct ksock_peer *peer_ni); int ksocknal_create_conn(struct lnet_ni *ni, struct ksock_route *route, struct socket *sock, int type); void ksocknal_close_conn_locked(struct ksock_conn *conn, int why); void ksocknal_terminate_conn(struct ksock_conn *conn); void ksocknal_destroy_conn(struct ksock_conn *conn); -int ksocknal_close_peer_conns_locked(struct ksock_peer *peer, +int ksocknal_close_peer_conns_locked(struct ksock_peer *peer_ni, __u32 ipaddr, int why); int ksocknal_close_conn_and_siblings(struct ksock_conn *conn, int why); int ksocknal_close_matching_conns(struct lnet_process_id id, __u32 ipaddr); -struct ksock_conn *ksocknal_find_conn_locked(struct ksock_peer *peer, +struct ksock_conn *ksocknal_find_conn_locked(struct ksock_peer *peer_ni, struct ksock_tx *tx, int nonblk); int ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx, @@ -661,9 +665,11 @@ void ksocknal_notify(struct lnet_ni *ni, lnet_nid_t gw_nid, int alive); void ksocknal_query(struct lnet_ni *ni, lnet_nid_t nid, time64_t *when); int ksocknal_thread_start(int (*fn)(void *arg), void *arg, char *name); void ksocknal_thread_fini(void); -void ksocknal_launch_all_connections_locked(struct ksock_peer *peer); -struct ksock_route *ksocknal_find_connectable_route_locked(struct ksock_peer *peer); -struct ksock_route *ksocknal_find_connecting_route_locked(struct ksock_peer *peer); +void ksocknal_launch_all_connections_locked(struct ksock_peer *peer_ni); +struct ksock_route *ksocknal_find_connectable_route_locked( + struct ksock_peer *peer_ni); +struct ksock_route *ksocknal_find_connecting_route_locked( + struct ksock_peer *peer_ni); int ksocknal_new_packet(struct ksock_conn *conn, int skip); int ksocknal_scheduler(void *arg); int ksocknal_connd(void *arg); diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c index 32b76727f400..1bf0170503ed 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c @@ -375,12 +375,12 @@ static void ksocknal_check_zc_req(struct ksock_tx *tx) { struct ksock_conn *conn = tx->tx_conn; - struct ksock_peer *peer = conn->ksnc_peer; + struct ksock_peer *peer_ni = conn->ksnc_peer; /* * Set tx_msg.ksm_zc_cookies[0] to a unique non-zero cookie and add tx * to ksnp_zc_req_list if some fragment of this message should be sent - * zero-copy. Our peer will send an ACK containing this cookie when + * zero-copy. Our peer_ni will send an ACK containing this cookie when * she has received this message to tell us we can signal completion. * tx_msg.ksm_zc_cookies[0] remains non-zero while tx is on * ksnp_zc_req_list. @@ -400,46 +400,46 @@ ksocknal_check_zc_req(struct ksock_tx *tx) */ ksocknal_tx_addref(tx); - spin_lock(&peer->ksnp_lock); + spin_lock(&peer_ni->ksnp_lock); - /* ZC_REQ is going to be pinned to the peer */ + /* ZC_REQ is going to be pinned to the peer_ni */ tx->tx_deadline = ktime_get_seconds() + *ksocknal_tunables.ksnd_timeout; LASSERT(!tx->tx_msg.ksm_zc_cookies[0]); - tx->tx_msg.ksm_zc_cookies[0] = peer->ksnp_zc_next_cookie++; + tx->tx_msg.ksm_zc_cookies[0] = peer_ni->ksnp_zc_next_cookie++; - if (!peer->ksnp_zc_next_cookie) - peer->ksnp_zc_next_cookie = SOCKNAL_KEEPALIVE_PING + 1; + if (!peer_ni->ksnp_zc_next_cookie) + peer_ni->ksnp_zc_next_cookie = SOCKNAL_KEEPALIVE_PING + 1; - list_add_tail(&tx->tx_zc_list, &peer->ksnp_zc_req_list); + list_add_tail(&tx->tx_zc_list, &peer_ni->ksnp_zc_req_list); - spin_unlock(&peer->ksnp_lock); + spin_unlock(&peer_ni->ksnp_lock); } static void ksocknal_uncheck_zc_req(struct ksock_tx *tx) { - struct ksock_peer *peer = tx->tx_conn->ksnc_peer; + struct ksock_peer *peer_ni = tx->tx_conn->ksnc_peer; LASSERT(tx->tx_msg.ksm_type != KSOCK_MSG_NOOP); LASSERT(tx->tx_zc_capable); tx->tx_zc_checked = 0; - spin_lock(&peer->ksnp_lock); + spin_lock(&peer_ni->ksnp_lock); if (!tx->tx_msg.ksm_zc_cookies[0]) { /* Not waiting for an ACK */ - spin_unlock(&peer->ksnp_lock); + spin_unlock(&peer_ni->ksnp_lock); return; } tx->tx_msg.ksm_zc_cookies[0] = 0; list_del(&tx->tx_zc_list); - spin_unlock(&peer->ksnp_lock); + spin_unlock(&peer_ni->ksnp_lock); ksocknal_tx_decref(tx); } @@ -540,14 +540,14 @@ ksocknal_launch_connection_locked(struct ksock_route *route) } void -ksocknal_launch_all_connections_locked(struct ksock_peer *peer) +ksocknal_launch_all_connections_locked(struct ksock_peer *peer_ni) { struct ksock_route *route; /* called holding write lock on ksnd_global_lock */ for (;;) { /* launch any/all connections that need it */ - route = ksocknal_find_connectable_route_locked(peer); + route = ksocknal_find_connectable_route_locked(peer_ni); if (!route) return; @@ -556,7 +556,7 @@ ksocknal_launch_all_connections_locked(struct ksock_peer *peer) } struct ksock_conn * -ksocknal_find_conn_locked(struct ksock_peer *peer, struct ksock_tx *tx, +ksocknal_find_conn_locked(struct ksock_peer *peer_ni, struct ksock_tx *tx, int nonblk) { struct ksock_conn *c; @@ -566,7 +566,7 @@ ksocknal_find_conn_locked(struct ksock_peer *peer, struct ksock_tx *tx, int tnob = 0; int fnob = 0; - list_for_each_entry(c, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(c, &peer_ni->ksnp_conns, ksnc_list) { int nob, rc; nob = atomic_read(&c->ksnc_tx_nob) + @@ -722,12 +722,12 @@ ksocknal_queue_tx_locked(struct ksock_tx *tx, struct ksock_conn *conn) } struct ksock_route * -ksocknal_find_connectable_route_locked(struct ksock_peer *peer) +ksocknal_find_connectable_route_locked(struct ksock_peer *peer_ni) { time64_t now = ktime_get_seconds(); struct ksock_route *route; - list_for_each_entry(route, &peer->ksnp_routes, ksnr_list) { + list_for_each_entry(route, &peer_ni->ksnp_routes, ksnr_list) { LASSERT(!route->ksnr_connecting || route->ksnr_scheduled); /* connections being established */ @@ -756,11 +756,11 @@ ksocknal_find_connectable_route_locked(struct ksock_peer *peer) } struct ksock_route * -ksocknal_find_connecting_route_locked(struct ksock_peer *peer) +ksocknal_find_connecting_route_locked(struct ksock_peer *peer_ni) { struct ksock_route *route; - list_for_each_entry(route, &peer->ksnp_routes, ksnr_list) { + list_for_each_entry(route, &peer_ni->ksnp_routes, ksnr_list) { LASSERT(!route->ksnr_connecting || route->ksnr_scheduled); @@ -775,7 +775,7 @@ int ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx, struct lnet_process_id id) { - struct ksock_peer *peer; + struct ksock_peer *peer_ni; struct ksock_conn *conn; rwlock_t *g_lock; int retry; @@ -787,10 +787,11 @@ ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx, for (retry = 0;; retry = 1) { read_lock(g_lock); - peer = ksocknal_find_peer_locked(ni, id); - if (peer) { - if (!ksocknal_find_connectable_route_locked(peer)) { - conn = ksocknal_find_conn_locked(peer, tx, tx->tx_nonblk); + peer_ni = ksocknal_find_peer_locked(ni, id); + if (peer_ni) { + if (!ksocknal_find_connectable_route_locked(peer_ni)) { + conn = ksocknal_find_conn_locked(peer_ni, tx, + tx->tx_nonblk); if (conn) { /* * I've got no routes that need to be @@ -809,8 +810,8 @@ ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx, write_lock_bh(g_lock); - peer = ksocknal_find_peer_locked(ni, id); - if (peer) + peer_ni = ksocknal_find_peer_locked(ni, id); + if (peer_ni) break; write_unlock_bh(g_lock); @@ -822,7 +823,7 @@ ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx, } if (retry) { - CERROR("Can't find peer %s\n", libcfs_id2str(id)); + CERROR("Can't find peer_ni %s\n", libcfs_id2str(id)); return -EHOSTUNREACH; } @@ -830,15 +831,15 @@ ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx, LNET_NIDADDR(id.nid), lnet_acceptor_port()); if (rc) { - CERROR("Can't add peer %s: %d\n", + CERROR("Can't add peer_ni %s: %d\n", libcfs_id2str(id), rc); return rc; } } - ksocknal_launch_all_connections_locked(peer); + ksocknal_launch_all_connections_locked(peer_ni); - conn = ksocknal_find_conn_locked(peer, tx, tx->tx_nonblk); + conn = ksocknal_find_conn_locked(peer_ni, tx, tx->tx_nonblk); if (conn) { /* Connection exists; queue message on it */ ksocknal_queue_tx_locked(tx, conn); @@ -846,14 +847,14 @@ ksocknal_launch_packet(struct lnet_ni *ni, struct ksock_tx *tx, return 0; } - if (peer->ksnp_accepting > 0 || - ksocknal_find_connecting_route_locked(peer)) { - /* the message is going to be pinned to the peer */ + if (peer_ni->ksnp_accepting > 0 || + ksocknal_find_connecting_route_locked(peer_ni)) { + /* the message is going to be pinned to the peer_ni */ tx->tx_deadline = ktime_get_seconds() + *ksocknal_tunables.ksnd_timeout; /* Queue the message until a connection is established */ - list_add_tail(&tx->tx_list, &peer->ksnp_tx_queue); + list_add_tail(&tx->tx_list, &peer_ni->ksnp_tx_queue); write_unlock_bh(g_lock); return 0; } @@ -1167,7 +1168,7 @@ ksocknal_process_receive(struct ksock_conn *conn) conn->ksnc_proto->pro_unpack(&conn->ksnc_msg); if (conn->ksnc_peer->ksnp_id.pid & LNET_PID_USERFLAG) { - /* Userspace peer */ + /* Userspace peer_ni */ lhdr = &conn->ksnc_msg.ksm_u.lnetmsg.ksnm_hdr; id = &conn->ksnc_peer->ksnp_id; @@ -1667,7 +1668,9 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn, proto = ksocknal_parse_proto_version(hello); if (!proto) { if (!active) { - /* unknown protocol from peer, tell peer my protocol */ + /* unknown protocol from peer_ni, + * tell peer_ni my protocol + */ conn->ksnc_proto = &ksocknal_protocol_v3x; #if SOCKNAL_VERSION_DEBUG if (*ksocknal_tunables.ksnd_protocol == 2) @@ -1708,7 +1711,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn, if (!active && conn->ksnc_port > LNET_ACCEPTOR_MAX_RESERVED_PORT) { - /* Userspace NAL assigns peer process ID from socket */ + /* Userspace NAL assigns peer_ni process ID from socket */ recv_id.pid = conn->ksnc_port | LNET_PID_USERFLAG; recv_id.nid = LNET_MKNID(LNET_NIDNET(ni->ni_nid), conn->ksnc_ipaddr); @@ -1720,7 +1723,7 @@ ksocknal_recv_hello(struct lnet_ni *ni, struct ksock_conn *conn, if (!active) { *peerid = recv_id; - /* peer determines type */ + /* peer_ni determines type */ conn->ksnc_type = ksocknal_invert_type(hello->kshm_ctype); if (conn->ksnc_type == SOCKLND_CONN_NONE) { CERROR("Unexpected type %d from %s ip %pI4h\n", @@ -1760,7 +1763,7 @@ static int ksocknal_connect(struct ksock_route *route) { LIST_HEAD(zombies); - struct ksock_peer *peer = route->ksnr_peer; + struct ksock_peer *peer_ni = route->ksnr_peer; int type; int wanted; struct socket *sock; @@ -1781,21 +1784,21 @@ ksocknal_connect(struct ksock_route *route) wanted = ksocknal_route_mask() & ~route->ksnr_connected; /* - * stop connecting if peer/route got closed under me, or + * stop connecting if peer_ni/route got closed under me, or * route got connected while queued */ - if (peer->ksnp_closing || route->ksnr_deleted || + if (peer_ni->ksnp_closing || route->ksnr_deleted || !wanted) { retry_later = 0; break; } - /* reschedule if peer is connecting to me */ - if (peer->ksnp_accepting > 0) { + /* reschedule if peer_ni is connecting to me */ + if (peer_ni->ksnp_accepting > 0) { CDEBUG(D_NET, - "peer %s(%d) already connecting to me, retry later.\n", - libcfs_nid2str(peer->ksnp_id.nid), - peer->ksnp_accepting); + "peer_ni %s(%d) already connecting to me, retry later.\n", + libcfs_nid2str(peer_ni->ksnp_id.nid), + peer_ni->ksnp_accepting); retry_later = 1; } @@ -1817,21 +1820,21 @@ ksocknal_connect(struct ksock_route *route) if (ktime_get_seconds() >= deadline) { rc = -ETIMEDOUT; - lnet_connect_console_error(rc, peer->ksnp_id.nid, + lnet_connect_console_error(rc, peer_ni->ksnp_id.nid, route->ksnr_ipaddr, route->ksnr_port); goto failed; } - rc = lnet_connect(&sock, peer->ksnp_id.nid, + rc = lnet_connect(&sock, peer_ni->ksnp_id.nid, route->ksnr_myipaddr, route->ksnr_ipaddr, route->ksnr_port); if (rc) goto failed; - rc = ksocknal_create_conn(peer->ksnp_ni, route, sock, type); + rc = ksocknal_create_conn(peer_ni->ksnp_ni, route, sock, type); if (rc < 0) { - lnet_connect_console_error(rc, peer->ksnp_id.nid, + lnet_connect_console_error(rc, peer_ni->ksnp_id.nid, route->ksnr_ipaddr, route->ksnr_port); goto failed; @@ -1843,8 +1846,8 @@ ksocknal_connect(struct ksock_route *route) */ retry_later = (rc); if (retry_later) - CDEBUG(D_NET, "peer %s: conn race, retry later.\n", - libcfs_nid2str(peer->ksnp_id.nid)); + CDEBUG(D_NET, "peer_ni %s: conn race, retry later.\n", + libcfs_nid2str(peer_ni->ksnp_id.nid)); write_lock_bh(&ksocknal_data.ksnd_global_lock); } @@ -1855,10 +1858,10 @@ ksocknal_connect(struct ksock_route *route) if (retry_later) { /* * re-queue for attention; this frees me up to handle - * the peer's incoming connection request + * the peer_ni's incoming connection request */ if (rc == EALREADY || - (!rc && peer->ksnp_accepting > 0)) { + (!rc && peer_ni->ksnp_accepting > 0)) { /* * We want to introduce a delay before next * attempt to connect if we lost conn race, @@ -1895,17 +1898,17 @@ ksocknal_connect(struct ksock_route *route) LASSERT(route->ksnr_retry_interval); route->ksnr_timeout = ktime_get_seconds() + route->ksnr_retry_interval; - if (!list_empty(&peer->ksnp_tx_queue) && - !peer->ksnp_accepting && - !ksocknal_find_connecting_route_locked(peer)) { + if (!list_empty(&peer_ni->ksnp_tx_queue) && + !peer_ni->ksnp_accepting && + !ksocknal_find_connecting_route_locked(peer_ni)) { struct ksock_conn *conn; /* * ksnp_tx_queue is queued on a conn on successful * connection for V1.x and V2.x */ - if (!list_empty(&peer->ksnp_conns)) { - conn = list_entry(peer->ksnp_conns.next, + if (!list_empty(&peer_ni->ksnp_conns)) { + conn = list_entry(peer_ni->ksnp_conns.next, struct ksock_conn, ksnc_list); LASSERT(conn->ksnc_proto == &ksocknal_protocol_v3x); } @@ -1914,13 +1917,13 @@ ksocknal_connect(struct ksock_route *route) * take all the blocked packets while I've got the lock and * complete below... */ - list_splice_init(&peer->ksnp_tx_queue, &zombies); + list_splice_init(&peer_ni->ksnp_tx_queue, &zombies); } write_unlock_bh(&ksocknal_data.ksnd_global_lock); - ksocknal_peer_failed(peer); - ksocknal_txlist_done(peer->ksnp_ni, &zombies, 1); + ksocknal_peer_failed(peer_ni); + ksocknal_txlist_done(peer_ni->ksnp_ni, &zombies, 1); return 0; } @@ -2167,12 +2170,12 @@ ksocknal_connd(void *arg) } static struct ksock_conn * -ksocknal_find_timed_out_conn(struct ksock_peer *peer) +ksocknal_find_timed_out_conn(struct ksock_peer *peer_ni) { /* We're called with a shared lock on ksnd_global_lock */ struct ksock_conn *conn; - list_for_each_entry(conn, &peer->ksnp_conns, ksnc_list) { + list_for_each_entry(conn, &peer_ni->ksnp_conns, ksnc_list) { int error; /* Don't need the {get,put}connsock dance to deref ksnc_sock */ @@ -2189,20 +2192,20 @@ ksocknal_find_timed_out_conn(struct ksock_peer *peer) switch (error) { case ECONNRESET: CNETERR("A connection with %s (%pI4h:%d) was reset; it may have rebooted.\n", - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &conn->ksnc_ipaddr, conn->ksnc_port); break; case ETIMEDOUT: CNETERR("A connection with %s (%pI4h:%d) timed out; the network or node may be down.\n", - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &conn->ksnc_ipaddr, conn->ksnc_port); break; default: CNETERR("An unexpected network error %d occurred with %s (%pI4h:%d\n", error, - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &conn->ksnc_ipaddr, conn->ksnc_port); break; @@ -2216,7 +2219,7 @@ ksocknal_find_timed_out_conn(struct ksock_peer *peer) /* Timed out incomplete incoming message */ ksocknal_conn_addref(conn); CNETERR("Timeout receiving from %s (%pI4h:%d), state %d wanted %zd left %d\n", - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &conn->ksnc_ipaddr, conn->ksnc_port, conn->ksnc_rx_state, @@ -2234,7 +2237,7 @@ ksocknal_find_timed_out_conn(struct ksock_peer *peer) */ ksocknal_conn_addref(conn); CNETERR("Timeout sending data to %s (%pI4h:%d) the network or that node may be down.\n", - libcfs_id2str(peer->ksnp_id), + libcfs_id2str(peer_ni->ksnp_id), &conn->ksnc_ipaddr, conn->ksnc_port); return conn; @@ -2245,15 +2248,16 @@ ksocknal_find_timed_out_conn(struct ksock_peer *peer) } static inline void -ksocknal_flush_stale_txs(struct ksock_peer *peer) +ksocknal_flush_stale_txs(struct ksock_peer *peer_ni) { struct ksock_tx *tx; LIST_HEAD(stale_txs); write_lock_bh(&ksocknal_data.ksnd_global_lock); - while (!list_empty(&peer->ksnp_tx_queue)) { - tx = list_entry(peer->ksnp_tx_queue.next, struct ksock_tx, tx_list); + while (!list_empty(&peer_ni->ksnp_tx_queue)) { + tx = list_entry(peer_ni->ksnp_tx_queue.next, struct ksock_tx, + tx_list); if (ktime_get_seconds() < tx->tx_deadline) break; @@ -2264,11 +2268,11 @@ ksocknal_flush_stale_txs(struct ksock_peer *peer) write_unlock_bh(&ksocknal_data.ksnd_global_lock); - ksocknal_txlist_done(peer->ksnp_ni, &stale_txs, 1); + ksocknal_txlist_done(peer_ni->ksnp_ni, &stale_txs, 1); } static int -ksocknal_send_keepalive_locked(struct ksock_peer *peer) +ksocknal_send_keepalive_locked(struct ksock_peer *peer_ni) __must_hold(&ksocknal_data.ksnd_global_lock) { struct ksock_sched *sched; @@ -2276,27 +2280,27 @@ ksocknal_send_keepalive_locked(struct ksock_peer *peer) struct ksock_tx *tx; /* last_alive will be updated by create_conn */ - if (list_empty(&peer->ksnp_conns)) + if (list_empty(&peer_ni->ksnp_conns)) return 0; - if (peer->ksnp_proto != &ksocknal_protocol_v3x) + if (peer_ni->ksnp_proto != &ksocknal_protocol_v3x) return 0; if (*ksocknal_tunables.ksnd_keepalive <= 0 || - ktime_get_seconds() < peer->ksnp_last_alive + + ktime_get_seconds() < peer_ni->ksnp_last_alive + *ksocknal_tunables.ksnd_keepalive) return 0; - if (ktime_get_seconds() < peer->ksnp_send_keepalive) + if (ktime_get_seconds() < peer_ni->ksnp_send_keepalive) return 0; /* * retry 10 secs later, so we wouldn't put pressure - * on this peer if we failed to send keepalive this time + * on this peer_ni if we failed to send keepalive this time */ - peer->ksnp_send_keepalive = ktime_get_seconds() + 10; + peer_ni->ksnp_send_keepalive = ktime_get_seconds() + 10; - conn = ksocknal_find_conn_locked(peer, NULL, 1); + conn = ksocknal_find_conn_locked(peer_ni, NULL, 1); if (conn) { sched = conn->ksnc_scheduler; @@ -2319,7 +2323,7 @@ ksocknal_send_keepalive_locked(struct ksock_peer *peer) return -ENOMEM; } - if (!ksocknal_launch_packet(peer->ksnp_ni, tx, peer->ksnp_id)) { + if (!ksocknal_launch_packet(peer_ni->ksnp_ni, tx, peer_ni->ksnp_id)) { read_lock(&ksocknal_data.ksnd_global_lock); return 1; } @@ -2334,7 +2338,7 @@ static void ksocknal_check_peer_timeouts(int idx) { struct list_head *peers = &ksocknal_data.ksnd_peers[idx]; - struct ksock_peer *peer; + struct ksock_peer *peer_ni; struct ksock_conn *conn; struct ksock_tx *tx; @@ -2346,18 +2350,18 @@ ksocknal_check_peer_timeouts(int idx) */ read_lock(&ksocknal_data.ksnd_global_lock); - list_for_each_entry(peer, peers, ksnp_list) { + list_for_each_entry(peer_ni, peers, ksnp_list) { struct ksock_tx *tx_stale; time64_t deadline = 0; int resid = 0; int n = 0; - if (ksocknal_send_keepalive_locked(peer)) { + if (ksocknal_send_keepalive_locked(peer_ni)) { read_unlock(&ksocknal_data.ksnd_global_lock); goto again; } - conn = ksocknal_find_timed_out_conn(peer); + conn = ksocknal_find_timed_out_conn(peer_ni); if (conn) { read_unlock(&ksocknal_data.ksnd_global_lock); @@ -2366,7 +2370,7 @@ ksocknal_check_peer_timeouts(int idx) /* * NB we won't find this one again, but we can't - * just proceed with the next peer, since we dropped + * just proceed with the next peer_ni, since we dropped * ksnd_global_lock and it might be dead already! */ ksocknal_conn_decref(conn); @@ -2377,27 +2381,28 @@ ksocknal_check_peer_timeouts(int idx) * we can't process stale txs right here because we're * holding only shared lock */ - if (!list_empty(&peer->ksnp_tx_queue)) { - tx = list_entry(peer->ksnp_tx_queue.next, + if (!list_empty(&peer_ni->ksnp_tx_queue)) { + tx = list_entry(peer_ni->ksnp_tx_queue.next, struct ksock_tx, tx_list); if (ktime_get_seconds() >= tx->tx_deadline) { - ksocknal_peer_addref(peer); + ksocknal_peer_addref(peer_ni); read_unlock(&ksocknal_data.ksnd_global_lock); - ksocknal_flush_stale_txs(peer); + ksocknal_flush_stale_txs(peer_ni); - ksocknal_peer_decref(peer); + ksocknal_peer_decref(peer_ni); goto again; } } - if (list_empty(&peer->ksnp_zc_req_list)) + if (list_empty(&peer_ni->ksnp_zc_req_list)) continue; tx_stale = NULL; - spin_lock(&peer->ksnp_lock); - list_for_each_entry(tx, &peer->ksnp_zc_req_list, tx_zc_list) { + spin_lock(&peer_ni->ksnp_lock); + list_for_each_entry(tx, &peer_ni->ksnp_zc_req_list, + tx_zc_list) { if (ktime_get_seconds() < tx->tx_deadline) break; /* ignore the TX if connection is being closed */ @@ -2409,7 +2414,7 @@ ksocknal_check_peer_timeouts(int idx) } if (!tx_stale) { - spin_unlock(&peer->ksnp_lock); + spin_unlock(&peer_ni->ksnp_lock); continue; } @@ -2418,11 +2423,11 @@ ksocknal_check_peer_timeouts(int idx) conn = tx_stale->tx_conn; ksocknal_conn_addref(conn); - spin_unlock(&peer->ksnp_lock); + spin_unlock(&peer_ni->ksnp_lock); read_unlock(&ksocknal_data.ksnd_global_lock); - CERROR("Total %d stale ZC_REQs for peer %s detected; the oldest(%p) timed out %lld secs ago, resid: %d, wmem: %d\n", - n, libcfs_nid2str(peer->ksnp_id.nid), tx_stale, + CERROR("Total %d stale ZC_REQs for peer_ni %s detected; the oldest(%p) timed out %lld secs ago, resid: %d, wmem: %d\n", + n, libcfs_nid2str(peer_ni->ksnp_id.nid), tx_stale, ktime_get_seconds() - deadline, resid, conn->ksnc_sock->sk->sk_wmem_queued); diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_lib.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_lib.c index 93a02cd6b6b5..33847b9615ed 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_lib.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_lib.c @@ -44,7 +44,7 @@ ksocknal_lib_get_conn_addrs(struct ksock_conn *conn) LASSERT(!conn->ksnc_closing); if (rc) { - CERROR("Error %d getting sock peer IP\n", rc); + CERROR("Error %d getting sock peer_ni IP\n", rc); return rc; } @@ -157,7 +157,7 @@ ksocknal_lib_eager_ack(struct ksock_conn *conn) * Remind the socket to ACK eagerly. If I don't, the socket might * think I'm about to send something it could piggy-back the ACK * on, introducing delay in completing zero-copy sends in my - * peer. + * peer_ni. */ kernel_setsockopt(sock, SOL_TCP, TCP_QUICKACK, (char *)&opt, sizeof(opt)); diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_proto.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_proto.c index abfaf5701758..8c10eda382b7 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_proto.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_proto.c @@ -367,14 +367,14 @@ ksocknal_match_tx_v3(struct ksock_conn *conn, struct ksock_tx *tx, int nonblk) static int ksocknal_handle_zcreq(struct ksock_conn *c, __u64 cookie, int remote) { - struct ksock_peer *peer = c->ksnc_peer; + struct ksock_peer *peer_ni = c->ksnc_peer; struct ksock_conn *conn; struct ksock_tx *tx; int rc; read_lock(&ksocknal_data.ksnd_global_lock); - conn = ksocknal_find_conn_locked(peer, NULL, !!remote); + conn = ksocknal_find_conn_locked(peer_ni, NULL, !!remote); if (conn) { struct ksock_sched *sched = conn->ksnc_scheduler; @@ -399,7 +399,7 @@ ksocknal_handle_zcreq(struct ksock_conn *c, __u64 cookie, int remote) if (!tx) return -ENOMEM; - rc = ksocknal_launch_packet(peer->ksnp_ni, tx, peer->ksnp_id); + rc = ksocknal_launch_packet(peer_ni->ksnp_ni, tx, peer_ni->ksnp_id); if (!rc) return 0; @@ -411,7 +411,7 @@ ksocknal_handle_zcreq(struct ksock_conn *c, __u64 cookie, int remote) static int ksocknal_handle_zcack(struct ksock_conn *conn, __u64 cookie1, __u64 cookie2) { - struct ksock_peer *peer = conn->ksnc_peer; + struct ksock_peer *peer_ni = conn->ksnc_peer; struct ksock_tx *tx; struct ksock_tx *tmp; LIST_HEAD(zlist); @@ -428,9 +428,9 @@ ksocknal_handle_zcack(struct ksock_conn *conn, __u64 cookie1, __u64 cookie2) return count == 1 ? 0 : -EPROTO; } - spin_lock(&peer->ksnp_lock); + spin_lock(&peer_ni->ksnp_lock); - list_for_each_entry_safe(tx, tmp, &peer->ksnp_zc_req_list, + list_for_each_entry_safe(tx, tmp, &peer_ni->ksnp_zc_req_list, tx_zc_list) { __u64 c = tx->tx_msg.ksm_zc_cookies[0]; @@ -445,7 +445,7 @@ ksocknal_handle_zcack(struct ksock_conn *conn, __u64 cookie1, __u64 cookie2) } } - spin_unlock(&peer->ksnp_lock); + spin_unlock(&peer_ni->ksnp_lock); while (!list_empty(&zlist)) { tx = list_entry(zlist.next, struct ksock_tx, tx_zc_list); From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613183 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 635AF1390 for ; Tue, 25 Sep 2018 01:11:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 678912A052 for ; Tue, 25 Sep 2018 01:11:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5BDE22A05D; Tue, 25 Sep 2018 01:11:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B48282A052 for ; Tue, 25 Sep 2018 01:11:45 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 429B54C3FBD; Mon, 24 Sep 2018 18:11:45 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 026384C3A17 for ; Mon, 24 Sep 2018 18:11:43 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 20DE9B034; Tue, 25 Sep 2018 01:11:42 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763552.32103.13888609534957205718.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 17/34] LU-7734 lnet: Add peer_ni and NI stats for DLC X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Doug Oucharek This patch adds three stats to the peer_ni and NI structures: send_count, recv_count, and drop_count. These stats get printed when you do an "lnetctl net show -v" (for NI) and "lnetctl peer show" (for peer_ni). Signed-off-by: Doug Oucharek Change-Id: Ic41c88cbc68dba677151d87a1fab53a48d36ea29 Reviewed-on: http://review.whamcloud.com/20170 Reviewed-by: Amir Shehata Tested-by: Amir Shehata Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 3 +- .../staging/lustre/include/linux/lnet/lib-types.h | 11 +++++++ .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 6 ++++ drivers/staging/lustre/lnet/lnet/api-ni.c | 32 +++++++++++++++----- drivers/staging/lustre/lnet/lnet/lib-move.c | 4 +++ drivers/staging/lustre/lnet/lnet/lib-msg.c | 8 +++++ drivers/staging/lustre/lnet/lnet/peer.c | 7 ++++ 7 files changed, 61 insertions(+), 10 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 08fc4abad332..53a5ee8632a6 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -664,7 +664,8 @@ bool lnet_peer_is_ni_pref_locked(struct lnet_peer_ni *lpni, int lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid, bool mr); int lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid); int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, - struct lnet_peer_ni_credit_info *peer_ni_info); + struct lnet_peer_ni_credit_info *peer_ni_info, + struct lnet_ioctl_element_stats *peer_ni_stats); int lnet_get_peer_ni_info(__u32 peer_index, __u64 *nid, char alivness[LNET_MAX_STR_LEN], __u32 *cpt_iter, __u32 *refcount, diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index dbcd9b3da914..e17ca716dce1 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -271,6 +271,12 @@ enum lnet_ni_state { LNET_NI_STATE_DELETING }; +struct lnet_element_stats { + atomic_t send_count; + atomic_t recv_count; + atomic_t drop_count; +}; + struct lnet_net { /* chain on the ln_nets */ struct list_head net_list; @@ -348,6 +354,9 @@ struct lnet_ni { /* lnd tunables set explicitly */ bool ni_lnd_tunables_set; + /* NI statistics */ + struct lnet_element_stats ni_stats; + /* physical device CPT */ int dev_cpt; @@ -403,6 +412,8 @@ struct lnet_peer_ni { struct list_head lpni_rtrq; /* chain on router list */ struct list_head lpni_rtr_list; + /* statistics kept on each peer NI */ + struct lnet_element_stats lpni_stats; /* # tx credits available */ int lpni_txcredits; struct lnet_peer_net *lpni_peer_net; diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index 8be322dd4bd2..b31b69c25ef2 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -142,6 +142,12 @@ struct lnet_ioctl_config_data { char cfg_bulk[0]; }; +struct lnet_ioctl_element_stats { + u32 send_count; + u32 recv_count; + u32 drop_count; +}; + /* * lnet_ioctl_config_ni * This structure describes an NI configuration. There are multiple components diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 2d5d657de058..a01858374211 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1881,6 +1881,7 @@ static int lnet_handle_dbg_task(struct lnet_ioctl_dbg *dbg, static void lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_ni *cfg_ni, struct lnet_ioctl_config_lnd_tunables *tun, + struct lnet_ioctl_element_stats *stats, __u32 tun_size) { size_t min_size = 0; @@ -1906,6 +1907,11 @@ lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_ni *cfg_ni, memcpy(&tun->lt_cmn, &ni->ni_net->net_tunables, sizeof(tun->lt_cmn)); + if (stats) { + stats->send_count = atomic_read(&ni->ni_stats.send_count); + stats->recv_count = atomic_read(&ni->ni_stats.recv_count); + } + /* * tun->lt_tun will always be present, but in order to be * backwards compatible, we need to deal with the cases when @@ -2102,13 +2108,14 @@ lnet_get_net_config(struct lnet_ioctl_config_data *config) int lnet_get_ni_config(struct lnet_ioctl_config_ni *cfg_ni, struct lnet_ioctl_config_lnd_tunables *tun, + struct lnet_ioctl_element_stats *stats, __u32 tun_size) { struct lnet_ni *ni; int cpt; int rc = -ENOENT; - if (!cfg_ni || !tun) + if (!cfg_ni || !tun || !stats) return -EINVAL; cpt = lnet_net_lock_current(); @@ -2118,7 +2125,7 @@ lnet_get_ni_config(struct lnet_ioctl_config_ni *cfg_ni, if (ni) { rc = 0; lnet_ni_lock(ni); - lnet_fill_ni_info(ni, cfg_ni, tun, tun_size); + lnet_fill_ni_info(ni, cfg_ni, tun, stats, tun_size); lnet_ni_unlock(ni); } @@ -2583,20 +2590,24 @@ LNetCtl(unsigned int cmd, void *arg) case IOC_LIBCFS_GET_LOCAL_NI: { struct lnet_ioctl_config_ni *cfg_ni; struct lnet_ioctl_config_lnd_tunables *tun = NULL; + struct lnet_ioctl_element_stats *stats; __u32 tun_size; cfg_ni = arg; /* get the tunables if they are available */ if (cfg_ni->lic_cfg_hdr.ioc_len < - sizeof(*cfg_ni) + sizeof(*tun)) + sizeof(*cfg_ni) + sizeof(*stats) + sizeof(*tun)) return -EINVAL; + stats = (struct lnet_ioctl_element_stats *) + cfg_ni->lic_bulk; tun = (struct lnet_ioctl_config_lnd_tunables *) - cfg_ni->lic_bulk; + (cfg_ni->lic_bulk + sizeof(*stats)); - tun_size = cfg_ni->lic_cfg_hdr.ioc_len - sizeof(*cfg_ni); + tun_size = cfg_ni->lic_cfg_hdr.ioc_len - sizeof(*cfg_ni) - + sizeof(*stats); - return lnet_get_ni_config(cfg_ni, tun, tun_size); + return lnet_get_ni_config(cfg_ni, tun, stats, tun_size); } case IOC_LIBCFS_GET_NET: { @@ -2724,15 +2735,20 @@ LNetCtl(unsigned int cmd, void *arg) case IOC_LIBCFS_GET_PEER_NI: { struct lnet_ioctl_peer_cfg *cfg = arg; struct lnet_peer_ni_credit_info *lpni_cri; - size_t total = sizeof(*cfg) + sizeof(*lpni_cri); + struct lnet_ioctl_element_stats *lpni_stats; + size_t total = sizeof(*cfg) + sizeof(*lpni_cri) + + sizeof(*lpni_stats); if (cfg->prcfg_hdr.ioc_len < total) return -EINVAL; lpni_cri = (struct lnet_peer_ni_credit_info *)cfg->prcfg_bulk; + lpni_stats = (struct lnet_ioctl_element_stats *) + (cfg->prcfg_bulk + sizeof(*lpni_cri)); return lnet_get_peer_info(cfg->prcfg_idx, &cfg->prcfg_key_nid, - &cfg->prcfg_cfg_nid, lpni_cri); + &cfg->prcfg_cfg_nid, lpni_cri, + lpni_stats); } case IOC_LIBCFS_NOTIFY_ROUTER: { diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 6c5bb953a6d3..3f28f3b87176 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -614,6 +614,10 @@ lnet_post_send_locked(struct lnet_msg *msg, int do_send) the_lnet.ln_counters[cpt]->drop_count++; the_lnet.ln_counters[cpt]->drop_length += msg->msg_len; lnet_net_unlock(cpt); + if (msg->msg_txpeer) + atomic_inc(&msg->msg_txpeer->lpni_stats.drop_count); + if (msg->msg_txni) + atomic_inc(&msg->msg_txni->ni_stats.drop_count); CNETERR("Dropping message for %s: peer not alive\n", libcfs_id2str(msg->msg_target)); diff --git a/drivers/staging/lustre/lnet/lnet/lib-msg.c b/drivers/staging/lustre/lnet/lnet/lib-msg.c index 8628899e1631..aa28b6a12f81 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-msg.c +++ b/drivers/staging/lustre/lnet/lnet/lib-msg.c @@ -215,6 +215,10 @@ lnet_msg_decommit_tx(struct lnet_msg *msg, int status) } counters->send_count++; + if (msg->msg_txpeer) + atomic_inc(&msg->msg_txpeer->lpni_stats.send_count); + if (msg->msg_txni) + atomic_inc(&msg->msg_txni->ni_stats.send_count); out: lnet_return_tx_credits_locked(msg); msg->msg_tx_committed = 0; @@ -270,6 +274,10 @@ lnet_msg_decommit_rx(struct lnet_msg *msg, int status) } counters->recv_count++; + if (msg->msg_rxpeer) + atomic_inc(&msg->msg_rxpeer->lpni_stats.recv_count); + if (msg->msg_rxni) + atomic_inc(&msg->msg_rxni->ni_stats.recv_count); if (ev->type == LNET_EVENT_PUT || ev->type == LNET_EVENT_REPLY) counters->recv_length += msg->msg_wanted; diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index ecbd276703f1..f626a3fcf00e 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -973,7 +973,8 @@ lnet_get_peer_ni_info(__u32 peer_index, __u64 *nid, } int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, - struct lnet_peer_ni_credit_info *peer_ni_info) + struct lnet_peer_ni_credit_info *peer_ni_info, + struct lnet_ioctl_element_stats *peer_ni_stats) { struct lnet_peer_ni *lpni = NULL; struct lnet_peer_net *lpn = NULL; @@ -1000,5 +1001,9 @@ int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, peer_ni_info->cr_peer_min_rtr_credits = lpni->lpni_mintxcredits; peer_ni_info->cr_peer_tx_qnob = lpni->lpni_txqnob; + peer_ni_stats->send_count = atomic_read(&lpni->lpni_stats.send_count); + peer_ni_stats->recv_count = atomic_read(&lpni->lpni_stats.recv_count); + peer_ni_stats->drop_count = atomic_read(&lpni->lpni_stats.drop_count); + return 0; } From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613185 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 12C79157B for ; Tue, 25 Sep 2018 01:11:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 101FA2A052 for ; Tue, 25 Sep 2018 01:11:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 020DD2A05D; Tue, 25 Sep 2018 01:11:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C6D2A2A052 for ; Tue, 25 Sep 2018 01:11:52 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1DF444C41AF; Mon, 24 Sep 2018 18:11:52 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id ED09D4C3F7F for ; Mon, 24 Sep 2018 18:11:49 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0CD7BB032; Tue, 25 Sep 2018 01:11:49 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763556.32103.9233364631803474395.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 18/34] LU-7734 lnet: peer/peer_ni handling adjustments X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata A peer can be added by specifying a list of NIDs The first NID shall be used as the primary NID. The rest of the NIDs will be added under the primary NID A peer can be added by explicitly specifying the key NID, and then by adding a set of other NIDs, all done through one API call If a key NID already exists, but it's not an MR NI, then adding that Key NID from DLC shall convert that NI to an MR NI If a key NID already exists, and it is an MR NI, then re-adding the Key NID shall have no effect if a Key NID already exists as part of another peer, then adding that NID as part of another peer or as primary shall fail if a NID is being added to a peer NI and that NID is a non-MR, then that NID is moved under the peer and is made to be MR capable if a NID is being added to a peer and that NID is an MR NID and part of another peer, then the operation shall fail if a NID is being added to a peer and it is already part of that Peer then the operation is a no-op. Moreover, the code is structured to consider the addition of Dynamic Discovery in later patches. Signed-off-by: Amir Shehata Change-Id: I71f740192a31ae00f83014ca3e9e06b61ae4ecd5 Reviewed-on: http://review.whamcloud.com/20531 Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 9 .../staging/lustre/include/linux/lnet/lib-types.h | 10 drivers/staging/lustre/lnet/lnet/api-ni.c | 77 +- drivers/staging/lustre/lnet/lnet/lib-move.c | 32 - drivers/staging/lustre/lnet/lnet/peer.c | 907 +++++++++++--------- drivers/staging/lustre/lnet/lnet/router.c | 8 6 files changed, 600 insertions(+), 443 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 53a5ee8632a6..55bcd17cd4dc 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -647,13 +647,12 @@ u32 lnet_get_dlc_seq_locked(void); struct lnet_peer_ni *lnet_get_next_peer_ni_locked(struct lnet_peer *peer, struct lnet_peer_net *peer_net, struct lnet_peer_ni *prev); -int lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, - struct lnet_peer **peer); -int lnet_nid2peerni_locked(struct lnet_peer_ni **lpp, lnet_nid_t nid, int cpt); +struct lnet_peer *lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt); +struct lnet_peer_ni *lnet_nid2peerni_locked(lnet_nid_t nid, int cpt); struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid); void lnet_peer_net_added(struct lnet_net *net); lnet_nid_t lnet_peer_primary_nid(lnet_nid_t nid); -void lnet_peer_tables_cleanup(struct lnet_ni *ni); +void lnet_peer_tables_cleanup(struct lnet_net *net); void lnet_peer_uninit(void); int lnet_peer_tables_create(void); void lnet_debug_peer(lnet_nid_t nid); @@ -664,7 +663,7 @@ bool lnet_peer_is_ni_pref_locked(struct lnet_peer_ni *lpni, int lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid, bool mr); int lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid); int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, - struct lnet_peer_ni_credit_info *peer_ni_info, + bool *mr, struct lnet_peer_ni_credit_info *peer_ni_info, struct lnet_ioctl_element_stats *peer_ni_stats); int lnet_get_peer_ni_info(__u32 peer_index, __u64 *nid, char alivness[LNET_MAX_STR_LEN], diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index e17ca716dce1..71ec0eaf8200 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -281,9 +281,9 @@ struct lnet_net { /* chain on the ln_nets */ struct list_head net_list; - /* net ID, which is compoed of + /* net ID, which is composed of * (net_type << 16) | net_num. - * net_type can be one of the enumarated types defined in + * net_type can be one of the enumerated types defined in * lnet/include/lnet/nidstr.h */ __u32 net_id; @@ -513,11 +513,13 @@ struct lnet_peer_table { /* /proc validity stamp */ int pt_version; /* # peers extant */ - int pt_number; + atomic_t pt_number; /* # zombies to go to deathrow (and not there yet) */ int pt_zombies; /* zombie peers */ - struct list_head pt_deathrow; + struct list_head pt_zombie_list; + /* protect list and count */ + spinlock_t pt_zombie_lock; /* NID->peer hash */ struct list_head *pt_hash; }; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index a01858374211..d3db4853c690 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1229,9 +1229,6 @@ lnet_shutdown_lndni(struct lnet_ni *ni) for (i = 0; i < the_lnet.ln_nportals; i++) lnet_clear_lazy_portal(ni, i, "Shutting down NI"); - /* Do peer table cleanup for this ni */ - lnet_peer_tables_cleanup(ni); - lnet_net_lock(LNET_LOCK_EX); lnet_clear_zombies_nis_locked(net); lnet_net_unlock(LNET_LOCK_EX); @@ -1254,6 +1251,12 @@ lnet_shutdown_lndnet(struct lnet_net *net) lnet_net_lock(LNET_LOCK_EX); } + lnet_net_unlock(LNET_LOCK_EX); + + /* Do peer table cleanup for this net */ + lnet_peer_tables_cleanup(net); + + lnet_net_lock(LNET_LOCK_EX); /* * decrement ref count on lnd only when the entire network goes * away @@ -2580,12 +2583,15 @@ LNetCtl(unsigned int cmd, void *arg) if (config->cfg_hdr.ioc_len < sizeof(*config)) return -EINVAL; - return lnet_get_route(config->cfg_count, - &config->cfg_net, - &config->cfg_config_u.cfg_route.rtr_hop, - &config->cfg_nid, - &config->cfg_config_u.cfg_route.rtr_flags, - &config->cfg_config_u.cfg_route.rtr_priority); + mutex_lock(&the_lnet.ln_api_mutex); + rc = lnet_get_route(config->cfg_count, + &config->cfg_net, + &config->cfg_config_u.cfg_route.rtr_hop, + &config->cfg_nid, + &config->cfg_config_u.cfg_route.rtr_flags, + &config->cfg_config_u.cfg_route.rtr_priority); + mutex_unlock(&the_lnet.ln_api_mutex); + return rc; case IOC_LIBCFS_GET_LOCAL_NI: { struct lnet_ioctl_config_ni *cfg_ni; @@ -2607,7 +2613,10 @@ LNetCtl(unsigned int cmd, void *arg) tun_size = cfg_ni->lic_cfg_hdr.ioc_len - sizeof(*cfg_ni) - sizeof(*stats); - return lnet_get_ni_config(cfg_ni, tun, stats, tun_size); + mutex_lock(&the_lnet.ln_api_mutex); + rc = lnet_get_ni_config(cfg_ni, tun, stats, tun_size); + mutex_unlock(&the_lnet.ln_api_mutex); + return rc; } case IOC_LIBCFS_GET_NET: { @@ -2618,7 +2627,10 @@ LNetCtl(unsigned int cmd, void *arg) if (config->cfg_hdr.ioc_len < total) return -EINVAL; - return lnet_get_net_config(config); + mutex_lock(&the_lnet.ln_api_mutex); + rc = lnet_get_net_config(config); + mutex_unlock(&the_lnet.ln_api_mutex); + return rc; } case IOC_LIBCFS_GET_LNET_STATS: { @@ -2627,7 +2639,9 @@ LNetCtl(unsigned int cmd, void *arg) if (lnet_stats->st_hdr.ioc_len < sizeof(*lnet_stats)) return -EINVAL; + mutex_lock(&the_lnet.ln_api_mutex); lnet_counters_get(&lnet_stats->st_cntrs); + mutex_unlock(&the_lnet.ln_api_mutex); return 0; } @@ -2666,7 +2680,9 @@ LNetCtl(unsigned int cmd, void *arg) numa = arg; if (numa->nr_hdr.ioc_len != sizeof(*numa)) return -EINVAL; + mutex_lock(&the_lnet.ln_api_mutex); lnet_numa_range = numa->nr_range; + mutex_unlock(&the_lnet.ln_api_mutex); return 0; } @@ -2690,7 +2706,11 @@ LNetCtl(unsigned int cmd, void *arg) return -EINVAL; pool_cfg = (struct lnet_ioctl_pool_cfg *)config->cfg_bulk; - return lnet_get_rtr_pool_cfg(config->cfg_count, pool_cfg); + + mutex_lock(&the_lnet.ln_api_mutex); + rc = lnet_get_rtr_pool_cfg(config->cfg_count, pool_cfg); + mutex_unlock(&the_lnet.ln_api_mutex); + return rc; } case IOC_LIBCFS_ADD_PEER_NI: { @@ -2699,9 +2719,13 @@ LNetCtl(unsigned int cmd, void *arg) if (cfg->prcfg_hdr.ioc_len < sizeof(*cfg)) return -EINVAL; - return lnet_add_peer_ni_to_peer(cfg->prcfg_key_nid, - cfg->prcfg_cfg_nid, - cfg->prcfg_mr); + mutex_lock(&the_lnet.ln_api_mutex); + lnet_incr_dlc_seq(); + rc = lnet_add_peer_ni_to_peer(cfg->prcfg_key_nid, + cfg->prcfg_cfg_nid, + cfg->prcfg_mr); + mutex_unlock(&the_lnet.ln_api_mutex); + return rc; } case IOC_LIBCFS_DEL_PEER_NI: { @@ -2710,8 +2734,12 @@ LNetCtl(unsigned int cmd, void *arg) if (cfg->prcfg_hdr.ioc_len < sizeof(*cfg)) return -EINVAL; - return lnet_del_peer_ni_from_peer(cfg->prcfg_key_nid, - cfg->prcfg_cfg_nid); + mutex_lock(&the_lnet.ln_api_mutex); + lnet_incr_dlc_seq(); + rc = lnet_del_peer_ni_from_peer(cfg->prcfg_key_nid, + cfg->prcfg_cfg_nid); + mutex_unlock(&the_lnet.ln_api_mutex); + return rc; } case IOC_LIBCFS_GET_PEER_INFO: { @@ -2720,7 +2748,9 @@ LNetCtl(unsigned int cmd, void *arg) if (peer_info->pr_hdr.ioc_len < sizeof(*peer_info)) return -EINVAL; - return lnet_get_peer_ni_info(peer_info->pr_count, + mutex_lock(&the_lnet.ln_api_mutex); + rc = lnet_get_peer_ni_info( + peer_info->pr_count, &peer_info->pr_nid, peer_info->pr_lnd_u.pr_peer_credits.cr_aliveness, &peer_info->pr_lnd_u.pr_peer_credits.cr_ncpt, @@ -2730,6 +2760,8 @@ LNetCtl(unsigned int cmd, void *arg) &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_rtr_credits, &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_min_rtr_credits, &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_tx_qnob); + mutex_unlock(&the_lnet.ln_api_mutex); + return rc; } case IOC_LIBCFS_GET_PEER_NI: { @@ -2746,9 +2778,12 @@ LNetCtl(unsigned int cmd, void *arg) lpni_stats = (struct lnet_ioctl_element_stats *) (cfg->prcfg_bulk + sizeof(*lpni_cri)); - return lnet_get_peer_info(cfg->prcfg_idx, &cfg->prcfg_key_nid, - &cfg->prcfg_cfg_nid, lpni_cri, - lpni_stats); + mutex_lock(&the_lnet.ln_api_mutex); + rc = lnet_get_peer_info(cfg->prcfg_idx, &cfg->prcfg_key_nid, + &cfg->prcfg_cfg_nid, &cfg->prcfg_mr, + lpni_cri, lpni_stats); + mutex_unlock(&the_lnet.ln_api_mutex); + return rc; } case IOC_LIBCFS_NOTIFY_ROUTER: { diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 3f28f3b87176..5d9acce26287 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1156,10 +1156,10 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, lpni = NULL; seq = lnet_get_dlc_seq_locked(); - rc = lnet_find_or_create_peer_locked(dst_nid, cpt, &peer); - if (rc != 0) { + peer = lnet_find_or_create_peer_locked(dst_nid, cpt); + if (IS_ERR(peer)) { lnet_net_unlock(cpt); - return rc; + return PTR_ERR(peer); } /* If peer is not healthy then can not send anything to it */ @@ -1364,13 +1364,6 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, best_credits = ni->ni_tx_queues[cpt]->tq_credits; } } - /* - * Now that we selected the NI to use increment its sequence - * number so the Round Robin algorithm will detect that it has - * been used and pick the next NI. - */ - best_ni->ni_seq++; - /* * if the peer is not MR capable, then we should always send to it * using the first NI in the NET we determined. @@ -1385,6 +1378,13 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, return -EINVAL; } + /* + * Now that we selected the NI to use increment its sequence + * number so the Round Robin algorithm will detect that it has + * been used and pick the next NI. + */ + best_ni->ni_seq++; + if (routing) goto send; @@ -1452,7 +1452,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, } CDEBUG(D_NET, "Best route to %s via %s for %s %d\n", - libcfs_nid2str(lpni->lpni_nid), + libcfs_nid2str(dst_nid), libcfs_nid2str(best_gw->lpni_nid), lnet_msgtyp2str(msg->msg_type), msg->msg_len); @@ -2065,6 +2065,7 @@ lnet_parse(struct lnet_ni *ni, struct lnet_hdr *hdr, lnet_nid_t from_nid, lnet_pid_t dest_pid; lnet_nid_t dest_nid; lnet_nid_t src_nid; + struct lnet_peer_ni *lpni; __u32 payload_length; __u32 type; @@ -2226,18 +2227,19 @@ lnet_parse(struct lnet_ni *ni, struct lnet_hdr *hdr, lnet_nid_t from_nid, msg->msg_initiator = lnet_peer_primary_nid(src_nid); lnet_net_lock(cpt); - rc = lnet_nid2peerni_locked(&msg->msg_rxpeer, from_nid, cpt); - if (rc) { + lpni = lnet_nid2peerni_locked(from_nid, cpt); + if (IS_ERR(lpni)) { lnet_net_unlock(cpt); - CERROR("%s, src %s: Dropping %s (error %d looking up sender)\n", + CERROR("%s, src %s: Dropping %s (error %ld looking up sender)\n", libcfs_nid2str(from_nid), libcfs_nid2str(src_nid), - lnet_msgtyp2str(type), rc); + lnet_msgtyp2str(type), PTR_ERR(lpni)); kfree(msg); if (rc == -ESHUTDOWN) /* We are shutting down. Don't do anything more */ return 0; goto drop; } + msg->msg_rxpeer = lpni; msg->msg_rxni = ni; lnet_ni_addref_locked(ni, cpt); diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index f626a3fcf00e..c2a04526a59a 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -84,6 +84,8 @@ lnet_peer_tables_destroy(void) if (!hash) /* not initialized */ break; + LASSERT(list_empty(&ptable->pt_zombie_list)); + ptable->pt_hash = NULL; for (j = 0; j < LNET_PEER_HASH_SIZE; j++) LASSERT(list_empty(&hash[j])); @@ -95,27 +97,179 @@ lnet_peer_tables_destroy(void) the_lnet.ln_peer_tables = NULL; } -void lnet_peer_uninit(void) +static struct lnet_peer_ni * +lnet_peer_ni_alloc(lnet_nid_t nid) { + struct lnet_peer_ni *lpni; + struct lnet_net *net; int cpt; - struct lnet_peer_ni *lpni, *tmp; - struct lnet_peer_table *ptable = NULL; - /* remove all peer_nis from the remote peer and he hash list */ - list_for_each_entry_safe(lpni, tmp, &the_lnet.ln_remote_peer_ni_list, - lpni_on_remote_peer_ni_list) { - list_del_init(&lpni->lpni_on_remote_peer_ni_list); - lnet_peer_ni_decref_locked(lpni); + cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); + + lpni = kzalloc_cpt(sizeof(*lpni), GFP_KERNEL, cpt); + if (!lpni) + return NULL; - cpt = lnet_cpt_of_nid_locked(lpni->lpni_nid, NULL); - ptable = the_lnet.ln_peer_tables[cpt]; - ptable->pt_zombies++; + INIT_LIST_HEAD(&lpni->lpni_txq); + INIT_LIST_HEAD(&lpni->lpni_rtrq); + INIT_LIST_HEAD(&lpni->lpni_routes); + INIT_LIST_HEAD(&lpni->lpni_hashlist); + INIT_LIST_HEAD(&lpni->lpni_on_peer_net_list); + INIT_LIST_HEAD(&lpni->lpni_on_remote_peer_ni_list); - list_del_init(&lpni->lpni_hashlist); - lnet_peer_ni_decref_locked(lpni); + lpni->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ + lpni->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ + lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL; + lpni->lpni_nid = nid; + lpni->lpni_cpt = cpt; + lnet_set_peer_ni_health_locked(lpni, true); + + net = lnet_get_net_locked(LNET_NIDNET(nid)); + lpni->lpni_net = net; + if (net) { + lpni->lpni_txcredits = net->net_tunables.lct_peer_tx_credits; + lpni->lpni_mintxcredits = lpni->lpni_txcredits; + lpni->lpni_rtrcredits = lnet_peer_buffer_credits(net); + lpni->lpni_minrtrcredits = lpni->lpni_rtrcredits; + } else { + /* + * This peer_ni is not on a local network, so we + * cannot add the credits here. In case the net is + * added later, add the peer_ni to the remote peer ni + * list so it can be easily found and revisited. + */ + /* FIXME: per-net implementation instead? */ + atomic_inc(&lpni->lpni_refcount); + list_add_tail(&lpni->lpni_on_remote_peer_ni_list, + &the_lnet.ln_remote_peer_ni_list); } + /* TODO: update flags */ + + return lpni; +} + +static struct lnet_peer_net * +lnet_peer_net_alloc(u32 net_id) +{ + struct lnet_peer_net *lpn; + + lpn = kzalloc_cpt(sizeof(*lpn), GFP_KERNEL, CFS_CPT_ANY); + if (!lpn) + return NULL; + + INIT_LIST_HEAD(&lpn->lpn_on_peer_list); + INIT_LIST_HEAD(&lpn->lpn_peer_nis); + lpn->lpn_net_id = net_id; + + return lpn; +} + +static struct lnet_peer * +lnet_peer_alloc(lnet_nid_t nid) +{ + struct lnet_peer *lp; + + lp = kzalloc_cpt(sizeof(*lp), GFP_KERNEL, CFS_CPT_ANY); + if (!lp) + return NULL; + + INIT_LIST_HEAD(&lp->lp_on_lnet_peer_list); + INIT_LIST_HEAD(&lp->lp_peer_nets); + lp->lp_primary_nid = nid; + + /* TODO: update flags */ + + return lp; +} + +static void +lnet_try_destroy_peer_hierarchy_locked(struct lnet_peer_ni *lpni) +{ + struct lnet_peer_net *peer_net; + struct lnet_peer *peer; + + /* TODO: could the below situation happen? accessing an already + * destroyed peer? + */ + if (!lpni->lpni_peer_net || + !lpni->lpni_peer_net->lpn_peer) + return; + + peer_net = lpni->lpni_peer_net; + peer = lpni->lpni_peer_net->lpn_peer; + + list_del_init(&lpni->lpni_on_peer_net_list); + lpni->lpni_peer_net = NULL; + + /* if peer_net is empty, then remove it from the peer */ + if (list_empty(&peer_net->lpn_peer_nis)) { + list_del_init(&peer_net->lpn_on_peer_list); + peer_net->lpn_peer = NULL; + kfree(peer_net); + + /* If the peer is empty then remove it from the + * the_lnet.ln_peers. + */ + if (list_empty(&peer->lp_peer_nets)) { + list_del_init(&peer->lp_on_lnet_peer_list); + kfree(peer); + } + } +} + +/* called with lnet_net_lock LNET_LOCK_EX held */ +static void +lnet_peer_ni_del_locked(struct lnet_peer_ni *lpni) +{ + struct lnet_peer_table *ptable = NULL; + + lnet_peer_remove_from_remote_list(lpni); + + /* remove peer ni from the hash list. */ + list_del_init(&lpni->lpni_hashlist); + + /* decrement the ref count on the peer table */ + ptable = the_lnet.ln_peer_tables[lpni->lpni_cpt]; + LASSERT(atomic_read(&ptable->pt_number) > 0); + atomic_dec(&ptable->pt_number); + + /* + * The peer_ni can no longer be found with a lookup. But there + * can be current users, so keep track of it on the zombie + * list until the reference count has gone to zero. + * + * The last reference may be lost in a place where the + * lnet_net_lock locks only a single cpt, and that cpt may not + * be lpni->lpni_cpt. So the zombie list of this peer_table + * has its own lock. + */ + spin_lock(&ptable->pt_zombie_lock); + list_add(&lpni->lpni_hashlist, &ptable->pt_zombie_list); + ptable->pt_zombies++; + spin_unlock(&ptable->pt_zombie_lock); + + /* no need to keep this peer on the hierarchy anymore */ + lnet_try_destroy_peer_hierarchy_locked(lpni); + + /* decrement reference on peer */ + lnet_peer_ni_decref_locked(lpni); +} + +void lnet_peer_uninit(void) +{ + struct lnet_peer_ni *lpni, *tmp; + + lnet_net_lock(LNET_LOCK_EX); + + /* remove all peer_nis from the remote peer and the hash list */ + list_for_each_entry_safe(lpni, tmp, &the_lnet.ln_remote_peer_ni_list, + lpni_on_remote_peer_ni_list) + lnet_peer_ni_del_locked(lpni); + lnet_peer_tables_destroy(); + + lnet_net_unlock(LNET_LOCK_EX); } int @@ -142,6 +296,9 @@ lnet_peer_tables_create(void) return -ENOMEM; } + spin_lock_init(&ptable->pt_zombie_lock); + INIT_LIST_HEAD(&ptable->pt_zombie_list); + for (j = 0; j < LNET_PEER_HASH_SIZE; j++) INIT_LIST_HEAD(&hash[j]); ptable->pt_hash = hash; /* sign of initialization */ @@ -151,34 +308,55 @@ lnet_peer_tables_create(void) } static void -lnet_peer_table_cleanup_locked(struct lnet_ni *ni, +lnet_peer_del_locked(struct lnet_peer *peer) +{ + struct lnet_peer_ni *lpni = NULL, *lpni2; + + lpni = lnet_get_next_peer_ni_locked(peer, NULL, lpni); + while (lpni) { + lpni2 = lnet_get_next_peer_ni_locked(peer, NULL, lpni); + lnet_peer_ni_del_locked(lpni); + lpni = lpni2; + } +} + +static void +lnet_peer_table_cleanup_locked(struct lnet_net *net, struct lnet_peer_table *ptable) { int i; - struct lnet_peer_ni *lp; + struct lnet_peer_ni *lpni; struct lnet_peer_ni *tmp; + struct lnet_peer *peer; for (i = 0; i < LNET_PEER_HASH_SIZE; i++) { - list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i], + list_for_each_entry_safe(lpni, tmp, &ptable->pt_hash[i], lpni_hashlist) { - if (ni && ni->ni_net != lp->lpni_net) + if (net && net != lpni->lpni_net) continue; - list_del_init(&lp->lpni_hashlist); - /* Lose hash table's ref */ - ptable->pt_zombies++; - lnet_peer_ni_decref_locked(lp); + + /* + * check if by removing this peer ni we should be + * removing the entire peer. + */ + peer = lpni->lpni_peer_net->lpn_peer; + + if (peer->lp_primary_nid == lpni->lpni_nid) + lnet_peer_del_locked(peer); + else + lnet_peer_ni_del_locked(lpni); } } } static void -lnet_peer_table_finalize_wait_locked(struct lnet_peer_table *ptable, - int cpt_locked) +lnet_peer_ni_finalize_wait(struct lnet_peer_table *ptable) { - int i; + int i = 3; - for (i = 3; ptable->pt_zombies; i++) { - lnet_net_unlock(cpt_locked); + spin_lock(&ptable->pt_zombie_lock); + while (ptable->pt_zombies) { + spin_unlock(&ptable->pt_zombie_lock); if (is_power_of_2(i)) { CDEBUG(D_WARNING, @@ -186,14 +364,14 @@ lnet_peer_table_finalize_wait_locked(struct lnet_peer_table *ptable, ptable->pt_zombies); } schedule_timeout_uninterruptible(HZ >> 1); - lnet_net_lock(cpt_locked); + spin_lock(&ptable->pt_zombie_lock); } + spin_unlock(&ptable->pt_zombie_lock); } static void -lnet_peer_table_del_rtrs_locked(struct lnet_ni *ni, - struct lnet_peer_table *ptable, - int cpt_locked) +lnet_peer_table_del_rtrs_locked(struct lnet_net *net, + struct lnet_peer_table *ptable) { struct lnet_peer_ni *lp; struct lnet_peer_ni *tmp; @@ -203,7 +381,7 @@ lnet_peer_table_del_rtrs_locked(struct lnet_ni *ni, for (i = 0; i < LNET_PEER_HASH_SIZE; i++) { list_for_each_entry_safe(lp, tmp, &ptable->pt_hash[i], lpni_hashlist) { - if (ni->ni_net != lp->lpni_net) + if (net != lp->lpni_net) continue; if (!lp->lpni_rtr_refcount) @@ -211,27 +389,27 @@ lnet_peer_table_del_rtrs_locked(struct lnet_ni *ni, lpni_nid = lp->lpni_nid; - lnet_net_unlock(cpt_locked); + lnet_net_unlock(LNET_LOCK_EX); lnet_del_route(LNET_NIDNET(LNET_NID_ANY), lpni_nid); - lnet_net_lock(cpt_locked); + lnet_net_lock(LNET_LOCK_EX); } } } void -lnet_peer_tables_cleanup(struct lnet_ni *ni) +lnet_peer_tables_cleanup(struct lnet_net *net) { struct lnet_peer_table *ptable; int i; - LASSERT(the_lnet.ln_shutdown || ni); + LASSERT(the_lnet.ln_shutdown || net); /* * If just deleting the peers for a NI, get rid of any routes these * peers are gateways for. */ cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { lnet_net_lock(LNET_LOCK_EX); - lnet_peer_table_del_rtrs_locked(ni, ptable, i); + lnet_peer_table_del_rtrs_locked(net, ptable); lnet_net_unlock(LNET_LOCK_EX); } @@ -240,16 +418,12 @@ lnet_peer_tables_cleanup(struct lnet_ni *ni) */ cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { lnet_net_lock(LNET_LOCK_EX); - lnet_peer_table_cleanup_locked(ni, ptable); + lnet_peer_table_cleanup_locked(net, ptable); lnet_net_unlock(LNET_LOCK_EX); } - /* Wait until all peers have been destroyed. */ - cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) { - lnet_net_lock(LNET_LOCK_EX); - lnet_peer_table_finalize_wait_locked(ptable, i); - lnet_net_unlock(LNET_LOCK_EX); - } + cfs_percpt_for_each(ptable, i, the_lnet.ln_peer_tables) + lnet_peer_ni_finalize_wait(ptable); } static struct lnet_peer_ni * @@ -286,25 +460,23 @@ lnet_find_peer_ni_locked(lnet_nid_t nid) return lpni; } -int -lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt, - struct lnet_peer **peer) +struct lnet_peer * +lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt) { struct lnet_peer_ni *lpni; + struct lnet_peer *lp; lpni = lnet_find_peer_ni_locked(dst_nid); if (!lpni) { - int rc; - - rc = lnet_nid2peerni_locked(&lpni, dst_nid, cpt); - if (rc != 0) - return rc; + lpni = lnet_nid2peerni_locked(dst_nid, cpt); + if (IS_ERR(lpni)) + return ERR_CAST(lpni); } - *peer = lpni->lpni_peer_net->lpn_peer; + lp = lpni->lpni_peer_net->lpn_peer; lnet_peer_ni_decref_locked(lpni); - return 0; + return lp; } struct lnet_peer_ni * @@ -412,269 +584,318 @@ lnet_peer_primary_nid(lnet_nid_t nid) return primary_nid; } -static void -lnet_try_destroy_peer_hierarchy_locked(struct lnet_peer_ni *lpni) +struct lnet_peer_net * +lnet_peer_get_net_locked(struct lnet_peer *peer, u32 net_id) { struct lnet_peer_net *peer_net; - struct lnet_peer *peer; + list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_on_peer_list) { + if (peer_net->lpn_net_id == net_id) + return peer_net; + } + return NULL; +} - /* TODO: could the below situation happen? accessing an already - * destroyed peer? +static int +lnet_peer_setup_hierarchy(struct lnet_peer *lp, struct lnet_peer_ni + *lpni, + lnet_nid_t nid) +{ + struct lnet_peer_net *lpn = NULL; + struct lnet_peer_table *ptable; + u32 net_id = LNET_NIDNET(nid); + + /* + * Create the peer_ni, peer_net, and peer if they don't exist + * yet. */ - if (!lpni->lpni_peer_net || - !lpni->lpni_peer_net->lpn_peer) - return; + if (lp) { + lpn = lnet_peer_get_net_locked(lp, net_id); + } else { + lp = lnet_peer_alloc(nid); + if (!lp) + goto out_enomem; + } - peer_net = lpni->lpni_peer_net; - peer = lpni->lpni_peer_net->lpn_peer; + if (!lpn) { + lpn = lnet_peer_net_alloc(net_id); + if (!lpn) + goto out_maybe_free_lp; + } - list_del_init(&lpni->lpni_on_peer_net_list); - lpni->lpni_peer_net = NULL; + if (!lpni) { + lpni = lnet_peer_ni_alloc(nid); + if (!lpni) + goto out_maybe_free_lpn; + } - /* if peer_net is empty, then remove it from the peer */ - if (list_empty(&peer_net->lpn_peer_nis)) { - list_del_init(&peer_net->lpn_on_peer_list); - peer_net->lpn_peer = NULL; - kfree(peer_net); + /* Install the new peer_ni */ + lnet_net_lock(LNET_LOCK_EX); + /* Add peer_ni to global peer table hash, if necessary. */ + if (list_empty(&lpni->lpni_hashlist)) { + ptable = the_lnet.ln_peer_tables[lpni->lpni_cpt]; + list_add_tail(&lpni->lpni_hashlist, + &ptable->pt_hash[lnet_nid2peerhash(nid)]); + ptable->pt_version++; + atomic_inc(&ptable->pt_number); + atomic_inc(&lpni->lpni_refcount); + } - /* If the peer is empty then remove it from the - * the_lnet.ln_peers - */ - if (list_empty(&peer->lp_peer_nets)) { - list_del_init(&peer->lp_on_lnet_peer_list); - kfree(peer); - } + /* Detach the peer_ni from an existing peer, if necessary. */ + if (lpni->lpni_peer_net && lpni->lpni_peer_net->lpn_peer != lp) + lnet_try_destroy_peer_hierarchy_locked(lpni); + + /* Add peer_ni to peer_net */ + lpni->lpni_peer_net = lpn; + list_add_tail(&lpni->lpni_on_peer_net_list, &lpn->lpn_peer_nis); + + /* Add peer_net to peer */ + if (!lpn->lpn_peer) { + lpn->lpn_peer = lp; + list_add_tail(&lpn->lpn_on_peer_list, &lp->lp_peer_nets); } + + /* Add peer to global peer list */ + if (list_empty(&lp->lp_on_lnet_peer_list)) + list_add_tail(&lp->lp_on_lnet_peer_list, &the_lnet.ln_peers); + lnet_net_unlock(LNET_LOCK_EX); + + return 0; + +out_maybe_free_lpn: + if (list_empty(&lpn->lpn_on_peer_list)) + kfree(lpn); +out_maybe_free_lp: + if (list_empty(&lp->lp_on_lnet_peer_list)) + kfree(lp); +out_enomem: + return -ENOMEM; } static int -lnet_build_peer_hierarchy(struct lnet_peer_ni *lpni) +lnet_add_prim_lpni(lnet_nid_t nid) { + int rc; struct lnet_peer *peer; - struct lnet_peer_net *peer_net; - __u32 lpni_net = LNET_NIDNET(lpni->lpni_nid); - - peer = NULL; - peer_net = NULL; + struct lnet_peer_ni *lpni; - peer = kzalloc(sizeof(*peer), GFP_KERNEL); - if (!peer) - return -ENOMEM; + LASSERT(nid != LNET_NID_ANY); - peer_net = kzalloc(sizeof(*peer_net), GFP_KERNEL); - if (!peer_net) { - kfree(peer); - return -ENOMEM; + /* + * lookup the NID and its peer + * if the peer doesn't exist, create it. + * if this is a non-MR peer then change its state to MR and exit. + * if this is an MR peer and it's a primary NI: NO-OP. + * if this is an MR peer and it's not a primary NI. Operation not + * allowed. + * + * The adding and deleting of peer nis is being serialized through + * the api_mutex. So we can look up peers with the mutex locked + * safely. Only when we need to change the ptable, do we need to + * exclusively lock the lnet_net_lock() + */ + lpni = lnet_find_peer_ni_locked(nid); + if (!lpni) { + rc = lnet_peer_setup_hierarchy(NULL, NULL, nid); + if (rc != 0) + return rc; + lpni = lnet_find_peer_ni_locked(nid); } - INIT_LIST_HEAD(&peer->lp_on_lnet_peer_list); - INIT_LIST_HEAD(&peer->lp_peer_nets); - INIT_LIST_HEAD(&peer_net->lpn_on_peer_list); - INIT_LIST_HEAD(&peer_net->lpn_peer_nis); + LASSERT(lpni); - /* build the hierarchy */ - peer_net->lpn_net_id = lpni_net; - peer_net->lpn_peer = peer; - lpni->lpni_peer_net = peer_net; - peer->lp_primary_nid = lpni->lpni_nid; - peer->lp_multi_rail = false; - list_add_tail(&peer_net->lpn_on_peer_list, &peer->lp_peer_nets); - list_add_tail(&lpni->lpni_on_peer_net_list, &peer_net->lpn_peer_nis); - list_add_tail(&peer->lp_on_lnet_peer_list, &the_lnet.ln_peers); + lnet_peer_ni_decref_locked(lpni); - return 0; -} + peer = lpni->lpni_peer_net->lpn_peer; -struct lnet_peer_net * -lnet_peer_get_net_locked(struct lnet_peer *peer, u32 net_id) -{ - struct lnet_peer_net *peer_net; + /* + * If we found a lpni with the same nid as the NID we're trying to + * create, then we're trying to create an already existing lpni + * that belongs to a different peer + */ + if (peer->lp_primary_nid != nid) + return -EEXIST; - list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_on_peer_list) { - if (peer_net->lpn_net_id == net_id) - return peer_net; - } - return NULL; + /* + * if we found an lpni that is not a multi-rail, which could occur + * if lpni is already created as a non-mr lpni or we just created + * it, then make sure you indicate that this lpni is a primary mr + * capable peer. + * + * TODO: update flags if necessary + */ + if (!peer->lp_multi_rail && peer->lp_primary_nid == nid) + peer->lp_multi_rail = true; + + return rc; } -/* - * given the key nid find the peer to add the new peer NID to. If the key - * nid is NULL, then create a new peer, but first make sure that the NID - * is unique - */ -int -lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid, bool mr) +static int +lnet_add_peer_ni_to_prim_lpni(lnet_nid_t key_nid, lnet_nid_t nid) { - struct lnet_peer_ni *lpni, *lpni2; - struct lnet_peer *peer; - struct lnet_peer_net *peer_net, *pn; - int cpt, cpt2, rc; - struct lnet_peer_table *ptable = NULL; - __u32 net_id = LNET_NIDNET(nid); + struct lnet_peer *peer, *primary_peer; + struct lnet_peer_ni *lpni = NULL, *klpni = NULL; - if (nid == LNET_NID_ANY) - return -EINVAL; + LASSERT(key_nid != LNET_NID_ANY && nid != LNET_NID_ANY); + + /* + * key nid must be created by this point. If not then this + * operation is not permitted + */ + klpni = lnet_find_peer_ni_locked(key_nid); + if (!klpni) + return -ENOENT; + + lnet_peer_ni_decref_locked(klpni); + + primary_peer = klpni->lpni_peer_net->lpn_peer; - /* check that nid is unique */ - cpt = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); - lnet_net_lock(cpt); lpni = lnet_find_peer_ni_locked(nid); if (lpni) { lnet_peer_ni_decref_locked(lpni); - lnet_net_unlock(cpt); - return -EEXIST; - } - lnet_net_unlock(cpt); - if (key_nid != LNET_NID_ANY) { - cpt2 = lnet_nid_cpt_hash(key_nid, LNET_CPT_NUMBER); - lnet_net_lock(cpt2); - lpni = lnet_find_peer_ni_locked(key_nid); - if (!lpni) { - lnet_net_unlock(cpt2); - /* key_nid refers to a non-existent peer_ni.*/ - return -EINVAL; - } peer = lpni->lpni_peer_net->lpn_peer; - peer->lp_multi_rail = mr; - lnet_peer_ni_decref_locked(lpni); - lnet_net_unlock(cpt2); - } else { - lnet_net_lock(LNET_LOCK_EX); - rc = lnet_nid2peerni_locked(&lpni, nid, LNET_LOCK_EX); - if (rc == 0) { - lpni->lpni_peer_net->lpn_peer->lp_multi_rail = mr; - lnet_peer_ni_decref_locked(lpni); + /* + * lpni already exists in the system but it belongs to + * a different peer. We can't re-added it + */ + if (peer->lp_primary_nid != key_nid && peer->lp_multi_rail) { + CERROR("Cannot add NID %s owned by peer %s to peer %s\n", + libcfs_nid2str(lpni->lpni_nid), + libcfs_nid2str(peer->lp_primary_nid), + libcfs_nid2str(key_nid)); + return -EEXIST; + } else if (peer->lp_primary_nid == key_nid) { + /* + * found a peer_ni that is already part of the + * peer. This is a no-op operation. + */ + return 0; } - lnet_net_unlock(LNET_LOCK_EX); - return rc; - } - - lpni = kzalloc_cpt(sizeof(*lpni), GFP_KERNEL, cpt); - if (!lpni) - return -ENOMEM; - INIT_LIST_HEAD(&lpni->lpni_txq); - INIT_LIST_HEAD(&lpni->lpni_rtrq); - INIT_LIST_HEAD(&lpni->lpni_routes); - INIT_LIST_HEAD(&lpni->lpni_hashlist); - INIT_LIST_HEAD(&lpni->lpni_on_peer_net_list); - INIT_LIST_HEAD(&lpni->lpni_on_remote_peer_ni_list); + /* + * TODO: else if (peer->lp_primary_nid != key_nid && + * !peer->lp_multi_rail) + * peer is not an MR peer and it will be moved in the next + * step to klpni, so update its flags accordingly. + * lnet_move_peer_ni() + */ - lpni->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ - lpni->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ - lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL; - lpni->lpni_nid = nid; - lpni->lpni_cpt = cpt; - lnet_set_peer_ni_health_locked(lpni, true); + /* + * TODO: call lnet_update_peer() from here to update the + * flags. This is the case when the lpni you're trying to + * add is already part of the peer. This could've been + * added by the DD previously, so go ahead and do any + * updates to the state if necessary + */ - /* allocate here in case we need to add a new peer_net */ - peer_net = NULL; - peer_net = kzalloc(sizeof(*peer_net), GFP_KERNEL); - if (!peer_net) { - rc = -ENOMEM; - kfree(lpni); - return rc; } - lnet_net_lock(LNET_LOCK_EX); + /* + * When we get here we either have found an existing lpni, which + * we can switch to the new peer. Or we need to create one and + * add it to the new peer + */ + return lnet_peer_setup_hierarchy(primary_peer, lpni, nid); +} - ptable = the_lnet.ln_peer_tables[cpt]; - ptable->pt_number++; - - lpni2 = lnet_find_peer_ni_locked(nid); - if (lpni2) { - lnet_peer_ni_decref_locked(lpni2); - /* sanity check that lpni2's peer is what we expect */ - if (lpni2->lpni_peer_net->lpn_peer != peer) - rc = -EEXIST; - else - rc = -EINVAL; - - ptable->pt_number--; - /* another thread has already added it */ - lnet_net_unlock(LNET_LOCK_EX); - kfree(peer_net); - return rc; - } +/* + * lpni creation initiated due to traffic either sending or receiving. + */ +static int +lnet_peer_ni_traffic_add(lnet_nid_t nid) +{ + struct lnet_peer_ni *lpni; + int rc = 0; - lpni->lpni_net = lnet_get_net_locked(LNET_NIDNET(lpni->lpni_nid)); - if (lpni->lpni_net) { - lpni->lpni_txcredits = - lpni->lpni_mintxcredits = - lpni->lpni_net->net_tunables.lct_peer_tx_credits; - lpni->lpni_rtrcredits = - lpni->lpni_minrtrcredits = - lnet_peer_buffer_credits(lpni->lpni_net); - } else { + if (nid == LNET_NID_ANY) + return -EINVAL; + + /* lnet_net_lock is not needed here because ln_api_lock is held */ + lpni = lnet_find_peer_ni_locked(nid); + if (lpni) { /* - * if you're adding a peer which is not on a local network - * then we can't assign any of the credits. It won't be - * picked for sending anyway. Eventually a network can be - * added, in this case we need to revisit this peer and - * update its credits. + * TODO: lnet_update_primary_nid() but not all of it + * only indicate if we're converting this to MR capable + * Can happen due to DD */ - - /* increment refcount for remote peer list */ - atomic_inc(&lpni->lpni_refcount); - list_add_tail(&lpni->lpni_on_remote_peer_ni_list, - &the_lnet.ln_remote_peer_ni_list); + lnet_peer_ni_decref_locked(lpni); + } else { + rc = lnet_peer_setup_hierarchy(NULL, NULL, nid); } - /* increment refcount for peer on hash list */ - atomic_inc(&lpni->lpni_refcount); + return rc; +} - list_add_tail(&lpni->lpni_hashlist, - &ptable->pt_hash[lnet_nid2peerhash(nid)]); - ptable->pt_version++; +static int +lnet_peer_ni_add_non_mr(lnet_nid_t nid) +{ + struct lnet_peer_ni *lpni; - /* add the lpni to a net */ - list_for_each_entry(pn, &peer->lp_peer_nets, lpn_on_peer_list) { - if (pn->lpn_net_id == net_id) { - list_add_tail(&lpni->lpni_on_peer_net_list, - &pn->lpn_peer_nis); - lpni->lpni_peer_net = pn; - lnet_net_unlock(LNET_LOCK_EX); - kfree(peer_net); - return 0; - } + lpni = lnet_find_peer_ni_locked(nid); + if (lpni) { + CERROR("Cannot add %s as non-mr when it already exists\n", + libcfs_nid2str(nid)); + lnet_peer_ni_decref_locked(lpni); + return -EEXIST; } - INIT_LIST_HEAD(&peer_net->lpn_on_peer_list); - INIT_LIST_HEAD(&peer_net->lpn_peer_nis); + return lnet_peer_setup_hierarchy(NULL, NULL, nid); +} - /* build the hierarchy */ - peer_net->lpn_net_id = net_id; - peer_net->lpn_peer = peer; - lpni->lpni_peer_net = peer_net; - list_add_tail(&lpni->lpni_on_peer_net_list, &peer_net->lpn_peer_nis); - list_add_tail(&peer_net->lpn_on_peer_list, &peer->lp_peer_nets); +/* + * This API handles the following combinations: + * Create a primary NI if only the key_nid is provided + * Create or add an lpni to a primary NI. Primary NI must've already + * been created + * Create a non-MR peer. + */ +int +lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid, bool mr) +{ + /* + * Caller trying to setup an MR like peer hierarchy but + * specifying it to be non-MR. This is not allowed. + */ + if (key_nid != LNET_NID_ANY && + nid != LNET_NID_ANY && !mr) + return -EPERM; + + /* Add the primary NID of a peer */ + if (key_nid != LNET_NID_ANY && + nid == LNET_NID_ANY && mr) + return lnet_add_prim_lpni(key_nid); + + /* Add a NID to an existing peer */ + if (key_nid != LNET_NID_ANY && + nid != LNET_NID_ANY && mr) + return lnet_add_peer_ni_to_prim_lpni(key_nid, nid); + + /* Add a non-MR peer NI */ + if (((key_nid != LNET_NID_ANY && + nid == LNET_NID_ANY) || + (key_nid == LNET_NID_ANY && + nid != LNET_NID_ANY)) && !mr) + return lnet_peer_ni_add_non_mr(key_nid != LNET_NID_ANY ? + key_nid : nid); - lnet_net_unlock(LNET_LOCK_EX); return 0; } int lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid) { - int cpt; lnet_nid_t local_nid; struct lnet_peer *peer; - struct lnet_peer_ni *lpni, *lpni2; - struct lnet_peer_table *ptable = NULL; + struct lnet_peer_ni *lpni; if (key_nid == LNET_NID_ANY) return -EINVAL; local_nid = (nid != LNET_NID_ANY) ? nid : key_nid; - cpt = lnet_nid_cpt_hash(local_nid, LNET_CPT_NUMBER); - lnet_net_lock(LNET_LOCK_EX); lpni = lnet_find_peer_ni_locked(local_nid); - if (!lpni) { - lnet_net_unlock(cpt); + if (!lpni) return -EINVAL; - } lnet_peer_ni_decref_locked(lpni); peer = lpni->lpni_peer_net->lpn_peer; @@ -685,30 +906,15 @@ lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid) * deleting the primary ni is equivalent to deleting the * entire peer */ - lpni = NULL; - lpni = lnet_get_next_peer_ni_locked(peer, NULL, lpni); - while (lpni) { - lpni2 = lnet_get_next_peer_ni_locked(peer, NULL, lpni); - cpt = lnet_nid_cpt_hash(lpni->lpni_nid, - LNET_CPT_NUMBER); - lnet_peer_remove_from_remote_list(lpni); - ptable = the_lnet.ln_peer_tables[cpt]; - ptable->pt_zombies++; - list_del_init(&lpni->lpni_hashlist); - lnet_peer_ni_decref_locked(lpni); - lpni = lpni2; - } + lnet_net_lock(LNET_LOCK_EX); + lnet_peer_del_locked(peer); lnet_net_unlock(LNET_LOCK_EX); return 0; } - lnet_peer_remove_from_remote_list(lpni); - cpt = lnet_nid_cpt_hash(lpni->lpni_nid, LNET_CPT_NUMBER); - ptable = the_lnet.ln_peer_tables[cpt]; - ptable->pt_zombies++; - list_del_init(&lpni->lpni_hashlist); - lnet_peer_ni_decref_locked(lpni); + lnet_net_lock(LNET_LOCK_EX); + lnet_peer_ni_del_locked(lpni); lnet_net_unlock(LNET_LOCK_EX); return 0; @@ -722,159 +928,70 @@ lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lpni) LASSERT(atomic_read(&lpni->lpni_refcount) == 0); LASSERT(lpni->lpni_rtr_refcount == 0); LASSERT(list_empty(&lpni->lpni_txq)); - LASSERT(list_empty(&lpni->lpni_hashlist)); LASSERT(lpni->lpni_txqnob == 0); - LASSERT(lpni->lpni_peer_net); - LASSERT(lpni->lpni_peer_net->lpn_peer); - - ptable = the_lnet.ln_peer_tables[lpni->lpni_cpt]; - LASSERT(ptable->pt_number > 0); - ptable->pt_number--; lpni->lpni_net = NULL; - lnet_try_destroy_peer_hierarchy_locked(lpni); + /* remove the peer ni from the zombie list */ + ptable = the_lnet.ln_peer_tables[lpni->lpni_cpt]; + spin_lock(&ptable->pt_zombie_lock); + list_del_init(&lpni->lpni_hashlist); + ptable->pt_zombies--; + spin_unlock(&ptable->pt_zombie_lock); kfree(lpni); - - LASSERT(ptable->pt_zombies > 0); - ptable->pt_zombies--; } -int -lnet_nid2peerni_locked(struct lnet_peer_ni **lpnip, lnet_nid_t nid, int cpt) +struct lnet_peer_ni * +lnet_nid2peerni_locked(lnet_nid_t nid, int cpt) { struct lnet_peer_table *ptable; struct lnet_peer_ni *lpni = NULL; - struct lnet_peer_ni *lpni2; int cpt2; - int rc = 0; + int rc; - *lpnip = NULL; if (the_lnet.ln_shutdown) /* it's shutting down */ - return -ESHUTDOWN; + return ERR_PTR(-ESHUTDOWN); /* * calculate cpt2 with the standard hash function - * This cpt2 becomes the slot where we'll find or create the peer. + * This cpt2 is the slot where we'll find or create the peer. */ cpt2 = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); - - /* - * Any changes to the peer tables happen under exclusive write - * lock. Any reads to the peer tables can be done via a standard - * CPT read lock. - */ - if (cpt != LNET_LOCK_EX) { - lnet_net_unlock(cpt); - lnet_net_lock(LNET_LOCK_EX); - } - ptable = the_lnet.ln_peer_tables[cpt2]; lpni = lnet_get_peer_ni_locked(ptable, nid); - if (lpni) { - *lpnip = lpni; - if (cpt != LNET_LOCK_EX) { - lnet_net_unlock(LNET_LOCK_EX); - lnet_net_lock(cpt); - } - return 0; - } + if (lpni) + return lpni; + /* Slow path: serialized using the ln_api_mutex. */ + lnet_net_unlock(cpt); + mutex_lock(&the_lnet.ln_api_mutex); /* - * take extra refcount in case another thread has shutdown LNet - * and destroyed locks and peer-table before I finish the allocation + * Shutdown is only set under the ln_api_lock, so a single + * check here is sufficent. + * + * lnet_add_nid_to_peer() also handles the case where we've + * raced and a different thread added the NID. */ - ptable->pt_number++; - lnet_net_unlock(LNET_LOCK_EX); - - lpni = kzalloc_cpt(sizeof(*lpni), GFP_KERNEL, cpt2); - if (!lpni) { - rc = -ENOMEM; - lnet_net_lock(cpt); - goto out; - } - - INIT_LIST_HEAD(&lpni->lpni_txq); - INIT_LIST_HEAD(&lpni->lpni_rtrq); - INIT_LIST_HEAD(&lpni->lpni_routes); - INIT_LIST_HEAD(&lpni->lpni_hashlist); - INIT_LIST_HEAD(&lpni->lpni_on_peer_net_list); - INIT_LIST_HEAD(&lpni->lpni_on_remote_peer_ni_list); - - lpni->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ - lpni->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ - lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL; - lpni->lpni_nid = nid; - lpni->lpni_cpt = cpt2; - atomic_set(&lpni->lpni_refcount, 2); /* 1 for caller; 1 for hash */ - - rc = lnet_build_peer_hierarchy(lpni); - if (rc != 0) - goto out; - - lnet_net_lock(LNET_LOCK_EX); - if (the_lnet.ln_shutdown) { - rc = -ESHUTDOWN; - goto out; - } - - lpni2 = lnet_get_peer_ni_locked(ptable, nid); - if (lpni2) { - *lpnip = lpni2; - goto out; + lpni = ERR_PTR(-ESHUTDOWN); + goto out_mutex_unlock; } - lpni->lpni_net = lnet_get_net_locked(LNET_NIDNET(lpni->lpni_nid)); - if (lpni->lpni_net) { - lpni->lpni_txcredits = - lpni->lpni_mintxcredits = - lpni->lpni_net->net_tunables.lct_peer_tx_credits; - lpni->lpni_rtrcredits = - lpni->lpni_minrtrcredits = - lnet_peer_buffer_credits(lpni->lpni_net); - } else { - /* - * if you're adding a peer which is not on a local network - * then we can't assign any of the credits. It won't be - * picked for sending anyway. Eventually a network can be - * added, in this case we need to revisit this peer and - * update its credits. - */ - - CDEBUG(D_NET, "peer_ni %s is not directly connected\n", - libcfs_nid2str(nid)); - /* increment refcount for remote peer list */ - atomic_inc(&lpni->lpni_refcount); - list_add_tail(&lpni->lpni_on_remote_peer_ni_list, - &the_lnet.ln_remote_peer_ni_list); + rc = lnet_peer_ni_traffic_add(nid); + if (rc) { + lpni = ERR_PTR(rc); + goto out_mutex_unlock; } - lnet_set_peer_ni_health_locked(lpni, true); - - list_add_tail(&lpni->lpni_hashlist, - &ptable->pt_hash[lnet_nid2peerhash(nid)]); - ptable->pt_version++; - *lpnip = lpni; + lpni = lnet_get_peer_ni_locked(ptable, nid); + LASSERT(lpni); - if (cpt != LNET_LOCK_EX) { - lnet_net_unlock(LNET_LOCK_EX); - lnet_net_lock(cpt); - } +out_mutex_unlock: + mutex_unlock(&the_lnet.ln_api_mutex); + lnet_net_lock(cpt); - return 0; -out: - if (lpni) { - lnet_try_destroy_peer_hierarchy_locked(lpni); - kfree(lpni); - } - ptable->pt_number--; - if (cpt != LNET_LOCK_EX) { - lnet_net_unlock(LNET_LOCK_EX); - lnet_net_lock(cpt); - } - return rc; + return lpni; } void @@ -882,14 +999,13 @@ lnet_debug_peer(lnet_nid_t nid) { char *aliveness = "NA"; struct lnet_peer_ni *lp; - int rc; int cpt; cpt = lnet_cpt_of_nid(nid, NULL); lnet_net_lock(cpt); - rc = lnet_nid2peerni_locked(&lp, nid, cpt); - if (rc) { + lp = lnet_nid2peerni_locked(nid, cpt); + if (IS_ERR(lp)) { lnet_net_unlock(cpt); CDEBUG(D_WARNING, "No peer %s\n", libcfs_nid2str(nid)); return; @@ -973,7 +1089,7 @@ lnet_get_peer_ni_info(__u32 peer_index, __u64 *nid, } int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, - struct lnet_peer_ni_credit_info *peer_ni_info, + bool *mr, struct lnet_peer_ni_credit_info *peer_ni_info, struct lnet_ioctl_element_stats *peer_ni_stats) { struct lnet_peer_ni *lpni = NULL; @@ -986,6 +1102,7 @@ int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, return -ENOENT; *primary_nid = lp->lp_primary_nid; + *mr = lp->lp_multi_rail; *nid = lpni->lpni_nid; snprintf(peer_ni_info->cr_aliveness, LNET_MAX_STR_LEN, "NA"); if (lnet_isrouter(lpni) || diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 7913914620f3..1c79a19f5a25 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -296,6 +296,7 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, struct lnet_route *route; struct lnet_route *route2; struct lnet_ni *ni; + struct lnet_peer_ni *lpni; int add_route; int rc; @@ -332,13 +333,14 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, lnet_net_lock(LNET_LOCK_EX); - rc = lnet_nid2peerni_locked(&route->lr_gateway, gateway, LNET_LOCK_EX); - if (rc) { + lpni = lnet_nid2peerni_locked(gateway, LNET_LOCK_EX); + if (IS_ERR(lpni)) { lnet_net_unlock(LNET_LOCK_EX); kfree(route); kfree(rnet); + rc = PTR_ERR(lpni); if (rc == -EHOSTUNREACH) /* gateway is not on a local net */ return rc; /* ignore the route entry */ CERROR("Error %d creating route %s %d %s\n", rc, @@ -346,7 +348,7 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, libcfs_nid2str(gateway)); return rc; } - + route->lr_gateway = lpni; LASSERT(!the_lnet.ln_shutdown); rnet2 = lnet_find_rnet_locked(net); From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613187 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 93D63157B for ; Tue, 25 Sep 2018 01:11:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 95E932A052 for ; Tue, 25 Sep 2018 01:11:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8A81B2A05D; Tue, 25 Sep 2018 01:11:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A45522A052 for ; Tue, 25 Sep 2018 01:11:58 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 31D394C41B1; Mon, 24 Sep 2018 18:11:58 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B39084C3F7F for ; Mon, 24 Sep 2018 18:11:56 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id BB779B034; Tue, 25 Sep 2018 01:11:55 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763561.32103.4423090846180857161.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 19/34] LU-7734 lnet: proper cpt locking X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata 1. add a per NI credits, which is just the total credits assigned on NI creation 2. Whenever percpt credits are added or decremented, we mirror that in the NI credits 3. We use the NI credits to determine best NI 4. After we have completed the peer_ni/ni selection we determine the cpt to use for locking: cpt_of_nid(lpni->nid, ni) The lpni_cpt is not enough to protect all the fields in the lnet_peer_ni structure. This is due to the fact that multiple NIs can talk to the same peer, and functions can be called with different cpts locked. To properly protect the fields in the lnet_peer_ni structure, a spin lock is introduced for the purpose. Signed-off-by: Amir Shehata Change-Id: Ief7868c3c8ff7e00ea9e908dd50d8cef77d9f9a4 Reviewed-on: http://review.whamcloud.com/20701 Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-types.h | 15 +++-- drivers/staging/lustre/lnet/lnet/api-ni.c | 3 + drivers/staging/lustre/lnet/lnet/lib-move.c | 58 ++++++++++++------ drivers/staging/lustre/lnet/lnet/peer.c | 2 + drivers/staging/lustre/lnet/lnet/router.c | 63 +++++++++++++++++++- 5 files changed, 113 insertions(+), 28 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 71ec0eaf8200..90a5c6e40dea 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -330,6 +330,9 @@ struct lnet_ni { /* instance-specific data */ void *ni_data; + /* per ni credits */ + atomic_t ni_tx_credits; + /* percpt TX queues */ struct lnet_tx_queue **ni_tx_queues; @@ -414,6 +417,8 @@ struct lnet_peer_ni { struct list_head lpni_rtr_list; /* statistics kept on each peer NI */ struct lnet_element_stats lpni_stats; + /* spin lock protecting credits and lpni_txq / lpni_rtrq */ + spinlock_t lpni_lock; /* # tx credits available */ int lpni_txcredits; struct lnet_peer_net *lpni_peer_net; @@ -424,13 +429,13 @@ struct lnet_peer_ni { /* low water mark */ int lpni_minrtrcredits; /* alive/dead? */ - unsigned int lpni_alive:1; + bool lpni_alive; /* notification outstanding? */ - unsigned int lpni_notify:1; + bool lpni_notify; /* outstanding notification for LND? */ - unsigned int lpni_notifylnd:1; + bool lpni_notifylnd; /* some thread is handling notification */ - unsigned int lpni_notifying:1; + bool lpni_notifying; /* SEND event outstanding from ping */ unsigned int lpni_ping_notsent; /* # times router went dead<->alive */ @@ -461,7 +466,7 @@ struct lnet_peer_ni { u32 lpni_seq; /* health flag */ bool lpni_healthy; - /* returned RC ping features */ + /* returned RC ping features. Protected with lpni_lock */ unsigned int lpni_ping_feats; /* routers on this peer */ struct list_head lpni_routes; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index d3db4853c690..9807cfb3a0fc 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1382,6 +1382,9 @@ lnet_startup_lndni(struct lnet_ni *ni, struct lnet_lnd_tunables *tun) seed = LNET_NIDADDR(ni->ni_nid); add_device_randomness(&seed, sizeof(seed)); + atomic_set(&ni->ni_tx_credits, + lnet_ni_tq_credits(ni) * ni->ni_ncpts); + CDEBUG(D_LNI, "Added LNI %s [%d/%d/%d/%d]\n", libcfs_nid2str(ni->ni_nid), ni->ni_net->net_tunables.lct_peer_tx_credits, diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 5d9acce26287..51224a4cb218 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -488,18 +488,26 @@ lnet_ni_eager_recv(struct lnet_ni *ni, struct lnet_msg *msg) return rc; } -/* NB: caller shall hold a ref on 'lp' as I'd drop lnet_net_lock */ +/* + * This function can be called from two paths: + * 1. when sending a message + * 2. when decommiting a message (lnet_msg_decommit_tx()) + * In both these cases the peer_ni should have it's reference count + * acquired by the caller and therefore it is safe to drop the spin + * lock before calling lnd_query() + */ static void lnet_ni_query_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) { time64_t last_alive = 0; + int cpt = lnet_cpt_of_nid_locked(lp->lpni_nid, ni); LASSERT(lnet_peer_aliveness_enabled(lp)); LASSERT(ni->ni_net->net_lnd->lnd_query); - lnet_net_unlock(lp->lpni_cpt); + lnet_net_unlock(cpt); ni->ni_net->net_lnd->lnd_query(ni, lp->lpni_nid, &last_alive); - lnet_net_lock(lp->lpni_cpt); + lnet_net_lock(cpt); lp->lpni_last_query = ktime_get_seconds(); @@ -519,9 +527,12 @@ lnet_peer_is_alive(struct lnet_peer_ni *lp, unsigned long now) /* Trust lnet_notify() if it has more recent aliveness news, but * ignore the initial assumed death (see lnet_peers_start_down()). */ + spin_lock(&lp->lpni_lock); if (!lp->lpni_alive && lp->lpni_alive_count > 0 && - lp->lpni_timestamp >= lp->lpni_last_alive) + lp->lpni_timestamp >= lp->lpni_last_alive) { + spin_unlock(&lp->lpni_lock); return 0; + } deadline = lp->lpni_last_alive + lp->lpni_net->net_tunables.lct_peer_timeout; @@ -532,8 +543,12 @@ lnet_peer_is_alive(struct lnet_peer_ni *lp, unsigned long now) * case, and moreover lpni_last_alive at peer creation is assumed. */ if (alive && !lp->lpni_alive && - !(lnet_isrouter(lp) && !lp->lpni_alive_count)) + !(lnet_isrouter(lp) && !lp->lpni_alive_count)) { + spin_unlock(&lp->lpni_lock); lnet_notify_locked(lp, 0, 1, lp->lpni_last_alive); + } else { + spin_unlock(&lp->lpni_lock); + } return alive; } @@ -665,6 +680,7 @@ lnet_post_send_locked(struct lnet_msg *msg, int do_send) msg->msg_txcredit = 1; tq->tq_credits--; + atomic_dec(&ni->ni_tx_credits); if (tq->tq_credits < tq->tq_credits_min) tq->tq_credits_min = tq->tq_credits; @@ -798,6 +814,7 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) !list_empty(&tq->tq_delayed)); tq->tq_credits++; + atomic_inc(&ni->ni_tx_credits); if (tq->tq_credits <= 0) { msg2 = list_entry(tq->tq_delayed.next, struct lnet_msg, msg_list); @@ -1271,9 +1288,13 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * 3. Round Robin */ while ((ni = lnet_get_next_ni_locked(local_net, ni))) { + int ni_credits; + if (!lnet_is_ni_healthy_locked(ni)) continue; + ni_credits = atomic_read(&ni->ni_tx_credits); + /* * calculate the distance from the cpt on which * the message memory is allocated to the CPT of @@ -1349,11 +1370,9 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * select using credits followed by Round * Robin. */ - if (ni->ni_tx_queues[cpt]->tq_credits < - best_credits) { + if (ni_credits < best_credits) { continue; - } else if (ni->ni_tx_queues[cpt]->tq_credits == - best_credits) { + } else if (ni_credits == best_credits) { if (best_ni && best_ni->ni_seq <= ni->ni_seq) continue; @@ -1361,7 +1380,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, } set_ni: best_ni = ni; - best_credits = ni->ni_tx_queues[cpt]->tq_credits; + best_credits = ni_credits; } } /* @@ -1539,13 +1558,15 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, send: /* - * determine the cpt to use and if it has changed then - * lock the new cpt and check if the config has changed. - * If it has changed then repeat the algorithm since the - * ni or peer list could have changed and the algorithm - * would endup picking a different ni/peer_ni pair. + * Use lnet_cpt_of_nid() to determine the CPT used to commit the + * message. This ensures that we get a CPT that is correct for + * the NI when the NI has been restricted to a subset of all CPTs. + * If the selected CPT differs from the one currently locked, we + * must unlock and relock the lnet_net_lock(), and then check whether + * the configuration has changed. We don't have a hold on the best_ni + * or best_peer_ni yet, and they may have vanished. */ - cpt2 = best_lpni->lpni_cpt; + cpt2 = lnet_cpt_of_nid_locked(best_lpni->lpni_nid, best_ni); if (cpt != cpt2) { lnet_net_unlock(cpt); cpt = cpt2; @@ -1699,7 +1720,7 @@ lnet_parse_put(struct lnet_ni *ni, struct lnet_msg *msg) info.mi_rlength = hdr->payload_length; info.mi_roffset = hdr->msg.put.offset; info.mi_mbits = hdr->msg.put.match_bits; - info.mi_cpt = msg->msg_rxpeer->lpni_cpt; + info.mi_cpt = lnet_cpt_of_nid(msg->msg_rxpeer->lpni_nid, ni); msg->msg_rx_ready_delay = !ni->ni_net->net_lnd->lnd_eager_recv; ready_delay = msg->msg_rx_ready_delay; @@ -2326,8 +2347,7 @@ lnet_drop_delayed_msg_list(struct list_head *head, char *reason) * called lnet_drop_message(), so I just hang onto msg as well * until that's done */ - lnet_drop_message(msg->msg_rxni, - msg->msg_rxpeer->lpni_cpt, + lnet_drop_message(msg->msg_rxni, msg->msg_rx_cpt, msg->msg_private, msg->msg_len); /* * NB: message will not generate event because w/o attached MD, diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index c2a04526a59a..dc4527f86113 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -117,6 +117,8 @@ lnet_peer_ni_alloc(lnet_nid_t nid) INIT_LIST_HEAD(&lpni->lpni_on_peer_net_list); INIT_LIST_HEAD(&lpni->lpni_on_remote_peer_ni_list); + spin_lock_init(&lpni->lpni_lock); + lpni->lpni_alive = !lnet_peers_start_down(); /* 1 bit!! */ lpni->lpni_last_alive = ktime_get_seconds(); /* assumes alive */ lpni->lpni_ping_feats = LNET_PING_FEAT_INVAL; diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 1c79a19f5a25..d3c41f5664a4 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -108,11 +108,20 @@ lnet_notify_locked(struct lnet_peer_ni *lp, int notifylnd, int alive, return; } + /* + * This function can be called with different cpt locks being + * held. lpni_alive_count modification needs to be properly protected. + * Significant reads to lpni_alive_count are also protected with + * the same lock + */ + spin_lock(&lp->lpni_lock); + lp->lpni_timestamp = when; /* update timestamp */ lp->lpni_ping_deadline = 0; /* disable ping timeout */ if (lp->lpni_alive_count && /* got old news */ (!lp->lpni_alive) == (!alive)) { /* new date for old news */ + spin_unlock(&lp->lpni_lock); CDEBUG(D_NET, "Old news\n"); return; } @@ -120,15 +129,20 @@ lnet_notify_locked(struct lnet_peer_ni *lp, int notifylnd, int alive, /* Flag that notification is outstanding */ lp->lpni_alive_count++; - lp->lpni_alive = !(!alive); /* 1 bit! */ + lp->lpni_alive = !!alive; /* 1 bit! */ lp->lpni_notify = 1; - lp->lpni_notifylnd |= notifylnd; + lp->lpni_notifylnd = notifylnd; if (lp->lpni_alive) lp->lpni_ping_feats = LNET_PING_FEAT_INVAL; /* reset */ + spin_unlock(&lp->lpni_lock); + CDEBUG(D_NET, "set %s %d\n", libcfs_nid2str(lp->lpni_nid), alive); } +/* + * This function will always be called with lp->lpni_cpt lock held. + */ static void lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) { @@ -140,11 +154,19 @@ lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) * NB individual events can be missed; the only guarantee is that you * always get the most recent news */ - if (lp->lpni_notifying || !ni) + spin_lock(&lp->lpni_lock); + + if (lp->lpni_notifying || !ni) { + spin_unlock(&lp->lpni_lock); return; + } lp->lpni_notifying = 1; + /* + * lp->lpni_notify needs to be protected because it can be set in + * lnet_notify_locked(). + */ while (lp->lpni_notify) { alive = lp->lpni_alive; notifylnd = lp->lpni_notifylnd; @@ -153,6 +175,7 @@ lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) lp->lpni_notify = 0; if (notifylnd && ni->ni_net->net_lnd->lnd_notify) { + spin_unlock(&lp->lpni_lock); lnet_net_unlock(lp->lpni_cpt); /* @@ -163,10 +186,12 @@ lnet_ni_notify_locked(struct lnet_ni *ni, struct lnet_peer_ni *lp) alive); lnet_net_lock(lp->lpni_cpt); + spin_lock(&lp->lpni_lock); } } lp->lpni_notifying = 0; + spin_unlock(&lp->lpni_lock); } static void @@ -623,6 +648,12 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) if (!gw->lpni_alive) return; + /* + * Protect gw->lpni_ping_feats. This can be set from + * lnet_notify_locked with different locks being held + */ + spin_lock(&gw->lpni_lock); + if (info->pi_magic == __swab32(LNET_PROTO_PING_MAGIC)) lnet_swap_pinginfo(info); @@ -631,6 +662,7 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) CDEBUG(D_NET, "%s: Unexpected magic %08x\n", libcfs_nid2str(gw->lpni_nid), info->pi_magic); gw->lpni_ping_feats = LNET_PING_FEAT_INVAL; + spin_unlock(&gw->lpni_lock); return; } @@ -638,11 +670,14 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) if (!(gw->lpni_ping_feats & LNET_PING_FEAT_MASK)) { CDEBUG(D_NET, "%s: Unexpected features 0x%x\n", libcfs_nid2str(gw->lpni_nid), gw->lpni_ping_feats); + spin_unlock(&gw->lpni_lock); return; /* nothing I can understand */ } - if (!(gw->lpni_ping_feats & LNET_PING_FEAT_NI_STATUS)) + if (!(gw->lpni_ping_feats & LNET_PING_FEAT_NI_STATUS)) { + spin_unlock(&gw->lpni_lock); return; /* can't carry NI status info */ + } list_for_each_entry(rte, &gw->lpni_routes, lr_gwlist) { int down = 0; @@ -662,6 +697,7 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) CDEBUG(D_NET, "%s: unexpected LNET_NID_ANY\n", libcfs_nid2str(gw->lpni_nid)); gw->lpni_ping_feats = LNET_PING_FEAT_INVAL; + spin_unlock(&gw->lpni_lock); return; } @@ -684,6 +720,7 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) CDEBUG(D_NET, "%s: Unexpected status 0x%x\n", libcfs_nid2str(gw->lpni_nid), stat->ns_status); gw->lpni_ping_feats = LNET_PING_FEAT_INVAL; + spin_unlock(&gw->lpni_lock); return; } @@ -700,6 +737,8 @@ lnet_parse_rc_info(struct lnet_rc_data *rcd) rte->lr_downis = down; } + + spin_unlock(&gw->lpni_lock); } static void @@ -773,10 +812,14 @@ lnet_wait_known_routerstate(void) all_known = 1; list_for_each_entry(rtr, &the_lnet.ln_routers, lpni_rtr_list) { + spin_lock(&rtr->lpni_lock); + if (!rtr->lpni_alive_count) { all_known = 0; + spin_unlock(&rtr->lpni_lock); break; } + spin_unlock(&rtr->lpni_lock); } lnet_net_unlock(cpt); @@ -1744,6 +1787,18 @@ lnet_notify(struct lnet_ni *ni, lnet_nid_t nid, int alive, time64_t when) return 0; } + /* + * It is possible for this function to be called for the same peer + * but with different NIs. We want to synchronize the notification + * between the different calls. So we will use the lpni_cpt to + * grab the net lock. + */ + if (lp->lpni_cpt != cpt) { + lnet_net_unlock(cpt); + cpt = lp->lpni_cpt; + lnet_net_lock(cpt); + } + /* * We can't fully trust LND on reporting exact peer last_alive * if he notifies us about dead peer. For example ksocklnd can From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613189 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 30EB41390 for ; Tue, 25 Sep 2018 01:12:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 34B212A052 for ; Tue, 25 Sep 2018 01:12:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 27F422A05D; Tue, 25 Sep 2018 01:12:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A805B2A052 for ; Tue, 25 Sep 2018 01:12:05 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 38A534C41C5; Mon, 24 Sep 2018 18:12:05 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E06544C4162 for ; Mon, 24 Sep 2018 18:12:03 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id BC64FB032; Tue, 25 Sep 2018 01:12:02 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763565.32103.14024172070634950028.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 20/34] LU-7734 lnet: protect peer_ni credits X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Currently multiple NIs can talk to the same peer_ni. The per-CPT lnet_net_lock therefore no longer protects the lpni against concurrent updates. To resolve this issue a spinlock is added to the lnet_peer_ni, which must be locked when the peer NI credits, delayed message queue, and delayed routed message queue are modified. The lock is not taken when reporting credits. Signed-off-by: Amir Shehata Signed-off-by: Olaf Weber Change-Id: I52153680a74d43e595314b63487026cc3f6a5a8f Reviewed-on: http://review.whamcloud.com/20702 Signed-off-by: NeilBrown --- drivers/staging/lustre/lnet/lnet/lib-move.c | 40 ++++++++++++++++++++------- drivers/staging/lustre/lnet/lnet/peer.c | 8 ++++- drivers/staging/lustre/lnet/lnet/router.c | 3 +- 3 files changed, 38 insertions(+), 13 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 51224a4cb218..b4c7c8aa33a7 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -657,6 +657,7 @@ lnet_post_send_locked(struct lnet_msg *msg, int do_send) } if (!msg->msg_peertxcredit) { + spin_lock(&lp->lpni_lock); LASSERT((lp->lpni_txcredits < 0) == !list_empty(&lp->lpni_txq)); @@ -670,8 +671,10 @@ lnet_post_send_locked(struct lnet_msg *msg, int do_send) if (lp->lpni_txcredits < 0) { msg->msg_tx_delayed = 1; list_add_tail(&msg->msg_list, &lp->lpni_txq); + spin_unlock(&lp->lpni_lock); return LNET_CREDIT_WAIT; } + spin_unlock(&lp->lpni_lock); } if (!msg->msg_txcredit) { @@ -744,6 +747,7 @@ lnet_post_routed_recv_locked(struct lnet_msg *msg, int do_recv) LASSERT(!do_recv || msg->msg_rx_delayed); if (!msg->msg_peerrtrcredit) { + spin_lock(&lp->lpni_lock); LASSERT((lp->lpni_rtrcredits < 0) == !list_empty(&lp->lpni_rtrq)); @@ -757,8 +761,10 @@ lnet_post_routed_recv_locked(struct lnet_msg *msg, int do_recv) LASSERT(msg->msg_rx_ready_delay); msg->msg_rx_delayed = 1; list_add_tail(&msg->msg_list, &lp->lpni_rtrq); + spin_unlock(&lp->lpni_lock); return LNET_CREDIT_WAIT; } + spin_unlock(&lp->lpni_lock); } rbp = lnet_msg2bufpool(msg); @@ -822,6 +828,7 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) LASSERT(msg2->msg_txni == ni); LASSERT(msg2->msg_tx_delayed); + LASSERT(msg2->msg_tx_cpt == msg->msg_tx_cpt); (void)lnet_post_send_locked(msg2, 1); } @@ -831,6 +838,7 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) /* give back peer txcredits */ msg->msg_peertxcredit = 0; + spin_lock(&txpeer->lpni_lock); LASSERT((txpeer->lpni_txcredits < 0) == !list_empty(&txpeer->lpni_txq)); @@ -842,11 +850,22 @@ lnet_return_tx_credits_locked(struct lnet_msg *msg) msg2 = list_entry(txpeer->lpni_txq.next, struct lnet_msg, msg_list); list_del(&msg2->msg_list); + spin_unlock(&txpeer->lpni_lock); LASSERT(msg2->msg_txpeer == txpeer); LASSERT(msg2->msg_tx_delayed); + if (msg2->msg_tx_cpt != msg->msg_tx_cpt) { + lnet_net_unlock(msg->msg_tx_cpt); + lnet_net_lock(msg2->msg_tx_cpt); + } (void)lnet_post_send_locked(msg2, 1); + if (msg2->msg_tx_cpt != msg->msg_tx_cpt) { + lnet_net_unlock(msg2->msg_tx_cpt); + lnet_net_lock(msg->msg_tx_cpt); + } + } else { + spin_unlock(&txpeer->lpni_lock); } } @@ -887,17 +906,12 @@ lnet_schedule_blocked_locked(struct lnet_rtrbufpool *rbp) void lnet_drop_routed_msgs_locked(struct list_head *list, int cpt) { - struct list_head drop; struct lnet_msg *msg; - INIT_LIST_HEAD(&drop); - - list_splice_init(list, &drop); - lnet_net_unlock(cpt); - while(!list_empty(&drop)) { - msg = list_first_entry(&drop, struct lnet_msg, msg_list); + while (!list_empty(list)) { + msg = list_first_entry(list, struct lnet_msg, msg_list); lnet_ni_recv(msg->msg_rxni, msg->msg_private, NULL, 0, 0, 0, msg->msg_hdr.payload_length); list_del_init(&msg->msg_list); @@ -968,6 +982,7 @@ lnet_return_rx_credits_locked(struct lnet_msg *msg) /* give back peer router credits */ msg->msg_peerrtrcredit = 0; + spin_lock(&rxpeer->lpni_lock); LASSERT((rxpeer->lpni_rtrcredits < 0) == !list_empty(&rxpeer->lpni_rtrq)); @@ -977,14 +992,19 @@ lnet_return_rx_credits_locked(struct lnet_msg *msg) * peer. */ if (!the_lnet.ln_routing) { - lnet_drop_routed_msgs_locked(&rxpeer->lpni_rtrq, - msg->msg_rx_cpt); + LIST_HEAD(drop); + + list_splice_init(&rxpeer->lpni_rtrq, &drop); + spin_unlock(&rxpeer->lpni_lock); + lnet_drop_routed_msgs_locked(&drop, msg->msg_rx_cpt); } else if (rxpeer->lpni_rtrcredits <= 0) { msg2 = list_entry(rxpeer->lpni_rtrq.next, struct lnet_msg, msg_list); list_del(&msg2->msg_list); - + spin_unlock(&rxpeer->lpni_lock); (void)lnet_post_routed_recv_locked(msg2, 1); + } else { + spin_unlock(&rxpeer->lpni_lock); } } if (rxni) { diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index dc4527f86113..3555e9bd1db1 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -56,12 +56,16 @@ lnet_peer_net_added(struct lnet_net *net) lpni_on_remote_peer_ni_list) { if (LNET_NIDNET(lpni->lpni_nid) == net->net_id) { lpni->lpni_net = net; + + spin_lock(&lpni->lpni_lock); lpni->lpni_txcredits = - lpni->lpni_mintxcredits = lpni->lpni_net->net_tunables.lct_peer_tx_credits; + lpni->lpni_mintxcredits = + lpni->lpni_txcredits; lpni->lpni_rtrcredits = - lpni->lpni_minrtrcredits = lnet_peer_buffer_credits(lpni->lpni_net); + lpni->lpni_minrtrcredits = lpni->lpni_rtrcredits; + spin_unlock(&lpni->lpni_lock); lnet_peer_remove_from_remote_list(lpni); } diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index d3c41f5664a4..6c50e8cc7833 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -1358,7 +1358,8 @@ lnet_rtrpool_free_bufs(struct lnet_rtrbufpool *rbp, int cpt) INIT_LIST_HEAD(&tmp); lnet_net_lock(cpt); - lnet_drop_routed_msgs_locked(&rbp->rbp_msgs, cpt); + list_splice_init(&rbp->rbp_msgs, &tmp); + lnet_drop_routed_msgs_locked(&tmp, cpt); list_splice_init(&rbp->rbp_bufs, &tmp); rbp->rbp_req_nbuffers = 0; rbp->rbp_nbuffers = 0; From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613191 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B96D51390 for ; Tue, 25 Sep 2018 01:12:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BC01A2A052 for ; Tue, 25 Sep 2018 01:12:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B044E2A05D; Tue, 25 Sep 2018 01:12:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C92C72A052 for ; Tue, 25 Sep 2018 01:12:11 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6093E4C41D9; Mon, 24 Sep 2018 18:12:11 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 570714C41B1 for ; Mon, 24 Sep 2018 18:12:09 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 52BA7B032; Tue, 25 Sep 2018 01:12:08 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763568.32103.16087987326485396768.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 21/34] LU-7734 lnet: simplify and fix lnet_select_pathway() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata In lnet_select_pathway() we restart selection if the DLC seq counter changes. Provided we take a hold on the preferred lnet_peer_ni, we only need to restart if an lnet_ni was added or removed. Update the locations where lnet_incr_dlc_seq() is called to take this into account. A number of local variables must be reset whenever we goto again. Do this immediately after the label for the global variables, and immediately before the block that uses them for the helper variables. In the loop where NUMA distances are compared, use the NUMA range for distances smaller than the NUMA range, simplifying the subsequent comparisons between distances. Remote the lo_sent output parameter. Instead do an early return with LNET_CREDIT_OK. Move the increment of the best_lpni->lpni_seq number after the check that best_lpni isn't NULL. When routing, the best_gw should be treated as the best_lpni for the purpose of determining the CPT to lock. Signed-off-by: Amir Shehata Signed-off-by: Olaf Weber Change-Id: Ie71eebc2301601cf1c85c6248dbed06951b89274 Reviewed-on: http://review.whamcloud.com/20720 Signed-off-by: NeilBrown --- drivers/staging/lustre/lnet/lnet/api-ni.c | 11 + drivers/staging/lustre/lnet/lnet/lib-move.c | 213 +++++++++++---------------- 2 files changed, 88 insertions(+), 136 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 9807cfb3a0fc..ac6efcd746c5 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -72,10 +72,10 @@ MODULE_PARM_DESC(lnet_numa_range, /* * This sequence number keeps track of how many times DLC was used to - * update the configuration. It is incremented on any DLC update and - * checked when sending a message to determine if there is a need to - * re-run the selection algorithm to handle configuration change. - * Look at lnet_select_pathway() for more details on its usage. + * update the local NIs. It is incremented when a NI is added or + * removed and checked when sending a message to determine if there is + * a need to re-run the selection algorithm. See lnet_select_pathway() + * for more details on its usage. */ static atomic_t lnet_dlc_seq_no = ATOMIC_INIT(0); @@ -1223,6 +1223,7 @@ lnet_shutdown_lndni(struct lnet_ni *ni) lnet_net_lock(LNET_LOCK_EX); ni->ni_state = LNET_NI_STATE_DELETING; lnet_ni_unlink_locked(ni); + lnet_incr_dlc_seq(); lnet_net_unlock(LNET_LOCK_EX); /* clear messages for this NI on the lazy portal */ @@ -2723,7 +2724,6 @@ LNetCtl(unsigned int cmd, void *arg) return -EINVAL; mutex_lock(&the_lnet.ln_api_mutex); - lnet_incr_dlc_seq(); rc = lnet_add_peer_ni_to_peer(cfg->prcfg_key_nid, cfg->prcfg_cfg_nid, cfg->prcfg_mr); @@ -2738,7 +2738,6 @@ LNetCtl(unsigned int cmd, void *arg) return -EINVAL; mutex_lock(&the_lnet.ln_api_mutex); - lnet_incr_dlc_seq(); rc = lnet_del_peer_ni_from_peer(cfg->prcfg_key_nid, cfg->prcfg_cfg_nid); mutex_unlock(&the_lnet.ln_api_mutex); diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index b4c7c8aa33a7..8019c59cc64e 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1132,48 +1132,44 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, static int lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, - struct lnet_msg *msg, lnet_nid_t rtr_nid, bool *lo_sent) + struct lnet_msg *msg, lnet_nid_t rtr_nid) { struct lnet_ni *best_ni = NULL; struct lnet_peer_ni *best_lpni = NULL; - struct lnet_peer_ni *net_gw = NULL; struct lnet_peer_ni *best_gw = NULL; struct lnet_peer_ni *lpni; - struct lnet_peer *peer = NULL; + struct lnet_peer *peer; struct lnet_peer_net *peer_net; struct lnet_net *local_net; - struct lnet_ni *ni = NULL; + struct lnet_ni *ni; + __u32 seq; int cpt, cpt2, rc; - bool routing = false; - bool ni_is_pref = false; - bool preferred = false; - int best_credits = 0; - u32 seq, seq2; - int best_lpni_credits = INT_MIN; - int md_cpt = 0; - unsigned int shortest_distance = UINT_MAX; - unsigned int distance = 0; - bool found_ir = false; + bool routing; + bool ni_is_pref; + bool preferred; + int best_credits; + int best_lpni_credits; + int md_cpt; + unsigned int shortest_distance; -again: /* * get an initial CPT to use for locking. The idea here is not to * serialize the calls to select_pathway, so that as many * operations can run concurrently as possible. To do that we use * the CPT where this call is being executed. Later on when we * determine the CPT to use in lnet_message_commit, we switch the - * lock and check if there was any configuration changes, if none, - * then we proceed, if there is, then we'll need to update the cpt - * and redo the operation. + * lock and check if there was any configuration change. If none, + * then we proceed, if there is, then we restart the operation. */ cpt = lnet_net_lock_current(); - +again: + best_ni = NULL; + best_lpni = NULL; best_gw = NULL; - routing = false; local_net = NULL; - best_ni = NULL; - shortest_distance = UINT_MAX; - found_ir = false; + routing = false; + + seq = lnet_get_dlc_seq_locked(); if (the_lnet.ln_shutdown) { lnet_net_unlock(cpt); @@ -1186,13 +1182,6 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, else md_cpt = CFS_CPT_ANY; - /* - * initialize the variables which could be reused if we go to - * again - */ - lpni = NULL; - seq = lnet_get_dlc_seq_locked(); - peer = lnet_find_or_create_peer_locked(dst_nid, cpt); if (IS_ERR(peer)) { lnet_net_unlock(cpt); @@ -1234,10 +1223,8 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, libcfs_nid2str(src_nid)); return -EINVAL; } - } - - if (best_ni) goto pick_peer; + } /* * Decide whether we need to route to peer_ni. @@ -1254,6 +1241,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, local_net = lnet_get_net_locked(peer_net->lpn_net_id); if (!local_net) { + struct lnet_peer_ni *net_gw; /* * go through each peer_ni on that peer_net and * determine the best possible gw to go through @@ -1307,8 +1295,12 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * 2. NI available credits * 3. Round Robin */ + shortest_distance = UINT_MAX; + best_credits = INT_MIN; + ni = NULL; while ((ni = lnet_get_next_ni_locked(local_net, ni))) { int ni_credits; + unsigned int distance; if (!lnet_is_ni_healthy_locked(ni)) continue; @@ -1316,7 +1308,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, ni_credits = atomic_read(&ni->ni_tx_credits); /* - * calculate the distance from the cpt on which + * calculate the distance from the CPT on which * the message memory is allocated to the CPT of * the NI's physical device */ @@ -1325,84 +1317,31 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, ni->dev_cpt); /* - * If we already have a closer NI within the NUMA - * range provided, then there is no need to - * consider the current NI. Move on to the next - * one. + * All distances smaller than the NUMA range + * are treated equally. */ - if (distance > shortest_distance && - distance > lnet_numa_range) - continue; + if (distance < lnet_numa_range) + distance = lnet_numa_range; - if (distance < shortest_distance && - distance > lnet_numa_range) { - /* - * The current NI is the closest one that we - * have found, even though it's not in the - * NUMA range specified. This occurs if - * the NUMA range is less than the least - * of the distances in the system. - * In effect NUMA range consideration is - * turned off. - */ + /* + * Select on shorter distance, then available + * credits, then round-robin. + */ + if (distance > shortest_distance) { + continue; + } else if (distance < shortest_distance) { shortest_distance = distance; - } else if ((distance <= shortest_distance && - distance < lnet_numa_range) || - distance == shortest_distance) { - /* - * This NI is either within range or it's - * equidistant. In both of these cases we - * would want to select the NI based on - * its available credits first, and then - * via Round Robin. - */ - if (distance <= shortest_distance && - distance < lnet_numa_range) { - /* - * If this is the first NI that's - * within range, then set the - * shortest distance to the range - * specified by the user. In - * effect we're saying that all - * NIs that fall within this NUMA - * range shall be dealt with as - * having equal NUMA weight. Which - * will mean that we should select - * through that set by their - * available credits first - * followed by Round Robin. - * - * And since this is the first NI - * in the range, let's just set it - * as our best_ni for now. The - * following NIs found in the - * range will be dealt with as - * mentioned previously. - */ - shortest_distance = lnet_numa_range; - if (!found_ir) { - found_ir = true; - goto set_ni; - } - } - /* - * This NI is NUMA equidistant let's - * select using credits followed by Round - * Robin. - */ - if (ni_credits < best_credits) { + } else if (ni_credits < best_credits) { + continue; + } else if (ni_credits == best_credits) { + if (best_ni && best_ni->ni_seq <= ni->ni_seq) continue; - } else if (ni_credits == best_credits) { - if (best_ni && - best_ni->ni_seq <= ni->ni_seq) - continue; - } } -set_ni: best_ni = ni; best_credits = ni_credits; } } + /* * if the peer is not MR capable, then we should always send to it * using the first NI in the NET we determined. @@ -1436,17 +1375,12 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, msg->msg_hdr.src_nid = cpu_to_le64(best_ni->ni_nid); msg->msg_target.nid = best_ni->ni_nid; lnet_msg_commit(msg, cpt); - - lnet_net_unlock(cpt); msg->msg_txni = best_ni; - lnet_ni_send(best_ni, msg); + lnet_net_unlock(cpt); - *lo_sent = true; - return 0; + return LNET_CREDIT_OK; } - lpni = NULL; - if (msg->msg_type == LNET_MSG_REPLY || msg->msg_type == LNET_MSG_ACK) { /* @@ -1509,14 +1443,22 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, */ u32 net_id = peer_net->lpn_net_id; - lnet_net_unlock(cpt); - if (!best_lpni) - LCONSOLE_WARN("peer net %s unhealthy\n", - libcfs_net2str(net_id)); + LCONSOLE_WARN("peer net %s unhealthy\n", + libcfs_net2str(net_id)); goto again; } - best_lpni = NULL; + /* + * Look at the peer NIs for the destination peer that connect + * to the chosen net. If a peer_ni is preferred when using the + * best_ni to communicate, we use that one. If there is no + * preferred peer_ni, or there are multiple preferred peer_ni, + * the available transmit credits are used. If the transmit + * credits are equal, we round-robin over the peer_ni. + */ + lpni = NULL; + best_lpni_credits = INT_MIN; + preferred = false; while ((lpni = lnet_get_next_peer_ni_locked(peer, peer_net, lpni))) { /* * if this peer ni is not healthy just skip it, no point in @@ -1559,12 +1501,6 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, best_lpni_credits = lpni->lpni_txcredits; } - /* - * Increment sequence number of the peer selected so that we can - * pick the next one in Round Robin. - */ - best_lpni->lpni_seq++; - /* if we still can't find a peer ni then we can't reach it */ if (!best_lpni) { u32 net_id = peer_net ? peer_net->lpn_net_id : @@ -1577,6 +1513,25 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, } send: + /* + * Increment sequence number of the peer selected so that we + * pick the next one in Round Robin. + */ + best_lpni->lpni_seq++; + + /* + * When routing the best gateway found acts as the best peer + * NI to send to. + */ + if (routing) + best_lpni = best_gw; + + /* + * grab a reference on the peer_ni so it sticks around even if + * we need to drop and relock the lnet_net_lock below. + */ + lnet_peer_ni_addref_locked(best_lpni); + /* * Use lnet_cpt_of_nid() to determine the CPT used to commit the * message. This ensures that we get a CPT that is correct for @@ -1584,16 +1539,15 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * If the selected CPT differs from the one currently locked, we * must unlock and relock the lnet_net_lock(), and then check whether * the configuration has changed. We don't have a hold on the best_ni - * or best_peer_ni yet, and they may have vanished. + * yet, and it may have vanished. */ cpt2 = lnet_cpt_of_nid_locked(best_lpni->lpni_nid, best_ni); if (cpt != cpt2) { lnet_net_unlock(cpt); cpt = cpt2; lnet_net_lock(cpt); - seq2 = lnet_get_dlc_seq_locked(); - if (seq2 != seq) { - lnet_net_unlock(cpt); + if (seq != lnet_get_dlc_seq_locked()) { + lnet_peer_ni_decref_locked(best_lpni); goto again; } } @@ -1602,15 +1556,15 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * store the best_lpni in the message right away to avoid having * to do the same operation under different conditions */ - msg->msg_txpeer = (routing) ? best_gw : best_lpni; + msg->msg_txpeer = best_lpni; msg->msg_txni = best_ni; + /* * grab a reference for the best_ni since now it's in use in this * send. the reference will need to be dropped when the message is * finished in lnet_finalize() */ lnet_ni_addref_locked(msg->msg_txni, cpt); - lnet_peer_ni_addref_locked(msg->msg_txpeer); /* * set the destination nid in the message here because it's @@ -1659,7 +1613,6 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) { lnet_nid_t dst_nid = msg->msg_target.nid; int rc; - bool lo_sent = false; /* * NB: rtr_nid is set to LNET_NID_ANY for all current use-cases, @@ -1676,8 +1629,8 @@ lnet_send(lnet_nid_t src_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) LASSERT(!msg->msg_tx_committed); - rc = lnet_select_pathway(src_nid, dst_nid, msg, rtr_nid, &lo_sent); - if (rc < 0 || lo_sent) + rc = lnet_select_pathway(src_nid, dst_nid, msg, rtr_nid); + if (rc < 0) return rc; if (rc == LNET_CREDIT_OK) From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613193 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 63DF2157B for ; Tue, 25 Sep 2018 01:12:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6749B2A052 for ; Tue, 25 Sep 2018 01:12:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5B8AC2A05D; Tue, 25 Sep 2018 01:12:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 060BD2A052 for ; Tue, 25 Sep 2018 01:12:17 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id A13C74C4214; Mon, 24 Sep 2018 18:12:16 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4565A4C421E for ; Mon, 24 Sep 2018 18:12:15 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 67E2BB034; Tue, 25 Sep 2018 01:12:14 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763572.32103.17043305407995526499.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 22/34] LU-7734 lnet: fix lnet_peer_table_cleanup_locked() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Olaf Weber In lnet_peer_table_cleanup_locked() we delete the entire peer if the lnet_peer_ni for the primary NID of the peer is deleted. If the next lnet_peer_ni in the list belongs to the peer being deleted, then the next pointer kept by list_for_each_entry_safe() ends up pointing at freed memory. Add a list_for_each_entry_from() loop to advance next to a peer_ni that does not belong to the peer being deleted and will therefore remain present in the list. Signed-off-by: Olaf Weber Change-Id: I92bf219dc93a79f7d90035ccfbb38cd251138c04 Reviewed-on: http://review.whamcloud.com/20824 Reviewed-by: Amir Shehata Tested-by: Amir Shehata Signed-off-by: NeilBrown --- drivers/staging/lustre/lnet/lnet/peer.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 3555e9bd1db1..11edf3632405 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -331,26 +331,32 @@ lnet_peer_table_cleanup_locked(struct lnet_net *net, struct lnet_peer_table *ptable) { int i; + struct lnet_peer_ni *next; struct lnet_peer_ni *lpni; - struct lnet_peer_ni *tmp; struct lnet_peer *peer; for (i = 0; i < LNET_PEER_HASH_SIZE; i++) { - list_for_each_entry_safe(lpni, tmp, &ptable->pt_hash[i], + list_for_each_entry_safe(lpni, next, &ptable->pt_hash[i], lpni_hashlist) { if (net && net != lpni->lpni_net) continue; - /* - * check if by removing this peer ni we should be - * removing the entire peer. - */ peer = lpni->lpni_peer_net->lpn_peer; - - if (peer->lp_primary_nid == lpni->lpni_nid) - lnet_peer_del_locked(peer); - else + if (peer->lp_primary_nid != lpni->lpni_nid) { lnet_peer_ni_del_locked(lpni); + continue; + } + /* + * Removing the primary NID implies removing + * the entire peer. Advance next beyond any + * peer_ni that belongs to the same peer. + */ + list_for_each_entry_from(next, &ptable->pt_hash[i], + lpni_hashlist) { + if (next->lpni_peer_net->lpn_peer != peer) + break; + } + lnet_peer_del_locked(peer); } } } From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613195 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2806157B for ; Tue, 25 Sep 2018 01:12:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E5BEB2A052 for ; Tue, 25 Sep 2018 01:12:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DA4B52A05D; Tue, 25 Sep 2018 01:12:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 789842A052 for ; Tue, 25 Sep 2018 01:12:24 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 18C824C420D; Mon, 24 Sep 2018 18:12:24 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 85D4D4C41FE for ; Mon, 24 Sep 2018 18:12:21 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 820BDB035; Tue, 25 Sep 2018 01:12:20 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763576.32103.3176502972388297285.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 23/34] LU-7734 lnet: configuration fixes X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Fix cpt configuration from DLC to configure the proper list of cpts in LNet. Check in LNet that no CPTs are outside the available CPTs in the system. Fix peer_rtr_credits name to peer_tx_credits to reflect the actual value. Signed-off-by: Amir Shehata Change-Id: Ic4a3985a470ed901be6166df4079205677921817 Reviewed-on: http://review.whamcloud.com/20862 Signed-off-by: NeilBrown --- .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 4 +++- drivers/staging/lustre/lnet/lnet/api-ni.c | 9 +++++++-- drivers/staging/lustre/lnet/lnet/peer.c | 3 ++- 3 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index b31b69c25ef2..a5e94619d3f1 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -169,6 +169,7 @@ struct lnet_ioctl_config_ni { __u32 lic_tcp_bonding; __u32 lic_idx; __s32 lic_dev_cpt; + char pad[4]; char lic_bulk[0]; }; @@ -177,9 +178,10 @@ struct lnet_peer_ni_credit_info { __u32 cr_refcount; __s32 cr_ni_peer_tx_credits; __s32 cr_peer_tx_credits; + __s32 cr_peer_min_tx_credits; + __u32 cr_peer_tx_qnob; __s32 cr_peer_rtr_credits; __s32 cr_peer_min_rtr_credits; - __u32 cr_peer_tx_qnob; __u32 cr_ncpt; }; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index ac6efcd746c5..60176d05d34a 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -2283,7 +2283,7 @@ int lnet_dyn_add_ni(struct lnet_ioctl_config_ni *conf) struct lnet_net *net; struct lnet_ni *ni; struct lnet_ioctl_config_lnd_tunables *tun = NULL; - int rc; + int rc, i; u32 net_id; /* get the tunables if they are available */ @@ -2303,6 +2303,11 @@ int lnet_dyn_add_ni(struct lnet_ioctl_config_ni *conf) if (!net) return -ENOMEM; + for (i = 0; i < conf->lic_ncpts; i++) { + if (conf->lic_cpts[i] >= LNET_CPT_NUMBER) + return -EINVAL; + } + ni = lnet_ni_alloc_w_cpt_array(net, conf->lic_cpts, conf->lic_ncpts, conf->lic_ni_intf[0]); if (!ni) @@ -2760,7 +2765,7 @@ LNetCtl(unsigned int cmd, void *arg) &peer_info->pr_lnd_u.pr_peer_credits.cr_ni_peer_tx_credits, &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_tx_credits, &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_rtr_credits, - &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_min_rtr_credits, + &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_min_tx_credits, &peer_info->pr_lnd_u.pr_peer_credits.cr_peer_tx_qnob); mutex_unlock(&the_lnet.ln_api_mutex); return rc; diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 11edf3632405..6f6039189456 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -1127,7 +1127,8 @@ int lnet_get_peer_info(__u32 idx, lnet_nid_t *primary_nid, lnet_nid_t *nid, lpni->lpni_net->net_tunables.lct_peer_tx_credits : 0; peer_ni_info->cr_peer_tx_credits = lpni->lpni_txcredits; peer_ni_info->cr_peer_rtr_credits = lpni->lpni_rtrcredits; - peer_ni_info->cr_peer_min_rtr_credits = lpni->lpni_mintxcredits; + peer_ni_info->cr_peer_min_rtr_credits = lpni->lpni_minrtrcredits; + peer_ni_info->cr_peer_min_tx_credits = lpni->lpni_mintxcredits; peer_ni_info->cr_peer_tx_qnob = lpni->lpni_txqnob; peer_ni_stats->send_count = atomic_read(&lpni->lpni_stats.send_count); From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613197 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 29B711390 for ; Tue, 25 Sep 2018 01:12:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2E2BE2A059 for ; Tue, 25 Sep 2018 01:12:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 227BB2A05E; Tue, 25 Sep 2018 01:12:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D16242A059 for ; Tue, 25 Sep 2018 01:12:29 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 691194C424E; Mon, 24 Sep 2018 18:12:29 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 894B24C421E for ; Mon, 24 Sep 2018 18:12:27 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id ABEB6B032; Tue, 25 Sep 2018 01:12:26 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763579.32103.5954776193692454631.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 24/34] LU-7734 lnet: fix lnet_select_pathway() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Fixed the selection algorithm to work properly with > 1 local networks. The behavior now is to iterate through all interfaces on all networks Also removed the health variable from struct lnet_peer_net since it's never used. Signed-off-by: Amir Shehata Change-Id: Ib91748e80446585b6a9e1bc0f3af6894599d8aaa Reviewed-on: http://review.whamcloud.com/20890 Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-types.h | 3 --- drivers/staging/lustre/lnet/lnet/lib-move.c | 4 ++-- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 90a5c6e40dea..0761fd533f8d 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -504,9 +504,6 @@ struct lnet_peer_net { /* Net ID */ __u32 lpn_net_id; - - /* health flag */ - bool lpn_healthy; }; /* peer hash size */ diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 8019c59cc64e..2f30ba0d89fb 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1235,6 +1235,8 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * b. Iterate through each of the peer_nets/nis to decide * the best peer/local_ni pair to use */ + shortest_distance = UINT_MAX; + best_credits = INT_MIN; list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_on_peer_list) { if (!lnet_is_peer_net_healthy_locked(peer_net)) continue; @@ -1295,8 +1297,6 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * 2. NI available credits * 3. Round Robin */ - shortest_distance = UINT_MAX; - best_credits = INT_MIN; ni = NULL; while ((ni = lnet_get_next_ni_locked(local_net, ni))) { int ni_credits; From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613199 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2F80F157B for ; Tue, 25 Sep 2018 01:12:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3293B2A052 for ; Tue, 25 Sep 2018 01:12:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 26D1D2A05D; Tue, 25 Sep 2018 01:12:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9D2EE2A052 for ; Tue, 25 Sep 2018 01:12:35 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 37E3B4C424E; Mon, 24 Sep 2018 18:12:35 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E1C2D4C420D for ; Mon, 24 Sep 2018 18:12:33 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D7735B032; Tue, 25 Sep 2018 01:12:32 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763583.32103.2609293106875022233.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 25/34] LU-7734 lnet: Routing fixes part 1 X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata This is the first part of a routing fix. - Fix crash in lnet_parse_get() - Resolve deadlock when adding a route. - Fix an issue with dynamically turning on routing - Set the final destination NID properly when routing a msg Signed-off-by: Amir Shehata Change-Id: I68d0e4d52192aa96e37c77952a1ebe75c1b770c5 Reviewed-on: http://review.whamcloud.com/21166 Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 drivers/staging/lustre/lnet/lnet/lib-move.c | 23 +++++-- drivers/staging/lustre/lnet/lnet/peer.c | 62 ++++++++++++++++---- drivers/staging/lustre/lnet/lnet/router.c | 2 - 4 files changed, 67 insertions(+), 21 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 55bcd17cd4dc..3a53d54b711d 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -649,6 +649,7 @@ struct lnet_peer_ni *lnet_get_next_peer_ni_locked(struct lnet_peer *peer, struct lnet_peer_ni *prev); struct lnet_peer *lnet_find_or_create_peer_locked(lnet_nid_t dst_nid, int cpt); struct lnet_peer_ni *lnet_nid2peerni_locked(lnet_nid_t nid, int cpt); +struct lnet_peer_ni *lnet_nid2peerni_ex(lnet_nid_t nid, int cpt); struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid); void lnet_peer_net_added(struct lnet_net *net); lnet_nid_t lnet_peer_primary_nid(lnet_nid_t nid); diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 2f30ba0d89fb..58521b014ef3 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1566,17 +1566,10 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, */ lnet_ni_addref_locked(msg->msg_txni, cpt); - /* - * set the destination nid in the message here because it's - * possible that we'd be sending to a different nid than the one - * originaly given. - */ - msg->msg_hdr.dest_nid = cpu_to_le64(msg->msg_txpeer->lpni_nid); - /* * Always set the target.nid to the best peer picked. Either the * nid will be one of the preconfigured NIDs, or the same NID as - * what was originaly set in the target or it will be the NID of + * what was originally set in the target or it will be the NID of * a router if this message should be routed */ msg->msg_target.nid = msg->msg_txpeer->lpni_nid; @@ -1599,6 +1592,19 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, if (routing) { msg->msg_target_is_router = 1; msg->msg_target.pid = LNET_PID_LUSTRE; + /* + * since we're routing we want to ensure that the + * msg_hdr.dest_nid is set to the final destination. When + * the router receives this message it knows how to route + * it. + */ + msg->msg_hdr.dest_nid = cpu_to_le64(dst_nid); + } else { + /* + * if we're not routing set the dest_nid to the best peer + * ni that we picked earlier in the algorithm. + */ + msg->msg_hdr.dest_nid = cpu_to_le64(msg->msg_txpeer->lpni_nid); } rc = lnet_post_send_locked(msg, 0); @@ -1757,6 +1763,7 @@ lnet_parse_get(struct lnet_ni *ni, struct lnet_msg *msg, int rdma_get) info.mi_rlength = hdr->msg.get.sink_length; info.mi_roffset = hdr->msg.get.src_offset; info.mi_mbits = hdr->msg.get.match_bits; + info.mi_cpt = lnet_cpt_of_nid(msg->msg_rxpeer->lpni_nid, ni); rc = lnet_ptl_match_md(&info, msg); if (rc == LNET_MATCHMD_DROP) { diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 6f6039189456..9cecfb49db87 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -954,36 +954,74 @@ lnet_destroy_peer_ni_locked(struct lnet_peer_ni *lpni) kfree(lpni); } +struct lnet_peer_ni * +lnet_nid2peerni_ex(lnet_nid_t nid, int cpt) +{ + struct lnet_peer_ni *lpni = NULL; + int rc; + + if (the_lnet.ln_shutdown) /* it's shutting down */ + return ERR_PTR(-ESHUTDOWN); + + /* + * find if a peer_ni already exists. + * If so then just return that. + */ + lpni = lnet_find_peer_ni_locked(nid); + if (lpni) + return lpni; + + lnet_net_unlock(cpt); + + rc = lnet_peer_ni_traffic_add(nid); + if (rc) { + lpni = ERR_PTR(rc); + goto out_net_relock; + } + + lpni = lnet_find_peer_ni_locked(nid); + LASSERT(lpni); + +out_net_relock: + lnet_net_lock(cpt); + + return lpni; +} + struct lnet_peer_ni * lnet_nid2peerni_locked(lnet_nid_t nid, int cpt) { - struct lnet_peer_table *ptable; struct lnet_peer_ni *lpni = NULL; - int cpt2; int rc; if (the_lnet.ln_shutdown) /* it's shutting down */ return ERR_PTR(-ESHUTDOWN); /* - * calculate cpt2 with the standard hash function - * This cpt2 is the slot where we'll find or create the peer. + * find if a peer_ni already exists. + * If so then just return that. */ - cpt2 = lnet_nid_cpt_hash(nid, LNET_CPT_NUMBER); - ptable = the_lnet.ln_peer_tables[cpt2]; - lpni = lnet_get_peer_ni_locked(ptable, nid); + lpni = lnet_find_peer_ni_locked(nid); if (lpni) return lpni; - /* Slow path: serialized using the ln_api_mutex. */ + /* + * Slow path: + * use the lnet_api_mutex to serialize the creation of the peer_ni + * and the creation/deletion of the local ni/net. When a local ni is + * created, if there exists a set of peer_nis on that network, + * they need to be traversed and updated. When a local NI is + * deleted, which could result in a network being deleted, then + * all peer nis on that network need to be removed as well. + * + * Creation through traffic should also be serialized with + * creation through DLC. + */ lnet_net_unlock(cpt); mutex_lock(&the_lnet.ln_api_mutex); /* * Shutdown is only set under the ln_api_lock, so a single * check here is sufficent. - * - * lnet_add_nid_to_peer() also handles the case where we've - * raced and a different thread added the NID. */ if (the_lnet.ln_shutdown) { lpni = ERR_PTR(-ESHUTDOWN); @@ -996,7 +1034,7 @@ lnet_nid2peerni_locked(lnet_nid_t nid, int cpt) goto out_mutex_unlock; } - lpni = lnet_get_peer_ni_locked(ptable, nid); + lpni = lnet_find_peer_ni_locked(nid); LASSERT(lpni); out_mutex_unlock: diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c index 6c50e8cc7833..a0483f970bd5 100644 --- a/drivers/staging/lustre/lnet/lnet/router.c +++ b/drivers/staging/lustre/lnet/lnet/router.c @@ -358,7 +358,7 @@ lnet_add_route(__u32 net, __u32 hops, lnet_nid_t gateway, lnet_net_lock(LNET_LOCK_EX); - lpni = lnet_nid2peerni_locked(gateway, LNET_LOCK_EX); + lpni = lnet_nid2peerni_ex(gateway, LNET_LOCK_EX); if (IS_ERR(lpni)) { lnet_net_unlock(LNET_LOCK_EX); From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613201 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B1589157B for ; Tue, 25 Sep 2018 01:12:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B37C22A052 for ; Tue, 25 Sep 2018 01:12:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A79352A05D; Tue, 25 Sep 2018 01:12:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EA7D42A052 for ; Tue, 25 Sep 2018 01:12:42 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8DCA84C4255; Mon, 24 Sep 2018 18:12:42 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id EF0204C4210 for ; Mon, 24 Sep 2018 18:12:39 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 180DFB034; Tue, 25 Sep 2018 01:12:39 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763587.32103.5037367646271689437.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 26/34] LU-7734 lnet: Routing fixes part 2 X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Fix lnet_select_pathway() to handle the routing cases correctly. The following general cases are handled: . Non-MR directly connected . Non-MR not directly connected . MR Directly connected . MR Not directly connected . No gateway . Gateway is non-mr . Gateway is mr Signed-off-by: Amir Shehata Change-Id: If2d16b797b94421e78a9f2a254a250a440f8b244 Reviewed-on: http://review.whamcloud.com/21167 Signed-off-by: NeilBrown --- drivers/staging/lustre/lnet/lnet/lib-move.c | 214 ++++++++++++++++++--------- drivers/staging/lustre/lnet/lnet/peer.c | 29 +++- 2 files changed, 167 insertions(+), 76 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 58521b014ef3..12bc80d060e9 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1145,6 +1145,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, __u32 seq; int cpt, cpt2, rc; bool routing; + bool routing2; bool ni_is_pref; bool preferred; int best_credits; @@ -1168,6 +1169,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, best_gw = NULL; local_net = NULL; routing = false; + routing2 = false; seq = lnet_get_dlc_seq_locked(); @@ -1201,7 +1203,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, } /* - * STEP 1: first jab at determineing best_ni + * STEP 1: first jab at determining best_ni * if src_nid is explicitly specified, then best_ni is already * pre-determiend for us. Otherwise we need to select the best * one to use later on @@ -1215,17 +1217,122 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, libcfs_nid2str(src_nid)); return -EINVAL; } + } + + if (msg->msg_type == LNET_MSG_REPLY || + msg->msg_type == LNET_MSG_ACK || + !peer->lp_multi_rail) { + /* + * for replies we want to respond on the same peer_ni we + * received the message on if possible. If not, then pick + * a peer_ni to send to + * + * if the peer is non-multi-rail then you want to send to + * the dst_nid provided as well. + * + * It is expected to find the lpni using dst_nid, since we + * created it earlier. + */ + best_lpni = lnet_find_peer_ni_locked(dst_nid); + if (best_lpni) + lnet_peer_ni_decref_locked(best_lpni); - if (best_ni->ni_net->net_id != LNET_NIDNET(dst_nid)) { + if (best_lpni && !lnet_get_net_locked(LNET_NIDNET(dst_nid))) { + /* + * this lpni is not on a local network so we need + * to route this reply. + */ + best_gw = lnet_find_route_locked(NULL, + best_lpni->lpni_nid, + rtr_nid); + if (best_gw) { + /* + * RULE: Each node considers only the next-hop + * + * We're going to route the message, + * so change the peer to the router. + */ + LASSERT(best_gw->lpni_peer_net); + LASSERT(best_gw->lpni_peer_net->lpn_peer); + peer = best_gw->lpni_peer_net->lpn_peer; + + /* + * if the router is not multi-rail + * then use the best_gw found to send + * the message to + */ + if (!peer->lp_multi_rail) + best_lpni = best_gw; + else + best_lpni = NULL; + + routing = true; + } else { + best_lpni = NULL; + } + } else if (!best_lpni) { lnet_net_unlock(cpt); - LCONSOLE_WARN("No route to %s via from %s\n", - libcfs_nid2str(dst_nid), - libcfs_nid2str(src_nid)); + CERROR("unable to send msg_type %d to originating %s. Destination NID not in DB\n", + msg->msg_type, libcfs_nid2str(dst_nid)); return -EINVAL; } - goto pick_peer; } + /* + * if the peer is not MR capable, then we should always send to it + * using the first NI in the NET we determined. + */ + if (!peer->lp_multi_rail) { + if (!best_lpni) { + lnet_net_unlock(cpt); + CERROR("no route to %s\n", + libcfs_nid2str(dst_nid)); + return -EHOSTUNREACH; + } + + /* best ni could be set because src_nid was provided */ + if (!best_ni) { + best_ni = lnet_net2ni_locked( + best_lpni->lpni_net->net_id, cpt); + if (!best_ni) { + lnet_net_unlock(cpt); + CERROR("no path to %s from net %s\n", + libcfs_nid2str(best_lpni->lpni_nid), + libcfs_net2str(best_lpni->lpni_net->net_id)); + return -EHOSTUNREACH; + } + } + } + + if (best_ni == the_lnet.ln_loni) { + /* No send credit hassles with LOLND */ + lnet_ni_addref_locked(best_ni, cpt); + msg->msg_hdr.dest_nid = cpu_to_le64(best_ni->ni_nid); + if (!msg->msg_routing) + msg->msg_hdr.src_nid = cpu_to_le64(best_ni->ni_nid); + msg->msg_target.nid = best_ni->ni_nid; + lnet_msg_commit(msg, cpt); + msg->msg_txni = best_ni; + lnet_net_unlock(cpt); + + return LNET_CREDIT_OK; + } + + /* + * if we already found a best_ni because src_nid is specified and + * best_lpni because we are replying to a message then just send + * the message + */ + if (best_ni && best_lpni) + goto send; + + /* + * If we already found a best_ni because src_nid is specified then + * pick the peer then send the message + */ + if (best_ni) + goto pick_peer; + /* * Decide whether we need to route to peer_ni. * Get the local net that I need to be on to be able to directly @@ -1242,7 +1349,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, continue; local_net = lnet_get_net_locked(peer_net->lpn_net_id); - if (!local_net) { + if (!local_net && !routing) { struct lnet_peer_ni *net_gw; /* * go through each peer_ni on that peer_net and @@ -1263,14 +1370,11 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, if (!best_gw) { best_gw = net_gw; - best_lpni = lpni; } else { rc = lnet_compare_peers(net_gw, best_gw); - if (rc > 0) { + if (rc > 0) best_gw = net_gw; - best_lpni = lpni; - } } } @@ -1279,9 +1383,9 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, local_net = lnet_get_net_locked (LNET_NIDNET(best_gw->lpni_nid)); - routing = true; + routing2 = true; } else { - routing = false; + routing2 = false; best_gw = NULL; } @@ -1342,12 +1446,17 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, } } - /* - * if the peer is not MR capable, then we should always send to it - * using the first NI in the NET we determined. - */ - if (!peer->lp_multi_rail && local_net) - best_ni = lnet_net2ni_locked(local_net->net_id, cpt); + if (routing2) { + /* + * RULE: Each node considers only the next-hop + * + * We're going to route the message, so change the peer to + * the router. + */ + LASSERT(best_gw->lpni_peer_net); + LASSERT(best_gw->lpni_peer_net->lpn_peer); + peer = best_gw->lpni_peer_net->lpn_peer; + } if (!best_ni) { lnet_net_unlock(cpt); @@ -1363,43 +1472,11 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, */ best_ni->ni_seq++; - if (routing) - goto send; - pick_peer: - if (best_ni == the_lnet.ln_loni) { - /* No send credit hassles with LOLND */ - lnet_ni_addref_locked(best_ni, cpt); - msg->msg_hdr.dest_nid = cpu_to_le64(best_ni->ni_nid); - if (!msg->msg_routing) - msg->msg_hdr.src_nid = cpu_to_le64(best_ni->ni_nid); - msg->msg_target.nid = best_ni->ni_nid; - lnet_msg_commit(msg, cpt); - msg->msg_txni = best_ni; - lnet_net_unlock(cpt); - - return LNET_CREDIT_OK; - } - - if (msg->msg_type == LNET_MSG_REPLY || - msg->msg_type == LNET_MSG_ACK) { - /* - * for replies we want to respond on the same peer_ni we - * received the message on if possible. If not, then pick - * a peer_ni to send to - */ - best_lpni = lnet_find_peer_ni_locked(dst_nid); - if (best_lpni) { - lnet_peer_ni_decref_locked(best_lpni); - goto send; - } else { - CDEBUG(D_NET, - "unable to send msg_type %d to originating %s\n", - msg->msg_type, - libcfs_nid2str(dst_nid)); - } - } - + /* + * At this point the best_ni is on a local network on which + * the peer has a peer_ni as well + */ peer_net = lnet_peer_get_net_locked(peer, best_ni->ni_net->net_id); /* @@ -1429,13 +1506,16 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, libcfs_nid2str(best_gw->lpni_nid), lnet_msgtyp2str(msg->msg_type), msg->msg_len); - best_lpni = lnet_find_peer_ni_locked(dst_nid); - LASSERT(best_lpni); - lnet_peer_ni_decref_locked(best_lpni); - - routing = true; - - goto send; + routing2 = true; + /* + * RULE: Each node considers only the next-hop + * + * We're going to route the message, so change the peer to + * the router. + */ + LASSERT(best_gw->lpni_peer_net); + LASSERT(best_gw->lpni_peer_net->lpn_peer); + peer = best_gw->lpni_peer_net->lpn_peer; } else if (!lnet_is_peer_net_healthy_locked(peer_net)) { /* * this peer_net is unhealthy but we still have an opportunity @@ -1459,6 +1539,7 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, lpni = NULL; best_lpni_credits = INT_MIN; preferred = false; + best_lpni = NULL; while ((lpni = lnet_get_next_peer_ni_locked(peer, peer_net, lpni))) { /* * if this peer ni is not healthy just skip it, no point in @@ -1513,19 +1594,14 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, } send: + routing = routing || routing2; + /* * Increment sequence number of the peer selected so that we * pick the next one in Round Robin. */ best_lpni->lpni_seq++; - /* - * When routing the best gateway found acts as the best peer - * NI to send to. - */ - if (routing) - best_lpni = best_gw; - /* * grab a reference on the peer_ni so it sticks around even if * we need to drop and relock the lnet_net_lock below. diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 9cecfb49db87..d757f4df1f39 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -225,11 +225,18 @@ lnet_try_destroy_peer_hierarchy_locked(struct lnet_peer_ni *lpni) } /* called with lnet_net_lock LNET_LOCK_EX held */ -static void +static int lnet_peer_ni_del_locked(struct lnet_peer_ni *lpni) { struct lnet_peer_table *ptable = NULL; + /* don't remove a peer_ni if it's also a gateway */ + if (lpni->lpni_rtr_refcount > 0) { + CERROR("Peer NI %s is a gateway. Can not delete it\n", + libcfs_nid2str(lpni->lpni_nid)); + return -EBUSY; + } + lnet_peer_remove_from_remote_list(lpni); /* remove peer ni from the hash list. */ @@ -260,6 +267,8 @@ lnet_peer_ni_del_locked(struct lnet_peer_ni *lpni) /* decrement reference on peer */ lnet_peer_ni_decref_locked(lpni); + + return 0; } void lnet_peer_uninit(void) @@ -313,17 +322,22 @@ lnet_peer_tables_create(void) return 0; } -static void +static int lnet_peer_del_locked(struct lnet_peer *peer) { struct lnet_peer_ni *lpni = NULL, *lpni2; + int rc = 0, rc2 = 0; lpni = lnet_get_next_peer_ni_locked(peer, NULL, lpni); while (lpni) { lpni2 = lnet_get_next_peer_ni_locked(peer, NULL, lpni); - lnet_peer_ni_del_locked(lpni); + rc = lnet_peer_ni_del_locked(lpni); + if (rc != 0) + rc2 = rc; lpni = lpni2; } + + return rc2; } static void @@ -899,6 +913,7 @@ lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid) lnet_nid_t local_nid; struct lnet_peer *peer; struct lnet_peer_ni *lpni; + int rc; if (key_nid == LNET_NID_ANY) return -EINVAL; @@ -919,17 +934,17 @@ lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid) * entire peer */ lnet_net_lock(LNET_LOCK_EX); - lnet_peer_del_locked(peer); + rc = lnet_peer_del_locked(peer); lnet_net_unlock(LNET_LOCK_EX); - return 0; + return rc; } lnet_net_lock(LNET_LOCK_EX); - lnet_peer_ni_del_locked(lpni); + rc = lnet_peer_ni_del_locked(lpni); lnet_net_unlock(LNET_LOCK_EX); - return 0; + return rc; } void From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613203 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DB8B61390 for ; Tue, 25 Sep 2018 01:12:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DFA742A059 for ; Tue, 25 Sep 2018 01:12:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D405D2A05E; Tue, 25 Sep 2018 01:12:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0C62A2A059 for ; Tue, 25 Sep 2018 01:12:49 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 906B94C42A3; Mon, 24 Sep 2018 18:12:48 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 318A44C4210 for ; Mon, 24 Sep 2018 18:12:46 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 46D13B035; Tue, 25 Sep 2018 01:12:45 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763590.32103.13916552051734764199.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 27/34] LU-7734 lnet: fix routing selection X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Always prefer locally connected networks over routed networks. If there are multiple routed networks and no connected networks pick the best gateway to use. If all gateways are equal then round robin through them. Renamed dev_cpt to ni_dev_cpt to maintain naming convention. Signed-off-by: Amir Shehata Change-Id: Ie6a3aaa7a9ec4f5474baf5e1ec0258d481418cb1 Reviewed-on: http://review.whamcloud.com/21326 Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-types.h | 4 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 2 .../staging/lustre/lnet/klnds/socklnd/socklnd.c | 4 drivers/staging/lustre/lnet/lnet/api-ni.c | 2 drivers/staging/lustre/lnet/lnet/lib-move.c | 217 +++++++++++--------- 5 files changed, 131 insertions(+), 98 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 0761fd533f8d..2d73aa1a121c 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -361,7 +361,7 @@ struct lnet_ni { struct lnet_element_stats ni_stats; /* physical device CPT */ - int dev_cpt; + int ni_dev_cpt; /* sequence number used to round robin over nis within a net */ u32 ni_seq; @@ -464,6 +464,8 @@ struct lnet_peer_ni { int lpni_rtr_refcount; /* sequence number used to round robin over peer nis within a net */ u32 lpni_seq; + /* sequence number used to round robin over gateways */ + __u32 lpni_gw_seq; /* health flag */ bool lpni_healthy; /* returned RC ping features. Protected with lpni_lock */ diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c index 71256500f245..0ed29177819a 100644 --- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c +++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c @@ -2891,7 +2891,7 @@ static int kiblnd_startup(struct lnet_ni *ni) goto failed; node_id = dev_to_node(ibdev->ibd_hdev->ibh_ibdev->dma_device); - ni->dev_cpt = cfs_cpt_of_node(lnet_cpt_table(), node_id); + ni->ni_dev_cpt = cfs_cpt_of_node(lnet_cpt_table(), node_id); net->ibn_dev = ibdev; ni->ni_nid = LNET_MKNID(LNET_NIDNET(ni->ni_nid), ibdev->ibd_ifip); diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c index c14711804d7b..2ec84a73c522 100644 --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c @@ -2798,10 +2798,10 @@ ksocknal_startup(struct lnet_ni *ni) net->ksnn_interfaces[0].ksni_name); if (net_dev) { node_id = dev_to_node(&net_dev->dev); - ni->dev_cpt = cfs_cpt_of_node(lnet_cpt_table(), node_id); + ni->ni_dev_cpt = cfs_cpt_of_node(lnet_cpt_table(), node_id); dev_put(net_dev); } else { - ni->dev_cpt = CFS_CPT_ANY; + ni->ni_dev_cpt = CFS_CPT_ANY; } /* call it before add it to ksocknal_data.ksnd_nets */ diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index 60176d05d34a..f57200eab746 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -1910,7 +1910,7 @@ lnet_fill_ni_info(struct lnet_ni *ni, struct lnet_ioctl_config_ni *cfg_ni, cfg_ni->lic_nid = ni->ni_nid; cfg_ni->lic_status = ni->ni_status->ns_status; cfg_ni->lic_tcp_bonding = use_tcp_bonding; - cfg_ni->lic_dev_cpt = ni->dev_cpt; + cfg_ni->lic_dev_cpt = ni->ni_dev_cpt; memcpy(&tun->lt_cmn, &ni->ni_net->net_tunables, sizeof(tun->lt_cmn)); diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 12bc80d060e9..141983f0ef83 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1130,6 +1130,69 @@ lnet_find_route_locked(struct lnet_net *net, lnet_nid_t target, return lpni_best; } +static struct lnet_ni * +lnet_get_best_ni(struct lnet_net *local_net, struct lnet_ni *cur_ni, + int md_cpt) +{ + struct lnet_ni *ni = NULL, *best_ni = cur_ni; + unsigned int shortest_distance; + int best_credits; + + if (!best_ni) { + shortest_distance = UINT_MAX; + best_credits = INT_MIN; + } else { + shortest_distance = cfs_cpt_distance(lnet_cpt_table(), md_cpt, + best_ni->ni_dev_cpt); + best_credits = atomic_read(&best_ni->ni_tx_credits); + } + + while ((ni = lnet_get_next_ni_locked(local_net, ni))) { + unsigned int distance; + int ni_credits; + + if (!lnet_is_ni_healthy_locked(ni)) + continue; + + ni_credits = atomic_read(&ni->ni_tx_credits); + + /* + * calculate the distance from the CPT on which + * the message memory is allocated to the CPT of + * the NI's physical device + */ + distance = cfs_cpt_distance(lnet_cpt_table(), + md_cpt, + ni->ni_dev_cpt); + + /* + * All distances smaller than the NUMA range + * are treated equally. + */ + if (distance < lnet_numa_range) + distance = lnet_numa_range; + + /* + * Select on shorter distance, then available + * credits, then round-robin. + */ + if (distance > shortest_distance) { + continue; + } else if (distance < shortest_distance) { + shortest_distance = distance; + } else if (ni_credits < best_credits) { + continue; + } else if (ni_credits == best_credits) { + if (best_ni && (best_ni)->ni_seq <= ni->ni_seq) + continue; + } + best_ni = ni; + best_credits = ni_credits; + } + + return best_ni; +} + static int lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, struct lnet_msg *msg, lnet_nid_t rtr_nid) @@ -1138,20 +1201,19 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, struct lnet_peer_ni *best_lpni = NULL; struct lnet_peer_ni *best_gw = NULL; struct lnet_peer_ni *lpni; + struct lnet_peer_ni *final_dst; struct lnet_peer *peer; struct lnet_peer_net *peer_net; struct lnet_net *local_net; - struct lnet_ni *ni; __u32 seq; int cpt, cpt2, rc; bool routing; bool routing2; bool ni_is_pref; bool preferred; - int best_credits; + bool local_found; int best_lpni_credits; int md_cpt; - unsigned int shortest_distance; /* * get an initial CPT to use for locking. The idea here is not to @@ -1167,9 +1229,11 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, best_ni = NULL; best_lpni = NULL; best_gw = NULL; + final_dst = NULL; local_net = NULL; routing = false; routing2 = false; + local_found = false; seq = lnet_get_dlc_seq_locked(); @@ -1334,62 +1398,68 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, goto pick_peer; /* - * Decide whether we need to route to peer_ni. - * Get the local net that I need to be on to be able to directly - * send to that peer. + * pick the best_ni by going through all the possible networks of + * that peer and see which local NI is best suited to talk to that + * peer. * - * a. Find the peer which the dst_nid belongs to. - * b. Iterate through each of the peer_nets/nis to decide - * the best peer/local_ni pair to use + * Locally connected networks will always be preferred over + * a routed network. If there are only routed paths to the peer, + * then the best route is chosen. If all routes are equal then + * they are used in round robin. */ - shortest_distance = UINT_MAX; - best_credits = INT_MIN; list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_on_peer_list) { if (!lnet_is_peer_net_healthy_locked(peer_net)) continue; local_net = lnet_get_net_locked(peer_net->lpn_net_id); - if (!local_net && !routing) { + if (!local_net && !routing && !local_found) { struct lnet_peer_ni *net_gw; - /* - * go through each peer_ni on that peer_net and - * determine the best possible gw to go through - */ - list_for_each_entry(lpni, &peer_net->lpn_peer_nis, - lpni_on_peer_net_list) { - net_gw = lnet_find_route_locked(NULL, - lpni->lpni_nid, - rtr_nid); + lpni = list_entry(peer_net->lpn_peer_nis.next, + struct lnet_peer_ni, + lpni_on_peer_net_list); + + net_gw = lnet_find_route_locked(NULL, + lpni->lpni_nid, + rtr_nid); + if (!net_gw) + continue; + + if (best_gw) { /* - * if no route is found for that network then - * move onto the next peer_ni in the peer + * lnet_find_route_locked() call + * will return the best_Gw on the + * lpni->lpni_nid network. + * However, best_gw and net_gw can + * be on different networks. + * Therefore need to compare them + * to pick the better of either. */ - if (!net_gw) + if (lnet_compare_peers(best_gw, net_gw) > 0) + continue; + if (best_gw->lpni_gw_seq <= net_gw->lpni_gw_seq) continue; - - if (!best_gw) { - best_gw = net_gw; - } else { - rc = lnet_compare_peers(net_gw, - best_gw); - if (rc > 0) - best_gw = net_gw; - } } + best_gw = net_gw; + final_dst = lpni; - if (!best_gw) - continue; - - local_net = lnet_get_net_locked - (LNET_NIDNET(best_gw->lpni_nid)); routing2 = true; } else { - routing2 = false; best_gw = NULL; + final_dst = NULL; + routing2 = false; + local_found = true; } - /* no routable net found go on to a different net */ + /* + * a gw on this network is found, but there could be + * other better gateways on other networks. So don't pick + * the best_ni until we determine the best_gw. + */ + if (best_gw) + continue; + + /* if no local_net found continue */ if (!local_net) continue; @@ -1401,70 +1471,30 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * 2. NI available credits * 3. Round Robin */ - ni = NULL; - while ((ni = lnet_get_next_ni_locked(local_net, ni))) { - int ni_credits; - unsigned int distance; - - if (!lnet_is_ni_healthy_locked(ni)) - continue; - - ni_credits = atomic_read(&ni->ni_tx_credits); - - /* - * calculate the distance from the CPT on which - * the message memory is allocated to the CPT of - * the NI's physical device - */ - distance = cfs_cpt_distance(lnet_cpt_table(), - md_cpt, - ni->dev_cpt); - - /* - * All distances smaller than the NUMA range - * are treated equally. - */ - if (distance < lnet_numa_range) - distance = lnet_numa_range; + best_ni = lnet_get_best_ni(local_net, best_ni, md_cpt); + } - /* - * Select on shorter distance, then available - * credits, then round-robin. - */ - if (distance > shortest_distance) { - continue; - } else if (distance < shortest_distance) { - shortest_distance = distance; - } else if (ni_credits < best_credits) { - continue; - } else if (ni_credits == best_credits) { - if (best_ni && best_ni->ni_seq <= ni->ni_seq) - continue; - } - best_ni = ni; - best_credits = ni_credits; - } + if (!best_ni && !best_gw) { + lnet_net_unlock(cpt); + LCONSOLE_WARN("No local ni found to send from to %s\n", + libcfs_nid2str(dst_nid)); + return -EINVAL; } - if (routing2) { + if (!best_ni) { + best_ni = lnet_get_best_ni(best_gw->lpni_net, best_ni, md_cpt); + LASSERT(best_gw && best_ni); + /* - * RULE: Each node considers only the next-hop - * * We're going to route the message, so change the peer to * the router. */ LASSERT(best_gw->lpni_peer_net); LASSERT(best_gw->lpni_peer_net->lpn_peer); + best_gw->lpni_gw_seq++; peer = best_gw->lpni_peer_net->lpn_peer; } - if (!best_ni) { - lnet_net_unlock(cpt); - LCONSOLE_WARN("No local ni found to send from to %s\n", - libcfs_nid2str(dst_nid)); - return -EINVAL; - } - /* * Now that we selected the NI to use increment its sequence * number so the Round Robin algorithm will detect that it has @@ -1674,7 +1704,8 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * the router receives this message it knows how to route * it. */ - msg->msg_hdr.dest_nid = cpu_to_le64(dst_nid); + msg->msg_hdr.dest_nid = + cpu_to_le64(final_dst ? final_dst->lpni_nid : dst_nid); } else { /* * if we're not routing set the dest_nid to the best peer From patchwork Tue Sep 25 01:07:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613205 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CFAAD1390 for ; Tue, 25 Sep 2018 01:12:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D44DF2A052 for ; Tue, 25 Sep 2018 01:12:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C8EAF2A05D; Tue, 25 Sep 2018 01:12:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 89DD82A052 for ; Tue, 25 Sep 2018 01:12:54 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 477904C42C2; Mon, 24 Sep 2018 18:12:54 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4773A4C41D9 for ; Mon, 24 Sep 2018 18:12:52 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 60EC6B032; Tue, 25 Sep 2018 01:12:51 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:15 +1000 Message-ID: <153783763594.32103.9029692452686853485.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 28/34] LU-7734 lnet: Fix crash in router_proc.c X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata Fixed NULL access in the case when a peer is a remote peer. In that case lpni_net is NULL. Signed-off-by: Amir Shehata Change-Id: Ida234ff016b2bdc305acf74df0f99600d2555e27 Reviewed-on: http://review.whamcloud.com/21327 Signed-off-by: NeilBrown --- drivers/staging/lustre/lnet/lnet/router_proc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/staging/lustre/lnet/lnet/router_proc.c b/drivers/staging/lustre/lnet/lnet/router_proc.c index 977a937f261c..a887ca446d42 100644 --- a/drivers/staging/lustre/lnet/lnet/router_proc.c +++ b/drivers/staging/lustre/lnet/lnet/router_proc.c @@ -492,7 +492,8 @@ static int proc_lnet_peers(struct ctl_table *table, int write, int nrefs = atomic_read(&peer->lpni_refcount); time64_t lastalive = -1; char *aliveness = "NA"; - int maxcr = peer->lpni_net->net_tunables.lct_peer_tx_credits; + int maxcr = peer->lpni_net ? + peer->lpni_net->net_tunables.lct_peer_tx_credits : 0; int txcr = peer->lpni_txcredits; int mintxcr = peer->lpni_mintxcredits; int rtrcr = peer->lpni_rtrcredits; From patchwork Tue Sep 25 01:07:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613207 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 02182157B for ; Tue, 25 Sep 2018 01:13:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0620A2A052 for ; Tue, 25 Sep 2018 01:13:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EECA72A05D; Tue, 25 Sep 2018 01:13:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 90ECA2A052 for ; Tue, 25 Sep 2018 01:13:00 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1F5214C4355; Mon, 24 Sep 2018 18:13:00 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C31264C4256 for ; Mon, 24 Sep 2018 18:12:58 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C2B19B032; Tue, 25 Sep 2018 01:12:57 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:16 +1000 Message-ID: <153783763597.32103.14845230051684461634.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 29/34] LU-7734 lnet: double free in lnet_add_net_common() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Olaf Weber lnet_startup_lndnet() always consumes its net parameter, so we should not free net after the function has been called. This fixes a double free triggered by adding a network twice. Eliminate the netl local variable. Signed-off-by: Olaf Weber Change-Id: I1cfc3494eada4660b792f6a1ebd96b5dc80d9945 Reviewed-on: http://review.whamcloud.com/21446 Reviewed-by: Amir Shehata Tested-by: Amir Shehata Signed-off-by: NeilBrown --- drivers/staging/lustre/lnet/lnet/api-ni.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index f57200eab746..ea27d38f78c5 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -2143,7 +2143,6 @@ lnet_get_ni_config(struct lnet_ioctl_config_ni *cfg_ni, static int lnet_add_net_common(struct lnet_net *net, struct lnet_ioctl_config_lnd_tunables *tun) { - struct lnet_net *netl = NULL; u32 net_id; struct lnet_ping_info *pinfo; struct lnet_handle_md md_handle; @@ -2162,8 +2161,8 @@ static int lnet_add_net_common(struct lnet_net *net, if (rnet) { CERROR("Adding net %s will invalidate routing configuration\n", libcfs_net2str(net->net_id)); - rc = -EUSERS; - goto failed1; + lnet_net_free(net); + return -EUSERS; } /* @@ -2180,8 +2179,11 @@ static int lnet_add_net_common(struct lnet_net *net, rc = lnet_ping_info_setup(&pinfo, &md_handle, net_ni_count + lnet_get_ni_count(), false); - if (rc < 0) - goto failed1; + if (rc < 0) { + lnet_net_free(net); + return rc; + } + if (tun) memcpy(&net->net_tunables, &tun->lt_cmn, sizeof(net->net_tunables)); @@ -2204,17 +2206,16 @@ static int lnet_add_net_common(struct lnet_net *net, goto failed; lnet_net_lock(LNET_LOCK_EX); - netl = lnet_get_net_locked(net_id); + net = lnet_get_net_locked(net_id); lnet_net_unlock(LNET_LOCK_EX); - LASSERT(netl); + LASSERT(net); /* * Start the acceptor thread if this is the first network * being added that requires the thread. */ - if (netl->net_lnd->lnd_accept && - num_acceptor_nets == 0) { + if (net->net_lnd->lnd_accept && num_acceptor_nets == 0) { rc = lnet_acceptor_start(); if (rc < 0) { /* shutdown the net that we just started */ @@ -2225,7 +2226,7 @@ static int lnet_add_net_common(struct lnet_net *net, } lnet_net_lock(LNET_LOCK_EX); - lnet_peer_net_added(netl); + lnet_peer_net_added(net); lnet_net_unlock(LNET_LOCK_EX); lnet_ping_target_update(pinfo, md_handle); @@ -2235,8 +2236,6 @@ static int lnet_add_net_common(struct lnet_net *net, failed: lnet_ping_md_unlink(pinfo, &md_handle); lnet_ping_info_free(pinfo); -failed1: - lnet_net_free(net); return rc; } From patchwork Tue Sep 25 01:07:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613209 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 30B50157B for ; Tue, 25 Sep 2018 01:13:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2896428CF8 for ; Tue, 25 Sep 2018 01:13:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1965E28E2B; Tue, 25 Sep 2018 01:13:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B771928CF8 for ; Tue, 25 Sep 2018 01:13:06 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 726C54C4366; Mon, 24 Sep 2018 18:13:06 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C33744C4351 for ; Mon, 24 Sep 2018 18:13:04 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B5CC4B034; Tue, 25 Sep 2018 01:13:03 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:16 +1000 Message-ID: <153783763601.32103.9360668949112667310.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 30/34] LU-7734 lnet: set primary NID in ptlrpc_connection_get() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Olaf Weber Set the NID in ptlrpc_connection::c_peer to the primary NID of a peer. This ensures that regardless of the NID used to start a connection, we consistently use the same NID (the primary NID) to identify a peer. It also means that PtlRPC will not create multiple connections to a peer. The primary NID is obtained by calling LNetPrimaryNID(), an addition to the exported symbols of the LNet module. The name was chosen to match the existing naming pattern. Test-Parameters: trivial Signed-off-by: Olaf Weber Change-Id: Idc0605d17a58678b634db246221028cf81ad2407 Reviewed-on: http://review.whamcloud.com/21710 Tested-by: Maloo Reviewed-by: Amir Shehata Tested-by: Amir Shehata Signed-off-by: NeilBrown --- drivers/staging/lustre/include/linux/lnet/api.h | 1 + drivers/staging/lustre/lnet/lnet/peer.c | 25 +++++++++++++++++++++ drivers/staging/lustre/lustre/ptlrpc/connection.c | 1 + 3 files changed, 27 insertions(+) diff --git a/drivers/staging/lustre/include/linux/lnet/api.h b/drivers/staging/lustre/include/linux/lnet/api.h index 22429213c023..70c991990474 100644 --- a/drivers/staging/lustre/include/linux/lnet/api.h +++ b/drivers/staging/lustre/include/linux/lnet/api.h @@ -77,6 +77,7 @@ int LNetNIFini(void); */ int LNetGetId(unsigned int index, struct lnet_process_id *id); int LNetDist(lnet_nid_t nid, lnet_nid_t *srcnid, __u32 *order); +lnet_nid_t LNetPrimaryNID(lnet_nid_t nid); /** @} lnet_addr */ diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index d757f4df1f39..747a4fc8d39f 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -610,6 +610,31 @@ lnet_peer_primary_nid(lnet_nid_t nid) return primary_nid; } +lnet_nid_t +LNetPrimaryNID(lnet_nid_t nid) +{ + struct lnet_peer_ni *lpni; + lnet_nid_t primary_nid = nid; + int rc = 0; + int cpt; + + cpt = lnet_net_lock_current(); + lpni = lnet_nid2peerni_locked(nid, cpt); + if (IS_ERR(lpni)) { + rc = PTR_ERR(lpni); + goto out_unlock; + } + primary_nid = lpni->lpni_peer_net->lpn_peer->lp_primary_nid; + lnet_peer_ni_decref_locked(lpni); +out_unlock: + lnet_net_unlock(cpt); + + CDEBUG(D_NET, "NID %s primary NID %s rc %d\n", libcfs_nid2str(nid), + libcfs_nid2str(primary_nid), rc); + return primary_nid; +} +EXPORT_SYMBOL(LNetPrimaryNID); + struct lnet_peer_net * lnet_peer_get_net_locked(struct lnet_peer *peer, u32 net_id) { diff --git a/drivers/staging/lustre/lustre/ptlrpc/connection.c b/drivers/staging/lustre/lustre/ptlrpc/connection.c index fb35a89ca6c6..ca9b4caf44b3 100644 --- a/drivers/staging/lustre/lustre/ptlrpc/connection.c +++ b/drivers/staging/lustre/lustre/ptlrpc/connection.c @@ -80,6 +80,7 @@ ptlrpc_connection_get(struct lnet_process_id peer, lnet_nid_t self, { struct ptlrpc_connection *conn, *conn2; + peer.nid = LNetPrimaryNID(peer.nid); conn = rhashtable_lookup_fast(&conn_hash, &peer, conn_hash_params); if (conn) { ptlrpc_connection_addref(conn); From patchwork Tue Sep 25 01:07:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613211 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6CEC01390 for ; Tue, 25 Sep 2018 01:13:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 71C2C29050 for ; Tue, 25 Sep 2018 01:13:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 64DED290EA; Tue, 25 Sep 2018 01:13:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F301529050 for ; Tue, 25 Sep 2018 01:13:12 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8EC744C4395; Mon, 24 Sep 2018 18:13:12 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B36AA4C4371 for ; Mon, 24 Sep 2018 18:13:10 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id CEAAEB035; Tue, 25 Sep 2018 01:13:09 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:16 +1000 Message-ID: <153783763604.32103.1694406227193499062.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 31/34] LU-7734 lnet: fix NULL access in lnet_peer_aliveness_enabled X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata When a peer is not on a local network, lpni->lpni_net is NULL. The lpni_net is access in lnet_peer_aliveness_enabled() without checking if it's NULL. Fixed. Test-Parameters: trivial Signed-off-by: Amir Shehata Change-Id: If328728e2bda2a19b273140a20c04b22bdda6bc4 Reviewed-on: http://review.whamcloud.com/22183 Tested-by: Maloo Reviewed-by: Olaf Weber Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-types.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index 2d73aa1a121c..f811f125dfcb 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -534,6 +534,7 @@ struct lnet_peer_table { */ #define lnet_peer_aliveness_enabled(lp) \ (the_lnet.ln_routing && \ + (lp)->lpni_net && \ (lp)->lpni_net->net_tunables.lct_peer_timeout > 0) struct lnet_route { From patchwork Tue Sep 25 01:07:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613213 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 90F27157B for ; Tue, 25 Sep 2018 01:13:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 94316290D6 for ; Tue, 25 Sep 2018 01:13:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 86A5729050; Tue, 25 Sep 2018 01:13:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0AC9829050 for ; Tue, 25 Sep 2018 01:13:19 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id AC81C4C438A; Mon, 24 Sep 2018 18:13:18 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 271AC4C3FBD for ; Mon, 24 Sep 2018 18:13:17 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 44718B032; Tue, 25 Sep 2018 01:13:16 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:16 +1000 Message-ID: <153783763608.32103.13444702792250303421.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 32/34] LU-7734 lnet: rename peer key_nid to prim_nid X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata To make the interface clear, renamed key_nid to prim_nid to indicate that this parameter refers to the peer's primary nid. Signed-off-by: Amir Shehata Change-Id: I74bd17cdd55ba8d2c52bc28557db149d23ecbfb5 Reviewed-on: http://review.whamcloud.com/23460 Tested-by: Maloo Reviewed-by: Olaf Weber Reviewed-by: Doug Oucharek Signed-off-by: NeilBrown --- .../lustre/include/uapi/linux/lnet/lnet-dlc.h | 2 - drivers/staging/lustre/lnet/lnet/api-ni.c | 6 +-- drivers/staging/lustre/lnet/lnet/peer.c | 42 ++++++++++---------- 3 files changed, 25 insertions(+), 25 deletions(-) diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h index a5e94619d3f1..d1e2911ab342 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-dlc.h @@ -216,7 +216,7 @@ struct lnet_ioctl_dbg { struct lnet_ioctl_peer_cfg { struct libcfs_ioctl_hdr prcfg_hdr; - lnet_nid_t prcfg_key_nid; + lnet_nid_t prcfg_prim_nid; lnet_nid_t prcfg_cfg_nid; __u32 prcfg_idx; bool prcfg_mr; diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c index ea27d38f78c5..9a09927c346a 100644 --- a/drivers/staging/lustre/lnet/lnet/api-ni.c +++ b/drivers/staging/lustre/lnet/lnet/api-ni.c @@ -2728,7 +2728,7 @@ LNetCtl(unsigned int cmd, void *arg) return -EINVAL; mutex_lock(&the_lnet.ln_api_mutex); - rc = lnet_add_peer_ni_to_peer(cfg->prcfg_key_nid, + rc = lnet_add_peer_ni_to_peer(cfg->prcfg_prim_nid, cfg->prcfg_cfg_nid, cfg->prcfg_mr); mutex_unlock(&the_lnet.ln_api_mutex); @@ -2742,7 +2742,7 @@ LNetCtl(unsigned int cmd, void *arg) return -EINVAL; mutex_lock(&the_lnet.ln_api_mutex); - rc = lnet_del_peer_ni_from_peer(cfg->prcfg_key_nid, + rc = lnet_del_peer_ni_from_peer(cfg->prcfg_prim_nid, cfg->prcfg_cfg_nid); mutex_unlock(&the_lnet.ln_api_mutex); return rc; @@ -2785,7 +2785,7 @@ LNetCtl(unsigned int cmd, void *arg) (cfg->prcfg_bulk + sizeof(*lpni_cri)); mutex_lock(&the_lnet.ln_api_mutex); - rc = lnet_get_peer_info(cfg->prcfg_idx, &cfg->prcfg_key_nid, + rc = lnet_get_peer_info(cfg->prcfg_idx, &cfg->prcfg_prim_nid, &cfg->prcfg_cfg_nid, &cfg->prcfg_mr, lpni_cri, lpni_stats); mutex_unlock(&the_lnet.ln_api_mutex); diff --git a/drivers/staging/lustre/lnet/lnet/peer.c b/drivers/staging/lustre/lnet/lnet/peer.c index 747a4fc8d39f..3a5f9dbf5c96 100644 --- a/drivers/staging/lustre/lnet/lnet/peer.c +++ b/drivers/staging/lustre/lnet/lnet/peer.c @@ -781,18 +781,18 @@ lnet_add_prim_lpni(lnet_nid_t nid) } static int -lnet_add_peer_ni_to_prim_lpni(lnet_nid_t key_nid, lnet_nid_t nid) +lnet_add_peer_ni_to_prim_lpni(lnet_nid_t prim_nid, lnet_nid_t nid) { struct lnet_peer *peer, *primary_peer; struct lnet_peer_ni *lpni = NULL, *klpni = NULL; - LASSERT(key_nid != LNET_NID_ANY && nid != LNET_NID_ANY); + LASSERT(prim_nid != LNET_NID_ANY && nid != LNET_NID_ANY); /* * key nid must be created by this point. If not then this * operation is not permitted */ - klpni = lnet_find_peer_ni_locked(key_nid); + klpni = lnet_find_peer_ni_locked(prim_nid); if (!klpni) return -ENOENT; @@ -809,13 +809,13 @@ lnet_add_peer_ni_to_prim_lpni(lnet_nid_t key_nid, lnet_nid_t nid) * lpni already exists in the system but it belongs to * a different peer. We can't re-added it */ - if (peer->lp_primary_nid != key_nid && peer->lp_multi_rail) { + if (peer->lp_primary_nid != prim_nid && peer->lp_multi_rail) { CERROR("Cannot add NID %s owned by peer %s to peer %s\n", libcfs_nid2str(lpni->lpni_nid), libcfs_nid2str(peer->lp_primary_nid), - libcfs_nid2str(key_nid)); + libcfs_nid2str(prim_nid)); return -EEXIST; - } else if (peer->lp_primary_nid == key_nid) { + } else if (peer->lp_primary_nid == prim_nid) { /* * found a peer_ni that is already part of the * peer. This is a no-op operation. @@ -824,7 +824,7 @@ lnet_add_peer_ni_to_prim_lpni(lnet_nid_t key_nid, lnet_nid_t nid) } /* - * TODO: else if (peer->lp_primary_nid != key_nid && + * TODO: else if (peer->lp_primary_nid != prim_nid && * !peer->lp_multi_rail) * peer is not an MR peer and it will be moved in the next * step to klpni, so update its flags accordingly. @@ -895,55 +895,55 @@ lnet_peer_ni_add_non_mr(lnet_nid_t nid) /* * This API handles the following combinations: - * Create a primary NI if only the key_nid is provided + * Create a primary NI if only the prim_nid is provided * Create or add an lpni to a primary NI. Primary NI must've already * been created * Create a non-MR peer. */ int -lnet_add_peer_ni_to_peer(lnet_nid_t key_nid, lnet_nid_t nid, bool mr) +lnet_add_peer_ni_to_peer(lnet_nid_t prim_nid, lnet_nid_t nid, bool mr) { /* * Caller trying to setup an MR like peer hierarchy but * specifying it to be non-MR. This is not allowed. */ - if (key_nid != LNET_NID_ANY && + if (prim_nid != LNET_NID_ANY && nid != LNET_NID_ANY && !mr) return -EPERM; /* Add the primary NID of a peer */ - if (key_nid != LNET_NID_ANY && + if (prim_nid != LNET_NID_ANY && nid == LNET_NID_ANY && mr) - return lnet_add_prim_lpni(key_nid); + return lnet_add_prim_lpni(prim_nid); /* Add a NID to an existing peer */ - if (key_nid != LNET_NID_ANY && + if (prim_nid != LNET_NID_ANY && nid != LNET_NID_ANY && mr) - return lnet_add_peer_ni_to_prim_lpni(key_nid, nid); + return lnet_add_peer_ni_to_prim_lpni(prim_nid, nid); /* Add a non-MR peer NI */ - if (((key_nid != LNET_NID_ANY && + if (((prim_nid != LNET_NID_ANY && nid == LNET_NID_ANY) || - (key_nid == LNET_NID_ANY && + (prim_nid == LNET_NID_ANY && nid != LNET_NID_ANY)) && !mr) - return lnet_peer_ni_add_non_mr(key_nid != LNET_NID_ANY ? - key_nid : nid); + return lnet_peer_ni_add_non_mr(prim_nid != LNET_NID_ANY ? + prim_nid : nid); return 0; } int -lnet_del_peer_ni_from_peer(lnet_nid_t key_nid, lnet_nid_t nid) +lnet_del_peer_ni_from_peer(lnet_nid_t prim_nid, lnet_nid_t nid) { lnet_nid_t local_nid; struct lnet_peer *peer; struct lnet_peer_ni *lpni; int rc; - if (key_nid == LNET_NID_ANY) + if (prim_nid == LNET_NID_ANY) return -EINVAL; - local_nid = (nid != LNET_NID_ANY) ? nid : key_nid; + local_nid = (nid != LNET_NID_ANY) ? nid : prim_nid; lpni = lnet_find_peer_ni_locked(local_nid); if (!lpni) From patchwork Tue Sep 25 01:07:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613215 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A52E1390 for ; Tue, 25 Sep 2018 01:13:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2EA3429050 for ; Tue, 25 Sep 2018 01:13:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 233FB290EA; Tue, 25 Sep 2018 01:13:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id DC51D29050 for ; Tue, 25 Sep 2018 01:13:24 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 810E64C43A4; Mon, 24 Sep 2018 18:13:24 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 759EB4C4383 for ; Mon, 24 Sep 2018 18:13:23 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B6918B032; Tue, 25 Sep 2018 01:13:22 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:16 +1000 Message-ID: <153783763611.32103.14845537928418369718.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 33/34] lnet: use BIT() macro for LNET_MD_* flags X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP As these are bit flags, it aids clarity to use the BIT() macro. Signed-off-by: NeilBrown --- .../lustre/include/uapi/linux/lnet/lnet-types.h | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h index e80ef4182e5d..62f062c0d1bf 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h @@ -483,22 +483,22 @@ struct lnet_md { /** * Options for the MD structure. See lnet_md::options. */ -#define LNET_MD_OP_PUT (1 << 0) +#define LNET_MD_OP_PUT BIT(0) /** See lnet_md::options. */ -#define LNET_MD_OP_GET (1 << 1) +#define LNET_MD_OP_GET BIT(1) /** See lnet_md::options. */ -#define LNET_MD_MANAGE_REMOTE (1 << 2) -/* unused (1 << 3) */ +#define LNET_MD_MANAGE_REMOTE BIT(2) +/* unused BIT(3) */ /** See lnet_md::options. */ -#define LNET_MD_TRUNCATE (1 << 4) +#define LNET_MD_TRUNCATE BIT(4) /** See lnet_md::options. */ -#define LNET_MD_ACK_DISABLE (1 << 5) +#define LNET_MD_ACK_DISABLE BIT(5) /** See lnet_md::options. */ -#define LNET_MD_IOVEC (1 << 6) +#define LNET_MD_IOVEC BIT(6) /** See lnet_md::options. */ -#define LNET_MD_MAX_SIZE (1 << 7) +#define LNET_MD_MAX_SIZE BIT(7) /** See lnet_md::options. */ -#define LNET_MD_KIOV (1 << 8) +#define LNET_MD_KIOV BIT(8) /* For compatibility with Cray Portals */ #define LNET_MD_PHYS 0 From patchwork Tue Sep 25 01:07:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 10613217 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B4FB9157B for ; Tue, 25 Sep 2018 01:13:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B4B4B2A06B for ; Tue, 25 Sep 2018 01:13:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A8DE82A070; Tue, 25 Sep 2018 01:13:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id A1E7F2A06B for ; Tue, 25 Sep 2018 01:13:32 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3B3214C43C3; Mon, 24 Sep 2018 18:13:32 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2E1144C43A4 for ; Mon, 24 Sep 2018 18:13:30 -0700 (PDT) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 55D10B034; Tue, 25 Sep 2018 01:13:29 +0000 (UTC) From: NeilBrown To: Oleg Drokin , Doug Oucharek , James Simmons , Andreas Dilger Date: Tue, 25 Sep 2018 11:07:16 +1000 Message-ID: <153783763614.32103.8503784699321046646.stgit@noble> In-Reply-To: <153783752960.32103.8394391715843917125.stgit@noble> References: <153783752960.32103.8394391715843917125.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 34/34] LU-7734 lnet: cpt locking X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Amir Shehata When source nid is specified it is necessary to also use the destination nid. Otherwise bulk transfer will end up on a different interface than the nearest interface to the memory. This has significant performance impact on NUMA systems such as the SGI UV. The CPT which the MD describing the bulk buffers belongs to is not the same CPT of the actual pages of memory. Therefore, it is necessary to communicate the CPT of the pages to LNet, in order for LNet to select the nearest interface. The MD which describes the pages of memory gets attached to an ME, to be matched later on. The MD which describes the message to be sent is different and this patch adds the handle of the bulk MD into the MD which ends up being accessible by lnet_select_pathway(). In that function a new API, lnet_cpt_of_md_page(), is called which returns the CPT of the buffers used for the bulk transfer. lnet_select_pathway() proceeds to use this CPT to select the nearest interface. Signed-off-by: Amir Shehata Change-Id: I4117ef912835f16dcdcaafb70703f92d74053b9b Reviewed-on: https://review.whamcloud.com/24085 Signed-off-by: NeilBrown --- .../staging/lustre/include/linux/lnet/lib-lnet.h | 1 + .../staging/lustre/include/linux/lnet/lib-types.h | 1 + .../lustre/include/uapi/linux/lnet/lnet-types.h | 12 ++++++++ drivers/staging/lustre/lnet/lnet/lib-md.c | 31 ++++++++++++++++++++ drivers/staging/lustre/lnet/lnet/lib-move.c | 20 ++++++++----- drivers/staging/lustre/lustre/ptlrpc/niobuf.c | 26 +++++++++++++---- 6 files changed, 78 insertions(+), 13 deletions(-) diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h index 3a53d54b711d..aedc88c69977 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h @@ -595,6 +595,7 @@ void lnet_me_unlink(struct lnet_me *me); void lnet_md_unlink(struct lnet_libmd *md); void lnet_md_deconstruct(struct lnet_libmd *lmd, struct lnet_md *umd); +int lnet_cpt_of_md(struct lnet_libmd *md); void lnet_register_lnd(struct lnet_lnd *lnd); void lnet_unregister_lnd(struct lnet_lnd *lnd); diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h index f811f125dfcb..18e2665ad74d 100644 --- a/drivers/staging/lustre/include/linux/lnet/lib-types.h +++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h @@ -161,6 +161,7 @@ struct lnet_libmd { void *md_user_ptr; struct lnet_eq *md_eq; unsigned int md_niov; /* # frags */ + struct lnet_handle_md md_bulk_handle; union { struct kvec iov[LNET_MAX_IOV]; struct bio_vec kiov[LNET_MAX_IOV]; diff --git a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h index 62f062c0d1bf..837e5fe25ac1 100644 --- a/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h +++ b/drivers/staging/lustre/include/uapi/linux/lnet/lnet-types.h @@ -444,6 +444,7 @@ struct lnet_md { * - LNET_MD_IOVEC: The start and length fields specify an array of * struct iovec. * - LNET_MD_MAX_SIZE: The max_size field is valid. + * - LNET_MD_BULK_HANDLE: The bulk_handle field is valid. * * Note: * - LNET_MD_KIOV or LNET_MD_IOVEC allows for a scatter/gather @@ -467,6 +468,15 @@ struct lnet_md { * descriptor are not logged. */ struct lnet_handle_eq eq_handle; + /** + * The bulk MD handle which was registered to describe the buffers + * either to be used to transfer data to the peer or receive data + * from the peer. This allows LNet to properly determine the NUMA + * node on which the memory was allocated and use that to select the + * nearest local network interface. This value is only used + * if the LNET_MD_BULK_HANDLE option is set. + */ + struct lnet_handle_md bulk_handle; }; /* @@ -499,6 +509,8 @@ struct lnet_md { #define LNET_MD_MAX_SIZE BIT(7) /** See lnet_md::options. */ #define LNET_MD_KIOV BIT(8) +/** See lnet_md::options. */ +#define LNET_MD_BULK_HANDLE BIT(9) /* For compatibility with Cray Portals */ #define LNET_MD_PHYS 0 diff --git a/drivers/staging/lustre/lnet/lnet/lib-md.c b/drivers/staging/lustre/lnet/lnet/lib-md.c index 8a22514aaf71..9e26911cd319 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-md.c +++ b/drivers/staging/lustre/lnet/lnet/lib-md.c @@ -84,6 +84,36 @@ lnet_md_unlink(struct lnet_libmd *md) kfree(md); } +int +lnet_cpt_of_md(struct lnet_libmd *md) +{ + int cpt = CFS_CPT_ANY; + + if (!md) + return CFS_CPT_ANY; + + if ((md->md_options & LNET_MD_BULK_HANDLE) != 0 && + md->md_bulk_handle.cookie != LNET_WIRE_HANDLE_COOKIE_NONE) { + md = lnet_handle2md(&md->md_bulk_handle); + + if (!md) + return CFS_CPT_ANY; + } + + if ((md->md_options & LNET_MD_KIOV) != 0) { + if (md->md_iov.kiov[0].bv_page) + cpt = cfs_cpt_of_node( + lnet_cpt_table(), + page_to_nid(md->md_iov.kiov[0].bv_page)); + } else if (md->md_iov.iov[0].iov_base) { + cpt = cfs_cpt_of_node( + lnet_cpt_table(), + page_to_nid(virt_to_page(md->md_iov.iov[0].iov_base))); + } + + return cpt; +} + static int lnet_md_build(struct lnet_libmd *lmd, struct lnet_md *umd, int unlink) { @@ -101,6 +131,7 @@ lnet_md_build(struct lnet_libmd *lmd, struct lnet_md *umd, int unlink) lmd->md_threshold = umd->threshold; lmd->md_refcount = 0; lmd->md_flags = (unlink == LNET_UNLINK) ? LNET_MD_FLAG_AUTO_UNLINK : 0; + lmd->md_bulk_handle = umd->bulk_handle; if (umd->options & LNET_MD_IOVEC) { if (umd->options & LNET_MD_KIOV) /* Can't specify both */ diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c index 141983f0ef83..d39331fcf932 100644 --- a/drivers/staging/lustre/lnet/lnet/lib-move.c +++ b/drivers/staging/lustre/lnet/lnet/lib-move.c @@ -1225,6 +1225,11 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * then we proceed, if there is, then we restart the operation. */ cpt = lnet_net_lock_current(); + + md_cpt = lnet_cpt_of_md(msg->msg_md); + if (md_cpt == CFS_CPT_ANY) + md_cpt = cpt; + again: best_ni = NULL; best_lpni = NULL; @@ -1242,12 +1247,6 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, return -ESHUTDOWN; } - if (msg->msg_md) - /* get the cpt of the MD, used during NUMA based selection */ - md_cpt = lnet_cpt_of_cookie(msg->msg_md->md_lh.lh_cookie); - else - md_cpt = CFS_CPT_ANY; - peer = lnet_find_or_create_peer_locked(dst_nid, cpt); if (IS_ERR(peer)) { lnet_net_unlock(cpt); @@ -1285,7 +1284,8 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, if (msg->msg_type == LNET_MSG_REPLY || msg->msg_type == LNET_MSG_ACK || - !peer->lp_multi_rail) { + !peer->lp_multi_rail || + best_ni) { /* * for replies we want to respond on the same peer_ni we * received the message on if possible. If not, then pick @@ -1294,6 +1294,12 @@ lnet_select_pathway(lnet_nid_t src_nid, lnet_nid_t dst_nid, * if the peer is non-multi-rail then you want to send to * the dst_nid provided as well. * + * If the best_ni has already been determined, IE the + * src_nid has been specified, then use the + * destination_nid provided as well, since we're + * continuing a series of related messages for the same + * RPC. + * * It is expected to find the lpni using dst_nid, since we * created it earlier. */ diff --git a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c index d0bcd8827f8a..415450d3c8c1 100644 --- a/drivers/staging/lustre/lustre/ptlrpc/niobuf.c +++ b/drivers/staging/lustre/lustre/ptlrpc/niobuf.c @@ -48,7 +48,8 @@ static int ptl_send_buf(struct lnet_handle_md *mdh, void *base, int len, enum lnet_ack_req ack, struct ptlrpc_cb_id *cbid, lnet_nid_t self, struct lnet_process_id peer_id, - int portal, __u64 xid, unsigned int offset) + int portal, __u64 xid, unsigned int offset, + struct lnet_handle_md *bulk_cookie) { int rc; struct lnet_md md; @@ -61,13 +62,17 @@ static int ptl_send_buf(struct lnet_handle_md *mdh, void *base, int len, md.options = PTLRPC_MD_OPTIONS; md.user_ptr = cbid; md.eq_handle = ptlrpc_eq_h; + md.bulk_handle.cookie = LNET_WIRE_HANDLE_COOKIE_NONE; + + if (bulk_cookie) { + md.bulk_handle = *bulk_cookie; + md.options |= LNET_MD_BULK_HANDLE; + } if (unlikely(ack == LNET_ACK_REQ && - OBD_FAIL_CHECK_ORSET(OBD_FAIL_PTLRPC_ACK, - OBD_FAIL_ONCE))) { + OBD_FAIL_CHECK_ORSET(OBD_FAIL_PTLRPC_ACK, OBD_FAIL_ONCE))) /* don't ask for the ack to simulate failing client */ ack = LNET_NOACK_REQ; - } rc = LNetMDBind(md, LNET_UNLINK, mdh); if (unlikely(rc != 0)) { @@ -417,7 +422,7 @@ int ptlrpc_send_reply(struct ptlrpc_request *req, int flags) LNET_ACK_REQ : LNET_NOACK_REQ, &rs->rs_cb_id, req->rq_self, req->rq_source, ptlrpc_req2svc(req)->srv_rep_portal, - req->rq_xid, req->rq_reply_off); + req->rq_xid, req->rq_reply_off, NULL); out: if (unlikely(rc != 0)) ptlrpc_req_drop_rs(req); @@ -474,12 +479,15 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) int rc; int rc2; unsigned int mpflag = 0; + struct lnet_handle_md bulk_cookie; struct ptlrpc_connection *connection; struct lnet_handle_me reply_me_h; struct lnet_md reply_md; struct obd_import *imp = request->rq_import; struct obd_device *obd = imp->imp_obd; + bulk_cookie.cookie = LNET_WIRE_HANDLE_COOKIE_NONE; + if (OBD_FAIL_CHECK(OBD_FAIL_PTLRPC_DROP_RPC)) return 0; @@ -577,6 +585,12 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) rc = ptlrpc_register_bulk(request); if (rc != 0) goto out; + /* + * All the mds in the request will have the same cpt + * encoded in the cookie. So we can just get the first + * one. + */ + bulk_cookie = request->rq_bulk->bd_mds[0]; } if (!noreply) { @@ -685,7 +699,7 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) LNET_NOACK_REQ, &request->rq_req_cbid, LNET_NID_ANY, connection->c_peer, request->rq_request_portal, - request->rq_xid, 0); + request->rq_xid, 0, &bulk_cookie); if (likely(rc == 0)) goto out;