From patchwork Thu Jul 23 19:26:32 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wan, Kaike" X-Patchwork-Id: 6855841 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 03D66C05AC for ; Thu, 23 Jul 2015 19:26:40 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D9493205EF for ; Thu, 23 Jul 2015 19:26:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 61C1D2051F for ; Thu, 23 Jul 2015 19:26:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753427AbbGWT0g (ORCPT ); Thu, 23 Jul 2015 15:26:36 -0400 Received: from mga03.intel.com ([134.134.136.65]:16872 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752896AbbGWT0f convert rfc822-to-8bit (ORCPT ); Thu, 23 Jul 2015 15:26:35 -0400 Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP; 23 Jul 2015 12:26:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,532,1432623600"; d="scan'208";a="529184536" Received: from orsmsx110.amr.corp.intel.com ([10.22.240.8]) by FMSMGA003.fm.intel.com with ESMTP; 23 Jul 2015 12:26:34 -0700 Received: from crsmsx151.amr.corp.intel.com (172.18.7.86) by ORSMSX110.amr.corp.intel.com (10.22.240.8) with Microsoft SMTP Server (TLS) id 14.3.224.2; Thu, 23 Jul 2015 12:26:34 -0700 Received: from crsmsx101.amr.corp.intel.com ([169.254.1.180]) by CRSMSX151.amr.corp.intel.com ([169.254.3.9]) with mapi id 14.03.0224.002; Thu, 23 Jul 2015 13:26:32 -0600 From: "Wan, Kaike" To: Jason Gunthorpe CC: "linux-rdma@vger.kernel.org" , "Fleck, John" , "Weiny, Ira" Subject: RE: [PATCH v8 4/4] IB/sa: Route SA pathrecord query through netlink Thread-Topic: [PATCH v8 4/4] IB/sa: Route SA pathrecord query through netlink Thread-Index: AQHQum2aKAeYVTF8akmEtHibKdF9tJ3odREAgAEBDOA= Date: Thu, 23 Jul 2015 19:26:32 +0000 Message-ID: <3F128C9216C9B84BB6ED23EF16290AFB0CB55290@CRSMSX101.amr.corp.intel.com> References: <1436463268-32365-1-git-send-email-kaike.wan@intel.com> <1436463268-32365-5-git-send-email-kaike.wan@intel.com> <20150722210918.GB20815@obsidianresearch.com> In-Reply-To: <20150722210918.GB20815@obsidianresearch.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.18.205.10] MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-8.1 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma- > owner@vger.kernel.org] On Behalf Of Jason Gunthorpe > Sent: Wednesday, July 22, 2015 5:09 PM > To: Wan, Kaike > Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira > Subject: Re: [PATCH v8 4/4] IB/sa: Route SA pathrecord query through netlink > > On Thu, Jul 09, 2015 at 01:34:28PM -0400, kaike.wan@intel.com wrote: > > From: Kaike Wan > > > > This patch routes a SA pathrecord query to netlink first and processes > > the response appropriately. If a failure is returned, the request will > > be sent through IB. The decision whether to route the request to > > netlink first is determined by the presence of a listener for the > > local service netlink multicast group. If the user-space local service > > netlink multicast group listener is not present, the request will be > > sent through IB, just like what is currently being done. > > So, I tested it again, and the UAPI looks pretty reasonable now. > > The netlink validation still needs to be fixed. > > > There is where we might have a problem: nla_parse() allows only one > > entry for each attribute type in the returned array tb[]. If we have > > multiple (say 6) pathrecords returned, each having different flags > > (for different path use), a later one will replace an earlier one in > > tb[]. > > The parse is OK, continue to use the for loop, nla_parse does more than just > extract into tb, it validates all the sizes are correct, ignore tb. > > My testing shows it seems to get into some kind of fast repeating query loop: > > [ 4904.969188] Return answer 0 > [ 4904.969483] Return answer 0 > [ 4904.969703] Return answer 0 > [ 4904.969824] Return answer 0 > [ 4904.969943] Return answer 0 > [ 4904.970058] Return answer 0 > [ 4904.970175] Return answer 0 > [ 4904.970289] Return answer 0 > > diff --git a/drivers/infiniband/core/sa_query.c > b/drivers/infiniband/core/sa_query.c > index 63ea36d05072..e6b0181e7076 100644 > --- a/drivers/infiniband/core/sa_query.c > +++ b/drivers/infiniband/core/sa_query.c > @@ -640,6 +640,8 @@ static void ib_nl_process_good_resolve_rsp(struct > ib_sa_query *query, > } > } > } > + > + printk("Return answer %u\n",status); > query->callback(query, status, mad); > } > > You'll have to figure that out. I'm just doing ping X while running a responder > on the netlink socket, v7 didn't do this, IIRC. I did not see such a repeated query loop. With a modified version of your debugging code: I ran rping a few times and never saw such a behavior: [root@phifs011 ~]# rping -c -C 1 -a 10.228.216.26 Request 2 returns answer 0 client DISCONNECT EVENT... [root@phifs011 ~]# rping -c -C 1 -a 10.228.216.26 Request 3 returns answer 0 client DISCONNECT EVENT... [root@phifs011 ~]# rping -c -C 1 -a 10.228.216.26 Request 4 returns answer 0 client DISCONNECT EVENT... [root@phifs011 ~]# rping -c -C 1 -a 10.228.216.26 Request 5 returns answer 0 client DISCONNECT EVENT... [root@phifs011 ~]# The nl sequence increased sequentially. What's the difference? In my debugging patch, I have also other print statements and I have never seen such a loop at all. > > I think it has something to do with the timer as the first request works fine, > then a pause, then a storm. > > Actually, looks like nothing removes nl queries from the timeout loop when > they successfully complete. (ie this series doesn't even have a hope of > working properly) That is incorrect. The first thing in ib_nl_handle_resolve_resp is to find the request and remove it from the ib_nl_rquest_list. the timeout routine will remove the entries that have timed out from the nl request list. > > Please actually test this patch completely and thoroughly before sending > another version. Generally, I tested the patches (along with a debugging patch) with rping, ping, SRP, and another performance path (no debugging patch). I also tested another ibacm patch for setting timeout. When making major changes, I also ran tests for timeout and error response cases. >I'm broadly happy with this, so get your whole team to look > it over agin. > > > + if (query->callback) { > > + head = (const struct nlattr *) nlmsg_data(nlh); > > + len = nlmsg_len(nlh); > > + switch (query->path_use) { > > + case LS_RESOLVE_PATH_USE_UNIDIRECTIONAL: > > + mask = IB_PATH_PRIMARY | IB_PATH_OUTBOUND; > > + break; > > + > > + case LS_RESOLVE_PATH_USE_ALL: > > + case LS_RESOLVE_PATH_USE_GMP: > > + default: > > + mask = IB_PATH_PRIMARY | IB_PATH_GMP | > > + IB_PATH_BIDIRECTIONAL; > > + break; > > + } > > + nla_for_each_attr(curr, head, len, rem) { > > + if (curr->nla_type == LS_NLA_TYPE_PATH_RECORD) { > > + rec = nla_data(curr); > > + /* > > + * Get the first one. In the future, we may > > + * need to get up to 6 pathrecords. > > + */ > > + if (rec->flags & mask) { > > (rec_>flags & mask) == mask > > All requested bits must be set, not just any requested bit. > > A IB_PATH_PRIMARY | IB_PATH_OUTBOUND result does not satisfy the > requirement for LS_RESOLVE_PATH_USE_GMP. > > The goal is to local the one path of many that satisfies the requirement. > Future kernels will direct 6 path answers Got it. > > > +static int ib_nl_handle_set_timeout(struct sk_buff *skb, > > + struct netlink_callback *cb) > > +{ > > + const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh; > > + int timeout, delta, abs_delta; > > + const struct nlattr *attr; > > + unsigned long flags; > > + struct ib_sa_query *query; > > + long delay = 0; > > + struct nlattr *tb[LS_NLA_TYPE_MAX + 1]; > > + int ret; > > + > > + ret = nla_parse(tb, LS_NLA_TYPE_MAX, nlmsg_data(nlh), > nlmsg_len(nlh), > > + NULL); > > + attr = (const struct nlattr *)tb[LS_NLA_TYPE_TIMEOUT]; > > + if (ret || !attr || nla_len(attr) != sizeof(u32)) > > + goto settimeout_out; > > This probably doesn't need the nested attribute, but I'm indifferent. It's not a nested attribute, only a U32 attribute data. I will use a policy to validate the entire response (for pathrecord also). > > Jason > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the > body of a message to majordomo@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html --- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- a/drivers/infiniband/core/sa_query.c +++ b/drivers/infiniband/core/sa_query.c @@ -653,6 +653,7 @@ static void ib_nl_process_good_resolve_rsp(struct ib_sa_query *query, } } } + printk("Request %u returns answer %u\n", nlh->nlmsg_seq); query->callback(query, status, mad); }