diff mbox

[3/3] ibacm/ibacmp: fix a crash when SM restarts

Message ID 1416408407-6774-4-git-send-email-kaike.wan@intel.com (mailing list archive)
State Rejected
Headers show

Commit Message

Wan, Kaike Nov. 19, 2014, 2:46 p.m. UTC
From: Kaike Wan <kaike.wan@intel.com>

Ibacm may cause segfault when the SM restarts: when the SM restarts, ibacm will
receive P_Key change event and instruct ibacmp to close all endpoints. However,
ibacmp only resets the core endpoint pointer in its ep structure and keeps the ep
in the port's ep_list. Afterwards, the ibacm core will ask ibacmp to create
an ep for each pkey enumerated from the local port. The ep will be found
from the port's ep_list if it exists. However, if an old pkey is not present
in the new SM configuration, the old ep will still be linked in the port's
ep_list with the ep->endpoint being set to NULL. When the ibacm core forwards
the client reregistration event to ibacmp, ibacmp will enumerate the ep_list and
try to join multicast group for each ep, including any one with ep->endpoint
set to NULL. In this case, it will cause segfault in acm_send_sa_mad().
Additional check should be able to avoid the crash.

Signed-off-by: Kaike Wan <kaike.wan@intel.com>
---
 prov/acmp/src/acmp.c |    4 ++++
 src/acm.c            |    4 ++++
 2 files changed, 8 insertions(+), 0 deletions(-)

Comments

Hefty, Sean Dec. 3, 2014, 7:42 p.m. UTC | #1
> Ibacm may cause segfault when the SM restarts: when the SM restarts, ibacm
> will
> receive P_Key change event and instruct ibacmp to close all endpoints.
> However,
> ibacmp only resets the core endpoint pointer in its ep structure and keeps
> the ep
> in the port's ep_list. Afterwards, the ibacm core will ask ibacmp to
> create
> an ep for each pkey enumerated from the local port. The ep will be found
> from the port's ep_list if it exists. However, if an old pkey is not
> present
> in the new SM configuration, the old ep will still be linked in the port's
> ep_list with the ep->endpoint being set to NULL. When the ibacm core
> forwards
> the client reregistration event to ibacmp, ibacmp will enumerate the
> ep_list and
> try to join multicast group for each ep, including any one with ep-
> >endpoint
> set to NULL. In this case, it will cause segfault in acm_send_sa_mad().
> Additional check should be able to avoid the crash.

I'm having trouble following this.  Is the problem that the provider is not cleaning up properly/completely when being told to close its endpoint?


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wan, Kaike Dec. 4, 2014, 12:22 p.m. UTC | #2
> > Ibacm may cause segfault when the SM restarts: when the SM restarts,
> > ibacm will receive P_Key change event and instruct ibacmp to close all
> > endpoints.
> > However,
> > ibacmp only resets the core endpoint pointer in its ep structure and
> > keeps the ep in the port's ep_list. Afterwards, the ibacm core will
> > ask ibacmp to create an ep for each pkey enumerated from the local
> > port. The ep will be found from the port's ep_list if it exists.
> > However, if an old pkey is not present in the new SM configuration,
> > the old ep will still be linked in the port's ep_list with the
> > ep->endpoint being set to NULL. When the ibacm core forwards the
> > client reregistration event to ibacmp, ibacmp will enumerate the
> > ep_list and try to join multicast group for each ep, including any one
> > with ep-
> > >endpoint
> > set to NULL. In this case, it will cause segfault in acm_send_sa_mad().
> > Additional check should be able to avoid the crash.
> 
> I'm having trouble following this.  Is the problem that the provider is not
> cleaning up properly/completely when being told to close its endpoint?
> 
[Wan, Kaike] Yes, it's a problem for the current implementation of the default provider ibacmp. If the provider can clean up properly, it will not have such a problem. Nevertheless, the ibacm core should validate incoming parameters for a SA request from any provider.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/prov/acmp/src/acmp.c b/prov/acmp/src/acmp.c
index 7568b9c..2b85958 100644
--- a/prov/acmp/src/acmp.c
+++ b/prov/acmp/src/acmp.c
@@ -1446,6 +1446,10 @@  static int acmp_port_join(void *port_context)
 	for (ep_entry = port->ep_list.Next; ep_entry != &port->ep_list;
 		 ep_entry = ep_entry->Next) {
 		ep = container_of(ep_entry, struct acmp_ep, entry);
+		if (!ep->endpoint) {
+			/* Stale endpoint */
+			continue;
+		}
 		acmp_ep_join(ep);
 	}
 	acm_log(1, "joins for device %s port %d complete\n",
diff --git a/src/acm.c b/src/acm.c
index d807c73..2d0d2e1 100644
--- a/src/acm.c
+++ b/src/acm.c
@@ -2352,6 +2352,10 @@  acm_alloc_sa_mad(const struct acm_endpoint *endpoint, void *context,
 {
 	struct acmc_sa_req *req;
 
+	if (!endpoint) {
+		acm_log(0, "Error: NULL endpoint\n");
+		return NULL;
+	}
 	req = calloc(1, sizeof (*req));
 	if (!req) {
 		acm_log(0, "Error: failed to allocate sa request\n");