diff mbox series

[v2,for-rc] RDMA/srpt: Add a check for valid 'mad_agent' pointer

Message ID 20230406042549.507328-1-saravanan.vajravel@broadcom.com (mailing list archive)
State Accepted
Headers show
Series [v2,for-rc] RDMA/srpt: Add a check for valid 'mad_agent' pointer | expand

Commit Message

Saravanan Vajravel April 6, 2023, 4:25 a.m. UTC
When unregistering MAD agent, srpt module has a non-null check
for 'mad_agent' pointer before invoking ib_unregister_mad_agent().
This check can pass if 'mad_agent' variable holds an error value.
The 'mad_agent' can have an error value for a short window when
srpt_add_one() and srpt_remove_one() is executed simultaneously.

In srpt module, added a valid pointer check for 'sport->mad_agent'
before unregistering MAD agent.

This issue can hit when RoCE driver unregisters ib_device

Stack Trace:
------------
BUG: kernel NULL pointer dereference, address: 000000000000004d
PGD 145003067 P4D 145003067 PUD 2324fe067 PMD 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 10 PID: 4459 Comm: kworker/u80:0 Kdump: loaded Tainted: P
Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS 2.5.4 01/13/2020
Workqueue: bnxt_re bnxt_re_task [bnxt_re]
RIP: 0010:_raw_spin_lock_irqsave+0x19/0x40
Call Trace:
  ib_unregister_mad_agent+0x46/0x2f0 [ib_core]
  IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
  ? __schedule+0x20b/0x560
  srpt_unregister_mad_agent+0x93/0xd0 [ib_srpt]
  srpt_remove_one+0x20/0x150 [ib_srpt]
  remove_client_context+0x88/0xd0 [ib_core]
  bond0: (slave p2p1): link status definitely up, 100000 Mbps full duplex
  disable_device+0x8a/0x160 [ib_core]
  bond0: active interface up!
  ? kernfs_name_hash+0x12/0x80
 (NULL device *): Bonding Info Received: rdev: 000000006c0b8247
  __ib_unregister_device+0x42/0xb0 [ib_core]
 (NULL device *):         Master: mode: 4 num_slaves:2
  ib_unregister_device+0x22/0x30 [ib_core]
 (NULL device *):         Slave: id: 105069936 name:p2p1 link:0 state:0
  bnxt_re_stopqps_and_ib_uninit+0x83/0x90 [bnxt_re]
  bnxt_re_alloc_lag+0x12e/0x4e0 [bnxt_re]

Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
---
v1->v2:
   - Return pointer of mad_agent is stored in local pointer to verify
     if it is successfully registered

 drivers/infiniband/ulp/srpt/ib_srpt.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

Comments

Bart Van Assche April 6, 2023, 5:08 p.m. UTC | #1
On 4/5/23 21:25, Saravanan Vajravel wrote:
> +		if (IS_ERR(mad_agent)) {
>   			pr_err("%s-%d: MAD agent registration failed (%ld). Note: this is expected if SR-IOV is enabled.\n",
>   			       dev_name(&sport->sdev->device->dev), sport->port,
> -			       PTR_ERR(sport->mad_agent));
> +			       PTR_ERR(mad_agent));
>   			sport->mad_agent = NULL;
>   			memset(&port_modify, 0, sizeof(port_modify));
>   			port_modify.clr_port_cap_mask = IB_PORT_DEVICE_MGMT_SUP;
>   			ib_modify_port(sport->sdev->device, sport->port, 0,
>   				       &port_modify);
> -
> +		} else {
> +			sport->mad_agent = mad_agent;
>   		}
>   	}
>   

With an early return the 'else' clause wouldn't have been necessary. Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Leon Romanovsky April 9, 2023, 10:06 a.m. UTC | #2
On Wed, 05 Apr 2023 21:25:49 -0700, Saravanan Vajravel wrote:
> When unregistering MAD agent, srpt module has a non-null check
> for 'mad_agent' pointer before invoking ib_unregister_mad_agent().
> This check can pass if 'mad_agent' variable holds an error value.
> The 'mad_agent' can have an error value for a short window when
> srpt_add_one() and srpt_remove_one() is executed simultaneously.
> 
> In srpt module, added a valid pointer check for 'sport->mad_agent'
> before unregistering MAD agent.
> 
> [...]

Applied, thanks!

[1/1] RDMA/srpt: Add a check for valid 'mad_agent' pointer
      https://git.kernel.org/rdma/rdma/c/eca5cd9474cd26

Best regards,
Leon Romanovsky April 9, 2023, 10:07 a.m. UTC | #3
On Thu, Apr 06, 2023 at 10:08:18AM -0700, Bart Van Assche wrote:
> On 4/5/23 21:25, Saravanan Vajravel wrote:
> > +		if (IS_ERR(mad_agent)) {
> >   			pr_err("%s-%d: MAD agent registration failed (%ld). Note: this is expected if SR-IOV is enabled.\n",
> >   			       dev_name(&sport->sdev->device->dev), sport->port,
> > -			       PTR_ERR(sport->mad_agent));
> > +			       PTR_ERR(mad_agent));
> >   			sport->mad_agent = NULL;
> >   			memset(&port_modify, 0, sizeof(port_modify));
> >   			port_modify.clr_port_cap_mask = IB_PORT_DEVICE_MGMT_SUP;
> >   			ib_modify_port(sport->sdev->device, sport->port, 0,
> >   				       &port_modify);
> > -
> > +		} else {
> > +			sport->mad_agent = mad_agent;
> >   		}
> >   	}
> 
> With an early return the 'else' clause wouldn't have been necessary. Anyway:
> 
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Thanks, I fixed it locally and applied.
diff mbox series

Patch

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 3c3fae738c3e..b4cb88563bb2 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -549,6 +549,7 @@  static int srpt_format_guid(char *buf, unsigned int size, const __be64 *guid)
  */
 static int srpt_refresh_port(struct srpt_port *sport)
 {
+	struct ib_mad_agent *mad_agent;
 	struct ib_mad_reg_req reg_req;
 	struct ib_port_modify port_modify;
 	struct ib_port_attr port_attr;
@@ -593,23 +594,24 @@  static int srpt_refresh_port(struct srpt_port *sport)
 		set_bit(IB_MGMT_METHOD_GET, reg_req.method_mask);
 		set_bit(IB_MGMT_METHOD_SET, reg_req.method_mask);
 
-		sport->mad_agent = ib_register_mad_agent(sport->sdev->device,
-							 sport->port,
-							 IB_QPT_GSI,
-							 &reg_req, 0,
-							 srpt_mad_send_handler,
-							 srpt_mad_recv_handler,
-							 sport, 0);
-		if (IS_ERR(sport->mad_agent)) {
+		mad_agent = ib_register_mad_agent(sport->sdev->device,
+						  sport->port,
+						  IB_QPT_GSI,
+						  &reg_req, 0,
+						  srpt_mad_send_handler,
+						  srpt_mad_recv_handler,
+						  sport, 0);
+		if (IS_ERR(mad_agent)) {
 			pr_err("%s-%d: MAD agent registration failed (%ld). Note: this is expected if SR-IOV is enabled.\n",
 			       dev_name(&sport->sdev->device->dev), sport->port,
-			       PTR_ERR(sport->mad_agent));
+			       PTR_ERR(mad_agent));
 			sport->mad_agent = NULL;
 			memset(&port_modify, 0, sizeof(port_modify));
 			port_modify.clr_port_cap_mask = IB_PORT_DEVICE_MGMT_SUP;
 			ib_modify_port(sport->sdev->device, sport->port, 0,
 				       &port_modify);
-
+		} else {
+			sport->mad_agent = mad_agent;
 		}
 	}