diff mbox series

[for-next] RDMA/efa: Reset device on probe failure

Message ID 20241225131548.15155-1-mrgolin@amazon.com (mailing list archive)
State Accepted
Headers show
Series [for-next] RDMA/efa: Reset device on probe failure | expand

Commit Message

Michael Margolin Dec. 25, 2024, 1:15 p.m. UTC
Make sure the device is being reset on driver exit whatever the reason
is, to keep the device aligned and allow it to close shared resources
(e.g. admin queue).

Reviewed-by: Firas Jahjah <firasj@amazon.com>
Reviewed-by: Yonatan Nachum <ynachum@amazon.com>
Signed-off-by: Michael Margolin <mrgolin@amazon.com>
---
 drivers/infiniband/hw/efa/efa_main.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Gal Pressman Dec. 30, 2024, 6:46 a.m. UTC | #1
On 25/12/2024 15:15, Michael Margolin wrote:
> Make sure the device is being reset on driver exit whatever the reason
> is, to keep the device aligned and allow it to close shared resources
> (e.g. admin queue).
> 
> Reviewed-by: Firas Jahjah <firasj@amazon.com>
> Reviewed-by: Yonatan Nachum <ynachum@amazon.com>
> Signed-off-by: Michael Margolin <mrgolin@amazon.com>
> ---
> @@ -685,7 +685,7 @@ static void efa_remove(struct pci_dev *pdev)
>  	struct efa_dev *dev = pci_get_drvdata(pdev);
>  
>  	efa_ib_device_remove(dev);

This already calls efa_com_dev_reset(), you now perform double reset in
the normal remove flow.

> -	efa_remove_device(pdev);
> +	efa_remove_device(pdev, false);
>  }
Gal Pressman Dec. 30, 2024, 6:52 a.m. UTC | #2
On 30/12/2024 8:46, Gal Pressman wrote:
> On 25/12/2024 15:15, Michael Margolin wrote:
>> Make sure the device is being reset on driver exit whatever the reason
>> is, to keep the device aligned and allow it to close shared resources
>> (e.g. admin queue).
>>
>> Reviewed-by: Firas Jahjah <firasj@amazon.com>
>> Reviewed-by: Yonatan Nachum <ynachum@amazon.com>
>> Signed-off-by: Michael Margolin <mrgolin@amazon.com>
>> ---
>> @@ -685,7 +685,7 @@ static void efa_remove(struct pci_dev *pdev)
>>  	struct efa_dev *dev = pci_get_drvdata(pdev);
>>  
>>  	efa_ib_device_remove(dev);
> 
> This already calls efa_com_dev_reset(), you now perform double reset in
> the normal remove flow.

Sorry, I obviously missed the part that removed it from
efa_ib_device_remove().

Reviewed-by: Gal Pressman <gal.pressman@linux.dev>

> 
>> -	efa_remove_device(pdev);
>> +	efa_remove_device(pdev, false);
>>  }
Leon Romanovsky Dec. 30, 2024, 6:41 p.m. UTC | #3
On Wed, 25 Dec 2024 13:15:48 +0000, Michael Margolin wrote:
> Make sure the device is being reset on driver exit whatever the reason
> is, to keep the device aligned and allow it to close shared resources
> (e.g. admin queue).
> 
> 

Applied, thanks!

[1/1] RDMA/efa: Reset device on probe failure
      https://git.kernel.org/rdma/rdma/c/123c13f10ed362

Best regards,
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/efa/efa_main.c b/drivers/infiniband/hw/efa/efa_main.c
index ad225823e6f2..0b102089e0ab 100644
--- a/drivers/infiniband/hw/efa/efa_main.c
+++ b/drivers/infiniband/hw/efa/efa_main.c
@@ -470,7 +470,6 @@  static void efa_ib_device_remove(struct efa_dev *dev)
 	ibdev_info(&dev->ibdev, "Unregister ib device\n");
 	ib_unregister_device(&dev->ibdev);
 	efa_destroy_eqs(dev);
-	efa_com_dev_reset(&dev->edev, EFA_REGS_RESET_NORMAL);
 	efa_release_doorbell_bar(dev);
 }
 
@@ -643,12 +642,13 @@  static struct efa_dev *efa_probe_device(struct pci_dev *pdev)
 	return ERR_PTR(err);
 }
 
-static void efa_remove_device(struct pci_dev *pdev)
+static void efa_remove_device(struct pci_dev *pdev, bool on_error)
 {
 	struct efa_dev *dev = pci_get_drvdata(pdev);
 	struct efa_com_dev *edev;
 
 	edev = &dev->edev;
+	efa_com_dev_reset(edev, on_error ? EFA_REGS_RESET_INIT_ERR : EFA_REGS_RESET_NORMAL);
 	efa_com_admin_destroy(edev);
 	efa_free_irq(dev, &dev->admin_irq);
 	efa_disable_msix(dev);
@@ -676,7 +676,7 @@  static int efa_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	return 0;
 
 err_remove_device:
-	efa_remove_device(pdev);
+	efa_remove_device(pdev, true);
 	return err;
 }
 
@@ -685,7 +685,7 @@  static void efa_remove(struct pci_dev *pdev)
 	struct efa_dev *dev = pci_get_drvdata(pdev);
 
 	efa_ib_device_remove(dev);
-	efa_remove_device(pdev);
+	efa_remove_device(pdev, false);
 }
 
 static void efa_shutdown(struct pci_dev *pdev)