diff mbox series

[resend,v3,rdma-next,1/1] RDMA/hns: Add the process of AEQ overflow for hip08

Message ID 1547879009-39383-1-git-send-email-tanxiaofei@huawei.com (mailing list archive)
State Mainlined
Commit 2b9acb9a97fe9b4101ca020643760c4a090b4cb4
Delegated to: Jason Gunthorpe
Headers show
Series [resend,v3,rdma-next,1/1] RDMA/hns: Add the process of AEQ overflow for hip08 | expand

Commit Message

Xiaofei Tan Jan. 19, 2019, 6:23 a.m. UTC
AEQ overflow will be reported by hardware when too many
asynchronous events occurred but not be handled in time.
Normally, AEQ overflow error is not easy to occur. Once
happened, we have to do physical function reset to recover.
PF reset is implemented in two steps. Firstly, set reset
level with ae_dev->ops->set_default_reset_request.
Secondly, run reset with ae_dev->ops->reset_event.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
---
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Jason Gunthorpe Jan. 21, 2019, 11:48 p.m. UTC | #1
On Sat, Jan 19, 2019 at 02:23:29PM +0800, Xiaofei Tan wrote:
> AEQ overflow will be reported by hardware when too many
> asynchronous events occurred but not be handled in time.
> Normally, AEQ overflow error is not easy to occur. Once
> happened, we have to do physical function reset to recover.
> PF reset is implemented in two steps. Firstly, set reset
> level with ae_dev->ops->set_default_reset_request.
> Secondly, run reset with ae_dev->ops->reset_event.
> 
> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
> Signed-off-by: Yixian Liu <liuyixian@huawei.com>
> ---
>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)

Applied to for-next

Thanks,
Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index d778457..fb990ff 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -4702,11 +4702,22 @@  static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id)
 	int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG);
 
 	if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) {
+		struct pci_dev *pdev = hr_dev->pci_dev;
+		struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev);
+		const struct hnae3_ae_ops *ops = ae_dev->ops;
+
 		dev_err(dev, "AEQ overflow!\n");
 
 		roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1);
 		roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st);
 
+		/* Set reset level for reset_event() */
+		if (ops->set_default_reset_request)
+			ops->set_default_reset_request(ae_dev,
+						       HNAE3_FUNC_RESET);
+		if (ops->reset_event)
+			ops->reset_event(pdev, NULL);
+
 		roce_set_bit(int_en, HNS_ROCE_V2_VF_ABN_INT_EN_S, 1);
 		roce_write(hr_dev, ROCEE_VF_ABN_INT_EN_REG, int_en);