Message ID | 1542959168-30503-3-git-send-email-liuyixian@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | RDMA/hns: Some updates for hip08 | expand |
On Fri, Nov 23, 2018 at 03:46:08PM +0800, Yixian Liu wrote: > From: Xiaofei Tan <tanxiaofei@huawei.com> > > AEQ overflow will be reported by hardware when too many > asynchronous events occured but not be handled in time. > Normally, AEQ overflow error is not easy to occur. Once > happened, we have to do physical function reset to recover. > PF reset is implemented in two steps. Firstly, set reset > level with ae_dev->ops->set_default_reset_request. > Secondly, run reset with ae_dev->ops->reset_event. > > Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> > Signed-off-by: Yixian Liu <liuyixian@huawei.com> > drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > index 3beb152..d02fe04 100644 > +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > @@ -4565,11 +4565,22 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id) > int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG); > > if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) { > + struct pci_dev *pdev = hr_dev->pci_dev; > + struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev); > + const struct hnae3_ae_ops *ops = ae_dev->ops; > + > dev_err(dev, "AEQ overflow!\n"); > > roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1); > roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st); > > + /* Set reset level for the following reset_event() call */ > + if (ops->set_default_reset_request) > + ops->set_default_reset_request(ae_dev, > + HNAE3_FUNC_RESET); This doesn't compile: drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function ‘hns_roce_v2_msix_interrupt_abn’: drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4578:10: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ if (ops->set_default_reset_request) ^~ drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4579:7: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ ops->set_default_reset_request(ae_dev, ^~ You can't send patches to -rc that don't compile on -rc... Jason
Hi Jason On 2018/11/24 4:18, Jason Gunthorpe wrote: > On Fri, Nov 23, 2018 at 03:46:08PM +0800, Yixian Liu wrote: >> From: Xiaofei Tan <tanxiaofei@huawei.com> >> >> AEQ overflow will be reported by hardware when too many >> asynchronous events occured but not be handled in time. >> Normally, AEQ overflow error is not easy to occur. Once >> happened, we have to do physical function reset to recover. >> PF reset is implemented in two steps. Firstly, set reset >> level with ae_dev->ops->set_default_reset_request. >> Secondly, run reset with ae_dev->ops->reset_event. >> >> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> >> Signed-off-by: Yixian Liu <liuyixian@huawei.com> >> drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >> index 3beb152..d02fe04 100644 >> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >> @@ -4565,11 +4565,22 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id) >> int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG); >> >> if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) { >> + struct pci_dev *pdev = hr_dev->pci_dev; >> + struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev); >> + const struct hnae3_ae_ops *ops = ae_dev->ops; >> + >> dev_err(dev, "AEQ overflow!\n"); >> >> roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1); >> roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st); >> >> + /* Set reset level for the following reset_event() call */ >> + if (ops->set_default_reset_request) >> + ops->set_default_reset_request(ae_dev, >> + HNAE3_FUNC_RESET); > > This doesn't compile: > > drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function ‘hns_roce_v2_msix_interrupt_abn’: > drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4578:10: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ > if (ops->set_default_reset_request) > ^~ > drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4579:7: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ > ops->set_default_reset_request(ae_dev, > ^~ > > You can't send patches to -rc that don't compile on -rc... > Hi Jason Sorry that I missed the patch dependency information in commit message. This patch dependents on the new interface in hns NIC driver those had been accept by David, the related commit as below: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git commit 720bd5837e3721f553a896a00da4a99ea12f0551
On Sat, Nov 24, 2018 at 11:21:50AM +0800, tanxiaofei wrote: > > Hi Jason > > On 2018/11/24 4:18, Jason Gunthorpe wrote: > > On Fri, Nov 23, 2018 at 03:46:08PM +0800, Yixian Liu wrote: > >> From: Xiaofei Tan <tanxiaofei@huawei.com> > >> > >> AEQ overflow will be reported by hardware when too many > >> asynchronous events occured but not be handled in time. > >> Normally, AEQ overflow error is not easy to occur. Once > >> happened, we have to do physical function reset to recover. > >> PF reset is implemented in two steps. Firstly, set reset > >> level with ae_dev->ops->set_default_reset_request. > >> Secondly, run reset with ae_dev->ops->reset_event. > >> > >> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> > >> Signed-off-by: Yixian Liu <liuyixian@huawei.com> > >> drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 +++++++++++ > >> 1 file changed, 11 insertions(+) > >> > >> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > >> index 3beb152..d02fe04 100644 > >> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > >> @@ -4565,11 +4565,22 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id) > >> int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG); > >> > >> if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) { > >> + struct pci_dev *pdev = hr_dev->pci_dev; > >> + struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev); > >> + const struct hnae3_ae_ops *ops = ae_dev->ops; > >> + > >> dev_err(dev, "AEQ overflow!\n"); > >> > >> roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1); > >> roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st); > >> > >> + /* Set reset level for the following reset_event() call */ > >> + if (ops->set_default_reset_request) > >> + ops->set_default_reset_request(ae_dev, > >> + HNAE3_FUNC_RESET); > > > > This doesn't compile: > > > > drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function ‘hns_roce_v2_msix_interrupt_abn’: > > drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4578:10: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ > > if (ops->set_default_reset_request) > > ^~ > > drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4579:7: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ > > ops->set_default_reset_request(ae_dev, > > ^~ > > > > You can't send patches to -rc that don't compile on -rc... > > > > Hi Jason > Sorry that I missed the patch dependency information in commit message. > This patch dependents on the new interface in hns NIC driver > those had been accept by David, the related commit as below: > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git You can't send -rc code which depends on -next, only vice versa is possible. > commit 720bd5837e3721f553a896a00da4a99ea12f0551 > > -- > thanks > tanxiaofei > > > Jason > > > > . > > >
Hi Leon On 2018/11/25 16:46, Leon Romanovsky wrote: > On Sat, Nov 24, 2018 at 11:21:50AM +0800, tanxiaofei wrote: >> >> Hi Jason >> >> On 2018/11/24 4:18, Jason Gunthorpe wrote: >>> On Fri, Nov 23, 2018 at 03:46:08PM +0800, Yixian Liu wrote: >>>> From: Xiaofei Tan <tanxiaofei@huawei.com> >>>> >>>> AEQ overflow will be reported by hardware when too many >>>> asynchronous events occured but not be handled in time. >>>> Normally, AEQ overflow error is not easy to occur. Once >>>> happened, we have to do physical function reset to recover. >>>> PF reset is implemented in two steps. Firstly, set reset >>>> level with ae_dev->ops->set_default_reset_request. >>>> Secondly, run reset with ae_dev->ops->reset_event. >>>> >>>> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> >>>> Signed-off-by: Yixian Liu <liuyixian@huawei.com> >>>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 +++++++++++ >>>> 1 file changed, 11 insertions(+) >>>> >>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >>>> index 3beb152..d02fe04 100644 >>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >>>> @@ -4565,11 +4565,22 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id) >>>> int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG); >>>> >>>> if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) { >>>> + struct pci_dev *pdev = hr_dev->pci_dev; >>>> + struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev); >>>> + const struct hnae3_ae_ops *ops = ae_dev->ops; >>>> + >>>> dev_err(dev, "AEQ overflow!\n"); >>>> >>>> roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1); >>>> roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st); >>>> >>>> + /* Set reset level for the following reset_event() call */ >>>> + if (ops->set_default_reset_request) >>>> + ops->set_default_reset_request(ae_dev, >>>> + HNAE3_FUNC_RESET); >>> >>> This doesn't compile: >>> >>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function ‘hns_roce_v2_msix_interrupt_abn’: >>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4578:10: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ >>> if (ops->set_default_reset_request) >>> ^~ >>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4579:7: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ >>> ops->set_default_reset_request(ae_dev, >>> ^~ >>> >>> You can't send patches to -rc that don't compile on -rc... >>> >> >> Hi Jason >> Sorry that I missed the patch dependency information in commit message. >> This patch dependents on the new interface in hns NIC driver >> those had been accept by David, the related commit as below: >> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git > > You can't send -rc code which depends on -next, only vice versa is possible. > OK.I will remake the patch based on for-next branch. thanks. >> commit 720bd5837e3721f553a896a00da4a99ea12f0551 >> >> -- >> thanks >> tanxiaofei >> >>> Jason >>> >>> . >>> >>
On Mon, Nov 26, 2018 at 03:02:08PM +0800, tanxiaofei wrote: > Hi Leon > > On 2018/11/25 16:46, Leon Romanovsky wrote: > > On Sat, Nov 24, 2018 at 11:21:50AM +0800, tanxiaofei wrote: > >> > >> Hi Jason > >> > >> On 2018/11/24 4:18, Jason Gunthorpe wrote: > >>> On Fri, Nov 23, 2018 at 03:46:08PM +0800, Yixian Liu wrote: > >>>> From: Xiaofei Tan <tanxiaofei@huawei.com> > >>>> > >>>> AEQ overflow will be reported by hardware when too many > >>>> asynchronous events occured but not be handled in time. > >>>> Normally, AEQ overflow error is not easy to occur. Once > >>>> happened, we have to do physical function reset to recover. > >>>> PF reset is implemented in two steps. Firstly, set reset > >>>> level with ae_dev->ops->set_default_reset_request. > >>>> Secondly, run reset with ae_dev->ops->reset_event. > >>>> > >>>> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> > >>>> Signed-off-by: Yixian Liu <liuyixian@huawei.com> > >>>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 +++++++++++ > >>>> 1 file changed, 11 insertions(+) > >>>> > >>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > >>>> index 3beb152..d02fe04 100644 > >>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > >>>> @@ -4565,11 +4565,22 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id) > >>>> int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG); > >>>> > >>>> if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) { > >>>> + struct pci_dev *pdev = hr_dev->pci_dev; > >>>> + struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev); > >>>> + const struct hnae3_ae_ops *ops = ae_dev->ops; > >>>> + > >>>> dev_err(dev, "AEQ overflow!\n"); > >>>> > >>>> roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1); > >>>> roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st); > >>>> > >>>> + /* Set reset level for the following reset_event() call */ > >>>> + if (ops->set_default_reset_request) > >>>> + ops->set_default_reset_request(ae_dev, > >>>> + HNAE3_FUNC_RESET); > >>> > >>> This doesn't compile: > >>> > >>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function ‘hns_roce_v2_msix_interrupt_abn’: > >>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4578:10: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ > >>> if (ops->set_default_reset_request) > >>> ^~ > >>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4579:7: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ > >>> ops->set_default_reset_request(ae_dev, > >>> ^~ > >>> > >>> You can't send patches to -rc that don't compile on -rc... > >>> > >> > >> Hi Jason > >> Sorry that I missed the patch dependency information in commit message. > >> This patch dependents on the new interface in hns NIC driver > >> those had been accept by David, the related commit as below: > >> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git > > > > You can't send -rc code which depends on -next, only vice versa is possible. > > > > OK.I will remake the patch based on for-next branch. thanks. Use for-rc if you are sending it to -rc Jason
Hi Jason, On 2018/11/27 1:42, Jason Gunthorpe wrote: > On Mon, Nov 26, 2018 at 03:02:08PM +0800, tanxiaofei wrote: >> Hi Leon >> >> On 2018/11/25 16:46, Leon Romanovsky wrote: >>> On Sat, Nov 24, 2018 at 11:21:50AM +0800, tanxiaofei wrote: >>>> >>>> Hi Jason >>>> >>>> On 2018/11/24 4:18, Jason Gunthorpe wrote: >>>>> On Fri, Nov 23, 2018 at 03:46:08PM +0800, Yixian Liu wrote: >>>>>> From: Xiaofei Tan <tanxiaofei@huawei.com> >>>>>> >>>>>> AEQ overflow will be reported by hardware when too many >>>>>> asynchronous events occured but not be handled in time. >>>>>> Normally, AEQ overflow error is not easy to occur. Once >>>>>> happened, we have to do physical function reset to recover. >>>>>> PF reset is implemented in two steps. Firstly, set reset >>>>>> level with ae_dev->ops->set_default_reset_request. >>>>>> Secondly, run reset with ae_dev->ops->reset_event. >>>>>> >>>>>> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> >>>>>> Signed-off-by: Yixian Liu <liuyixian@huawei.com> >>>>>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 11 +++++++++++ >>>>>> 1 file changed, 11 insertions(+) >>>>>> >>>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >>>>>> index 3beb152..d02fe04 100644 >>>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >>>>>> @@ -4565,11 +4565,22 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id) >>>>>> int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG); >>>>>> >>>>>> if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) { >>>>>> + struct pci_dev *pdev = hr_dev->pci_dev; >>>>>> + struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev); >>>>>> + const struct hnae3_ae_ops *ops = ae_dev->ops; >>>>>> + >>>>>> dev_err(dev, "AEQ overflow!\n"); >>>>>> >>>>>> roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1); >>>>>> roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st); >>>>>> >>>>>> + /* Set reset level for the following reset_event() call */ >>>>>> + if (ops->set_default_reset_request) >>>>>> + ops->set_default_reset_request(ae_dev, >>>>>> + HNAE3_FUNC_RESET); >>>>> >>>>> This doesn't compile: >>>>> >>>>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c: In function ‘hns_roce_v2_msix_interrupt_abn’: >>>>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4578:10: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ >>>>> if (ops->set_default_reset_request) >>>>> ^~ >>>>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4579:7: error: ‘const struct hnae3_ae_ops’ has no member named ‘set_default_reset_request’ >>>>> ops->set_default_reset_request(ae_dev, >>>>> ^~ >>>>> >>>>> You can't send patches to -rc that don't compile on -rc... >>>>> >>>> >>>> Hi Jason >>>> Sorry that I missed the patch dependency information in commit message. >>>> This patch dependents on the new interface in hns NIC driver >>>> those had been accept by David, the related commit as below: >>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git >>> >>> You can't send -rc code which depends on -next, only vice versa is possible. >>> >> >> OK.I will remake the patch based on for-next branch. thanks. > > Use for-rc if you are sending it to -rc > OK. I will send the patch to for-next branch, as the dependent patch accepted by David is also in -next branch. Thanks. > Jason > > . >
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 3beb152..d02fe04 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -4565,11 +4565,22 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id) int_en = roce_read(hr_dev, ROCEE_VF_ABN_INT_EN_REG); if (roce_get_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S)) { + struct pci_dev *pdev = hr_dev->pci_dev; + struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev); + const struct hnae3_ae_ops *ops = ae_dev->ops; + dev_err(dev, "AEQ overflow!\n"); roce_set_bit(int_st, HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S, 1); roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st); + /* Set reset level for the following reset_event() call */ + if (ops->set_default_reset_request) + ops->set_default_reset_request(ae_dev, + HNAE3_FUNC_RESET); + if (ops->reset_event) + ops->reset_event(pdev, NULL); + roce_set_bit(int_en, HNS_ROCE_V2_VF_ABN_INT_EN_S, 1); roce_write(hr_dev, ROCEE_VF_ABN_INT_EN_REG, int_en);