Message ID | 20211123084809.37318-1-liangwenpeng@huawei.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | [for-rc] RDMA/hns: Fix the problem of mailbox being blocked in the reset scene | expand |
On Tue, Nov 23, 2021 at 04:48:09PM +0800, Wenpeng Liang wrote: > From: Yangyang Li <liyangyang20@huawei.com> > > is_reset is used to indicate whether the hardware starts to reset. When > hns_roce_hw_v2_reset_notify_down() is called, the hardware has not yet > started to reset. If is_reset is set at this time, all mailbox operations > of resource destroy actions will be intercepted by driver. When the driver > cleans up resources, but the hardware is still accessed, the following > errors will appear: > > [382663.191495] arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received: > [382663.336320] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000350100000010 > [382663.349860] arm-smmu-v3 arm-smmu-v3.2.auto: 0x000002088000003f > [382663.362217] arm-smmu-v3 arm-smmu-v3.2.auto: 0x00000000a50e0800 > [382663.370690] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000000000000000 > [382663.385557] arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received: > [382663.487465] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000350100000010 > [382663.534555] arm-smmu-v3 arm-smmu-v3.2.auto: 0x000002088000043e > [382663.546569] arm-smmu-v3 arm-smmu-v3.2.auto: 0x00000000a50a0800 > [382663.554642] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000000000000000 > [382663.565023] arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received: > [382663.575860] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000350100000010 > [382663.585248] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000020880000436 > [382663.595860] arm-smmu-v3 arm-smmu-v3.2.auto: 0x00000000a50a0880 > [382663.804870] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000000000000000 > [382663.942132] arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received: > [382663.962770] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000350100000010 > [382664.100535] arm-smmu-v3 arm-smmu-v3.2.auto: 0x000002088000043a > [382664.178632] arm-smmu-v3 arm-smmu-v3.2.auto: 0x00000000a50e0840 > [382664.218997] hns3 0000:35:00.0: INT status: CMDQ(0x0) HW errors(0x0) other(0x0) > [382664.223572] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000000000000000 > [382664.257988] hns3 0000:35:00.0: received unknown or unhandled event of vector0 > [382664.271027] arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received: > [382664.546592] arm-smmu-v3 arm-smmu-v3.2.auto: 0x0000350100000010 > [382664.555942] {34}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 7 > > is_reset will be set correctly in check_aedev_reset_status(), so the > setting in hns_roce_hw_v2_reset_notify_down() should be deleted. > > Fixes: 726be12f5ca0 ("RDMA/hns: Set reset flag when hw resetting") > Signed-off-by: Yangyang Li <liyangyang20@huawei.com> > Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> > --- > drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 2 -- > 1 file changed, 2 deletions(-) Applied to for-rc, thanks Jason
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index 9bfbaddd1763..ae14329c619c 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -6387,10 +6387,8 @@ static int hns_roce_hw_v2_reset_notify_down(struct hnae3_handle *handle) if (!hr_dev) return 0; - hr_dev->is_reset = true; hr_dev->active = false; hr_dev->dis_db = true; - hr_dev->state = HNS_ROCE_DEVICE_STATE_RST_DOWN; return 0;