diff mbox series

[02/12] accel/habanalabs: update pending reset flags with new reset requests

Message ID 20230608133849.2739411-2-ogabbay@kernel.org (mailing list archive)
State New, archived
Headers show
Series [01/12] accel/habanalabs: prevent immediate hard reset due to 2 adjacent H/W events | expand

Commit Message

Oded Gabbay June 8, 2023, 1:38 p.m. UTC
From: Tomer Tayar <ttayar@habana.ai>

If hl_device_cond_reset() is called while a reset is already pending but
hasn't started, the reset request will be dropped.
If the flags of the new request are more severe, e.g. a hard reset while
the pending reset is a compute reset, the eventual reset won't be
suitable for the device status.

To prevent such cases, update the pending reset flags with the new
requests flags before the requests are dropped.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
 drivers/accel/habanalabs/common/device.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c
index 1e61e79c42e5..993305871292 100644
--- a/drivers/accel/habanalabs/common/device.c
+++ b/drivers/accel/habanalabs/common/device.c
@@ -1937,8 +1937,10 @@  int hl_device_cond_reset(struct hl_device *hdev, u32 flags, u64 event_mask)
 		goto device_reset;
 	}
 
-	if (hdev->reset_info.watchdog_active)
+	if (hdev->reset_info.watchdog_active) {
+		hdev->device_release_watchdog_work.flags |= flags;
 		goto out;
+	}
 
 	hdev->device_release_watchdog_work.flags = flags;
 	dev_dbg(hdev->dev, "Device is going to be hard-reset in %u sec unless being released\n",