diff mbox series

[vfio,2/5] vfio/mlx5: Add support for tracker object events

Message ID 20240130170227.153464-3-yishaih@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Improve mlx5 driver to better handle some error cases | expand

Commit Message

Yishai Hadas Jan. 30, 2024, 5:02 p.m. UTC
Add support for tracker object events by referring to its
MLX5_EVENT_TYPE_OBJECT_CHANGE event when occurs.

This lets the driver recognize whether the firmware moved the tracker
object to an error state.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/mlx5/cmd.c | 48 +++++++++++++++++++++++++++++++++++++
 drivers/vfio/pci/mlx5/cmd.h |  1 +
 2 files changed, 49 insertions(+)

Comments

Tian, Kevin Feb. 5, 2024, 8:10 a.m. UTC | #1
> From: Yishai Hadas <yishaih@nvidia.com>
> Sent: Wednesday, January 31, 2024 1:02 AM
> 
> +static void set_tracker_event(struct mlx5vf_pci_core_device *mvdev)
> +{
> +	mvdev->tracker.event_occur = true;
> +	complete(&mvdev->tracker_comp);

it's slightly clearer to call it 'object_changed'.

> @@ -1634,6 +1671,11 @@ int mlx5vf_tracker_read_and_clear(struct
> vfio_device *vdev, unsigned long iova,
>  		goto end;
>  	}
> 
> +	if (tracker->is_err) {
> +		err = -EIO;
> +		goto end;
> +	}
> +

this sounds like a separate improvement? i.e. if the tracker is already
in an error state then exit early. if yes better put it in a separate patch.

> @@ -1652,6 +1694,12 @@ int mlx5vf_tracker_read_and_clear(struct
> vfio_device *vdev, unsigned long iova,
>  						      dirty, &tracker->status);
>  			if (poll_err == CQ_EMPTY) {
>  				wait_for_completion(&mvdev-
> >tracker_comp);
> +				if (tracker->event_occur) {
> +					tracker->event_occur = false;
> +					err =
> mlx5vf_cmd_query_tracker(mdev, tracker);
> +					if (err)
> +						goto end;
> +				}

this implies that the error notified by tracker event cannot be queried
by mlx5vf_cq_poll_one() otherwise the next iteration will get the error
state anyway. possibly add a comment to clarify.

and why not setting state->is_err too?
Yishai Hadas Feb. 5, 2024, 9:20 a.m. UTC | #2
On 05/02/2024 10:10, Tian, Kevin wrote:
>> From: Yishai Hadas <yishaih@nvidia.com>
>> Sent: Wednesday, January 31, 2024 1:02 AM
>>
>> +static void set_tracker_event(struct mlx5vf_pci_core_device *mvdev)
>> +{
>> +	mvdev->tracker.event_occur = true;
>> +	complete(&mvdev->tracker_comp);
> 
> it's slightly clearer to call it 'object_changed'.

Do you refer to the 'event_occur' field ? I can rename it, if you think 
that it's clearer.

> 
>> @@ -1634,6 +1671,11 @@ int mlx5vf_tracker_read_and_clear(struct
>> vfio_device *vdev, unsigned long iova,
>>   		goto end;
>>   	}
>>
>> +	if (tracker->is_err) {
>> +		err = -EIO;
>> +		goto end;
>> +	}
>> +
> 
> this sounds like a separate improvement? i.e. if the tracker is already
> in an error state then exit early. if yes better put it in a separate patch.
> 

As it's just an early exit, I don't think that it worth a separate patch.

However, I can add one statement about that in the commit log.

>> @@ -1652,6 +1694,12 @@ int mlx5vf_tracker_read_and_clear(struct
>> vfio_device *vdev, unsigned long iova,
>>   						      dirty, &tracker->status);
>>   			if (poll_err == CQ_EMPTY) {
>>   				wait_for_completion(&mvdev-
>>> tracker_comp);
>> +				if (tracker->event_occur) {
>> +					tracker->event_occur = false;
>> +					err =
>> mlx5vf_cmd_query_tracker(mdev, tracker);
>> +					if (err)
>> +						goto end;
>> +				}
> 
> this implies that the error notified by tracker event cannot be queried
> by mlx5vf_cq_poll_one() otherwise the next iteration will get the error
> state anyway. 

Right, in that case, no CQE will be delivered, so, this is the way to 
detect that firmware moved the object to an error state, following the 
device specification.

> possibly add a comment to clarify.

OK, I'll add a note in the commit log.

> 
> and why not setting state->is_err too?

No need for an extra code here.

Upon mlx5vf_cmd_query_tracker() the tracker->status field will be 
updated to an error, the while loop will detect that, and do the job.

Yishai
Tian, Kevin Feb. 6, 2024, 7:33 a.m. UTC | #3
> From: Yishai Hadas <yishaih@nvidia.com>
> Sent: Monday, February 5, 2024 5:21 PM
> 
> On 05/02/2024 10:10, Tian, Kevin wrote:
> >> From: Yishai Hadas <yishaih@nvidia.com>
> >> Sent: Wednesday, January 31, 2024 1:02 AM
> >>
> >> +static void set_tracker_event(struct mlx5vf_pci_core_device *mvdev)
> >> +{
> >> +	mvdev->tracker.event_occur = true;
> >> +	complete(&mvdev->tracker_comp);
> >
> > it's slightly clearer to call it 'object_changed'.
> 
> Do you refer to the 'event_occur' field ? I can rename it, if you think
> that it's clearer.

yes

> 
> >
> > and why not setting state->is_err too?
> 
> No need for an extra code here.
> 
> Upon mlx5vf_cmd_query_tracker() the tracker->status field will be
> updated to an error, the while loop will detect that, and do the job.
> 

except below where tracker->status is not updated:

+	err = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out));
+	if (err)
+		return err;
Yishai Hadas Feb. 6, 2024, 8:01 a.m. UTC | #4
On 06/02/2024 9:33, Tian, Kevin wrote:
>> From: Yishai Hadas <yishaih@nvidia.com>
>> Sent: Monday, February 5, 2024 5:21 PM
>>
>> On 05/02/2024 10:10, Tian, Kevin wrote:
>>>> From: Yishai Hadas <yishaih@nvidia.com>
>>>> Sent: Wednesday, January 31, 2024 1:02 AM
>>>>
>>>> +static void set_tracker_event(struct mlx5vf_pci_core_device *mvdev)
>>>> +{
>>>> +	mvdev->tracker.event_occur = true;
>>>> +	complete(&mvdev->tracker_comp);
>>>
>>> it's slightly clearer to call it 'object_changed'.
>>
>> Do you refer to the 'event_occur' field ? I can rename it, if you think
>> that it's clearer.
> 
> yes
> 
>>
>>>
>>> and why not setting state->is_err too?
>>
>> No need for an extra code here.
>>
>> Upon mlx5vf_cmd_query_tracker() the tracker->status field will be
>> updated to an error, the while loop will detect that, and do the job.
>>
> 
> except below where tracker->status is not updated:

This is the expected behavior in that case, see below.

> 
> +	err = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out));
> +	if (err)
> +		return err;

We can't set unconditionally the tracker status to an error without 
getting that information from the firmware.

Upon an error here, we may just go out from the while loop in the caller 
and userspace will get the returned error.

Any further call to mlx5vf_tracker_read_and_clear() if will come, may 
have the chance to start the regular flow.

In case the tracker can't be moved to MLX5_PAGE_TRACK_STATE_REPORTING 
(e.g. as of some previous error) the call will simply fail as expected.

So, it looks OK.

Yishai
Tian, Kevin Feb. 6, 2024, 8:05 a.m. UTC | #5
> From: Yishai Hadas <yishaih@nvidia.com>
> Sent: Tuesday, February 6, 2024 4:02 PM
> 
> On 06/02/2024 9:33, Tian, Kevin wrote:
> >> From: Yishai Hadas <yishaih@nvidia.com>
> >> Sent: Monday, February 5, 2024 5:21 PM
> >>
> >> On 05/02/2024 10:10, Tian, Kevin wrote:
> >>>> From: Yishai Hadas <yishaih@nvidia.com>
> >>>> Sent: Wednesday, January 31, 2024 1:02 AM
> >>>>
> >>>> +static void set_tracker_event(struct mlx5vf_pci_core_device *mvdev)
> >>>> +{
> >>>> +	mvdev->tracker.event_occur = true;
> >>>> +	complete(&mvdev->tracker_comp);
> >>>
> >>> it's slightly clearer to call it 'object_changed'.
> >>
> >> Do you refer to the 'event_occur' field ? I can rename it, if you think
> >> that it's clearer.
> >
> > yes
> >
> >>
> >>>
> >>> and why not setting state->is_err too?
> >>
> >> No need for an extra code here.
> >>
> >> Upon mlx5vf_cmd_query_tracker() the tracker->status field will be
> >> updated to an error, the while loop will detect that, and do the job.
> >>
> >
> > except below where tracker->status is not updated:
> 
> This is the expected behavior in that case, see below.
> 
> >
> > +	err = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out));
> > +	if (err)
> > +		return err;
> 
> We can't set unconditionally the tracker status to an error without
> getting that information from the firmware.
> 
> Upon an error here, we may just go out from the while loop in the caller
> and userspace will get the returned error.
> 
> Any further call to mlx5vf_tracker_read_and_clear() if will come, may
> have the chance to start the regular flow.
> 
> In case the tracker can't be moved to MLX5_PAGE_TRACK_STATE_REPORTING
> (e.g. as of some previous error) the call will simply fail as expected.
> 
> So, it looks OK.
> 

ok then fine to me.
diff mbox series

Patch

diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c
index efd1d252cdc9..55ba02c70093 100644
--- a/drivers/vfio/pci/mlx5/cmd.c
+++ b/drivers/vfio/pci/mlx5/cmd.c
@@ -149,6 +149,12 @@  int mlx5vf_cmd_query_vhca_migration_state(struct mlx5vf_pci_core_device *mvdev,
 	return 0;
 }
 
+static void set_tracker_event(struct mlx5vf_pci_core_device *mvdev)
+{
+	mvdev->tracker.event_occur = true;
+	complete(&mvdev->tracker_comp);
+}
+
 static void set_tracker_error(struct mlx5vf_pci_core_device *mvdev)
 {
 	/* Mark the tracker under an error and wake it up if it's running */
@@ -900,6 +906,29 @@  static int mlx5vf_cmd_modify_tracker(struct mlx5_core_dev *mdev,
 	return mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out));
 }
 
+static int mlx5vf_cmd_query_tracker(struct mlx5_core_dev *mdev,
+				    struct mlx5_vhca_page_tracker *tracker)
+{
+	u32 out[MLX5_ST_SZ_DW(query_page_track_obj_out)] = {};
+	u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {};
+	void *obj_context;
+	void *cmd_hdr;
+	int err;
+
+	cmd_hdr = MLX5_ADDR_OF(modify_page_track_obj_in, in, general_obj_in_cmd_hdr);
+	MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, opcode, MLX5_CMD_OP_QUERY_GENERAL_OBJECT);
+	MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_type, MLX5_OBJ_TYPE_PAGE_TRACK);
+	MLX5_SET(general_obj_in_cmd_hdr, cmd_hdr, obj_id, tracker->id);
+
+	err = mlx5_cmd_exec(mdev, in, sizeof(in), out, sizeof(out));
+	if (err)
+		return err;
+
+	obj_context = MLX5_ADDR_OF(query_page_track_obj_out, out, obj_context);
+	tracker->status = MLX5_GET(page_track, obj_context, state);
+	return 0;
+}
+
 static int alloc_cq_frag_buf(struct mlx5_core_dev *mdev,
 			     struct mlx5_vhca_cq_buf *buf, int nent,
 			     int cqe_size)
@@ -957,9 +986,11 @@  static int mlx5vf_event_notifier(struct notifier_block *nb, unsigned long type,
 		mlx5_nb_cof(nb, struct mlx5_vhca_page_tracker, nb);
 	struct mlx5vf_pci_core_device *mvdev = container_of(
 		tracker, struct mlx5vf_pci_core_device, tracker);
+	struct mlx5_eqe_obj_change *object;
 	struct mlx5_eqe *eqe = data;
 	u8 event_type = (u8)type;
 	u8 queue_type;
+	u32 obj_id;
 	int qp_num;
 
 	switch (event_type) {
@@ -975,6 +1006,12 @@  static int mlx5vf_event_notifier(struct notifier_block *nb, unsigned long type,
 			break;
 		set_tracker_error(mvdev);
 		break;
+	case MLX5_EVENT_TYPE_OBJECT_CHANGE:
+		object = &eqe->data.obj_change;
+		obj_id = be32_to_cpu(object->obj_id);
+		if (obj_id == tracker->id)
+			set_tracker_event(mvdev);
+		break;
 	default:
 		break;
 	}
@@ -1634,6 +1671,11 @@  int mlx5vf_tracker_read_and_clear(struct vfio_device *vdev, unsigned long iova,
 		goto end;
 	}
 
+	if (tracker->is_err) {
+		err = -EIO;
+		goto end;
+	}
+
 	mdev = mvdev->mdev;
 	err = mlx5vf_cmd_modify_tracker(mdev, tracker->id, iova, length,
 					MLX5_PAGE_TRACK_STATE_REPORTING);
@@ -1652,6 +1694,12 @@  int mlx5vf_tracker_read_and_clear(struct vfio_device *vdev, unsigned long iova,
 						      dirty, &tracker->status);
 			if (poll_err == CQ_EMPTY) {
 				wait_for_completion(&mvdev->tracker_comp);
+				if (tracker->event_occur) {
+					tracker->event_occur = false;
+					err = mlx5vf_cmd_query_tracker(mdev, tracker);
+					if (err)
+						goto end;
+				}
 				continue;
 			}
 		}
diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h
index f2c7227fa683..09d0ed6e2345 100644
--- a/drivers/vfio/pci/mlx5/cmd.h
+++ b/drivers/vfio/pci/mlx5/cmd.h
@@ -162,6 +162,7 @@  struct mlx5_vhca_page_tracker {
 	u32 id;
 	u32 pdn;
 	u8 is_err:1;
+	u8 event_occur:1;
 	struct mlx5_uars_page *uar;
 	struct mlx5_vhca_cq cq;
 	struct mlx5_vhca_qp *host_qp;