drm/dp_mst: Clear MSG_RDY flag before sending new message

Message ID	20230418060905.4078976-1-Wayne.Lin@amd.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB03.amd.com; pr=C From: Wayne Lin <Wayne.Lin@amd.com> To: <dri-devel@lists.freedesktop.org>, <amd-gfx@lists.freedesktop.org> Subject: [PATCH] drm/dp_mst: Clear MSG_RDY flag before sending new message Date: Tue, 18 Apr 2023 14:09:05 +0800 Message-ID: <20230418060905.4078976-1-Wayne.Lin@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: list Cc: jani.nikula@intel.com, stable@vger.kernel.org, jerry.zuo@amd.com, Wayne Lin <Wayne.Lin@amd.com> Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/dp_mst: Clear MSG_RDY flag before sending new message \| expand drm/dp_mst: Clear MSG_RDY flag before sending new message

Lin, Wayne April 18, 2023, 6:09 a.m. UTC

[Why & How]
The sequence for collecting down_reply/up_request from source
perspective should be:

Request_n->repeat (get partial reply of Request_n->clear message ready
flag to ack DPRX that the message is received) till all partial
replies for Request_n are received->new Request_n+1.

While assembling partial reply packets, reading out DPCD DOWN_REP
Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be
wrapped up as a complete operation for reading out a reply packet.
Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might
be risky. e.g. If the reply of the new request has overwritten the
DPRX DOWN_REP Sideband MSG buffer before source writing ack to clear
DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply
for the new request. Should handle the up request in the same way.

In drm_dp_mst_hpd_irq(), we don't clear MSG_RDY flag before caliing
drm_dp_mst_kick_tx(). Fix that.

Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Cc: stable@vger.kernel.org
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 ++
 drivers/gpu/drm/display/drm_dp_mst_topology.c | 22 +++++++++++++++++++
 drivers/gpu/drm/i915/display/intel_dp.c       |  3 +++
 drivers/gpu/drm/nouveau/dispnv50/disp.c       |  2 ++
 4 files changed, 29 insertions(+)

Jani Nikula April 18, 2023, 8:52 a.m. UTC | #1

On Tue, 18 Apr 2023, Wayne Lin <Wayne.Lin@amd.com> wrote:
> [Why & How]
> The sequence for collecting down_reply/up_request from source
> perspective should be:
>
> Request_n->repeat (get partial reply of Request_n->clear message ready
> flag to ack DPRX that the message is received) till all partial
> replies for Request_n are received->new Request_n+1.
>
> While assembling partial reply packets, reading out DPCD DOWN_REP
> Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be
> wrapped up as a complete operation for reading out a reply packet.
> Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might
> be risky. e.g. If the reply of the new request has overwritten the
> DPRX DOWN_REP Sideband MSG buffer before source writing ack to clear
> DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply
> for the new request. Should handle the up request in the same way.
>
> In drm_dp_mst_hpd_irq(), we don't clear MSG_RDY flag before caliing
> drm_dp_mst_kick_tx(). Fix that.
>
> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
> Cc: stable@vger.kernel.org
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 ++
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 22 +++++++++++++++++++
>  drivers/gpu/drm/i915/display/intel_dp.c       |  3 +++
>  drivers/gpu/drm/nouveau/dispnv50/disp.c       |  2 ++
>  4 files changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 77277d90b6e2..5313a5656598 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3166,6 +3166,8 @@ static void dm_handle_mst_sideband_msg(struct amdgpu_dm_connector *aconnector)
>  			for (retry = 0; retry < 3; retry++) {
>  				uint8_t wret;
>  
> +				/* MSG_RDY ack is done in drm*/
> +				esi[1] &= ~(DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);

Why do the masking within the retry loop?

>  				wret = drm_dp_dpcd_write(
>  					&aconnector->dm_dp_aux.aux,
>  					dpcd_addr + 1,
> diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> index 51a46689cda7..02aad713c67c 100644
> --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> @@ -4054,6 +4054,9 @@ int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl
>  {
>  	int ret = 0;
>  	int sc;
> +	const int tosend = 1;
> +	int retries = 0;
> +	u8 buf = 0;

All of these should be in tighter scope.

>  	*handled = false;
>  	sc = DP_GET_SINK_COUNT(esi[0]);
>  
> @@ -4072,6 +4075,25 @@ int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl
>  		*handled = true;
>  	}
>  
> +	if (*handled) {

That should check for DP_DOWN_REP_MSG_RDY and DP_UP_REQ_MSG_RDY only,
right? If those are not set, we didn't do anything with them, and should
not ack.

> +		buf = esi[1] & (DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);
> +		do {
> +			ret = drm_dp_dpcd_write(mgr->aux,
> +						DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0,
> +						&buf,
> +						tosend);

We should probably have a helper function to do the acking, similar to
intel_dp_ack_sink_irq_esi(), which could be used both by this function
and the drivers.

> +
> +			if (ret == tosend)
> +				break;
> +
> +			retries++;
> +		} while (retries < 5);

Please don't use a do-while when a for loop is sufficient.

	for (tries = 0; tries < 5; tries++)

and it's obvious at a glance how many times at most this runs. Not so
with a do-while where you count *re-tries*. Again, would be nice to
abstract this away in a helper function.

> +
> +		if (ret != tosend)
> +			drm_dbg_kms(mgr->dev, "failed to write dpcd 0x%x\n",
> +				    DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0);
> +	}
> +
>  	drm_dp_mst_kick_tx(mgr);
>  	return ret;
>  }
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> index bf80f296a8fd..abec3de38b66 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -3939,6 +3939,9 @@ intel_dp_check_mst_status(struct intel_dp *intel_dp)
>  		if (!memchr_inv(ack, 0, sizeof(ack)))
>  			break;
>  
> +		/* MSG_RDY ack is done in drm*/
> +		ack[1] &= ~(DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);

Above we check if there's anything to ack and bail out, and now this
clears the bits but writes them anyway.

I think the handled parameter was problematic before, but now it's even
more convoluted. What does it indicate? It used to mean you need to ack
if it's set, but now it's something different. This function is getting
very difficult to use correctly.

BR,
Jani.



> +
>  		if (!intel_dp_ack_sink_irq_esi(intel_dp, ack))
>  			drm_dbg_kms(&i915->drm, "Failed to ack ESI\n");
>  	}
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> index edcb2529b402..e905987104ed 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> @@ -1336,6 +1336,8 @@ nv50_mstm_service(struct nouveau_drm *drm,
>  		if (!handled)
>  			break;
>  
> +		/* MSG_RDY ack is done in drm*/
> +		esi[1] &= ~(DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);
>  		rc = drm_dp_dpcd_write(aux, DP_SINK_COUNT_ESI + 1, &esi[1],
>  				       3);

Same here, this acks even if it's already been acked.

>  		if (rc != 3) {

Lin, Wayne April 18, 2023, 9:42 a.m. UTC | #2

[Public]

Hi Jani Nikula,

Appreciate your time and feedback! Will adjust the patch.
Some comments inline.

> -----Original Message-----
> From: Jani Nikula <jani.nikula@intel.com>
> Sent: Tuesday, April 18, 2023 4:53 PM
> To: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
> amd-gfx@lists.freedesktop.org
> Cc: lyude@redhat.com; imre.deak@intel.com; ville.syrjala@linux.intel.com;
> Wentland, Harry <Harry.Wentland@amd.com>; Zuo, Jerry
> <Jerry.Zuo@amd.com>; Lin, Wayne <Wayne.Lin@amd.com>;
> stable@vger.kernel.org
> Subject: Re: [PATCH] drm/dp_mst: Clear MSG_RDY flag before sending new
> message
> 
> On Tue, 18 Apr 2023, Wayne Lin <Wayne.Lin@amd.com> wrote:
> > [Why & How]
> > The sequence for collecting down_reply/up_request from source
> > perspective should be:
> >
> > Request_n->repeat (get partial reply of Request_n->clear message ready
> > flag to ack DPRX that the message is received) till all partial
> > replies for Request_n are received->new Request_n+1.
> >
> > While assembling partial reply packets, reading out DPCD DOWN_REP
> > Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be
> wrapped
> > up as a complete operation for reading out a reply packet.
> > Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might
> > be risky. e.g. If the reply of the new request has overwritten the
> > DPRX DOWN_REP Sideband MSG buffer before source writing ack to clear
> > DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply
> > for the new request. Should handle the up request in the same way.
> >
> > In drm_dp_mst_hpd_irq(), we don't clear MSG_RDY flag before caliing
> > drm_dp_mst_kick_tx(). Fix that.
> >
> > Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 ++
> > drivers/gpu/drm/display/drm_dp_mst_topology.c | 22
> +++++++++++++++++++
> >  drivers/gpu/drm/i915/display/intel_dp.c       |  3 +++
> >  drivers/gpu/drm/nouveau/dispnv50/disp.c       |  2 ++
> >  4 files changed, 29 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 77277d90b6e2..5313a5656598 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -3166,6 +3166,8 @@ static void dm_handle_mst_sideband_msg(struct
> amdgpu_dm_connector *aconnector)
> >  			for (retry = 0; retry < 3; retry++) {
> >  				uint8_t wret;
> >
> > +				/* MSG_RDY ack is done in drm*/
> > +				esi[1] &= ~(DP_DOWN_REP_MSG_RDY |
> DP_UP_REQ_MSG_RDY);
> 
> Why do the masking within the retry loop?
> 
> >  				wret = drm_dp_dpcd_write(
> >  					&aconnector->dm_dp_aux.aux,
> >  					dpcd_addr + 1,
> > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > index 51a46689cda7..02aad713c67c 100644
> > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > @@ -4054,6 +4054,9 @@ int drm_dp_mst_hpd_irq(struct
> > drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl  {
> >  	int ret = 0;
> >  	int sc;
> > +	const int tosend = 1;
> > +	int retries = 0;
> > +	u8 buf = 0;
> 
> All of these should be in tighter scope.
> 
> >  	*handled = false;
> >  	sc = DP_GET_SINK_COUNT(esi[0]);
> >
> > @@ -4072,6 +4075,25 @@ int drm_dp_mst_hpd_irq(struct
> drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl
> >  		*handled = true;
> >  	}
> >
> > +	if (*handled) {
> 
> That should check for DP_DOWN_REP_MSG_RDY and
> DP_UP_REQ_MSG_RDY only, right? If those are not set, we didn't do
> anything with them, and should not ack.

Right. I was thinking the sink count change will accompany the CSN
up request message. I'll change it to be more clear. Thanks.
> 
> > +		buf = esi[1] & (DP_DOWN_REP_MSG_RDY |
> DP_UP_REQ_MSG_RDY);
> > +		do {
> > +			ret = drm_dp_dpcd_write(mgr->aux,
> > +
> 	DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0,
> > +						&buf,
> > +						tosend);
> 
> We should probably have a helper function to do the acking, similar to
> intel_dp_ack_sink_irq_esi(), which could be used both by this function and
> the drivers.
> 
> > +
> > +			if (ret == tosend)
> > +				break;
> > +
> > +			retries++;
> > +		} while (retries < 5);
> 
> Please don't use a do-while when a for loop is sufficient.
> 
> 	for (tries = 0; tries < 5; tries++)
> 
> and it's obvious at a glance how many times at most this runs. Not so with a
> do-while where you count *re-tries*. Again, would be nice to abstract this
> away in a helper function.
> 
> > +
> > +		if (ret != tosend)
> > +			drm_dbg_kms(mgr->dev, "failed to write dpcd
> 0x%x\n",
> > +				    DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0);
> > +	}
> > +
> >  	drm_dp_mst_kick_tx(mgr);
> >  	return ret;
> >  }
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
> > b/drivers/gpu/drm/i915/display/intel_dp.c
> > index bf80f296a8fd..abec3de38b66 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -3939,6 +3939,9 @@ intel_dp_check_mst_status(struct intel_dp
> *intel_dp)
> >  		if (!memchr_inv(ack, 0, sizeof(ack)))
> >  			break;
> >
> > +		/* MSG_RDY ack is done in drm*/
> > +		ack[1] &= ~(DP_DOWN_REP_MSG_RDY |
> DP_UP_REQ_MSG_RDY);
> 
> Above we check if there's anything to ack and bail out, and now this clears
> the bits but writes them anyway.
> 
> I think the handled parameter was problematic before, but now it's even
> more convoluted. What does it indicate? It used to mean you need to ack if
> it's set, but now it's something different. This function is getting very difficult
> to use correctly.

My plan was to ack message events within drm_dp_mst_hpd_irq() since the
events are handled there. There are still CP_IRQ and LINK_STATUS_CHANGED
events above get handled in intel_dp_check_mst_status(), so I intended to
mask DP_DOWN_REP_MSG_RDY/DP_UP_REQ_MSG_RDY, and ack
CP_IRQ/LINK_STATUS_CHANGED here.
> 
> BR,
> Jani.
> 
> 
> 
> > +
> >  		if (!intel_dp_ack_sink_irq_esi(intel_dp, ack))
> >  			drm_dbg_kms(&i915->drm, "Failed to ack ESI\n");
> >  	}
> > diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > index edcb2529b402..e905987104ed 100644
> > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > @@ -1336,6 +1336,8 @@ nv50_mstm_service(struct nouveau_drm *drm,
> >  		if (!handled)
> >  			break;
> >
> > +		/* MSG_RDY ack is done in drm*/
> > +		esi[1] &= ~(DP_DOWN_REP_MSG_RDY |
> DP_UP_REQ_MSG_RDY);
> >  		rc = drm_dp_dpcd_write(aux, DP_SINK_COUNT_ESI + 1,
> &esi[1],
> >  				       3);
> 
> Same here, this acks even if it's already been acked.
> 
> >  		if (rc != 3) {
> 
> --
> Jani Nikula, Intel Open Source Graphics Center

--
Regards,
Wayne Lin

Jani Nikula April 18, 2023, 11:58 a.m. UTC | #3

On Tue, 18 Apr 2023, "Lin, Wayne" <Wayne.Lin@amd.com> wrote:
> [Public]
>
> Hi Jani Nikula,
>
> Appreciate your time and feedback! Will adjust the patch.
> Some comments inline.
>
>> -----Original Message-----
>> From: Jani Nikula <jani.nikula@intel.com>
>> Sent: Tuesday, April 18, 2023 4:53 PM
>> To: Lin, Wayne <Wayne.Lin@amd.com>; dri-devel@lists.freedesktop.org;
>> amd-gfx@lists.freedesktop.org
>> Cc: lyude@redhat.com; imre.deak@intel.com; ville.syrjala@linux.intel.com;
>> Wentland, Harry <Harry.Wentland@amd.com>; Zuo, Jerry
>> <Jerry.Zuo@amd.com>; Lin, Wayne <Wayne.Lin@amd.com>;
>> stable@vger.kernel.org
>> Subject: Re: [PATCH] drm/dp_mst: Clear MSG_RDY flag before sending new
>> message
>> 
>> On Tue, 18 Apr 2023, Wayne Lin <Wayne.Lin@amd.com> wrote:
>> > [Why & How]
>> > The sequence for collecting down_reply/up_request from source
>> > perspective should be:
>> >
>> > Request_n->repeat (get partial reply of Request_n->clear message ready
>> > flag to ack DPRX that the message is received) till all partial
>> > replies for Request_n are received->new Request_n+1.
>> >
>> > While assembling partial reply packets, reading out DPCD DOWN_REP
>> > Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be
>> wrapped
>> > up as a complete operation for reading out a reply packet.
>> > Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might
>> > be risky. e.g. If the reply of the new request has overwritten the
>> > DPRX DOWN_REP Sideband MSG buffer before source writing ack to clear
>> > DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply
>> > for the new request. Should handle the up request in the same way.
>> >
>> > In drm_dp_mst_hpd_irq(), we don't clear MSG_RDY flag before caliing
>> > drm_dp_mst_kick_tx(). Fix that.
>> >
>> > Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
>> > Cc: stable@vger.kernel.org
>> > ---
>> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 ++
>> > drivers/gpu/drm/display/drm_dp_mst_topology.c | 22
>> +++++++++++++++++++
>> >  drivers/gpu/drm/i915/display/intel_dp.c       |  3 +++
>> >  drivers/gpu/drm/nouveau/dispnv50/disp.c       |  2 ++
>> >  4 files changed, 29 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> > index 77277d90b6e2..5313a5656598 100644
>> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> > @@ -3166,6 +3166,8 @@ static void dm_handle_mst_sideband_msg(struct
>> amdgpu_dm_connector *aconnector)
>> >  			for (retry = 0; retry < 3; retry++) {
>> >  				uint8_t wret;
>> >
>> > +				/* MSG_RDY ack is done in drm*/
>> > +				esi[1] &= ~(DP_DOWN_REP_MSG_RDY |
>> DP_UP_REQ_MSG_RDY);
>> 
>> Why do the masking within the retry loop?
>> 
>> >  				wret = drm_dp_dpcd_write(
>> >  					&aconnector->dm_dp_aux.aux,
>> >  					dpcd_addr + 1,
>> > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > b/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > index 51a46689cda7..02aad713c67c 100644
>> > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > @@ -4054,6 +4054,9 @@ int drm_dp_mst_hpd_irq(struct
>> > drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl  {
>> >  	int ret = 0;
>> >  	int sc;
>> > +	const int tosend = 1;
>> > +	int retries = 0;
>> > +	u8 buf = 0;
>> 
>> All of these should be in tighter scope.
>> 
>> >  	*handled = false;
>> >  	sc = DP_GET_SINK_COUNT(esi[0]);
>> >
>> > @@ -4072,6 +4075,25 @@ int drm_dp_mst_hpd_irq(struct
>> drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl
>> >  		*handled = true;
>> >  	}
>> >
>> > +	if (*handled) {
>> 
>> That should check for DP_DOWN_REP_MSG_RDY and
>> DP_UP_REQ_MSG_RDY only, right? If those are not set, we didn't do
>> anything with them, and should not ack.
>
> Right. I was thinking the sink count change will accompany the CSN
> up request message. I'll change it to be more clear. Thanks.
>> 
>> > +		buf = esi[1] & (DP_DOWN_REP_MSG_RDY |
>> DP_UP_REQ_MSG_RDY);
>> > +		do {
>> > +			ret = drm_dp_dpcd_write(mgr->aux,
>> > +
>> 	DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0,
>> > +						&buf,
>> > +						tosend);
>> 
>> We should probably have a helper function to do the acking, similar to
>> intel_dp_ack_sink_irq_esi(), which could be used both by this function and
>> the drivers.
>> 
>> > +
>> > +			if (ret == tosend)
>> > +				break;
>> > +
>> > +			retries++;
>> > +		} while (retries < 5);
>> 
>> Please don't use a do-while when a for loop is sufficient.
>> 
>> 	for (tries = 0; tries < 5; tries++)
>> 
>> and it's obvious at a glance how many times at most this runs. Not so with a
>> do-while where you count *re-tries*. Again, would be nice to abstract this
>> away in a helper function.
>> 
>> > +
>> > +		if (ret != tosend)
>> > +			drm_dbg_kms(mgr->dev, "failed to write dpcd
>> 0x%x\n",
>> > +				    DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0);
>> > +	}
>> > +
>> >  	drm_dp_mst_kick_tx(mgr);
>> >  	return ret;
>> >  }
>> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
>> > b/drivers/gpu/drm/i915/display/intel_dp.c
>> > index bf80f296a8fd..abec3de38b66 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
>> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
>> > @@ -3939,6 +3939,9 @@ intel_dp_check_mst_status(struct intel_dp
>> *intel_dp)
>> >  		if (!memchr_inv(ack, 0, sizeof(ack)))
>> >  			break;
>> >
>> > +		/* MSG_RDY ack is done in drm*/
>> > +		ack[1] &= ~(DP_DOWN_REP_MSG_RDY |
>> DP_UP_REQ_MSG_RDY);
>> 
>> Above we check if there's anything to ack and bail out, and now this clears
>> the bits but writes them anyway.
>> 
>> I think the handled parameter was problematic before, but now it's even
>> more convoluted. What does it indicate? It used to mean you need to ack if
>> it's set, but now it's something different. This function is getting very difficult
>> to use correctly.
>
> My plan was to ack message events within drm_dp_mst_hpd_irq() since the
> events are handled there. There are still CP_IRQ and LINK_STATUS_CHANGED
> events above get handled in intel_dp_check_mst_status(), so I intended to
> mask DP_DOWN_REP_MSG_RDY/DP_UP_REQ_MSG_RDY, and ack
> CP_IRQ/LINK_STATUS_CHANGED here.

I get it, but if DP_DOWN_REP_MSG_RDY or DP_UP_REQ_MSG_RDY were the only
events to ack, and they were already acked in drm_dp_mst_hpd_irq(), we
should not do an extra "nop" ack.

The caller of drm_dp_mst_hpd_irq() needs to be able to conveniently
figure out what to ack, and what to not ack. And without duplicating the
logic within drm_dp_mst_hpd_irq().

BR,
Jani.



>> 
>> BR,
>> Jani.
>> 
>> 
>> 
>> > +
>> >  		if (!intel_dp_ack_sink_irq_esi(intel_dp, ack))
>> >  			drm_dbg_kms(&i915->drm, "Failed to ack ESI\n");
>> >  	}
>> > diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> > b/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> > index edcb2529b402..e905987104ed 100644
>> > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
>> > @@ -1336,6 +1336,8 @@ nv50_mstm_service(struct nouveau_drm *drm,
>> >  		if (!handled)
>> >  			break;
>> >
>> > +		/* MSG_RDY ack is done in drm*/
>> > +		esi[1] &= ~(DP_DOWN_REP_MSG_RDY |
>> DP_UP_REQ_MSG_RDY);
>> >  		rc = drm_dp_dpcd_write(aux, DP_SINK_COUNT_ESI + 1,
>> &esi[1],
>> >  				       3);
>> 
>> Same here, this acks even if it's already been acked.
>> 
>> >  		if (rc != 3) {
>> 
>> --
>> Jani Nikula, Intel Open Source Graphics Center
>
> --
> Regards,
> Wayne Lin

Ville Syrjälä April 18, 2023, 2:01 p.m. UTC | #4

On Tue, Apr 18, 2023 at 02:09:05PM +0800, Wayne Lin wrote:
> [Why & How]
> The sequence for collecting down_reply/up_request from source
> perspective should be:
> 
> Request_n->repeat (get partial reply of Request_n->clear message ready
> flag to ack DPRX that the message is received) till all partial
> replies for Request_n are received->new Request_n+1.
> 
> While assembling partial reply packets, reading out DPCD DOWN_REP
> Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be
> wrapped up as a complete operation for reading out a reply packet.
> Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might
> be risky. e.g. If the reply of the new request has overwritten the
> DPRX DOWN_REP Sideband MSG buffer before source writing ack to clear
> DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply
> for the new request. Should handle the up request in the same way.
> 
> In drm_dp_mst_hpd_irq(), we don't clear MSG_RDY flag before caliing
> drm_dp_mst_kick_tx(). Fix that.
> 
> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
> Cc: stable@vger.kernel.org
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 ++
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 22 +++++++++++++++++++
>  drivers/gpu/drm/i915/display/intel_dp.c       |  3 +++
>  drivers/gpu/drm/nouveau/dispnv50/disp.c       |  2 ++
>  4 files changed, 29 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 77277d90b6e2..5313a5656598 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3166,6 +3166,8 @@ static void dm_handle_mst_sideband_msg(struct amdgpu_dm_connector *aconnector)
>  			for (retry = 0; retry < 3; retry++) {
>  				uint8_t wret;
>  
> +				/* MSG_RDY ack is done in drm*/
> +				esi[1] &= ~(DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);
>  				wret = drm_dp_dpcd_write(
>  					&aconnector->dm_dp_aux.aux,
>  					dpcd_addr + 1,
> diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> index 51a46689cda7..02aad713c67c 100644
> --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> @@ -4054,6 +4054,9 @@ int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl
>  {
>  	int ret = 0;
>  	int sc;
> +	const int tosend = 1;
> +	int retries = 0;
> +	u8 buf = 0;
>  	*handled = false;
>  	sc = DP_GET_SINK_COUNT(esi[0]);
>  
> @@ -4072,6 +4075,25 @@ int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl
>  		*handled = true;
>  	}
>  
> +	if (*handled) {
> +		buf = esi[1] & (DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);
> +		do {
> +			ret = drm_dp_dpcd_write(mgr->aux,
> +						DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0,
> +						&buf,
> +						tosend);
> +
> +			if (ret == tosend)
> +				break;
> +
> +			retries++;
> +		} while (retries < 5);

What's with this magic retry loop?

Not sure I like the whole thing though. Splitting the irq ack
semi-randomly between driver vs. multiple helpers doesn't feel
great to me.

As a whole the HPD_IRQ handling is a total mess atm. At some point
I was trying to sketch something a bit better for it. The approach
I was thinking was something along the lines of:

 u8 vector[...];
 drm_dp_read_irq_vector(vector);
 ... handle all irqs/etc., calling suitable helpers as needed
 drm_dp_clear_irq_vector(vector);

And I was also thinking that this drm_dp_*_irq_vector() stuff
would always use the ESI layout, converting as needed from/to
the old layout for pre-1.2 (or whatever the cutoff was) devices.
That way drivers would just need the one codepath.

> +
> +		if (ret != tosend)
> +			drm_dbg_kms(mgr->dev, "failed to write dpcd 0x%x\n",
> +				    DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0);
> +	}
> +
>  	drm_dp_mst_kick_tx(mgr);
>  	return ret;
>  }
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
> index bf80f296a8fd..abec3de38b66 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -3939,6 +3939,9 @@ intel_dp_check_mst_status(struct intel_dp *intel_dp)
>  		if (!memchr_inv(ack, 0, sizeof(ack)))
>  			break;
>  
> +		/* MSG_RDY ack is done in drm*/
> +		ack[1] &= ~(DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);
> +
>  		if (!intel_dp_ack_sink_irq_esi(intel_dp, ack))
>  			drm_dbg_kms(&i915->drm, "Failed to ack ESI\n");
>  	}
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> index edcb2529b402..e905987104ed 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> @@ -1336,6 +1336,8 @@ nv50_mstm_service(struct nouveau_drm *drm,
>  		if (!handled)
>  			break;
>  
> +		/* MSG_RDY ack is done in drm*/
> +		esi[1] &= ~(DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);
>  		rc = drm_dp_dpcd_write(aux, DP_SINK_COUNT_ESI + 1, &esi[1],
>  				       3);
>  		if (rc != 3) {
> -- 
> 2.37.3

Lin, Wayne April 21, 2023, 1:13 p.m. UTC | #5

[Public]

Much appreciated, Ville and Jani!

To tackle this MST message ack event now, probably I could just pull out the 
drm_dp_mst_kick_tx() out of drm_dp_mst_hpd_irq() and make it the second 
step function to handle mst hpd irq? Would like to know your thoughts : )

Again, thanks for your time!

Regards,
Wayne Lin

> -----Original Message-----
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Sent: Tuesday, April 18, 2023 10:01 PM
> To: Lin, Wayne <Wayne.Lin@amd.com>
> Cc: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org;
> lyude@redhat.com; imre.deak@intel.com; jani.nikula@intel.com; Wentland,
> Harry <Harry.Wentland@amd.com>; Zuo, Jerry <Jerry.Zuo@amd.com>;
> stable@vger.kernel.org
> Subject: Re: [PATCH] drm/dp_mst: Clear MSG_RDY flag before sending new
> message
> 
> On Tue, Apr 18, 2023 at 02:09:05PM +0800, Wayne Lin wrote:
> > [Why & How]
> > The sequence for collecting down_reply/up_request from source
> > perspective should be:
> >
> > Request_n->repeat (get partial reply of Request_n->clear message ready
> > flag to ack DPRX that the message is received) till all partial
> > replies for Request_n are received->new Request_n+1.
> >
> > While assembling partial reply packets, reading out DPCD DOWN_REP
> > Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be
> wrapped
> > up as a complete operation for reading out a reply packet.
> > Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might
> > be risky. e.g. If the reply of the new request has overwritten the
> > DPRX DOWN_REP Sideband MSG buffer before source writing ack to clear
> > DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply
> > for the new request. Should handle the up request in the same way.
> >
> > In drm_dp_mst_hpd_irq(), we don't clear MSG_RDY flag before caliing
> > drm_dp_mst_kick_tx(). Fix that.
> >
> > Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  2 ++
> > drivers/gpu/drm/display/drm_dp_mst_topology.c | 22
> +++++++++++++++++++
> >  drivers/gpu/drm/i915/display/intel_dp.c       |  3 +++
> >  drivers/gpu/drm/nouveau/dispnv50/disp.c       |  2 ++
> >  4 files changed, 29 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > index 77277d90b6e2..5313a5656598 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> > @@ -3166,6 +3166,8 @@ static void dm_handle_mst_sideband_msg(struct
> amdgpu_dm_connector *aconnector)
> >  			for (retry = 0; retry < 3; retry++) {
> >  				uint8_t wret;
> >
> > +				/* MSG_RDY ack is done in drm*/
> > +				esi[1] &= ~(DP_DOWN_REP_MSG_RDY |
> DP_UP_REQ_MSG_RDY);
> >  				wret = drm_dp_dpcd_write(
> >  					&aconnector->dm_dp_aux.aux,
> >  					dpcd_addr + 1,
> > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > index 51a46689cda7..02aad713c67c 100644
> > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > @@ -4054,6 +4054,9 @@ int drm_dp_mst_hpd_irq(struct
> > drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl  {
> >  	int ret = 0;
> >  	int sc;
> > +	const int tosend = 1;
> > +	int retries = 0;
> > +	u8 buf = 0;
> >  	*handled = false;
> >  	sc = DP_GET_SINK_COUNT(esi[0]);
> >
> > @@ -4072,6 +4075,25 @@ int drm_dp_mst_hpd_irq(struct
> drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl
> >  		*handled = true;
> >  	}
> >
> > +	if (*handled) {
> > +		buf = esi[1] & (DP_DOWN_REP_MSG_RDY |
> DP_UP_REQ_MSG_RDY);
> > +		do {
> > +			ret = drm_dp_dpcd_write(mgr->aux,
> > +
> 	DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0,
> > +						&buf,
> > +						tosend);
> > +
> > +			if (ret == tosend)
> > +				break;
> > +
> > +			retries++;
> > +		} while (retries < 5);
> 
> What's with this magic retry loop?
> 
> Not sure I like the whole thing though. Splitting the irq ack semi-randomly
> between driver vs. multiple helpers doesn't feel great to me.
> 
> As a whole the HPD_IRQ handling is a total mess atm. At some point I was
> trying to sketch something a bit better for it. The approach I was thinking was
> something along the lines of:
> 
>  u8 vector[...];
>  drm_dp_read_irq_vector(vector);
>  ... handle all irqs/etc., calling suitable helpers as needed
> drm_dp_clear_irq_vector(vector);
> 
> And I was also thinking that this drm_dp_*_irq_vector() stuff would always
> use the ESI layout, converting as needed from/to the old layout for pre-1.2
> (or whatever the cutoff was) devices.
> That way drivers would just need the one codepath.
> 
> > +
> > +		if (ret != tosend)
> > +			drm_dbg_kms(mgr->dev, "failed to write dpcd
> 0x%x\n",
> > +				    DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0);
> > +	}
> > +
> >  	drm_dp_mst_kick_tx(mgr);
> >  	return ret;
> >  }
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
> > b/drivers/gpu/drm/i915/display/intel_dp.c
> > index bf80f296a8fd..abec3de38b66 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -3939,6 +3939,9 @@ intel_dp_check_mst_status(struct intel_dp
> *intel_dp)
> >  		if (!memchr_inv(ack, 0, sizeof(ack)))
> >  			break;
> >
> > +		/* MSG_RDY ack is done in drm*/
> > +		ack[1] &= ~(DP_DOWN_REP_MSG_RDY |
> DP_UP_REQ_MSG_RDY);
> > +
> >  		if (!intel_dp_ack_sink_irq_esi(intel_dp, ack))
> >  			drm_dbg_kms(&i915->drm, "Failed to ack ESI\n");
> >  	}
> > diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > index edcb2529b402..e905987104ed 100644
> > --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
> > @@ -1336,6 +1336,8 @@ nv50_mstm_service(struct nouveau_drm *drm,
> >  		if (!handled)
> >  			break;
> >
> > +		/* MSG_RDY ack is done in drm*/
> > +		esi[1] &= ~(DP_DOWN_REP_MSG_RDY |
> DP_UP_REQ_MSG_RDY);
> >  		rc = drm_dp_dpcd_write(aux, DP_SINK_COUNT_ESI + 1,
> &esi[1],
> >  				       3);
> >  		if (rc != 3) {
> > --
> > 2.37.3
> 
> --
> Ville Syrjälä
> Intel

drm/dp_mst: Clear MSG_RDY flag before sending new message

Commit Message

Comments

Patch