diff mbox series

drm/dp: Don't attempt AUX transfers when eDP panels are not powered

Message ID 20240202141109.1.I24277520ac754ea538c9b14578edc94e1df11b48@changeid (mailing list archive)
State New, archived
Headers show
Series drm/dp: Don't attempt AUX transfers when eDP panels are not powered | expand

Commit Message

Doug Anderson Feb. 2, 2024, 10:11 p.m. UTC
If an eDP panel is not powered on then any attempts to talk to it over
the DP AUX channel will timeout. Unfortunately these attempts may be
quite slow. Userspace can initiate these attempts either via a
/dev/drm_dp_auxN device or via the created i2c device.

Making the DP AUX drivers timeout faster is a difficult proposition.
In theory we could just poll the panel's HPD line in the AUX transfer
function and immediately return an error there. However, this is
easier said than done. For one thing, there's no hard requirement to
hook the HPD line up for eDP panels and it's OK to just delay a fixed
amount. For another thing, the HPD line may not be fast to probe. On
parade-ps8640 we need to wait for the bridge chip's firmware to boot
before we can get the HPD line and this is a slow process.

The fact that the transfers are taking so long to timeout is causing
real problems. The open source fwupd daemon sometimes scans DP busses
looking for devices whose firmware need updating. If it happens to
scan while a panel is turned off this scan can take a long time. The
fwupd daemon could try to be smarter and only scan when eDP panels are
turned on, but we can also improve the behavior in the kernel.

Let's let eDP panels drivers specify that a panel is turned off and
then modify the common AUX transfer code not to attempt a transfer in
this case.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/gpu/drm/display/drm_dp_helper.c       | 35 +++++++++++++++++++
 drivers/gpu/drm/panel/panel-edp.c             |  3 ++
 .../gpu/drm/panel/panel-samsung-atna33xc20.c  |  2 ++
 include/drm/display/drm_dp_helper.h           |  6 ++++
 4 files changed, 46 insertions(+)

Comments

Jani Nikula Feb. 5, 2024, 11:17 a.m. UTC | #1
On Fri, 02 Feb 2024, Douglas Anderson <dianders@chromium.org> wrote:
> If an eDP panel is not powered on then any attempts to talk to it over
> the DP AUX channel will timeout. Unfortunately these attempts may be
> quite slow. Userspace can initiate these attempts either via a
> /dev/drm_dp_auxN device or via the created i2c device.
>
> Making the DP AUX drivers timeout faster is a difficult proposition.
> In theory we could just poll the panel's HPD line in the AUX transfer
> function and immediately return an error there. However, this is
> easier said than done. For one thing, there's no hard requirement to
> hook the HPD line up for eDP panels and it's OK to just delay a fixed
> amount. For another thing, the HPD line may not be fast to probe. On
> parade-ps8640 we need to wait for the bridge chip's firmware to boot
> before we can get the HPD line and this is a slow process.
>
> The fact that the transfers are taking so long to timeout is causing
> real problems. The open source fwupd daemon sometimes scans DP busses
> looking for devices whose firmware need updating. If it happens to
> scan while a panel is turned off this scan can take a long time. The
> fwupd daemon could try to be smarter and only scan when eDP panels are
> turned on, but we can also improve the behavior in the kernel.
>
> Let's let eDP panels drivers specify that a panel is turned off and
> then modify the common AUX transfer code not to attempt a transfer in
> this case.

I guess my question is, why not make the aux->transfer function handle
the powered down case and return the appropriate error?

For example, the transfer hook in i915 checks whether the display is
connected and bails out early if not.

Having to track and set the state all over the place seems more
complicated to me than dynamically checking where needed i.e. in the
transfer hook.


BR,
Jani.

>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
>  drivers/gpu/drm/display/drm_dp_helper.c       | 35 +++++++++++++++++++
>  drivers/gpu/drm/panel/panel-edp.c             |  3 ++
>  .../gpu/drm/panel/panel-samsung-atna33xc20.c  |  2 ++
>  include/drm/display/drm_dp_helper.h           |  6 ++++
>  4 files changed, 46 insertions(+)
>
> diff --git a/drivers/gpu/drm/display/drm_dp_helper.c b/drivers/gpu/drm/display/drm_dp_helper.c
> index b1ca3a1100da..6fa705d82c8f 100644
> --- a/drivers/gpu/drm/display/drm_dp_helper.c
> +++ b/drivers/gpu/drm/display/drm_dp_helper.c
> @@ -532,6 +532,15 @@ static int drm_dp_dpcd_access(struct drm_dp_aux *aux, u8 request,
>  
>  	mutex_lock(&aux->hw_mutex);
>  
> +	/*
> +	 * If the device attached to the aux bus is powered down then there's
> +	 * no reason to attempt a transfer. Error out immediately.
> +	 */
> +	if (aux->powered_down) {
> +		ret = -EBUSY;
> +		goto unlock;
> +	}
> +
>  	/*
>  	 * The specification doesn't give any recommendation on how often to
>  	 * retry native transactions. We used to retry 7 times like for
> @@ -599,6 +608,29 @@ int drm_dp_dpcd_probe(struct drm_dp_aux *aux, unsigned int offset)
>  }
>  EXPORT_SYMBOL(drm_dp_dpcd_probe);
>  
> +/**
> + * drm_dp_dpcd_set_powered() - Set whether the DP device is powered
> + * @aux: DisplayPort AUX channel; for convenience it's OK to pass NULL here
> + *       and the function will be a no-op.
> + * @powered: true if powered; false if not
> + *
> + * If the endpoint device on the DP AUX bus is known to be powered down
> + * then this function can be called to make future transfers fail immediately
> + * instead of needing to time out.
> + *
> + * If this function is never called then a device defaults to being powered.
> + */
> +void drm_dp_dpcd_set_powered(struct drm_dp_aux *aux, bool powered)
> +{
> +	if (!aux)
> +		return;
> +
> +	mutex_lock(&aux->hw_mutex);
> +	aux->powered_down = !powered;
> +	mutex_unlock(&aux->hw_mutex);
> +}
> +EXPORT_SYMBOL(drm_dp_dpcd_set_powered);
> +
>  /**
>   * drm_dp_dpcd_read() - read a series of bytes from the DPCD
>   * @aux: DisplayPort AUX channel (SST or MST)
> @@ -1858,6 +1890,9 @@ static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
>  	struct drm_dp_aux_msg msg;
>  	int err = 0;
>  
> +	if (aux->powered_down)
> +		return -EBUSY;
> +
>  	dp_aux_i2c_transfer_size = clamp(dp_aux_i2c_transfer_size, 1, DP_AUX_MAX_PAYLOAD_BYTES);
>  
>  	memset(&msg, 0, sizeof(msg));
> diff --git a/drivers/gpu/drm/panel/panel-edp.c b/drivers/gpu/drm/panel/panel-edp.c
> index 7d556b1bfa82..d2a4e514d8fd 100644
> --- a/drivers/gpu/drm/panel/panel-edp.c
> +++ b/drivers/gpu/drm/panel/panel-edp.c
> @@ -413,6 +413,7 @@ static int panel_edp_suspend(struct device *dev)
>  {
>  	struct panel_edp *p = dev_get_drvdata(dev);
>  
> +	drm_dp_dpcd_set_powered(p->aux, false);
>  	gpiod_set_value_cansleep(p->enable_gpio, 0);
>  	regulator_disable(p->supply);
>  	p->unprepared_time = ktime_get_boottime();
> @@ -469,6 +470,7 @@ static int panel_edp_prepare_once(struct panel_edp *p)
>  	}
>  
>  	gpiod_set_value_cansleep(p->enable_gpio, 1);
> +	drm_dp_dpcd_set_powered(p->aux, true);
>  
>  	p->powered_on_time = ktime_get_boottime();
>  
> @@ -507,6 +509,7 @@ static int panel_edp_prepare_once(struct panel_edp *p)
>  	return 0;
>  
>  error:
> +	drm_dp_dpcd_set_powered(p->aux, false);
>  	gpiod_set_value_cansleep(p->enable_gpio, 0);
>  	regulator_disable(p->supply);
>  	p->unprepared_time = ktime_get_boottime();
> diff --git a/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c b/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
> index 5703f4712d96..76c2a8f6718c 100644
> --- a/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
> +++ b/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
> @@ -72,6 +72,7 @@ static int atana33xc20_suspend(struct device *dev)
>  	if (p->el3_was_on)
>  		atana33xc20_wait(p->el_on3_off_time, 150);
>  
> +	drm_dp_dpcd_set_powered(p->aux, false);
>  	ret = regulator_disable(p->supply);
>  	if (ret)
>  		return ret;
> @@ -93,6 +94,7 @@ static int atana33xc20_resume(struct device *dev)
>  	ret = regulator_enable(p->supply);
>  	if (ret)
>  		return ret;
> +	drm_dp_dpcd_set_powered(p->aux, true);
>  	p->powered_on_time = ktime_get_boottime();
>  
>  	if (p->no_hpd) {
> diff --git a/include/drm/display/drm_dp_helper.h b/include/drm/display/drm_dp_helper.h
> index 863b2e7add29..472359a9d675 100644
> --- a/include/drm/display/drm_dp_helper.h
> +++ b/include/drm/display/drm_dp_helper.h
> @@ -463,9 +463,15 @@ struct drm_dp_aux {
>  	 * @is_remote: Is this AUX CH actually using sideband messaging.
>  	 */
>  	bool is_remote;
> +
> +	/**
> +	 * @powered_down: If true then the remote endpoint is powered down.
> +	 */
> +	bool powered_down;
>  };
>  
>  int drm_dp_dpcd_probe(struct drm_dp_aux *aux, unsigned int offset);
> +void drm_dp_dpcd_set_powered(struct drm_dp_aux *aux, bool powered);
>  ssize_t drm_dp_dpcd_read(struct drm_dp_aux *aux, unsigned int offset,
>  			 void *buffer, size_t size);
>  ssize_t drm_dp_dpcd_write(struct drm_dp_aux *aux, unsigned int offset,
Doug Anderson Feb. 5, 2024, 4:23 p.m. UTC | #2
Hi,

On Mon, Feb 5, 2024 at 3:17 AM Jani Nikula <jani.nikula@intel.com> wrote:
>
> On Fri, 02 Feb 2024, Douglas Anderson <dianders@chromium.org> wrote:
> > If an eDP panel is not powered on then any attempts to talk to it over
> > the DP AUX channel will timeout. Unfortunately these attempts may be
> > quite slow. Userspace can initiate these attempts either via a
> > /dev/drm_dp_auxN device or via the created i2c device.
> >
> > Making the DP AUX drivers timeout faster is a difficult proposition.
> > In theory we could just poll the panel's HPD line in the AUX transfer
> > function and immediately return an error there. However, this is
> > easier said than done. For one thing, there's no hard requirement to
> > hook the HPD line up for eDP panels and it's OK to just delay a fixed
> > amount. For another thing, the HPD line may not be fast to probe. On
> > parade-ps8640 we need to wait for the bridge chip's firmware to boot
> > before we can get the HPD line and this is a slow process.
> >
> > The fact that the transfers are taking so long to timeout is causing
> > real problems. The open source fwupd daemon sometimes scans DP busses
> > looking for devices whose firmware need updating. If it happens to
> > scan while a panel is turned off this scan can take a long time. The
> > fwupd daemon could try to be smarter and only scan when eDP panels are
> > turned on, but we can also improve the behavior in the kernel.
> >
> > Let's let eDP panels drivers specify that a panel is turned off and
> > then modify the common AUX transfer code not to attempt a transfer in
> > this case.
>
> I guess my question is, why not make the aux->transfer function handle
> the powered down case and return the appropriate error?

The basic problem is that the aux->transfer() function doesn't have
knowledge of the power state of the eDP panel and that's the component
whose power state matters here. The aux->transfer() function is
implemented by the DP controller driver and can't access the internal
state of the eDP panel driver.

The traditional solution here would be to use the "HPD" pin to know if
the DP device is powered and ready to accept AUX commands. That works
fine for external DP devices where HPD is mandatory, but it has issues
for eDP as talked about in my commit description. If nothing else, the
eDP spec lists "HPD" as optional. In addition to that, however, we
have additional difficulties for eDP because of the "connected but not
powered on" state that eDP panels can be in. For DP if you see HPD you
know that the device is connected+powered on. If you don't see HPD
then the device is disconnected and/or powered off. For eDP you may
power off components (like the controller / eDP panel) when the device
is still physically connected and that adds complexities.

Another possible solution someone could come up with would be to use
the DRM state of the DP controller driver and fail all AUX commands if
the DP controller isn't powered. Unfortunately this doesn't work
either. Specifically at panel probe time we need to do AUX transfers
even though the full DRM bridge isn't powered. We had many discussions
about this on the mailing lists when coming up with the generic eDP
panel driver and this is fairly well documented in the kernel-docs of
the transfer() function in "struct drm_dp_aux". As documented, an eDP
controller driver needs to return an error for transfer() if the panel
isn't powered, but nothing says that it needs to do it quickly. The
slowness is what we're trying to solve here.


> For example, the transfer hook in i915 checks whether the display is
> connected and bails out early if not.

I'm not massively familiar with the i915 code, so I'd love a pointer
to the exact code you're referring to. I took a quick look and found a
Type-C specific check in intel_dp_aux_xfer(). That doesn't help here
since we're not Type-C, though the comments do back up my argument
that the long timeouts are something worth avoiding. After that I
don't see anything obvious so I'd love a pointer.


> Having to track and set the state all over the place seems more
> complicated to me than dynamically checking where needed i.e. in the
> transfer hook.

Huh. I was actually surprised by how simple/straightforward my patch
was compared to how ugly I thought it was going to be. I guess it's
just a different perspective? Specifically it can be noted that there
aren't many distinct eDP panel drivers out there since all but one of
them was able to use the generic "panel-edp.c". However, there are
quite a number of eDP controller drivers, especially considering all
the bridge chips out there. That means that this short patch means we
don't need to add weird logic/hacks to all of the eDP controller
drivers...

-Doug
Hsin-Yi Wang Feb. 14, 2024, 6:24 a.m. UTC | #3
On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> If an eDP panel is not powered on then any attempts to talk to it over
> the DP AUX channel will timeout. Unfortunately these attempts may be
> quite slow. Userspace can initiate these attempts either via a
> /dev/drm_dp_auxN device or via the created i2c device.
>
> Making the DP AUX drivers timeout faster is a difficult proposition.
> In theory we could just poll the panel's HPD line in the AUX transfer
> function and immediately return an error there. However, this is
> easier said than done. For one thing, there's no hard requirement to
> hook the HPD line up for eDP panels and it's OK to just delay a fixed
> amount. For another thing, the HPD line may not be fast to probe. On
> parade-ps8640 we need to wait for the bridge chip's firmware to boot
> before we can get the HPD line and this is a slow process.
>
> The fact that the transfers are taking so long to timeout is causing
> real problems. The open source fwupd daemon sometimes scans DP busses
> looking for devices whose firmware need updating. If it happens to
> scan while a panel is turned off this scan can take a long time. The
> fwupd daemon could try to be smarter and only scan when eDP panels are
> turned on, but we can also improve the behavior in the kernel.
>
> Let's let eDP panels drivers specify that a panel is turned off and
> then modify the common AUX transfer code not to attempt a transfer in
> this case.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---

Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>

>
>  drivers/gpu/drm/display/drm_dp_helper.c       | 35 +++++++++++++++++++
>  drivers/gpu/drm/panel/panel-edp.c             |  3 ++
>  .../gpu/drm/panel/panel-samsung-atna33xc20.c  |  2 ++
>  include/drm/display/drm_dp_helper.h           |  6 ++++
>  4 files changed, 46 insertions(+)
>
> diff --git a/drivers/gpu/drm/display/drm_dp_helper.c b/drivers/gpu/drm/display/drm_dp_helper.c
> index b1ca3a1100da..6fa705d82c8f 100644
> --- a/drivers/gpu/drm/display/drm_dp_helper.c
> +++ b/drivers/gpu/drm/display/drm_dp_helper.c
> @@ -532,6 +532,15 @@ static int drm_dp_dpcd_access(struct drm_dp_aux *aux, u8 request,
>
>         mutex_lock(&aux->hw_mutex);
>
> +       /*
> +        * If the device attached to the aux bus is powered down then there's
> +        * no reason to attempt a transfer. Error out immediately.
> +        */
> +       if (aux->powered_down) {
> +               ret = -EBUSY;
> +               goto unlock;
> +       }
> +
>         /*
>          * The specification doesn't give any recommendation on how often to
>          * retry native transactions. We used to retry 7 times like for
> @@ -599,6 +608,29 @@ int drm_dp_dpcd_probe(struct drm_dp_aux *aux, unsigned int offset)
>  }
>  EXPORT_SYMBOL(drm_dp_dpcd_probe);
>
> +/**
> + * drm_dp_dpcd_set_powered() - Set whether the DP device is powered
> + * @aux: DisplayPort AUX channel; for convenience it's OK to pass NULL here
> + *       and the function will be a no-op.
> + * @powered: true if powered; false if not
> + *
> + * If the endpoint device on the DP AUX bus is known to be powered down
> + * then this function can be called to make future transfers fail immediately
> + * instead of needing to time out.
> + *
> + * If this function is never called then a device defaults to being powered.
> + */
> +void drm_dp_dpcd_set_powered(struct drm_dp_aux *aux, bool powered)
> +{
> +       if (!aux)
> +               return;
> +
> +       mutex_lock(&aux->hw_mutex);
> +       aux->powered_down = !powered;
> +       mutex_unlock(&aux->hw_mutex);
> +}
> +EXPORT_SYMBOL(drm_dp_dpcd_set_powered);
> +
>  /**
>   * drm_dp_dpcd_read() - read a series of bytes from the DPCD
>   * @aux: DisplayPort AUX channel (SST or MST)
> @@ -1858,6 +1890,9 @@ static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
>         struct drm_dp_aux_msg msg;
>         int err = 0;
>
> +       if (aux->powered_down)
> +               return -EBUSY;
> +
>         dp_aux_i2c_transfer_size = clamp(dp_aux_i2c_transfer_size, 1, DP_AUX_MAX_PAYLOAD_BYTES);
>
>         memset(&msg, 0, sizeof(msg));
> diff --git a/drivers/gpu/drm/panel/panel-edp.c b/drivers/gpu/drm/panel/panel-edp.c
> index 7d556b1bfa82..d2a4e514d8fd 100644
> --- a/drivers/gpu/drm/panel/panel-edp.c
> +++ b/drivers/gpu/drm/panel/panel-edp.c
> @@ -413,6 +413,7 @@ static int panel_edp_suspend(struct device *dev)
>  {
>         struct panel_edp *p = dev_get_drvdata(dev);
>
> +       drm_dp_dpcd_set_powered(p->aux, false);
>         gpiod_set_value_cansleep(p->enable_gpio, 0);
>         regulator_disable(p->supply);
>         p->unprepared_time = ktime_get_boottime();
> @@ -469,6 +470,7 @@ static int panel_edp_prepare_once(struct panel_edp *p)
>         }
>
>         gpiod_set_value_cansleep(p->enable_gpio, 1);
> +       drm_dp_dpcd_set_powered(p->aux, true);
>
>         p->powered_on_time = ktime_get_boottime();
>
> @@ -507,6 +509,7 @@ static int panel_edp_prepare_once(struct panel_edp *p)
>         return 0;
>
>  error:
> +       drm_dp_dpcd_set_powered(p->aux, false);
>         gpiod_set_value_cansleep(p->enable_gpio, 0);
>         regulator_disable(p->supply);
>         p->unprepared_time = ktime_get_boottime();
> diff --git a/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c b/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
> index 5703f4712d96..76c2a8f6718c 100644
> --- a/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
> +++ b/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
> @@ -72,6 +72,7 @@ static int atana33xc20_suspend(struct device *dev)
>         if (p->el3_was_on)
>                 atana33xc20_wait(p->el_on3_off_time, 150);
>
> +       drm_dp_dpcd_set_powered(p->aux, false);
>         ret = regulator_disable(p->supply);
>         if (ret)
>                 return ret;
> @@ -93,6 +94,7 @@ static int atana33xc20_resume(struct device *dev)
>         ret = regulator_enable(p->supply);
>         if (ret)
>                 return ret;
> +       drm_dp_dpcd_set_powered(p->aux, true);
>         p->powered_on_time = ktime_get_boottime();
>
>         if (p->no_hpd) {
> diff --git a/include/drm/display/drm_dp_helper.h b/include/drm/display/drm_dp_helper.h
> index 863b2e7add29..472359a9d675 100644
> --- a/include/drm/display/drm_dp_helper.h
> +++ b/include/drm/display/drm_dp_helper.h
> @@ -463,9 +463,15 @@ struct drm_dp_aux {
>          * @is_remote: Is this AUX CH actually using sideband messaging.
>          */
>         bool is_remote;
> +
> +       /**
> +        * @powered_down: If true then the remote endpoint is powered down.
> +        */
> +       bool powered_down;
>  };
>
>  int drm_dp_dpcd_probe(struct drm_dp_aux *aux, unsigned int offset);
> +void drm_dp_dpcd_set_powered(struct drm_dp_aux *aux, bool powered);
>  ssize_t drm_dp_dpcd_read(struct drm_dp_aux *aux, unsigned int offset,
>                          void *buffer, size_t size);
>  ssize_t drm_dp_dpcd_write(struct drm_dp_aux *aux, unsigned int offset,
> --
> 2.43.0.594.gd9cf4e227d-goog
>
Doug Anderson Feb. 14, 2024, 11:07 p.m. UTC | #4
Hi,

On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
>
> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
> >
> > If an eDP panel is not powered on then any attempts to talk to it over
> > the DP AUX channel will timeout. Unfortunately these attempts may be
> > quite slow. Userspace can initiate these attempts either via a
> > /dev/drm_dp_auxN device or via the created i2c device.
> >
> > Making the DP AUX drivers timeout faster is a difficult proposition.
> > In theory we could just poll the panel's HPD line in the AUX transfer
> > function and immediately return an error there. However, this is
> > easier said than done. For one thing, there's no hard requirement to
> > hook the HPD line up for eDP panels and it's OK to just delay a fixed
> > amount. For another thing, the HPD line may not be fast to probe. On
> > parade-ps8640 we need to wait for the bridge chip's firmware to boot
> > before we can get the HPD line and this is a slow process.
> >
> > The fact that the transfers are taking so long to timeout is causing
> > real problems. The open source fwupd daemon sometimes scans DP busses
> > looking for devices whose firmware need updating. If it happens to
> > scan while a panel is turned off this scan can take a long time. The
> > fwupd daemon could try to be smarter and only scan when eDP panels are
> > turned on, but we can also improve the behavior in the kernel.
> >
> > Let's let eDP panels drivers specify that a panel is turned off and
> > then modify the common AUX transfer code not to attempt a transfer in
> > this case.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
>
> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>

Thanks for the review!

Given that this touches core DRM code and that I never got
confirmation that Jani's concerns were addressed with my previous
response, I'm still going to wait a little while before applying. I'm
on vacation for most of next week, but if there are no further replies
between now and then I'll plan to apply this to "drm-misc-next" the
week of Feb 26th. If someone else wants to apply this before I do then
I certainly won't object. Jani: if you feel this needs more discussion
or otherwise object to this patch landing then please yell. Likewise
if anyone else in the community wants to throw in their opinion, feel
free.

-Doug
Jani Nikula Feb. 15, 2024, 10:24 a.m. UTC | #5
On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
> Hi,
>
> On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
>>
>> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
>> >
>> > If an eDP panel is not powered on then any attempts to talk to it over
>> > the DP AUX channel will timeout. Unfortunately these attempts may be
>> > quite slow. Userspace can initiate these attempts either via a
>> > /dev/drm_dp_auxN device or via the created i2c device.
>> >
>> > Making the DP AUX drivers timeout faster is a difficult proposition.
>> > In theory we could just poll the panel's HPD line in the AUX transfer
>> > function and immediately return an error there. However, this is
>> > easier said than done. For one thing, there's no hard requirement to
>> > hook the HPD line up for eDP panels and it's OK to just delay a fixed
>> > amount. For another thing, the HPD line may not be fast to probe. On
>> > parade-ps8640 we need to wait for the bridge chip's firmware to boot
>> > before we can get the HPD line and this is a slow process.
>> >
>> > The fact that the transfers are taking so long to timeout is causing
>> > real problems. The open source fwupd daemon sometimes scans DP busses
>> > looking for devices whose firmware need updating. If it happens to
>> > scan while a panel is turned off this scan can take a long time. The
>> > fwupd daemon could try to be smarter and only scan when eDP panels are
>> > turned on, but we can also improve the behavior in the kernel.
>> >
>> > Let's let eDP panels drivers specify that a panel is turned off and
>> > then modify the common AUX transfer code not to attempt a transfer in
>> > this case.
>> >
>> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
>> > ---
>>
>> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
>
> Thanks for the review!
>
> Given that this touches core DRM code and that I never got
> confirmation that Jani's concerns were addressed with my previous
> response, I'm still going to wait a little while before applying. I'm
> on vacation for most of next week, but if there are no further replies
> between now and then I'll plan to apply this to "drm-misc-next" the
> week of Feb 26th. If someone else wants to apply this before I do then
> I certainly won't object. Jani: if you feel this needs more discussion
> or otherwise object to this patch landing then please yell. Likewise
> if anyone else in the community wants to throw in their opinion, feel
> free.

Sorry for dropping the ball after my initial response. I simply have not
had the time to look into this.

It would be great to get, say, drm-misc maintainer ack on this before
merging. It's not fair for me to stall this any longer, I'll trust their
judgement.

Reasonable?


BR,
Jani.
Doug Anderson Feb. 15, 2024, 3:08 p.m. UTC | #6
Hi,

On Thu, Feb 15, 2024 at 2:24 AM Jani Nikula <jani.nikula@intel.com> wrote:
>
> On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
> > Hi,
> >
> > On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
> >>
> >> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
> >> >
> >> > If an eDP panel is not powered on then any attempts to talk to it over
> >> > the DP AUX channel will timeout. Unfortunately these attempts may be
> >> > quite slow. Userspace can initiate these attempts either via a
> >> > /dev/drm_dp_auxN device or via the created i2c device.
> >> >
> >> > Making the DP AUX drivers timeout faster is a difficult proposition.
> >> > In theory we could just poll the panel's HPD line in the AUX transfer
> >> > function and immediately return an error there. However, this is
> >> > easier said than done. For one thing, there's no hard requirement to
> >> > hook the HPD line up for eDP panels and it's OK to just delay a fixed
> >> > amount. For another thing, the HPD line may not be fast to probe. On
> >> > parade-ps8640 we need to wait for the bridge chip's firmware to boot
> >> > before we can get the HPD line and this is a slow process.
> >> >
> >> > The fact that the transfers are taking so long to timeout is causing
> >> > real problems. The open source fwupd daemon sometimes scans DP busses
> >> > looking for devices whose firmware need updating. If it happens to
> >> > scan while a panel is turned off this scan can take a long time. The
> >> > fwupd daemon could try to be smarter and only scan when eDP panels are
> >> > turned on, but we can also improve the behavior in the kernel.
> >> >
> >> > Let's let eDP panels drivers specify that a panel is turned off and
> >> > then modify the common AUX transfer code not to attempt a transfer in
> >> > this case.
> >> >
> >> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> >> > ---
> >>
> >> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
> >
> > Thanks for the review!
> >
> > Given that this touches core DRM code and that I never got
> > confirmation that Jani's concerns were addressed with my previous
> > response, I'm still going to wait a little while before applying. I'm
> > on vacation for most of next week, but if there are no further replies
> > between now and then I'll plan to apply this to "drm-misc-next" the
> > week of Feb 26th. If someone else wants to apply this before I do then
> > I certainly won't object. Jani: if you feel this needs more discussion
> > or otherwise object to this patch landing then please yell. Likewise
> > if anyone else in the community wants to throw in their opinion, feel
> > free.
>
> Sorry for dropping the ball after my initial response. I simply have not
> had the time to look into this.
>
> It would be great to get, say, drm-misc maintainer ack on this before
> merging. It's not fair for me to stall this any longer, I'll trust their
> judgement.
>
> Reasonable?

I'd be more than happy for one of the drm-misc maintainers to Ack.
I'll move Maxime, Thomas, and Maarten to the "To:" line to see if that
helps get through their filters.

-Doug
Neil Armstrong Feb. 15, 2024, 4:53 p.m. UTC | #7
Hi Doug,

On 15/02/2024 16:08, Doug Anderson wrote:
> Hi,
> 
> On Thu, Feb 15, 2024 at 2:24 AM Jani Nikula <jani.nikula@intel.com> wrote:
>>
>> On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
>>> Hi,
>>>
>>> On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
>>>>
>>>> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
>>>>>
>>>>> If an eDP panel is not powered on then any attempts to talk to it over
>>>>> the DP AUX channel will timeout. Unfortunately these attempts may be
>>>>> quite slow. Userspace can initiate these attempts either via a
>>>>> /dev/drm_dp_auxN device or via the created i2c device.
>>>>>
>>>>> Making the DP AUX drivers timeout faster is a difficult proposition.
>>>>> In theory we could just poll the panel's HPD line in the AUX transfer
>>>>> function and immediately return an error there. However, this is
>>>>> easier said than done. For one thing, there's no hard requirement to
>>>>> hook the HPD line up for eDP panels and it's OK to just delay a fixed
>>>>> amount. For another thing, the HPD line may not be fast to probe. On
>>>>> parade-ps8640 we need to wait for the bridge chip's firmware to boot
>>>>> before we can get the HPD line and this is a slow process.
>>>>>
>>>>> The fact that the transfers are taking so long to timeout is causing
>>>>> real problems. The open source fwupd daemon sometimes scans DP busses
>>>>> looking for devices whose firmware need updating. If it happens to
>>>>> scan while a panel is turned off this scan can take a long time. The
>>>>> fwupd daemon could try to be smarter and only scan when eDP panels are
>>>>> turned on, but we can also improve the behavior in the kernel.
>>>>>
>>>>> Let's let eDP panels drivers specify that a panel is turned off and
>>>>> then modify the common AUX transfer code not to attempt a transfer in
>>>>> this case.
>>>>>
>>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>>>>> ---
>>>>
>>>> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
>>>
>>> Thanks for the review!
>>>
>>> Given that this touches core DRM code and that I never got
>>> confirmation that Jani's concerns were addressed with my previous
>>> response, I'm still going to wait a little while before applying. I'm
>>> on vacation for most of next week, but if there are no further replies
>>> between now and then I'll plan to apply this to "drm-misc-next" the
>>> week of Feb 26th. If someone else wants to apply this before I do then
>>> I certainly won't object. Jani: if you feel this needs more discussion
>>> or otherwise object to this patch landing then please yell. Likewise
>>> if anyone else in the community wants to throw in their opinion, feel
>>> free.
>>
>> Sorry for dropping the ball after my initial response. I simply have not
>> had the time to look into this.
>>
>> It would be great to get, say, drm-misc maintainer ack on this before
>> merging. It's not fair for me to stall this any longer, I'll trust their
>> judgement.
>>
>> Reasonable?
> 
> I'd be more than happy for one of the drm-misc maintainers to Ack.
> I'll move Maxime, Thomas, and Maarten to the "To:" line to see if that
> helps get through their filters.

I'll like some test reports to be sure it doesn't break anything,
then I'll be happy to give my ack !

Thanks,
Neil

> 
> -Doug
Doug Anderson Feb. 15, 2024, 5:08 p.m. UTC | #8
Hi,

On Thu, Feb 15, 2024 at 8:53 AM Neil Armstrong
<neil.armstrong@linaro.org> wrote:
>
> Hi Doug,
>
> On 15/02/2024 16:08, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, Feb 15, 2024 at 2:24 AM Jani Nikula <jani.nikula@intel.com> wrote:
> >>
> >> On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
> >>> Hi,
> >>>
> >>> On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
> >>>>
> >>>> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
> >>>>>
> >>>>> If an eDP panel is not powered on then any attempts to talk to it over
> >>>>> the DP AUX channel will timeout. Unfortunately these attempts may be
> >>>>> quite slow. Userspace can initiate these attempts either via a
> >>>>> /dev/drm_dp_auxN device or via the created i2c device.
> >>>>>
> >>>>> Making the DP AUX drivers timeout faster is a difficult proposition.
> >>>>> In theory we could just poll the panel's HPD line in the AUX transfer
> >>>>> function and immediately return an error there. However, this is
> >>>>> easier said than done. For one thing, there's no hard requirement to
> >>>>> hook the HPD line up for eDP panels and it's OK to just delay a fixed
> >>>>> amount. For another thing, the HPD line may not be fast to probe. On
> >>>>> parade-ps8640 we need to wait for the bridge chip's firmware to boot
> >>>>> before we can get the HPD line and this is a slow process.
> >>>>>
> >>>>> The fact that the transfers are taking so long to timeout is causing
> >>>>> real problems. The open source fwupd daemon sometimes scans DP busses
> >>>>> looking for devices whose firmware need updating. If it happens to
> >>>>> scan while a panel is turned off this scan can take a long time. The
> >>>>> fwupd daemon could try to be smarter and only scan when eDP panels are
> >>>>> turned on, but we can also improve the behavior in the kernel.
> >>>>>
> >>>>> Let's let eDP panels drivers specify that a panel is turned off and
> >>>>> then modify the common AUX transfer code not to attempt a transfer in
> >>>>> this case.
> >>>>>
> >>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> >>>>> ---
> >>>>
> >>>> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
> >>>
> >>> Thanks for the review!
> >>>
> >>> Given that this touches core DRM code and that I never got
> >>> confirmation that Jani's concerns were addressed with my previous
> >>> response, I'm still going to wait a little while before applying. I'm
> >>> on vacation for most of next week, but if there are no further replies
> >>> between now and then I'll plan to apply this to "drm-misc-next" the
> >>> week of Feb 26th. If someone else wants to apply this before I do then
> >>> I certainly won't object. Jani: if you feel this needs more discussion
> >>> or otherwise object to this patch landing then please yell. Likewise
> >>> if anyone else in the community wants to throw in their opinion, feel
> >>> free.
> >>
> >> Sorry for dropping the ball after my initial response. I simply have not
> >> had the time to look into this.
> >>
> >> It would be great to get, say, drm-misc maintainer ack on this before
> >> merging. It's not fair for me to stall this any longer, I'll trust their
> >> judgement.
> >>
> >> Reasonable?
> >
> > I'd be more than happy for one of the drm-misc maintainers to Ack.
> > I'll move Maxime, Thomas, and Maarten to the "To:" line to see if that
> > helps get through their filters.
>
> I'll like some test reports to be sure it doesn't break anything,
> then I'll be happy to give my ack !

Sounds good. I know Eizan (CCed, but also a ChromeOS person) was going
to poke at it a bit and seemed willing to provide a Tested-by. I'll
let him chime in.

...but perhaps some better evidence that it's not breaking hardware is
that we actually landed this into one (but not all) ChromeOS kernel
trees [1] and it's rolled out to at least "beta" channel without
anyone screaming. Normally we like things to land upstream before
picking, but in this case we needed to land another patch to gather
in-field data [2] that would have made the problem even worse.

The kernel tree we landed on was the v5.15 tree, which is currently
serving all Qualcomm sc7180-based Chromebooks, all Mediatek 8173
Chromebooks and all Mediatek 8186 Chromebooks. There are also a pile
of x86 Chromebooks running our v5.15 kernel. This code shouldn't
affect them because (unless I'm mistaken) they don't use the two
affected panel drivers. In any case, I haven't heard any screams from
them either. Given my landing plans of "the week of the 26th", this
still gives another 1.5 weeks for any screams to reach my ears.

...or are you looking for non-ChromeOS test reports? I'm not sure how
to encourage those. I suppose sometimes folks at Red Hat end up
stumbling over similar panel problems to those of us in ChromeOS.
Maybe +Javier would be interested in providing a Tested-by?

[1] https://crrev.com/c/5277322
[2] https://crrev.com/c/5277736
Javier Martinez Canillas Feb. 16, 2024, 8:21 a.m. UTC | #9
Doug Anderson <dianders@chromium.org> writes:

Hello Doug,

> Hi,
>
> On Thu, Feb 15, 2024 at 8:53 AM Neil Armstrong
> <neil.armstrong@linaro.org> wrote:
>>
>> Hi Doug,
>>
>> On 15/02/2024 16:08, Doug Anderson wrote:
>> > Hi,
>> >
>> > On Thu, Feb 15, 2024 at 2:24 AM Jani Nikula <jani.nikula@intel.com> wrote:
>> >>
>> >> On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
>> >>> Hi,
>> >>>
>> >>> On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
>> >>>>
>> >>>> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
>> >>>>>
>> >>>>> If an eDP panel is not powered on then any attempts to talk to it over
>> >>>>> the DP AUX channel will timeout. Unfortunately these attempts may be
>> >>>>> quite slow. Userspace can initiate these attempts either via a
>> >>>>> /dev/drm_dp_auxN device or via the created i2c device.
>> >>>>>
>> >>>>> Making the DP AUX drivers timeout faster is a difficult proposition.
>> >>>>> In theory we could just poll the panel's HPD line in the AUX transfer
>> >>>>> function and immediately return an error there. However, this is
>> >>>>> easier said than done. For one thing, there's no hard requirement to
>> >>>>> hook the HPD line up for eDP panels and it's OK to just delay a fixed
>> >>>>> amount. For another thing, the HPD line may not be fast to probe. On
>> >>>>> parade-ps8640 we need to wait for the bridge chip's firmware to boot
>> >>>>> before we can get the HPD line and this is a slow process.
>> >>>>>

[...]

>
> The kernel tree we landed on was the v5.15 tree, which is currently
> serving all Qualcomm sc7180-based Chromebooks, all Mediatek 8173
> Chromebooks and all Mediatek 8186 Chromebooks. There are also a pile
> of x86 Chromebooks running our v5.15 kernel. This code shouldn't
> affect them because (unless I'm mistaken) they don't use the two
> affected panel drivers. In any case, I haven't heard any screams from
> them either. Given my landing plans of "the week of the 26th", this
> still gives another 1.5 weeks for any screams to reach my ears.
>
> ...or are you looking for non-ChromeOS test reports? I'm not sure how
> to encourage those. I suppose sometimes folks at Red Hat end up
> stumbling over similar panel problems to those of us in ChromeOS.
> Maybe +Javier would be interested in providing a Tested-by?
>

I do have a SC7180 based HP X2 Chromebook and could test your patch (not
today but I could do it early next week), although I haven't followed so
if you could please let me know what exactly do you want me to verify.

AFAIU the problem is that the fwupd daemon tries to scan DP busses even if
the panel is turned off and that's what takes a lot of time (due retries),
and your patch makes the driver to bail out immediately ? If that's the
case, I guess that just starting fwupd daemon with an without your patch
while the panel is turned off could be a good test ?

> [1] https://crrev.com/c/5277322
> [2] https://crrev.com/c/5277736
>
Neil Armstrong Feb. 16, 2024, 8:38 a.m. UTC | #10
+ Bjorn
+ Konrad
+ Johan

On 15/02/2024 18:08, Doug Anderson wrote:
> Hi,
> 
> On Thu, Feb 15, 2024 at 8:53 AM Neil Armstrong
> <neil.armstrong@linaro.org> wrote:
>>
>> Hi Doug,
>>
>> On 15/02/2024 16:08, Doug Anderson wrote:
>>> Hi,
>>>
>>> On Thu, Feb 15, 2024 at 2:24 AM Jani Nikula <jani.nikula@intel.com> wrote:
>>>>
>>>> On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
>>>>> Hi,
>>>>>
>>>>> On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
>>>>>>
>>>>>> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
>>>>>>>
>>>>>>> If an eDP panel is not powered on then any attempts to talk to it over
>>>>>>> the DP AUX channel will timeout. Unfortunately these attempts may be
>>>>>>> quite slow. Userspace can initiate these attempts either via a
>>>>>>> /dev/drm_dp_auxN device or via the created i2c device.
>>>>>>>
>>>>>>> Making the DP AUX drivers timeout faster is a difficult proposition.
>>>>>>> In theory we could just poll the panel's HPD line in the AUX transfer
>>>>>>> function and immediately return an error there. However, this is
>>>>>>> easier said than done. For one thing, there's no hard requirement to
>>>>>>> hook the HPD line up for eDP panels and it's OK to just delay a fixed
>>>>>>> amount. For another thing, the HPD line may not be fast to probe. On
>>>>>>> parade-ps8640 we need to wait for the bridge chip's firmware to boot
>>>>>>> before we can get the HPD line and this is a slow process.
>>>>>>>
>>>>>>> The fact that the transfers are taking so long to timeout is causing
>>>>>>> real problems. The open source fwupd daemon sometimes scans DP busses
>>>>>>> looking for devices whose firmware need updating. If it happens to
>>>>>>> scan while a panel is turned off this scan can take a long time. The
>>>>>>> fwupd daemon could try to be smarter and only scan when eDP panels are
>>>>>>> turned on, but we can also improve the behavior in the kernel.
>>>>>>>
>>>>>>> Let's let eDP panels drivers specify that a panel is turned off and
>>>>>>> then modify the common AUX transfer code not to attempt a transfer in
>>>>>>> this case.
>>>>>>>
>>>>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>>>>>>> ---
>>>>>>
>>>>>> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
>>>>>
>>>>> Thanks for the review!
>>>>>
>>>>> Given that this touches core DRM code and that I never got
>>>>> confirmation that Jani's concerns were addressed with my previous
>>>>> response, I'm still going to wait a little while before applying. I'm
>>>>> on vacation for most of next week, but if there are no further replies
>>>>> between now and then I'll plan to apply this to "drm-misc-next" the
>>>>> week of Feb 26th. If someone else wants to apply this before I do then
>>>>> I certainly won't object. Jani: if you feel this needs more discussion
>>>>> or otherwise object to this patch landing then please yell. Likewise
>>>>> if anyone else in the community wants to throw in their opinion, feel
>>>>> free.
>>>>
>>>> Sorry for dropping the ball after my initial response. I simply have not
>>>> had the time to look into this.
>>>>
>>>> It would be great to get, say, drm-misc maintainer ack on this before
>>>> merging. It's not fair for me to stall this any longer, I'll trust their
>>>> judgement.
>>>>
>>>> Reasonable?
>>>
>>> I'd be more than happy for one of the drm-misc maintainers to Ack.
>>> I'll move Maxime, Thomas, and Maarten to the "To:" line to see if that
>>> helps get through their filters.
>>
>> I'll like some test reports to be sure it doesn't break anything,
>> then I'll be happy to give my ack !
> 
> Sounds good. I know Eizan (CCed, but also a ChromeOS person) was going
> to poke at it a bit and seemed willing to provide a Tested-by. I'll
> let him chime in.
> 
> ...but perhaps some better evidence that it's not breaking hardware is
> that we actually landed this into one (but not all) ChromeOS kernel
> trees [1] and it's rolled out to at least "beta" channel without
> anyone screaming. Normally we like things to land upstream before
> picking, but in this case we needed to land another patch to gather
> in-field data [2] that would have made the problem even worse.
> 
> The kernel tree we landed on was the v5.15 tree, which is currently
> serving all Qualcomm sc7180-based Chromebooks, all Mediatek 8173
> Chromebooks and all Mediatek 8186 Chromebooks. There are also a pile
> of x86 Chromebooks running our v5.15 kernel. This code shouldn't
> affect them because (unless I'm mistaken) they don't use the two
> affected panel drivers. In any case, I haven't heard any screams from
> them either. Given my landing plans of "the week of the 26th", this
> still gives another 1.5 weeks for any screams to reach my ears.
> 
> ...or are you looking for non-ChromeOS test reports? I'm not sure how
> to encourage those. I suppose sometimes folks at Red Hat end up
> stumbling over similar panel problems to those of us in ChromeOS.
> Maybe +Javier would be interested in providing a Tested-by?

Bjorn, Konrad, Johan,

Could one of you somehow try this on the X13s or other Laptop using eDP ?

Thanks,
Neil

> 
> [1] https://crrev.com/c/5277322
> [2] https://crrev.com/c/5277736
Doug Anderson Feb. 16, 2024, 3:30 p.m. UTC | #11
Hi,

On Fri, Feb 16, 2024 at 12:21 AM Javier Martinez Canillas
<javierm@redhat.com> wrote:
>
> > The kernel tree we landed on was the v5.15 tree, which is currently
> > serving all Qualcomm sc7180-based Chromebooks, all Mediatek 8173
> > Chromebooks and all Mediatek 8186 Chromebooks. There are also a pile
> > of x86 Chromebooks running our v5.15 kernel. This code shouldn't
> > affect them because (unless I'm mistaken) they don't use the two
> > affected panel drivers. In any case, I haven't heard any screams from
> > them either. Given my landing plans of "the week of the 26th", this
> > still gives another 1.5 weeks for any screams to reach my ears.
> >
> > ...or are you looking for non-ChromeOS test reports? I'm not sure how
> > to encourage those. I suppose sometimes folks at Red Hat end up
> > stumbling over similar panel problems to those of us in ChromeOS.
> > Maybe +Javier would be interested in providing a Tested-by?
> >
>
> I do have a SC7180 based HP X2 Chromebook and could test your patch (not
> today but I could do it early next week), although I haven't followed so
> if you could please let me know what exactly do you want me to verify.
>
> AFAIU the problem is that the fwupd daemon tries to scan DP busses even if
> the panel is turned off and that's what takes a lot of time (due retries),
> and your patch makes the driver to bail out immediately ? If that's the
> case, I guess that just starting fwupd daemon with an without your patch
> while the panel is turned off could be a good test ?

Sorry! I wasn't trying to sign you up for extra work. I'm not
convinced that any extra verification on a Chromebook is all that
valuable since that's pretty covered by the fact that we've already
pushed this patch out to real users on Chromebooks. I think Neil was
hoping for some confirmation that my patch didn't break someone else's
hardware. I think maybe good enough is if you have some type of
hardware that uses eDP and that you could verify that my patch does
break anything about it?

I'm not aware of anyone with extensive DP AUX character device usage.
I guess I thought of Javier because I remembered him at least also
using fwupd and some of the fwupd plugins try to talk to DP things
over the DP AUX character device.

If someone is really in a testing mood and wanted to stress the char
device, I guess something simple like "hexdump -C /dev/drm_dp_aux*"
for whatever eDP AUX is associated with eDP would at least do some
reading. You could stress turning the display on and off while doing
that with and without my patch. Presumably it will be better (error
out more quickly) with my patch.

If you wanted to stress the i2c path, you could do something like this
(the grep assumes you're using ti-sn65dsi86 as your eDP bridge chip,
but hopefully easy to adjust):

bus=$(i2cdetect -l | grep sn65 | sed 's/i2c-\([0-9]*\).*$/\1/')
i2cdump ${bus} 0x50 i

That should dump your EDID. Again it should error out quickly when the
panel is off after my patch but should start working again when the
panel is on.


Hmmm, thinking about all the above, I guess there is one case that
_could_ be broken by my patch. I really hope not, though. If someone
has a panel that's on an always-on rail or on a shared rail with some
other device (like the touchscreen) that's keeping the panel power on
then without my patch it would be possible to do DP AUX transactions
even when the panel was "off" from Linux's point of view. It would
have worked mostly due to luck, but now luck will run out and it will
stop working. I really hope nobody has userspace that is relying on
this, but I suppose it's always possible that somewhere, someone's
userspace is. If you are or know of someone who is then please shout.

-Doug
Eizan Miyamoto Feb. 19, 2024, 4:45 a.m. UTC | #12
Hiya,

On Fri, Feb 16, 2024 at 4:09 AM Doug Anderson <dianders@chromium.org> wrote:
[...snip...]
> Sounds good. I know Eizan (CCed, but also a ChromeOS person) was going
> to poke at it a bit and seemed willing to provide a Tested-by. I'll
> let him chime in.

Yup, I've tested this like so:

1. I started by slightly modifying a recent chromeos-5.15 kernel checked
   out to _right before_ the patch we are discussing to emit some
   tracing info to dmesg at entry/exit of auxdev_read_iter(). I
   installed it on a "tentacool" Corsola device.
2. I then rebooted the device and ran these commands:

   # dmesg -w &
   # while /bin/true; do echo -n "dpms: "; cat
/sys/devices/platform/soc/14000000.syscon/mediatek-drm.7.auto/drm/card0/card0-eDP-1/dpms;
dd if=/dev/drm_dp_aux1  count=1 of=/dev/null; sleep 30; done

And after a few minutes, I saw the following output:

dpms: On
[  435.603257] auxdev_open by pid 6327 inode num 108 dev 256901121
[  435.603369] start auxdev_read_iter by pid 6327
1+0 records in
1+0 records out
[  435.756547] finish auxdev_read_iter by pid 6327 status 512
[  435.756632] auxdev_release by pid 6327 inode num 108 dev 256901121
512 bytes copied, 0.153862 s, 3.3 kB/s
[  455.418637] [drm] mtk_crtc_ddp_hw_fini 459 event 0x0000000000000000
0xffffff80c0277080 0xffffff80c0277080
dpms: Off
[  465.775104] auxdev_open by pid 6399 inode num 108 dev 256901121
[  465.775218] start auxdev_read_iter by pid 6399
dd: error reading '/dev/drm_dp_aux1': Connection timed out
0+0 records in
0+0 records out
0 bytes copied, 16.6631 s, 0.0 kB/s
[  482.437762] finish auxdev_read_iter by pid 6399 status -110
[  482.438200] auxdev_release by pid 6399 inode num 108 dev 256901121

(OK, so what to look for in the above is the ETIMEDOUT returned by
auxdev_read_iter after about 17s when dpms was turned off.)

I then checked out the repo to the patch we are discussing and did the
same thing, and after a few minutes, I saw:

dpms: On
[  441.892692] auxdev_open by pid 6317 inode num 108 dev 256901121
[  441.892786] start auxdev_read_iter by pid 6317
1+0 records in
1+0 records out
512 bytes copied, 0.148004 s, 3.5 kB/s
[  442.040597] finish auxdev_read_iter by pid 6317 status 512
[  442.040652] auxdev_release by pid 6317 inode num 108 dev 256901121
[  455.395549] [drm] mtk_crtc_ddp_hw_fini 459 event 0x0000000000000000
0xffffff80c3993080 0xffffff80c3993080
dpms: Off
dd: error reading '/dev/drm_dp_aux1': Device or resource busy
0+0 records in
0+0 records out
0 bytes copied, 0.000241 s, 0.0 kB/s
[  472.055296] auxdev_open by pid 6439 inode num 108 dev 256901121
[  472.055388] start auxdev_read_iter by pid 6439
[  472.055421] finish auxdev_read_iter by pid 6439 status -16
[  472.055571] auxdev_release by pid 6439 inode num 108 dev 256901121

Tested-by: Eizan Miyamoto <eizan@chromium.org>
Steev Klimaszewski Feb. 26, 2024, 9:49 p.m. UTC | #13
On Fri, Feb 16, 2024 at 9:30 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Fri, Feb 16, 2024 at 12:21 AM Javier Martinez Canillas
> <javierm@redhat.com> wrote:
> >
> > > The kernel tree we landed on was the v5.15 tree, which is currently
> > > serving all Qualcomm sc7180-based Chromebooks, all Mediatek 8173
> > > Chromebooks and all Mediatek 8186 Chromebooks. There are also a pile
> > > of x86 Chromebooks running our v5.15 kernel. This code shouldn't
> > > affect them because (unless I'm mistaken) they don't use the two
> > > affected panel drivers. In any case, I haven't heard any screams from
> > > them either. Given my landing plans of "the week of the 26th", this
> > > still gives another 1.5 weeks for any screams to reach my ears.
> > >
> > > ...or are you looking for non-ChromeOS test reports? I'm not sure how
> > > to encourage those. I suppose sometimes folks at Red Hat end up
> > > stumbling over similar panel problems to those of us in ChromeOS.
> > > Maybe +Javier would be interested in providing a Tested-by?
> > >
> >
> > I do have a SC7180 based HP X2 Chromebook and could test your patch (not
> > today but I could do it early next week), although I haven't followed so
> > if you could please let me know what exactly do you want me to verify.
> >
> > AFAIU the problem is that the fwupd daemon tries to scan DP busses even if
> > the panel is turned off and that's what takes a lot of time (due retries),
> > and your patch makes the driver to bail out immediately ? If that's the
> > case, I guess that just starting fwupd daemon with an without your patch
> > while the panel is turned off could be a good test ?
>
> Sorry! I wasn't trying to sign you up for extra work. I'm not
> convinced that any extra verification on a Chromebook is all that
> valuable since that's pretty covered by the fact that we've already
> pushed this patch out to real users on Chromebooks. I think Neil was
> hoping for some confirmation that my patch didn't break someone else's
> hardware. I think maybe good enough is if you have some type of
> hardware that uses eDP and that you could verify that my patch does
> break anything about it?
>
> I'm not aware of anyone with extensive DP AUX character device usage.
> I guess I thought of Javier because I remembered him at least also
> using fwupd and some of the fwupd plugins try to talk to DP things
> over the DP AUX character device.
>
> If someone is really in a testing mood and wanted to stress the char
> device, I guess something simple like "hexdump -C /dev/drm_dp_aux*"
> for whatever eDP AUX is associated with eDP would at least do some
> reading. You could stress turning the display on and off while doing
> that with and without my patch. Presumably it will be better (error
> out more quickly) with my patch.
>
> If you wanted to stress the i2c path, you could do something like this
> (the grep assumes you're using ti-sn65dsi86 as your eDP bridge chip,
> but hopefully easy to adjust):
>
> bus=$(i2cdetect -l | grep sn65 | sed 's/i2c-\([0-9]*\).*$/\1/')
> i2cdump ${bus} 0x50 i
>
> That should dump your EDID. Again it should error out quickly when the
> panel is off after my patch but should start working again when the
> panel is on.
>
>
> Hmmm, thinking about all the above, I guess there is one case that
> _could_ be broken by my patch. I really hope not, though. If someone
> has a panel that's on an always-on rail or on a shared rail with some
> other device (like the touchscreen) that's keeping the panel power on
> then without my patch it would be possible to do DP AUX transactions
> even when the panel was "off" from Linux's point of view. It would
> have worked mostly due to luck, but now luck will run out and it will
> stop working. I really hope nobody has userspace that is relying on
> this, but I suppose it's always possible that somewhere, someone's
> userspace is. If you are or know of someone who is then please shout.
>
> -Doug

Tested on my Thinkpad X13s, with display on, I get the did when
hexdumping /dev/drm_dp_aux2, with display off I get device/resource
busy.
Tested-by: Steev Klimaszewski <steev@kali.org>
Doug Anderson Feb. 28, 2024, 4:40 p.m. UTC | #14
Neil,

On Thu, Feb 15, 2024 at 8:53 AM Neil Armstrong
<neil.armstrong@linaro.org> wrote:
>
> Hi Doug,
>
> On 15/02/2024 16:08, Doug Anderson wrote:
> > Hi,
> >
> > On Thu, Feb 15, 2024 at 2:24 AM Jani Nikula <jani.nikula@intel.com> wrote:
> >>
> >> On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
> >>> Hi,
> >>>
> >>> On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
> >>>>
> >>>> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
> >>>>>
> >>>>> If an eDP panel is not powered on then any attempts to talk to it over
> >>>>> the DP AUX channel will timeout. Unfortunately these attempts may be
> >>>>> quite slow. Userspace can initiate these attempts either via a
> >>>>> /dev/drm_dp_auxN device or via the created i2c device.
> >>>>>
> >>>>> Making the DP AUX drivers timeout faster is a difficult proposition.
> >>>>> In theory we could just poll the panel's HPD line in the AUX transfer
> >>>>> function and immediately return an error there. However, this is
> >>>>> easier said than done. For one thing, there's no hard requirement to
> >>>>> hook the HPD line up for eDP panels and it's OK to just delay a fixed
> >>>>> amount. For another thing, the HPD line may not be fast to probe. On
> >>>>> parade-ps8640 we need to wait for the bridge chip's firmware to boot
> >>>>> before we can get the HPD line and this is a slow process.
> >>>>>
> >>>>> The fact that the transfers are taking so long to timeout is causing
> >>>>> real problems. The open source fwupd daemon sometimes scans DP busses
> >>>>> looking for devices whose firmware need updating. If it happens to
> >>>>> scan while a panel is turned off this scan can take a long time. The
> >>>>> fwupd daemon could try to be smarter and only scan when eDP panels are
> >>>>> turned on, but we can also improve the behavior in the kernel.
> >>>>>
> >>>>> Let's let eDP panels drivers specify that a panel is turned off and
> >>>>> then modify the common AUX transfer code not to attempt a transfer in
> >>>>> this case.
> >>>>>
> >>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> >>>>> ---
> >>>>
> >>>> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
> >>>
> >>> Thanks for the review!
> >>>
> >>> Given that this touches core DRM code and that I never got
> >>> confirmation that Jani's concerns were addressed with my previous
> >>> response, I'm still going to wait a little while before applying. I'm
> >>> on vacation for most of next week, but if there are no further replies
> >>> between now and then I'll plan to apply this to "drm-misc-next" the
> >>> week of Feb 26th. If someone else wants to apply this before I do then
> >>> I certainly won't object. Jani: if you feel this needs more discussion
> >>> or otherwise object to this patch landing then please yell. Likewise
> >>> if anyone else in the community wants to throw in their opinion, feel
> >>> free.
> >>
> >> Sorry for dropping the ball after my initial response. I simply have not
> >> had the time to look into this.
> >>
> >> It would be great to get, say, drm-misc maintainer ack on this before
> >> merging. It's not fair for me to stall this any longer, I'll trust their
> >> judgement.
> >>
> >> Reasonable?
> >
> > I'd be more than happy for one of the drm-misc maintainers to Ack.
> > I'll move Maxime, Thomas, and Maarten to the "To:" line to see if that
> > helps get through their filters.
>
> I'll like some test reports to be sure it doesn't break anything,
> then I'll be happy to give my ack !

Are you looking for any more test reports at this point? Eizan did
some testing and provided a tag, though this was also on ChromeOS.
Steev also tested on two non-ChromeOS environments and provided his
tag. It's also been another two weeks of this being rolled out to some
Chromebook users and I haven't heard any reports of problems. If
somehow something was missed, I'm happy to follow-up and provide
additional fixes if some report comes in later.

Thanks!

-Doug
Neil Armstrong Feb. 28, 2024, 4:52 p.m. UTC | #15
On 28/02/2024 17:40, Doug Anderson wrote:
> Neil,
> 
> On Thu, Feb 15, 2024 at 8:53 AM Neil Armstrong
> <neil.armstrong@linaro.org> wrote:
>>
>> Hi Doug,
>>
>> On 15/02/2024 16:08, Doug Anderson wrote:
>>> Hi,
>>>
>>> On Thu, Feb 15, 2024 at 2:24 AM Jani Nikula <jani.nikula@intel.com> wrote:
>>>>
>>>> On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
>>>>> Hi,
>>>>>
>>>>> On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
>>>>>>
>>>>>> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
>>>>>>>
>>>>>>> If an eDP panel is not powered on then any attempts to talk to it over
>>>>>>> the DP AUX channel will timeout. Unfortunately these attempts may be
>>>>>>> quite slow. Userspace can initiate these attempts either via a
>>>>>>> /dev/drm_dp_auxN device or via the created i2c device.
>>>>>>>
>>>>>>> Making the DP AUX drivers timeout faster is a difficult proposition.
>>>>>>> In theory we could just poll the panel's HPD line in the AUX transfer
>>>>>>> function and immediately return an error there. However, this is
>>>>>>> easier said than done. For one thing, there's no hard requirement to
>>>>>>> hook the HPD line up for eDP panels and it's OK to just delay a fixed
>>>>>>> amount. For another thing, the HPD line may not be fast to probe. On
>>>>>>> parade-ps8640 we need to wait for the bridge chip's firmware to boot
>>>>>>> before we can get the HPD line and this is a slow process.
>>>>>>>
>>>>>>> The fact that the transfers are taking so long to timeout is causing
>>>>>>> real problems. The open source fwupd daemon sometimes scans DP busses
>>>>>>> looking for devices whose firmware need updating. If it happens to
>>>>>>> scan while a panel is turned off this scan can take a long time. The
>>>>>>> fwupd daemon could try to be smarter and only scan when eDP panels are
>>>>>>> turned on, but we can also improve the behavior in the kernel.
>>>>>>>
>>>>>>> Let's let eDP panels drivers specify that a panel is turned off and
>>>>>>> then modify the common AUX transfer code not to attempt a transfer in
>>>>>>> this case.
>>>>>>>
>>>>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>>>>>>> ---
>>>>>>
>>>>>> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
>>>>>
>>>>> Thanks for the review!
>>>>>
>>>>> Given that this touches core DRM code and that I never got
>>>>> confirmation that Jani's concerns were addressed with my previous
>>>>> response, I'm still going to wait a little while before applying. I'm
>>>>> on vacation for most of next week, but if there are no further replies
>>>>> between now and then I'll plan to apply this to "drm-misc-next" the
>>>>> week of Feb 26th. If someone else wants to apply this before I do then
>>>>> I certainly won't object. Jani: if you feel this needs more discussion
>>>>> or otherwise object to this patch landing then please yell. Likewise
>>>>> if anyone else in the community wants to throw in their opinion, feel
>>>>> free.
>>>>
>>>> Sorry for dropping the ball after my initial response. I simply have not
>>>> had the time to look into this.
>>>>
>>>> It would be great to get, say, drm-misc maintainer ack on this before
>>>> merging. It's not fair for me to stall this any longer, I'll trust their
>>>> judgement.
>>>>
>>>> Reasonable?
>>>
>>> I'd be more than happy for one of the drm-misc maintainers to Ack.
>>> I'll move Maxime, Thomas, and Maarten to the "To:" line to see if that
>>> helps get through their filters.
>>
>> I'll like some test reports to be sure it doesn't break anything,
>> then I'll be happy to give my ack !
> 
> Are you looking for any more test reports at this point? Eizan did
> some testing and provided a tag, though this was also on ChromeOS.
> Steev also tested on two non-ChromeOS environments and provided his
> tag. It's also been another two weeks of this being rolled out to some
> Chromebook users and I haven't heard any reports of problems. If
> somehow something was missed, I'm happy to follow-up and provide
> additional fixes if some report comes in later.

Sure, thx I think you can apply it now

Acked-by: Neil Armstrong <neil.armstrong@linaro.org>

Thanks,
Neil

> 
> Thanks!
> 
> -Doug
Doug Anderson Feb. 28, 2024, 8:48 p.m. UTC | #16
Hi,

On Wed, Feb 28, 2024 at 8:52 AM <neil.armstrong@linaro.org> wrote:
>
> On 28/02/2024 17:40, Doug Anderson wrote:
> > Neil,
> >
> > On Thu, Feb 15, 2024 at 8:53 AM Neil Armstrong
> > <neil.armstrong@linaro.org> wrote:
> >>
> >> Hi Doug,
> >>
> >> On 15/02/2024 16:08, Doug Anderson wrote:
> >>> Hi,
> >>>
> >>> On Thu, Feb 15, 2024 at 2:24 AM Jani Nikula <jani.nikula@intel.com> wrote:
> >>>>
> >>>> On Wed, 14 Feb 2024, Doug Anderson <dianders@chromium.org> wrote:
> >>>>> Hi,
> >>>>>
> >>>>> On Tue, Feb 13, 2024 at 10:25 PM Hsin-Yi Wang <hsinyi@chromium.org> wrote:
> >>>>>>
> >>>>>> On Wed, Feb 14, 2024 at 2:23 PM Douglas Anderson <dianders@chromium.org> wrote:
> >>>>>>>
> >>>>>>> If an eDP panel is not powered on then any attempts to talk to it over
> >>>>>>> the DP AUX channel will timeout. Unfortunately these attempts may be
> >>>>>>> quite slow. Userspace can initiate these attempts either via a
> >>>>>>> /dev/drm_dp_auxN device or via the created i2c device.
> >>>>>>>
> >>>>>>> Making the DP AUX drivers timeout faster is a difficult proposition.
> >>>>>>> In theory we could just poll the panel's HPD line in the AUX transfer
> >>>>>>> function and immediately return an error there. However, this is
> >>>>>>> easier said than done. For one thing, there's no hard requirement to
> >>>>>>> hook the HPD line up for eDP panels and it's OK to just delay a fixed
> >>>>>>> amount. For another thing, the HPD line may not be fast to probe. On
> >>>>>>> parade-ps8640 we need to wait for the bridge chip's firmware to boot
> >>>>>>> before we can get the HPD line and this is a slow process.
> >>>>>>>
> >>>>>>> The fact that the transfers are taking so long to timeout is causing
> >>>>>>> real problems. The open source fwupd daemon sometimes scans DP busses
> >>>>>>> looking for devices whose firmware need updating. If it happens to
> >>>>>>> scan while a panel is turned off this scan can take a long time. The
> >>>>>>> fwupd daemon could try to be smarter and only scan when eDP panels are
> >>>>>>> turned on, but we can also improve the behavior in the kernel.
> >>>>>>>
> >>>>>>> Let's let eDP panels drivers specify that a panel is turned off and
> >>>>>>> then modify the common AUX transfer code not to attempt a transfer in
> >>>>>>> this case.
> >>>>>>>
> >>>>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> >>>>>>> ---
> >>>>>>
> >>>>>> Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
> >>>>>
> >>>>> Thanks for the review!
> >>>>>
> >>>>> Given that this touches core DRM code and that I never got
> >>>>> confirmation that Jani's concerns were addressed with my previous
> >>>>> response, I'm still going to wait a little while before applying. I'm
> >>>>> on vacation for most of next week, but if there are no further replies
> >>>>> between now and then I'll plan to apply this to "drm-misc-next" the
> >>>>> week of Feb 26th. If someone else wants to apply this before I do then
> >>>>> I certainly won't object. Jani: if you feel this needs more discussion
> >>>>> or otherwise object to this patch landing then please yell. Likewise
> >>>>> if anyone else in the community wants to throw in their opinion, feel
> >>>>> free.
> >>>>
> >>>> Sorry for dropping the ball after my initial response. I simply have not
> >>>> had the time to look into this.
> >>>>
> >>>> It would be great to get, say, drm-misc maintainer ack on this before
> >>>> merging. It's not fair for me to stall this any longer, I'll trust their
> >>>> judgement.
> >>>>
> >>>> Reasonable?
> >>>
> >>> I'd be more than happy for one of the drm-misc maintainers to Ack.
> >>> I'll move Maxime, Thomas, and Maarten to the "To:" line to see if that
> >>> helps get through their filters.
> >>
> >> I'll like some test reports to be sure it doesn't break anything,
> >> then I'll be happy to give my ack !
> >
> > Are you looking for any more test reports at this point? Eizan did
> > some testing and provided a tag, though this was also on ChromeOS.
> > Steev also tested on two non-ChromeOS environments and provided his
> > tag. It's also been another two weeks of this being rolled out to some
> > Chromebook users and I haven't heard any reports of problems. If
> > somehow something was missed, I'm happy to follow-up and provide
> > additional fixes if some report comes in later.
>
> Sure, thx I think you can apply it now
>
> Acked-by: Neil Armstrong <neil.armstrong@linaro.org>

Pushed to drm-misc-next.

8df1ddb5bf11 drm/dp: Don't attempt AUX transfers when eDP panels are not powered
diff mbox series

Patch

diff --git a/drivers/gpu/drm/display/drm_dp_helper.c b/drivers/gpu/drm/display/drm_dp_helper.c
index b1ca3a1100da..6fa705d82c8f 100644
--- a/drivers/gpu/drm/display/drm_dp_helper.c
+++ b/drivers/gpu/drm/display/drm_dp_helper.c
@@ -532,6 +532,15 @@  static int drm_dp_dpcd_access(struct drm_dp_aux *aux, u8 request,
 
 	mutex_lock(&aux->hw_mutex);
 
+	/*
+	 * If the device attached to the aux bus is powered down then there's
+	 * no reason to attempt a transfer. Error out immediately.
+	 */
+	if (aux->powered_down) {
+		ret = -EBUSY;
+		goto unlock;
+	}
+
 	/*
 	 * The specification doesn't give any recommendation on how often to
 	 * retry native transactions. We used to retry 7 times like for
@@ -599,6 +608,29 @@  int drm_dp_dpcd_probe(struct drm_dp_aux *aux, unsigned int offset)
 }
 EXPORT_SYMBOL(drm_dp_dpcd_probe);
 
+/**
+ * drm_dp_dpcd_set_powered() - Set whether the DP device is powered
+ * @aux: DisplayPort AUX channel; for convenience it's OK to pass NULL here
+ *       and the function will be a no-op.
+ * @powered: true if powered; false if not
+ *
+ * If the endpoint device on the DP AUX bus is known to be powered down
+ * then this function can be called to make future transfers fail immediately
+ * instead of needing to time out.
+ *
+ * If this function is never called then a device defaults to being powered.
+ */
+void drm_dp_dpcd_set_powered(struct drm_dp_aux *aux, bool powered)
+{
+	if (!aux)
+		return;
+
+	mutex_lock(&aux->hw_mutex);
+	aux->powered_down = !powered;
+	mutex_unlock(&aux->hw_mutex);
+}
+EXPORT_SYMBOL(drm_dp_dpcd_set_powered);
+
 /**
  * drm_dp_dpcd_read() - read a series of bytes from the DPCD
  * @aux: DisplayPort AUX channel (SST or MST)
@@ -1858,6 +1890,9 @@  static int drm_dp_i2c_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs,
 	struct drm_dp_aux_msg msg;
 	int err = 0;
 
+	if (aux->powered_down)
+		return -EBUSY;
+
 	dp_aux_i2c_transfer_size = clamp(dp_aux_i2c_transfer_size, 1, DP_AUX_MAX_PAYLOAD_BYTES);
 
 	memset(&msg, 0, sizeof(msg));
diff --git a/drivers/gpu/drm/panel/panel-edp.c b/drivers/gpu/drm/panel/panel-edp.c
index 7d556b1bfa82..d2a4e514d8fd 100644
--- a/drivers/gpu/drm/panel/panel-edp.c
+++ b/drivers/gpu/drm/panel/panel-edp.c
@@ -413,6 +413,7 @@  static int panel_edp_suspend(struct device *dev)
 {
 	struct panel_edp *p = dev_get_drvdata(dev);
 
+	drm_dp_dpcd_set_powered(p->aux, false);
 	gpiod_set_value_cansleep(p->enable_gpio, 0);
 	regulator_disable(p->supply);
 	p->unprepared_time = ktime_get_boottime();
@@ -469,6 +470,7 @@  static int panel_edp_prepare_once(struct panel_edp *p)
 	}
 
 	gpiod_set_value_cansleep(p->enable_gpio, 1);
+	drm_dp_dpcd_set_powered(p->aux, true);
 
 	p->powered_on_time = ktime_get_boottime();
 
@@ -507,6 +509,7 @@  static int panel_edp_prepare_once(struct panel_edp *p)
 	return 0;
 
 error:
+	drm_dp_dpcd_set_powered(p->aux, false);
 	gpiod_set_value_cansleep(p->enable_gpio, 0);
 	regulator_disable(p->supply);
 	p->unprepared_time = ktime_get_boottime();
diff --git a/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c b/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
index 5703f4712d96..76c2a8f6718c 100644
--- a/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
+++ b/drivers/gpu/drm/panel/panel-samsung-atna33xc20.c
@@ -72,6 +72,7 @@  static int atana33xc20_suspend(struct device *dev)
 	if (p->el3_was_on)
 		atana33xc20_wait(p->el_on3_off_time, 150);
 
+	drm_dp_dpcd_set_powered(p->aux, false);
 	ret = regulator_disable(p->supply);
 	if (ret)
 		return ret;
@@ -93,6 +94,7 @@  static int atana33xc20_resume(struct device *dev)
 	ret = regulator_enable(p->supply);
 	if (ret)
 		return ret;
+	drm_dp_dpcd_set_powered(p->aux, true);
 	p->powered_on_time = ktime_get_boottime();
 
 	if (p->no_hpd) {
diff --git a/include/drm/display/drm_dp_helper.h b/include/drm/display/drm_dp_helper.h
index 863b2e7add29..472359a9d675 100644
--- a/include/drm/display/drm_dp_helper.h
+++ b/include/drm/display/drm_dp_helper.h
@@ -463,9 +463,15 @@  struct drm_dp_aux {
 	 * @is_remote: Is this AUX CH actually using sideband messaging.
 	 */
 	bool is_remote;
+
+	/**
+	 * @powered_down: If true then the remote endpoint is powered down.
+	 */
+	bool powered_down;
 };
 
 int drm_dp_dpcd_probe(struct drm_dp_aux *aux, unsigned int offset);
+void drm_dp_dpcd_set_powered(struct drm_dp_aux *aux, bool powered);
 ssize_t drm_dp_dpcd_read(struct drm_dp_aux *aux, unsigned int offset,
 			 void *buffer, size_t size);
 ssize_t drm_dp_dpcd_write(struct drm_dp_aux *aux, unsigned int offset,