mbox series

[0/3] Define a final failure state when link training fails

Message ID 20230818170156.2194015-1-gildekel@chromium.org (mailing list archive)
Headers show
Series Define a final failure state when link training fails | expand

Message

Gil Dekel Aug. 18, 2023, 4:59 p.m. UTC
Currently, when link training fails after all fallback values have been
exhausted, the i915 driver seizes to send uevents to userspace. This leave
userspace thinking that the last passing atomic commit was successful, and that
all connectors (displays) are connected and operational, when in fact, the last
link failed to train and the displays remain dark. This manifests as "zombie"
displays in userspace, in which users observe the displays appear in their
display settings page, but they are dark and unresponsive.

Since, at the time of writing, MST link training fallback is not implemented,
failing MST link training is a significantly more common case then a complete
SST link training failure. And with users using MST hubs than ever to connect
multiple displays via their USB-C ports we observe this case often.

This patchset series suggest a solution, in which a final failure state is
defined. In this final state, the connector's bit rate capabilities, namely
max_link_rate and max_link_lane_count, are set to 0. This effectively set the
connector's bandwidth to 0Gbps, thus causing all its modes to be pruned in the
following connector probing.

Next, with this state defined, we emit a link-status=Bad uevent. The next time
userspace probes the connector, it should recognize that the connector has no
modes and ignore it since it is in a bad state.

I am aware that always sending a uevent and never stopping may result in some
userspaces having their expectations broken and enter an infinite loop of
modesets and link-training attempts. However, per DRM link-status spec:
```
 * link-status:
 *      Connector link-status property to indicate the status of link. The
 *      default value of link-status is "GOOD". If something fails during or
 *      after modeset, the kernel driver may set this to "BAD" and issue a
 *      hotplug uevent. Drivers should update this value using
 *      drm_connector_set_link_status_property().
 *
 *      When user-space receives the hotplug uevent and detects a "BAD"
 *      link-status, the sink doesn't receive pixels anymore (e.g. the screen
 *      becomes completely black). The list of available modes may have
 *      changed. User-space is expected to pick a new mode if the current one
 *      has disappeared and perform a new modeset with link-status set to
 *      "GOOD" to re-enable the connector.
```
(form drivers/gpu/drm/drm_connector.c - DOC: standard connector properties)

it seems reasonable to assume that the suggested state is an extension of the
spec's guidelines, in which the next new mode userspace picks for a connector
with no modes is - none, thus breaking the cycle of failed link-training
attempts.

I suspect that, maybe, zeroing out the bit rate capabilities is not the right
way to go, and perhaps marking the connector as disconnected instead may be a
better solution. However, if marking a connector disconnected is the way to go,
We will have to iterate over all MST ports in the MST case and mark the spawned
connectors as disconnected as well.

As a final note I should add that this approach was tested with ChromeOS as
userspace, and we observed that the zombie displays stop showing up once the
connectors are pruned of all their modes and are ignored by userspace.

For your consideration and guidance.
Thanks,

Gil Dekel (3):
  drm/i915/dp_link_training: Add a final failing state to link training
    fallback
  drm/i915/dp_link_training: Add a final failing state to link training
    fallback for MST
  drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
    property

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Manasi Navare <navaremanasi@chromium.org>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Gil Dekel <gildekel@chromium.org>
---
 drivers/gpu/drm/i915/display/intel_dp.c       | 50 +++++++++++++------
 drivers/gpu/drm/i915/display/intel_dp.h       |  4 +-
 .../drm/i915/display/intel_dp_link_training.c |  8 +--
 3 files changed, 41 insertions(+), 21 deletions(-)

--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics

Comments

Gil Dekel Aug. 24, 2023, 4:31 a.m. UTC | #1
Next version of https://patchwork.freedesktop.org/series/122643/

Reorganize into:
1) Add for final failure state for SST and MST link training fallback.
2) Add a DRM helper for setting downstream MST ports' link-status state.
3) Make handling SST and MST connectors simpler via intel_dp.
4) Update link-status for downstream MST ports.
5) Emit a uevent with the "link-status" trigger property.

Gil Dekel (6):
  drm/i915/dp_link_training: Add a final failing state to link training
    fallback
  drm/i915/dp_link_training: Add a final failing state to link training
    fallback for MST
  drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
  drm/i915: Move DP modeset_retry_work into intel_dp
  drm/i915/dp_link_training: Set all downstream MST ports to BAD before
    retrying
  drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
    property

 drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
 drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
 .../drm/i915/display/intel_display_types.h    |  6 +-
 drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
 drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
 .../drm/i915/display/intel_dp_link_training.c | 11 ++-
 include/drm/display/drm_dp_mst_helper.h       |  3 +
 7 files changed, 110 insertions(+), 40 deletions(-)

--
Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics
Jani Nikula Aug. 24, 2023, 9:56 a.m. UTC | #2
On Thu, 24 Aug 2023, Gil Dekel <gildekel@chromium.org> wrote:
> Next version of https://patchwork.freedesktop.org/series/122643/
>
> Reorganize into:
> 1) Add for final failure state for SST and MST link training fallback.
> 2) Add a DRM helper for setting downstream MST ports' link-status state.
> 3) Make handling SST and MST connectors simpler via intel_dp.
> 4) Update link-status for downstream MST ports.
> 5) Emit a uevent with the "link-status" trigger property.

Please don't send series in-reply-to other series. It'll confuse the
patchwork/CI doohickey trying to apply the patches for testing.

Thanks,
Jani.

>
> Gil Dekel (6):
>   drm/i915/dp_link_training: Add a final failing state to link training
>     fallback
>   drm/i915/dp_link_training: Add a final failing state to link training
>     fallback for MST
>   drm/dp_mst: Add drm_dp_set_mst_topology_link_status()
>   drm/i915: Move DP modeset_retry_work into intel_dp
>   drm/i915/dp_link_training: Set all downstream MST ports to BAD before
>     retrying
>   drm/i915/dp_link_training: Emit a link-status=Bad uevent with trigger
>     property
>
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 39 ++++++++++
>  drivers/gpu/drm/i915/display/intel_display.c  | 14 +++-
>  .../drm/i915/display/intel_display_types.h    |  6 +-
>  drivers/gpu/drm/i915/display/intel_dp.c       | 75 ++++++++++++-------
>  drivers/gpu/drm/i915/display/intel_dp.h       |  2 +-
>  .../drm/i915/display/intel_dp_link_training.c | 11 ++-
>  include/drm/display/drm_dp_mst_helper.h       |  3 +
>  7 files changed, 110 insertions(+), 40 deletions(-)
>
> --
> Gil Dekel, Software Engineer, Google / ChromeOS Display and Graphics