Message ID | 20230612220106.1884039-1-quic_bjorande@quicinc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/msm/dp: Drop aux devices together with DP controller | expand |
On 13/06/2023 01:01, Bjorn Andersson wrote: > Using devres to depopulate the aux bus made sure that upon a probe > deferral the EDP panel device would be destroyed and recreated upon next > attempt. > > But the struct device which the devres is tied to is the DPUs > (drm_dev->dev), which may be happen after the DP controller is torn > down. > > Indications of this can be seen in the commonly seen EDID-hexdump full > of zeros in the log, or the occasional/rare KASAN fault where the > panel's attempt to read the EDID information causes a use after free on > DP resources. > > It's tempting to move the devres to the DP controller's struct device, > but the resources used by the device(s) on the aux bus are explicitly > torn down in the error path. I hoped that proper usage of of_dp_aux_populate_bus(), with the callback function being non-NULL would have solved at least this part. But it seems I'll never see this patch. > The KASAN-reported use-after-free also > remains, as the DP aux "module" explicitly frees its devres-allocated > memory in this code path. > > As such, explicitly depopulate the aux bus in the error path, and in the > component unbind path, to avoid these issues. > > Fixes: 2b57f726611e ("drm/msm/dp: fix aux-bus EP lifetime") > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > --- > drivers/gpu/drm/msm/dp/dp_display.c | 14 +++----------- > 1 file changed, 3 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c > index 3d8fa2e73583..bbb0550a022b 100644 > --- a/drivers/gpu/drm/msm/dp/dp_display.c > +++ b/drivers/gpu/drm/msm/dp/dp_display.c > @@ -322,6 +322,8 @@ static void dp_display_unbind(struct device *dev, struct device *master, > > kthread_stop(dp->ev_tsk); > > + of_dp_aux_depopulate_bus(dp->aux); > + > dp_power_client_deinit(dp->power); > dp_unregister_audio_driver(dev, dp->audio); > dp_aux_unregister(dp->aux); > @@ -1521,11 +1523,6 @@ void msm_dp_debugfs_init(struct msm_dp *dp_display, struct drm_minor *minor) > } > } > > -static void of_dp_aux_depopulate_bus_void(void *data) > -{ > - of_dp_aux_depopulate_bus(data); > -} > - > static int dp_display_get_next_bridge(struct msm_dp *dp) > { > int rc; > @@ -1554,12 +1551,6 @@ static int dp_display_get_next_bridge(struct msm_dp *dp) > of_node_put(aux_bus); > if (rc) > goto error; > - > - rc = devm_add_action_or_reset(dp->drm_dev->dev, > - of_dp_aux_depopulate_bus_void, > - dp_priv->aux); > - if (rc) > - goto error; > } else if (dp->is_edp) { > DRM_ERROR("eDP aux_bus not found\n"); > return -ENODEV; > @@ -1583,6 +1574,7 @@ static int dp_display_get_next_bridge(struct msm_dp *dp) > > error: > if (dp->is_edp) { > + of_dp_aux_depopulate_bus(dp_priv->aux); > disable_irq(dp_priv->irq); > dp_display_host_phy_exit(dp_priv); > dp_display_host_deinit(dp_priv);
Hi, On Mon, Jun 12, 2023 at 3:40 PM Dmitry Baryshkov <dmitry.baryshkov@linaro.org> wrote: > > On 13/06/2023 01:01, Bjorn Andersson wrote: > > Using devres to depopulate the aux bus made sure that upon a probe > > deferral the EDP panel device would be destroyed and recreated upon next > > attempt. > > > > But the struct device which the devres is tied to is the DPUs > > (drm_dev->dev), which may be happen after the DP controller is torn > > down. > > > > Indications of this can be seen in the commonly seen EDID-hexdump full > > of zeros in the log, or the occasional/rare KASAN fault where the > > panel's attempt to read the EDID information causes a use after free on > > DP resources. > > > > It's tempting to move the devres to the DP controller's struct device, > > but the resources used by the device(s) on the aux bus are explicitly > > torn down in the error path. > > I hoped that proper usage of of_dp_aux_populate_bus(), with the callback > function being non-NULL would have solved at least this part. But it > seems I'll never see this patch. Agreed. This has been pending for > 1 year now with no significant progress. Abhinav: Is there anything that can be done about this? Not following up on agreed-to cleanups in a timely manner doesn't set a good precedent. Next time the Qualcomm display wants to land something and promises to land a followup people will be less likely to believe them... > > The KASAN-reported use-after-free also > > remains, as the DP aux "module" explicitly frees its devres-allocated > > memory in this code path. > > > > As such, explicitly depopulate the aux bus in the error path, and in the > > component unbind path, to avoid these issues. > > > > Fixes: 2b57f726611e ("drm/msm/dp: fix aux-bus EP lifetime") > > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> > > Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Douglas Anderson <dianders@chromium.org>
Hi Doug On 6/13/2023 12:33 PM, Doug Anderson wrote: > Hi, > > On Mon, Jun 12, 2023 at 3:40 PM Dmitry Baryshkov > <dmitry.baryshkov@linaro.org> wrote: >> >> On 13/06/2023 01:01, Bjorn Andersson wrote: >>> Using devres to depopulate the aux bus made sure that upon a probe >>> deferral the EDP panel device would be destroyed and recreated upon next >>> attempt. >>> >>> But the struct device which the devres is tied to is the DPUs >>> (drm_dev->dev), which may be happen after the DP controller is torn >>> down. >>> >>> Indications of this can be seen in the commonly seen EDID-hexdump full >>> of zeros in the log, or the occasional/rare KASAN fault where the >>> panel's attempt to read the EDID information causes a use after free on >>> DP resources. >>> >>> It's tempting to move the devres to the DP controller's struct device, >>> but the resources used by the device(s) on the aux bus are explicitly >>> torn down in the error path. >> >> I hoped that proper usage of of_dp_aux_populate_bus(), with the callback >> function being non-NULL would have solved at least this part. But it >> seems I'll never see this patch. > > Agreed. This has been pending for > 1 year now with no significant > progress. Abhinav: Is there anything that can be done about this? Not > following up on agreed-to cleanups in a timely manner doesn't set a > good precedent. Next time the Qualcomm display wants to land something > and promises to land a followup people will be less likely to believe > them... > Both QC and Google know there were other factors which delayed this last 3-4 months. But, I do not have any concrete justification to give you for the delays before that apart from perhaps other higher priority chrome and upstream bugs which kept cropping up. Hence, all I can offer is my apologies for the delay. After seeing this patch on the list, we have revived this effort now and re-assigned this within our team to take over from where that was left off. It will need some time to transition but this will see the end of the tunnel soon. Thanks Abhinav
On Mon, 12 Jun 2023 15:01:06 -0700, Bjorn Andersson wrote: > Using devres to depopulate the aux bus made sure that upon a probe > deferral the EDP panel device would be destroyed and recreated upon next > attempt. > > But the struct device which the devres is tied to is the DPUs > (drm_dev->dev), which may be happen after the DP controller is torn > down. > > [...] Applied, thanks! [1/1] drm/msm/dp: Drop aux devices together with DP controller https://gitlab.freedesktop.org/lumag/msm/-/commit/a7bfb2ad2184 Best regards,
On Mon, Jun 12, 2023 at 03:01:06PM -0700, Bjorn Andersson wrote: > Using devres to depopulate the aux bus made sure that upon a probe > deferral the EDP panel device would be destroyed and recreated upon next > attempt. > > But the struct device which the devres is tied to is the DPUs > (drm_dev->dev), which may be happen after the DP controller is torn > down. There appears to be some words missing in this sentence. > Indications of this can be seen in the commonly seen EDID-hexdump full > of zeros in the log, This could happen also when the aux bus lifetime was tied to DP controller and is mostly benign as dp_aux_deinit() set the "initted" flag to false. > or the occasional/rare KASAN fault where the > panel's attempt to read the EDID information causes a use after free on > DP resources. But this is clearly a bug as there's a small window where the aux bus struct holding the above flag may also have been released... > It's tempting to move the devres to the DP controller's struct device, > but the resources used by the device(s) on the aux bus are explicitly > torn down in the error path. The KASAN-reported use-after-free also > remains, as the DP aux "module" explicitly frees its devres-allocated > memory in this code path. Right, and this would also not work as the aux bus could remain populated for the next bind attempt which would then fail (as described in the commit message of the offending commit). > As such, explicitly depopulate the aux bus in the error path, and in the > component unbind path, to avoid these issues. Sounds good. > Fixes: 2b57f726611e ("drm/msm/dp: fix aux-bus EP lifetime") This one should also have a stable tag: Cc: stable@vger.kernel.org # 5.19 > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> > --- > drivers/gpu/drm/msm/dp/dp_display.c | 14 +++----------- > 1 file changed, 3 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c > index 3d8fa2e73583..bbb0550a022b 100644 > --- a/drivers/gpu/drm/msm/dp/dp_display.c > +++ b/drivers/gpu/drm/msm/dp/dp_display.c > @@ -322,6 +322,8 @@ static void dp_display_unbind(struct device *dev, struct device *master, > > kthread_stop(dp->ev_tsk); > > + of_dp_aux_depopulate_bus(dp->aux); This may now be called without first having populated the bus, but looks like that still works. > + > dp_power_client_deinit(dp->power); > dp_unregister_audio_driver(dev, dp->audio); > dp_aux_unregister(dp->aux); I know this one was merged while I was out-of-office last week, but for the record: Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Johan
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c index 3d8fa2e73583..bbb0550a022b 100644 --- a/drivers/gpu/drm/msm/dp/dp_display.c +++ b/drivers/gpu/drm/msm/dp/dp_display.c @@ -322,6 +322,8 @@ static void dp_display_unbind(struct device *dev, struct device *master, kthread_stop(dp->ev_tsk); + of_dp_aux_depopulate_bus(dp->aux); + dp_power_client_deinit(dp->power); dp_unregister_audio_driver(dev, dp->audio); dp_aux_unregister(dp->aux); @@ -1521,11 +1523,6 @@ void msm_dp_debugfs_init(struct msm_dp *dp_display, struct drm_minor *minor) } } -static void of_dp_aux_depopulate_bus_void(void *data) -{ - of_dp_aux_depopulate_bus(data); -} - static int dp_display_get_next_bridge(struct msm_dp *dp) { int rc; @@ -1554,12 +1551,6 @@ static int dp_display_get_next_bridge(struct msm_dp *dp) of_node_put(aux_bus); if (rc) goto error; - - rc = devm_add_action_or_reset(dp->drm_dev->dev, - of_dp_aux_depopulate_bus_void, - dp_priv->aux); - if (rc) - goto error; } else if (dp->is_edp) { DRM_ERROR("eDP aux_bus not found\n"); return -ENODEV; @@ -1583,6 +1574,7 @@ static int dp_display_get_next_bridge(struct msm_dp *dp) error: if (dp->is_edp) { + of_dp_aux_depopulate_bus(dp_priv->aux); disable_irq(dp_priv->irq); dp_display_host_phy_exit(dp_priv); dp_display_host_deinit(dp_priv);
Using devres to depopulate the aux bus made sure that upon a probe deferral the EDP panel device would be destroyed and recreated upon next attempt. But the struct device which the devres is tied to is the DPUs (drm_dev->dev), which may be happen after the DP controller is torn down. Indications of this can be seen in the commonly seen EDID-hexdump full of zeros in the log, or the occasional/rare KASAN fault where the panel's attempt to read the EDID information causes a use after free on DP resources. It's tempting to move the devres to the DP controller's struct device, but the resources used by the device(s) on the aux bus are explicitly torn down in the error path. The KASAN-reported use-after-free also remains, as the DP aux "module" explicitly frees its devres-allocated memory in this code path. As such, explicitly depopulate the aux bus in the error path, and in the component unbind path, to avoid these issues. Fixes: 2b57f726611e ("drm/msm/dp: fix aux-bus EP lifetime") Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> --- drivers/gpu/drm/msm/dp/dp_display.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)