Message ID | 20230822085949.816844-1-victor.liu@nxp.com (mailing list archive) |
---|---|
Headers | show |
Series | drm/imx: Introduce i.MX8qm/qxp DPU DRM | expand |
Hi, Aside from the discussion on the binding and the general architecture, I have some comments there. On Tue, Aug 22, 2023 at 04:59:48PM +0800, Liu Ying wrote: > +int dpu_cf_init(struct dpu_soc *dpu, unsigned int index, > + unsigned int id, enum dpu_unit_type type, > + unsigned long pec_base, unsigned long base) > +{ > + struct dpu_constframe *cf; > + > + cf = devm_kzalloc(dpu->dev, sizeof(*cf), GFP_KERNEL); > + if (!cf) > + return -ENOMEM; > + > + dpu->cf_priv[index] = cf; You can't store structures related to KMS in a device managed structure. The DRM KMS device will stick around (and be accessible from userspace) after the device has been removed until the last application closed its file descriptor to the device. This can be checked by enabling KASAN and manually unbinding the driver through sysfs. > + cf->pec_base = devm_ioremap(dpu->dev, pec_base, SZ_16); > + if (!cf->pec_base) > + return -ENOMEM; > + > + cf->base = devm_ioremap(dpu->dev, base, SZ_32); > + if (!cf->base) > + return -ENOMEM; For the same reason, you need to protect any access to a device managed resource (so clocks, registers, regulators, etc.) by a call to drm_dev_enter/drm_dev_exit and you need to call drm_dev_unplug instead of drm_dev_unregister. > +static int dpu_crtc_pm_runtime_get_sync(struct dpu_crtc *dpu_crtc) > +{ > + int ret; > + > + ret = pm_runtime_get_sync(dpu_crtc->dev->parent); > + if (ret < 0) { > + pm_runtime_put_noidle(dpu_crtc->dev->parent); > + dpu_crtc_err(&dpu_crtc->base, > + "failed to get parent device RPM sync: %d\n", ret); > + } > + > + return ret; > +} That's pm_runtime_resume_and_get. > +static int dpu_crtc_pm_runtime_put(struct dpu_crtc *dpu_crtc) > +{ > + int ret; > + > + ret = pm_runtime_put(dpu_crtc->dev->parent); > + if (ret < 0) > + dpu_crtc_err(&dpu_crtc->base, > + "failed to put parent device RPM: %d\n", ret); > + > + return ret; > +} > + > +static void dpu_crtc_mode_set_nofb(struct drm_crtc *crtc) > +{ > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > + struct drm_display_mode *adj = &crtc->state->adjusted_mode; > + enum dpu_link_id cf_link; > + > + dpu_crtc_dbg(crtc, "mode " DRM_MODE_FMT "\n", DRM_MODE_ARG(adj)); > + > + /* request power-on when we start to set mode for CRTC */ > + dpu_crtc_pm_runtime_get_sync(dpu_crtc); From the drm_crtc_helper_funcs documentation: """ * Note that the display pipe is completely off when this function is * called. Atomic drivers which need hardware to be running before they * program the new display mode (e.g. because they implement runtime PM) * should not use this hook. This is because the helper library calls * this hook only once per mode change and not every time the display * pipeline is suspended using either DPMS or the new "ACTIVE" property. * Which means register values set in this callback might get reset when * the CRTC is suspended, but not restored. Such drivers should instead * move all their CRTC setup into the @atomic_enable callback. """ > +static void dpu_crtc_atomic_enable(struct drm_crtc *crtc, > + struct drm_atomic_state *state) > +{ > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > + unsigned long flags; > + > + drm_crtc_vblank_on(crtc); > + > + enable_irq(dpu_crtc->dec_shdld_irq); > + enable_irq(dpu_crtc->ed_cont_shdld_irq); > + enable_irq(dpu_crtc->ed_safe_shdld_irq); > + > + dpu_fg_enable_clock(dpu_crtc->fg); > + dpu_ed_pec_sync_trigger(dpu_crtc->ed_cont); > + dpu_ed_pec_sync_trigger(dpu_crtc->ed_safe); > + if (crtc->state->gamma_lut) > + dpu_crtc_set_gammacor(dpu_crtc); > + else > + dpu_crtc_disable_gammacor(dpu_crtc); > + dpu_fg_shdtokgen(dpu_crtc->fg); > + > + /* don't relinquish CPU until TCON is set to operation mode */ > + local_irq_save(flags); > + preempt_disable(); > + dpu_fg_enable(dpu_crtc->fg); That's super fishy. You shouldn't need that, at all. What is going on there? > + > + /* > + * TKT320590: Those are NXP internal references as far as as I can tell. They shouldn't be here. > + * Turn TCON into operation mode as soon as the first dumb > + * frame is generated by DPU(we don't relinquish CPU to ensure > + * this). This makes DPR/PRG be able to evade the frame. > + */ > + DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc->fg); > + dpu_tcon_set_operation_mode(dpu_crtc->tcon); > + local_irq_restore(flags); > + preempt_enable(); > + > + DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(ed_safe_shdld_done); > + DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(ed_cont_shdld_done); > + DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(dec_shdld_done); > + > + disable_irq(dpu_crtc->ed_safe_shdld_irq); > + disable_irq(dpu_crtc->ed_cont_shdld_irq); > + disable_irq(dpu_crtc->dec_shdld_irq); > + > + DPU_CRTC_WAIT_FOR_FRAMEGEN_SECONDARY_SYNCUP(dpu_crtc->fg); The dance around the interrupts doesn't look great either. This need a proper description of the problem this was trying to solve. Also, what happens if any of those interrupts fail to trigger before you timeout? > + DPU_CRTC_CHECK_FRAMEGEN_FIFO(dpu_crtc->fg); > + > + dpu_crtc_queue_state_event(crtc); > +} > + > +static void dpu_crtc_atomic_disable(struct drm_crtc *crtc, > + struct drm_atomic_state *state) > +{ > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > + struct drm_plane *plane; > + struct drm_plane_state *old_plane_state; > + struct dpu_plane_state *old_dpstate; > + struct dpu_fetchunit *fu; > + struct dpu_dprc *dprc; > + const struct dpu_fetchunit_ops *fu_ops; > + unsigned long flags; > + int i; > + > + enable_irq(dpu_crtc->dec_seq_complete_irq); > + > + /* don't relinquish CPU until DPRC repeat_en is disabled */ > + local_irq_save(flags); > + preempt_disable(); > + /* > + * Sync to FrameGen frame counter moving so that > + * FrameGen can be disabled in the next frame. > + */ > + DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc->fg); > + dpu_fg_disable(dpu_crtc->fg); > + /* > + * There is one frame leftover after FrameGen disablement. > + * Sync to FrameGen frame counter moving so that > + * DPRC repeat_en can be disabled in the next frame. > + */ > + DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc->fg); > + > + for_each_old_plane_in_state(state, plane, old_plane_state, i) { > + old_dpstate = to_dpu_plane_state(old_plane_state); > + > + if (!old_plane_state->fb) > + continue; > + > + if (old_plane_state->crtc != crtc) > + continue; > + > + fu = old_dpstate->source; > + > + fu_ops = dpu_fu_get_ops(fu); > + > + dprc = fu_ops->get_dprc(fu); > + dpu_dprc_disable_repeat_en(dprc); > + } > + > + local_irq_restore(flags); > + preempt_enable(); > + > + DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(dec_seq_complete_done); > + > + disable_irq(dpu_crtc->dec_seq_complete_irq); > + > + dpu_fg_disable_clock(dpu_crtc->fg); > + > + drm_crtc_vblank_off(crtc); > + > + spin_lock_irq(&crtc->dev->event_lock); > + if (crtc->state->event && !crtc->state->active) { > + drm_crtc_send_vblank_event(crtc, crtc->state->event); > + crtc->state->event = NULL; > + } > + spin_unlock_irq(&crtc->dev->event_lock); > + > + /* request power-off when CRTC is disabled */ > + dpu_crtc_pm_runtime_put(dpu_crtc); > +} Same story than in atomic_enable here. > +static int legacyfb_depth = 32; > +module_param(legacyfb_depth, uint, 0444); No custom module parameter > +static void dpu_atomic_put_plane_state(struct drm_atomic_state *state, > + struct drm_plane *plane) > +{ > + int index = drm_plane_index(plane); > + > + plane->funcs->atomic_destroy_state(plane, state->planes[index].state); > + state->planes[index].ptr = NULL; > + state->planes[index].state = NULL; > + state->planes[index].old_state = NULL; > + state->planes[index].new_state = NULL; > + > + drm_modeset_unlock(&plane->mutex); > + > + dpu_plane_dbg(plane, "put state\n"); > +} > + > +static void dpu_atomic_put_crtc_state(struct drm_atomic_state *state, > + struct drm_crtc *crtc) > +{ > + int index = drm_crtc_index(crtc); > + > + crtc->funcs->atomic_destroy_state(crtc, state->crtcs[index].state); > + state->crtcs[index].ptr = NULL; > + state->crtcs[index].state = NULL; > + state->crtcs[index].old_state = NULL; > + state->crtcs[index].new_state = NULL; > + > + drm_modeset_unlock(&crtc->mutex); > + > + dpu_crtc_dbg(crtc, "put state\n"); > +} > + > +static void > +dpu_atomic_put_possible_states_per_crtc(struct drm_crtc_state *crtc_state) > +{ > + struct drm_atomic_state *state = crtc_state->state; > + struct drm_crtc *crtc = crtc_state->crtc; > + struct drm_plane *plane; > + struct drm_plane_state *old_plane_state, *new_plane_state; > + struct dpu_plane_state *old_dpstate, *new_dpstate; > + > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) { > + old_plane_state = drm_atomic_get_old_plane_state(state, plane); > + new_plane_state = drm_atomic_get_new_plane_state(state, plane); > + > + old_dpstate = to_dpu_plane_state(old_plane_state); > + new_dpstate = to_dpu_plane_state(new_plane_state); > + > + /* Should be enough to check the below HW plane resources. */ > + if (old_dpstate->stage.ptr != new_dpstate->stage.ptr || > + old_dpstate->source != new_dpstate->source || > + old_dpstate->blend != new_dpstate->blend) > + return; > + } > + > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) > + dpu_atomic_put_plane_state(state, plane); > + > + dpu_atomic_put_crtc_state(state, crtc); > +} That's super suspicious too. Are you really going around and dropping and destroying plane and crtc states in a global state? At the very least, you need to describe what this addresses and why you think it's a good solution. I kind of skimmed over the last part of the driver, but we should really address these first comments. There's a larger discussion on the fact that this driver does much more that it should and needs to (especially in atomic_check, but not only), and this applies to the rest of patch. Maxime
On Tuesday, August 22, 2023 8:59 PM, Maxime Ripard <mripard@kernel.org> wrote: > > Hi, Hi, > > Aside from the discussion on the binding and the general architecture, I > have some comments there. Thanks for your comments. > > On Tue, Aug 22, 2023 at 04:59:48PM +0800, Liu Ying wrote: > > +int dpu_cf_init(struct dpu_soc *dpu, unsigned int index, > > + unsigned int id, enum dpu_unit_type type, > > + unsigned long pec_base, unsigned long base) > > +{ > > + struct dpu_constframe *cf; > > + > > + cf = devm_kzalloc(dpu->dev, sizeof(*cf), GFP_KERNEL); > > + if (!cf) > > + return -ENOMEM; > > + > > + dpu->cf_priv[index] = cf; > > You can't store structures related to KMS in a device managed structure. > The DRM KMS device will stick around (and be accessible from userspace) > after the device has been removed until the last application closed its > file descriptor to the device. The DRM device is registered after component_bind_all() is called in dpu_drm_bind(). The CRTC components' platform devices are created in the dpu_core_probe() where the device managed resources are created. So, it looks those resources are safe because the DRM device will be unregistered before those resources are freed. > > This can be checked by enabling KASAN and manually unbinding the driver > through sysfs. I enabled KASAN and manually unbound the dpu-core driver with command: echo 56180000.dpu > /sys/bus/platform/drivers/dpu-core/56180000.dpu/driver/unbind KASAN didin't report memory issue regarding those device managed resources. However, it did report another issue in dpu_drm_unbind(), where drm_device should be got from drv_data->drm_dev instead of dev_get_drvdata(dev). I'll fix that in next version. BTW, the dpu-core driver was successfully bound again after unbinding with command: echo 56180000.dpu > /sys/bus/platform/drivers/dpu-core/bind > > > + cf->pec_base = devm_ioremap(dpu->dev, pec_base, SZ_16); > > + if (!cf->pec_base) > > + return -ENOMEM; > > + > > + cf->base = devm_ioremap(dpu->dev, base, SZ_32); > > + if (!cf->base) > > + return -ENOMEM; > > For the same reason, you need to protect any access to a device managed > resource (so clocks, registers, regulators, etc.) by a call to > drm_dev_enter/drm_dev_exit and you need to call drm_dev_unplug instead > of drm_dev_unregister. That's a good point. I've tried to do that, but it turns out that the display controller cannot be enabled again after binding the dpu-core driver manually again. It seems that the display controller requires a proper disablement procedure, but the "driver instance overview " kdoc mentions the shortcoming of no proper disablement if drm_dev_unplug() is used: """ * Drivers that want to support device unplugging (USB, DT overlay unload) should * use drm_dev_unplug() instead of drm_dev_unregister(). The driver must protect * regions that is accessing device resources to prevent use after they're * released. This is done using drm_dev_enter() and drm_dev_exit(). There is one * shortcoming however, drm_dev_unplug() marks the drm_device as unplugged before * drm_atomic_helper_shutdown() is called. This means that if the disable code * paths are protected, they will not run on regular driver module unload, * possibly leaving the hardware enabled. """ A DPU reset in dpu_core() might be helpful, but I'm not sure if there is any reset line provided by the embodying system. Even if the reset works, the 2nd DPU instance in i.MX8qm would be a problem, because it won't be reset or properly disabled if the 1st DPU instance is unbound. Although the two DPU instances could be wrapped by two DRM devices, I tend not to do that because downstream bridges in future SoCs might be able to mux to different DPU instances at runtime. Due to the disablement issue, can we set drm_dev_enter/exit/unplug aside first? > > > +static int dpu_crtc_pm_runtime_get_sync(struct dpu_crtc *dpu_crtc) > > +{ > > + int ret; > > + > > + ret = pm_runtime_get_sync(dpu_crtc->dev->parent); > > + if (ret < 0) { > > + pm_runtime_put_noidle(dpu_crtc->dev->parent); > > + dpu_crtc_err(&dpu_crtc->base, > > + "failed to get parent device RPM sync: %d\n", ret); > > + } > > + > > + return ret; > > +} > > That's pm_runtime_resume_and_get. Ok, will use it. > > > +static int dpu_crtc_pm_runtime_put(struct dpu_crtc *dpu_crtc) > > +{ > > + int ret; > > + > > + ret = pm_runtime_put(dpu_crtc->dev->parent); > > + if (ret < 0) > > + dpu_crtc_err(&dpu_crtc->base, > > + "failed to put parent device RPM: %d\n", ret); > > + > > + return ret; > > +} > > + > > +static void dpu_crtc_mode_set_nofb(struct drm_crtc *crtc) > > +{ > > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > > + struct drm_display_mode *adj = &crtc->state->adjusted_mode; > > + enum dpu_link_id cf_link; > > + > > + dpu_crtc_dbg(crtc, "mode " DRM_MODE_FMT "\n", > DRM_MODE_ARG(adj)); > > + > > + /* request power-on when we start to set mode for CRTC */ > > + dpu_crtc_pm_runtime_get_sync(dpu_crtc); > > From the drm_crtc_helper_funcs documentation: > > """ > * Note that the display pipe is completely off when this function is > * called. Atomic drivers which need hardware to be running before > they > * program the new display mode (e.g. because they implement > runtime PM) > * should not use this hook. This is because the helper library calls > * this hook only once per mode change and not every time the display > * pipeline is suspended using either DPMS or the new "ACTIVE" > property. > * Which means register values set in this callback might get reset > when > * the CRTC is suspended, but not restored. Such drivers should > instead > * move all their CRTC setup into the @atomic_enable callback. > """ I can use drm_atomic_helper_commit_tail() but not drm_atomic_helper_commit_tail_rpm() because the planes need to be updated prior to modeset_enables(where trigger shadow registers in plane's HW resources to take effect). Using the former one means that RPM needs to be get/put in drm_atomic_helper_commit_planes(), which doesn't seem good because in some cases the power of the display controller might be lost after RPM put and I'm not sure if the registers set there will be lost or not. So, to avoid the issue the documentation mentions, crtc_state->mode_changed is forced to be true in dpu_crtc_atomic_check() if the CRTC is changed to active. Is this acceptable? > > > +static void dpu_crtc_atomic_enable(struct drm_crtc *crtc, > > + struct drm_atomic_state *state) > > +{ > > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > > + unsigned long flags; > > + > > + drm_crtc_vblank_on(crtc); > > + > > + enable_irq(dpu_crtc->dec_shdld_irq); > > + enable_irq(dpu_crtc->ed_cont_shdld_irq); > > + enable_irq(dpu_crtc->ed_safe_shdld_irq); > > + > > + dpu_fg_enable_clock(dpu_crtc->fg); > > + dpu_ed_pec_sync_trigger(dpu_crtc->ed_cont); > > + dpu_ed_pec_sync_trigger(dpu_crtc->ed_safe); > > + if (crtc->state->gamma_lut) > > + dpu_crtc_set_gammacor(dpu_crtc); > > + else > > + dpu_crtc_disable_gammacor(dpu_crtc); > > + dpu_fg_shdtokgen(dpu_crtc->fg); > > + > > + /* don't relinquish CPU until TCON is set to operation mode */ > > + local_irq_save(flags); > > + preempt_disable(); > > + dpu_fg_enable(dpu_crtc->fg); > > That's super fishy. You shouldn't need that, at all. What is going on > there? This aims to fully workaround the below errata recently released by NXP. ERR010910: DC: Display Subsystem First Frame Programming Restriction Link: https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf In short, to make sure the display controller can be enabled properly with prefetch engine(DPRC + PRG), the TCON must be switch from bypass mode to operation mode after FrameGen(FG) generates the first frame. Timing is critical here, so local irq and preemption are disabled during the switch procedure. BTW, the driver always use prefetch engines for KMS, although they can be bypassed. > > > + > > + /* > > + * TKT320590: > > Those are NXP internal references as far as as I can tell. They > shouldn't be here. Ok, will change it to be ERR010910. > > > + * Turn TCON into operation mode as soon as the first dumb > > + * frame is generated by DPU(we don't relinquish CPU to ensure > > + * this). This makes DPR/PRG be able to evade the frame. > > + */ > > + DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc- > >fg); > > + dpu_tcon_set_operation_mode(dpu_crtc->tcon); > > + local_irq_restore(flags); > > + preempt_enable(); > > + > > + DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(ed_safe_shdld_done); > > + DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(ed_cont_shdld_done); > > + DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(dec_shdld_done); > > + > > + disable_irq(dpu_crtc->ed_safe_shdld_irq); > > + disable_irq(dpu_crtc->ed_cont_shdld_irq); > > + disable_irq(dpu_crtc->dec_shdld_irq); > > + > > + DPU_CRTC_WAIT_FOR_FRAMEGEN_SECONDARY_SYNCUP(dpu_crtc- > >fg); > > The dance around the interrupts doesn't look great either. This need a > proper description of the problem this was trying to solve. Also, what > happens if any of those interrupts fail to trigger before you timeout? Hmm, this is just following the display controller spec which provides detail steps to enable the display pipeline which include waiting for the interrupts and the FrameGen primary or secondary channel syncup. If the interrupts fail to trigger and we timeout, then there must be something wrong either caused by DPU HW itself or driver bug. Here, warnings are generated only. > > > + DPU_CRTC_CHECK_FRAMEGEN_FIFO(dpu_crtc->fg); > > + > > + dpu_crtc_queue_state_event(crtc); > > +} > > + > > +static void dpu_crtc_atomic_disable(struct drm_crtc *crtc, > > + struct drm_atomic_state *state) > > +{ > > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > > + struct drm_plane *plane; > > + struct drm_plane_state *old_plane_state; > > + struct dpu_plane_state *old_dpstate; > > + struct dpu_fetchunit *fu; > > + struct dpu_dprc *dprc; > > + const struct dpu_fetchunit_ops *fu_ops; > > + unsigned long flags; > > + int i; > > + > > + enable_irq(dpu_crtc->dec_seq_complete_irq); > > + > > + /* don't relinquish CPU until DPRC repeat_en is disabled */ > > + local_irq_save(flags); > > + preempt_disable(); > > + /* > > + * Sync to FrameGen frame counter moving so that > > + * FrameGen can be disabled in the next frame. > > + */ > > + DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc- > >fg); > > + dpu_fg_disable(dpu_crtc->fg); > > + /* > > + * There is one frame leftover after FrameGen disablement. > > + * Sync to FrameGen frame counter moving so that > > + * DPRC repeat_en can be disabled in the next frame. > > + */ > > + DPU_CRTC_WAIT_FOR_FRAMEGEN_FRAME_CNT_MOVING(dpu_crtc- > >fg); > > + > > + for_each_old_plane_in_state(state, plane, old_plane_state, i) { > > + old_dpstate = to_dpu_plane_state(old_plane_state); > > + > > + if (!old_plane_state->fb) > > + continue; > > + > > + if (old_plane_state->crtc != crtc) > > + continue; > > + > > + fu = old_dpstate->source; > > + > > + fu_ops = dpu_fu_get_ops(fu); > > + > > + dprc = fu_ops->get_dprc(fu); > > + dpu_dprc_disable_repeat_en(dprc); > > + } > > + > > + local_irq_restore(flags); > > + preempt_enable(); > > + > > + > DPU_CRTC_WAIT_FOR_COMPLETION_TIMEOUT(dec_seq_complete_d > one); > > + > > + disable_irq(dpu_crtc->dec_seq_complete_irq); > > + > > + dpu_fg_disable_clock(dpu_crtc->fg); > > + > > + drm_crtc_vblank_off(crtc); > > + > > + spin_lock_irq(&crtc->dev->event_lock); > > + if (crtc->state->event && !crtc->state->active) { > > + drm_crtc_send_vblank_event(crtc, crtc->state->event); > > + crtc->state->event = NULL; > > + } > > + spin_unlock_irq(&crtc->dev->event_lock); > > + > > + /* request power-off when CRTC is disabled */ > > + dpu_crtc_pm_runtime_put(dpu_crtc); > > +} > > Same story than in atomic_enable here. Similar to atomic_enable, strict procedure needs to be followed to disable the CRTC properly, including disabling FrameGen by syncing to frame counter moving and disabling DPRC repeat_en as soon as possible. The critical timing is achieved by disabling local irq and preemption during the procedure. > > > > +static int legacyfb_depth = 32; > > +module_param(legacyfb_depth, uint, 0444); > > No custom module parameter Ok, will remove them. > > > +static void dpu_atomic_put_plane_state(struct drm_atomic_state *state, > > + struct drm_plane *plane) > > +{ > > + int index = drm_plane_index(plane); > > + > > + plane->funcs->atomic_destroy_state(plane, state->planes[index].state); > > + state->planes[index].ptr = NULL; > > + state->planes[index].state = NULL; > > + state->planes[index].old_state = NULL; > > + state->planes[index].new_state = NULL; > > + > > + drm_modeset_unlock(&plane->mutex); > > + > > + dpu_plane_dbg(plane, "put state\n"); > > +} > > + > > +static void dpu_atomic_put_crtc_state(struct drm_atomic_state *state, > > + struct drm_crtc *crtc) > > +{ > > + int index = drm_crtc_index(crtc); > > + > > + crtc->funcs->atomic_destroy_state(crtc, state->crtcs[index].state); > > + state->crtcs[index].ptr = NULL; > > + state->crtcs[index].state = NULL; > > + state->crtcs[index].old_state = NULL; > > + state->crtcs[index].new_state = NULL; > > + > > + drm_modeset_unlock(&crtc->mutex); > > + > > + dpu_crtc_dbg(crtc, "put state\n"); > > +} > > + > > +static void > > +dpu_atomic_put_possible_states_per_crtc(struct drm_crtc_state > *crtc_state) > > +{ > > + struct drm_atomic_state *state = crtc_state->state; > > + struct drm_crtc *crtc = crtc_state->crtc; > > + struct drm_plane *plane; > > + struct drm_plane_state *old_plane_state, *new_plane_state; > > + struct dpu_plane_state *old_dpstate, *new_dpstate; > > + > > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) { > > + old_plane_state = drm_atomic_get_old_plane_state(state, > plane); > > + new_plane_state = drm_atomic_get_new_plane_state(state, > plane); > > + > > + old_dpstate = to_dpu_plane_state(old_plane_state); > > + new_dpstate = to_dpu_plane_state(new_plane_state); > > + > > + /* Should be enough to check the below HW plane resources. > */ > > + if (old_dpstate->stage.ptr != new_dpstate->stage.ptr || > > + old_dpstate->source != new_dpstate->source || > > + old_dpstate->blend != new_dpstate->blend) > > + return; > > + } > > + > > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) > > + dpu_atomic_put_plane_state(state, plane); > > + > > + dpu_atomic_put_crtc_state(state, crtc); > > +} > > That's super suspicious too. Are you really going around and dropping > and destroying plane and crtc states in a global state? Yes. > > At the very least, you need to describe what this addresses and why you > think it's a good solution. This is the solution to assign HW resources of a plane group to the two CRTCs in one DPU or one CRTC group _dynamically_ at runtime. Dpu.h has some comments which hint this: """ /* * fetchunit/scaler/layerblend resources of a plane group are * shared by the two CRTCs in a CRTC group */ """ I can add a DPU display controller block diagram in dpu_kms.c to tell the HW architecture and some SW architecture to clarify this more. Regards, Liu Ying > > I kind of skimmed over the last part of the driver, but we should really > address these first comments. There's a larger discussion on the fact > that this driver does much more that it should and needs to (especially in > atomic_check, but not only), and this applies to the rest of patch. > > Maxime
On Tue, Sep 05, 2023 at 03:32:47AM +0000, Ying Liu wrote: > > On Tue, Aug 22, 2023 at 04:59:48PM +0800, Liu Ying wrote: > > > +int dpu_cf_init(struct dpu_soc *dpu, unsigned int index, > > > + unsigned int id, enum dpu_unit_type type, > > > + unsigned long pec_base, unsigned long base) > > > +{ > > > + struct dpu_constframe *cf; > > > + > > > + cf = devm_kzalloc(dpu->dev, sizeof(*cf), GFP_KERNEL); > > > + if (!cf) > > > + return -ENOMEM; > > > + > > > + dpu->cf_priv[index] = cf; > > > > You can't store structures related to KMS in a device managed structure. > > The DRM KMS device will stick around (and be accessible from userspace) > > after the device has been removed until the last application closed its > > file descriptor to the device. > > The DRM device is registered after component_bind_all() is called in > dpu_drm_bind(). The CRTC components' platform devices are created > in the dpu_core_probe() where the device managed resources are > created. So, it looks those resources are safe because the DRM device > will be unregistered before those resources are freed. Not, it's not, because the KMS device isn't freed when devices will be unbound/removed, but when the last application closes its fd to it, and so you'll get dangling pointers. The general rule to get it right is to use drmm for anything but device resources (like clocks, regulators, memory mapping, etc.). You can deviate from the rule, of course, but you'll need a long and clear explanation on why it doesn't work, and why your new approach works. Your current approach doesn't though. > > This can be checked by enabling KASAN and manually unbinding the driver > > through sysfs. > > I enabled KASAN and manually unbound the dpu-core driver with command: > > echo 56180000.dpu > /sys/bus/platform/drivers/dpu-core/56180000.dpu/driver/unbind > > KASAN didin't report memory issue regarding those device managed > resources. However, it did report another issue in dpu_drm_unbind(), > where drm_device should be got from drv_data->drm_dev instead of > dev_get_drvdata(dev). I'll fix that in next version. > > BTW, the dpu-core driver was successfully bound again after unbinding with > command: > > echo 56180000.dpu > /sys/bus/platform/drivers/dpu-core/bind Guess you're lucky. That doesn't make it safe or right. > > > + cf->pec_base = devm_ioremap(dpu->dev, pec_base, SZ_16); > > > + if (!cf->pec_base) > > > + return -ENOMEM; > > > + > > > + cf->base = devm_ioremap(dpu->dev, base, SZ_32); > > > + if (!cf->base) > > > + return -ENOMEM; > > > > For the same reason, you need to protect any access to a device managed > > resource (so clocks, registers, regulators, etc.) by a call to > > drm_dev_enter/drm_dev_exit and you need to call drm_dev_unplug instead > > of drm_dev_unregister. > > That's a good point. I've tried to do that, but it turns out that the > display controller cannot be enabled again after binding the dpu-core > driver manually again. It seems that the display controller requires a > proper disablement procedure, but the "driver instance overview " kdoc > mentions the shortcoming of no proper disablement if drm_dev_unplug() > is used: > > """ > * Drivers that want to support device unplugging (USB, DT overlay unload) should > * use drm_dev_unplug() instead of drm_dev_unregister(). The driver must protect > * regions that is accessing device resources to prevent use after they're > * released. This is done using drm_dev_enter() and drm_dev_exit(). There is one > * shortcoming however, drm_dev_unplug() marks the drm_device as unplugged before > * drm_atomic_helper_shutdown() is called. This means that if the disable code > * paths are protected, they will not run on regular driver module unload, > * possibly leaving the hardware enabled. > """ > > A DPU reset in dpu_core() might be helpful, but I'm not sure if there is any > reset line provided by the embodying system. Generally speaking, you shouldn't rely on the device being in any particuliar state before your driver loads. So a reset at probe/bind time is a good idea. > Even if the reset works, the 2nd DPU instance in i.MX8qm would be a > problem, because it won't be reset or properly disabled if the 1st DPU > instance is unbound. Why it wouldn't be reset? > Although the two DPU instances could be wrapped by two DRM devices, I > tend not to do that because downstream bridges in future SoCs might be > able to mux to different DPU instances at runtime. > > Due to the disablement issue, can we set drm_dev_enter/exit/unplug > aside first? I'd rather have that figured out prior to merging. > > > > > +static int dpu_crtc_pm_runtime_put(struct dpu_crtc *dpu_crtc) > > > +{ > > > + int ret; > > > + > > > + ret = pm_runtime_put(dpu_crtc->dev->parent); > > > + if (ret < 0) > > > + dpu_crtc_err(&dpu_crtc->base, > > > + "failed to put parent device RPM: %d\n", ret); > > > + > > > + return ret; > > > +} > > > + > > > +static void dpu_crtc_mode_set_nofb(struct drm_crtc *crtc) > > > +{ > > > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > > > + struct drm_display_mode *adj = &crtc->state->adjusted_mode; > > > + enum dpu_link_id cf_link; > > > + > > > + dpu_crtc_dbg(crtc, "mode " DRM_MODE_FMT "\n", > > DRM_MODE_ARG(adj)); > > > + > > > + /* request power-on when we start to set mode for CRTC */ > > > + dpu_crtc_pm_runtime_get_sync(dpu_crtc); > > > > From the drm_crtc_helper_funcs documentation: > > > > """ > > * Note that the display pipe is completely off when this function is > > * called. Atomic drivers which need hardware to be running before > > they > > * program the new display mode (e.g. because they implement > > runtime PM) > > * should not use this hook. This is because the helper library calls > > * this hook only once per mode change and not every time the display > > * pipeline is suspended using either DPMS or the new "ACTIVE" > > property. > > * Which means register values set in this callback might get reset > > when > > * the CRTC is suspended, but not restored. Such drivers should > > instead > > * move all their CRTC setup into the @atomic_enable callback. > > """ > > I can use drm_atomic_helper_commit_tail() but not > drm_atomic_helper_commit_tail_rpm() because the planes need to be > updated prior to modeset_enables(where trigger shadow registers in > plane's HW resources to take effect). Using the former one means that > RPM needs to be get/put in drm_atomic_helper_commit_planes(), which > doesn't seem good because in some cases the power of the display controller > might be lost after RPM put and I'm not sure if the registers set there will > be lost or not. So, to avoid the issue the documentation mentions, > crtc_state->mode_changed is forced to be true in dpu_crtc_atomic_check() > if the CRTC is changed to active. Is this acceptable? No, just move the crtc setup to atomic_enable like the doc suggests. > > > +static void dpu_crtc_atomic_enable(struct drm_crtc *crtc, > > > + struct drm_atomic_state *state) > > > +{ > > > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > > > + unsigned long flags; > > > + > > > + drm_crtc_vblank_on(crtc); > > > + > > > + enable_irq(dpu_crtc->dec_shdld_irq); > > > + enable_irq(dpu_crtc->ed_cont_shdld_irq); > > > + enable_irq(dpu_crtc->ed_safe_shdld_irq); > > > + > > > + dpu_fg_enable_clock(dpu_crtc->fg); > > > + dpu_ed_pec_sync_trigger(dpu_crtc->ed_cont); > > > + dpu_ed_pec_sync_trigger(dpu_crtc->ed_safe); > > > + if (crtc->state->gamma_lut) > > > + dpu_crtc_set_gammacor(dpu_crtc); > > > + else > > > + dpu_crtc_disable_gammacor(dpu_crtc); > > > + dpu_fg_shdtokgen(dpu_crtc->fg); > > > + > > > + /* don't relinquish CPU until TCON is set to operation mode */ > > > + local_irq_save(flags); > > > + preempt_disable(); > > > + dpu_fg_enable(dpu_crtc->fg); > > > > That's super fishy. You shouldn't need that, at all. What is going on > > there? > > This aims to fully workaround the below errata recently released by > NXP. > > ERR010910: DC: Display Subsystem First Frame Programming Restriction > Link: https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf > > In short, to make sure the display controller can be enabled properly with > prefetch engine(DPRC + PRG), the TCON must be switch from bypass mode > to operation mode after FrameGen(FG) generates the first frame. > > Timing is critical here, so local irq and preemption are disabled during the > switch procedure. > > BTW, the driver always use prefetch engines for KMS, although they can > be bypassed. Ok. So I think it would help your driver getting merged to split that workaround into a separate patch. It's hard to tell what are the implications of that workaround on your driver when it's lost in the middle of, well, the driver :) I guess it would be much easier for everyone if you submitted that driver without the errata handling, or with prefetch bypassed, at first. And then you can submit / rework the hard parts. > > > > > + > > > + /* > > > + * TKT320590: > > > > Those are NXP internal references as far as as I can tell. They > > shouldn't be here. > > Ok, will change it to be ERR010910. A link to the errata description would be a good idea too. > > > +static void dpu_atomic_put_plane_state(struct drm_atomic_state *state, > > > + struct drm_plane *plane) > > > +{ > > > + int index = drm_plane_index(plane); > > > + > > > + plane->funcs->atomic_destroy_state(plane, state->planes[index].state); > > > + state->planes[index].ptr = NULL; > > > + state->planes[index].state = NULL; > > > + state->planes[index].old_state = NULL; > > > + state->planes[index].new_state = NULL; > > > + > > > + drm_modeset_unlock(&plane->mutex); > > > + > > > + dpu_plane_dbg(plane, "put state\n"); > > > +} > > > + > > > +static void dpu_atomic_put_crtc_state(struct drm_atomic_state *state, > > > + struct drm_crtc *crtc) > > > +{ > > > + int index = drm_crtc_index(crtc); > > > + > > > + crtc->funcs->atomic_destroy_state(crtc, state->crtcs[index].state); > > > + state->crtcs[index].ptr = NULL; > > > + state->crtcs[index].state = NULL; > > > + state->crtcs[index].old_state = NULL; > > > + state->crtcs[index].new_state = NULL; > > > + > > > + drm_modeset_unlock(&crtc->mutex); > > > + > > > + dpu_crtc_dbg(crtc, "put state\n"); > > > +} > > > + > > > +static void > > > +dpu_atomic_put_possible_states_per_crtc(struct drm_crtc_state > > *crtc_state) > > > +{ > > > + struct drm_atomic_state *state = crtc_state->state; > > > + struct drm_crtc *crtc = crtc_state->crtc; > > > + struct drm_plane *plane; > > > + struct drm_plane_state *old_plane_state, *new_plane_state; > > > + struct dpu_plane_state *old_dpstate, *new_dpstate; > > > + > > > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) { > > > + old_plane_state = drm_atomic_get_old_plane_state(state, > > plane); > > > + new_plane_state = drm_atomic_get_new_plane_state(state, > > plane); > > > + > > > + old_dpstate = to_dpu_plane_state(old_plane_state); > > > + new_dpstate = to_dpu_plane_state(new_plane_state); > > > + > > > + /* Should be enough to check the below HW plane resources. > > */ > > > + if (old_dpstate->stage.ptr != new_dpstate->stage.ptr || > > > + old_dpstate->source != new_dpstate->source || > > > + old_dpstate->blend != new_dpstate->blend) > > > + return; > > > + } > > > + > > > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) > > > + dpu_atomic_put_plane_state(state, plane); > > > + > > > + dpu_atomic_put_crtc_state(state, crtc); > > > +} > > > > That's super suspicious too. Are you really going around and dropping > > and destroying plane and crtc states in a global state? > > Yes. That's really not a good idea. Adding states are fine, dropping ones aren't. > > > > At the very least, you need to describe what this addresses and why you > > think it's a good solution. > > This is the solution to assign HW resources of a plane group to the two CRTCs > in one DPU or one CRTC group _dynamically_ at runtime. Dpu.h has some > comments which hint this: > > """ > /* > * fetchunit/scaler/layerblend resources of a plane group are > * shared by the two CRTCs in a CRTC group > */ > """ > > I can add a DPU display controller block diagram in dpu_kms.c to tell the HW > architecture and some SW architecture to clarify this more. It's not so much the diagram that I'm looking for, but an accurate description of the problem. What resource is there, why and how does it need to be shared, so we can understand what you are doing there, and possibly suggest other things. Maxime
On Tuesday, September 5, 2023 4:37 PM, Maxime Ripard <mripard@kernel.org> wrote: > On Tue, Sep 05, 2023 at 03:32:47AM +0000, Ying Liu wrote: > > > On Tue, Aug 22, 2023 at 04:59:48PM +0800, Liu Ying wrote: > > > > +int dpu_cf_init(struct dpu_soc *dpu, unsigned int index, > > > > + unsigned int id, enum dpu_unit_type type, > > > > + unsigned long pec_base, unsigned long base) > > > > +{ > > > > + struct dpu_constframe *cf; > > > > + > > > > + cf = devm_kzalloc(dpu->dev, sizeof(*cf), GFP_KERNEL); > > > > + if (!cf) > > > > + return -ENOMEM; > > > > + > > > > + dpu->cf_priv[index] = cf; > > > > > > You can't store structures related to KMS in a device managed structure. > > > The DRM KMS device will stick around (and be accessible from userspace) > > > after the device has been removed until the last application closed its > > > file descriptor to the device. > > > > The DRM device is registered after component_bind_all() is called in > > dpu_drm_bind(). The CRTC components' platform devices are created > > in the dpu_core_probe() where the device managed resources are > > created. So, it looks those resources are safe because the DRM device > > will be unregistered before those resources are freed. > > Not, it's not, because the KMS device isn't freed when devices will be > unbound/removed, but when the last application closes its fd to it, and > so you'll get dangling pointers. > > The general rule to get it right is to use drmm for anything but device > resources (like clocks, regulators, memory mapping, etc.). You can > deviate from the rule, of course, but you'll need a long and clear > explanation on why it doesn't work, and why your new approach works. > Your current approach doesn't though. I'll try to follow the rule in next version. Will call drmm_kzalloc() instead of devm_kzalloc() here. > > > > This can be checked by enabling KASAN and manually unbinding the > driver > > > through sysfs. > > > > I enabled KASAN and manually unbound the dpu-core driver with > command: > > > > echo 56180000.dpu > /sys/bus/platform/drivers/dpu- > core/56180000.dpu/driver/unbind > > > > KASAN didin't report memory issue regarding those device managed > > resources. However, it did report another issue in dpu_drm_unbind(), > > where drm_device should be got from drv_data->drm_dev instead of > > dev_get_drvdata(dev). I'll fix that in next version. > > > > BTW, the dpu-core driver was successfully bound again after unbinding > with > > command: > > > > echo 56180000.dpu > /sys/bus/platform/drivers/dpu-core/bind > > Guess you're lucky. That doesn't make it safe or right. > > > > > + cf->pec_base = devm_ioremap(dpu->dev, pec_base, SZ_16); > > > > + if (!cf->pec_base) > > > > + return -ENOMEM; > > > > + > > > > + cf->base = devm_ioremap(dpu->dev, base, SZ_32); > > > > + if (!cf->base) > > > > + return -ENOMEM; > > > > > > For the same reason, you need to protect any access to a device managed > > > resource (so clocks, registers, regulators, etc.) by a call to > > > drm_dev_enter/drm_dev_exit and you need to call drm_dev_unplug > instead > > > of drm_dev_unregister. > > > > That's a good point. I've tried to do that, but it turns out that the > > display controller cannot be enabled again after binding the dpu-core > > driver manually again. It seems that the display controller requires a > > proper disablement procedure, but the "driver instance overview " kdoc > > mentions the shortcoming of no proper disablement if drm_dev_unplug() > > is used: > > > > """ > > * Drivers that want to support device unplugging (USB, DT overlay unload) > should > > * use drm_dev_unplug() instead of drm_dev_unregister(). The driver must > protect > > * regions that is accessing device resources to prevent use after they're > > * released. This is done using drm_dev_enter() and drm_dev_exit(). There > is one > > * shortcoming however, drm_dev_unplug() marks the drm_device as > unplugged before > > * drm_atomic_helper_shutdown() is called. This means that if the disable > code > > * paths are protected, they will not run on regular driver module unload, > > * possibly leaving the hardware enabled. > > """ > > > > A DPU reset in dpu_core() might be helpful, but I'm not sure if there is any > > reset line provided by the embodying system. > > Generally speaking, you shouldn't rely on the device being in any > particuliar state before your driver loads. So a reset at probe/bind > time is a good idea. Yes. I'll drop the platform device creations for CRTCs from dpu-core.c and drop the aggregation of CRTC components from different DPU instances into one DRM device. This way, there will be only two CRTCs of one DPU in one DRM device. Then, the driver will be simpler and users cannot unbind the driver of one of the two DPU instances, which means drm_dev_unplug() won't be needed any more(?) and the reset issue will be gone. The controller will be shutdown properly through drm_atomic_helper_shutdown() when the driver module is removed. > > > Even if the reset works, the 2nd DPU instance in i.MX8qm would be a > > problem, because it won't be reset or properly disabled if the 1st DPU > > instance is unbound. > > Why it wouldn't be reset? Because dpu_core_remove() is not called for the 2nd DPU instance. Anyway, with the above new design, this doesn't seem to be a problem any more. > > > Although the two DPU instances could be wrapped by two DRM devices, I > > tend not to do that because downstream bridges in future SoCs might be > > able to mux to different DPU instances at runtime. > > > > Due to the disablement issue, can we set drm_dev_enter/exit/unplug > > aside first? > > I'd rather have that figured out prior to merging. I'm assuming that drm_dev_enter/exit/unplug won't be needed with the above new design - one DPU instance wrapped by one DRM device. > > > > > > > +static int dpu_crtc_pm_runtime_put(struct dpu_crtc *dpu_crtc) > > > > +{ > > > > + int ret; > > > > + > > > > + ret = pm_runtime_put(dpu_crtc->dev->parent); > > > > + if (ret < 0) > > > > + dpu_crtc_err(&dpu_crtc->base, > > > > + "failed to put parent device RPM: %d\n", ret); > > > > + > > > > + return ret; > > > > +} > > > > + > > > > +static void dpu_crtc_mode_set_nofb(struct drm_crtc *crtc) > > > > +{ > > > > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > > > > + struct drm_display_mode *adj = &crtc->state->adjusted_mode; > > > > + enum dpu_link_id cf_link; > > > > + > > > > + dpu_crtc_dbg(crtc, "mode " DRM_MODE_FMT "\n", > > > DRM_MODE_ARG(adj)); > > > > + > > > > + /* request power-on when we start to set mode for CRTC */ > > > > + dpu_crtc_pm_runtime_get_sync(dpu_crtc); > > > > > > From the drm_crtc_helper_funcs documentation: > > > > > > """ > > > * Note that the display pipe is completely off when this function is > > > * called. Atomic drivers which need hardware to be running before > > > they > > > * program the new display mode (e.g. because they implement > > > runtime PM) > > > * should not use this hook. This is because the helper library calls > > > * this hook only once per mode change and not every time the > display > > > * pipeline is suspended using either DPMS or the new "ACTIVE" > > > property. > > > * Which means register values set in this callback might get reset > > > when > > > * the CRTC is suspended, but not restored. Such drivers should > > > instead > > > * move all their CRTC setup into the @atomic_enable callback. > > > """ > > > > I can use drm_atomic_helper_commit_tail() but not > > drm_atomic_helper_commit_tail_rpm() because the planes need to be > > updated prior to modeset_enables(where trigger shadow registers in > > plane's HW resources to take effect). Using the former one means that > > RPM needs to be get/put in drm_atomic_helper_commit_planes(), which > > doesn't seem good because in some cases the power of the display > controller > > might be lost after RPM put and I'm not sure if the registers set there will > > be lost or not. So, to avoid the issue the documentation mentions, > > crtc_state->mode_changed is forced to be true in dpu_crtc_atomic_check() > > if the CRTC is changed to active. Is this acceptable? > > No, just move the crtc setup to atomic_enable like the doc suggests. Ok, will do. > > > > > +static void dpu_crtc_atomic_enable(struct drm_crtc *crtc, > > > > + struct drm_atomic_state *state) > > > > +{ > > > > + struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc); > > > > + unsigned long flags; > > > > + > > > > + drm_crtc_vblank_on(crtc); > > > > + > > > > + enable_irq(dpu_crtc->dec_shdld_irq); > > > > + enable_irq(dpu_crtc->ed_cont_shdld_irq); > > > > + enable_irq(dpu_crtc->ed_safe_shdld_irq); > > > > + > > > > + dpu_fg_enable_clock(dpu_crtc->fg); > > > > + dpu_ed_pec_sync_trigger(dpu_crtc->ed_cont); > > > > + dpu_ed_pec_sync_trigger(dpu_crtc->ed_safe); > > > > + if (crtc->state->gamma_lut) > > > > + dpu_crtc_set_gammacor(dpu_crtc); > > > > + else > > > > + dpu_crtc_disable_gammacor(dpu_crtc); > > > > + dpu_fg_shdtokgen(dpu_crtc->fg); > > > > + > > > > + /* don't relinquish CPU until TCON is set to operation mode */ > > > > + local_irq_save(flags); > > > > + preempt_disable(); > > > > + dpu_fg_enable(dpu_crtc->fg); > > > > > > That's super fishy. You shouldn't need that, at all. What is going on > > > there? > > > > This aims to fully workaround the below errata recently released by > > NXP. > > > > ERR010910: DC: Display Subsystem First Frame Programming Restriction > > Link: > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww. > nxp.com%2Fdocs%2Fen%2Ferrata%2FIMX8_1N94W.pdf&data=05%7C01%7C > victor.liu%40nxp.com%7C1250637791414caf559b08dbadeb597e%7C686ea1d > 3bc2b4c6fa92cd99c5c301635%7C0%7C1%7C638294998555649537%7CUnkno > wn%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1h > aWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mKAtsgN30jqfUiczAFe3 > dXPUZOXhM%2Fnlq9%2F%2B%2F8%2Bjj24%3D&reserved=0 > > > > In short, to make sure the display controller can be enabled properly with > > prefetch engine(DPRC + PRG), the TCON must be switch from bypass mode > > to operation mode after FrameGen(FG) generates the first frame. > > > > Timing is critical here, so local irq and preemption are disabled during the > > switch procedure. > > > > BTW, the driver always use prefetch engines for KMS, although they can > > be bypassed. > > Ok. So I think it would help your driver getting merged to split that > workaround into a separate patch. It's hard to tell what are the > implications of that workaround on your driver when it's lost in the > middle of, well, the driver :) > > I guess it would be much easier for everyone if you submitted that > driver without the errata handling, or with prefetch bypassed, at first. > And then you can submit / rework the hard parts. Ok, I'll drop the prefetch engine support and the errata handling. Also, I'll drop add-on features, like gamma, csc and alpha, to achieve kind of minimal feature set. > > > > > > > > + > > > > + /* > > > > + * TKT320590: > > > > > > Those are NXP internal references as far as as I can tell. They > > > shouldn't be here. > > > > Ok, will change it to be ERR010910. > > A link to the errata description would be a good idea too. Ok, will add a link when errata handling is supported. > > > > > +static void dpu_atomic_put_plane_state(struct drm_atomic_state > *state, > > > > + struct drm_plane *plane) > > > > +{ > > > > + int index = drm_plane_index(plane); > > > > + > > > > + plane->funcs->atomic_destroy_state(plane, state- > >planes[index].state); > > > > + state->planes[index].ptr = NULL; > > > > + state->planes[index].state = NULL; > > > > + state->planes[index].old_state = NULL; > > > > + state->planes[index].new_state = NULL; > > > > + > > > > + drm_modeset_unlock(&plane->mutex); > > > > + > > > > + dpu_plane_dbg(plane, "put state\n"); > > > > +} > > > > + > > > > +static void dpu_atomic_put_crtc_state(struct drm_atomic_state *state, > > > > + struct drm_crtc *crtc) > > > > +{ > > > > + int index = drm_crtc_index(crtc); > > > > + > > > > + crtc->funcs->atomic_destroy_state(crtc, state->crtcs[index].state); > > > > + state->crtcs[index].ptr = NULL; > > > > + state->crtcs[index].state = NULL; > > > > + state->crtcs[index].old_state = NULL; > > > > + state->crtcs[index].new_state = NULL; > > > > + > > > > + drm_modeset_unlock(&crtc->mutex); > > > > + > > > > + dpu_crtc_dbg(crtc, "put state\n"); > > > > +} > > > > + > > > > +static void > > > > +dpu_atomic_put_possible_states_per_crtc(struct drm_crtc_state > > > *crtc_state) > > > > +{ > > > > + struct drm_atomic_state *state = crtc_state->state; > > > > + struct drm_crtc *crtc = crtc_state->crtc; > > > > + struct drm_plane *plane; > > > > + struct drm_plane_state *old_plane_state, *new_plane_state; > > > > + struct dpu_plane_state *old_dpstate, *new_dpstate; > > > > + > > > > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) { > > > > + old_plane_state = drm_atomic_get_old_plane_state(state, > > > plane); > > > > + new_plane_state = drm_atomic_get_new_plane_state(state, > > > plane); > > > > + > > > > + old_dpstate = to_dpu_plane_state(old_plane_state); > > > > + new_dpstate = to_dpu_plane_state(new_plane_state); > > > > + > > > > + /* Should be enough to check the below HW plane resources. > > > */ > > > > + if (old_dpstate->stage.ptr != new_dpstate->stage.ptr || > > > > + old_dpstate->source != new_dpstate->source || > > > > + old_dpstate->blend != new_dpstate->blend) > > > > + return; > > > > + } > > > > + > > > > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) > > > > + dpu_atomic_put_plane_state(state, plane); > > > > + > > > > + dpu_atomic_put_crtc_state(state, crtc); > > > > +} > > > > > > That's super suspicious too. Are you really going around and dropping > > > and destroying plane and crtc states in a global state? > > > > Yes. > > That's really not a good idea. Adding states are fine, dropping ones > aren't. The idea is to add all active planes on two CRTCs into one commit and try to allocate fetchunits for those planes as a whole to best meet user's requirements, because ... > > > > > > > At the very least, you need to describe what this addresses and why you > > > think it's a good solution. > > > > This is the solution to assign HW resources of a plane group to the two > CRTCs > > in one DPU or one CRTC group _dynamically_ at runtime. Dpu.h has some > > comments which hint this: > > > > """ > > /* > > * fetchunit/scaler/layerblend resources of a plane group are > > * shared by the two CRTCs in a CRTC group > > */ > > """ > > > > I can add a DPU display controller block diagram in dpu_kms.c to tell the > HW > > architecture and some SW architecture to clarify this more. > > It's not so much the diagram that I'm looking for, but an accurate > description of the problem. What resource is there, why and how does it > need to be shared, so we can understand what you are doing there, and > possibly suggest other things. ... the problem is that fetchunits(fetchdecode/fetchlayer/fetchwarp) have different capabilities/feature sets and they can be built upon either of the two CRTCs through layerblends. The 4 fetchunits for one DPU display controller are fetchdecode0, fetchdecode1, fetchlayer0 and fetchwarp2. Correspondingly, there are 4 layerblends(0 to 3). Layerblends blend two input sources(primary input and secondary input). The primary input can be, say, constframe or another layerblend's output. The secondary input can be, say, one of those fetchunits. For example, a real use case could be: - CRTC0: Primary plane: Layerblend0: Primary: constframe4 Secondary: fetchlayer0 Overlay plane: Layerblend1: Primary: Layerblend0 output Secondary: fetchdecode1 + vscaler5 + hscaler4 - CRTC1: Primary plane: Layerblend2: Primary: constframe5 Secondary: fetchwarp2 + fetcheco2 Overlay plane: Layerblend3: Primary: Layerblend2 output Secondary: fetchdecode0 + fetcheco0 + vscaler4 Some fetchunit features: - fetchdecode: Horizontoal/vertical downscaling through video processing blocks and YUV fb with two planars(use fetcheco). - fetchwarp: YUV fb with two planars(use fetcheco), up to 8 sub-layers and warp. - fetchlayer: up to 8 sub-layers. All fetchunits support RGB fb. What I do is: - Add all active planes(including primary and overlays) on two CRTCs into one commit even if the user only wants to update the plane(s) on one CRTC. - Those manually added planes/CRTCs are prone to put, because the relevant fetchunits and layerblends are likely/eventually unchanged after the allocation. - Traverse through fetchunit list to try to find an available one to meet a plane's requirements(say CSC? Scalers?). Those prone-to-put planes are handled first to use the current fetchunits if modeset is not allowed, otherwise planes with lower z-positions go first. - Do not allow fetchunit hot-migration between two CRTCs. - Assign layerblend with the lowest possible index to planes. Planes with lower z-positions go first. - Put the prone-to-put planes/CRTC if possible to gain some performance . Regards, Liu Ying > > Maxime
On Tue, Sep 26, 2023 at 03:55:35AM +0000, Ying Liu wrote: > > > > > + cf->pec_base = devm_ioremap(dpu->dev, pec_base, SZ_16); > > > > > + if (!cf->pec_base) > > > > > + return -ENOMEM; > > > > > + > > > > > + cf->base = devm_ioremap(dpu->dev, base, SZ_32); > > > > > + if (!cf->base) > > > > > + return -ENOMEM; > > > > > > > > For the same reason, you need to protect any access to a device managed > > > > resource (so clocks, registers, regulators, etc.) by a call to > > > > drm_dev_enter/drm_dev_exit and you need to call drm_dev_unplug > > instead > > > > of drm_dev_unregister. > > > > > > That's a good point. I've tried to do that, but it turns out that the > > > display controller cannot be enabled again after binding the dpu-core > > > driver manually again. It seems that the display controller requires a > > > proper disablement procedure, but the "driver instance overview " kdoc > > > mentions the shortcoming of no proper disablement if drm_dev_unplug() > > > is used: > > > > > > """ > > > * Drivers that want to support device unplugging (USB, DT overlay unload) > > should > > > * use drm_dev_unplug() instead of drm_dev_unregister(). The driver must > > protect > > > * regions that is accessing device resources to prevent use after they're > > > * released. This is done using drm_dev_enter() and drm_dev_exit(). There > > is one > > > * shortcoming however, drm_dev_unplug() marks the drm_device as > > unplugged before > > > * drm_atomic_helper_shutdown() is called. This means that if the disable > > code > > > * paths are protected, they will not run on regular driver module unload, > > > * possibly leaving the hardware enabled. > > > """ > > > > > > A DPU reset in dpu_core() might be helpful, but I'm not sure if there is any > > > reset line provided by the embodying system. > > > > Generally speaking, you shouldn't rely on the device being in any > > particuliar state before your driver loads. So a reset at probe/bind > > time is a good idea. > > Yes. I'll drop the platform device creations for CRTCs from dpu-core.c > and drop the aggregation of CRTC components from different DPU > instances into one DRM device. This way, there will be only two CRTCs > of one DPU in one DRM device. Ok. > Then, the driver will be simpler and users cannot unbind the driver of > one of the two DPU instances, Uh? They would still be able to do that. > which means drm_dev_unplug() won't be needed any more(?) So this would still be needed > and the reset issue will be gone. The controller will be shutdown > properly through drm_atomic_helper_shutdown() when the driver module > is removed. Again, you shouldn't rely on a particular state at boot. For all you know, you could have been reset by some watchdog or been kexec'd. > > > Even if the reset works, the 2nd DPU instance in i.MX8qm would be a > > > problem, because it won't be reset or properly disabled if the 1st DPU > > > instance is unbound. > > > > Why it wouldn't be reset? > > Because dpu_core_remove() is not called for the 2nd DPU instance. > Anyway, with the above new design, this doesn't seem to be a problem > any more. Ok. > > > > > Although the two DPU instances could be wrapped by two DRM devices, I > > > tend not to do that because downstream bridges in future SoCs might be > > > able to mux to different DPU instances at runtime. > > > > > > Due to the disablement issue, can we set drm_dev_enter/exit/unplug > > > aside first? > > > > I'd rather have that figured out prior to merging. > > I'm assuming that drm_dev_enter/exit/unplug won't be needed with the above > new design - one DPU instance wrapped by one DRM device. I'm not sure why you are making that claim. And again, that's good practice: it does no harm while preventing unsafe behaviour in the future. > > > > > +static void dpu_atomic_put_plane_state(struct drm_atomic_state > > *state, > > > > > + struct drm_plane *plane) > > > > > +{ > > > > > + int index = drm_plane_index(plane); > > > > > + > > > > > + plane->funcs->atomic_destroy_state(plane, state- > > >planes[index].state); > > > > > + state->planes[index].ptr = NULL; > > > > > + state->planes[index].state = NULL; > > > > > + state->planes[index].old_state = NULL; > > > > > + state->planes[index].new_state = NULL; > > > > > + > > > > > + drm_modeset_unlock(&plane->mutex); > > > > > + > > > > > + dpu_plane_dbg(plane, "put state\n"); > > > > > +} > > > > > + > > > > > +static void dpu_atomic_put_crtc_state(struct drm_atomic_state *state, > > > > > + struct drm_crtc *crtc) > > > > > +{ > > > > > + int index = drm_crtc_index(crtc); > > > > > + > > > > > + crtc->funcs->atomic_destroy_state(crtc, state->crtcs[index].state); > > > > > + state->crtcs[index].ptr = NULL; > > > > > + state->crtcs[index].state = NULL; > > > > > + state->crtcs[index].old_state = NULL; > > > > > + state->crtcs[index].new_state = NULL; > > > > > + > > > > > + drm_modeset_unlock(&crtc->mutex); > > > > > + > > > > > + dpu_crtc_dbg(crtc, "put state\n"); > > > > > +} > > > > > + > > > > > +static void > > > > > +dpu_atomic_put_possible_states_per_crtc(struct drm_crtc_state > > > > *crtc_state) > > > > > +{ > > > > > + struct drm_atomic_state *state = crtc_state->state; > > > > > + struct drm_crtc *crtc = crtc_state->crtc; > > > > > + struct drm_plane *plane; > > > > > + struct drm_plane_state *old_plane_state, *new_plane_state; > > > > > + struct dpu_plane_state *old_dpstate, *new_dpstate; > > > > > + > > > > > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) { > > > > > + old_plane_state = drm_atomic_get_old_plane_state(state, > > > > plane); > > > > > + new_plane_state = drm_atomic_get_new_plane_state(state, > > > > plane); > > > > > + > > > > > + old_dpstate = to_dpu_plane_state(old_plane_state); > > > > > + new_dpstate = to_dpu_plane_state(new_plane_state); > > > > > + > > > > > + /* Should be enough to check the below HW plane resources. > > > > */ > > > > > + if (old_dpstate->stage.ptr != new_dpstate->stage.ptr || > > > > > + old_dpstate->source != new_dpstate->source || > > > > > + old_dpstate->blend != new_dpstate->blend) > > > > > + return; > > > > > + } > > > > > + > > > > > + drm_atomic_crtc_state_for_each_plane(plane, crtc_state) > > > > > + dpu_atomic_put_plane_state(state, plane); > > > > > + > > > > > + dpu_atomic_put_crtc_state(state, crtc); > > > > > +} > > > > > > > > That's super suspicious too. Are you really going around and dropping > > > > and destroying plane and crtc states in a global state? > > > > > > Yes. > > > > That's really not a good idea. Adding states are fine, dropping ones > > aren't. > > The idea is to add all active planes on two CRTCs into one commit and > try to allocate fetchunits for those planes as a whole to best meet user's > requirements, because ... > > > > > > > > > > > At the very least, you need to describe what this addresses and why you > > > > think it's a good solution. > > > > > > This is the solution to assign HW resources of a plane group to the two > > CRTCs > > > in one DPU or one CRTC group _dynamically_ at runtime. Dpu.h has some > > > comments which hint this: > > > > > > """ > > > /* > > > * fetchunit/scaler/layerblend resources of a plane group are > > > * shared by the two CRTCs in a CRTC group > > > */ > > > """ > > > > > > I can add a DPU display controller block diagram in dpu_kms.c to tell the > > HW > > > architecture and some SW architecture to clarify this more. > > > > It's not so much the diagram that I'm looking for, but an accurate > > description of the problem. What resource is there, why and how does it > > need to be shared, so we can understand what you are doing there, and > > possibly suggest other things. > > ... the problem is that fetchunits(fetchdecode/fetchlayer/fetchwarp) have > different capabilities/feature sets and they can be built upon either of the > two CRTCs through layerblends. The 4 fetchunits for one DPU display > controller are fetchdecode0, fetchdecode1, fetchlayer0 and fetchwarp2. > Correspondingly, there are 4 layerblends(0 to 3). Layerblends blend two > input sources(primary input and secondary input). The primary input can > be, say, constframe or another layerblend's output. The secondary input > can be, say, one of those fetchunits. For example, a real use case could > be: > - CRTC0: > Primary plane: > Layerblend0: > Primary: constframe4 > Secondary: fetchlayer0 > Overlay plane: > Layerblend1: > Primary: Layerblend0 output > Secondary: fetchdecode1 + vscaler5 + hscaler4 > > - CRTC1: > Primary plane: > Layerblend2: > Primary: constframe5 > Secondary: fetchwarp2 + fetcheco2 > Overlay plane: > Layerblend3: > Primary: Layerblend2 output > Secondary: fetchdecode0 + fetcheco0 + vscaler4 > > Some fetchunit features: > - fetchdecode: Horizontoal/vertical downscaling through video > processing blocks and YUV fb with two planars(use fetcheco). > - fetchwarp: YUV fb with two planars(use fetcheco), up to > 8 sub-layers and warp. > - fetchlayer: up to 8 sub-layers. > > All fetchunits support RGB fb. > > > What I do is: > - Add all active planes(including primary and overlays) on two CRTCs > into one commit even if the user only wants to update the plane(s) > on one CRTC. > - Those manually added planes/CRTCs are prone to put, because > the relevant fetchunits and layerblends are likely/eventually > unchanged after the allocation. > - Traverse through fetchunit list to try to find an available one to > meet a plane's requirements(say CSC? Scalers?). Those prone-to-put > planes are handled first to use the current fetchunits if modeset > is not allowed, otherwise planes with lower z-positions go first. > - Do not allow fetchunit hot-migration between two CRTCs. > - Assign layerblend with the lowest possible index to planes. Planes > with lower z-positions go first. > - Put the prone-to-put planes/CRTC if possible to gain some > performance . So, you shouldn't do it the way you did it so far, but if I got you right, this seems similar to what we have on vc4 where all planes go through another device (called HVS) that we maintain a global state for. That way, any plane change will pull that global state in, and you are getting a global view of what resources are being used in the system. See vc4_pv_muxing_atomic_check() for an example. Maxime