Message ID | 20220511112438.1251024-1-javierm@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix some races between sysfb device registration and drivers probe | expand |
Hi Javier, thanks again for providing the examples. I think I now better get what you're trying to solve. First of all let's merge patch 3, as it seems unrelated. For the other patches, I'd like to take a step back and try to solve the broader problem. IIRC we talked about this briefly already. From my understanding, the problem of the current code is that removal of the firmware device is build around drivers (either DRM or fbdev). Everything works fine if a driver is bound to the firmware device and the native driver can remove it. In other case, we have to manually disable sysfb and force-remove unbound firmware devices. And the patchset doesn't even cover firmware devices that come from DT nodes. But what we really want is to track resource ownership based on devices. When we add a firmware device (via sysfb or DT), we immediately add it to a list of firmware devices. When the native driver arrives, it can remove the firmware device even if no driver has been bound. We can also operate in the other way if the native drivers implicitly reserves the framebuffer range. If sysfb would try to register a firmware device in that range, he can let it fail. No more need to fully disable sysfb on the first arriving native device. We already track the memory ranges in drm aperture helpers. We'd need to move the code to a more prominent location (e.g., <linux/aperture.h>) and change fbdev to use it. Sysfb and DT code needs to insert platform devices upon creation. We can then implement the more fancy stuff, such as tracking native-device memory. (And if we ever get to fix all usage of Linux' request-mem-region, we can even move all the functionality there). Best regards Thomas Am 11.05.22 um 13:24 schrieb Javier Martinez Canillas: > Hello, > > The patches in this series contain mostly changes suggested by Daniel Vetter > Thomas Zimmermann. They aim to fix existing races between the Generic System > Framebuffer (sysfb) infrastructure and the fbdev and DRM device registration. > > For example, it is currently possible for sysfb to register a platform > device after a real DRM driver was registered and requested to remove the > conflicting framebuffers. Or is possible for a simple{fb,drm} to match with > a device previously registered by sysfb, even after a real driver is present. > > A symptom of this issue, was worked around with the commit fb561bf9abde > ("fbdev: Prevent probing generic drivers if a FB is already registered") > but that's really a hack and should be reverted instead. > > This series attempt to fix it more correctly and revert the mentioned hack. > That will also allow to make the num_registered_fb variable not visible to > drivers anymore, since that's internal to fbdev core. > > Pach 1 is just a simple cleanup in preparation for later patches. > > Patch 2 add a sysfb_disable() and sysfb_try_unregister() helpers to allow > disabling sysfb and attempt to unregister registered devices respectively. > > Patch 3 changes how is dealt with conflicting framebuffers unregistering, > rather than having a variable to determine if a lock should be take, it > just drops the lock before unregistering the platform device. > > Patch 4 changes the fbdev core to not attempt to unregister devices that > were registered by sysfb, let the same code doing the registration to also > handle the unregistration. > > Patch 5 fixes the race that exists between sysfb devices registration and > fbdev framebuffer devices registration, by disabling the sysfb when a DRM > or fbdev driver requests to remove conflicting framebuffers. > > Patch 6 is the revert patch that was posted by Daniel before but dropped > from his set and finally patch 7 is the one that makes num_registered_fb > private to fbmem.c, to not allow drivers to use it anymore. > > The patches were tested on a rpi4 with the vc4, simpledrm and simplefb > drivers, using different combinations of built-in and as a module. > > For example, having simpledrm as a module and loading it after the vc4 > driver probed would not register a DRM device, which happens now without > the patches from this series. > > Best regards, > Javier > > Changes in v5: > - Move the sysfb_disable() call at conflicting framebuffers again to > avoid the need of a DRIVER_FIRMWARE capability flag. > - Add Daniel Vetter's Reviewed-by tag again since reverted to the old > patch that he already reviewed in v2. > - Drop patches that added a DRM_FIRMWARE capability and use them > since the case those prevented could be ignored (Daniel Vetter). > > Changes in v4: > - Make sysfb_disable() to also attempt to unregister a device. > - Drop call to sysfb_disable() in fbmem since is done in other places now. > - Add patch to make registered_fb[] private. > - Add patches that introduce the DRM_FIRMWARE capability and usage. > > Changes in v3: > - Add Thomas Zimmermann's Reviewed-by tag to patch #1. > - Call sysfb_disable() when a DRM dev and a fbdev are registered rather > than when conflicting framebuffers are removed (Thomas Zimmermann). > - Call sysfb_disable() when a fbdev framebuffer is registered rather > than when conflicting framebuffers are removed (Thomas Zimmermann). > - Drop Daniel Vetter's Reviewed-by tag since patch changed a lot. > - Rebase on top of latest drm-misc-next branch. > > Changes in v2: > - Rebase on top of latest drm-misc-next and fix conflicts (Daniel Vetter). > - Add kernel-doc comments and include in other_interfaces.rst (Daniel Vetter). > - Explain in the commit message that fbmem has to unregister the device > as fallback if a driver registered the device itself (Daniel Vetter). > - Also explain that fallback in a comment in the code (Daniel Vetter). > - Don't encode in fbmem the assumption that sysfb will always register > platform devices (Daniel Vetter). > - Add a FIXME comment about drivers registering devices (Daniel Vetter). > - Explain in the commit message that fbmem has to unregister the device > as fallback if a driver registered the device itself (Daniel Vetter). > - Also explain that fallback in a comment in the code (Daniel Vetter). > - Don't encode in fbmem the assumption that sysfb will always register > platform devices (Daniel Vetter). > - Add a FIXME comment about drivers registering devices (Daniel Vetter). > - Drop RFC prefix since patches were already reviewed by Daniel Vetter. > - Add Daniel Reviewed-by tags to the patches. > > Daniel Vetter (2): > Revert "fbdev: Prevent probing generic drivers if a FB is already > registered" > fbdev: Make registered_fb[] private to fbmem.c > > Javier Martinez Canillas (5): > firmware: sysfb: Make sysfb_create_simplefb() return a pdev pointer > firmware: sysfb: Add helpers to unregister a pdev and disable > registration > fbdev: Restart conflicting fb removal loop when unregistering devices > fbdev: Make sysfb to unregister its own registered devices > fbdev: Disable sysfb device registration when removing conflicting FBs > > .../driver-api/firmware/other_interfaces.rst | 6 ++ > drivers/firmware/sysfb.c | 91 +++++++++++++++++-- > drivers/firmware/sysfb_simplefb.c | 16 ++-- > drivers/video/fbdev/core/fbmem.c | 67 +++++++++++--- > drivers/video/fbdev/efifb.c | 11 --- > drivers/video/fbdev/simplefb.c | 11 --- > include/linux/fb.h | 8 +- > include/linux/sysfb.h | 29 +++++- > 8 files changed, 178 insertions(+), 61 deletions(-) >
Hello Thomas, Thanks for your feedback and comments. On 5/13/22 12:48, Thomas Zimmermann wrote: > Hi Javier, > > thanks again for providing the examples. I think I now better get what > you're trying to solve. > You are welcome. > First of all let's merge patch 3, as it seems unrelated. > Agreed. I can push it to drm-misc-next. > For the other patches, I'd like to take a step back and try to solve the > broader problem. IIRC we talked about this briefly already. > Yes, that's what we discussed on IRC some time ago. > From my understanding, the problem of the current code is that removal > of the firmware device is build around drivers (either DRM or fbdev). > Everything works fine if a driver is bound to the firmware device and > the native driver can remove it. In other case, we have to manually And this is the common case, since most kernels will have some driver that binds to a platform device registered to provide the firmware FB. > disable sysfb and force-remove unbound firmware devices. And the > patchset doesn't even cover firmware devices that come from DT nodes. > Indeed. > But what we really want is to track resource ownership based on devices. > When we add a firmware device (via sysfb or DT), we immediately add it > to a list of firmware devices. When the native driver arrives, it can > remove the firmware device even if no driver has been bound. > > We can also operate in the other way if the native drivers implicitly > reserves the framebuffer range. If sysfb would try to register a > firmware device in that range, he can let it fail. No more need to fully > disable sysfb on the first arriving native device. > > We already track the memory ranges in drm aperture helpers. We'd need to > move the code to a more prominent location (e.g., <linux/aperture.h>) > and change fbdev to use it. Sysfb and DT code needs to insert platform > devices upon creation. We can then implement the more fancy stuff, such > as tracking native-device memory. (And if we ever get to fix all usage > of Linux' request-mem-region, we can even move all the functionality there). > Agreed. And as mentioned, the race that these patches attempt to fix are for the less common case when a native driver probes but either no generic driver registered a framebuffer yet or the platform device wasn't registered yet. But this can only happen if for example a native driver is built-in but the generic driver is build as a module, which is not the common configuration. What most distros do is the opposite, to have {simple,of,efi,vesa}fb or simpledrm built-in and the native drivers built as modules. So there's no rush to fix this by piling more hacks on top of the ones we already have and instead try to fix it more properly as you suggested.
Hi Javier Am 13.05.22 um 13:10 schrieb Javier Martinez Canillas: ... >> We already track the memory ranges in drm aperture helpers. We'd need to >> move the code to a more prominent location (e.g., <linux/aperture.h>) >> and change fbdev to use it. Sysfb and DT code needs to insert platform >> devices upon creation. We can then implement the more fancy stuff, such >> as tracking native-device memory. (And if we ever get to fix all usage >> of Linux' request-mem-region, we can even move all the functionality there). >> > > Agreed. And as mentioned, the race that these patches attempt to fix are for > the less common case when a native driver probes but either no generic driver > registered a framebuffer yet or the platform device wasn't registered yet. > > But this can only happen if for example a native driver is built-in but the > generic driver is build as a module, which is not the common configuration. > > What most distros do is the opposite, to have {simple,of,efi,vesa}fb or > simpledrm built-in and the native drivers built as modules. > > So there's no rush to fix this by piling more hacks on top of the ones we > already have and instead try to fix it more properly as you suggested. > A first step would be to use DRM's aperture helpers in fbdev. That would be a good idea anyway, as it would simplify both. You already mentioned some API changes to make aperture helpers DRM-independent. The affected fbdev drivers use platform devices, so this should be easy. Aperture helpers have something like the registration_lock. [1] I don't know if we need to recreate patch 3 for this as well. If we absolutely need some special detachment handling for fbdev, we can make devm_aperture_aquire() a public interface. The detach helper is provided by the caller. Best regards Thomas [1] https://elixir.bootlin.com/linux/v5.17.6/source/drivers/gpu/drm/drm_aperture.c#L254 [2] https://elixir.bootlin.com/linux/v5.17.6/source/drivers/gpu/drm/drm_aperture.c#L159