Message ID | 20231108024613.2898921-1-chenhuacai@loongson.cn (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/Makefile: Move tiny drivers before native drivers | expand |
Hi, thanks for the patch. Am 08.11.23 um 03:46 schrieb Huacai Chen: > After commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from > device_initcall to subsys_initcall_sync") some Lenovo laptops get a blank > screen until the display manager starts. > > This regression occurs with such a Kconfig combination: > CONFIG_SYSFB=y > CONFIG_SYSFB_SIMPLEFB=y > CONFIG_DRM_SIMPLEDRM=y > CONFIG_DRM_I915=y # Or other native drivers such as radeon, amdgpu > > If replace CONFIG_DRM_SIMPLEDRM with CONFIG_FB_SIMPLE (they use the same > device), there is no blank screen. The root cause is the initialization > order, and this order depends on the Makefile. > > FB_SIMPLE is before native DRM drivers (e.g. i915, radeon, amdgpu, and > so on), but DRM_SIMPLEDRM is after them. Thus, if we use FB_SIMPLE, I915 > will takeover FB_SIMPLE, then no problem; and if we use DRM_SIMPLEDRM, > DRM_SIMPLEDRM will try to takeover I915, but fails to work. But what exactly is the problem? From the lengthy discussion threat, it looks like you've stumbled across a long-known problem, where the firmware driver probes a device that has already been taken by a native driver. But that should not be possible. As you know, there's a platform device that represents the firmware framebuffer. The firmware drivers, such as simpledrm, bind to it. In i915 and the other native drivers we remove that platform device, so that simpledrm does not run. We call the DRM aperture helpers at [1]. It's implemented at [2]. The function contains a call to sysfb_disable(), [3] which should be invoked for the i915 device and remove the platform device. [1] https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/i915/i915_driver.c#L489 [2] https://elixir.bootlin.com/linux/v6.5/source/drivers/video/aperture.c#L347 [3] https://elixir.bootlin.com/linux/v6.5/source/drivers/firmware/sysfb.c#L63 Can you investigate why this does not work? Is sysfb_disable() not being called? Does it remove the platform device? > > So we can move the "tiny" directory before native DRM drivers to solve > this problem. Relying on linking order is just as unreliable. The usual workaround is to build native drivers as modules. But first, please investigate where the current code fails. Best regards Thomas > > Fixes: 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") > Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t > Reported-by: Jaak Ristioja <jaak@ristioja.ee> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> > --- > drivers/gpu/drm/Makefile | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile > index 8e1bde059170..db0f3d3aff43 100644 > --- a/drivers/gpu/drm/Makefile > +++ b/drivers/gpu/drm/Makefile > @@ -141,6 +141,7 @@ obj-y += arm/ > obj-y += display/ > obj-$(CONFIG_DRM_TTM) += ttm/ > obj-$(CONFIG_DRM_SCHED) += scheduler/ > +obj-y += tiny/ > obj-$(CONFIG_DRM_RADEON)+= radeon/ > obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/ > obj-$(CONFIG_DRM_AMDGPU)+= amd/amdxcp/ > @@ -182,7 +183,6 @@ obj-$(CONFIG_DRM_FSL_DCU) += fsl-dcu/ > obj-$(CONFIG_DRM_ETNAVIV) += etnaviv/ > obj-y += hisilicon/ > obj-y += mxsfb/ > -obj-y += tiny/ > obj-$(CONFIG_DRM_PL111) += pl111/ > obj-$(CONFIG_DRM_TVE200) += tve200/ > obj-$(CONFIG_DRM_XEN) += xen/
Hello, On Wed, Nov 8, 2023 at 9:14 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > Hi, > [...] > > Relying on linking order is just as unreliable. The usual workaround is > to build native drivers as modules. But first, please investigate where > the current code fails. > I fully agree with Thomas here. This is just papering over the issue. I'll read the lengthy thread now to see if I can better understand what's going on here.
Hi, Thomas, On Wed, Nov 8, 2023 at 4:14 PM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > Hi, > > thanks for the patch. > > Am 08.11.23 um 03:46 schrieb Huacai Chen: > > After commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from > > device_initcall to subsys_initcall_sync") some Lenovo laptops get a blank > > screen until the display manager starts. > > > > This regression occurs with such a Kconfig combination: > > CONFIG_SYSFB=y > > CONFIG_SYSFB_SIMPLEFB=y > > CONFIG_DRM_SIMPLEDRM=y > > CONFIG_DRM_I915=y # Or other native drivers such as radeon, amdgpu > > > > If replace CONFIG_DRM_SIMPLEDRM with CONFIG_FB_SIMPLE (they use the same > > device), there is no blank screen. The root cause is the initialization > > order, and this order depends on the Makefile. > > > > FB_SIMPLE is before native DRM drivers (e.g. i915, radeon, amdgpu, and > > so on), but DRM_SIMPLEDRM is after them. Thus, if we use FB_SIMPLE, I915 > > will takeover FB_SIMPLE, then no problem; and if we use DRM_SIMPLEDRM, > > DRM_SIMPLEDRM will try to takeover I915, but fails to work. > > But what exactly is the problem? From the lengthy discussion threat, it > looks like you've stumbled across a long-known problem, where the > firmware driver probes a device that has already been taken by a native > driver. But that should not be possible. Yes, it is a long-known problem but only exposed after commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync"), because the platform device for simpledrm was not created as early as possible in old kernels. > > As you know, there's a platform device that represents the firmware > framebuffer. The firmware drivers, such as simpledrm, bind to it. In > i915 and the other native drivers we remove that platform device, so > that simpledrm does not run. > > We call the DRM aperture helpers at [1]. It's implemented at [2]. The > function contains a call to sysfb_disable(), [3] which should be invoked > for the i915 device and remove the platform device. Yes, this looks a little strange, which I haven't investigated before. Jaak, could you please try to see whether sysfb_disable() is called in bad kernels? > > [1] > https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/i915/i915_driver.c#L489 > [2] > https://elixir.bootlin.com/linux/v6.5/source/drivers/video/aperture.c#L347 > [3] > https://elixir.bootlin.com/linux/v6.5/source/drivers/firmware/sysfb.c#L63 > > Can you investigate why this does not work? Is sysfb_disable() not being > called? Does it remove the platform device? > > > > > So we can move the "tiny" directory before native DRM drivers to solve > > this problem. > > Relying on linking order is just as unreliable. The usual workaround is > to build native drivers as modules. But first, please investigate where > the current code fails. Yes, native drivers are usually built as modules, but Jaak tries to built-in, and then reports this bug. :) Huacai > > Best regards > Thomas > > > > > Fixes: 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") > > Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t > > Reported-by: Jaak Ristioja <jaak@ristioja.ee> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> > > --- > > drivers/gpu/drm/Makefile | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile > > index 8e1bde059170..db0f3d3aff43 100644 > > --- a/drivers/gpu/drm/Makefile > > +++ b/drivers/gpu/drm/Makefile > > @@ -141,6 +141,7 @@ obj-y += arm/ > > obj-y += display/ > > obj-$(CONFIG_DRM_TTM) += ttm/ > > obj-$(CONFIG_DRM_SCHED) += scheduler/ > > +obj-y += tiny/ > > obj-$(CONFIG_DRM_RADEON)+= radeon/ > > obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/ > > obj-$(CONFIG_DRM_AMDGPU)+= amd/amdxcp/ > > @@ -182,7 +183,6 @@ obj-$(CONFIG_DRM_FSL_DCU) += fsl-dcu/ > > obj-$(CONFIG_DRM_ETNAVIV) += etnaviv/ > > obj-y += hisilicon/ > > obj-y += mxsfb/ > > -obj-y += tiny/ > > obj-$(CONFIG_DRM_PL111) += pl111/ > > obj-$(CONFIG_DRM_TVE200) += tve200/ > > obj-$(CONFIG_DRM_XEN) += xen/ > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Frankenstrasse 146, 90461 Nuernberg, Germany > GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman > HRB 36809 (AG Nuernberg)
Hi, Javier, On Wed, Nov 8, 2023 at 4:24 PM Javier Martinez Canillas <javierm@redhat.com> wrote: > > Hello, > > On Wed, Nov 8, 2023 at 9:14 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > > > Hi, > > > > [...] > > > > > Relying on linking order is just as unreliable. The usual workaround is > > to build native drivers as modules. But first, please investigate where > > the current code fails. > > > > I fully agree with Thomas here. This is just papering over the issue. > > I'll read the lengthy thread now to see if I can better understand > what's going on here. Thank you very much. Huacai > -- > Best regards, > > Javier Martinez Canillas > Core Platforms > Red Hat >
Hi, Javier, On Wed, Nov 8, 2023 at 4:24 PM Javier Martinez Canillas <javierm@redhat.com> wrote: > > Hello, > > On Wed, Nov 8, 2023 at 9:14 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > > > Hi, > > > > [...] > > > > > Relying on linking order is just as unreliable. The usual workaround is > > to build native drivers as modules. But first, please investigate where > > the current code fails. > > > > I fully agree with Thomas here. This is just papering over the issue. > > I'll read the lengthy thread now to see if I can better understand > what's going on here. Have you understood enough now? I really don't want the original patch to be reverted. And Jaak, could you please test with the below patch (but keep the original order in Makefile) and then give me the dmesg output? diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c index 561be8feca96..cc2e39fb98f5 100644 --- a/drivers/video/aperture.c +++ b/drivers/video/aperture.c @@ -350,21 +350,29 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na resource_size_t base, size; int bar, ret = 0; - if (pdev == vga_default_device()) + printk("DEBUG: remove 1\n"); + + if (pdev == vga_default_device()) { + printk("DEBUG: primary = true\n"); primary = true; + } - if (primary) + if (primary) { + printk("DEBUG: disable sysfb\n"); sysfb_disable(); + } for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) continue; + printk("DEBUG: remove 2\n"); base = pci_resource_start(pdev, bar); size = pci_resource_len(pdev, bar); aperture_detach_devices(base, size); } + printk("DEBUG: remove 3\n"); /* * If this is the primary adapter, there could be a VGA device * that consumes the VGA framebuffer I/O range. Remove this [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u > -- > Best regards, > > Javier Martinez Canillas > Core Platforms > Red Hat >
Huacai Chen <chenhuacai@kernel.org> writes: Hello Huacai, > Hi, Javier, > > On Wed, Nov 8, 2023 at 4:24 PM Javier Martinez Canillas > <javierm@redhat.com> wrote: >> >> Hello, >> >> On Wed, Nov 8, 2023 at 9:14 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: >> > >> > Hi, >> > >> >> [...] >> >> > >> > Relying on linking order is just as unreliable. The usual workaround is >> > to build native drivers as modules. But first, please investigate where >> > the current code fails. >> > >> >> I fully agree with Thomas here. This is just papering over the issue. >> >> I'll read the lengthy thread now to see if I can better understand >> what's going on here. > Have you understood enough now? I really don't want the original patch > to be reverted. > I discussed this with Thomas but we didn't fully understand what was going on. In theory, it should work since the native driver should disable sysfb and remove the registered platform device. But it seems that this does not happen for Jaak and others who reported the same issue. Something that we noticed is that PCI fixups happen in fs_initcall_sync() and since the sysfb_init() should happen after the PCI subsystem for EFI quirks, we think that at least should be moved after that initcall level. That means rootfs_initcall() onwards, and that takes it almost at the same before your patch. The safest would be to move sysfb_init() to initcall device_initcall_sync() and make sure that happens after all the native DRM drivers, since module_init() happens at device_initcall(). I think that Thomas meant to send a patch to do that change.
Hi, Javier, On Mon, Dec 11, 2023 at 4:33 PM Javier Martinez Canillas <javierm@redhat.com> wrote: > > Huacai Chen <chenhuacai@kernel.org> writes: > > Hello Huacai, > > > Hi, Javier, > > > > On Wed, Nov 8, 2023 at 4:24 PM Javier Martinez Canillas > > <javierm@redhat.com> wrote: > >> > >> Hello, > >> > >> On Wed, Nov 8, 2023 at 9:14 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > >> > > >> > Hi, > >> > > >> > >> [...] > >> > >> > > >> > Relying on linking order is just as unreliable. The usual workaround is > >> > to build native drivers as modules. But first, please investigate where > >> > the current code fails. > >> > > >> > >> I fully agree with Thomas here. This is just papering over the issue. > >> > >> I'll read the lengthy thread now to see if I can better understand > >> what's going on here. > > Have you understood enough now? I really don't want the original patch > > to be reverted. > > > > I discussed this with Thomas but we didn't fully understand what was going > on. In theory, it should work since the native driver should disable sysfb > and remove the registered platform device. But it seems that this does not > happen for Jaak and others who reported the same issue. > > Something that we noticed is that PCI fixups happen in fs_initcall_sync() > and since the sysfb_init() should happen after the PCI subsystem for EFI > quirks, we think that at least should be moved after that initcall level. > > That means rootfs_initcall() onwards, and that takes it almost at the same > before your patch. The safest would be to move sysfb_init() to initcall > device_initcall_sync() and make sure that happens after all the native DRM > drivers, since module_init() happens at device_initcall(). > > I think that Thomas meant to send a patch to do that change. Thank you very much. I guess things may be like this: i915 init at first, then simpledrm init in parallel and finished before i915 call sysfb_disable(), so in my previous reply I provide a debug patch for Jaak to see what happens. Huacai > > -- > Best regards, > > Javier Martinez Canillas > Core Platforms > Red Hat >
Huacai Chen <chenhuacai@kernel.org> writes: > Hi, Javier, > > On Mon, Dec 11, 2023 at 4:33 PM Javier Martinez Canillas > <javierm@redhat.com> wrote: >> >> Huacai Chen <chenhuacai@kernel.org> writes: >> >> Hello Huacai, >> >> > Hi, Javier, >> > >> > On Wed, Nov 8, 2023 at 4:24 PM Javier Martinez Canillas >> > <javierm@redhat.com> wrote: >> >> >> >> Hello, >> >> >> >> On Wed, Nov 8, 2023 at 9:14 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: >> >> > >> >> > Hi, >> >> > >> >> >> >> [...] >> >> >> >> > >> >> > Relying on linking order is just as unreliable. The usual workaround is >> >> > to build native drivers as modules. But first, please investigate where >> >> > the current code fails. >> >> > >> >> >> >> I fully agree with Thomas here. This is just papering over the issue. >> >> >> >> I'll read the lengthy thread now to see if I can better understand >> >> what's going on here. >> > Have you understood enough now? I really don't want the original patch >> > to be reverted. >> > >> >> I discussed this with Thomas but we didn't fully understand what was going >> on. In theory, it should work since the native driver should disable sysfb >> and remove the registered platform device. But it seems that this does not >> happen for Jaak and others who reported the same issue. >> >> Something that we noticed is that PCI fixups happen in fs_initcall_sync() >> and since the sysfb_init() should happen after the PCI subsystem for EFI >> quirks, we think that at least should be moved after that initcall level. >> >> That means rootfs_initcall() onwards, and that takes it almost at the same >> before your patch. The safest would be to move sysfb_init() to initcall >> device_initcall_sync() and make sure that happens after all the native DRM >> drivers, since module_init() happens at device_initcall(). >> >> I think that Thomas meant to send a patch to do that change. > Thank you very much. I guess things may be like this: > i915 init at first, then simpledrm init in parallel and finished > before i915 call sysfb_disable(), so in my previous reply I provide a > debug patch for Jaak to see what happens. > Yes, specially with async probing although neither i915 nor simpledrm use it right now AFAICT. Is still unclear to me what's going on in this particular case, although moving it to device_initcall_sync() seems to be the correct thing to do regardless of this issue.
Hi, Javier, On Mon, Dec 11, 2023 at 5:17 PM Javier Martinez Canillas <javierm@redhat.com> wrote: > > Huacai Chen <chenhuacai@kernel.org> writes: > > > Hi, Javier, > > > > On Mon, Dec 11, 2023 at 4:33 PM Javier Martinez Canillas > > <javierm@redhat.com> wrote: > >> > >> Huacai Chen <chenhuacai@kernel.org> writes: > >> > >> Hello Huacai, > >> > >> > Hi, Javier, > >> > > >> > On Wed, Nov 8, 2023 at 4:24 PM Javier Martinez Canillas > >> > <javierm@redhat.com> wrote: > >> >> > >> >> Hello, > >> >> > >> >> On Wed, Nov 8, 2023 at 9:14 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > >> >> > > >> >> > Hi, > >> >> > > >> >> > >> >> [...] > >> >> > >> >> > > >> >> > Relying on linking order is just as unreliable. The usual workaround is > >> >> > to build native drivers as modules. But first, please investigate where > >> >> > the current code fails. > >> >> > > >> >> > >> >> I fully agree with Thomas here. This is just papering over the issue. > >> >> > >> >> I'll read the lengthy thread now to see if I can better understand > >> >> what's going on here. > >> > Have you understood enough now? I really don't want the original patch > >> > to be reverted. > >> > > >> > >> I discussed this with Thomas but we didn't fully understand what was going > >> on. In theory, it should work since the native driver should disable sysfb > >> and remove the registered platform device. But it seems that this does not > >> happen for Jaak and others who reported the same issue. > >> > >> Something that we noticed is that PCI fixups happen in fs_initcall_sync() > >> and since the sysfb_init() should happen after the PCI subsystem for EFI > >> quirks, we think that at least should be moved after that initcall level. > >> > >> That means rootfs_initcall() onwards, and that takes it almost at the same > >> before your patch. The safest would be to move sysfb_init() to initcall > >> device_initcall_sync() and make sure that happens after all the native DRM > >> drivers, since module_init() happens at device_initcall(). > >> > >> I think that Thomas meant to send a patch to do that change. > > Thank you very much. I guess things may be like this: > > i915 init at first, then simpledrm init in parallel and finished > > before i915 call sysfb_disable(), so in my previous reply I provide a > > debug patch for Jaak to see what happens. > > > > Yes, specially with async probing although neither i915 nor simpledrm use > it right now AFAICT. > > Is still unclear to me what's going on in this particular case, although > moving it to device_initcall_sync() seems to be the correct thing to do > regardless of this issue. But that cannot "make the screen display as early as possible". Maybe we can wait for Jaak's result. Huacai > > -- > Best regards, > > Javier Martinez Canillas > Core Platforms > Red Hat >
On Wed, 08 Nov 2023, Thomas Zimmermann <tzimmermann@suse.de> wrote: > Hi, > > thanks for the patch. > > Am 08.11.23 um 03:46 schrieb Huacai Chen: >> After commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from >> device_initcall to subsys_initcall_sync") some Lenovo laptops get a blank >> screen until the display manager starts. >> >> This regression occurs with such a Kconfig combination: >> CONFIG_SYSFB=y >> CONFIG_SYSFB_SIMPLEFB=y >> CONFIG_DRM_SIMPLEDRM=y >> CONFIG_DRM_I915=y # Or other native drivers such as radeon, amdgpu >> >> If replace CONFIG_DRM_SIMPLEDRM with CONFIG_FB_SIMPLE (they use the same >> device), there is no blank screen. The root cause is the initialization >> order, and this order depends on the Makefile. >> >> FB_SIMPLE is before native DRM drivers (e.g. i915, radeon, amdgpu, and >> so on), but DRM_SIMPLEDRM is after them. Thus, if we use FB_SIMPLE, I915 >> will takeover FB_SIMPLE, then no problem; and if we use DRM_SIMPLEDRM, >> DRM_SIMPLEDRM will try to takeover I915, but fails to work. > > But what exactly is the problem? From the lengthy discussion threat, it > looks like you've stumbled across a long-known problem, where the > firmware driver probes a device that has already been taken by a native > driver. But that should not be possible. > > As you know, there's a platform device that represents the firmware > framebuffer. The firmware drivers, such as simpledrm, bind to it. In > i915 and the other native drivers we remove that platform device, so > that simpledrm does not run. The problem is still not resolved. Another bug report at [1]. The commit message here points at 60aebc955949 ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") as regressing, and Jaak also bisected it (see Closes:). I agree the patch here is just papering over the issue, but lacking a proper fix, for months, a revert would be in order, no? BR, Jani. [1] https://gitlab.freedesktop.org/drm/intel/-/issues/10133 > > We call the DRM aperture helpers at [1]. It's implemented at [2]. The > function contains a call to sysfb_disable(), [3] which should be invoked > for the i915 device and remove the platform device. > > [1] > https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/i915/i915_driver.c#L489 > [2] > https://elixir.bootlin.com/linux/v6.5/source/drivers/video/aperture.c#L347 > [3] > https://elixir.bootlin.com/linux/v6.5/source/drivers/firmware/sysfb.c#L63 > > Can you investigate why this does not work? Is sysfb_disable() not being > called? Does it remove the platform device? > >> >> So we can move the "tiny" directory before native DRM drivers to solve >> this problem. > > Relying on linking order is just as unreliable. The usual workaround is > to build native drivers as modules. But first, please investigate where > the current code fails. > > Best regards > Thomas > >> >> Fixes: 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") >> Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t >> Reported-by: Jaak Ristioja <jaak@ristioja.ee> >> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> >> --- >> drivers/gpu/drm/Makefile | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile >> index 8e1bde059170..db0f3d3aff43 100644 >> --- a/drivers/gpu/drm/Makefile >> +++ b/drivers/gpu/drm/Makefile >> @@ -141,6 +141,7 @@ obj-y += arm/ >> obj-y += display/ >> obj-$(CONFIG_DRM_TTM) += ttm/ >> obj-$(CONFIG_DRM_SCHED) += scheduler/ >> +obj-y += tiny/ >> obj-$(CONFIG_DRM_RADEON)+= radeon/ >> obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/ >> obj-$(CONFIG_DRM_AMDGPU)+= amd/amdxcp/ >> @@ -182,7 +183,6 @@ obj-$(CONFIG_DRM_FSL_DCU) += fsl-dcu/ >> obj-$(CONFIG_DRM_ETNAVIV) += etnaviv/ >> obj-y += hisilicon/ >> obj-y += mxsfb/ >> -obj-y += tiny/ >> obj-$(CONFIG_DRM_PL111) += pl111/ >> obj-$(CONFIG_DRM_TVE200) += tve200/ >> obj-$(CONFIG_DRM_XEN) += xen/
On 23.01.24 09:53, Jani Nikula wrote: > On Wed, 08 Nov 2023, Thomas Zimmermann <tzimmermann@suse.de> wrote: >> >> thanks for the patch. >> >> Am 08.11.23 um 03:46 schrieb Huacai Chen: >>> After commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from >>> device_initcall to subsys_initcall_sync") some Lenovo laptops get a blank >>> screen until the display manager starts. >>> >>> This regression occurs with such a Kconfig combination: >>> CONFIG_SYSFB=y >>> CONFIG_SYSFB_SIMPLEFB=y >>> CONFIG_DRM_SIMPLEDRM=y >>> CONFIG_DRM_I915=y # Or other native drivers such as radeon, amdgpu >>> >>> If replace CONFIG_DRM_SIMPLEDRM with CONFIG_FB_SIMPLE (they use the same >>> device), there is no blank screen. The root cause is the initialization >>> order, and this order depends on the Makefile. >>> >>> FB_SIMPLE is before native DRM drivers (e.g. i915, radeon, amdgpu, and >>> so on), but DRM_SIMPLEDRM is after them. Thus, if we use FB_SIMPLE, I915 >>> will takeover FB_SIMPLE, then no problem; and if we use DRM_SIMPLEDRM, >>> DRM_SIMPLEDRM will try to takeover I915, but fails to work. >> >> But what exactly is the problem? From the lengthy discussion threat, it >> looks like you've stumbled across a long-known problem, where the >> firmware driver probes a device that has already been taken by a native >> driver. But that should not be possible. >> >> As you know, there's a platform device that represents the firmware >> framebuffer. The firmware drivers, such as simpledrm, bind to it. In >> i915 and the other native drivers we remove that platform device, so >> that simpledrm does not run. > > The problem is still not resolved. Another bug report at [1]. > > The commit message here points at 60aebc955949 ("drivers/firmware: Move > sysfb_init() from device_initcall to subsys_initcall_sync") as > regressing, and Jaak also bisected it (see Closes:). > > I agree the patch here is just papering over the issue, but lacking a > proper fix, for months, a revert would be in order, no? > > [1] https://gitlab.freedesktop.org/drm/intel/-/issues/10133 Interesting. JFYI for those that don't follow this closely: Huacai Chen proposed a fix and asked a earlier reporter to test it, but afaics heard nothing back: https://lore.kernel.org/all/CAAhV-H5eXM7FNzuRCMthAziG_jg75XwQV3grpw=sdyJ-9GXgvA@mail.gmail.com/ That's afaics why this got stuck (and why I didn't request on a escalate this weeks ago). Ciao, Thorsten >> We call the DRM aperture helpers at [1]. It's implemented at [2]. The >> function contains a call to sysfb_disable(), [3] which should be invoked >> for the i915 device and remove the platform device. >> >> [1] >> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/i915/i915_driver.c#L489 >> [2] >> https://elixir.bootlin.com/linux/v6.5/source/drivers/video/aperture.c#L347 >> [3] >> https://elixir.bootlin.com/linux/v6.5/source/drivers/firmware/sysfb.c#L63 >> >> Can you investigate why this does not work? Is sysfb_disable() not being >> called? Does it remove the platform device? >> >>> >>> So we can move the "tiny" directory before native DRM drivers to solve >>> this problem. >> >> Relying on linking order is just as unreliable. The usual workaround is >> to build native drivers as modules. But first, please investigate where >> the current code fails. >> >> Best regards >> Thomas >> >>> >>> Fixes: 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") >>> Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t >>> Reported-by: Jaak Ristioja <jaak@ristioja.ee> >>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> >>> --- >>> drivers/gpu/drm/Makefile | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile >>> index 8e1bde059170..db0f3d3aff43 100644 >>> --- a/drivers/gpu/drm/Makefile >>> +++ b/drivers/gpu/drm/Makefile >>> @@ -141,6 +141,7 @@ obj-y += arm/ >>> obj-y += display/ >>> obj-$(CONFIG_DRM_TTM) += ttm/ >>> obj-$(CONFIG_DRM_SCHED) += scheduler/ >>> +obj-y += tiny/ >>> obj-$(CONFIG_DRM_RADEON)+= radeon/ >>> obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/ >>> obj-$(CONFIG_DRM_AMDGPU)+= amd/amdxcp/ >>> @@ -182,7 +183,6 @@ obj-$(CONFIG_DRM_FSL_DCU) += fsl-dcu/ >>> obj-$(CONFIG_DRM_ETNAVIV) += etnaviv/ >>> obj-y += hisilicon/ >>> obj-y += mxsfb/ >>> -obj-y += tiny/ >>> obj-$(CONFIG_DRM_PL111) += pl111/ >>> obj-$(CONFIG_DRM_TVE200) += tve200/ >>> obj-$(CONFIG_DRM_XEN) += xen/ >
Thorsten Leemhuis <regressions@leemhuis.info> writes: > On 23.01.24 09:53, Jani Nikula wrote: >> On Wed, 08 Nov 2023, Thomas Zimmermann <tzimmermann@suse.de> wrote: [...] > >>> As you know, there's a platform device that represents the firmware >>> framebuffer. The firmware drivers, such as simpledrm, bind to it. In >>> i915 and the other native drivers we remove that platform device, so >>> that simpledrm does not run. >> >> The problem is still not resolved. Another bug report at [1]. >> >> The commit message here points at 60aebc955949 ("drivers/firmware: Move >> sysfb_init() from device_initcall to subsys_initcall_sync") as >> regressing, and Jaak also bisected it (see Closes:). >> >> I agree the patch here is just papering over the issue, but lacking a >> proper fix, for months, a revert would be in order, no? >> Yes, I agree that this patch has to be reverted, since as you said the issue has not been fixed and is causing regressions for multiple users. >> [1] https://gitlab.freedesktop.org/drm/intel/-/issues/10133 > > Interesting. > > JFYI for those that don't follow this closely: Huacai Chen proposed a > fix and asked a earlier reporter to test it, but afaics heard nothing back: > > https://lore.kernel.org/all/CAAhV-H5eXM7FNzuRCMthAziG_jg75XwQV3grpw=sdyJ-9GXgvA@mail.gmail.com/ > It's not a fix but a debug patch for the patch author to get more info ?
On 23.01.24 10:17, Thorsten Leemhuis wrote: > On 23.01.24 09:53, Jani Nikula wrote: >> On Wed, 08 Nov 2023, Thomas Zimmermann <tzimmermann@suse.de> wrote: >>> >>> thanks for the patch. >>> >>> Am 08.11.23 um 03:46 schrieb Huacai Chen: >>>> After commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from >>>> device_initcall to subsys_initcall_sync") some Lenovo laptops get a blank >>>> screen until the display manager starts. >>>> >>>> This regression occurs with such a Kconfig combination: >>>> CONFIG_SYSFB=y >>>> CONFIG_SYSFB_SIMPLEFB=y >>>> CONFIG_DRM_SIMPLEDRM=y >>>> CONFIG_DRM_I915=y # Or other native drivers such as radeon, amdgpu >>>> >>>> If replace CONFIG_DRM_SIMPLEDRM with CONFIG_FB_SIMPLE (they use the same >>>> device), there is no blank screen. The root cause is the initialization >>>> order, and this order depends on the Makefile. >>>> >>>> FB_SIMPLE is before native DRM drivers (e.g. i915, radeon, amdgpu, and >>>> so on), but DRM_SIMPLEDRM is after them. Thus, if we use FB_SIMPLE, I915 >>>> will takeover FB_SIMPLE, then no problem; and if we use DRM_SIMPLEDRM, >>>> DRM_SIMPLEDRM will try to takeover I915, but fails to work. >>> >>> But what exactly is the problem? From the lengthy discussion threat, it >>> looks like you've stumbled across a long-known problem, where the >>> firmware driver probes a device that has already been taken by a native >>> driver. But that should not be possible. >>> >>> As you know, there's a platform device that represents the firmware >>> framebuffer. The firmware drivers, such as simpledrm, bind to it. In >>> i915 and the other native drivers we remove that platform device, so >>> that simpledrm does not run. >> >> The problem is still not resolved. Another bug report at [1]. >> >> The commit message here points at 60aebc955949 ("drivers/firmware: Move >> sysfb_init() from device_initcall to subsys_initcall_sync") as >> regressing, and Jaak also bisected it (see Closes:). >> >> I agree the patch here is just papering over the issue, but lacking a >> proper fix, for months, a revert would be in order, no? >> >> [1] https://gitlab.freedesktop.org/drm/intel/-/issues/10133 > > Interesting. > > JFYI for those that don't follow this closely: Huacai Chen proposed a > fix Sorry, I did not look closely and misremembered: that was not a fix, it was just a test patch for gather more data for debugging. Ciao, Thorsten > and asked a earlier reporter to test it, but afaics heard nothing back: > > https://lore.kernel.org/all/CAAhV-H5eXM7FNzuRCMthAziG_jg75XwQV3grpw=sdyJ-9GXgvA@mail.gmail.com/ > > That's afaics why this got stuck (and why I didn't request on a escalate > this weeks ago). > > Ciao, Thorsten > >>> We call the DRM aperture helpers at [1]. It's implemented at [2]. The >>> function contains a call to sysfb_disable(), [3] which should be invoked >>> for the i915 device and remove the platform device. >>> >>> [1] >>> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/i915/i915_driver.c#L489 >>> [2] >>> https://elixir.bootlin.com/linux/v6.5/source/drivers/video/aperture.c#L347 >>> [3] >>> https://elixir.bootlin.com/linux/v6.5/source/drivers/firmware/sysfb.c#L63 >>> >>> Can you investigate why this does not work? Is sysfb_disable() not being >>> called? Does it remove the platform device? >>> >>>> >>>> So we can move the "tiny" directory before native DRM drivers to solve >>>> this problem. >>> >>> Relying on linking order is just as unreliable. The usual workaround is >>> to build native drivers as modules. But first, please investigate where >>> the current code fails. >>> >>> Best regards >>> Thomas >>> >>>> >>>> Fixes: 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") >>>> Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t >>>> Reported-by: Jaak Ristioja <jaak@ristioja.ee> >>>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> >>>> --- >>>> drivers/gpu/drm/Makefile | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile >>>> index 8e1bde059170..db0f3d3aff43 100644 >>>> --- a/drivers/gpu/drm/Makefile >>>> +++ b/drivers/gpu/drm/Makefile >>>> @@ -141,6 +141,7 @@ obj-y += arm/ >>>> obj-y += display/ >>>> obj-$(CONFIG_DRM_TTM) += ttm/ >>>> obj-$(CONFIG_DRM_SCHED) += scheduler/ >>>> +obj-y += tiny/ >>>> obj-$(CONFIG_DRM_RADEON)+= radeon/ >>>> obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/ >>>> obj-$(CONFIG_DRM_AMDGPU)+= amd/amdxcp/ >>>> @@ -182,7 +183,6 @@ obj-$(CONFIG_DRM_FSL_DCU) += fsl-dcu/ >>>> obj-$(CONFIG_DRM_ETNAVIV) += etnaviv/ >>>> obj-y += hisilicon/ >>>> obj-y += mxsfb/ >>>> -obj-y += tiny/ >>>> obj-$(CONFIG_DRM_PL111) += pl111/ >>>> obj-$(CONFIG_DRM_TVE200) += tve200/ >>>> obj-$(CONFIG_DRM_XEN) += xen/ >>
Hi Am 23.01.24 um 10:41 schrieb Javier Martinez Canillas: > Thorsten Leemhuis <regressions@leemhuis.info> writes: > >> On 23.01.24 09:53, Jani Nikula wrote: >>> On Wed, 08 Nov 2023, Thomas Zimmermann <tzimmermann@suse.de> wrote: > > [...] > >> >>>> As you know, there's a platform device that represents the firmware >>>> framebuffer. The firmware drivers, such as simpledrm, bind to it. In >>>> i915 and the other native drivers we remove that platform device, so >>>> that simpledrm does not run. >>> >>> The problem is still not resolved. Another bug report at [1]. >>> >>> The commit message here points at 60aebc955949 ("drivers/firmware: Move >>> sysfb_init() from device_initcall to subsys_initcall_sync") as >>> regressing, and Jaak also bisected it (see Closes:). >>> >>> I agree the patch here is just papering over the issue, but lacking a >>> proper fix, for months, a revert would be in order, no? >>> > > Yes, I agree that this patch has to be reverted, since as you said the > issue has not been fixed and is causing regressions for multiple users. I wanted to send out a revert anyway. I'll do this in a bit. Best regards Thomas > >>> [1] https://gitlab.freedesktop.org/drm/intel/-/issues/10133 >> >> Interesting. >> >> JFYI for those that don't follow this closely: Huacai Chen proposed a >> fix and asked a earlier reporter to test it, but afaics heard nothing back: >> >> https://lore.kernel.org/all/CAAhV-H5eXM7FNzuRCMthAziG_jg75XwQV3grpw=sdyJ-9GXgvA@mail.gmail.com/ >> > > It's not a fix but a debug patch for the patch author to get more info ? >
Hi, I apologize for not finding the time to test this earlier. On 11.12.23 05:08, Huacai Chen wrote: > And Jaak, could you please test with the below patch (but keep the > original order in Makefile) and then give me the dmesg output? > > diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > index 561be8feca96..cc2e39fb98f5 100644 > --- a/drivers/video/aperture.c > +++ b/drivers/video/aperture.c > @@ -350,21 +350,29 @@ int > aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > char *na > resource_size_t base, size; > int bar, ret = 0; > > - if (pdev == vga_default_device()) > + printk("DEBUG: remove 1\n"); > + > + if (pdev == vga_default_device()) { > + printk("DEBUG: primary = true\n"); > primary = true; > + } > > - if (primary) > + if (primary) { > + printk("DEBUG: disable sysfb\n"); > sysfb_disable(); > + } > > for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > continue; > > + printk("DEBUG: remove 2\n"); > base = pci_resource_start(pdev, bar); > size = pci_resource_len(pdev, bar); > aperture_detach_devices(base, size); > } > > + printk("DEBUG: remove 3\n"); > /* > * If this is the primary adapter, there could be a VGA device > * that consumes the VGA framebuffer I/O range. Remove this > > [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u Copy-pasting this from the e-mail body didn't work well, but I applied the changes manually to a 6.5.9 kernel without any of the other patches. Here's the relevant dmesg output on the Lenovo L570: ... [ 2.953405] ACPI: bus type drm_connector registered [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access [ 2.954018] DEBUG: remove 1 [ 2.954020] DEBUG: remove 2 [ 2.954021] DEBUG: remove 2 [ 2.954023] DEBUG: remove 3 [ 2.954029] resource: resource sanity check: requesting [mem 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB [mem 0xe0000000-0xe012bfff] [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4) ... [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0 [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: no post: no) [ 4.147244] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 [ 4.147420] usbcore: registered new interface driver udl [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for simple-framebuffer.0 on minor 2 [ 4.148643] Console: switching to colour frame buffer device 80x30 [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: simpledrmdrmfb frame buffer device [ 4.154043] loop: module loaded [ 4.156017] ahci 0000:00:17.0: version 3.0 [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device ... J
Hi, Javier and Thomas, On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: > > Hi, > > I apologize for not finding the time to test this earlier. > > On 11.12.23 05:08, Huacai Chen wrote: > > And Jaak, could you please test with the below patch (but keep the > > original order in Makefile) and then give me the dmesg output? > > > > diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > > index 561be8feca96..cc2e39fb98f5 100644 > > --- a/drivers/video/aperture.c > > +++ b/drivers/video/aperture.c > > @@ -350,21 +350,29 @@ int > > aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > > char *na > > resource_size_t base, size; > > int bar, ret = 0; > > > > - if (pdev == vga_default_device()) > > + printk("DEBUG: remove 1\n"); > > + > > + if (pdev == vga_default_device()) { > > + printk("DEBUG: primary = true\n"); > > primary = true; > > + } > > > > - if (primary) > > + if (primary) { > > + printk("DEBUG: disable sysfb\n"); > > sysfb_disable(); > > + } > > > > for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > > if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > > continue; > > > > + printk("DEBUG: remove 2\n"); > > base = pci_resource_start(pdev, bar); > > size = pci_resource_len(pdev, bar); > > aperture_detach_devices(base, size); > > } > > > > + printk("DEBUG: remove 3\n"); > > /* > > * If this is the primary adapter, there could be a VGA device > > * that consumes the VGA framebuffer I/O range. Remove this > > > > [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u > > Copy-pasting this from the e-mail body didn't work well, but I applied > the changes manually to a 6.5.9 kernel without any of the other patches. > Here's the relevant dmesg output on the Lenovo L570: > > ... > [ 2.953405] ACPI: bus type drm_connector registered > [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access > [ 2.954018] DEBUG: remove 1 > [ 2.954020] DEBUG: remove 2 > [ 2.954021] DEBUG: remove 2 > [ 2.954023] DEBUG: remove 3 My tmp patch is as follows: diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c index 561be8feca96..cc2e39fb98f5 100644 --- a/drivers/video/aperture.c +++ b/drivers/video/aperture.c @@ -350,21 +350,29 @@ int aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const char *na resource_size_t base, size; int bar, ret = 0; - if (pdev == vga_default_device()) + printk("DEBUG: remove 1\n"); + + if (pdev == vga_default_device()) { + printk("DEBUG: primary = true\n"); primary = true; + } - if (primary) + if (primary) { + printk("DEBUG: disable sysfb\n"); sysfb_disable(); + } for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) continue; + printk("DEBUG: remove 2\n"); base = pci_resource_start(pdev, bar); size = pci_resource_len(pdev, bar); aperture_detach_devices(base, size); } + printk("DEBUG: remove 3\n"); /* * If this is the primary adapter, there could be a VGA device * that consumes the VGA framebuffer I/O range. Remove this From the Jaak's dmesg, it is obvious that "pdev == vga_default_device()" is false, which causes sysfb_disable() to be not called. And I think the simple-drm device is not provided by the i915 device in this case. So, can we unconditionally call sysfb_disable() here, which is the same as aperture_remove_conflicting_devices()? Huacai > [ 2.954029] resource: resource sanity check: requesting [mem > 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB > [mem 0xe0000000-0xe012bfff] > [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs > [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages > [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin > [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware > i915/kbl_dmc_ver1_04.bin (v1.4) > ... > [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on > minor 0 > [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: > no post: no) > [ 4.147244] input: Video Bus as > /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 > [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 > [ 4.147420] usbcore: registered new interface driver udl > [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for > simple-framebuffer.0 on minor 2 > [ 4.148643] Console: switching to colour frame buffer device 80x30 > [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: > simpledrmdrmfb frame buffer device > [ 4.154043] loop: module loaded > [ 4.156017] ahci 0000:00:17.0: version 3.0 > [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device > ... > > J >
Hi Am 24.01.24 um 04:00 schrieb Huacai Chen: > Hi, Javier and Thomas, > > On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: >> >> Hi, >> >> I apologize for not finding the time to test this earlier. >> >> On 11.12.23 05:08, Huacai Chen wrote: >>> And Jaak, could you please test with the below patch (but keep the >>> original order in Makefile) and then give me the dmesg output? >>> >>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c >>> index 561be8feca96..cc2e39fb98f5 100644 >>> --- a/drivers/video/aperture.c >>> +++ b/drivers/video/aperture.c >>> @@ -350,21 +350,29 @@ int >>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const >>> char *na >>> resource_size_t base, size; >>> int bar, ret = 0; >>> >>> - if (pdev == vga_default_device()) >>> + printk("DEBUG: remove 1\n"); >>> + >>> + if (pdev == vga_default_device()) { >>> + printk("DEBUG: primary = true\n"); >>> primary = true; >>> + } >>> >>> - if (primary) >>> + if (primary) { >>> + printk("DEBUG: disable sysfb\n"); >>> sysfb_disable(); >>> + } >>> >>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { >>> if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) >>> continue; >>> >>> + printk("DEBUG: remove 2\n"); >>> base = pci_resource_start(pdev, bar); >>> size = pci_resource_len(pdev, bar); >>> aperture_detach_devices(base, size); >>> } >>> >>> + printk("DEBUG: remove 3\n"); >>> /* >>> * If this is the primary adapter, there could be a VGA device >>> * that consumes the VGA framebuffer I/O range. Remove this >>> >>> [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u >> >> Copy-pasting this from the e-mail body didn't work well, but I applied >> the changes manually to a 6.5.9 kernel without any of the other patches. >> Here's the relevant dmesg output on the Lenovo L570: >> >> ... >> [ 2.953405] ACPI: bus type drm_connector registered >> [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access >> [ 2.954018] DEBUG: remove 1 >> [ 2.954020] DEBUG: remove 2 >> [ 2.954021] DEBUG: remove 2 >> [ 2.954023] DEBUG: remove 3 > > My tmp patch is as follows: > > diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > index 561be8feca96..cc2e39fb98f5 100644 > --- a/drivers/video/aperture.c > +++ b/drivers/video/aperture.c > @@ -350,21 +350,29 @@ int > aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > char *na > resource_size_t base, size; > int bar, ret = 0; > > - if (pdev == vga_default_device()) > + printk("DEBUG: remove 1\n"); > + > + if (pdev == vga_default_device()) { > + printk("DEBUG: primary = true\n"); > primary = true; > + } > > - if (primary) > + if (primary) { > + printk("DEBUG: disable sysfb\n"); > sysfb_disable(); > + } > > for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > continue; > > + printk("DEBUG: remove 2\n"); > base = pci_resource_start(pdev, bar); > size = pci_resource_len(pdev, bar); > aperture_detach_devices(base, size); > } > > + printk("DEBUG: remove 3\n"); > /* > * If this is the primary adapter, there could be a VGA device > * that consumes the VGA framebuffer I/O range. Remove this > > From the Jaak's dmesg, it is obvious that "pdev == > vga_default_device()" is false, which causes sysfb_disable() to be not > called. And I think the simple-drm device is not provided by the i915 > device in this case. So, can we unconditionally call sysfb_disable() > here, which is the same as aperture_remove_conflicting_devices()? Unfortunately, we cannot call sysfb_disable() unconditionally. Otherwise, we'd possibly remove simpledrm et al on the primary driver even pdev is not the primary device. Both, sysfb and vgaarb, are initialized with subsys_initcall_sync() and the order of initialization is most likely wrong in the broken builds. Hence, reordering the linking mitigates the problem. I've long been thinking about how the graphics init code could be streamlined into something more manageable. But it's a larger effort. Best regards Thomas > > Huacai > >> [ 2.954029] resource: resource sanity check: requesting [mem >> 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB >> [mem 0xe0000000-0xe012bfff] >> [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs >> [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages >> [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin >> [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware >> i915/kbl_dmc_ver1_04.bin (v1.4) >> ... >> [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on >> minor 0 >> [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: >> no post: no) >> [ 4.147244] input: Video Bus as >> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 >> [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 >> [ 4.147420] usbcore: registered new interface driver udl >> [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for >> simple-framebuffer.0 on minor 2 >> [ 4.148643] Console: switching to colour frame buffer device 80x30 >> [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: >> simpledrmdrmfb frame buffer device >> [ 4.154043] loop: module loaded >> [ 4.156017] ahci 0000:00:17.0: version 3.0 >> [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device >> ... >> >> J >>
Hi, Thomas, On Wed, Jan 24, 2024 at 4:16 PM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > Hi > > Am 24.01.24 um 04:00 schrieb Huacai Chen: > > Hi, Javier and Thomas, > > > > On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: > >> > >> Hi, > >> > >> I apologize for not finding the time to test this earlier. > >> > >> On 11.12.23 05:08, Huacai Chen wrote: > >>> And Jaak, could you please test with the below patch (but keep the > >>> original order in Makefile) and then give me the dmesg output? > >>> > >>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > >>> index 561be8feca96..cc2e39fb98f5 100644 > >>> --- a/drivers/video/aperture.c > >>> +++ b/drivers/video/aperture.c > >>> @@ -350,21 +350,29 @@ int > >>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > >>> char *na > >>> resource_size_t base, size; > >>> int bar, ret = 0; > >>> > >>> - if (pdev == vga_default_device()) > >>> + printk("DEBUG: remove 1\n"); > >>> + > >>> + if (pdev == vga_default_device()) { > >>> + printk("DEBUG: primary = true\n"); > >>> primary = true; > >>> + } > >>> > >>> - if (primary) > >>> + if (primary) { > >>> + printk("DEBUG: disable sysfb\n"); > >>> sysfb_disable(); > >>> + } > >>> > >>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > >>> if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > >>> continue; > >>> > >>> + printk("DEBUG: remove 2\n"); > >>> base = pci_resource_start(pdev, bar); > >>> size = pci_resource_len(pdev, bar); > >>> aperture_detach_devices(base, size); > >>> } > >>> > >>> + printk("DEBUG: remove 3\n"); > >>> /* > >>> * If this is the primary adapter, there could be a VGA device > >>> * that consumes the VGA framebuffer I/O range. Remove this > >>> > >>> [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u > >> > >> Copy-pasting this from the e-mail body didn't work well, but I applied > >> the changes manually to a 6.5.9 kernel without any of the other patches. > >> Here's the relevant dmesg output on the Lenovo L570: > >> > >> ... > >> [ 2.953405] ACPI: bus type drm_connector registered > >> [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access > >> [ 2.954018] DEBUG: remove 1 > >> [ 2.954020] DEBUG: remove 2 > >> [ 2.954021] DEBUG: remove 2 > >> [ 2.954023] DEBUG: remove 3 > > > > My tmp patch is as follows: > > > > diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > > index 561be8feca96..cc2e39fb98f5 100644 > > --- a/drivers/video/aperture.c > > +++ b/drivers/video/aperture.c > > @@ -350,21 +350,29 @@ int > > aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > > char *na > > resource_size_t base, size; > > int bar, ret = 0; > > > > - if (pdev == vga_default_device()) > > + printk("DEBUG: remove 1\n"); > > + > > + if (pdev == vga_default_device()) { > > + printk("DEBUG: primary = true\n"); > > primary = true; > > + } > > > > - if (primary) > > + if (primary) { > > + printk("DEBUG: disable sysfb\n"); > > sysfb_disable(); > > + } > > > > for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > > if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > > continue; > > > > + printk("DEBUG: remove 2\n"); > > base = pci_resource_start(pdev, bar); > > size = pci_resource_len(pdev, bar); > > aperture_detach_devices(base, size); > > } > > > > + printk("DEBUG: remove 3\n"); > > /* > > * If this is the primary adapter, there could be a VGA device > > * that consumes the VGA framebuffer I/O range. Remove this > > > > From the Jaak's dmesg, it is obvious that "pdev == > > vga_default_device()" is false, which causes sysfb_disable() to be not > > called. And I think the simple-drm device is not provided by the i915 > > device in this case. So, can we unconditionally call sysfb_disable() > > here, which is the same as aperture_remove_conflicting_devices()? > > Unfortunately, we cannot call sysfb_disable() unconditionally. > Otherwise, we'd possibly remove simpledrm et al on the primary driver > even pdev is not the primary device. > > Both, sysfb and vgaarb, are initialized with subsys_initcall_sync() and > the order of initialization is most likely wrong in the broken builds. > Hence, reordering the linking mitigates the problem. OK, sysfb_init() should be after vgaarb, so I think we can move sysfb_init to be fs_initcall(). Though sysfb has nothing to do with "file system", I found that there are already a lot of init functions using fs_initcall(). Hi, Jaak, could you please make sysfb_initcall from subsys_initcall_sync to be fs_initcall and see if your problem can be fixed? Thank you very much. Huacai > > I've long been thinking about how the graphics init code could be > streamlined into something more manageable. But it's a larger effort. > > Best regards > Thomas > > > > > Huacai > > > >> [ 2.954029] resource: resource sanity check: requesting [mem > >> 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB > >> [mem 0xe0000000-0xe012bfff] > >> [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs > >> [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages > >> [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin > >> [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware > >> i915/kbl_dmc_ver1_04.bin (v1.4) > >> ... > >> [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on > >> minor 0 > >> [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: > >> no post: no) > >> [ 4.147244] input: Video Bus as > >> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 > >> [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 > >> [ 4.147420] usbcore: registered new interface driver udl > >> [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for > >> simple-framebuffer.0 on minor 2 > >> [ 4.148643] Console: switching to colour frame buffer device 80x30 > >> [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: > >> simpledrmdrmfb frame buffer device > >> [ 4.154043] loop: module loaded > >> [ 4.156017] ahci 0000:00:17.0: version 3.0 > >> [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device > >> ... > >> > >> J > >> > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Frankenstrasse 146, 90461 Nuernberg, Germany > GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman > HRB 36809 (AG Nuernberg)
Hi Am 24.01.24 um 10:24 schrieb Huacai Chen: > Hi, Thomas, > > On Wed, Jan 24, 2024 at 4:16 PM Thomas Zimmermann <tzimmermann@suse.de> wrote: >> >> Hi >> >> Am 24.01.24 um 04:00 schrieb Huacai Chen: >>> Hi, Javier and Thomas, >>> >>> On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: >>>> >>>> Hi, >>>> >>>> I apologize for not finding the time to test this earlier. >>>> >>>> On 11.12.23 05:08, Huacai Chen wrote: >>>>> And Jaak, could you please test with the below patch (but keep the >>>>> original order in Makefile) and then give me the dmesg output? >>>>> >>>>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c >>>>> index 561be8feca96..cc2e39fb98f5 100644 >>>>> --- a/drivers/video/aperture.c >>>>> +++ b/drivers/video/aperture.c >>>>> @@ -350,21 +350,29 @@ int >>>>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const >>>>> char *na >>>>> resource_size_t base, size; >>>>> int bar, ret = 0; >>>>> >>>>> - if (pdev == vga_default_device()) >>>>> + printk("DEBUG: remove 1\n"); >>>>> + >>>>> + if (pdev == vga_default_device()) { >>>>> + printk("DEBUG: primary = true\n"); >>>>> primary = true; >>>>> + } >>>>> >>>>> - if (primary) >>>>> + if (primary) { >>>>> + printk("DEBUG: disable sysfb\n"); >>>>> sysfb_disable(); >>>>> + } >>>>> >>>>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { >>>>> if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) >>>>> continue; >>>>> >>>>> + printk("DEBUG: remove 2\n"); >>>>> base = pci_resource_start(pdev, bar); >>>>> size = pci_resource_len(pdev, bar); >>>>> aperture_detach_devices(base, size); >>>>> } >>>>> >>>>> + printk("DEBUG: remove 3\n"); >>>>> /* >>>>> * If this is the primary adapter, there could be a VGA device >>>>> * that consumes the VGA framebuffer I/O range. Remove this >>>>> >>>>> [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u >>>> >>>> Copy-pasting this from the e-mail body didn't work well, but I applied >>>> the changes manually to a 6.5.9 kernel without any of the other patches. >>>> Here's the relevant dmesg output on the Lenovo L570: >>>> >>>> ... >>>> [ 2.953405] ACPI: bus type drm_connector registered >>>> [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access >>>> [ 2.954018] DEBUG: remove 1 >>>> [ 2.954020] DEBUG: remove 2 >>>> [ 2.954021] DEBUG: remove 2 >>>> [ 2.954023] DEBUG: remove 3 >>> >>> My tmp patch is as follows: >>> >>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c >>> index 561be8feca96..cc2e39fb98f5 100644 >>> --- a/drivers/video/aperture.c >>> +++ b/drivers/video/aperture.c >>> @@ -350,21 +350,29 @@ int >>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const >>> char *na >>> resource_size_t base, size; >>> int bar, ret = 0; >>> >>> - if (pdev == vga_default_device()) >>> + printk("DEBUG: remove 1\n"); >>> + >>> + if (pdev == vga_default_device()) { >>> + printk("DEBUG: primary = true\n"); >>> primary = true; >>> + } >>> >>> - if (primary) >>> + if (primary) { >>> + printk("DEBUG: disable sysfb\n"); >>> sysfb_disable(); >>> + } >>> >>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { >>> if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) >>> continue; >>> >>> + printk("DEBUG: remove 2\n"); >>> base = pci_resource_start(pdev, bar); >>> size = pci_resource_len(pdev, bar); >>> aperture_detach_devices(base, size); >>> } >>> >>> + printk("DEBUG: remove 3\n"); >>> /* >>> * If this is the primary adapter, there could be a VGA device >>> * that consumes the VGA framebuffer I/O range. Remove this >>> >>> From the Jaak's dmesg, it is obvious that "pdev == >>> vga_default_device()" is false, which causes sysfb_disable() to be not >>> called. And I think the simple-drm device is not provided by the i915 >>> device in this case. So, can we unconditionally call sysfb_disable() >>> here, which is the same as aperture_remove_conflicting_devices()? >> >> Unfortunately, we cannot call sysfb_disable() unconditionally. >> Otherwise, we'd possibly remove simpledrm et al on the primary driver >> even pdev is not the primary device. >> >> Both, sysfb and vgaarb, are initialized with subsys_initcall_sync() and >> the order of initialization is most likely wrong in the broken builds. >> Hence, reordering the linking mitigates the problem. > OK, sysfb_init() should be after vgaarb, so I think we can move > sysfb_init to be fs_initcall(). Though sysfb has nothing to do with > "file system", I found that there are already a lot of init functions > using fs_initcall(). > > Hi, Jaak, could you please make sysfb_initcall from > subsys_initcall_sync to be fs_initcall and see if your problem can be > fixed? Thank you very much. Thanks for helping. I already supplied a patch to fix sysfb. No further action is required. Best regards Thomas > > Huacai > >> >> I've long been thinking about how the graphics init code could be >> streamlined into something more manageable. But it's a larger effort. >> >> Best regards >> Thomas >> >>> >>> Huacai >>> >>>> [ 2.954029] resource: resource sanity check: requesting [mem >>>> 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB >>>> [mem 0xe0000000-0xe012bfff] >>>> [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs >>>> [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages >>>> [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin >>>> [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware >>>> i915/kbl_dmc_ver1_04.bin (v1.4) >>>> ... >>>> [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on >>>> minor 0 >>>> [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: >>>> no post: no) >>>> [ 4.147244] input: Video Bus as >>>> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 >>>> [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 >>>> [ 4.147420] usbcore: registered new interface driver udl >>>> [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for >>>> simple-framebuffer.0 on minor 2 >>>> [ 4.148643] Console: switching to colour frame buffer device 80x30 >>>> [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: >>>> simpledrmdrmfb frame buffer device >>>> [ 4.154043] loop: module loaded >>>> [ 4.156017] ahci 0000:00:17.0: version 3.0 >>>> [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device >>>> ... >>>> >>>> J >>>> >> >> -- >> Thomas Zimmermann >> Graphics Driver Developer >> SUSE Software Solutions Germany GmbH >> Frankenstrasse 146, 90461 Nuernberg, Germany >> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman >> HRB 36809 (AG Nuernberg)
Hi, Thomas, On Wed, Jan 24, 2024 at 5:44 PM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > Hi > > Am 24.01.24 um 10:24 schrieb Huacai Chen: > > Hi, Thomas, > > > > On Wed, Jan 24, 2024 at 4:16 PM Thomas Zimmermann <tzimmermann@suse.de> wrote: > >> > >> Hi > >> > >> Am 24.01.24 um 04:00 schrieb Huacai Chen: > >>> Hi, Javier and Thomas, > >>> > >>> On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: > >>>> > >>>> Hi, > >>>> > >>>> I apologize for not finding the time to test this earlier. > >>>> > >>>> On 11.12.23 05:08, Huacai Chen wrote: > >>>>> And Jaak, could you please test with the below patch (but keep the > >>>>> original order in Makefile) and then give me the dmesg output? > >>>>> > >>>>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > >>>>> index 561be8feca96..cc2e39fb98f5 100644 > >>>>> --- a/drivers/video/aperture.c > >>>>> +++ b/drivers/video/aperture.c > >>>>> @@ -350,21 +350,29 @@ int > >>>>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > >>>>> char *na > >>>>> resource_size_t base, size; > >>>>> int bar, ret = 0; > >>>>> > >>>>> - if (pdev == vga_default_device()) > >>>>> + printk("DEBUG: remove 1\n"); > >>>>> + > >>>>> + if (pdev == vga_default_device()) { > >>>>> + printk("DEBUG: primary = true\n"); > >>>>> primary = true; > >>>>> + } > >>>>> > >>>>> - if (primary) > >>>>> + if (primary) { > >>>>> + printk("DEBUG: disable sysfb\n"); > >>>>> sysfb_disable(); > >>>>> + } > >>>>> > >>>>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > >>>>> if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > >>>>> continue; > >>>>> > >>>>> + printk("DEBUG: remove 2\n"); > >>>>> base = pci_resource_start(pdev, bar); > >>>>> size = pci_resource_len(pdev, bar); > >>>>> aperture_detach_devices(base, size); > >>>>> } > >>>>> > >>>>> + printk("DEBUG: remove 3\n"); > >>>>> /* > >>>>> * If this is the primary adapter, there could be a VGA device > >>>>> * that consumes the VGA framebuffer I/O range. Remove this > >>>>> > >>>>> [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u > >>>> > >>>> Copy-pasting this from the e-mail body didn't work well, but I applied > >>>> the changes manually to a 6.5.9 kernel without any of the other patches. > >>>> Here's the relevant dmesg output on the Lenovo L570: > >>>> > >>>> ... > >>>> [ 2.953405] ACPI: bus type drm_connector registered > >>>> [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access > >>>> [ 2.954018] DEBUG: remove 1 > >>>> [ 2.954020] DEBUG: remove 2 > >>>> [ 2.954021] DEBUG: remove 2 > >>>> [ 2.954023] DEBUG: remove 3 > >>> > >>> My tmp patch is as follows: > >>> > >>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > >>> index 561be8feca96..cc2e39fb98f5 100644 > >>> --- a/drivers/video/aperture.c > >>> +++ b/drivers/video/aperture.c > >>> @@ -350,21 +350,29 @@ int > >>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > >>> char *na > >>> resource_size_t base, size; > >>> int bar, ret = 0; > >>> > >>> - if (pdev == vga_default_device()) > >>> + printk("DEBUG: remove 1\n"); > >>> + > >>> + if (pdev == vga_default_device()) { > >>> + printk("DEBUG: primary = true\n"); > >>> primary = true; > >>> + } > >>> > >>> - if (primary) > >>> + if (primary) { > >>> + printk("DEBUG: disable sysfb\n"); > >>> sysfb_disable(); > >>> + } > >>> > >>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > >>> if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > >>> continue; > >>> > >>> + printk("DEBUG: remove 2\n"); > >>> base = pci_resource_start(pdev, bar); > >>> size = pci_resource_len(pdev, bar); > >>> aperture_detach_devices(base, size); > >>> } > >>> > >>> + printk("DEBUG: remove 3\n"); > >>> /* > >>> * If this is the primary adapter, there could be a VGA device > >>> * that consumes the VGA framebuffer I/O range. Remove this > >>> > >>> From the Jaak's dmesg, it is obvious that "pdev == > >>> vga_default_device()" is false, which causes sysfb_disable() to be not > >>> called. And I think the simple-drm device is not provided by the i915 > >>> device in this case. So, can we unconditionally call sysfb_disable() > >>> here, which is the same as aperture_remove_conflicting_devices()? > >> > >> Unfortunately, we cannot call sysfb_disable() unconditionally. > >> Otherwise, we'd possibly remove simpledrm et al on the primary driver > >> even pdev is not the primary device. > >> > >> Both, sysfb and vgaarb, are initialized with subsys_initcall_sync() and > >> the order of initialization is most likely wrong in the broken builds. > >> Hence, reordering the linking mitigates the problem. > > OK, sysfb_init() should be after vgaarb, so I think we can move > > sysfb_init to be fs_initcall(). Though sysfb has nothing to do with > > "file system", I found that there are already a lot of init functions > > using fs_initcall(). > > > > Hi, Jaak, could you please make sysfb_initcall from > > subsys_initcall_sync to be fs_initcall and see if your problem can be > > fixed? Thank you very much. > > Thanks for helping. I already supplied a patch to fix sysfb. No further > action is required. Do you mean reverting "drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync"? I still want to keep that, since we can probably solve both the original problem and Jaak's problem now. And if you mean others, please ignore what I said. :) Huacai > > Best regards > Thomas > > > > > Huacai > > > >> > >> I've long been thinking about how the graphics init code could be > >> streamlined into something more manageable. But it's a larger effort. > >> > >> Best regards > >> Thomas > >> > >>> > >>> Huacai > >>> > >>>> [ 2.954029] resource: resource sanity check: requesting [mem > >>>> 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB > >>>> [mem 0xe0000000-0xe012bfff] > >>>> [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs > >>>> [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages > >>>> [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin > >>>> [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware > >>>> i915/kbl_dmc_ver1_04.bin (v1.4) > >>>> ... > >>>> [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on > >>>> minor 0 > >>>> [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: > >>>> no post: no) > >>>> [ 4.147244] input: Video Bus as > >>>> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 > >>>> [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 > >>>> [ 4.147420] usbcore: registered new interface driver udl > >>>> [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for > >>>> simple-framebuffer.0 on minor 2 > >>>> [ 4.148643] Console: switching to colour frame buffer device 80x30 > >>>> [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: > >>>> simpledrmdrmfb frame buffer device > >>>> [ 4.154043] loop: module loaded > >>>> [ 4.156017] ahci 0000:00:17.0: version 3.0 > >>>> [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device > >>>> ... > >>>> > >>>> J > >>>> > >> > >> -- > >> Thomas Zimmermann > >> Graphics Driver Developer > >> SUSE Software Solutions Germany GmbH > >> Frankenstrasse 146, 90461 Nuernberg, Germany > >> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman > >> HRB 36809 (AG Nuernberg) > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Frankenstrasse 146, 90461 Nuernberg, Germany > GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman > HRB 36809 (AG Nuernberg)
Hi, Jaak, I'm still here as a boring man. :) Have you ever tested whether using "fs_initcall(sysfb_init)" works fine on your machine? Huacai On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: > > Hi, > > I apologize for not finding the time to test this earlier. > > On 11.12.23 05:08, Huacai Chen wrote: > > And Jaak, could you please test with the below patch (but keep the > > original order in Makefile) and then give me the dmesg output? > > > > diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > > index 561be8feca96..cc2e39fb98f5 100644 > > --- a/drivers/video/aperture.c > > +++ b/drivers/video/aperture.c > > @@ -350,21 +350,29 @@ int > > aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > > char *na > > resource_size_t base, size; > > int bar, ret = 0; > > > > - if (pdev == vga_default_device()) > > + printk("DEBUG: remove 1\n"); > > + > > + if (pdev == vga_default_device()) { > > + printk("DEBUG: primary = true\n"); > > primary = true; > > + } > > > > - if (primary) > > + if (primary) { > > + printk("DEBUG: disable sysfb\n"); > > sysfb_disable(); > > + } > > > > for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > > if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > > continue; > > > > + printk("DEBUG: remove 2\n"); > > base = pci_resource_start(pdev, bar); > > size = pci_resource_len(pdev, bar); > > aperture_detach_devices(base, size); > > } > > > > + printk("DEBUG: remove 3\n"); > > /* > > * If this is the primary adapter, there could be a VGA device > > * that consumes the VGA framebuffer I/O range. Remove this > > > > [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u > > Copy-pasting this from the e-mail body didn't work well, but I applied > the changes manually to a 6.5.9 kernel without any of the other patches. > Here's the relevant dmesg output on the Lenovo L570: > > ... > [ 2.953405] ACPI: bus type drm_connector registered > [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access > [ 2.954018] DEBUG: remove 1 > [ 2.954020] DEBUG: remove 2 > [ 2.954021] DEBUG: remove 2 > [ 2.954023] DEBUG: remove 3 > [ 2.954029] resource: resource sanity check: requesting [mem > 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB > [mem 0xe0000000-0xe012bfff] > [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs > [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages > [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin > [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware > i915/kbl_dmc_ver1_04.bin (v1.4) > ... > [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on > minor 0 > [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: > no post: no) > [ 4.147244] input: Video Bus as > /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 > [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 > [ 4.147420] usbcore: registered new interface driver udl > [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for > simple-framebuffer.0 on minor 2 > [ 4.148643] Console: switching to colour frame buffer device 80x30 > [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: > simpledrmdrmfb frame buffer device > [ 4.154043] loop: module loaded > [ 4.156017] ahci 0000:00:17.0: version 3.0 > [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device > ... > > J >
Hi Huacai, Uh, no, sorry, I did not get to test such changes. From what Thomas wrote I presumed that this got fixed and no further action would be required. To speed things up I would appreciate it if you provided a patch. As I worked around the problem for the end user by changing the kernel configuration, I have long forgotten the exact details. It would otherwise probably take me a while to understand what and where you propose to change exactly. Also, do you want me to test it on some newer kernel or should I re-download the 6.5.# version sources? Best regards, Jaak On 18.03.24 16:43, Huacai Chen wrote: > Hi, Jaak, > > I'm still here as a boring man. :) > Have you ever tested whether using "fs_initcall(sysfb_init)" works > fine on your machine? > > Huacai > > On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: >> >> Hi, >> >> I apologize for not finding the time to test this earlier. >> >> On 11.12.23 05:08, Huacai Chen wrote: >>> And Jaak, could you please test with the below patch (but keep the >>> original order in Makefile) and then give me the dmesg output? >>> >>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c >>> index 561be8feca96..cc2e39fb98f5 100644 >>> --- a/drivers/video/aperture.c >>> +++ b/drivers/video/aperture.c >>> @@ -350,21 +350,29 @@ int >>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const >>> char *na >>> resource_size_t base, size; >>> int bar, ret = 0; >>> >>> - if (pdev == vga_default_device()) >>> + printk("DEBUG: remove 1\n"); >>> + >>> + if (pdev == vga_default_device()) { >>> + printk("DEBUG: primary = true\n"); >>> primary = true; >>> + } >>> >>> - if (primary) >>> + if (primary) { >>> + printk("DEBUG: disable sysfb\n"); >>> sysfb_disable(); >>> + } >>> >>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { >>> if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) >>> continue; >>> >>> + printk("DEBUG: remove 2\n"); >>> base = pci_resource_start(pdev, bar); >>> size = pci_resource_len(pdev, bar); >>> aperture_detach_devices(base, size); >>> } >>> >>> + printk("DEBUG: remove 3\n"); >>> /* >>> * If this is the primary adapter, there could be a VGA device >>> * that consumes the VGA framebuffer I/O range. Remove this >>> >>> [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u >> >> Copy-pasting this from the e-mail body didn't work well, but I applied >> the changes manually to a 6.5.9 kernel without any of the other patches. >> Here's the relevant dmesg output on the Lenovo L570: >> >> ... >> [ 2.953405] ACPI: bus type drm_connector registered >> [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access >> [ 2.954018] DEBUG: remove 1 >> [ 2.954020] DEBUG: remove 2 >> [ 2.954021] DEBUG: remove 2 >> [ 2.954023] DEBUG: remove 3 >> [ 2.954029] resource: resource sanity check: requesting [mem >> 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB >> [mem 0xe0000000-0xe012bfff] >> [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs >> [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages >> [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin >> [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware >> i915/kbl_dmc_ver1_04.bin (v1.4) >> ... >> [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on >> minor 0 >> [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: >> no post: no) >> [ 4.147244] input: Video Bus as >> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 >> [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 >> [ 4.147420] usbcore: registered new interface driver udl >> [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for >> simple-framebuffer.0 on minor 2 >> [ 4.148643] Console: switching to colour frame buffer device 80x30 >> [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: >> simpledrmdrmfb frame buffer device >> [ 4.154043] loop: module loaded >> [ 4.156017] ahci 0000:00:17.0: version 3.0 >> [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device >> ... >> >> J >>
Hi Am 18.03.24 um 16:42 schrieb Jaak Ristioja: > Hi Huacai, > > Uh, no, sorry, I did not get to test such changes. From what Thomas > wrote I presumed that this got fixed and no further action would be > required. Right. We reverted the problematic patch and the issue should be gone now. Best regards Thomas > > To speed things up I would appreciate it if you provided a patch. As I > worked around the problem for the end user by changing the kernel > configuration, I have long forgotten the exact details. It would > otherwise probably take me a while to understand what and where you > propose to change exactly. > > Also, do you want me to test it on some newer kernel or should I > re-download the 6.5.# version sources? > > > Best regards, > Jaak > > On 18.03.24 16:43, Huacai Chen wrote: >> Hi, Jaak, >> >> I'm still here as a boring man. :) >> Have you ever tested whether using "fs_initcall(sysfb_init)" works >> fine on your machine? >> >> Huacai >> >> On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: >>> >>> Hi, >>> >>> I apologize for not finding the time to test this earlier. >>> >>> On 11.12.23 05:08, Huacai Chen wrote: >>>> And Jaak, could you please test with the below patch (but keep the >>>> original order in Makefile) and then give me the dmesg output? >>>> >>>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c >>>> index 561be8feca96..cc2e39fb98f5 100644 >>>> --- a/drivers/video/aperture.c >>>> +++ b/drivers/video/aperture.c >>>> @@ -350,21 +350,29 @@ int >>>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const >>>> char *na >>>> resource_size_t base, size; >>>> int bar, ret = 0; >>>> >>>> - if (pdev == vga_default_device()) >>>> + printk("DEBUG: remove 1\n"); >>>> + >>>> + if (pdev == vga_default_device()) { >>>> + printk("DEBUG: primary = true\n"); >>>> primary = true; >>>> + } >>>> >>>> - if (primary) >>>> + if (primary) { >>>> + printk("DEBUG: disable sysfb\n"); >>>> sysfb_disable(); >>>> + } >>>> >>>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { >>>> if (!(pci_resource_flags(pdev, bar) & >>>> IORESOURCE_MEM)) >>>> continue; >>>> >>>> + printk("DEBUG: remove 2\n"); >>>> base = pci_resource_start(pdev, bar); >>>> size = pci_resource_len(pdev, bar); >>>> aperture_detach_devices(base, size); >>>> } >>>> >>>> + printk("DEBUG: remove 3\n"); >>>> /* >>>> * If this is the primary adapter, there could be a VGA >>>> device >>>> * that consumes the VGA framebuffer I/O range. Remove this >>>> >>>> [1] >>>> https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u >>> >>> Copy-pasting this from the e-mail body didn't work well, but I applied >>> the changes manually to a 6.5.9 kernel without any of the other >>> patches. >>> Here's the relevant dmesg output on the Lenovo L570: >>> >>> ... >>> [ 2.953405] ACPI: bus type drm_connector registered >>> [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access >>> [ 2.954018] DEBUG: remove 1 >>> [ 2.954020] DEBUG: remove 2 >>> [ 2.954021] DEBUG: remove 2 >>> [ 2.954023] DEBUG: remove 3 >>> [ 2.954029] resource: resource sanity check: requesting [mem >>> 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB >>> [mem 0xe0000000-0xe012bfff] >>> [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple >>> BARs >>> [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages >>> [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin >>> [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware >>> i915/kbl_dmc_ver1_04.bin (v1.4) >>> ... >>> [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for >>> 0000:00:02.0 on >>> minor 0 >>> [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: >>> no post: no) >>> [ 4.147244] input: Video Bus as >>> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 >>> [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on >>> minor 1 >>> [ 4.147420] usbcore: registered new interface driver udl >>> [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for >>> simple-framebuffer.0 on minor 2 >>> [ 4.148643] Console: switching to colour frame buffer device 80x30 >>> [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: >>> simpledrmdrmfb frame buffer device >>> [ 4.154043] loop: module loaded >>> [ 4.156017] ahci 0000:00:17.0: version 3.0 >>> [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer >>> device >>> ... >>> >>> J >>> >
Hi, Jaak, On Mon, Mar 18, 2024 at 11:42 PM Jaak Ristioja <jaak@ristioja.ee> wrote: > > Hi Huacai, > > Uh, no, sorry, I did not get to test such changes. From what Thomas > wrote I presumed that this got fixed and no further action would be > required. > > To speed things up I would appreciate it if you provided a patch. As I > worked around the problem for the end user by changing the kernel > configuration, I have long forgotten the exact details. It would > otherwise probably take me a while to understand what and where you > propose to change exactly. > > Also, do you want me to test it on some newer kernel or should I > re-download the 6.5.# version sources? Yes, it is better to use 6.5, you can simply change the last line of drivers/firmware/sysfb.c to fs_initcall(sysfb_init), So no patch needed. And to Thomas, I'm sorry that reverting 60aebc95594 solve Jaak's problem, but my original problem exist (at least in 6.5), and I want to know the result of the above experiment to understand what happens. Huacai > > > Best regards, > Jaak > > On 18.03.24 16:43, Huacai Chen wrote: > > Hi, Jaak, > > > > I'm still here as a boring man. :) > > Have you ever tested whether using "fs_initcall(sysfb_init)" works > > fine on your machine? > > > > Huacai > > > > On Wed, Jan 24, 2024 at 5:21 AM Jaak Ristioja <jaak@ristioja.ee> wrote: > >> > >> Hi, > >> > >> I apologize for not finding the time to test this earlier. > >> > >> On 11.12.23 05:08, Huacai Chen wrote: > >>> And Jaak, could you please test with the below patch (but keep the > >>> original order in Makefile) and then give me the dmesg output? > >>> > >>> diff --git a/drivers/video/aperture.c b/drivers/video/aperture.c > >>> index 561be8feca96..cc2e39fb98f5 100644 > >>> --- a/drivers/video/aperture.c > >>> +++ b/drivers/video/aperture.c > >>> @@ -350,21 +350,29 @@ int > >>> aperture_remove_conflicting_pci_devices(struct pci_dev *pdev, const > >>> char *na > >>> resource_size_t base, size; > >>> int bar, ret = 0; > >>> > >>> - if (pdev == vga_default_device()) > >>> + printk("DEBUG: remove 1\n"); > >>> + > >>> + if (pdev == vga_default_device()) { > >>> + printk("DEBUG: primary = true\n"); > >>> primary = true; > >>> + } > >>> > >>> - if (primary) > >>> + if (primary) { > >>> + printk("DEBUG: disable sysfb\n"); > >>> sysfb_disable(); > >>> + } > >>> > >>> for (bar = 0; bar < PCI_STD_NUM_BARS; ++bar) { > >>> if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) > >>> continue; > >>> > >>> + printk("DEBUG: remove 2\n"); > >>> base = pci_resource_start(pdev, bar); > >>> size = pci_resource_len(pdev, bar); > >>> aperture_detach_devices(base, size); > >>> } > >>> > >>> + printk("DEBUG: remove 3\n"); > >>> /* > >>> * If this is the primary adapter, there could be a VGA device > >>> * that consumes the VGA framebuffer I/O range. Remove this > >>> > >>> [1] https://lore.kernel.org/lkml/170222766284.86103.11020060769330721008@leemhuis.info/T/#u > >> > >> Copy-pasting this from the e-mail body didn't work well, but I applied > >> the changes manually to a 6.5.9 kernel without any of the other patches. > >> Here's the relevant dmesg output on the Lenovo L570: > >> > >> ... > >> [ 2.953405] ACPI: bus type drm_connector registered > >> [ 2.954014] i915 0000:00:02.0: [drm] VT-d active for gfx access > >> [ 2.954018] DEBUG: remove 1 > >> [ 2.954020] DEBUG: remove 2 > >> [ 2.954021] DEBUG: remove 2 > >> [ 2.954023] DEBUG: remove 3 > >> [ 2.954029] resource: resource sanity check: requesting [mem > >> 0x00000000e0000000-0x00000000efffffff], which spans more than BOOTFB > >> [mem 0xe0000000-0xe012bfff] > >> [ 2.954035] caller i915_ggtt_init_hw+0x88/0x120 mapping multiple BARs > >> [ 2.954061] i915 0000:00:02.0: [drm] Using Transparent Hugepages > >> [ 2.955103] Loading firmware: i915/kbl_dmc_ver1_04.bin > >> [ 2.955384] i915 0000:00:02.0: [drm] Finished loading DMC firmware > >> i915/kbl_dmc_ver1_04.bin (v1.4) > >> ... > >> [ 4.145013] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on > >> minor 0 > >> [ 4.147101] ACPI: video: Video Device [GFX0] (multi-head: yes rom: > >> no post: no) > >> [ 4.147244] input: Video Bus as > >> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4 > >> [ 4.147410] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1 > >> [ 4.147420] usbcore: registered new interface driver udl > >> [ 4.147500] [drm] Initialized simpledrm 1.0.0 20200625 for > >> simple-framebuffer.0 on minor 2 > >> [ 4.148643] Console: switching to colour frame buffer device 80x30 > >> [ 4.153216] simple-framebuffer simple-framebuffer.0: [drm] fb0: > >> simpledrmdrmfb frame buffer device > >> [ 4.154043] loop: module loaded > >> [ 4.156017] ahci 0000:00:17.0: version 3.0 > >> [ 4.157373] i915 0000:00:02.0: [drm] fb1: i915drmfb frame buffer device > >> ... > >> > >> J > >> >
Hi Huacai, On 19.03.24 16:16, Huacai Chen wrote: > Hi, Jaak, > > On Mon, Mar 18, 2024 at 11:42 PM Jaak Ristioja <jaak@ristioja.ee> wrote: >> >> Hi Huacai, >> >> Uh, no, sorry, I did not get to test such changes. From what Thomas >> wrote I presumed that this got fixed and no further action would be >> required. >> >> To speed things up I would appreciate it if you provided a patch. As I >> worked around the problem for the end user by changing the kernel >> configuration, I have long forgotten the exact details. It would >> otherwise probably take me a while to understand what and where you >> propose to change exactly. >> >> Also, do you want me to test it on some newer kernel or should I >> re-download the 6.5.# version sources? > Yes, it is better to use 6.5, you can simply change the last line of > drivers/firmware/sysfb.c to fs_initcall(sysfb_init), So no patch > needed. > > And to Thomas, > > I'm sorry that reverting 60aebc95594 solve Jaak's problem, but my > original problem exist (at least in 6.5), and I want to know the > result of the above experiment to understand what happens. Using the sources for 6.5.9 and fs_initcall(sysfb_init) instead of subsys_initcall_sync(sysfb_init) in drivers/firmware/sysfb.c did not help. The screen still went black during the boot and stayed black until SDDM started. Jaak
On Thu, Mar 21, 2024 at 4:55 AM Jaak Ristioja <jaak@ristioja.ee> wrote: > > Hi Huacai, > > On 19.03.24 16:16, Huacai Chen wrote: > > Hi, Jaak, > > > > On Mon, Mar 18, 2024 at 11:42 PM Jaak Ristioja <jaak@ristioja.ee> wrote: > >> > >> Hi Huacai, > >> > >> Uh, no, sorry, I did not get to test such changes. From what Thomas > >> wrote I presumed that this got fixed and no further action would be > >> required. > >> > >> To speed things up I would appreciate it if you provided a patch. As I > >> worked around the problem for the end user by changing the kernel > >> configuration, I have long forgotten the exact details. It would > >> otherwise probably take me a while to understand what and where you > >> propose to change exactly. > >> > >> Also, do you want me to test it on some newer kernel or should I > >> re-download the 6.5.# version sources? > > Yes, it is better to use 6.5, you can simply change the last line of > > drivers/firmware/sysfb.c to fs_initcall(sysfb_init), So no patch > > needed. > > > > And to Thomas, > > > > I'm sorry that reverting 60aebc95594 solve Jaak's problem, but my > > original problem exist (at least in 6.5), and I want to know the > > result of the above experiment to understand what happens. > > Using the sources for 6.5.9 and fs_initcall(sysfb_init) instead of > subsys_initcall_sync(sysfb_init) in drivers/firmware/sysfb.c did not > help. The screen still went black during the boot and stayed black until > SDDM started. OK, then can you try rootfs_initcall(sysfb_init)? Huacai > > Jaak
Hi, On 22.03.24 16:06, Huacai Chen wrote: > On Thu, Mar 21, 2024 at 4:55 AM Jaak Ristioja <jaak@ristioja.ee> wrote: >> >> Hi Huacai, >> >> On 19.03.24 16:16, Huacai Chen wrote: >>> Hi, Jaak, >>> >>> On Mon, Mar 18, 2024 at 11:42 PM Jaak Ristioja <jaak@ristioja.ee> wrote: >>>> >>>> Hi Huacai, >>>> >>>> Uh, no, sorry, I did not get to test such changes. From what Thomas >>>> wrote I presumed that this got fixed and no further action would be >>>> required. >>>> >>>> To speed things up I would appreciate it if you provided a patch. As I >>>> worked around the problem for the end user by changing the kernel >>>> configuration, I have long forgotten the exact details. It would >>>> otherwise probably take me a while to understand what and where you >>>> propose to change exactly. >>>> >>>> Also, do you want me to test it on some newer kernel or should I >>>> re-download the 6.5.# version sources? >>> Yes, it is better to use 6.5, you can simply change the last line of >>> drivers/firmware/sysfb.c to fs_initcall(sysfb_init), So no patch >>> needed. >>> >>> And to Thomas, >>> >>> I'm sorry that reverting 60aebc95594 solve Jaak's problem, but my >>> original problem exist (at least in 6.5), and I want to know the >>> result of the above experiment to understand what happens. >> >> Using the sources for 6.5.9 and fs_initcall(sysfb_init) instead of >> subsys_initcall_sync(sysfb_init) in drivers/firmware/sysfb.c did not >> help. The screen still went black during the boot and stayed black until >> SDDM started. > OK, then can you try rootfs_initcall(sysfb_init)? Unfortunately, this did not help, I still get the black screen until SDDM starts. Jaak
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 8e1bde059170..db0f3d3aff43 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -141,6 +141,7 @@ obj-y += arm/ obj-y += display/ obj-$(CONFIG_DRM_TTM) += ttm/ obj-$(CONFIG_DRM_SCHED) += scheduler/ +obj-y += tiny/ obj-$(CONFIG_DRM_RADEON)+= radeon/ obj-$(CONFIG_DRM_AMDGPU)+= amd/amdgpu/ obj-$(CONFIG_DRM_AMDGPU)+= amd/amdxcp/ @@ -182,7 +183,6 @@ obj-$(CONFIG_DRM_FSL_DCU) += fsl-dcu/ obj-$(CONFIG_DRM_ETNAVIV) += etnaviv/ obj-y += hisilicon/ obj-y += mxsfb/ -obj-y += tiny/ obj-$(CONFIG_DRM_PL111) += pl111/ obj-$(CONFIG_DRM_TVE200) += tve200/ obj-$(CONFIG_DRM_XEN) += xen/
After commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") some Lenovo laptops get a blank screen until the display manager starts. This regression occurs with such a Kconfig combination: CONFIG_SYSFB=y CONFIG_SYSFB_SIMPLEFB=y CONFIG_DRM_SIMPLEDRM=y CONFIG_DRM_I915=y # Or other native drivers such as radeon, amdgpu If replace CONFIG_DRM_SIMPLEDRM with CONFIG_FB_SIMPLE (they use the same device), there is no blank screen. The root cause is the initialization order, and this order depends on the Makefile. FB_SIMPLE is before native DRM drivers (e.g. i915, radeon, amdgpu, and so on), but DRM_SIMPLEDRM is after them. Thus, if we use FB_SIMPLE, I915 will takeover FB_SIMPLE, then no problem; and if we use DRM_SIMPLEDRM, DRM_SIMPLEDRM will try to takeover I915, but fails to work. So we can move the "tiny" directory before native DRM drivers to solve this problem. Fixes: 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t Reported-by: Jaak Ristioja <jaak@ristioja.ee> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> --- drivers/gpu/drm/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)