diff mbox series

[1/5] fbdev: Hot-unplug firmware fb devices on forced removal

Message ID 20220124123659.4692-2-tzimmermann@suse.de (mailing list archive)
State Superseded
Headers show
Series sysfb: Fix memory-region management | expand

Commit Message

Thomas Zimmermann Jan. 24, 2022, 12:36 p.m. UTC
Hot-unplug all firmware-framebuffer devices as part of removing
them via remove_conflicting_framebuffers() et al. Releases all
memory regions to be acquired by native drivers.

Firmware, such as EFI, install a framebuffer while posting the
computer. After removing the firmware-framebuffer device from fbdev,
a native driver takes over the hardware and the firmware framebuffer
becomes invalid.

Firmware-framebuffer drivers, specifically simplefb, don't release
their device from Linux' device hierarchy. It still owns the firmware
framebuffer and blocks the native drivers from loading. This has been
observed in the vmwgfx driver. [1]

Initiating a device removal (i.e., hot unplug) as part of
remove_conflicting_framebuffers() removes the underlying device and
returns the memory range to the system.

[1] https://lore.kernel.org/dri-devel/20220117180359.18114-1-zack@kde.org/

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
CC: stable@vger.kernel.org # v5.11+
---
 drivers/video/fbdev/core/fbmem.c | 29 ++++++++++++++++++++++++++---
 include/linux/fb.h               |  1 +
 2 files changed, 27 insertions(+), 3 deletions(-)

Comments

Javier Martinez Canillas Jan. 24, 2022, 1:52 p.m. UTC | #1
Hello Thomas,

Thanks for the patch.

On 1/24/22 13:36, Thomas Zimmermann wrote:
> Hot-unplug all firmware-framebuffer devices as part of removing
> them via remove_conflicting_framebuffers() et al. Releases all
> memory regions to be acquired by native drivers.
> 
> Firmware, such as EFI, install a framebuffer while posting the
> computer. After removing the firmware-framebuffer device from fbdev,
> a native driver takes over the hardware and the firmware framebuffer
> becomes invalid.
> 
> Firmware-framebuffer drivers, specifically simplefb, don't release
> their device from Linux' device hierarchy. It still owns the firmware
> framebuffer and blocks the native drivers from loading. This has been
> observed in the vmwgfx driver. [1]
> 
> Initiating a device removal (i.e., hot unplug) as part of
> remove_conflicting_framebuffers() removes the underlying device and
> returns the memory range to the system.
> 
> [1] https://lore.kernel.org/dri-devel/20220117180359.18114-1-zack@kde.org/
> 

I would add a Reported-by tag here for Zack.

> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> CC: stable@vger.kernel.org # v5.11+
> ---
>  drivers/video/fbdev/core/fbmem.c | 29 ++++++++++++++++++++++++++---
>  include/linux/fb.h               |  1 +
>  2 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> index 0fa7ede94fa6..f73f8415b8cb 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -25,6 +25,7 @@
>  #include <linux/init.h>
>  #include <linux/linux_logo.h>
>  #include <linux/proc_fs.h>
> +#include <linux/platform_device.h>
>  #include <linux/seq_file.h>
>  #include <linux/console.h>
>  #include <linux/kmod.h>
> @@ -1557,18 +1558,36 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
>  	/* check all firmware fbs and kick off if the base addr overlaps */
>  	for_each_registered_fb(i) {
>  		struct apertures_struct *gen_aper;
> +		struct device *dev;
>  
>  		if (!(registered_fb[i]->flags & FBINFO_MISC_FIRMWARE))
>  			continue;
>  
>  		gen_aper = registered_fb[i]->apertures;
> +		dev = registered_fb[i]->device;
>  		if (fb_do_apertures_overlap(gen_aper, a) ||
>  			(primary && gen_aper && gen_aper->count &&
>  			 gen_aper->ranges[0].base == VGA_FB_PHYS)) {
>  
>  			printk(KERN_INFO "fb%d: switching to %s from %s\n",
>  			       i, name, registered_fb[i]->fix.id);
> -			do_unregister_framebuffer(registered_fb[i]);
> +
> +			/*
> +			 * If we kick-out a firmware driver, we also want to remove
> +			 * the underlying platform device, such as simple-framebuffer,
> +			 * VESA, EFI, etc. A native driver will then be able to
> +			 * allocate the memory range.
> +			 *
> +			 * If it's not a platform device, at least print a warning. A
> +			 * fix would add code to remove the device from the system.
> +			 */
> +			if (dev_is_platform(dev)) {

In do_register_framebuffer() creating the fb%d is not a fatal error. It would
be safer to do if (!IS_ERR_OR_NULL(dev) && dev_is_platform(dev)) instead here.

https://elixir.bootlin.com/linux/latest/source/drivers/video/fbdev/core/fbmem.c#L1605

> +				registered_fb[i]->forced_out = true;
> +				platform_device_unregister(to_platform_device(dev));
> +			} else {
> +				pr_warn("fb%d: cannot remove device\n", i);
> +				do_unregister_framebuffer(registered_fb[i]);
> +			}
>  		}
>  	}
>  }
> @@ -1898,9 +1917,13 @@ EXPORT_SYMBOL(register_framebuffer);
>  void
>  unregister_framebuffer(struct fb_info *fb_info)
>  {
> -	mutex_lock(&registration_lock);
> +	bool forced_out = fb_info->forced_out;
> +
> +	if (!forced_out)
> +		mutex_lock(&registration_lock);
>  	do_unregister_framebuffer(fb_info);
> -	mutex_unlock(&registration_lock);
> +	if (!forced_out)
> +		mutex_unlock(&registration_lock);
>  }

I'm not sure to follow the logic here. The forced_out bool is set when the
platform device is unregistered in do_remove_conflicting_framebuffers(),
but shouldn't the struct platform_driver .remove callback be executed even
in this case ?

That is, the platform_device_unregister() will trigger the call to the
.remove callback that in turn will call unregister_framebuffer().

Shouldn't we always hold the mutex when calling do_unregister_framebuffer() ?

Best regards,
Javier Martinez Canillas Jan. 24, 2022, 1:56 p.m. UTC | #2
On 1/24/22 14:52, Javier Martinez Canillas wrote:

[snip]

>> @@ -1898,9 +1917,13 @@ EXPORT_SYMBOL(register_framebuffer);
>>  void
>>  unregister_framebuffer(struct fb_info *fb_info)
>>  {
>> -	mutex_lock(&registration_lock);
>> +	bool forced_out = fb_info->forced_out;
>> +
>> +	if (!forced_out)
>> +		mutex_lock(&registration_lock);
>>  	do_unregister_framebuffer(fb_info);
>> -	mutex_unlock(&registration_lock);
>> +	if (!forced_out)
>> +		mutex_unlock(&registration_lock);
>>  }
> 
> I'm not sure to follow the logic here. The forced_out bool is set when the
> platform device is unregistered in do_remove_conflicting_framebuffers(),
> but shouldn't the struct platform_driver .remove callback be executed even
> in this case ?
> 
> That is, the platform_device_unregister() will trigger the call to the
> .remove callback that in turn will call unregister_framebuffer().
> 
> Shouldn't we always hold the mutex when calling do_unregister_framebuffer() ?
> 

Scratch that, I got it now. That's exactly the reason why you skip the
mutext_lock(). After adding the check for dev, feel free to add:

Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>

Best regards,
Thomas Zimmermann Jan. 24, 2022, 2:19 p.m. UTC | #3
Hi

Am 24.01.22 um 14:52 schrieb Javier Martinez Canillas:
> Hello Thomas,
> 
> Thanks for the patch.
> 
> On 1/24/22 13:36, Thomas Zimmermann wrote:
>> Hot-unplug all firmware-framebuffer devices as part of removing
>> them via remove_conflicting_framebuffers() et al. Releases all
>> memory regions to be acquired by native drivers.
>>
>> Firmware, such as EFI, install a framebuffer while posting the
>> computer. After removing the firmware-framebuffer device from fbdev,
>> a native driver takes over the hardware and the firmware framebuffer
>> becomes invalid.
>>
>> Firmware-framebuffer drivers, specifically simplefb, don't release
>> their device from Linux' device hierarchy. It still owns the firmware
>> framebuffer and blocks the native drivers from loading. This has been
>> observed in the vmwgfx driver. [1]
>>
>> Initiating a device removal (i.e., hot unplug) as part of
>> remove_conflicting_framebuffers() removes the underlying device and
>> returns the memory range to the system.
>>
>> [1] https://lore.kernel.org/dri-devel/20220117180359.18114-1-zack@kde.org/
>>
> 
> I would add a Reported-by tag here for Zack.

Indeed, I simply forgot about it.

> 
>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>> CC: stable@vger.kernel.org # v5.11+
>> ---
>>   drivers/video/fbdev/core/fbmem.c | 29 ++++++++++++++++++++++++++---
>>   include/linux/fb.h               |  1 +
>>   2 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
>> index 0fa7ede94fa6..f73f8415b8cb 100644
>> --- a/drivers/video/fbdev/core/fbmem.c
>> +++ b/drivers/video/fbdev/core/fbmem.c
>> @@ -25,6 +25,7 @@
>>   #include <linux/init.h>
>>   #include <linux/linux_logo.h>
>>   #include <linux/proc_fs.h>
>> +#include <linux/platform_device.h>
>>   #include <linux/seq_file.h>
>>   #include <linux/console.h>
>>   #include <linux/kmod.h>
>> @@ -1557,18 +1558,36 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
>>   	/* check all firmware fbs and kick off if the base addr overlaps */
>>   	for_each_registered_fb(i) {
>>   		struct apertures_struct *gen_aper;
>> +		struct device *dev;
>>   
>>   		if (!(registered_fb[i]->flags & FBINFO_MISC_FIRMWARE))
>>   			continue;
>>   
>>   		gen_aper = registered_fb[i]->apertures;
>> +		dev = registered_fb[i]->device;
>>   		if (fb_do_apertures_overlap(gen_aper, a) ||
>>   			(primary && gen_aper && gen_aper->count &&
>>   			 gen_aper->ranges[0].base == VGA_FB_PHYS)) {
>>   
>>   			printk(KERN_INFO "fb%d: switching to %s from %s\n",
>>   			       i, name, registered_fb[i]->fix.id);
>> -			do_unregister_framebuffer(registered_fb[i]);
>> +
>> +			/*
>> +			 * If we kick-out a firmware driver, we also want to remove
>> +			 * the underlying platform device, such as simple-framebuffer,
>> +			 * VESA, EFI, etc. A native driver will then be able to
>> +			 * allocate the memory range.
>> +			 *
>> +			 * If it's not a platform device, at least print a warning. A
>> +			 * fix would add code to remove the device from the system.
>> +			 */
>> +			if (dev_is_platform(dev)) {
> 
> In do_register_framebuffer() creating the fb%d is not a fatal error. It would
> be safer to do if (!IS_ERR_OR_NULL(dev) && dev_is_platform(dev)) instead here.
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/video/fbdev/core/fbmem.c#L1605

'dev' here refers to 'fb_info->device', which is the underlying device 
created by the sysfb code.  fb_info->dev is something different.

> 
>> +				registered_fb[i]->forced_out = true;
>> +				platform_device_unregister(to_platform_device(dev));
>> +			} else {
>> +				pr_warn("fb%d: cannot remove device\n", i);
>> +				do_unregister_framebuffer(registered_fb[i]);
>> +			}
>>   		}
>>   	}
>>   }
>> @@ -1898,9 +1917,13 @@ EXPORT_SYMBOL(register_framebuffer);
>>   void
>>   unregister_framebuffer(struct fb_info *fb_info)
>>   {
>> -	mutex_lock(&registration_lock);
>> +	bool forced_out = fb_info->forced_out;
>> +
>> +	if (!forced_out)
>> +		mutex_lock(&registration_lock);
>>   	do_unregister_framebuffer(fb_info);
>> -	mutex_unlock(&registration_lock);
>> +	if (!forced_out)
>> +		mutex_unlock(&registration_lock);
>>   }
> 
> I'm not sure to follow the logic here. The forced_out bool is set when the
> platform device is unregistered in do_remove_conflicting_framebuffers(),
> but shouldn't the struct platform_driver .remove callback be executed even
> in this case ?
> 
> That is, the platform_device_unregister() will trigger the call to the
> .remove callback that in turn will call unregister_framebuffer().
> 
> Shouldn't we always hold the mutex when calling do_unregister_framebuffer() ?

Doing the hot-unplug will end up in unregister_framebuffer(), but we 
already hold the lock from the do_remove_conflicting_framebuffer() code.

Best regards
Thomas

> 
> Best regards,
Javier Martinez Canillas Jan. 24, 2022, 2:31 p.m. UTC | #4
On 1/24/22 15:19, Thomas Zimmermann wrote:

[snip]

>>> +			if (dev_is_platform(dev)) {
>>
>> In do_register_framebuffer() creating the fb%d is not a fatal error. It would
>> be safer to do if (!IS_ERR_OR_NULL(dev) && dev_is_platform(dev)) instead here.
>>
>> https://elixir.bootlin.com/linux/latest/source/drivers/video/fbdev/core/fbmem.c#L1605
> 
> 'dev' here refers to 'fb_info->device', which is the underlying device 
> created by the sysfb code.  fb_info->dev is something different.
>

oh, indeed. I conflated the two.

Maybe the local variable could be renamed to 'device' just to avoid confusion ?

[snip]

>> I'm not sure to follow the logic here. The forced_out bool is set when the
>> platform device is unregistered in do_remove_conflicting_framebuffers(),
>> but shouldn't the struct platform_driver .remove callback be executed even
>> in this case ?
>>
>> That is, the platform_device_unregister() will trigger the call to the
>> .remove callback that in turn will call unregister_framebuffer().
>>
>> Shouldn't we always hold the mutex when calling do_unregister_framebuffer() ?
> 
> Doing the hot-unplug will end up in unregister_framebuffer(), but we 
> already hold the lock from the do_remove_conflicting_framebuffer() code.
>

Yes, I realized that just after sending the first email. Sorry for the noise.
 
Best regards,
Zack Rusin Jan. 24, 2022, 3:59 p.m. UTC | #5
On Mon, 2022-01-24 at 13:36 +0100, Thomas Zimmermann wrote:
> Hot-unplug all firmware-framebuffer devices as part of removing
> them via remove_conflicting_framebuffers() et al. Releases all
> memory regions to be acquired by native drivers.
> 
> Firmware, such as EFI, install a framebuffer while posting the
> computer. After removing the firmware-framebuffer device from fbdev,
> a native driver takes over the hardware and the firmware framebuffer
> becomes invalid.
> 
> Firmware-framebuffer drivers, specifically simplefb, don't release
> their device from Linux' device hierarchy. It still owns the firmware
> framebuffer and blocks the native drivers from loading. This has been
> observed in the vmwgfx driver. [1]
> 
> Initiating a device removal (i.e., hot unplug) as part of
> remove_conflicting_framebuffers() removes the underlying device and
> returns the memory range to the system.
> 
> [1]
> https://lore.kernel.org/dri-devel/20220117180359.18114-1-zack@kde.org/
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> CC: stable@vger.kernel.orgĀ # v5.11+

Looks great, thanks!

Reviewed-by: Zack Rusin <zackr@vmware.com>
diff mbox series

Patch

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 0fa7ede94fa6..f73f8415b8cb 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -25,6 +25,7 @@ 
 #include <linux/init.h>
 #include <linux/linux_logo.h>
 #include <linux/proc_fs.h>
+#include <linux/platform_device.h>
 #include <linux/seq_file.h>
 #include <linux/console.h>
 #include <linux/kmod.h>
@@ -1557,18 +1558,36 @@  static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
 	/* check all firmware fbs and kick off if the base addr overlaps */
 	for_each_registered_fb(i) {
 		struct apertures_struct *gen_aper;
+		struct device *dev;
 
 		if (!(registered_fb[i]->flags & FBINFO_MISC_FIRMWARE))
 			continue;
 
 		gen_aper = registered_fb[i]->apertures;
+		dev = registered_fb[i]->device;
 		if (fb_do_apertures_overlap(gen_aper, a) ||
 			(primary && gen_aper && gen_aper->count &&
 			 gen_aper->ranges[0].base == VGA_FB_PHYS)) {
 
 			printk(KERN_INFO "fb%d: switching to %s from %s\n",
 			       i, name, registered_fb[i]->fix.id);
-			do_unregister_framebuffer(registered_fb[i]);
+
+			/*
+			 * If we kick-out a firmware driver, we also want to remove
+			 * the underlying platform device, such as simple-framebuffer,
+			 * VESA, EFI, etc. A native driver will then be able to
+			 * allocate the memory range.
+			 *
+			 * If it's not a platform device, at least print a warning. A
+			 * fix would add code to remove the device from the system.
+			 */
+			if (dev_is_platform(dev)) {
+				registered_fb[i]->forced_out = true;
+				platform_device_unregister(to_platform_device(dev));
+			} else {
+				pr_warn("fb%d: cannot remove device\n", i);
+				do_unregister_framebuffer(registered_fb[i]);
+			}
 		}
 	}
 }
@@ -1898,9 +1917,13 @@  EXPORT_SYMBOL(register_framebuffer);
 void
 unregister_framebuffer(struct fb_info *fb_info)
 {
-	mutex_lock(&registration_lock);
+	bool forced_out = fb_info->forced_out;
+
+	if (!forced_out)
+		mutex_lock(&registration_lock);
 	do_unregister_framebuffer(fb_info);
-	mutex_unlock(&registration_lock);
+	if (!forced_out)
+		mutex_unlock(&registration_lock);
 }
 EXPORT_SYMBOL(unregister_framebuffer);
 
diff --git a/include/linux/fb.h b/include/linux/fb.h
index 3da95842b207..9a14f3f8a329 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -502,6 +502,7 @@  struct fb_info {
 	} *apertures;
 
 	bool skip_vt_switch; /* no VT switch on suspend/resume required */
+	bool forced_out; /* set when being removed by another driver */
 };
 
 static inline struct apertures_struct *alloc_apertures(unsigned int max_num) {