diff mbox series

[v7,7/8] drm/amdgpu: Implement the is_boot_device callback function

Message ID 20230613030151.216625-8-15330273260@189.cn (mailing list archive)
State Not Applicable
Delegated to: Bjorn Helgaas
Headers show
Series PCI/VGA: Introduce is_boot_device function callback to vga_client_register | expand

Commit Message

Sui Jingfeng June 13, 2023, 3:01 a.m. UTC
From: Sui Jingfeng <suijingfeng@loongson.cn>

[why]

The vga_is_firmware_default() defined in drivers/pci/vgaarb.c is
arch-dependent, it's a dummy on non-x86 architectures currently.
This made VGAARB lost an important condition for the arbitration.
It could still be wrong even if we remove the #ifdef and #endif guards.
because the PCI bar will move (resource re-allocation).

[how]

The device that owns the firmware framebuffer should be the default boot
device. This patch adds an arch-independent function to enforce this rule.
The vgaarb subsystem will call back to amdgpu_is_boot_device() function
when drm/amdgpu is successfully bound to an AMDGPU device.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Konig <christian.koenig@amd.com>
Cc: Pan Xinhui <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: YiPeng Chai <YiPeng.Chai@amd.com>
Cc: Bokun Zhang <Bokun.Zhang@amd.com>
CC: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Sui Jingfeng June 15, 2023, 6:51 a.m. UTC | #1
Hi,


Does anyone has the bandwidth to review this?

I provide more additional information here, hope it helps.


On a non-x86 multiple platform, the discrete AMDGPU fails to override 
the integrated one.

because the PCI BAR 0 of the AMDGPU gets moved.

Below is the log of 'dmesg | grep vgaarb'.

So relaying on screen_info is not always reliable.


[    0.361928] pci 0000:00:06.1: vgaarb: setting as boot VGA device
[    0.361932] pci 0000:00:06.1: vgaarb: bridge control possible
[    0.361933] pci 0000:00:06.1: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    0.361940] pci 0000:05:00.0: vgaarb: bridge control possible
[    0.361941] pci 0000:05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
[    0.361943] vgaarb: loaded
[   11.352087] amdgpu 0000:05:00.0: vgaarb: Set as boot device (dictated by driver)
[   11.575505] loongson 0000:00:06.1: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[   11.585100] amdgpu 0000:05:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none


dmesg | grep efifb:


[    0.356355] pci 0000:05:00.0: BAR 0: assigned to efifb
[    0.375793] efifb: probing for efifb
[    0.375795] pci 0000:05:00.0: BAR has moved, updating efifb address
[    0.375803] efifb: framebuffer at 0xe0030000000, using 976k, total 975k
[    0.375805] efifb: mode is 800x600x16, linelength=1664, pages=1
[    0.375806] efifb: scrolling: redraw
[    0.375808] efifb: Truecolor: size=0:5:6:5, shift=0:11:5:0


efifb can also prove that "BAR has been moved"


 From dmesg |  grep "pci 0000:05:00.0":


[    0.356286] pci 0000:05:00.0: [1002:699f] type 00 class 0x030000
[    0.356303] pci 0000:05:00.0: reg 0x10: [mem 
0xe0020000000-0xe002fffffff 64bit pref]
[    0.356315] pci 0000:05:00.0: reg 0x18: [mem 
0xe0030000000-0xe00301fffff 64bit pref]
[    0.356323] pci 0000:05:00.0: reg 0x20: [io  0x40000-0x400ff]
[    0.356331] pci 0000:05:00.0: reg 0x24: [mem 0xe0053100000-0xe005313ffff]
[    0.356339] pci 0000:05:00.0: reg 0x30: [mem 0xfffe0000-0xffffffff pref]
[    0.356346] pci 0000:05:00.0: enabling Extended Tags
[    0.356355] pci 0000:05:00.0: BAR 0: assigned to efifb
[    0.356421] pci 0000:05:00.0: supports D1 D2
[    0.356422] pci 0000:05:00.0: PME# supported from D1 D2 D3hot D3cold
[    0.356858] pci 0000:05:00.0: BAR 0: assigned [mem 
0xe0030000000-0xe003fffffff 64bit pref]
[    0.356866] pci 0000:05:00.0: BAR 2: assigned [mem 
0xe0040000000-0xe00401fffff 64bit pref]
[    0.356875] pci 0000:05:00.0: BAR 5: assigned [mem 
0xe0049000000-0xe004903ffff]
[    0.356878] pci 0000:05:00.0: BAR 6: assigned [mem 
0xe0049040000-0xe004905ffff pref]
[    0.356889] pci 0000:05:00.0: BAR 4: assigned [io 0x4000-0x40ff]
[    0.361940] pci 0000:05:00.0: vgaarb: bridge control possible
[    0.361941] pci 0000:05:00.0: vgaarb: VGA device added: 
decodes=io+mem,owns=none,locks=none
[    0.375795] pci 0000:05:00.0: BAR has moved, updating efifb address

We can see that the Bar 0 of AMDGPU

moved from '0xe0020000000-0xe002fffffff'  to '0xe0030000000-0xe003fffffff'

while the fb location information recorded by the screen_info still 
belong to '0xe0020000000-0xe002fffffff'


I suspect this is also the reason that video/aperture don't relay on the 
information provided by screen_info

in the contrast, it require the firmware framebuffer driver(efifb) to call

devm_aperture_acquire_from_firmware() function, only in this way 
video/aperture

could record the correct information about the aperture being used the 
by the firmware framebuffe.


While vgaarb is loaded too early, even before efifb.

so that we can only relay on the pci_notifier call back to us.


On 2023/6/13 11:01, Sui Jingfeng wrote:
> From: Sui Jingfeng <suijingfeng@loongson.cn>
>
> [why]
>
> The vga_is_firmware_default() defined in drivers/pci/vgaarb.c is
> arch-dependent, it's a dummy on non-x86 architectures currently.
> This made VGAARB lost an important condition for the arbitration.
> It could still be wrong even if we remove the #ifdef and #endif guards.
> because the PCI bar will move (resource re-allocation).
>
> [how]
>
> The device that owns the firmware framebuffer should be the default boot
> device. This patch adds an arch-independent function to enforce this rule
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a096f2d5c16..77624e8062d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3560,6 +3560,15 @@  static const struct attribute *amdgpu_dev_attributes[] = {
 	NULL
 };
 
+static bool amdgpu_is_boot_device(struct pci_dev *pdev)
+{
+	struct drm_device *dev = pci_get_drvdata(pdev);
+	struct amdgpu_device *adev = drm_to_adev(dev);
+	struct amdgpu_gmc *gmc = &adev->gmc;
+
+	return drm_aperture_contain_firmware_fb(gmc->aper_base, gmc->aper_size);
+}
+
 /**
  * amdgpu_device_init - initialize the driver
  *
@@ -3960,7 +3969,8 @@  int amdgpu_device_init(struct amdgpu_device *adev,
 	/* this will fail for cards that aren't VGA class devices, just
 	 * ignore it */
 	if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
-		vga_client_register(adev->pdev, amdgpu_device_vga_set_decode, NULL);
+		vga_client_register(adev->pdev, amdgpu_device_vga_set_decode,
+				    amdgpu_is_boot_device);
 
 	px = amdgpu_device_supports_px(ddev);