Message ID | 20230613030151.216625-8-15330273260@189.cn (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | PCI/VGA: Introduce is_boot_device function callback to vga_client_register | expand |
Hi, Does anyone has the bandwidth to review this? I provide more additional information here, hope it helps. On a non-x86 multiple platform, the discrete AMDGPU fails to override the integrated one. because the PCI BAR 0 of the AMDGPU gets moved. Below is the log of 'dmesg | grep vgaarb'. So relaying on screen_info is not always reliable. [ 0.361928] pci 0000:00:06.1: vgaarb: setting as boot VGA device [ 0.361932] pci 0000:00:06.1: vgaarb: bridge control possible [ 0.361933] pci 0000:00:06.1: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none [ 0.361940] pci 0000:05:00.0: vgaarb: bridge control possible [ 0.361941] pci 0000:05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.361943] vgaarb: loaded [ 11.352087] amdgpu 0000:05:00.0: vgaarb: Set as boot device (dictated by driver) [ 11.575505] loongson 0000:00:06.1: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem [ 11.585100] amdgpu 0000:05:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none dmesg | grep efifb: [ 0.356355] pci 0000:05:00.0: BAR 0: assigned to efifb [ 0.375793] efifb: probing for efifb [ 0.375795] pci 0000:05:00.0: BAR has moved, updating efifb address [ 0.375803] efifb: framebuffer at 0xe0030000000, using 976k, total 975k [ 0.375805] efifb: mode is 800x600x16, linelength=1664, pages=1 [ 0.375806] efifb: scrolling: redraw [ 0.375808] efifb: Truecolor: size=0:5:6:5, shift=0:11:5:0 efifb can also prove that "BAR has been moved" From dmesg | grep "pci 0000:05:00.0": [ 0.356286] pci 0000:05:00.0: [1002:699f] type 00 class 0x030000 [ 0.356303] pci 0000:05:00.0: reg 0x10: [mem 0xe0020000000-0xe002fffffff 64bit pref] [ 0.356315] pci 0000:05:00.0: reg 0x18: [mem 0xe0030000000-0xe00301fffff 64bit pref] [ 0.356323] pci 0000:05:00.0: reg 0x20: [io 0x40000-0x400ff] [ 0.356331] pci 0000:05:00.0: reg 0x24: [mem 0xe0053100000-0xe005313ffff] [ 0.356339] pci 0000:05:00.0: reg 0x30: [mem 0xfffe0000-0xffffffff pref] [ 0.356346] pci 0000:05:00.0: enabling Extended Tags [ 0.356355] pci 0000:05:00.0: BAR 0: assigned to efifb [ 0.356421] pci 0000:05:00.0: supports D1 D2 [ 0.356422] pci 0000:05:00.0: PME# supported from D1 D2 D3hot D3cold [ 0.356858] pci 0000:05:00.0: BAR 0: assigned [mem 0xe0030000000-0xe003fffffff 64bit pref] [ 0.356866] pci 0000:05:00.0: BAR 2: assigned [mem 0xe0040000000-0xe00401fffff 64bit pref] [ 0.356875] pci 0000:05:00.0: BAR 5: assigned [mem 0xe0049000000-0xe004903ffff] [ 0.356878] pci 0000:05:00.0: BAR 6: assigned [mem 0xe0049040000-0xe004905ffff pref] [ 0.356889] pci 0000:05:00.0: BAR 4: assigned [io 0x4000-0x40ff] [ 0.361940] pci 0000:05:00.0: vgaarb: bridge control possible [ 0.361941] pci 0000:05:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none [ 0.375795] pci 0000:05:00.0: BAR has moved, updating efifb address We can see that the Bar 0 of AMDGPU moved from '0xe0020000000-0xe002fffffff' to '0xe0030000000-0xe003fffffff' while the fb location information recorded by the screen_info still belong to '0xe0020000000-0xe002fffffff' I suspect this is also the reason that video/aperture don't relay on the information provided by screen_info in the contrast, it require the firmware framebuffer driver(efifb) to call devm_aperture_acquire_from_firmware() function, only in this way video/aperture could record the correct information about the aperture being used the by the firmware framebuffe. While vgaarb is loaded too early, even before efifb. so that we can only relay on the pci_notifier call back to us. On 2023/6/13 11:01, Sui Jingfeng wrote: > From: Sui Jingfeng <suijingfeng@loongson.cn> > > [why] > > The vga_is_firmware_default() defined in drivers/pci/vgaarb.c is > arch-dependent, it's a dummy on non-x86 architectures currently. > This made VGAARB lost an important condition for the arbitration. > It could still be wrong even if we remove the #ifdef and #endif guards. > because the PCI bar will move (resource re-allocation). > > [how] > > The device that owns the firmware framebuffer should be the default boot > device. This patch adds an arch-independent function to enforce this rule
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 7a096f2d5c16..77624e8062d5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3560,6 +3560,15 @@ static const struct attribute *amdgpu_dev_attributes[] = { NULL }; +static bool amdgpu_is_boot_device(struct pci_dev *pdev) +{ + struct drm_device *dev = pci_get_drvdata(pdev); + struct amdgpu_device *adev = drm_to_adev(dev); + struct amdgpu_gmc *gmc = &adev->gmc; + + return drm_aperture_contain_firmware_fb(gmc->aper_base, gmc->aper_size); +} + /** * amdgpu_device_init - initialize the driver * @@ -3960,7 +3969,8 @@ int amdgpu_device_init(struct amdgpu_device *adev, /* this will fail for cards that aren't VGA class devices, just * ignore it */ if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA) - vga_client_register(adev->pdev, amdgpu_device_vga_set_decode, NULL); + vga_client_register(adev->pdev, amdgpu_device_vga_set_decode, + amdgpu_is_boot_device); px = amdgpu_device_supports_px(ddev);