Message ID | 20230904195724.633404-4-sui.jingfeng@linux.dev (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | PCI/VGA: Allowing the user to select the primary video adapter at boot time | expand |
Am 04.09.23 um 21:57 schrieb Sui Jingfeng: > From: Sui Jingfeng <suijingfeng@loongson.cn> > > On a machine with multiple GPUs, a Linux user has no control over which one > is primary at boot time. Question is why is that useful? Should we give users the ability to control that? I don't see an use case for this. Regards, Christian. > This patch tries to solve the mentioned problem by > implementing the .be_primary() callback. Pass radeon.modeset=10 on the > kernel cmd line if you really want the device bound by radeon to be the > primary video adapter, no matter what VGAARB say. > > Cc: Alex Deucher <alexander.deucher@amd.com> > Cc: Christian Koenig <christian.koenig@amd.com> > Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> > --- > drivers/gpu/drm/radeon/radeon_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c > index 71f2ff39d6a1..b661cd3a8dc2 100644 > --- a/drivers/gpu/drm/radeon/radeon_device.c > +++ b/drivers/gpu/drm/radeon/radeon_device.c > @@ -1263,6 +1263,14 @@ static const struct vga_switcheroo_client_ops radeon_switcheroo_ops = { > .can_switch = radeon_switcheroo_can_switch, > }; > > +static bool radeon_want_to_be_primary(struct pci_dev *pdev) > +{ > + if (radeon_modeset == 10) > + return true; > + > + return false; > +} > + > /** > * radeon_device_init - initialize the driver > * > @@ -1425,7 +1433,7 @@ int radeon_device_init(struct radeon_device *rdev, > /* if we have > 1 VGA cards, then disable the radeon VGA resources */ > /* this will fail for cards that aren't VGA class devices, just > * ignore it */ > - vga_client_register(rdev->pdev, radeon_vga_set_decode, NULL); > + vga_client_register(rdev->pdev, radeon_vga_set_decode, radeon_want_to_be_primary); > > if (rdev->flags & RADEON_IS_PX) > runtime = true;
Hi, On 2023/9/5 13:50, Christian König wrote: > Am 04.09.23 um 21:57 schrieb Sui Jingfeng: >> From: Sui Jingfeng <suijingfeng@loongson.cn> >> >> On a machine with multiple GPUs, a Linux user has no control over >> which one >> is primary at boot time. > > Question is why is that useful? Should we give users the ability to > control that? > > I don't see an use case for this. > On a specific machine with multiple GPUs mounted, only the primary graphics get POST-ed (initialized) by the firmware. Therefore the DRM drivers for the rest video cards have to work without the prerequisite setups done by firmware, This is called as POST. One of the use cases is to test if a specific DRM driver would works properly, under the circumstance of not being POST-ed, The ast drm driver is the first one which refused to work if not being POST-ed by the firmware. Before apply this series, I was unable make drm/ast as the primary video card easily. The problem is that on a multiple video card configuration, the monitor connected with my AST2400 card not light up. While confusing, a naive programmer may suspect the PRIME is not working. After applied this series and passing ast.modeset=10 on the kernel cmd line, I found that the monitor connected with my ast2400 video card still black, It doesn't display and It doesn't show image to me. While in the process of study drm/ast, I know that drm/ast driver has the POST code shipped, See the ast_post_gpu() function. Then, I was wondering why this function doesn't works. After a short-time (hasty) debugging, I found that the ast_post_gpu() function didn't get run. Because it have something to do with the ast->config_mode. Without thinking too much, I hardcoded the ast->config_mode as ast_use_p2a, the key point is to force the ast_post_gpu() function to run. ``` --- a/drivers/gpu/drm/ast/ast_main.c +++ b/drivers/gpu/drm/ast/ast_main.c @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device *ast) } } + ast->config_mode = ast_use_p2a; + switch (ast->config_mode) { case ast_use_defaults: drm_info(dev, "Using default configuration\n"); ``` Then, the monitor light up, it display the Ubuntu greeter to me. Therefore my patch is useful, at least for the Linux drm driver tester and developer. It allow programmers to test the specific part of a specific driver without changing a line of the source code and without the need of sudo authority. It improves the efficiency of the testing and patch verification. I know the PrimaryGPU option of Xorg conf, but this approach will remember the setup have been made, you need modify it with root authority each time you want to switch the primary. But on the process of rapid developing and/or testing for multiple video drivers, with only one computer hardware resource available. What we really want is a one-shot command, as provided by this series. So, this is the first use case. The second use case is that sometime the firmware is not reliable. While there are thousands of ARM64, PowerPC and Mips servers machine, Most of them don't have a good UEFI firmware support. I haven't test the drm/amdgpu and drm/radeon at my ARM64 server yet. Because this ARM64 server always use the platform(BMC) integrated display controller as primary. The UEFI firmware of it does not provide options menu to tune. So, for the first time, the discrete card because useless, despite more powerful. I will take time to carry on the testing, so I will be able to tell more in the future. Even on X86, when select the PEG as primary on the UEFI BIOS menu. There is no way to tell the bios which one of my three discrete video be the primary. Not to mention some old UEFI firmware, which doesn't provide a setting at all. While the benefit of my approach is the flexibility. Yes the i915, amdgpu and radeon are good quality, but there may have programmers want to try nouveau. The third use case is that VGAARB is also not reliable, It will select a wrong device as primary. Especially on Arm64, Loongarch and mips arch etc. And the X server will use this wrong device as primary and completely crash there. Either because of lacking a driver or the driver has a bug which can not bear the graphic environment up. VGAARB is firmware dependent. My patch provide a temporary method to rescue. The forth is probably the PRIME and reverse PRIME development and developing driver for new video cards.
On Tue, Sep 5, 2023 at 1:25 PM suijingfeng <suijingfeng@loongson.cn> wrote: > > Hi, > > > On 2023/9/5 13:50, Christian König wrote: > > Am 04.09.23 um 21:57 schrieb Sui Jingfeng: > >> From: Sui Jingfeng <suijingfeng@loongson.cn> > >> > >> On a machine with multiple GPUs, a Linux user has no control over > >> which one > >> is primary at boot time. > > > > Question is why is that useful? Should we give users the ability to > > control that? > > > > I don't see an use case for this. > > > > On a specific machine with multiple GPUs mounted, only the > primary graphics get POST-ed (initialized) by the firmware. > Therefore the DRM drivers for the rest video cards have to > work without the prerequisite setups done by firmware, This > is called as POST. I think that should be regarded as a bug in the driver that should be fixed and this would not help with that case. If a driver can't initialize a device without aid from the pre-OS environment, that should be fixed in the driver. This solution also doesn't fix which device is selected as the primary by the pre-OS environment. That can only be fixed in the pre-OS environment code. > > One of the use cases is to test if a specific DRM driver > would works properly, under the circumstance of not being > POST-ed, The ast drm driver is the first one which refused > to work if not being POST-ed by the firmware. > > Before apply this series, I was unable make drm/ast as the > primary video card easily. The problem is that on a multiple > video card configuration, the monitor connected with my > AST2400 card not light up. While confusing, a naive programmer > may suspect the PRIME is not working. > > After applied this series and passing ast.modeset=10 on the > kernel cmd line, I found that the monitor connected with my > ast2400 video card still black, It doesn't display and It > doesn't show image to me. The problem with adding modeset=10 is that it only helps when you have one GPU driven by that driver in the system. If you have multiple GPUs driven by that driver, which one would that apply to? E.g., what if you have 2 AMD GPUs in the system. > > While in the process of study drm/ast, I know that drm/ast > driver has the POST code shipped, See the ast_post_gpu() function. > Then, I was wondering why this function doesn't works. > > After a short-time (hasty) debugging, I found that the ast_post_gpu() > function didn't get run. Because it have something to do with the > ast->config_mode. Without thinking too much, I hardcoded the > ast->config_mode as ast_use_p2a, the key point is to force the > ast_post_gpu() function to run. > > > ``` > > --- a/drivers/gpu/drm/ast/ast_main.c > +++ b/drivers/gpu/drm/ast/ast_main.c > @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device > *ast) > } > } > > + ast->config_mode = ast_use_p2a; > + > switch (ast->config_mode) { > case ast_use_defaults: > drm_info(dev, "Using default configuration\n"); > > ``` > > Then, the monitor light up, it display the Ubuntu greeter to me. Therefore > my patch is useful, at least for the Linux drm driver tester and developer. > It allow programmers to test the specific part of a specific driver without > changing a line of the source code and without the need of sudo authority. > > It improves the efficiency of the testing and patch verification. I know > the PrimaryGPU option of Xorg conf, but this approach will remember the > setup have been made, you need modify it with root authority each time > you want to switch the primary. But on the process of rapid developing > and/or testing for multiple video drivers, with only one computer hardware > resource available. What we really want is a one-shot command, as provided > by this series. So, this is the first use case. > > > The second use case is that sometime the firmware is not reliable. > While there are thousands of ARM64, PowerPC and Mips servers machine, > Most of them don't have a good UEFI firmware support. I haven't test the > drm/amdgpu and drm/radeon at my ARM64 server yet. Because this ARM64 > server always use the platform(BMC) integrated display controller as primary. > The UEFI firmware of it does not provide options menu to tune. > So, for the first time, the discrete card because useless, despite more powerful. > I will take time to carry on the testing, so I will be able to tell more > in the future. > > > Even on X86, when select the PEG as primary on the UEFI BIOS menu. > There is no way to tell the bios which one of my three > discrete video be the primary. Not to mention some old UEFI > firmware, which doesn't provide a setting at all. > While the benefit of my approach is the flexibility. > Yes the i915, amdgpu and radeon are good quality, > but there may have programmers want to try nouveau. > > > The third use case is that VGAARB is also not reliable, It will > select a wrong device as primary. Especially on Arm64, Loongarch > and mips arch etc. And the X server will use this wrong device > as primary and completely crash there. Either because of lacking > a driver or the driver has a bug which can not bear the graphic > environment up. VGAARB is firmware dependent. > My patch provide a temporary method to rescue. > It sounds like we need a general purpose "primary" selector. I think it's sort of orthogonal to VGA. VGAARB is just for managing VGA routing on PCI. It's not really directly related to which GPU you want to be the primary when the OS loads. Maybe some new kernel parameter, E.g., primary_display=<string> where the string would be the specific device you wanted to be the primary, E.g., you could use the PCI BDF on PCI devices, primary_display=0000:0a:00.0 or some other device string for non-PCI devices. Alex
On 2023/9/7 00:00, Alex Deucher wrote: > On Tue, Sep 5, 2023 at 1:25 PM suijingfeng <suijingfeng@loongson.cn> wrote: >> Hi, >> >> >> On 2023/9/5 13:50, Christian König wrote: >>> Am 04.09.23 um 21:57 schrieb Sui Jingfeng: >>>> From: Sui Jingfeng <suijingfeng@loongson.cn> >>>> >>>> On a machine with multiple GPUs, a Linux user has no control over >>>> which one >>>> is primary at boot time. >>> Question is why is that useful? Should we give users the ability to >>> control that? >>> >>> I don't see an use case for this. >>> >> On a specific machine with multiple GPUs mounted, only the >> primary graphics get POST-ed (initialized) by the firmware. >> Therefore the DRM drivers for the rest video cards have to >> work without the prerequisite setups done by firmware, This >> is called as POST. > I think that should be regarded as a bug in the driver that should be > fixed and this would not help with that case. If a driver can't > initialize a device without aid from the pre-OS environment, that > should be fixed in the driver. This solution also doesn't fix which > device is selected as the primary by the pre-OS environment. That can > only be fixed in the pre-OS environment code. > >> One of the use cases is to test if a specific DRM driver >> would works properly, under the circumstance of not being >> POST-ed, The ast drm driver is the first one which refused >> to work if not being POST-ed by the firmware. >> >> Before apply this series, I was unable make drm/ast as the >> primary video card easily. The problem is that on a multiple >> video card configuration, the monitor connected with my >> AST2400 card not light up. While confusing, a naive programmer >> may suspect the PRIME is not working. >> >> After applied this series and passing ast.modeset=10 on the >> kernel cmd line, I found that the monitor connected with my >> ast2400 video card still black, It doesn't display and It >> doesn't show image to me. > The problem with adding modeset=10 is that it only helps when you have > one GPU driven by that driver in the system. If you have multiple > GPUs driven by that driver, which one would that apply to? E.g., what > if you have 2 AMD GPUs in the system. > >> While in the process of study drm/ast, I know that drm/ast >> driver has the POST code shipped, See the ast_post_gpu() function. >> Then, I was wondering why this function doesn't works. >> >> After a short-time (hasty) debugging, I found that the ast_post_gpu() >> function didn't get run. Because it have something to do with the >> ast->config_mode. Without thinking too much, I hardcoded the >> ast->config_mode as ast_use_p2a, the key point is to force the >> ast_post_gpu() function to run. >> >> >> ``` >> >> --- a/drivers/gpu/drm/ast/ast_main.c >> +++ b/drivers/gpu/drm/ast/ast_main.c >> @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device >> *ast) >> } >> } >> >> + ast->config_mode = ast_use_p2a; >> + >> switch (ast->config_mode) { >> case ast_use_defaults: >> drm_info(dev, "Using default configuration\n"); >> >> ``` >> >> Then, the monitor light up, it display the Ubuntu greeter to me. Therefore >> my patch is useful, at least for the Linux drm driver tester and developer. >> It allow programmers to test the specific part of a specific driver without >> changing a line of the source code and without the need of sudo authority. >> >> It improves the efficiency of the testing and patch verification. I know >> the PrimaryGPU option of Xorg conf, but this approach will remember the >> setup have been made, you need modify it with root authority each time >> you want to switch the primary. But on the process of rapid developing >> and/or testing for multiple video drivers, with only one computer hardware >> resource available. What we really want is a one-shot command, as provided >> by this series. So, this is the first use case. >> >> >> The second use case is that sometime the firmware is not reliable. >> While there are thousands of ARM64, PowerPC and Mips servers machine, >> Most of them don't have a good UEFI firmware support. I haven't test the >> drm/amdgpu and drm/radeon at my ARM64 server yet. Because this ARM64 >> server always use the platform(BMC) integrated display controller as primary. >> The UEFI firmware of it does not provide options menu to tune. >> So, for the first time, the discrete card because useless, despite more powerful. >> I will take time to carry on the testing, so I will be able to tell more >> in the future. >> >> >> Even on X86, when select the PEG as primary on the UEFI BIOS menu. >> There is no way to tell the bios which one of my three >> discrete video be the primary. Not to mention some old UEFI >> firmware, which doesn't provide a setting at all. >> While the benefit of my approach is the flexibility. >> Yes the i915, amdgpu and radeon are good quality, >> but there may have programmers want to try nouveau. >> >> >> The third use case is that VGAARB is also not reliable, It will >> select a wrong device as primary. Especially on Arm64, Loongarch >> and mips arch etc. And the X server will use this wrong device >> as primary and completely crash there. Either because of lacking >> a driver or the driver has a bug which can not bear the graphic >> environment up. VGAARB is firmware dependent. >> My patch provide a temporary method to rescue. >> > It sounds like we need a general purpose "primary" selector. I think > it's sort of orthogonal to VGA. VGAARB is just for managing VGA > routing on PCI. It's not really directly related to which GPU you > want to be the primary when the OS loads. Maybe some new kernel > parameter, E.g., primary_display=<string> where the string would be > the specific device you wanted to be the primary, E.g., you could use > the PCI BDF on PCI devices, primary_display=0000:0a:00.0 or some other > device string for non-PCI devices. Indeed, thanks for you give the right direction. Bjorn and Willianson also told me something similar, but I didn't realized until today. historically, VGAARB is only for managing VGA compatible device. For legacy BIOS environment, the routing related to this need to handled. I didn't realized those technique point by the time this series is sending. Beside this, It not uncommon that a specific machine ship multiple identical GPUs models Or GPUs from the same company (likely a integrated one and a discrete one) My naive approach in v4 cannot solve complex problems like this. I will try to solve this problems and concerns at the next version. Thanks. > Alex
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 71f2ff39d6a1..b661cd3a8dc2 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1263,6 +1263,14 @@ static const struct vga_switcheroo_client_ops radeon_switcheroo_ops = { .can_switch = radeon_switcheroo_can_switch, }; +static bool radeon_want_to_be_primary(struct pci_dev *pdev) +{ + if (radeon_modeset == 10) + return true; + + return false; +} + /** * radeon_device_init - initialize the driver * @@ -1425,7 +1433,7 @@ int radeon_device_init(struct radeon_device *rdev, /* if we have > 1 VGA cards, then disable the radeon VGA resources */ /* this will fail for cards that aren't VGA class devices, just * ignore it */ - vga_client_register(rdev->pdev, radeon_vga_set_decode, NULL); + vga_client_register(rdev->pdev, radeon_vga_set_decode, radeon_want_to_be_primary); if (rdev->flags & RADEON_IS_PX) runtime = true;