Message ID | 20180211194154.GB22869@wunner.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi I've not been able to reproduce the original problem you're trying to solve on amdgpu thats with or without your patch set and the above "trigger" too Is anything else required to trigger it, I started multiple DRI_PRIME glxgears, in parallel, serial waiting the 12 seconds and serial within the 12 seconds and I couldn't reproduce it Regards Mike
On Mon, Feb 12, 2018 at 12:35:51AM +0000, Mike Lothian wrote: > I've not been able to reproduce the original problem you're trying to > solve on amdgpu thats with or without your patch set and the above > "trigger" too > > Is anything else required to trigger it, I started multiple DRI_PRIME > glxgears, in parallel, serial waiting the 12 seconds and serial within > the 12 seconds and I couldn't reproduce it The discrete GPU needs to runtime suspend, that's the trigger, so no DRI_PRIME executables should be running. Just let it autosuspend after boot. Do you see "waiting 12 sec" messages in dmesg? If not it's not autosuspending. Thanks, Lukas
On 12 February 2018 at 03:39, Lukas Wunner <lukas@wunner.de> wrote: > On Mon, Feb 12, 2018 at 12:35:51AM +0000, Mike Lothian wrote: >> I've not been able to reproduce the original problem you're trying to >> solve on amdgpu thats with or without your patch set and the above >> "trigger" too >> >> Is anything else required to trigger it, I started multiple DRI_PRIME >> glxgears, in parallel, serial waiting the 12 seconds and serial within >> the 12 seconds and I couldn't reproduce it > > The discrete GPU needs to runtime suspend, that's the trigger, > so no DRI_PRIME executables should be running. Just let it > autosuspend after boot. Do you see "waiting 12 sec" messages > in dmesg? If not it's not autosuspending. > > Thanks, > > Lukas Hi Yes I'm seeing those messages, I'm just not seeing the hangs I've attached the dmesg in case you're interested Regards Mike
On Mon, Feb 12, 2018 at 09:03:26AM +0000, Mike Lothian wrote: > On 12 February 2018 at 03:39, Lukas Wunner <lukas@wunner.de> wrote: > > On Mon, Feb 12, 2018 at 12:35:51AM +0000, Mike Lothian wrote: > > > I've not been able to reproduce the original problem you're trying to > > > solve on amdgpu thats with or without your patch set and the above > > > "trigger" too > > > > > > Is anything else required to trigger it, I started multiple DRI_PRIME > > > glxgears, in parallel, serial waiting the 12 seconds and serial within > > > the 12 seconds and I couldn't reproduce it > > > > The discrete GPU needs to runtime suspend, that's the trigger, > > so no DRI_PRIME executables should be running. Just let it > > autosuspend after boot. Do you see "waiting 12 sec" messages > > in dmesg? If not it's not autosuspending. > > Yes I'm seeing those messages, I'm just not seeing the hangs > > I've attached the dmesg in case you're interested Okay the reason you're not seeing deadlocks is because the output poll worker is not enabled. And the output poll worker is not enabled because your discrete GPU doesn't have any outputs: [ 0.265568] [drm:dc_create] *ERROR* DC: Number of connectors is zero! The outputs are only polled if there are connectors which have the DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT flag set. And that only ever seems to be the case for VGA and DVI. We know based on bugzilla reports that hybrid graphics laptops do exist which poll outputs with radeon and nouveau. If there are no laptops supported by amdgpu whose discrete GPU has polled connectors, then patch [5/5] would be unnecessary. That is for Alex to decide. However that is very good to know, so thanks a lot for your testing efforts, much appreciated! Kind regards, Lukas
On Mon, Feb 12, 2018 at 4:45 AM, Lukas Wunner <lukas@wunner.de> wrote: > On Mon, Feb 12, 2018 at 09:03:26AM +0000, Mike Lothian wrote: >> On 12 February 2018 at 03:39, Lukas Wunner <lukas@wunner.de> wrote: >> > On Mon, Feb 12, 2018 at 12:35:51AM +0000, Mike Lothian wrote: >> > > I've not been able to reproduce the original problem you're trying to >> > > solve on amdgpu thats with or without your patch set and the above >> > > "trigger" too >> > > >> > > Is anything else required to trigger it, I started multiple DRI_PRIME >> > > glxgears, in parallel, serial waiting the 12 seconds and serial within >> > > the 12 seconds and I couldn't reproduce it >> > >> > The discrete GPU needs to runtime suspend, that's the trigger, >> > so no DRI_PRIME executables should be running. Just let it >> > autosuspend after boot. Do you see "waiting 12 sec" messages >> > in dmesg? If not it's not autosuspending. >> >> Yes I'm seeing those messages, I'm just not seeing the hangs >> >> I've attached the dmesg in case you're interested > > Okay the reason you're not seeing deadlocks is because the output poll > worker is not enabled. And the output poll worker is not enabled > because your discrete GPU doesn't have any outputs: > > [ 0.265568] [drm:dc_create] *ERROR* DC: Number of connectors is zero! > > The outputs are only polled if there are connectors which have the > DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT flag set. > And that only ever seems to be the case for VGA and DVI. > > We know based on bugzilla reports that hybrid graphics laptops do exist > which poll outputs with radeon and nouveau. If there are no laptops > supported by amdgpu whose discrete GPU has polled connectors, then > patch [5/5] would be unnecessary. That is for Alex to decide. Most hybrid laptops don't have display connectors on the dGPU and we only use polling on analog connectors, so you are not likely to run into this on recent laptops. That said, I don't know if there is some OEM system out there with a VGA port on the dGPU in a hybrid laptop. I guess another option is to just disable polling on hybrid laptops. Alex > > However that is very good to know, so thanks a lot for your testing > efforts, much appreciated! > > Kind regards, > > Lukas > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, Feb 12, 2018 at 01:58:32PM -0500, Alex Deucher wrote: > On Mon, Feb 12, 2018 at 4:45 AM, Lukas Wunner <lukas@wunner.de> wrote: > > On Mon, Feb 12, 2018 at 09:03:26AM +0000, Mike Lothian wrote: > >> On 12 February 2018 at 03:39, Lukas Wunner <lukas@wunner.de> wrote: > >> > On Mon, Feb 12, 2018 at 12:35:51AM +0000, Mike Lothian wrote: > >> > > I've not been able to reproduce the original problem you're trying to > >> > > solve on amdgpu thats with or without your patch set and the above > >> > > "trigger" too > > > > Okay the reason you're not seeing deadlocks is because the output poll > > worker is not enabled. And the output poll worker is not enabled > > because your discrete GPU doesn't have any outputs: > > > > [ 0.265568] [drm:dc_create] *ERROR* DC: Number of connectors is zero! > > > > The outputs are only polled if there are connectors which have the > > DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT flag set. > > And that only ever seems to be the case for VGA and DVI. > > > > We know based on bugzilla reports that hybrid graphics laptops do exist > > which poll outputs with radeon and nouveau. If there are no laptops > > supported by amdgpu whose discrete GPU has polled connectors, then > > patch [5/5] would be unnecessary. That is for Alex to decide. > > Most hybrid laptops don't have display connectors on the dGPU and we > only use polling on analog connectors, so you are not likely to run > into this on recent laptops. That said, I don't know if there is some > OEM system out there with a VGA port on the dGPU in a hybrid laptop. > I guess another option is to just disable polling on hybrid laptops. If we don't know for sure, applying patch [5/5] would seem to be the safest approach. (Assuming it doesn't break anything else.) Right now runtime PM is only used on hybrid graphics dGPUs by nouveau, radeon and amdgpu. Would it be conceivable that its use is expanded beyond that in the future? E.g. on a desktop machine, if DPMS is off on all screens, why keep the GPU in D0? If that is conceivable, chances that analog connectors are present are higher, and then the patch would be necessary again. (Of course this would mean that analog screens wouldn't light up automatically if they're attached while the GPU is in D3hot, but the user may forbid runtime PM via sysfs if that is unwanted.) Thanks, Lukas
On Tue, Feb 13, 2018 at 3:17 AM, Lukas Wunner <lukas@wunner.de> wrote: > On Mon, Feb 12, 2018 at 01:58:32PM -0500, Alex Deucher wrote: >> On Mon, Feb 12, 2018 at 4:45 AM, Lukas Wunner <lukas@wunner.de> wrote: >> > On Mon, Feb 12, 2018 at 09:03:26AM +0000, Mike Lothian wrote: >> >> On 12 February 2018 at 03:39, Lukas Wunner <lukas@wunner.de> wrote: >> >> > On Mon, Feb 12, 2018 at 12:35:51AM +0000, Mike Lothian wrote: >> >> > > I've not been able to reproduce the original problem you're trying to >> >> > > solve on amdgpu thats with or without your patch set and the above >> >> > > "trigger" too >> > >> > Okay the reason you're not seeing deadlocks is because the output poll >> > worker is not enabled. And the output poll worker is not enabled >> > because your discrete GPU doesn't have any outputs: >> > >> > [ 0.265568] [drm:dc_create] *ERROR* DC: Number of connectors is zero! >> > >> > The outputs are only polled if there are connectors which have the >> > DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT flag set. >> > And that only ever seems to be the case for VGA and DVI. >> > >> > We know based on bugzilla reports that hybrid graphics laptops do exist >> > which poll outputs with radeon and nouveau. If there are no laptops >> > supported by amdgpu whose discrete GPU has polled connectors, then >> > patch [5/5] would be unnecessary. That is for Alex to decide. >> >> Most hybrid laptops don't have display connectors on the dGPU and we >> only use polling on analog connectors, so you are not likely to run >> into this on recent laptops. That said, I don't know if there is some >> OEM system out there with a VGA port on the dGPU in a hybrid laptop. >> I guess another option is to just disable polling on hybrid laptops. > > If we don't know for sure, applying patch [5/5] would seem to be the > safest approach. (Assuming it doesn't break anything else.) I don't have any objections. I see no reason to leave out the amdgpu changes. Alex
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 50afcf6..beaaf2c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -36,6 +36,7 @@ #include <drm/drm_pciids.h> #include <linux/console.h> +#include <linux/delay.h> #include <linux/module.h> #include <linux/pm_runtime.h> #include <linux/vga_switcheroo.h> @@ -718,6 +719,9 @@ static int amdgpu_pmops_runtime_suspend(struct device *dev) return -EBUSY; } + printk("waiting 12 sec\n"); + msleep(12*1000); + printk("done waiting 12 sec\n"); drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING; drm_kms_helper_poll_disable(drm_dev); vga_switcheroo_set_dynamic_switch(pdev, VGA_SWITCHEROO_OFF); diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c index 555fbe5..ee7cf0d 100644 --- a/drivers/gpu/drm/drm_probe_helper.c +++ b/drivers/gpu/drm/drm_probe_helper.c @@ -586,6 +586,7 @@ static void output_poll_execute(struct work_struct *work) repoll = true; goto out; } + dev_info(&dev->pdev->dev, "begin poll\n"); drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) { @@ -651,6 +652,7 @@ static void output_poll_execute(struct work_struct *work) if (repoll) schedule_delayed_work(delayed_work, DRM_OUTPUT_POLL_PERIOD); + dev_info(&dev->pdev->dev, "end poll\n"); } /** diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index 3e29302..f9da5bc 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -855,6 +855,9 @@ static int nouveau_drm_probe(struct pci_dev *pdev, return -EBUSY; } + printk("waiting 12 sec\n"); + msleep(12*1000); + printk("done waiting 12 sec\n"); drm_kms_helper_poll_disable(drm_dev); vga_switcheroo_set_dynamic_switch(pdev, VGA_SWITCHEROO_OFF); nouveau_switcheroo_optimus_dsm(); diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 31dd04f..2b4e7e0 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -35,6 +35,7 @@ #include <drm/drm_pciids.h> #include <linux/console.h> +#include <linux/delay.h> #include <linux/module.h> #include <linux/pm_runtime.h> #include <linux/vga_switcheroo.h> @@ -413,6 +414,9 @@ static int radeon_pmops_runtime_suspend(struct device *dev) return -EBUSY; } + printk("waiting 12 sec\n"); + msleep(12*1000); + printk("done waiting 12 sec\n"); drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING; drm_kms_helper_poll_disable(drm_dev); vga_switcheroo_set_dynamic_switch(pdev, VGA_SWITCHEROO_OFF);