Message ID | nycvar.YFH.7.76.2106241319430.18969@cbobk.fhfr.pm (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/amdgpu: Fix resource leak on probe error path | expand |
On Thu, 24 Jun 2021, Jiri Kosina wrote: > From: Jiri Kosina <jkosina@suse.cz> > > This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980. > > It is not true (as stated in the reverted commit changelog) that we never > unmap the BAR on failure; it actually does happen properly on > amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> > amdgpu_device_fini() error path. > > What's worse, this commit actually completely breaks resource freeing on > probe failure (like e.g. failure to load microcode), as > amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too > early, leaving all the resources that'd normally be freed in > amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading > to all sorts of oopses when someone tries to, for example, access the > sysfs and procfs resources which are still around while the driver is > gone. > > Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure") > Reported-by: Vojtech Pavlik <vojtech@ucw.cz> > Signed-off-by: Jiri Kosina <jkosina@suse.cz> Friendly ping on this one (as it's easily triggerable in the wild by just missing proper firwmare). Thanks. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------ > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 57ec108b5972..0f1c0e17a587 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, > r = amdgpu_device_get_job_timeout_settings(adev); > if (r) { > dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n"); > - goto failed_unmap; > + return r; > } > > /* early init functions */ > r = amdgpu_device_ip_early_init(adev); > if (r) > - goto failed_unmap; > + return r; > > /* doorbell bar mapping and doorbell index init*/ > amdgpu_device_doorbell_init(adev); > @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, > failed: > amdgpu_vf_error_trans_all(adev); > > -failed_unmap: > - iounmap(adev->rmmio); > - adev->rmmio = NULL; > - > return r; > } > > -- > 2.12.3 > >
Applied. Thanks! Alex On Thu, Jul 1, 2021 at 4:32 AM Jiri Kosina <jikos@kernel.org> wrote: > > On Thu, 24 Jun 2021, Jiri Kosina wrote: > > > From: Jiri Kosina <jkosina@suse.cz> > > > > This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980. > > > > It is not true (as stated in the reverted commit changelog) that we never > > unmap the BAR on failure; it actually does happen properly on > > amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> > > amdgpu_device_fini() error path. > > > > What's worse, this commit actually completely breaks resource freeing on > > probe failure (like e.g. failure to load microcode), as > > amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too > > early, leaving all the resources that'd normally be freed in > > amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading > > to all sorts of oopses when someone tries to, for example, access the > > sysfs and procfs resources which are still around while the driver is > > gone. > > > > Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure") > > Reported-by: Vojtech Pavlik <vojtech@ucw.cz> > > Signed-off-by: Jiri Kosina <jkosina@suse.cz> > > Friendly ping on this one (as it's easily triggerable in the wild by just > missing proper firwmare). > > Thanks. > > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------ > > 1 file changed, 2 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > index 57ec108b5972..0f1c0e17a587 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, > > r = amdgpu_device_get_job_timeout_settings(adev); > > if (r) { > > dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n"); > > - goto failed_unmap; > > + return r; > > } > > > > /* early init functions */ > > r = amdgpu_device_ip_early_init(adev); > > if (r) > > - goto failed_unmap; > > + return r; > > > > /* doorbell bar mapping and doorbell index init*/ > > amdgpu_device_doorbell_init(adev); > > @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, > > failed: > > amdgpu_vf_error_trans_all(adev); > > > > -failed_unmap: > > - iounmap(adev->rmmio); > > - adev->rmmio = NULL; > > - > > return r; > > } > > > > -- > > 2.12.3 > > > > > > -- > Jiri Kosina > SUSE Labs >
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 57ec108b5972..0f1c0e17a587 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev, r = amdgpu_device_get_job_timeout_settings(adev); if (r) { dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n"); - goto failed_unmap; + return r; } /* early init functions */ r = amdgpu_device_ip_early_init(adev); if (r) - goto failed_unmap; + return r; /* doorbell bar mapping and doorbell index init*/ amdgpu_device_doorbell_init(adev); @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, failed: amdgpu_vf_error_trans_all(adev); -failed_unmap: - iounmap(adev->rmmio); - adev->rmmio = NULL; - return r; }