Message ID | 20240213064650.45051-1-umesh.nerlige.ramappa@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix crash due to open pmu events during unbind | expand |
Resending to include patch 2/2. Please ignore this series. On Mon, Feb 12, 2024 at 10:46:48PM -0800, Umesh Nerlige Ramappa wrote: >Once a user opens an fd for a perf event, if the driver undergoes a >function level reset (FLR), the resources are not cleaned up as >expected. For this discussion FLR is defined as a PCI unbind followed by >a bind. perf_pmu_unregister() would cleanup everything, but when the >user closes the perf fd much later, perf_release() is called and we >encounter null pointer dereferences and/or list corruption in that path >which require a reboot to recover. > >The only approach that worked to resolve this was to close the file >associated with the event such that the relevant cleanup happens w.r.t. >the open file. To do so, use the event->owner task and find the file >relevant to the event and close it. This relies on the >file->private_data matching the event object. > >Test-with: 20240213062948.32735-1-umesh.nerlige.ramappa@intel.com >Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> > >Umesh Nerlige Ramappa (2): > i915/pmu: Add pmu_teardown helper > INTEL_DII: i915/pmu: Cleanup pending events on unbind > > drivers/gpu/drm/i915/i915_pmu.c | 192 ++++++++++++++++++++++++-------- > drivers/gpu/drm/i915/i915_pmu.h | 15 +++ > 2 files changed, 161 insertions(+), 46 deletions(-) > >-- >2.34.1 >
Once a user opens an fd for a perf event, if the driver undergoes a function level reset (FLR), the resources are not cleaned up as expected. For this discussion FLR is defined as a PCI unbind followed by a bind. perf_pmu_unregister() would cleanup everything, but when the user closes the perf fd much later, perf_release() is called and we encounter null pointer dereferences and/or list corruption in that path which require a reboot to recover. The only approach that worked to resolve this was to close the file associated with the event such that the relevant cleanup happens w.r.t. the open file. To do so, use the event->owner task and find the file relevant to the event and close it. This relies on the file->private_data matching the event object. Test-with: 20240213062948.32735-1-umesh.nerlige.ramappa@intel.com Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Umesh Nerlige Ramappa (2): i915/pmu: Add pmu_teardown helper INTEL_DII: i915/pmu: Cleanup pending events on unbind drivers/gpu/drm/i915/i915_pmu.c | 192 ++++++++++++++++++++++++-------- drivers/gpu/drm/i915/i915_pmu.h | 15 +++ 2 files changed, 161 insertions(+), 46 deletions(-)