Message ID | 1594891165-8228-1-git-send-email-liuqi115@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling | expand |
On Thu, Jul 16, 2020 at 05:19:25PM +0800, Qi Liu wrote: > Kernel panic will also happen when users try to unbind PMU drivers with > device. This unbind issue could be solved by another patch latter. > > drivers/perf/arm_smmuv3_pmu.c | 1 + > drivers/perf/fsl_imx8_ddr_perf.c | 1 + > drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 1 + > drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 1 + > drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 1 + > 5 files changed, 5 insertions(+) > > diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c > index 48e28ef..90caba56 100644 > --- a/drivers/perf/arm_smmuv3_pmu.c > +++ b/drivers/perf/arm_smmuv3_pmu.c > @@ -742,6 +742,7 @@ static int smmu_pmu_probe(struct platform_device *pdev) > platform_set_drvdata(pdev, smmu_pmu); > > smmu_pmu->pmu = (struct pmu) { > + .module = THIS_MODULE, I thought platform_driver_register() did this automatically? Will
On 16/07/2020 10:41, Will Deacon wrote: > On Thu, Jul 16, 2020 at 05:19:25PM +0800, Qi Liu wrote: >> Kernel panic will also happen when users try to unbind PMU drivers with >> device. This unbind issue could be solved by another patch latter. >> >> drivers/perf/arm_smmuv3_pmu.c | 1 + >> drivers/perf/fsl_imx8_ddr_perf.c | 1 + >> drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 1 + >> drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 1 + >> drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 1 + >> 5 files changed, 5 insertions(+) >> >> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c >> index 48e28ef..90caba56 100644 >> --- a/drivers/perf/arm_smmuv3_pmu.c >> +++ b/drivers/perf/arm_smmuv3_pmu.c >> @@ -742,6 +742,7 @@ static int smmu_pmu_probe(struct platform_device *pdev) >> platform_set_drvdata(pdev, smmu_pmu); >> >> smmu_pmu->pmu = (struct pmu) { >> + .module = THIS_MODULE, > > I thought platform_driver_register() did this automatically? > Isn't that something different? The perf framework knows nothing of the platform_device/driver really, and just knows the event_source device which it creates. And so we also need to tell the perf framework about the module backing this pmu. I think some relevant code is perf_try_init_event() -> try_module_get(pmu->module). Thanks, John
On 2020-07-16 10:41, Will Deacon wrote: > On Thu, Jul 16, 2020 at 05:19:25PM +0800, Qi Liu wrote: >> Kernel panic will also happen when users try to unbind PMU drivers with >> device. This unbind issue could be solved by another patch latter. >> >> drivers/perf/arm_smmuv3_pmu.c | 1 + >> drivers/perf/fsl_imx8_ddr_perf.c | 1 + >> drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 1 + >> drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 1 + >> drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 1 + >> 5 files changed, 5 insertions(+) >> >> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c >> index 48e28ef..90caba56 100644 >> --- a/drivers/perf/arm_smmuv3_pmu.c >> +++ b/drivers/perf/arm_smmuv3_pmu.c >> @@ -742,6 +742,7 @@ static int smmu_pmu_probe(struct platform_device *pdev) >> platform_set_drvdata(pdev, smmu_pmu); >> >> smmu_pmu->pmu = (struct pmu) { >> + .module = THIS_MODULE, > > I thought platform_driver_register() did this automatically? For the platform device itself, yes, but this is for the PMU device - perf needs to take a reference to the module, otherwise the platform device can still be pulled out from under its feet. I can't remember if we ever discussed making perf_pmu_register() do the same trick as platform_device_register() and friends, but obviously it's a possibility. Robin.
On Thu, Jul 16, 2020 at 11:26:25AM +0100, Robin Murphy wrote: > On 2020-07-16 10:41, Will Deacon wrote: > > On Thu, Jul 16, 2020 at 05:19:25PM +0800, Qi Liu wrote: > > > Kernel panic will also happen when users try to unbind PMU drivers with > > > device. This unbind issue could be solved by another patch latter. > > > > > > drivers/perf/arm_smmuv3_pmu.c | 1 + > > > drivers/perf/fsl_imx8_ddr_perf.c | 1 + > > > drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 1 + > > > drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 1 + > > > drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 1 + > > > 5 files changed, 5 insertions(+) > > > > > > diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c > > > index 48e28ef..90caba56 100644 > > > --- a/drivers/perf/arm_smmuv3_pmu.c > > > +++ b/drivers/perf/arm_smmuv3_pmu.c > > > @@ -742,6 +742,7 @@ static int smmu_pmu_probe(struct platform_device *pdev) > > > platform_set_drvdata(pdev, smmu_pmu); > > > > > > smmu_pmu->pmu = (struct pmu) { > > > + .module = THIS_MODULE, > > > > I thought platform_driver_register() did this automatically? > > For the platform device itself, yes, but this is for the PMU device - perf > needs to take a reference to the module, otherwise the platform device can > still be pulled out from under its feet. Urgh, gross. > I can't remember if we ever discussed making perf_pmu_register() do the same > trick as platform_device_register() and friends, but obviously it's a > possibility. Yeah, but I suppose this patch is the right thing to do for now. I'll queue it as a fix. Will
On 16/07/2020 11:30, Will Deacon wrote: > On Thu, Jul 16, 2020 at 11:26:25AM +0100, Robin Murphy wrote: >> On 2020-07-16 10:41, Will Deacon wrote: >>> On Thu, Jul 16, 2020 at 05:19:25PM +0800, Qi Liu wrote: >>>> Kernel panic will also happen when users try to unbind PMU drivers with >>>> device. This unbind issue could be solved by another patch latter. >>>> >>>> drivers/perf/arm_smmuv3_pmu.c | 1 + >>>> drivers/perf/fsl_imx8_ddr_perf.c | 1 + >>>> drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 1 + >>>> drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 1 + >>>> drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 1 + >>>> 5 files changed, 5 insertions(+) >>>> >>>> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c >>>> index 48e28ef..90caba56 100644 >>>> --- a/drivers/perf/arm_smmuv3_pmu.c >>>> +++ b/drivers/perf/arm_smmuv3_pmu.c >>>> @@ -742,6 +742,7 @@ static int smmu_pmu_probe(struct platform_device *pdev) >>>> platform_set_drvdata(pdev, smmu_pmu); >>>> >>>> smmu_pmu->pmu = (struct pmu) { >>>> + .module = THIS_MODULE, >>> I thought platform_driver_register() did this automatically? >> For the platform device itself, yes, but this is for the PMU device - perf >> needs to take a reference to the module, otherwise the platform device can >> still be pulled out from under its feet. > Urgh, gross. > >> I can't remember if we ever discussed making perf_pmu_register() do the same >> trick as platform_device_register() and friends, but obviously it's a >> possibility. > Yeah, but I suppose this patch is the right thing to do for now. I'll queue > it as a fix. > Please also note what Qi Liu wrote about being able to unbind the driver and cause the same issue. I don't know if you can about that issue also.
On Thu, Jul 16, 2020 at 11:36:10AM +0100, John Garry wrote: > On 16/07/2020 11:30, Will Deacon wrote: > > On Thu, Jul 16, 2020 at 11:26:25AM +0100, Robin Murphy wrote: > > > On 2020-07-16 10:41, Will Deacon wrote: > > > > On Thu, Jul 16, 2020 at 05:19:25PM +0800, Qi Liu wrote: > > > > > Kernel panic will also happen when users try to unbind PMU drivers with > > > > > device. This unbind issue could be solved by another patch latter. > > > > > > > > > > drivers/perf/arm_smmuv3_pmu.c | 1 + > > > > > drivers/perf/fsl_imx8_ddr_perf.c | 1 + > > > > > drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c | 1 + > > > > > drivers/perf/hisilicon/hisi_uncore_hha_pmu.c | 1 + > > > > > drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c | 1 + > > > > > 5 files changed, 5 insertions(+) > > > > > > > > > > diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c > > > > > index 48e28ef..90caba56 100644 > > > > > --- a/drivers/perf/arm_smmuv3_pmu.c > > > > > +++ b/drivers/perf/arm_smmuv3_pmu.c > > > > > @@ -742,6 +742,7 @@ static int smmu_pmu_probe(struct platform_device *pdev) > > > > > platform_set_drvdata(pdev, smmu_pmu); > > > > > > > > > > smmu_pmu->pmu = (struct pmu) { > > > > > + .module = THIS_MODULE, > > > > I thought platform_driver_register() did this automatically? > > > For the platform device itself, yes, but this is for the PMU device - perf > > > needs to take a reference to the module, otherwise the platform device can > > > still be pulled out from under its feet. > > Urgh, gross. > > > > > I can't remember if we ever discussed making perf_pmu_register() do the same > > > trick as platform_device_register() and friends, but obviously it's a > > > possibility. > > Yeah, but I suppose this patch is the right thing to do for now. I'll queue > > it as a fix. > > > > Please also note what Qi Liu wrote about being able to unbind the driver and > cause the same issue. I don't know if you can about that issue also. I guess we have to throw in some '.suppress_bind_attrs = true,' lines as well, then? I'll queue this as-is, but happy to take a follow-up if somebody can test it. Will
On 16/07/2020 11:44, Will Deacon wrote: >> Please also note what Qi Liu wrote about being able to unbind the driver and >> cause the same issue. I don't know if you can about that issue also. > I guess we have to throw in some '.suppress_bind_attrs = true,' lines as > well, then? Yeah, I think it would be a follow up using suppress_bind_attrs. TBH, I don't know why this is an opt out feature, than an opt in. Probably a good reason... I'll queue this as-is, but happy to take a follow-up if somebody > can test it. > Thanks, john
On Thu, 16 Jul 2020 17:19:25 +0800, Qi Liu wrote: > When users try to remove PMU modules during perf sampling, kernel panic > will happen because the pmu->read() is a NULL pointer here. > > INFO on HiSilicon hip08 platform as follow: > pc : hisi_uncore_pmu_event_update+0x30/0xa4 [hisi_uncore_pmu] > lr : hisi_uncore_pmu_read+0x20/0x2c [hisi_uncore_pmu] > sp : ffff800010103e90 > x29: ffff800010103e90 x28: ffff0027db0c0e40 > x27: ffffa29a76f129d8 x26: ffffa29a77ceb000 > x25: ffffa29a773a5000 x24: ffffa29a77392000 > x23: ffffddffe5943f08 x22: ffff002784285960 > x21: ffff002784285800 x20: ffff0027d2e76c80 > x19: ffff0027842859e0 x18: ffff80003498bcc8 > x17: ffffa29a76afe910 x16: ffffa29a7583f530 > x15: 16151a1512061a1e x14: 0000000000000000 > x13: ffffa29a76f1e238 x12: 0000000000000001 > x11: 0000000000000400 x10: 00000000000009f0 > x9 : ffff8000107b3e70 x8 : ffff0027db0c1890 > x7 : ffffa29a773a7000 x6 : 00000007f5131013 > x5 : 00000007f5131013 x4 : 09f257d417c00000 > x3 : 00000002187bd7ce x2 : ffffa29a38f0f0d8 > x1 : ffffa29a38eae268 x0 : ffff0027d2e76c80 > Call trace: > hisi_uncore_pmu_event_update+0x30/0xa4 [hisi_uncore_pmu] > hisi_uncore_pmu_read+0x20/0x2c [hisi_uncore_pmu] > __perf_event_read+0x1a0/0x1f8 > flush_smp_call_function_queue+0xa0/0x160 > generic_smp_call_function_single_interrupt+0x18/0x20 > handle_IPI+0x31c/0x4dc > gic_handle_irq+0x2c8/0x310 > el1_irq+0xcc/0x180 > arch_cpu_idle+0x4c/0x20c > default_idle_call+0x20/0x30 > do_idle+0x1b4/0x270 > cpu_startup_entry+0x28/0x30 > secondary_start_kernel+0x1a4/0x1fc > > [...] Applied to arm64 (for-next/fixes), thanks! [1/1] drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling https://git.kernel.org/arm64/c/bdc5c744c7b6 Cheers,
diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c index 48e28ef..90caba56 100644 --- a/drivers/perf/arm_smmuv3_pmu.c +++ b/drivers/perf/arm_smmuv3_pmu.c @@ -742,6 +742,7 @@ static int smmu_pmu_probe(struct platform_device *pdev) platform_set_drvdata(pdev, smmu_pmu); smmu_pmu->pmu = (struct pmu) { + .module = THIS_MODULE, .task_ctx_nr = perf_invalid_context, .pmu_enable = smmu_pmu_enable, .pmu_disable = smmu_pmu_disable, diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c index 90884d1..2aed2d9 100644 --- a/drivers/perf/fsl_imx8_ddr_perf.c +++ b/drivers/perf/fsl_imx8_ddr_perf.c @@ -512,6 +512,7 @@ static int ddr_perf_init(struct ddr_pmu *pmu, void __iomem *base, { *pmu = (struct ddr_pmu) { .pmu = (struct pmu) { + .module = THIS_MODULE, .capabilities = PERF_PMU_CAP_NO_EXCLUDE, .task_ctx_nr = perf_invalid_context, .attr_groups = attr_groups, diff --git a/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c index 15713fa..71587f1 100644 --- a/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c @@ -378,6 +378,7 @@ static int hisi_ddrc_pmu_probe(struct platform_device *pdev) ddrc_pmu->sccl_id, ddrc_pmu->index_id); ddrc_pmu->pmu = (struct pmu) { .name = name, + .module = THIS_MODULE, .task_ctx_nr = perf_invalid_context, .event_init = hisi_uncore_pmu_event_init, .pmu_enable = hisi_uncore_pmu_enable, diff --git a/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c index dcc5600..c199de7 100644 --- a/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_hha_pmu.c @@ -390,6 +390,7 @@ static int hisi_hha_pmu_probe(struct platform_device *pdev) hha_pmu->sccl_id, hha_pmu->index_id); hha_pmu->pmu = (struct pmu) { .name = name, + .module = THIS_MODULE, .task_ctx_nr = perf_invalid_context, .event_init = hisi_uncore_pmu_event_init, .pmu_enable = hisi_uncore_pmu_enable, diff --git a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c index 7719ae4..567d7e6 100644 --- a/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c +++ b/drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c @@ -380,6 +380,7 @@ static int hisi_l3c_pmu_probe(struct platform_device *pdev) l3c_pmu->sccl_id, l3c_pmu->index_id); l3c_pmu->pmu = (struct pmu) { .name = name, + .module = THIS_MODULE, .task_ctx_nr = perf_invalid_context, .event_init = hisi_uncore_pmu_event_init, .pmu_enable = hisi_uncore_pmu_enable,