Message ID | 20210628071546.167088-1-kjain@linux.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC] fpga: dfl: fme: Fix cpu hotplug code | expand |
It's a good fix, you can drop the RFC in commit title. :) The title could be more specific, like: fpga: dfl: fme: Fix cpu hotplug issue in performance reporting So we know it is for performance reporting feature at first glance. On Mon, Jun 28, 2021 at 12:45:46PM +0530, Kajol Jain wrote: > Commit 724142f8c42a ("fpga: dfl: fme: add performance > reporting support") added performance reporting support > for FPGA management engine via perf. May drop this section, it is indicated in the Fixes tag. > > It also added cpu hotplug feature but it didn't add The performance reporting driver added cpu hotplug ... > pmu migration call in cpu offline function. > This can create an issue incase the current designated > cpu being used to collect fme pmu data got offline, > as based on current code we are not migrating fme pmu to > new target cpu. Because of that perf will still try to > fetch data from that offline cpu and hence we will not > get counter data. > > Patch fixed this issue by adding pmu_migrate_context call > in fme_perf_offline_cpu function. > > Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support") > Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Xu Yilun <yilun.xu@intel.com> Thanks, Yilun > --- > drivers/fpga/dfl-fme-perf.c | 4 ++++ > 1 file changed, 4 insertions(+) > > --- > - This fix patch is not tested (as I don't have required environment). > But issue mentioned in the commit msg can be re-created, by starting any > fme_perf event and while its still running, offline current designated > cpu pointed by cpumask file. Since current code didn't migrating pmu, > perf gonna try getting counts from that offlined cpu and hence we will > not get event data. > --- > diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c > index 4299145ef347..b9a54583e505 100644 > --- a/drivers/fpga/dfl-fme-perf.c > +++ b/drivers/fpga/dfl-fme-perf.c > @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node) > return 0; > > priv->cpu = target; > + > + /* Migrate fme_perf pmu events to the new target cpu */ > + perf_pmu_migrate_context(&priv->pmu, cpu, target); > + > return 0; > } > > -- > 2.31.1
On 6/28/21 2:31 PM, Xu Yilun wrote: > It's a good fix, you can drop the RFC in commit title. :) > > The title could be more specific, like: > > fpga: dfl: fme: Fix cpu hotplug issue in performance reporting > > So we know it is for performance reporting feature at first glance. > > On Mon, Jun 28, 2021 at 12:45:46PM +0530, Kajol Jain wrote: > >> Commit 724142f8c42a ("fpga: dfl: fme: add performance >> reporting support") added performance reporting support >> for FPGA management engine via perf. > > May drop this section, it is indicated in the Fixes tag. > Hi Yilun, Thanks for testing the patch. I will make mentioned changes and send new patch. Thanks, Kajol Jain >> >> It also added cpu hotplug feature but it didn't add > > The performance reporting driver added cpu hotplug ... > >> pmu migration call in cpu offline function. >> This can create an issue incase the current designated >> cpu being used to collect fme pmu data got offline, >> as based on current code we are not migrating fme pmu to >> new target cpu. Because of that perf will still try to >> fetch data from that offline cpu and hence we will not >> get counter data. >> >> Patch fixed this issue by adding pmu_migrate_context call >> in fme_perf_offline_cpu function. >> >> Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support") >> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> > > Tested-by: Xu Yilun <yilun.xu@intel.com> > > Thanks, > Yilun > >> --- >> drivers/fpga/dfl-fme-perf.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> --- >> - This fix patch is not tested (as I don't have required environment). >> But issue mentioned in the commit msg can be re-created, by starting any >> fme_perf event and while its still running, offline current designated >> cpu pointed by cpumask file. Since current code didn't migrating pmu, >> perf gonna try getting counts from that offlined cpu and hence we will >> not get event data. >> --- >> diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c >> index 4299145ef347..b9a54583e505 100644 >> --- a/drivers/fpga/dfl-fme-perf.c >> +++ b/drivers/fpga/dfl-fme-perf.c >> @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node) >> return 0; >> >> priv->cpu = target; >> + >> + /* Migrate fme_perf pmu events to the new target cpu */ >> + perf_pmu_migrate_context(&priv->pmu, cpu, target); >> + >> return 0; >> } >> >> -- >> 2.31.1
On Mon, Jun 28, 2021 at 12:45:46PM +0530, Kajol Jain wrote: > Commit 724142f8c42a ("fpga: dfl: fme: add performance > reporting support") added performance reporting support > for FPGA management engine via perf. > > It also added cpu hotplug feature but it didn't add > pmu migration call in cpu offline function. > This can create an issue incase the current designated > cpu being used to collect fme pmu data got offline, > as based on current code we are not migrating fme pmu to > new target cpu. Because of that perf will still try to > fetch data from that offline cpu and hence we will not > get counter data. > > Patch fixed this issue by adding pmu_migrate_context call > in fme_perf_offline_cpu function. > > Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support") > Signed-off-by: Kajol Jain <kjain@linux.ibm.com> You might want to Cc: stable@vger.kernel.org if it fixes an actual bug. > --- > drivers/fpga/dfl-fme-perf.c | 4 ++++ > 1 file changed, 4 insertions(+) > > --- > - This fix patch is not tested (as I don't have required environment). > But issue mentioned in the commit msg can be re-created, by starting any > fme_perf event and while its still running, offline current designated > cpu pointed by cpumask file. Since current code didn't migrating pmu, > perf gonna try getting counts from that offlined cpu and hence we will > not get event data. > --- > diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c > index 4299145ef347..b9a54583e505 100644 > --- a/drivers/fpga/dfl-fme-perf.c > +++ b/drivers/fpga/dfl-fme-perf.c > @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node) > return 0; > > priv->cpu = target; > + > + /* Migrate fme_perf pmu events to the new target cpu */ > + perf_pmu_migrate_context(&priv->pmu, cpu, target); > + > return 0; > } > > -- > 2.31.1 > - Moritz
On 6/29/21 12:10 AM, Moritz Fischer wrote: > On Mon, Jun 28, 2021 at 12:45:46PM +0530, Kajol Jain wrote: >> Commit 724142f8c42a ("fpga: dfl: fme: add performance >> reporting support") added performance reporting support >> for FPGA management engine via perf. >> >> It also added cpu hotplug feature but it didn't add >> pmu migration call in cpu offline function. >> This can create an issue incase the current designated >> cpu being used to collect fme pmu data got offline, >> as based on current code we are not migrating fme pmu to >> new target cpu. Because of that perf will still try to >> fetch data from that offline cpu and hence we will not >> get counter data. >> >> Patch fixed this issue by adding pmu_migrate_context call >> in fme_perf_offline_cpu function. >> >> Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support") >> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> > > You might want to Cc: stable@vger.kernel.org if it fixes an actual bug. Hi Moritz, I already send patch out without RFC tag yesterday. Link to the patch: https://lkml.org/lkml/2021/6/28/275 I will cc stable@vger.kernel.org there as suggested by you. Thanks, Kajol Jain >> --- >> drivers/fpga/dfl-fme-perf.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> --- >> - This fix patch is not tested (as I don't have required environment). >> But issue mentioned in the commit msg can be re-created, by starting any >> fme_perf event and while its still running, offline current designated >> cpu pointed by cpumask file. Since current code didn't migrating pmu, >> perf gonna try getting counts from that offlined cpu and hence we will >> not get event data. >> --- >> diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c >> index 4299145ef347..b9a54583e505 100644 >> --- a/drivers/fpga/dfl-fme-perf.c >> +++ b/drivers/fpga/dfl-fme-perf.c >> @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node) >> return 0; >> >> priv->cpu = target; >> + >> + /* Migrate fme_perf pmu events to the new target cpu */ >> + perf_pmu_migrate_context(&priv->pmu, cpu, target); >> + >> return 0; >> } >> >> -- >> 2.31.1 >> > - Moritz >
diff --git a/drivers/fpga/dfl-fme-perf.c b/drivers/fpga/dfl-fme-perf.c index 4299145ef347..b9a54583e505 100644 --- a/drivers/fpga/dfl-fme-perf.c +++ b/drivers/fpga/dfl-fme-perf.c @@ -953,6 +953,10 @@ static int fme_perf_offline_cpu(unsigned int cpu, struct hlist_node *node) return 0; priv->cpu = target; + + /* Migrate fme_perf pmu events to the new target cpu */ + perf_pmu_migrate_context(&priv->pmu, cpu, target); + return 0; }
Commit 724142f8c42a ("fpga: dfl: fme: add performance reporting support") added performance reporting support for FPGA management engine via perf. It also added cpu hotplug feature but it didn't add pmu migration call in cpu offline function. This can create an issue incase the current designated cpu being used to collect fme pmu data got offline, as based on current code we are not migrating fme pmu to new target cpu. Because of that perf will still try to fetch data from that offline cpu and hence we will not get counter data. Patch fixed this issue by adding pmu_migrate_context call in fme_perf_offline_cpu function. Fixes: 724142f8c42a ("fpga: dfl: fme: add performance reporting support") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> --- drivers/fpga/dfl-fme-perf.c | 4 ++++ 1 file changed, 4 insertions(+) --- - This fix patch is not tested (as I don't have required environment). But issue mentioned in the commit msg can be re-created, by starting any fme_perf event and while its still running, offline current designated cpu pointed by cpumask file. Since current code didn't migrating pmu, perf gonna try getting counts from that offlined cpu and hence we will not get event data. ---