diff mbox series

[2/3] perf: arm_pmu: Only show online CPUs in device's "cpus" attribute

Message ID 20240603092812.46616-3-yangyicong@huawei.com (mailing list archive)
State New, archived
Headers show
Series Perf avoid opening events on offline CPUs | expand

Commit Message

Yicong Yang June 3, 2024, 9:28 a.m. UTC
From: Yicong Yang <yangyicong@hisilicon.com>

When there're CPUs offline after system booting, perf will failed:
[root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/
Error:
The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock).
/bin/dmesg | grep -i perf may provide additional information.

This is due to PMU's "cpus" is not updated and still contains offline
CPUs and perf will try to open perf event on the offlined CPUs.

Make "cpus" attribute only shows online CPUs and introduced a new
"supported_cpus" where users can get the range of the CPUs this
PMU supported monitoring.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/perf/arm_pmu.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

Comments

Ian Rogers June 3, 2024, 4:20 p.m. UTC | #1
On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@huawei.com> wrote:
>
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> When there're CPUs offline after system booting, perf will failed:
> [root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/
> Error:
> The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock).
> /bin/dmesg | grep -i perf may provide additional information.

Thanks for debugging this Yicong! The fact cycles is falling back on
cpu-clock I'm confused by, on ARM the PMU type generally isn't
PERF_TYPE_HARDWARE and so this fallback shouldn't happen:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n2900
I note your command line is for system wide event opening rather than
for a process. It is more strange the fallback is giving "No such
device".

> This is due to PMU's "cpus" is not updated and still contains offline
> CPUs and perf will try to open perf event on the offlined CPUs.

The perf tool will try to only open online CPUs. Unfortunately the
naming around this is confusing:

- any - an event may follow a task and schedule on "any" CPU. We
encode this with -1.
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n24
- online - the set of online CPU.
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n33
- all - I try to avoid this but it may still be in the code, "all"
usually is another name for online. Hopefully when we use this name it
is clearly defined:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/internal/evlist.h?h=perf-tools-next#n23

> Make "cpus" attribute only shows online CPUs and introduced a new
> "supported_cpus" where users can get the range of the CPUs this
> PMU supported monitoring.

I don't think this should be necessary as by default the CPUs the perf
tool will use will be the "online" CPUs:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40

Could you create a reproduction of the problem you are encountering?
My expectation is for a core PMU to advertise the online and offline
CPUs it is valid for, and for perf to intersect:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n44
those CPUs with the online CPUs so the perf_event_open doesn't fail.

Thanks,
Ian

> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  drivers/perf/arm_pmu.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 8458fe2cebb4..acbb0e1d0414 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -558,13 +558,35 @@ static ssize_t cpus_show(struct device *dev,
>                          struct device_attribute *attr, char *buf)
>  {
>         struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
> -       return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
> +       cpumask_var_t mask;
> +       ssize_t n;
> +
> +       /* If allocation failed then show the supported_cpus */
> +       if (!alloc_cpumask_var(&mask, GFP_KERNEL))
> +               return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
> +
> +       cpumask_and(mask, &armpmu->supported_cpus, cpu_online_mask);
> +       n = cpumap_print_to_pagebuf(true, buf, mask);
> +       free_cpumask_var(mask);
> +
> +       return n;
>  }
>
>  static DEVICE_ATTR_RO(cpus);
>
> +static ssize_t supported_cpus_show(struct device *dev,
> +                                  struct device_attribute *attr, char *buf)
> +{
> +       struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
> +
> +       return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
> +}
> +
> +static DEVICE_ATTR_RO(supported_cpus);
> +
>  static struct attribute *armpmu_common_attrs[] = {
>         &dev_attr_cpus.attr,
> +       &dev_attr_supported_cpus.attr,
>         NULL,
>  };
>
> --
> 2.24.0
>
Yicong Yang June 4, 2024, 7:43 a.m. UTC | #2
Hi Ian,

On 2024/6/4 0:20, Ian Rogers wrote:
> On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@huawei.com> wrote:
>>
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> When there're CPUs offline after system booting, perf will failed:
>> [root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/
>> Error:
>> The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock).
>> /bin/dmesg | grep -i perf may provide additional information.
> 
> Thanks for debugging this Yicong! The fact cycles is falling back on
> cpu-clock I'm confused by, on ARM the PMU type generally isn't
> PERF_TYPE_HARDWARE and so this fallback shouldn't happen:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n2900
> I note your command line is for system wide event opening rather than
> for a process. It is more strange the fallback is giving "No such
> device".
> 
>> This is due to PMU's "cpus" is not updated and still contains offline
>> CPUs and perf will try to open perf event on the offlined CPUs.
> 
> The perf tool will try to only open online CPUs. Unfortunately the
> naming around this is confusing:
> 
> - any - an event may follow a task and schedule on "any" CPU. We
> encode this with -1.
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n24
> - online - the set of online CPU.
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n33
> - all - I try to avoid this but it may still be in the code, "all"
> usually is another name for online. Hopefully when we use this name it
> is clearly defined:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/internal/evlist.h?h=perf-tools-next#n23
> 
>> Make "cpus" attribute only shows online CPUs and introduced a new
>> "supported_cpus" where users can get the range of the CPUs this
>> PMU supported monitoring.
> 
> I don't think this should be necessary as by default the CPUs the perf
> tool will use will be the "online" CPUs:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40
> 
> Could you create a reproduction of the problem you are encountering?
> My expectation is for a core PMU to advertise the online and offline
> CPUs it is valid for, and for perf to intersect:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n44
> those CPUs with the online CPUs so the perf_event_open doesn't fail.
> 

Thanks for the comments and detailed illustration!

Actually it can be reproduced easily using the armv8_pmuv3's events. Tested on 6.10-rc1 with
perf version 6.10.rc1.g1613e604df0c:
# offline any CPU
[root@localhost tmp]# echo 0 > /sys//devices/system/cpu/cpu1/online
# try any event of armv8_pmuv3, with -a or without
[root@localhost tmp]# ./perf stat -e armv8_pmuv3_0/ll_cache/ -vvv
Control descriptor is not initialized
Opening: armv8_pmuv3_0/ll_cache/
------------------------------------------------------------
perf_event_attr:
  type                             10 (armv8_pmuv3_0)
  size                             136
  config                           0x32 (ll_cache)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
Opening: armv8_pmuv3_0/ll_cache/
------------------------------------------------------------
perf_event_attr:
  type                             10 (armv8_pmuv3_0)
  size                             136
  config                           0x32 (ll_cache)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8
sys_perf_event_open failed, error -19
Error:
The sys_perf_event_open() syscall returned with 19 (No such device) for event (armv8_pmuv3_0/ll_cache/).
/bin/dmesg | grep -i perf may provide additional information.

As you can see, we're trying to open event on CPU 0 first (succeed) and then CPU 1 but
failed on CPU1. Actually we don't enter any branch which handle the evsel->cpus in
__perf_evlist__propagate_maps() so the evsel->cpus keeps as is which should be initialized
from the pmu->cpumask. You referenced to [1] but in this case perf_evsel->system_wide == false
as I checked. perf_evsel->system_wide will set to true in perf_evlist__go_system_wide() and it
maybe only set for dummy events. Please correct me if I misread here.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40

Thanks.
Yicong Yang June 4, 2024, 12:22 p.m. UTC | #3
On 2024/6/4 0:20, Ian Rogers wrote:
> On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@huawei.com> wrote:
>>
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> When there're CPUs offline after system booting, perf will failed:
>> [root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/
>> Error:
>> The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock).
>> /bin/dmesg | grep -i perf may provide additional information.
> 
> Thanks for debugging this Yicong! The fact cycles is falling back on
> cpu-clock I'm confused by, on ARM the PMU type generally isn't
> PERF_TYPE_HARDWARE and so this fallback shouldn't happen:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n2900

It should be brought by config_term_pmu() in [1]. If it's a hardware event term
and not found in the PMU's sysfs, we'll make attr->type to PERF_TYPE_HARDWARE.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1052

Thanks.
Namhyung Kim June 7, 2024, 12:17 a.m. UTC | #4
On Tue, Jun 04, 2024 at 03:43:35PM +0800, Yicong Yang wrote:
> Hi Ian,
> 
> On 2024/6/4 0:20, Ian Rogers wrote:
> > On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@huawei.com> wrote:
> >>
> >> From: Yicong Yang <yangyicong@hisilicon.com>
> >>
> >> When there're CPUs offline after system booting, perf will failed:
> >> [root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/
> >> Error:
> >> The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock).
> >> /bin/dmesg | grep -i perf may provide additional information.
> > 
> > Thanks for debugging this Yicong! The fact cycles is falling back on
> > cpu-clock I'm confused by, on ARM the PMU type generally isn't
> > PERF_TYPE_HARDWARE and so this fallback shouldn't happen:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n2900
> > I note your command line is for system wide event opening rather than
> > for a process. It is more strange the fallback is giving "No such
> > device".
> > 
> >> This is due to PMU's "cpus" is not updated and still contains offline
> >> CPUs and perf will try to open perf event on the offlined CPUs.
> > 
> > The perf tool will try to only open online CPUs. Unfortunately the
> > naming around this is confusing:
> > 
> > - any - an event may follow a task and schedule on "any" CPU. We
> > encode this with -1.
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n24
> > - online - the set of online CPU.
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n33
> > - all - I try to avoid this but it may still be in the code, "all"
> > usually is another name for online. Hopefully when we use this name it
> > is clearly defined:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/internal/evlist.h?h=perf-tools-next#n23
> > 
> >> Make "cpus" attribute only shows online CPUs and introduced a new
> >> "supported_cpus" where users can get the range of the CPUs this
> >> PMU supported monitoring.
> > 
> > I don't think this should be necessary as by default the CPUs the perf
> > tool will use will be the "online" CPUs:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40
> > 
> > Could you create a reproduction of the problem you are encountering?
> > My expectation is for a core PMU to advertise the online and offline
> > CPUs it is valid for, and for perf to intersect:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n44
> > those CPUs with the online CPUs so the perf_event_open doesn't fail.
> > 
> 
> Thanks for the comments and detailed illustration!
> 
> Actually it can be reproduced easily using the armv8_pmuv3's events. Tested on 6.10-rc1 with
> perf version 6.10.rc1.g1613e604df0c:
> # offline any CPU
> [root@localhost tmp]# echo 0 > /sys//devices/system/cpu/cpu1/online
> # try any event of armv8_pmuv3, with -a or without
> [root@localhost tmp]# ./perf stat -e armv8_pmuv3_0/ll_cache/ -vvv
> Control descriptor is not initialized
> Opening: armv8_pmuv3_0/ll_cache/
> ------------------------------------------------------------
> perf_event_attr:
>   type                             10 (armv8_pmuv3_0)
>   size                             136
>   config                           0x32 (ll_cache)
>   sample_type                      IDENTIFIER
>   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>   disabled                         1
>   inherit                          1
>   exclude_guest                    1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
> Opening: armv8_pmuv3_0/ll_cache/
> ------------------------------------------------------------
> perf_event_attr:
>   type                             10 (armv8_pmuv3_0)
>   size                             136
>   config                           0x32 (ll_cache)
>   sample_type                      IDENTIFIER
>   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>   disabled                         1
>   inherit                          1
>   exclude_guest                    1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8
> sys_perf_event_open failed, error -19
> Error:
> The sys_perf_event_open() syscall returned with 19 (No such device) for event (armv8_pmuv3_0/ll_cache/).
> /bin/dmesg | grep -i perf may provide additional information.
> 
> As you can see, we're trying to open event on CPU 0 first (succeed) and then CPU 1 but
> failed on CPU1. Actually we don't enter any branch which handle the evsel->cpus in
> __perf_evlist__propagate_maps() so the evsel->cpus keeps as is which should be initialized
> from the pmu->cpumask. You referenced to [1] but in this case perf_evsel->system_wide == false
> as I checked. perf_evsel->system_wide will set to true in perf_evlist__go_system_wide() and it
> maybe only set for dummy events. Please correct me if I misread here.

Yes, this is confusing.  IIRC evsel->system_wide is for tracking (dummy)
events to collect side band events (like for Intel-PT) regardless of
command line options (like -a or -C).

Thanks,
Namhyung

> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40
> 
> Thanks.
>
diff mbox series

Patch

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 8458fe2cebb4..acbb0e1d0414 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -558,13 +558,35 @@  static ssize_t cpus_show(struct device *dev,
 			 struct device_attribute *attr, char *buf)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
-	return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
+	cpumask_var_t mask;
+	ssize_t n;
+
+	/* If allocation failed then show the supported_cpus */
+	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
+		return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
+
+	cpumask_and(mask, &armpmu->supported_cpus, cpu_online_mask);
+	n = cpumap_print_to_pagebuf(true, buf, mask);
+	free_cpumask_var(mask);
+
+	return n;
 }
 
 static DEVICE_ATTR_RO(cpus);
 
+static ssize_t supported_cpus_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
+
+	return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
+}
+
+static DEVICE_ATTR_RO(supported_cpus);
+
 static struct attribute *armpmu_common_attrs[] = {
 	&dev_attr_cpus.attr,
+	&dev_attr_supported_cpus.attr,
 	NULL,
 };