Message ID | 20240603092812.46616-3-yangyicong@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Perf avoid opening events on offline CPUs | expand |
On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@huawei.com> wrote: > > From: Yicong Yang <yangyicong@hisilicon.com> > > When there're CPUs offline after system booting, perf will failed: > [root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/ > Error: > The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock). > /bin/dmesg | grep -i perf may provide additional information. Thanks for debugging this Yicong! The fact cycles is falling back on cpu-clock I'm confused by, on ARM the PMU type generally isn't PERF_TYPE_HARDWARE and so this fallback shouldn't happen: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n2900 I note your command line is for system wide event opening rather than for a process. It is more strange the fallback is giving "No such device". > This is due to PMU's "cpus" is not updated and still contains offline > CPUs and perf will try to open perf event on the offlined CPUs. The perf tool will try to only open online CPUs. Unfortunately the naming around this is confusing: - any - an event may follow a task and schedule on "any" CPU. We encode this with -1. https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n24 - online - the set of online CPU. https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n33 - all - I try to avoid this but it may still be in the code, "all" usually is another name for online. Hopefully when we use this name it is clearly defined: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/internal/evlist.h?h=perf-tools-next#n23 > Make "cpus" attribute only shows online CPUs and introduced a new > "supported_cpus" where users can get the range of the CPUs this > PMU supported monitoring. I don't think this should be necessary as by default the CPUs the perf tool will use will be the "online" CPUs: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40 Could you create a reproduction of the problem you are encountering? My expectation is for a core PMU to advertise the online and offline CPUs it is valid for, and for perf to intersect: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n44 those CPUs with the online CPUs so the perf_event_open doesn't fail. Thanks, Ian > Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> > --- > drivers/perf/arm_pmu.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c > index 8458fe2cebb4..acbb0e1d0414 100644 > --- a/drivers/perf/arm_pmu.c > +++ b/drivers/perf/arm_pmu.c > @@ -558,13 +558,35 @@ static ssize_t cpus_show(struct device *dev, > struct device_attribute *attr, char *buf) > { > struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev)); > - return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus); > + cpumask_var_t mask; > + ssize_t n; > + > + /* If allocation failed then show the supported_cpus */ > + if (!alloc_cpumask_var(&mask, GFP_KERNEL)) > + return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus); > + > + cpumask_and(mask, &armpmu->supported_cpus, cpu_online_mask); > + n = cpumap_print_to_pagebuf(true, buf, mask); > + free_cpumask_var(mask); > + > + return n; > } > > static DEVICE_ATTR_RO(cpus); > > +static ssize_t supported_cpus_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev)); > + > + return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus); > +} > + > +static DEVICE_ATTR_RO(supported_cpus); > + > static struct attribute *armpmu_common_attrs[] = { > &dev_attr_cpus.attr, > + &dev_attr_supported_cpus.attr, > NULL, > }; > > -- > 2.24.0 >
Hi Ian, On 2024/6/4 0:20, Ian Rogers wrote: > On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@huawei.com> wrote: >> >> From: Yicong Yang <yangyicong@hisilicon.com> >> >> When there're CPUs offline after system booting, perf will failed: >> [root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/ >> Error: >> The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock). >> /bin/dmesg | grep -i perf may provide additional information. > > Thanks for debugging this Yicong! The fact cycles is falling back on > cpu-clock I'm confused by, on ARM the PMU type generally isn't > PERF_TYPE_HARDWARE and so this fallback shouldn't happen: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n2900 > I note your command line is for system wide event opening rather than > for a process. It is more strange the fallback is giving "No such > device". > >> This is due to PMU's "cpus" is not updated and still contains offline >> CPUs and perf will try to open perf event on the offlined CPUs. > > The perf tool will try to only open online CPUs. Unfortunately the > naming around this is confusing: > > - any - an event may follow a task and schedule on "any" CPU. We > encode this with -1. > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n24 > - online - the set of online CPU. > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n33 > - all - I try to avoid this but it may still be in the code, "all" > usually is another name for online. Hopefully when we use this name it > is clearly defined: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/internal/evlist.h?h=perf-tools-next#n23 > >> Make "cpus" attribute only shows online CPUs and introduced a new >> "supported_cpus" where users can get the range of the CPUs this >> PMU supported monitoring. > > I don't think this should be necessary as by default the CPUs the perf > tool will use will be the "online" CPUs: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40 > > Could you create a reproduction of the problem you are encountering? > My expectation is for a core PMU to advertise the online and offline > CPUs it is valid for, and for perf to intersect: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n44 > those CPUs with the online CPUs so the perf_event_open doesn't fail. > Thanks for the comments and detailed illustration! Actually it can be reproduced easily using the armv8_pmuv3's events. Tested on 6.10-rc1 with perf version 6.10.rc1.g1613e604df0c: # offline any CPU [root@localhost tmp]# echo 0 > /sys//devices/system/cpu/cpu1/online # try any event of armv8_pmuv3, with -a or without [root@localhost tmp]# ./perf stat -e armv8_pmuv3_0/ll_cache/ -vvv Control descriptor is not initialized Opening: armv8_pmuv3_0/ll_cache/ ------------------------------------------------------------ perf_event_attr: type 10 (armv8_pmuv3_0) size 136 config 0x32 (ll_cache) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 Opening: armv8_pmuv3_0/ll_cache/ ------------------------------------------------------------ perf_event_attr: type 10 (armv8_pmuv3_0) size 136 config 0x32 (ll_cache) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -19 Error: The sys_perf_event_open() syscall returned with 19 (No such device) for event (armv8_pmuv3_0/ll_cache/). /bin/dmesg | grep -i perf may provide additional information. As you can see, we're trying to open event on CPU 0 first (succeed) and then CPU 1 but failed on CPU1. Actually we don't enter any branch which handle the evsel->cpus in __perf_evlist__propagate_maps() so the evsel->cpus keeps as is which should be initialized from the pmu->cpumask. You referenced to [1] but in this case perf_evsel->system_wide == false as I checked. perf_evsel->system_wide will set to true in perf_evlist__go_system_wide() and it maybe only set for dummy events. Please correct me if I misread here. [1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40 Thanks.
On 2024/6/4 0:20, Ian Rogers wrote: > On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@huawei.com> wrote: >> >> From: Yicong Yang <yangyicong@hisilicon.com> >> >> When there're CPUs offline after system booting, perf will failed: >> [root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/ >> Error: >> The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock). >> /bin/dmesg | grep -i perf may provide additional information. > > Thanks for debugging this Yicong! The fact cycles is falling back on > cpu-clock I'm confused by, on ARM the PMU type generally isn't > PERF_TYPE_HARDWARE and so this fallback shouldn't happen: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n2900 It should be brought by config_term_pmu() in [1]. If it's a hardware event term and not found in the PMU's sysfs, we'll make attr->type to PERF_TYPE_HARDWARE. [1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n1052 Thanks.
On Tue, Jun 04, 2024 at 03:43:35PM +0800, Yicong Yang wrote: > Hi Ian, > > On 2024/6/4 0:20, Ian Rogers wrote: > > On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@huawei.com> wrote: > >> > >> From: Yicong Yang <yangyicong@hisilicon.com> > >> > >> When there're CPUs offline after system booting, perf will failed: > >> [root@localhost ~]# /home/yang/perf stat -a -e armv8_pmuv3_0/cycles/ > >> Error: > >> The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock). > >> /bin/dmesg | grep -i perf may provide additional information. > > > > Thanks for debugging this Yicong! The fact cycles is falling back on > > cpu-clock I'm confused by, on ARM the PMU type generally isn't > > PERF_TYPE_HARDWARE and so this fallback shouldn't happen: > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/evsel.c?h=perf-tools-next#n2900 > > I note your command line is for system wide event opening rather than > > for a process. It is more strange the fallback is giving "No such > > device". > > > >> This is due to PMU's "cpus" is not updated and still contains offline > >> CPUs and perf will try to open perf event on the offlined CPUs. > > > > The perf tool will try to only open online CPUs. Unfortunately the > > naming around this is confusing: > > > > - any - an event may follow a task and schedule on "any" CPU. We > > encode this with -1. > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n24 > > - online - the set of online CPU. > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n33 > > - all - I try to avoid this but it may still be in the code, "all" > > usually is another name for online. Hopefully when we use this name it > > is clearly defined: > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/internal/evlist.h?h=perf-tools-next#n23 > > > >> Make "cpus" attribute only shows online CPUs and introduced a new > >> "supported_cpus" where users can get the range of the CPUs this > >> PMU supported monitoring. > > > > I don't think this should be necessary as by default the CPUs the perf > > tool will use will be the "online" CPUs: > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40 > > > > Could you create a reproduction of the problem you are encountering? > > My expectation is for a core PMU to advertise the online and offline > > CPUs it is valid for, and for perf to intersect: > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/include/perf/cpumap.h?h=perf-tools-next#n44 > > those CPUs with the online CPUs so the perf_event_open doesn't fail. > > > > Thanks for the comments and detailed illustration! > > Actually it can be reproduced easily using the armv8_pmuv3's events. Tested on 6.10-rc1 with > perf version 6.10.rc1.g1613e604df0c: > # offline any CPU > [root@localhost tmp]# echo 0 > /sys//devices/system/cpu/cpu1/online > # try any event of armv8_pmuv3, with -a or without > [root@localhost tmp]# ./perf stat -e armv8_pmuv3_0/ll_cache/ -vvv > Control descriptor is not initialized > Opening: armv8_pmuv3_0/ll_cache/ > ------------------------------------------------------------ > perf_event_attr: > type 10 (armv8_pmuv3_0) > size 136 > config 0x32 (ll_cache) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 > Opening: armv8_pmuv3_0/ll_cache/ > ------------------------------------------------------------ > perf_event_attr: > type 10 (armv8_pmuv3_0) > size 136 > config 0x32 (ll_cache) > sample_type IDENTIFIER > read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -19 > Error: > The sys_perf_event_open() syscall returned with 19 (No such device) for event (armv8_pmuv3_0/ll_cache/). > /bin/dmesg | grep -i perf may provide additional information. > > As you can see, we're trying to open event on CPU 0 first (succeed) and then CPU 1 but > failed on CPU1. Actually we don't enter any branch which handle the evsel->cpus in > __perf_evlist__propagate_maps() so the evsel->cpus keeps as is which should be initialized > from the pmu->cpumask. You referenced to [1] but in this case perf_evsel->system_wide == false > as I checked. perf_evsel->system_wide will set to true in perf_evlist__go_system_wide() and it > maybe only set for dummy events. Please correct me if I misread here. Yes, this is confusing. IIRC evsel->system_wide is for tracking (dummy) events to collect side band events (like for Intel-PT) regardless of command line options (like -a or -C). Thanks, Namhyung > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/lib/perf/evlist.c?h=perf-tools-next#n40 > > Thanks. >
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 8458fe2cebb4..acbb0e1d0414 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -558,13 +558,35 @@ static ssize_t cpus_show(struct device *dev, struct device_attribute *attr, char *buf) { struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev)); - return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus); + cpumask_var_t mask; + ssize_t n; + + /* If allocation failed then show the supported_cpus */ + if (!alloc_cpumask_var(&mask, GFP_KERNEL)) + return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus); + + cpumask_and(mask, &armpmu->supported_cpus, cpu_online_mask); + n = cpumap_print_to_pagebuf(true, buf, mask); + free_cpumask_var(mask); + + return n; } static DEVICE_ATTR_RO(cpus); +static ssize_t supported_cpus_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus); +} + +static DEVICE_ATTR_RO(supported_cpus); + static struct attribute *armpmu_common_attrs[] = { &dev_attr_cpus.attr, + &dev_attr_supported_cpus.attr, NULL, };