Message ID | 20220324183323.31414-5-alisaidi@amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | perf: arm-spe: Decode SPE source and use for perf c2c | expand |
On Thu, Mar 24, 2022 at 06:33:23PM +0000, Ali Saidi wrote: > For loads that hit in a the LLC snoop filter and are fulfilled from a > higher level cache on arm64 Neoverse cores, it's not usually clear what > the true level of the cache the data came from (i.e. a transfer from a > core could come from it's L1 or L2). Instead of making an assumption of > where the line came from, add support for incrementing HITM if the > source is CACHE_ANY. > > Since other architectures don't seem to populate the mem_lvl_num field > here there shouldn't be a change in functionality. > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > Tested-by: German Gomez <german.gomez@arm.com> > Reviewed-by: German Gomez <german.gomez@arm.com> > --- > tools/perf/util/mem-events.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c > index e5e405185498..084977cfebef 100644 > --- a/tools/perf/util/mem-events.c > +++ b/tools/perf/util/mem-events.c > @@ -539,6 +539,15 @@ do { \ > stats->ld_llchit++; > } > > + /* > + * A hit in another cores cache must mean a llc snoop > + * filter hit > + */ > + if (lnum == P(LVLNUM, ANY_CACHE)) { > + if (snoop & P(SNOOP, HITM)) > + HITM_INC(lcl_hitm); > + } This might break the memory profiling result for x86, see file arch/x86/events/intel/ds.c: 97 void __init intel_pmu_pebs_data_source_skl(bool pmem) 98 { 99 u64 pmem_or_l4 = pmem ? LEVEL(PMEM) : LEVEL(L4); ... 105 pebs_data_source[0x0d] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOP, HITM); 106 } Which means that it's possible that it's a remote access and the cache level is ANY_CACHE, it's good to add checking for bit PERF_MEM_REMOTE_REMOTE: u64 remote = data_src->mem_remote; /* * A hit in another cores cache must mean a llc snoop * filter hit */ if (lnum == P(LVLNUM, ANY_CACHE) && remote != P(REMOTE, REMOTE)) { if (snoop & P(SNOOP, HITM)) HITM_INC(lcl_hitm); } Appreciate German's reviewing and testing, and sorry I jumped in very late. Thanks, Leo > + > if (lvl & P(LVL, LOC_RAM) || lnum == P(LVLNUM, RAM)) { > stats->lcl_dram++; > if (snoop & P(SNOOP, HIT)) > -- > 2.32.0 >
Em Sat, Mar 26, 2022 at 02:23:03PM +0800, Leo Yan escreveu: > On Thu, Mar 24, 2022 at 06:33:23PM +0000, Ali Saidi wrote: > > For loads that hit in a the LLC snoop filter and are fulfilled from a > > higher level cache on arm64 Neoverse cores, it's not usually clear what > > the true level of the cache the data came from (i.e. a transfer from a > > core could come from it's L1 or L2). Instead of making an assumption of > > where the line came from, add support for incrementing HITM if the > > source is CACHE_ANY. > > > > Since other architectures don't seem to populate the mem_lvl_num field > > here there shouldn't be a change in functionality. > > > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > > Tested-by: German Gomez <german.gomez@arm.com> > > Reviewed-by: German Gomez <german.gomez@arm.com> > > --- > > tools/perf/util/mem-events.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c > > index e5e405185498..084977cfebef 100644 > > --- a/tools/perf/util/mem-events.c > > +++ b/tools/perf/util/mem-events.c > > @@ -539,6 +539,15 @@ do { \ > > stats->ld_llchit++; > > } > > > > + /* > > + * A hit in another cores cache must mean a llc snoop > > + * filter hit > > + */ > > + if (lnum == P(LVLNUM, ANY_CACHE)) { > > + if (snoop & P(SNOOP, HITM)) > > + HITM_INC(lcl_hitm); > > + } > > This might break the memory profiling result for x86, see file > arch/x86/events/intel/ds.c: > > 97 void __init intel_pmu_pebs_data_source_skl(bool pmem) > 98 { > 99 u64 pmem_or_l4 = pmem ? LEVEL(PMEM) : LEVEL(L4); > ... > 105 pebs_data_source[0x0d] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOP, HITM); > 106 } > > Which means that it's possible that it's a remote access and the cache > level is ANY_CACHE, it's good to add checking for bit > PERF_MEM_REMOTE_REMOTE: > > u64 remote = data_src->mem_remote; > > /* > * A hit in another cores cache must mean a llc snoop > * filter hit > */ > if (lnum == P(LVLNUM, ANY_CACHE) && remote != P(REMOTE, REMOTE)) { > if (snoop & P(SNOOP, HITM)) > HITM_INC(lcl_hitm); > } > > Appreciate German's reviewing and testing, and sorry I jumped in very > late. I have not published this on perf/core, its just in tmp.perf/core while tests ran, so I'll remove this specific patch and rerun tests, thanks for reviewing. - Arnaldo
On Sat, 26 Mar 2022 22:23:03 +0000, Leo Yan wrote: > On Thu, Mar 24, 2022 at 06:33:23PM +0000, Ali Saidi wrote: > > For loads that hit in a the LLC snoop filter and are fulfilled from a > > higher level cache on arm64 Neoverse cores, it's not usually clear what > > the true level of the cache the data came from (i.e. a transfer from a > > core could come from it's L1 or L2). Instead of making an assumption of > > where the line came from, add support for incrementing HITM if the > > source is CACHE_ANY.A [snip] > > This might break the memory profiling result for x86, see file > arch/x86/events/intel/ds.c: > > 97 void __init intel_pmu_pebs_data_source_skl(bool pmem) > 98 { > 99 u64 pmem_or_l4 = pmem ? LEVEL(PMEM) : LEVEL(L4); > ... > 105 pebs_data_source[0x0d] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOP, HITM); > 106 } > Thanks for catching this Leo, I'll add your fix. Ali
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c index e5e405185498..084977cfebef 100644 --- a/tools/perf/util/mem-events.c +++ b/tools/perf/util/mem-events.c @@ -539,6 +539,15 @@ do { \ stats->ld_llchit++; } + /* + * A hit in another cores cache must mean a llc snoop + * filter hit + */ + if (lnum == P(LVLNUM, ANY_CACHE)) { + if (snoop & P(SNOOP, HITM)) + HITM_INC(lcl_hitm); + } + if (lvl & P(LVL, LOC_RAM) || lnum == P(LVLNUM, RAM)) { stats->lcl_dram++; if (snoop & P(SNOOP, HIT))