Message ID | 20220504184850.24986-3-alisaidi@amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | perf: arm-spe: Decode SPE source and use for perf c2c | expand |
Em Wed, May 04, 2022 at 06:48:47PM +0000, Ali Saidi escreveu: > Add a flag to the perf mem data struct to signal that a request caused a > cache-to-cache transfer of a line from a peer of the requestor and > wasn't sourced from a lower cache level. The line being moved from one > peer cache to another has latency and performance implications. On Arm64 > Neoverse systems the data source can indicate a cache-to-cache transfer > but not if the line is dirty or clean, so instead of overloading HITM > define a new flag that indicates this type of transfer. > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > Reviewed-by: Leo Yan <leo.yan@linaro.org> Was this already merged on the ARM kernel tree? - Arnaldo > --- > tools/include/uapi/linux/perf_event.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h > index d37629dbad72..7b88bfd097dc 100644 > --- a/tools/include/uapi/linux/perf_event.h > +++ b/tools/include/uapi/linux/perf_event.h > @@ -1310,7 +1310,7 @@ union perf_mem_data_src { > #define PERF_MEM_SNOOP_SHIFT 19 > > #define PERF_MEM_SNOOPX_FWD 0x01 /* forward */ > -/* 1 free */ > +#define PERF_MEM_SNOOPX_PEER 0x02 /* xfer from peer */ > #define PERF_MEM_SNOOPX_SHIFT 38 > > /* locked instruction */ > -- > 2.32.0
On Tue, May 10, 2022 at 01:28:38PM -0300, Arnaldo Carvalho de Melo wrote: > Em Wed, May 04, 2022 at 06:48:47PM +0000, Ali Saidi escreveu: > > Add a flag to the perf mem data struct to signal that a request caused a > > cache-to-cache transfer of a line from a peer of the requestor and > > wasn't sourced from a lower cache level. The line being moved from one > > peer cache to another has latency and performance implications. On Arm64 > > Neoverse systems the data source can indicate a cache-to-cache transfer > > but not if the line is dirty or clean, so instead of overloading HITM > > define a new flag that indicates this type of transfer. > > > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > > Reviewed-by: Leo Yan <leo.yan@linaro.org> > > Was this already merged on the ARM kernel tree? No, I don't think this patch has been merged on Arm kernel tree. I searched Arm and Arm64 git repos, none of them has merged this patch. https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?qt=author&q=Ali+Saidi http://git.armlinux.org.uk/cgit/linux-arm.git/log/?qt=author&q=Ali+Saidi P.s. Ali missed to include German's review tag, see: https://lore.kernel.org/lkml/458a2de1-dc93-7e2d-5dc5-fbcd670572b6@arm.com/ Do you want us to resend the patch set for adding tags? Thanks, Leo
On 5/5/22 00:18, Ali Saidi wrote: > Add a flag to the perf mem data struct to signal that a request caused a > cache-to-cache transfer of a line from a peer of the requestor and > wasn't sourced from a lower cache level. The line being moved from one > peer cache to another has latency and performance implications. On Arm64 > Neoverse systems the data source can indicate a cache-to-cache transfer > but not if the line is dirty or clean, so instead of overloading HITM > define a new flag that indicates this type of transfer. > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > Reviewed-by: Leo Yan <leo.yan@linaro.org> > --- > tools/include/uapi/linux/perf_event.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h > index d37629dbad72..7b88bfd097dc 100644 > --- a/tools/include/uapi/linux/perf_event.h > +++ b/tools/include/uapi/linux/perf_event.h > @@ -1310,7 +1310,7 @@ union perf_mem_data_src { > #define PERF_MEM_SNOOP_SHIFT 19 > > #define PERF_MEM_SNOOPX_FWD 0x01 /* forward */ > -/* 1 free */ > +#define PERF_MEM_SNOOPX_PEER 0x02 /* xfer from peer */ > #define PERF_MEM_SNOOPX_SHIFT 38 > > /* locked instruction */ Patch looks good to me. Reviewed-By: Kajol Jain<kjain@linux.ibm.com> Thanks, Kajol Jain
Em Wed, May 11, 2022 at 10:20:04AM +0800, Leo Yan escreveu: > On Tue, May 10, 2022 at 01:28:38PM -0300, Arnaldo Carvalho de Melo wrote: > > Em Wed, May 04, 2022 at 06:48:47PM +0000, Ali Saidi escreveu: > > > Add a flag to the perf mem data struct to signal that a request caused a > > > cache-to-cache transfer of a line from a peer of the requestor and > > > wasn't sourced from a lower cache level. The line being moved from one > > > peer cache to another has latency and performance implications. On Arm64 > > > Neoverse systems the data source can indicate a cache-to-cache transfer > > > but not if the line is dirty or clean, so instead of overloading HITM > > > define a new flag that indicates this type of transfer. > > > > > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > > > Reviewed-by: Leo Yan <leo.yan@linaro.org> > > > > Was this already merged on the ARM kernel tree? > > No, I don't think this patch has been merged on Arm kernel tree. I searched > Arm and Arm64 git repos, none of them has merged this patch. > > https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?qt=author&q=Ali+Saidi > http://git.armlinux.org.uk/cgit/linux-arm.git/log/?qt=author&q=Ali+Saidi > > P.s. Ali missed to include German's review tag, see: > https://lore.kernel.org/lkml/458a2de1-dc93-7e2d-5dc5-fbcd670572b6@arm.com/ > > Do you want us to resend the patch set for adding tags? I use b4 and it should collect Reviewed-by, Acked-by, etc tags, for instance, if I use the message-id in your message: ⬢[acme@toolbox perf]$ b4 am -ctsl --cc-trailers 20220511022004.GA956170@leoy-ThinkPad-X240s Looking up https://lore.kernel.org/r/20220511022004.GA956170%40leoy-ThinkPad-X240s Grabbing thread from lore.kernel.org/all/20220511022004.GA956170%40leoy-ThinkPad-X240s/t.mbox.gz Checking for newer revisions on https://lore.kernel.org/all/ Analyzing 12 messages in the thread Checking attestation on all messages, may take a moment... --- ✓ [PATCH v8 1/5] perf: Add SNOOP_PEER flag to perf mem data struct + Reviewed-By: Kajol Jain<kjain@linux.ibm.com> (✓ DKIM/ibm.com) + Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> + Link: https://lore.kernel.org/r/20220504184850.24986-2-alisaidi@amazon.com + Cc: Nick.Forrington@arm.com + Cc: andrew.kilroy@arm.com + Cc: german.gomez@arm.com + Cc: james.clark@arm.com + Cc: mark.rutland@arm.com + Cc: john.garry@huawei.com + Cc: lihuafei1@huawei.com + Cc: peterz@infradead.org + Cc: benh@kernel.crashing.org + Cc: acme@kernel.org + Cc: jolsa@kernel.org + Cc: namhyung@kernel.org + Cc: will@kernel.org + Cc: mathieu.poirier@linaro.org + Cc: alexander.shishkin@linux.intel.com + Cc: linux-arm-kernel@lists.infradead.org + Cc: mingo@redhat.com + Cc: linux-kernel@vger.kernel.org + Cc: linux-perf-users@vger.kernel.org ✓ [PATCH v8 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER + Reviewed-By: Kajol Jain<kjain@linux.ibm.com> (✓ DKIM/ibm.com) + Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> + Link: https://lore.kernel.org/r/20220504184850.24986-3-alisaidi@amazon.com + Cc: Nick.Forrington@arm.com + Cc: andrew.kilroy@arm.com + Cc: german.gomez@arm.com + Cc: james.clark@arm.com + Cc: mark.rutland@arm.com + Cc: john.garry@huawei.com + Cc: lihuafei1@huawei.com + Cc: peterz@infradead.org + Cc: benh@kernel.crashing.org + Cc: acme@kernel.org + Cc: jolsa@kernel.org + Cc: namhyung@kernel.org + Cc: will@kernel.org + Cc: mathieu.poirier@linaro.org + Cc: alexander.shishkin@linux.intel.com + Cc: linux-arm-kernel@lists.infradead.org + Cc: mingo@redhat.com + Cc: linux-kernel@vger.kernel.org + Cc: linux-perf-users@vger.kernel.org ✓ [PATCH v8 3/5] perf mem: Print snoop peer flag + Reviewed-By: Kajol Jain<kjain@linux.ibm.com> (✓ DKIM/ibm.com) + Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> + Link: https://lore.kernel.org/r/20220504184850.24986-4-alisaidi@amazon.com + Cc: Nick.Forrington@arm.com + Cc: andrew.kilroy@arm.com + Cc: german.gomez@arm.com + Cc: james.clark@arm.com + Cc: mark.rutland@arm.com + Cc: john.garry@huawei.com + Cc: lihuafei1@huawei.com + Cc: peterz@infradead.org + Cc: benh@kernel.crashing.org + Cc: acme@kernel.org + Cc: jolsa@kernel.org + Cc: namhyung@kernel.org + Cc: will@kernel.org + Cc: mathieu.poirier@linaro.org + Cc: alexander.shishkin@linux.intel.com + Cc: linux-arm-kernel@lists.infradead.org + Cc: mingo@redhat.com + Cc: linux-kernel@vger.kernel.org + Cc: linux-perf-users@vger.kernel.org ✓ [PATCH v8 4/5] perf arm-spe: Don't set data source if it's not a memory operation + Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> + Link: https://lore.kernel.org/r/20220504184850.24986-5-alisaidi@amazon.com + Cc: Nick.Forrington@arm.com + Cc: andrew.kilroy@arm.com + Cc: james.clark@arm.com + Cc: mark.rutland@arm.com + Cc: john.garry@huawei.com + Cc: lihuafei1@huawei.com + Cc: peterz@infradead.org + Cc: benh@kernel.crashing.org + Cc: acme@kernel.org + Cc: jolsa@kernel.org + Cc: namhyung@kernel.org + Cc: will@kernel.org + Cc: mathieu.poirier@linaro.org + Cc: kjain@linux.ibm.com + Cc: alexander.shishkin@linux.intel.com + Cc: linux-arm-kernel@lists.infradead.org + Cc: mingo@redhat.com + Cc: linux-kernel@vger.kernel.org + Cc: linux-perf-users@vger.kernel.org ✓ [PATCH v8 5/5] perf arm-spe: Use SPE data source for neoverse cores + Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> + Link: https://lore.kernel.org/r/20220504184850.24986-6-alisaidi@amazon.com + Cc: Nick.Forrington@arm.com + Cc: andrew.kilroy@arm.com + Cc: james.clark@arm.com + Cc: mark.rutland@arm.com + Cc: john.garry@huawei.com + Cc: lihuafei1@huawei.com + Cc: peterz@infradead.org + Cc: benh@kernel.crashing.org + Cc: acme@kernel.org + Cc: jolsa@kernel.org + Cc: namhyung@kernel.org + Cc: will@kernel.org + Cc: mathieu.poirier@linaro.org + Cc: kjain@linux.ibm.com + Cc: alexander.shishkin@linux.intel.com + Cc: linux-arm-kernel@lists.infradead.org + Cc: mingo@redhat.com + Cc: linux-kernel@vger.kernel.org + Cc: linux-perf-users@vger.kernel.org --- ✓ Signed: DKIM/amazon.com --- Total patches: 5 --- Cover: ./v8_20220504_alisaidi_perf_arm_spe_decode_spe_source_and_use_for_perf_c2c.cover Link: https://lore.kernel.org/r/20220504184850.24986-1-alisaidi@amazon.com Base: not specified git am ./v8_20220504_alisaidi_perf_arm_spe_decode_spe_source_and_use_for_perf_c2c.mbx ⬢[acme@toolbox perf]$ Somehow it is not being collected... :-\ Not even when I use: > P.s. Ali missed to include German's review tag, see: > https://lore.kernel.org/lkml/458a2de1-dc93-7e2d-5dc5-fbcd670572b6@arm.com/ 458a2de1-dc93-7e2d-5dc5-fbcd670572b6@arm.com Will try updating b4... - Arnaldo
Em Wed, May 11, 2022 at 03:28:00PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Wed, May 11, 2022 at 10:20:04AM +0800, Leo Yan escreveu: > > On Tue, May 10, 2022 at 01:28:38PM -0300, Arnaldo Carvalho de Melo wrote: > > > Em Wed, May 04, 2022 at 06:48:47PM +0000, Ali Saidi escreveu: > > > > Add a flag to the perf mem data struct to signal that a request caused a > > > > cache-to-cache transfer of a line from a peer of the requestor and > > > > wasn't sourced from a lower cache level. The line being moved from one > > > > peer cache to another has latency and performance implications. On Arm64 > > > > Neoverse systems the data source can indicate a cache-to-cache transfer > > > > but not if the line is dirty or clean, so instead of overloading HITM > > > > define a new flag that indicates this type of transfer. > > > > > > > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > > > > Reviewed-by: Leo Yan <leo.yan@linaro.org> > > > > > > Was this already merged on the ARM kernel tree? > > > > No, I don't think this patch has been merged on Arm kernel tree. I searched > > Arm and Arm64 git repos, none of them has merged this patch. > > > > https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?qt=author&q=Ali+Saidi > > http://git.armlinux.org.uk/cgit/linux-arm.git/log/?qt=author&q=Ali+Saidi > > > > P.s. Ali missed to include German's review tag, see: > > https://lore.kernel.org/lkml/458a2de1-dc93-7e2d-5dc5-fbcd670572b6@arm.com/ > > > > Do you want us to resend the patch set for adding tags? > > I use b4 and it should collect Reviewed-by, Acked-by, etc tags, for > instance, if I use the message-id in your message: > > ⬢[acme@toolbox perf]$ b4 am -ctsl --cc-trailers 20220511022004.GA956170@leoy-ThinkPad-X240s > Looking up https://lore.kernel.org/r/20220511022004.GA956170%40leoy-ThinkPad-X240s > Grabbing thread from lore.kernel.org/all/20220511022004.GA956170%40leoy-ThinkPad-X240s/t.mbox.gz > Checking for newer revisions on https://lore.kernel.org/all/ > Analyzing 12 messages in the thread <SNIP> > ✓ [PATCH v8 5/5] perf arm-spe: Use SPE data source for neoverse cores > + Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > + Link: https://lore.kernel.org/r/20220504184850.24986-6-alisaidi@amazon.com > + Cc: Nick.Forrington@arm.com > + Cc: andrew.kilroy@arm.com > + Cc: james.clark@arm.com > + Cc: mark.rutland@arm.com > + Cc: john.garry@huawei.com > + Cc: lihuafei1@huawei.com > + Cc: peterz@infradead.org > + Cc: benh@kernel.crashing.org > + Cc: acme@kernel.org > + Cc: jolsa@kernel.org > + Cc: namhyung@kernel.org > + Cc: will@kernel.org > + Cc: mathieu.poirier@linaro.org > + Cc: kjain@linux.ibm.com > + Cc: alexander.shishkin@linux.intel.com > + Cc: linux-arm-kernel@lists.infradead.org > + Cc: mingo@redhat.com > + Cc: linux-kernel@vger.kernel.org > + Cc: linux-perf-users@vger.kernel.org > --- > ✓ Signed: DKIM/amazon.com > --- > Total patches: 5 > --- > Cover: ./v8_20220504_alisaidi_perf_arm_spe_decode_spe_source_and_use_for_perf_c2c.cover > Link: https://lore.kernel.org/r/20220504184850.24986-1-alisaidi@amazon.com > Base: not specified > git am ./v8_20220504_alisaidi_perf_arm_spe_decode_spe_source_and_use_for_perf_c2c.mbx > ⬢[acme@toolbox perf]$ > > Somehow it is not being collected... :-\ > > Not even when I use: > > > P.s. Ali missed to include German's review tag, see: > > https://lore.kernel.org/lkml/458a2de1-dc93-7e2d-5dc5-fbcd670572b6@arm.com/ > > > 458a2de1-dc93-7e2d-5dc5-fbcd670572b6@arm.com > > Will try updating b4... Didn't help, so please collect the new tags and resubmit. - Arnaldo
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index d37629dbad72..7b88bfd097dc 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -1310,7 +1310,7 @@ union perf_mem_data_src { #define PERF_MEM_SNOOP_SHIFT 19 #define PERF_MEM_SNOOPX_FWD 0x01 /* forward */ -/* 1 free */ +#define PERF_MEM_SNOOPX_PEER 0x02 /* xfer from peer */ #define PERF_MEM_SNOOPX_SHIFT 38 /* locked instruction */