Message ID | 1535535863-1818-1-git-send-email-robert.walker@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] perf: Support for Arm A32/T32 instruction sets in CoreSight trace | expand |
On Wed, 29 Aug 2018 10:44:23 +0100 Robert Walker <robert.walker@arm.com> wrote: > This patch adds support for generating instruction samples from trace of > AArch32 programs using the A32 and T32 instruction sets. > > T32 has variable 2 or 4 byte instruction size, so the conversion between > addresses and instruction counts requires extra information from the trace > decoder, requiring version 0.9.1 of OpenCSD. A check for the new struct > member has been added to the feature check for OpenCSD. > > Signed-off-by: Robert Walker <robert.walker@arm.com> > --- ... > +++ b/tools/build/feature/test-libopencsd.c > @@ -3,6 +3,13 @@ > > int main(void) > { > + /* > + * Requires ocsd_generic_trace_elem.num_instr_range introduced in > + * OpenCSD 0.9 0.9 != 0.9.1 in the above commit text: which is it? > + */ > + ocsd_generic_trace_elem elem; > + (void)elem.num_instr_range; > + This breaks building against older versions of OpenCSD, right? > (void)ocsd_get_version(); Why don't we maintain building against older versions of the library, and use the version information to make the decision on whether to use the new feature being introduced in this patch? Thanks, Kim
Hi Kim, On 29/08/18 14:49, Kim Phillips wrote: > On Wed, 29 Aug 2018 10:44:23 +0100 > Robert Walker <robert.walker@arm.com> wrote: > >> This patch adds support for generating instruction samples from trace of >> AArch32 programs using the A32 and T32 instruction sets. >> >> T32 has variable 2 or 4 byte instruction size, so the conversion between >> addresses and instruction counts requires extra information from the trace >> decoder, requiring version 0.9.1 of OpenCSD. A check for the new struct >> member has been added to the feature check for OpenCSD. >> >> Signed-off-by: Robert Walker <robert.walker@arm.com> >> --- > ... >> +++ b/tools/build/feature/test-libopencsd.c >> @@ -3,6 +3,13 @@ >> >> int main(void) >> { >> + /* >> + * Requires ocsd_generic_trace_elem.num_instr_range introduced in >> + * OpenCSD 0.9 > 0.9 != 0.9.1 in the above commit text: which is it? I'll change it to 0.9.1 if there's another version of the patch (it was introduced in 0.9, but 0.9.1 has a necessary bug fix) >> + */ >> + ocsd_generic_trace_elem elem; >> + (void)elem.num_instr_range; >> + > This breaks building against older versions of OpenCSD, right? > >> (void)ocsd_get_version(); > Why don't we maintain building against older versions of the library, > and use the version information to make the decision on whether to use > the new feature being introduced in this patch? The intention is to fail the feature detection check if the older version is installed - perf will still compile, but without the CoreSight trace support. OpenCSD is still in development, so new features like this are being added and it would add a lot of #ifdef mess to the perf code to continue to support any machines with the old library version installed - there will only be a handful of machines affected and it's trivial to upgrade them (the new Debian packages are available). How long would we continue to support such an older version? I also don't see any precedent for supporting multiple dependent library versions in perf. > > Thanks, > > Kim Regards Rob
On Wed, 29 Aug 2018 15:34:16 +0100 Robert Walker <robert.walker@arm.com> wrote: > Hi Kim, Hi Robert, > On 29/08/18 14:49, Kim Phillips wrote: > > On Wed, 29 Aug 2018 10:44:23 +0100 > > Robert Walker <robert.walker@arm.com> wrote: > > > >> This patch adds support for generating instruction samples from trace of > >> AArch32 programs using the A32 and T32 instruction sets. > >> > >> T32 has variable 2 or 4 byte instruction size, so the conversion between > >> addresses and instruction counts requires extra information from the trace > >> decoder, requiring version 0.9.1 of OpenCSD. A check for the new struct > >> member has been added to the feature check for OpenCSD. > >> > >> Signed-off-by: Robert Walker <robert.walker@arm.com> > >> --- > > ... > >> +++ b/tools/build/feature/test-libopencsd.c > >> @@ -3,6 +3,13 @@ > >> > >> int main(void) > >> { > >> + /* > >> + * Requires ocsd_generic_trace_elem.num_instr_range introduced in > >> + * OpenCSD 0.9 > > 0.9 != 0.9.1 in the above commit text: which is it? > I'll change it to 0.9.1 if there's another version of the patch (it was > introduced in 0.9, but 0.9.1 has a necessary bug fix) > >> + */ > >> + ocsd_generic_trace_elem elem; > >> + (void)elem.num_instr_range; > >> + > > This breaks building against older versions of OpenCSD, right? > > > >> (void)ocsd_get_version(); > > Why don't we maintain building against older versions of the library, > > and use the version information to make the decision on whether to use > > the new feature being introduced in this patch? > The intention is to fail the feature detection check if the older > version is installed - perf will still compile, but without the > CoreSight trace support. It should still compile, and with CoreSight trace support, just not support for A32/T32 instruction sets. The user shouldn't be denied CoreSight trace support if they don't care for A32/T32 ISA support. > OpenCSD is still in development, so new features like this are being > added and it would add a lot of #ifdef mess to the perf code to continue > to support any machines with the old library version installed - there Even adding #ifdefs, that won't survive taking one perf executable built on one machine and running it on another machine with a different version of the OpenCSD library: it'll break inconspicuously, not gracefully! There needs to be a run-time means of working with older versions of the library. Consider checking the sizeof some of the structs? IIRC, some of the structs sizes changed in the library. See e.g., the 'size' field of perf_event_attr: size The size of the perf_event_attr structure for forward/backward compatibility. Set this using sizeof(struct perf_event_attr) to allow the kernel to see the struct size at the time of compilation. or, likely better, the 'version' and 'compat_version' of the perf_event_mmap_page structure: struct perf_event_mmap_page { __u32 version; /* version number of this structure */ __u32 compat_version; /* lowest version this is compat with */ ... > will only be a handful of machines affected and it's trivial to upgrade > them (the new Debian packages are available). This is upstream linux, so I don't know how you know only a 'handful' of machines affected, and I wouldn't assume everyone's using Debian. For one, I'd hate to see a single user affected if it isn't necessary, as is in this case - not everyone wants A32/T32 ISA support, and library compatibility needn't be broken. This 'screw compatibility' mentality needs to be dropped *now* if CoreSight support is to have a successful future. Otherwise, I suggest keeping this feature in downstream trees for the 'handful', until the library and perf code are rewritten in a state where they properly interoperate, and do not break each other. > How long would we > continue to support such an older version? What do you mean such an older version? The project's v0.9.0 commit was on 20 June 2018, the one that's usable - v0.9.1 - has a July 27 2018 commit date! One month is *not* *old*! > I also don't see any > precedent for supporting multiple dependent library versions in perf. That's because perf doesn't have a precedent on depending on libraries that flat-out break their own users compatibility across versions ;) Thanks, Kim
Hi Kim, Generally, I agree with you about breaking backward compatibility, but in this case I don't think there is an actual problem. As I understand it, you're worried that perf will break for people who are using an older version (0.8.x) of the OpenCSD library for CoreSight trace decode and this patch updates the requirement to a newer version (0.9.x) to enable support for trace of 32-bit applications. There are only a few (4/5?) targets around with working support for CoreSight trace (and of these only Juno is the only platform with a devicetree in the mainline kernel), so only a few users of perf for Arm trace decode - most of these are people working those directly involved with Arm & Linaro or will be reading the coresight mailing list. Anyone working with OpenCSD will have got it from github and compiled it themselves, so they can update and build a new version. It's only been packaged for debian so far and testing already has the 0.9.x version (the 0.8.x version was only in debian for 8 days before being replaced by 0.9.x). It would be possible to add conditional compilation flags to support compiling with 0.8.x, but I feel this would add too much mess to the code and I'd need some help in figuring out perf's feature detection system to generate the flags. Given the likely small number of people affected and the easy upgrade path, I don't think this is worth it. On 29/08/18 17:32, Kim Phillips wrote: > On Wed, 29 Aug 2018 15:34:16 +0100 > Robert Walker <robert.walker@arm.com> wrote: > >> Hi Kim, > Hi Robert, > >> On 29/08/18 14:49, Kim Phillips wrote: >>> On Wed, 29 Aug 2018 10:44:23 +0100 >>> Robert Walker <robert.walker@arm.com> wrote: >>> >>>> This patch adds support for generating instruction samples from trace of >>>> AArch32 programs using the A32 and T32 instruction sets. >>>> >>>> T32 has variable 2 or 4 byte instruction size, so the conversion between >>>> addresses and instruction counts requires extra information from the trace >>>> decoder, requiring version 0.9.1 of OpenCSD. A check for the new struct >>>> member has been added to the feature check for OpenCSD. >>>> >>>> Signed-off-by: Robert Walker <robert.walker@arm.com> >>>> --- >>> ... >>>> +++ b/tools/build/feature/test-libopencsd.c >>>> @@ -3,6 +3,13 @@ >>>> >>>> int main(void) >>>> { >>>> + /* >>>> + * Requires ocsd_generic_trace_elem.num_instr_range introduced in >>>> + * OpenCSD 0.9 >>> 0.9 != 0.9.1 in the above commit text: which is it? >> I'll change it to 0.9.1 if there's another version of the patch (it was >> introduced in 0.9, but 0.9.1 has a necessary bug fix) >>>> + */ >>>> + ocsd_generic_trace_elem elem; >>>> + (void)elem.num_instr_range; >>>> + >>> This breaks building against older versions of OpenCSD, right? >>> >>>> (void)ocsd_get_version(); >>> Why don't we maintain building against older versions of the library, >>> and use the version information to make the decision on whether to use >>> the new feature being introduced in this patch? >> The intention is to fail the feature detection check if the older >> version is installed - perf will still compile, but without the >> CoreSight trace support. > It should still compile, and with CoreSight trace support, just > not support for A32/T32 instruction sets. The user shouldn't be denied > CoreSight trace support if they don't care for A32/T32 ISA support. > >> OpenCSD is still in development, so new features like this are being >> added and it would add a lot of #ifdef mess to the perf code to continue >> to support any machines with the old library version installed - there > Even adding #ifdefs, that won't survive taking one perf executable > built on one machine and running it on another machine with a different > version of the OpenCSD library: it'll break inconspicuously, not > gracefully! perf has a lot of other shared library dependencies (ELF , unwind libraries etc), so moving builds between systems is already fragile. > There needs to be a run-time means of working with older versions of > the library. > > Consider checking the sizeof some of the structs? IIRC, some of the > structs sizes changed in the library. See e.g., the 'size' field of > perf_event_attr: > > size > The size of the perf_event_attr structure for forward/backward > compatibility. Set this using sizeof(struct perf_event_attr) > to allow the kernel to see the struct size at the time > of compilation. > > or, likely better, the 'version' and 'compat_version' of the > perf_event_mmap_page structure: > > struct perf_event_mmap_page { > __u32 version; /* version number of this structure */ > __u32 compat_version; /* lowest version this is compat with */ > ... > >> will only be a handful of machines affected and it's trivial to upgrade >> them (the new Debian packages are available). > This is upstream linux, so I don't know how you know only a 'handful' > of machines affected, and I wouldn't assume everyone's using Debian. > > For one, I'd hate to see a single user affected if it isn't necessary, > as is in this case - not everyone wants A32/T32 ISA support, and > library compatibility needn't be broken. > > This 'screw compatibility' mentality needs to be dropped *now* if > CoreSight support is to have a successful future. > > Otherwise, I suggest keeping this feature in downstream trees for the > 'handful', until the library and perf code are rewritten in a state > where they properly interoperate, and do not break each other. > >> How long would we >> continue to support such an older version? > What do you mean such an older version? The project's v0.9.0 commit > was on 20 June 2018, the one that's usable - v0.9.1 - has a July 27 > 2018 commit date! One month is *not* *old*! I mean the 0.8.x version as the old version. >> I also don't see any >> precedent for supporting multiple dependent library versions in perf. > That's because perf doesn't have a precedent on depending on libraries > that flat-out break their own users compatibility across versions ;) This patch picks up a new feature that's been added - I notice the feature detection checks for other libraries check a number of features and emit warnings about required versions. > Thanks, > > Kim Regards Rob
On Fri, 31 Aug 2018 14:42:00 +0100 Robert Walker <robert.walker@arm.com> wrote: > Generally, I agree with you about breaking backward compatibility, but > in this case I don't think there is an actual problem. As I understand I consider it a serious problem. > it, you're worried that perf will break for people who are using an > older version (0.8.x) of the OpenCSD library for CoreSight trace decode > and this patch updates the requirement to a newer version (0.9.x) to > enable support for trace of 32-bit applications. My problem is: every time a new feature is added, it shouldn't force base CoreSight decode functionality to require a new version of the library. My second problem is: this patch implements a build-time requirement, which is insufficient for running on machines with incompatible versions of the library. > There are only a few (4/5?) targets around with working support for > CoreSight trace (and of these only Juno is the only platform with a > devicetree in the mainline kernel), so only a few users of perf for Arm > trace decode - most of these are people working those directly involved > with Arm & Linaro or will be reading the coresight mailing list. Anyone Great, then share this feature with them, but don't send a patch to break upstream, which, in turn, goes back to many things downstream (future distro releases on newer targets, etc.). > working with OpenCSD will have got it from github and compiled it > themselves, so they can update and build a new version. It's only been No. Updating the library - no matter where one gets it from - shouldn't have to be necessary to avoid perf build regressions. > packaged for debian so far and testing already has the 0.9.x version > (the 0.8.x version was only in debian for 8 days before being replaced > by 0.9.x). What Debian does is immaterial to upstream submissions. How is Debian testing the library, btw? Functionality that worked in 0.8 will fail in 0.9 AFAICT. > It would be possible to add conditional compilation flags to support > compiling with 0.8.x, but I feel this would add too much mess to the > code and I'd need some help in figuring out perf's feature detection > system to generate the flags. No, we need run-time compatibility. Build-time compatibility does not satisfy the run-time requirements in this case. > Given the likely small number of people > affected and the easy upgrade path, I don't think this is worth it. This is an upstream submission, and I wouldn't like for a single person to ever have to experience such silently deceitful bugs. For now, share the new feature in a personal git tree, for those that need it. Meanwhile, the library needs to be fixed with a run-time version compatibility API ASAP. Thanks, Kim > On 29/08/18 17:32, Kim Phillips wrote: > > On Wed, 29 Aug 2018 15:34:16 +0100 > > Robert Walker <robert.walker@arm.com> wrote: > > > >> Hi Kim, > > Hi Robert, > > > >> On 29/08/18 14:49, Kim Phillips wrote: > >>> On Wed, 29 Aug 2018 10:44:23 +0100 > >>> Robert Walker <robert.walker@arm.com> wrote: > >>> > >>>> This patch adds support for generating instruction samples from trace of > >>>> AArch32 programs using the A32 and T32 instruction sets. > >>>> > >>>> T32 has variable 2 or 4 byte instruction size, so the conversion between > >>>> addresses and instruction counts requires extra information from the trace > >>>> decoder, requiring version 0.9.1 of OpenCSD. A check for the new struct > >>>> member has been added to the feature check for OpenCSD. > >>>> > >>>> Signed-off-by: Robert Walker <robert.walker@arm.com> > >>>> --- > >>> ... > >>>> +++ b/tools/build/feature/test-libopencsd.c > >>>> @@ -3,6 +3,13 @@ > >>>> > >>>> int main(void) > >>>> { > >>>> + /* > >>>> + * Requires ocsd_generic_trace_elem.num_instr_range introduced in > >>>> + * OpenCSD 0.9 > >>> 0.9 != 0.9.1 in the above commit text: which is it? > >> I'll change it to 0.9.1 if there's another version of the patch (it was > >> introduced in 0.9, but 0.9.1 has a necessary bug fix) > >>>> + */ > >>>> + ocsd_generic_trace_elem elem; > >>>> + (void)elem.num_instr_range; > >>>> + > >>> This breaks building against older versions of OpenCSD, right? > >>> > >>>> (void)ocsd_get_version(); > >>> Why don't we maintain building against older versions of the library, > >>> and use the version information to make the decision on whether to use > >>> the new feature being introduced in this patch? > >> The intention is to fail the feature detection check if the older > >> version is installed - perf will still compile, but without the > >> CoreSight trace support. > > It should still compile, and with CoreSight trace support, just > > not support for A32/T32 instruction sets. The user shouldn't be denied > > CoreSight trace support if they don't care for A32/T32 ISA support. > > > >> OpenCSD is still in development, so new features like this are being > >> added and it would add a lot of #ifdef mess to the perf code to continue > >> to support any machines with the old library version installed - there > > Even adding #ifdefs, that won't survive taking one perf executable > > built on one machine and running it on another machine with a different > > version of the OpenCSD library: it'll break inconspicuously, not > > gracefully! > > perf has a lot of other shared library dependencies (ELF , unwind > libraries etc), so moving builds between systems is already fragile. > > > There needs to be a run-time means of working with older versions of > > the library. > > > > Consider checking the sizeof some of the structs? IIRC, some of the > > structs sizes changed in the library. See e.g., the 'size' field of > > perf_event_attr: > > > > size > > The size of the perf_event_attr structure for forward/backward > > compatibility. Set this using sizeof(struct perf_event_attr) > > to allow the kernel to see the struct size at the time > > of compilation. > > > > or, likely better, the 'version' and 'compat_version' of the > > perf_event_mmap_page structure: > > > > struct perf_event_mmap_page { > > __u32 version; /* version number of this structure */ > > __u32 compat_version; /* lowest version this is compat with */ > > ... > > > >> will only be a handful of machines affected and it's trivial to upgrade > >> them (the new Debian packages are available). > > This is upstream linux, so I don't know how you know only a 'handful' > > of machines affected, and I wouldn't assume everyone's using Debian. > > > > For one, I'd hate to see a single user affected if it isn't necessary, > > as is in this case - not everyone wants A32/T32 ISA support, and > > library compatibility needn't be broken. > > > > This 'screw compatibility' mentality needs to be dropped *now* if > > CoreSight support is to have a successful future. > > > > Otherwise, I suggest keeping this feature in downstream trees for the > > 'handful', until the library and perf code are rewritten in a state > > where they properly interoperate, and do not break each other. > > > >> How long would we > >> continue to support such an older version? > > What do you mean such an older version? The project's v0.9.0 commit > > was on 20 June 2018, the one that's usable - v0.9.1 - has a July 27 > > 2018 commit date! One month is *not* *old*! > I mean the 0.8.x version as the old version. > >> I also don't see any > >> precedent for supporting multiple dependent library versions in perf. > > That's because perf doesn't have a precedent on depending on libraries > > that flat-out break their own users compatibility across versions ;) > This patch picks up a new feature that's been added - I notice the > feature detection checks for other libraries check a number of features > and emit warnings about required versions. > > > Thanks, > > > > Kim > > Regards > > Rob > >
On Fri, 31 Aug 2018 at 08:43, Kim Phillips <kim.phillips@arm.com> wrote: > > On Fri, 31 Aug 2018 14:42:00 +0100 > Robert Walker <robert.walker@arm.com> wrote: > > > Generally, I agree with you about breaking backward compatibility, but > > in this case I don't think there is an actual problem. As I understand > > I consider it a serious problem. > > > it, you're worried that perf will break for people who are using an > > older version (0.8.x) of the OpenCSD library for CoreSight trace decode > > and this patch updates the requirement to a newer version (0.9.x) to > > enable support for trace of 32-bit applications. > > My problem is: every time a new feature is added, it shouldn't > force base CoreSight decode functionality to require a new version of > the library. > > My second problem is: this patch implements a build-time requirement, > which is insufficient for running on machines with incompatible > versions of the library. While it is not realistic to expect eternal backward compatibility for core features, I think in this case we can do better. Looking at the code in the patch it seems possible to implement the new functionality in functions that can be enabled based on the library version we compile against. If the library version is not high enough, the functions simply get stubbed out and the feature isn't available. A prerequisite step would be to move openCSD's "decoder/include/ocsd_if_version.h" under "decoder/include/opencsd/ocsd_if_version.h" in order to make symbols OCSD_VER_{ MAJOR | MINOW | PATH } accessible but that's not a big deal. Please try this approach first - we can consider more drastic measures if that doesn't work. > > > There are only a few (4/5?) targets around with working support for > > CoreSight trace (and of these only Juno is the only platform with a > > devicetree in the mainline kernel), so only a few users of perf for Arm > > trace decode - most of these are people working those directly involved > > with Arm & Linaro or will be reading the coresight mailing list. Anyone > > Great, then share this feature with them, but don't send a patch to > break upstream, which, in turn, goes back to many things downstream > (future distro releases on newer targets, etc.). > > > working with OpenCSD will have got it from github and compiled it > > themselves, so they can update and build a new version. It's only been > > No. Updating the library - no matter where one gets it from - shouldn't > have to be necessary to avoid perf build regressions. > > > packaged for debian so far and testing already has the 0.9.x version > > (the 0.8.x version was only in debian for 8 days before being replaced > > by 0.9.x). > > What Debian does is immaterial to upstream submissions. > > How is Debian testing the library, btw? Functionality that worked in > 0.8 will fail in 0.9 AFAICT. > > > It would be possible to add conditional compilation flags to support > > compiling with 0.8.x, but I feel this would add too much mess to the > > code and I'd need some help in figuring out perf's feature detection > > system to generate the flags. > > No, we need run-time compatibility. Build-time compatibility does not > satisfy the run-time requirements in this case. > > > Given the likely small number of people > > affected and the easy upgrade path, I don't think this is worth it. > > This is an upstream submission, and I wouldn't like for a single person > to ever have to experience such silently deceitful bugs. > > For now, share the new feature in a personal git tree, for those that > need it. > > Meanwhile, the library needs to be fixed with a run-time version > compatibility API ASAP. > > Thanks, > > Kim > > > On 29/08/18 17:32, Kim Phillips wrote: > > > On Wed, 29 Aug 2018 15:34:16 +0100 > > > Robert Walker <robert.walker@arm.com> wrote: > > > > > >> Hi Kim, > > > Hi Robert, > > > > > >> On 29/08/18 14:49, Kim Phillips wrote: > > >>> On Wed, 29 Aug 2018 10:44:23 +0100 > > >>> Robert Walker <robert.walker@arm.com> wrote: > > >>> > > >>>> This patch adds support for generating instruction samples from trace of > > >>>> AArch32 programs using the A32 and T32 instruction sets. > > >>>> > > >>>> T32 has variable 2 or 4 byte instruction size, so the conversion between > > >>>> addresses and instruction counts requires extra information from the trace > > >>>> decoder, requiring version 0.9.1 of OpenCSD. A check for the new struct > > >>>> member has been added to the feature check for OpenCSD. > > >>>> > > >>>> Signed-off-by: Robert Walker <robert.walker@arm.com> > > >>>> --- > > >>> ... > > >>>> +++ b/tools/build/feature/test-libopencsd.c > > >>>> @@ -3,6 +3,13 @@ > > >>>> > > >>>> int main(void) > > >>>> { > > >>>> + /* > > >>>> + * Requires ocsd_generic_trace_elem.num_instr_range introduced in > > >>>> + * OpenCSD 0.9 > > >>> 0.9 != 0.9.1 in the above commit text: which is it? > > >> I'll change it to 0.9.1 if there's another version of the patch (it was > > >> introduced in 0.9, but 0.9.1 has a necessary bug fix) > > >>>> + */ > > >>>> + ocsd_generic_trace_elem elem; > > >>>> + (void)elem.num_instr_range; > > >>>> + > > >>> This breaks building against older versions of OpenCSD, right? > > >>> > > >>>> (void)ocsd_get_version(); > > >>> Why don't we maintain building against older versions of the library, > > >>> and use the version information to make the decision on whether to use > > >>> the new feature being introduced in this patch? > > >> The intention is to fail the feature detection check if the older > > >> version is installed - perf will still compile, but without the > > >> CoreSight trace support. > > > It should still compile, and with CoreSight trace support, just > > > not support for A32/T32 instruction sets. The user shouldn't be denied > > > CoreSight trace support if they don't care for A32/T32 ISA support. > > > > > >> OpenCSD is still in development, so new features like this are being > > >> added and it would add a lot of #ifdef mess to the perf code to continue > > >> to support any machines with the old library version installed - there > > > Even adding #ifdefs, that won't survive taking one perf executable > > > built on one machine and running it on another machine with a different > > > version of the OpenCSD library: it'll break inconspicuously, not > > > gracefully! > > > > perf has a lot of other shared library dependencies (ELF , unwind > > libraries etc), so moving builds between systems is already fragile. > > > > > There needs to be a run-time means of working with older versions of > > > the library. > > > > > > Consider checking the sizeof some of the structs? IIRC, some of the > > > structs sizes changed in the library. See e.g., the 'size' field of > > > perf_event_attr: > > > > > > size > > > The size of the perf_event_attr structure for forward/backward > > > compatibility. Set this using sizeof(struct perf_event_attr) > > > to allow the kernel to see the struct size at the time > > > of compilation. > > > > > > or, likely better, the 'version' and 'compat_version' of the > > > perf_event_mmap_page structure: > > > > > > struct perf_event_mmap_page { > > > __u32 version; /* version number of this structure */ > > > __u32 compat_version; /* lowest version this is compat with */ > > > ... > > > > > >> will only be a handful of machines affected and it's trivial to upgrade > > >> them (the new Debian packages are available). > > > This is upstream linux, so I don't know how you know only a 'handful' > > > of machines affected, and I wouldn't assume everyone's using Debian. > > > > > > For one, I'd hate to see a single user affected if it isn't necessary, > > > as is in this case - not everyone wants A32/T32 ISA support, and > > > library compatibility needn't be broken. > > > > > > This 'screw compatibility' mentality needs to be dropped *now* if > > > CoreSight support is to have a successful future. > > > > > > Otherwise, I suggest keeping this feature in downstream trees for the > > > 'handful', until the library and perf code are rewritten in a state > > > where they properly interoperate, and do not break each other. > > > > > >> How long would we > > >> continue to support such an older version? > > > What do you mean such an older version? The project's v0.9.0 commit > > > was on 20 June 2018, the one that's usable - v0.9.1 - has a July 27 > > > 2018 commit date! One month is *not* *old*! > > I mean the 0.8.x version as the old version. > > >> I also don't see any > > >> precedent for supporting multiple dependent library versions in perf. > > > That's because perf doesn't have a precedent on depending on libraries > > > that flat-out break their own users compatibility across versions ;) > > This patch picks up a new feature that's been added - I notice the > > feature detection checks for other libraries check a number of features > > and emit warnings about required versions. > > > > > Thanks, > > > > > > Kim > > > > Regards > > > > Rob > > > >
diff --git a/tools/build/feature/test-libopencsd.c b/tools/build/feature/test-libopencsd.c index 5ff1246..d96b2df 100644 --- a/tools/build/feature/test-libopencsd.c +++ b/tools/build/feature/test-libopencsd.c @@ -3,6 +3,13 @@ int main(void) { + /* + * Requires ocsd_generic_trace_elem.num_instr_range introduced in + * OpenCSD 0.9 + */ + ocsd_generic_trace_elem elem; + (void)elem.num_instr_range; + (void)ocsd_get_version(); return 0; } diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 938def6..73d8384 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -263,9 +263,12 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->tail = 0; decoder->packet_count = 0; for (i = 0; i < MAX_BUFFER; i++) { + decoder->packet_buffer[i].isa = CS_ETM_ISA_UNKNOWN; decoder->packet_buffer[i].start_addr = CS_ETM_INVAL_ADDR; decoder->packet_buffer[i].end_addr = CS_ETM_INVAL_ADDR; + decoder->packet_buffer[i].instr_count = 0; decoder->packet_buffer[i].last_instr_taken_branch = false; + decoder->packet_buffer[i].last_instr_size = 0; decoder->packet_buffer[i].exc = false; decoder->packet_buffer[i].exc_ret = false; decoder->packet_buffer[i].cpu = INT_MIN; @@ -294,11 +297,13 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_count++; decoder->packet_buffer[et].sample_type = sample_type; + decoder->packet_buffer[et].isa = CS_ETM_ISA_UNKNOWN; decoder->packet_buffer[et].exc = false; decoder->packet_buffer[et].exc_ret = false; decoder->packet_buffer[et].cpu = *((int *)inode->priv); decoder->packet_buffer[et].start_addr = CS_ETM_INVAL_ADDR; decoder->packet_buffer[et].end_addr = CS_ETM_INVAL_ADDR; + decoder->packet_buffer[et].instr_count = 0; if (decoder->packet_count == MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -321,8 +326,28 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, packet = &decoder->packet_buffer[decoder->tail]; + switch (elem->isa) { + case ocsd_isa_aarch64: + packet->isa = CS_ETM_ISA_A64; + break; + case ocsd_isa_arm: + packet->isa = CS_ETM_ISA_A32; + break; + case ocsd_isa_thumb2: + packet->isa = CS_ETM_ISA_T32; + break; + case ocsd_isa_tee: + case ocsd_isa_jazelle: + case ocsd_isa_custom: + case ocsd_isa_unknown: + default: + packet->isa = CS_ETM_ISA_UNKNOWN; + } + packet->start_addr = elem->st_addr; packet->end_addr = elem->en_addr; + packet->instr_count = elem->num_instr_range; + switch (elem->last_i_type) { case OCSD_INSTR_BR: case OCSD_INSTR_BR_INDIRECT: @@ -336,6 +361,8 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, break; } + packet->last_instr_size = elem->last_instr_sz; + return ret; } diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 612b575..9351bd1 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -28,11 +28,21 @@ enum cs_etm_sample_type { CS_ETM_TRACE_ON = 1 << 1, }; +enum cs_etm_isa { + CS_ETM_ISA_UNKNOWN, + CS_ETM_ISA_A64, + CS_ETM_ISA_A32, + CS_ETM_ISA_T32, +}; + struct cs_etm_packet { enum cs_etm_sample_type sample_type; + enum cs_etm_isa isa; u64 start_addr; u64 end_addr; + u32 instr_count; u8 last_instr_taken_branch; + u8 last_instr_size; u8 exc; u8 exc_ret; int cpu; diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 2ae6402..fcaa73f 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -31,14 +31,6 @@ #define MAX_TIMESTAMP (~0ULL) -/* - * A64 instructions are always 4 bytes - * - * Only A64 is supported, so can use this constant for converting between - * addresses and instruction counts, calculting offsets etc - */ -#define A64_INSTR_SIZE 4 - struct cs_etm_auxtrace { struct auxtrace auxtrace; struct auxtrace_queues queues; @@ -492,21 +484,16 @@ static inline void cs_etm__reset_last_branch_rb(struct cs_etm_queue *etmq) etmq->last_branch_rb->nr = 0; } -static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) -{ - /* Returns 0 for the CS_ETM_TRACE_ON packet */ - if (packet->sample_type == CS_ETM_TRACE_ON) - return 0; +static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq, + u64 addr) { + u8 instrBytes[2]; - /* - * The packet records the execution range with an exclusive end address - * - * A64 instructions are constant size, so the last executed - * instruction is A64_INSTR_SIZE before the end address - * Will need to do instruction level decode for T32 instructions as - * they can be variable size (not yet supported). + cs_etm__mem_access(etmq, addr, ARRAY_SIZE(instrBytes), instrBytes); + /* T32 instruction size is indicated by bits[15:11] of the first + * 16-bit word of the instruction: 0b11101, 0b11110 and 0b11111 + * denote a 32-bit instruction. */ - return packet->end_addr - A64_INSTR_SIZE; + return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2; } static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) @@ -518,27 +505,32 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) return packet->start_addr; } -static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet) +static inline +u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet) { - /* - * Only A64 instructions are currently supported, so can get - * instruction count by dividing. - * Will need to do instruction level decode for T32 instructions as - * they can be variable size (not yet supported). - */ - return (packet->end_addr - packet->start_addr) / A64_INSTR_SIZE; + /* Returns 0 for the CS_ETM_TRACE_ON packet */ + if (packet->sample_type == CS_ETM_TRACE_ON) + return 0; + + return packet->end_addr - packet->last_instr_size; } -static inline u64 cs_etm__instr_addr(const struct cs_etm_packet *packet, +static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq, + const struct cs_etm_packet *packet, u64 offset) { - /* - * Only A64 instructions are currently supported, so can get - * instruction address by muliplying. - * Will need to do instruction level decode for T32 instructions as - * they can be variable size (not yet supported). - */ - return packet->start_addr + offset * A64_INSTR_SIZE; + if (packet->isa == CS_ETM_ISA_T32) { + u64 addr = packet->start_addr; + + while (offset > 0) { + addr += cs_etm__t32_instr_size(etmq, addr); + offset--; + } + return addr; + } + + /* Assume a 4 byte instruction size (A32/A64) */ + return packet->start_addr + offset * 4; } static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq) @@ -867,9 +859,8 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) struct cs_etm_auxtrace *etm = etmq->etm; struct cs_etm_packet *tmp; int ret; - u64 instrs_executed; + u64 instrs_executed = etmq->packet->instr_count; - instrs_executed = cs_etm__instr_count(etmq->packet); etmq->period_instructions += instrs_executed; /* @@ -899,7 +890,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) * executed, but PC has not advanced to next instruction) */ u64 offset = (instrs_executed - instrs_over - 1); - u64 addr = cs_etm__instr_addr(etmq->packet, offset); + u64 addr = cs_etm__instr_addr(etmq, etmq->packet, offset); ret = cs_etm__synth_instruction_sample( etmq, addr, etm->instructions_sample_period);
This patch adds support for generating instruction samples from trace of AArch32 programs using the A32 and T32 instruction sets. T32 has variable 2 or 4 byte instruction size, so the conversion between addresses and instruction counts requires extra information from the trace decoder, requiring version 0.9.1 of OpenCSD. A check for the new struct member has been added to the feature check for OpenCSD. Signed-off-by: Robert Walker <robert.walker@arm.com> --- v2: Minor fixes following review comments from Mathieu Rebased on v4.19-rc1 tools/build/feature/test-libopencsd.c | 7 +++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 27 ++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 10 ++++ tools/perf/util/cs-etm.c | 71 +++++++++++-------------- 4 files changed, 75 insertions(+), 40 deletions(-)