Message ID | 20231017013235.27831-2-xueshuai@linux.alibaba.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | drivers/perf: add Synopsys DesignWare PCIe PMU driver support | expand |
On Tue, 17 Oct 2023 09:32:32 +0800 Shuai Xue <xueshuai@linux.alibaba.com> wrote: > Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe > controller which implements which implements PMU for performance and > functional debugging to facilitate system maintenance. > > Document it to provide guidance on how to use it. > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> A few minor things inline and one question that I'd like a comment on for my understanding at least! (why not multiply the counter by 16 and make the maths simpler?) With those tidied up, Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Thanks, Jonathan > --- > .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ > Documentation/admin-guide/perf/index.rst | 1 + > 2 files changed, 95 insertions(+) > create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst > > diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > new file mode 100644 > index 000000000000..eac1b6f36450 > --- /dev/null > +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > @@ -0,0 +1,94 @@ > +====================================================================== > +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) > +====================================================================== > + > +DesignWare Cores (DWC) PCIe PMU > +=============================== > + > +The PMU is a PCIe configuration space register block provided by each PCIe Root > +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error > +injection, and Statistics). > + > +As the name indicates, the RAS DES capability supports system level > +debugging, AER error injection, and collection of statistics. To facilitate > +collection of statistics, Synopsys DesignWare Cores PCIe controller > +provides the following two features: > + > +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and > + time spent in each low-power LTSSM state) and > +- one 32-bit counter for Event Counting (error and non-error events for > + a specified lane) > + > +Note: There is no interrupt for counter overflow. > + > +Time Based Analysis > +------------------- > + > +Using this feature you can obtain information regarding RX/TX data > +throughput and time spent in each low-power LTSSM state by the controller. > +The PMU measures data in two categories: > + > +- Group#0: Percentage of time the controller stays in LTSSM states. > +- Group#1: Amount of data processed (Units of 16 bytes). > + > +Lane Event counters > +------------------- > + > +Using this feature you can obtain Error and Non-Error information in > +specific lane by the controller. The PMU event is select by: > + > +- Group i > +- Event j within the Group i > +- and Lane k The and here is a little confusing. I'd rework as The PMU event is selected by all of: - Group i - Event j within the Group i - Lane k > + > +Some of the event only exist for specific configurations. events > + > +DesignWare Cores (DWC) PCIe PMU Driver > +======================================= > + > +This driver adds PMU devices for each PCIe Root Port named based on the BDF of > +the Root Port. For example, > + > + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) > + > +the PMU device name for this Root Port is dwc_rootport_3018. > + > +The DWC PCIe PMU driver registers a perf PMU driver, which provides > +description of available events and configuration options in sysfs, see > +/sys/bus/event_source/devices/dwc_rootport_{bdf}. > + > +The "format" directory describes format of the config fields of the > +perf_event_attr structure. The "events" directory provides configuration > +templates for all documented events. For example, > +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". > + > +The "perf list" command shall list the available events from sysfs, e.g.:: > + > + $# perf list | grep dwc_rootport > + <...> > + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] > + <...> > + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] > + > +Time Based Analysis Event Usage > +------------------------------- > + > +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: > + > + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ > + > +The average RX/TX bandwidth can be calculated using the following formula: > + > + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window > + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window Silly question (sorry I didn't raise it earlier) but can we make the interface more intuitive by just multiplying the counter value at point of read by 16? > + > +Lane Event Usage > +------------------------------- > + > +Each lane has the same event set and to avoid generating a list of hundreds > +of events, the user need to specify the lane ID explicitly, e.g.:: > + > + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ > + > +The driver does not support sampling, therefore "perf record" will not > +work. Per-task (without "-a") perf sessions are not supported. > diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst > index f60be04e4e33..6bc7739fddb5 100644 > --- a/Documentation/admin-guide/perf/index.rst > +++ b/Documentation/admin-guide/perf/index.rst > @@ -19,6 +19,7 @@ Performance monitor support > arm_dsu_pmu > thunderx2-pmu > alibaba_pmu > + dwc_pcie_pmu > nvidia-pmu > meson-ddr-pmu > cxl
On 2023/10/17 17:16, Jonathan Cameron wrote: > On Tue, 17 Oct 2023 09:32:32 +0800 > Shuai Xue <xueshuai@linux.alibaba.com> wrote: > >> Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe >> controller which implements which implements PMU for performance and >> functional debugging to facilitate system maintenance. >> >> Document it to provide guidance on how to use it. >> >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> >> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > A few minor things inline and one question that I'd like a comment on > for my understanding at least! (why not multiply the counter by 16 and > make the maths simpler?) > > With those tidied up, > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Thank you for providing prompt feedback and valuable comments to me. (please also see my replies inline) Best Regards, Shuai > > >> --- >> .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ >> Documentation/admin-guide/perf/index.rst | 1 + >> 2 files changed, 95 insertions(+) >> create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> >> diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> new file mode 100644 >> index 000000000000..eac1b6f36450 >> --- /dev/null >> +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> @@ -0,0 +1,94 @@ >> +====================================================================== >> +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) >> +====================================================================== >> + >> +DesignWare Cores (DWC) PCIe PMU >> +=============================== >> + >> +The PMU is a PCIe configuration space register block provided by each PCIe Root >> +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error >> +injection, and Statistics). >> + >> +As the name indicates, the RAS DES capability supports system level >> +debugging, AER error injection, and collection of statistics. To facilitate >> +collection of statistics, Synopsys DesignWare Cores PCIe controller >> +provides the following two features: >> + >> +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and >> + time spent in each low-power LTSSM state) and >> +- one 32-bit counter for Event Counting (error and non-error events for >> + a specified lane) >> + >> +Note: There is no interrupt for counter overflow. >> + >> +Time Based Analysis >> +------------------- >> + >> +Using this feature you can obtain information regarding RX/TX data >> +throughput and time spent in each low-power LTSSM state by the controller. >> +The PMU measures data in two categories: >> + >> +- Group#0: Percentage of time the controller stays in LTSSM states. >> +- Group#1: Amount of data processed (Units of 16 bytes). >> + >> +Lane Event counters >> +------------------- >> + >> +Using this feature you can obtain Error and Non-Error information in >> +specific lane by the controller. The PMU event is select by: >> + >> +- Group i >> +- Event j within the Group i >> +- and Lane k > The and here is a little confusing. I'd rework as > The PMU event is selected by all of: > - Group i > - Event j within the Group i > - Lane k Will rework it in next version. > >> + >> +Some of the event only exist for specific configurations. > > events Sorry for typo, will fix it. > >> + >> +DesignWare Cores (DWC) PCIe PMU Driver >> +======================================= >> + >> +This driver adds PMU devices for each PCIe Root Port named based on the BDF of >> +the Root Port. For example, >> + >> + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) >> + >> +the PMU device name for this Root Port is dwc_rootport_3018. >> + >> +The DWC PCIe PMU driver registers a perf PMU driver, which provides >> +description of available events and configuration options in sysfs, see >> +/sys/bus/event_source/devices/dwc_rootport_{bdf}. >> + >> +The "format" directory describes format of the config fields of the >> +perf_event_attr structure. The "events" directory provides configuration >> +templates for all documented events. For example, >> +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". >> + >> +The "perf list" command shall list the available events from sysfs, e.g.:: >> + >> + $# perf list | grep dwc_rootport >> + <...> >> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] >> + <...> >> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] >> + >> +Time Based Analysis Event Usage >> +------------------------------- >> + >> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: >> + >> + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ >> + >> +The average RX/TX bandwidth can be calculated using the following formula: >> + >> + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window >> + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window > > Silly question (sorry I didn't raise it earlier) but can we make the interface > more intuitive by just multiplying the counter value at point of read by 16? Really a good suggestion, and it is very convenient for end perf users. But the unit of 16 is only applied to group#1 as described in Time Based Analysis section. So I prefer to left the unit part to end users.
On 2023/10/17 9:32, Shuai Xue wrote: > Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe > controller which implements which implements PMU for performance and > functional debugging to facilitate system maintenance. Double "which implements"? > > Document it to provide guidance on how to use it. > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Others look good to me. Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> > --- > .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ > Documentation/admin-guide/perf/index.rst | 1 + > 2 files changed, 95 insertions(+) > create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst > > diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > new file mode 100644 > index 000000000000..eac1b6f36450 > --- /dev/null > +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > @@ -0,0 +1,94 @@ > +====================================================================== > +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) > +====================================================================== > + > +DesignWare Cores (DWC) PCIe PMU > +=============================== > + > +The PMU is a PCIe configuration space register block provided by each PCIe Root > +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error > +injection, and Statistics). > + > +As the name indicates, the RAS DES capability supports system level > +debugging, AER error injection, and collection of statistics. To facilitate > +collection of statistics, Synopsys DesignWare Cores PCIe controller > +provides the following two features: > + > +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and > + time spent in each low-power LTSSM state) and > +- one 32-bit counter for Event Counting (error and non-error events for > + a specified lane) > + > +Note: There is no interrupt for counter overflow. > + > +Time Based Analysis > +------------------- > + > +Using this feature you can obtain information regarding RX/TX data > +throughput and time spent in each low-power LTSSM state by the controller. > +The PMU measures data in two categories: > + > +- Group#0: Percentage of time the controller stays in LTSSM states. > +- Group#1: Amount of data processed (Units of 16 bytes). > + > +Lane Event counters > +------------------- > + > +Using this feature you can obtain Error and Non-Error information in > +specific lane by the controller. The PMU event is select by: > + > +- Group i > +- Event j within the Group i > +- and Lane k > + > +Some of the event only exist for specific configurations. > + > +DesignWare Cores (DWC) PCIe PMU Driver > +======================================= > + > +This driver adds PMU devices for each PCIe Root Port named based on the BDF of > +the Root Port. For example, > + > + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) > + > +the PMU device name for this Root Port is dwc_rootport_3018. > + > +The DWC PCIe PMU driver registers a perf PMU driver, which provides > +description of available events and configuration options in sysfs, see > +/sys/bus/event_source/devices/dwc_rootport_{bdf}. > + > +The "format" directory describes format of the config fields of the > +perf_event_attr structure. The "events" directory provides configuration > +templates for all documented events. For example, > +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". > + > +The "perf list" command shall list the available events from sysfs, e.g.:: > + > + $# perf list | grep dwc_rootport > + <...> > + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] > + <...> > + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] > + > +Time Based Analysis Event Usage > +------------------------------- > + > +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: > + > + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ > + > +The average RX/TX bandwidth can be calculated using the following formula: > + > + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window > + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window > + > +Lane Event Usage > +------------------------------- > + > +Each lane has the same event set and to avoid generating a list of hundreds > +of events, the user need to specify the lane ID explicitly, e.g.:: > + > + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ > + > +The driver does not support sampling, therefore "perf record" will not > +work. Per-task (without "-a") perf sessions are not supported. > diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst > index f60be04e4e33..6bc7739fddb5 100644 > --- a/Documentation/admin-guide/perf/index.rst > +++ b/Documentation/admin-guide/perf/index.rst > @@ -19,6 +19,7 @@ Performance monitor support > arm_dsu_pmu > thunderx2-pmu > alibaba_pmu > + dwc_pcie_pmu > nvidia-pmu > meson-ddr-pmu > cxl >
On Wed, 18 Oct 2023 09:19:51 +0800 Shuai Xue <xueshuai@linux.alibaba.com> wrote: > On 2023/10/17 17:16, Jonathan Cameron wrote: > > On Tue, 17 Oct 2023 09:32:32 +0800 > > Shuai Xue <xueshuai@linux.alibaba.com> wrote: > > > >> Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe > >> controller which implements which implements PMU for performance and > >> functional debugging to facilitate system maintenance. > >> > >> Document it to provide guidance on how to use it. > >> > >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > >> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > > > A few minor things inline and one question that I'd like a comment on > > for my understanding at least! (why not multiply the counter by 16 and > > make the maths simpler?) > > > > With those tidied up, > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > Thank you for providing prompt feedback and valuable comments to me. > (please also see my replies inline) > > Best Regards, > Shuai > > > > > > >> --- > >> .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ > >> Documentation/admin-guide/perf/index.rst | 1 + > >> 2 files changed, 95 insertions(+) > >> create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst > >> > >> diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > >> new file mode 100644 > >> index 000000000000..eac1b6f36450 > >> --- /dev/null > >> +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst > >> @@ -0,0 +1,94 @@ > >> +====================================================================== > >> +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) > >> +====================================================================== > >> + > >> +DesignWare Cores (DWC) PCIe PMU > >> +=============================== > >> + > >> +The PMU is a PCIe configuration space register block provided by each PCIe Root > >> +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error > >> +injection, and Statistics). > >> + > >> +As the name indicates, the RAS DES capability supports system level > >> +debugging, AER error injection, and collection of statistics. To facilitate > >> +collection of statistics, Synopsys DesignWare Cores PCIe controller > >> +provides the following two features: > >> + > >> +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and > >> + time spent in each low-power LTSSM state) and > >> +- one 32-bit counter for Event Counting (error and non-error events for > >> + a specified lane) > >> + > >> +Note: There is no interrupt for counter overflow. > >> + > >> +Time Based Analysis > >> +------------------- > >> + > >> +Using this feature you can obtain information regarding RX/TX data > >> +throughput and time spent in each low-power LTSSM state by the controller. > >> +The PMU measures data in two categories: > >> + > >> +- Group#0: Percentage of time the controller stays in LTSSM states. > >> +- Group#1: Amount of data processed (Units of 16 bytes). > >> + > >> +Lane Event counters > >> +------------------- > >> + > >> +Using this feature you can obtain Error and Non-Error information in > >> +specific lane by the controller. The PMU event is select by: > >> + > >> +- Group i > >> +- Event j within the Group i > >> +- and Lane k > > The and here is a little confusing. I'd rework as > > The PMU event is selected by all of: > > - Group i > > - Event j within the Group i > > - Lane k > > Will rework it in next version. > > > > >> + > >> +Some of the event only exist for specific configurations. > > > > events > > Sorry for typo, will fix it. > > > > >> + > >> +DesignWare Cores (DWC) PCIe PMU Driver > >> +======================================= > >> + > >> +This driver adds PMU devices for each PCIe Root Port named based on the BDF of > >> +the Root Port. For example, > >> + > >> + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) > >> + > >> +the PMU device name for this Root Port is dwc_rootport_3018. > >> + > >> +The DWC PCIe PMU driver registers a perf PMU driver, which provides > >> +description of available events and configuration options in sysfs, see > >> +/sys/bus/event_source/devices/dwc_rootport_{bdf}. > >> + > >> +The "format" directory describes format of the config fields of the > >> +perf_event_attr structure. The "events" directory provides configuration > >> +templates for all documented events. For example, > >> +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". > >> + > >> +The "perf list" command shall list the available events from sysfs, e.g.:: > >> + > >> + $# perf list | grep dwc_rootport > >> + <...> > >> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] > >> + <...> > >> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] > >> + > >> +Time Based Analysis Event Usage > >> +------------------------------- > >> + > >> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: > >> + > >> + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ > >> + > >> +The average RX/TX bandwidth can be calculated using the following formula: > >> + > >> + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window > >> + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window > > > > Silly question (sorry I didn't raise it earlier) but can we make the interface > > more intuitive by just multiplying the counter value at point of read by 16? > > Really a good suggestion, and it is very convenient for end perf users. > But the unit of 16 is only applied to group#1 as described in Time Based Analysis > section. How hard would it be to just apply it to those events? Userspace doesn't care what the hardware does underneath - it just wants to get moderately intuitive data back. Having the end user deal with this oddity + even the need to document it seems to me to be unnecessary burden given how simple it is (I assume) to remove the oddity. > > So I prefer to left the unit part to end users. >
On 2023/10/19 15:35, Yicong Yang wrote: > On 2023/10/17 9:32, Shuai Xue wrote: >> Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe >> controller which implements which implements PMU for performance and >> functional debugging to facilitate system maintenance. > > Double "which implements"? Sorry for the typo, will fix it. > >> >> Document it to provide guidance on how to use it. >> >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> >> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > Others look good to me. > > Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> > Thank you for valuable comments :) Best Regards Shuai >> --- >> .../admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++ >> Documentation/admin-guide/perf/index.rst | 1 + >> 2 files changed, 95 insertions(+) >> create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> >> diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> new file mode 100644 >> index 000000000000..eac1b6f36450 >> --- /dev/null >> +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst >> @@ -0,0 +1,94 @@ >> +====================================================================== >> +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) >> +====================================================================== >> + >> +DesignWare Cores (DWC) PCIe PMU >> +=============================== >> + >> +The PMU is a PCIe configuration space register block provided by each PCIe Root >> +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error >> +injection, and Statistics). >> + >> +As the name indicates, the RAS DES capability supports system level >> +debugging, AER error injection, and collection of statistics. To facilitate >> +collection of statistics, Synopsys DesignWare Cores PCIe controller >> +provides the following two features: >> + >> +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and >> + time spent in each low-power LTSSM state) and >> +- one 32-bit counter for Event Counting (error and non-error events for >> + a specified lane) >> + >> +Note: There is no interrupt for counter overflow. >> + >> +Time Based Analysis >> +------------------- >> + >> +Using this feature you can obtain information regarding RX/TX data >> +throughput and time spent in each low-power LTSSM state by the controller. >> +The PMU measures data in two categories: >> + >> +- Group#0: Percentage of time the controller stays in LTSSM states. >> +- Group#1: Amount of data processed (Units of 16 bytes). >> + >> +Lane Event counters >> +------------------- >> + >> +Using this feature you can obtain Error and Non-Error information in >> +specific lane by the controller. The PMU event is select by: >> + >> +- Group i >> +- Event j within the Group i >> +- and Lane k >> + >> +Some of the event only exist for specific configurations. >> + >> +DesignWare Cores (DWC) PCIe PMU Driver >> +======================================= >> + >> +This driver adds PMU devices for each PCIe Root Port named based on the BDF of >> +the Root Port. For example, >> + >> + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) >> + >> +the PMU device name for this Root Port is dwc_rootport_3018. >> + >> +The DWC PCIe PMU driver registers a perf PMU driver, which provides >> +description of available events and configuration options in sysfs, see >> +/sys/bus/event_source/devices/dwc_rootport_{bdf}. >> + >> +The "format" directory describes format of the config fields of the >> +perf_event_attr structure. The "events" directory provides configuration >> +templates for all documented events. For example, >> +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". >> + >> +The "perf list" command shall list the available events from sysfs, e.g.:: >> + >> + $# perf list | grep dwc_rootport >> + <...> >> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] >> + <...> >> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] >> + >> +Time Based Analysis Event Usage >> +------------------------------- >> + >> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: >> + >> + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ >> + >> +The average RX/TX bandwidth can be calculated using the following formula: >> + >> + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window >> + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window >> + >> +Lane Event Usage >> +------------------------------- >> + >> +Each lane has the same event set and to avoid generating a list of hundreds >> +of events, the user need to specify the lane ID explicitly, e.g.:: >> + >> + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ >> + >> +The driver does not support sampling, therefore "perf record" will not >> +work. Per-task (without "-a") perf sessions are not supported. >> diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst >> index f60be04e4e33..6bc7739fddb5 100644 >> --- a/Documentation/admin-guide/perf/index.rst >> +++ b/Documentation/admin-guide/perf/index.rst >> @@ -19,6 +19,7 @@ Performance monitor support >> arm_dsu_pmu >> thunderx2-pmu >> alibaba_pmu >> + dwc_pcie_pmu >> nvidia-pmu >> meson-ddr-pmu >> cxl >>
On 2023/10/19 19:06, Jonathan Cameron wrote: ... >>>> + >>>> +The DWC PCIe PMU driver registers a perf PMU driver, which provides >>>> +description of available events and configuration options in sysfs, see >>>> +/sys/bus/event_source/devices/dwc_rootport_{bdf}. >>>> + >>>> +The "format" directory describes format of the config fields of the >>>> +perf_event_attr structure. The "events" directory provides configuration >>>> +templates for all documented events. For example, >>>> +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". >>>> + >>>> +The "perf list" command shall list the available events from sysfs, e.g.:: >>>> + >>>> + $# perf list | grep dwc_rootport >>>> + <...> >>>> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] >>>> + <...> >>>> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] >>>> + >>>> +Time Based Analysis Event Usage >>>> +------------------------------- >>>> + >>>> +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: >>>> + >>>> + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ >>>> + >>>> +The average RX/TX bandwidth can be calculated using the following formula: >>>> + >>>> + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window >>>> + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window >>> >>> Silly question (sorry I didn't raise it earlier) but can we make the interface >>> more intuitive by just multiplying the counter value at point of read by 16? >> >> Really a good suggestion, and it is very convenient for end perf users. >> But the unit of 16 is only applied to group#1 as described in Time Based Analysis >> section. > > How hard would it be to just apply it to those events? > Userspace doesn't care what the hardware does underneath - it just wants to get > moderately intuitive data back. Having the end user deal with this oddity + even > the need to document it seems to me to be unnecessary burden given how simple it > is (I assume) to remove the oddity. Ok. Talked me into it :) I will multiply the counter value at point of read by 16 for group#1 events. Thank you. Best Regards, Shuai
diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst new file mode 100644 index 000000000000..eac1b6f36450 --- /dev/null +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst @@ -0,0 +1,94 @@ +====================================================================== +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) +====================================================================== + +DesignWare Cores (DWC) PCIe PMU +=============================== + +The PMU is a PCIe configuration space register block provided by each PCIe Root +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error +injection, and Statistics). + +As the name indicates, the RAS DES capability supports system level +debugging, AER error injection, and collection of statistics. To facilitate +collection of statistics, Synopsys DesignWare Cores PCIe controller +provides the following two features: + +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and + time spent in each low-power LTSSM state) and +- one 32-bit counter for Event Counting (error and non-error events for + a specified lane) + +Note: There is no interrupt for counter overflow. + +Time Based Analysis +------------------- + +Using this feature you can obtain information regarding RX/TX data +throughput and time spent in each low-power LTSSM state by the controller. +The PMU measures data in two categories: + +- Group#0: Percentage of time the controller stays in LTSSM states. +- Group#1: Amount of data processed (Units of 16 bytes). + +Lane Event counters +------------------- + +Using this feature you can obtain Error and Non-Error information in +specific lane by the controller. The PMU event is select by: + +- Group i +- Event j within the Group i +- and Lane k + +Some of the event only exist for specific configurations. + +DesignWare Cores (DWC) PCIe PMU Driver +======================================= + +This driver adds PMU devices for each PCIe Root Port named based on the BDF of +the Root Port. For example, + + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) + +the PMU device name for this Root Port is dwc_rootport_3018. + +The DWC PCIe PMU driver registers a perf PMU driver, which provides +description of available events and configuration options in sysfs, see +/sys/bus/event_source/devices/dwc_rootport_{bdf}. + +The "format" directory describes format of the config fields of the +perf_event_attr structure. The "events" directory provides configuration +templates for all documented events. For example, +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". + +The "perf list" command shall list the available events from sysfs, e.g.:: + + $# perf list | grep dwc_rootport + <...> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] + <...> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] + +Time Based Analysis Event Usage +------------------------------- + +Example usage of counting PCIe RX TLP data payload (Units of 16 bytes):: + + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ + +The average RX/TX bandwidth can be calculated using the following formula: + + PCIe RX Bandwidth = PCIE_RX_DATA * 16B / Measure_Time_Window + PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window + +Lane Event Usage +------------------------------- + +Each lane has the same event set and to avoid generating a list of hundreds +of events, the user need to specify the lane ID explicitly, e.g.:: + + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ + +The driver does not support sampling, therefore "perf record" will not +work. Per-task (without "-a") perf sessions are not supported. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst index f60be04e4e33..6bc7739fddb5 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -19,6 +19,7 @@ Performance monitor support arm_dsu_pmu thunderx2-pmu alibaba_pmu + dwc_pcie_pmu nvidia-pmu meson-ddr-pmu cxl