new file mode 100644
@@ -0,0 +1,61 @@
+======================================================================
+Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU)
+======================================================================
+
+DesignWare Cores (DWC) PCIe PMU
+===============================
+
+To facilitate collection of statistics, Synopsys DesignWare Cores PCIe
+controller provides the following two features:
+
+- Time Based Analysis (RX/TX data throughput and time spent in each
+ low-power LTSSM state)
+- Lane Event counters (Error and Non-Error for lanes)
+
+The PMU is not a PCIe Root Complex integrated End Point (RCiEP) device but
+only register counters provided by each PCIe Root Port.
+
+Time Based Analysis
+-------------------
+
+Using this feature you can obtain information regarding RX/TX data
+throughput and time spent in each low-power LTSSM state by the controller.
+
+The counters are 64-bit width and measure data in two categories,
+
+- percentage of time does the controller stay in LTSSM state in a
+ configurable duration. The measurement range of each Event in Group#0.
+- amount of data processed (Units of 16 bytes). The measurement range of
+ each Event in Group#1.
+
+Lane Event counters
+-------------------
+
+Using this feature you can obtain Error and Non-Error information in
+specific lane by the controller.
+
+The counters are 32-bit width and the measured event is select by:
+
+- Group i
+- Event j within the Group i
+- and Lane k
+
+Some of the event counters only exist for specific configurations.
+
+DesignWare Cores (DWC) PCIe PMU Driver
+=======================================
+
+This driver add PMU devices for each PCIe Root Port. And the PMU device is
+named based the BDF of Root Port. For example,
+
+ 30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
+
+the PMU device name for this Root Port is dwc_rootport_3018.
+
+Example usage of counting PCIe RX TLP data payload (Units of 16 bytes)::
+
+ $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
+
+average RX bandwidth can be calculated like this:
+
+ PCIe TX Bandwidth = PCIE_TX_DATA * 16B / Measure_Time_Window
@@ -19,5 +19,6 @@ Performance monitor support
arm_dsu_pmu
thunderx2-pmu
alibaba_pmu
+ dwc_pcie_pmu
nvidia-pmu
meson-ddr-pmu
Alibaba's T-Head Yitan 710 SoC is built on Synopsys' widely deployed and silicon-proven DesignWare Core PCIe controller which implements PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> --- .../admin-guide/perf/dwc_pcie_pmu.rst | 61 +++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 62 insertions(+) create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst