mbox series

[V7,0/5] TPH and cache direct injection support

Message ID 20241002165954.128085-1-wei.huang2@amd.com (mailing list archive)
Headers show
Series TPH and cache direct injection support | expand

Message

Wei Huang Oct. 2, 2024, 4:59 p.m. UTC
Hi All,

TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
devices to provide optimization hints for requests that target memory
space. These hints, in a format called steering tag (ST), are provided
in the requester's TLP headers and allow the system hardware, including
the Root Complex, to optimize the utilization of platform resources
for the requests.

Upcoming AMD hardware implement a new Cache Injection feature that
leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
Coherent DMA writes directly into an L2 within the CCX (core complex)
closest to the CPU core that will consume it. This technology is aimed
at applications requiring high performance and low latency, such as
networking and storage applications.

This series introduces generic TPH support in Linux, allowing STs to be
retrieved and used by PCIe endpoint drivers as needed. As a
demonstration, it includes an example usage in the Broadcom BNXT driver.
When running on Broadcom NICs with the appropriate firmware, it shows
substantial memory bandwidth savings and better network bandwidth using
real-world benchmarks. This solution is vendor-neutral and implemented
based on industry standards (PCIe Spec and PCI FW Spec).

V6->V7:
 * Rebase on top of the latest pci/main (6.12-rc1)
 * Fix compilation warning/error on clang-18 with w=1 (test robot)
 * Revise commit messages for Patch #2, #4, and #5 (Bjorn)
 * Add more _DSM method description for reference in Patch #2 (Bjorn)
 * Remove "default n" in Kconfig (Lukas)

V5->V6:
 * Rebase on top of pci/main (tag: pci-v6.12-changes)
 * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
 * Move tph.c to drivers/pci directory (Lukas)
 * Remove CONFIG_ACPI dependency (Lukas)
 * Slightly re-arrange save/restore sequence (Lukas)

V4->V5:
 * Rebase on top of net-next/main tree (Broadcom)
 * Remove TPH mode query and TPH enabled checking functions (Bjorn)
 * Remove "nostmode" kernel parameter (Bjorn)
 * Add "notph" kernel parameter support (Bjorn)
 * Add back TPH documentation (Bjorn)
 * Change TPH register namings (Bjorn)
 * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
 * Squash ST get_st/set_st funcs as a single patch (Bjorn)
 * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)

V3->V4:
 * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
 * Add new API functioins to query/enable/disable TPH support
 * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
 * Rewrite bnxt.c based on new APIs
 * Remove documentation for now due to constantly changing API
 * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
 * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
 * Add TPH save/restore support (Paul Luse and Lukas Wunner)

V2->V3:
 * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
 * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
 * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
 * Break API functions into three individual patches for easy review
 * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)

V1->V2:
 * Rebase on top of pci.git/for-linus (6.10-rc1)
 * Address mismatched data types reported by Sparse (Sparse check passed)
 * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
 * Skip bnxt affinity notifier registration if
   pcie_tph_intr_vec_supported()=false
 * Minor fixes in bnxt driver (i.e. warning messages)

Manoj Panicker (1):
  bnxt_en: Add TPH support in BNXT driver

Michael Chan (1):
  bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings

Wei Huang (3):
  PCI: Add TLP Processing Hints (TPH) support
  PCI/TPH: Add Steering Tag support
  PCI/TPH: Add TPH documentation

 Documentation/PCI/index.rst                   |   1 +
 Documentation/PCI/tph.rst                     | 132 +++++
 .../admin-guide/kernel-parameters.txt         |   4 +
 Documentation/driver-api/pci/pci.rst          |   3 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  91 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   7 +
 drivers/pci/Kconfig                           |   9 +
 drivers/pci/Makefile                          |   1 +
 drivers/pci/pci.c                             |   4 +
 drivers/pci/pci.h                             |  12 +
 drivers/pci/probe.c                           |   1 +
 drivers/pci/tph.c                             | 546 ++++++++++++++++++
 include/linux/pci-tph.h                       |  44 ++
 include/linux/pci.h                           |   7 +
 include/uapi/linux/pci_regs.h                 |  37 +-
 net/core/netdev_rx_queue.c                    |   1 +
 16 files changed, 890 insertions(+), 10 deletions(-)
 create mode 100644 Documentation/PCI/tph.rst
 create mode 100644 drivers/pci/tph.c
 create mode 100644 include/linux/pci-tph.h


base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc

Comments

Bjorn Helgaas Oct. 2, 2024, 9:35 p.m. UTC | #1
On Wed, Oct 02, 2024 at 11:59:49AM -0500, Wei Huang wrote:
> Hi All,
> 
> TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
> devices to provide optimization hints for requests that target memory
> space. These hints, in a format called steering tag (ST), are provided
> in the requester's TLP headers and allow the system hardware, including
> the Root Complex, to optimize the utilization of platform resources
> for the requests.
> 
> Upcoming AMD hardware implement a new Cache Injection feature that
> leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
> Coherent DMA writes directly into an L2 within the CCX (core complex)
> closest to the CPU core that will consume it. This technology is aimed
> at applications requiring high performance and low latency, such as
> networking and storage applications.
> 
> This series introduces generic TPH support in Linux, allowing STs to be
> retrieved and used by PCIe endpoint drivers as needed. As a
> demonstration, it includes an example usage in the Broadcom BNXT driver.
> When running on Broadcom NICs with the appropriate firmware, it shows
> substantial memory bandwidth savings and better network bandwidth using
> real-world benchmarks. This solution is vendor-neutral and implemented
> based on industry standards (PCIe Spec and PCI FW Spec).
> 
> V6->V7:
>  * Rebase on top of the latest pci/main (6.12-rc1)
>  * Fix compilation warning/error on clang-18 with w=1 (test robot)
>  * Revise commit messages for Patch #2, #4, and #5 (Bjorn)
>  * Add more _DSM method description for reference in Patch #2 (Bjorn)
>  * Remove "default n" in Kconfig (Lukas)
> 
> V5->V6:
>  * Rebase on top of pci/main (tag: pci-v6.12-changes)
>  * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
>  * Move tph.c to drivers/pci directory (Lukas)
>  * Remove CONFIG_ACPI dependency (Lukas)
>  * Slightly re-arrange save/restore sequence (Lukas)
> 
> V4->V5:
>  * Rebase on top of net-next/main tree (Broadcom)
>  * Remove TPH mode query and TPH enabled checking functions (Bjorn)
>  * Remove "nostmode" kernel parameter (Bjorn)
>  * Add "notph" kernel parameter support (Bjorn)
>  * Add back TPH documentation (Bjorn)
>  * Change TPH register namings (Bjorn)
>  * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
>  * Squash ST get_st/set_st funcs as a single patch (Bjorn)
>  * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)
> 
> V3->V4:
>  * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
>  * Add new API functioins to query/enable/disable TPH support
>  * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
>  * Rewrite bnxt.c based on new APIs
>  * Remove documentation for now due to constantly changing API
>  * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
>  * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
>  * Add TPH save/restore support (Paul Luse and Lukas Wunner)
> 
> V2->V3:
>  * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
>  * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
>  * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
>  * Break API functions into three individual patches for easy review
>  * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)
> 
> V1->V2:
>  * Rebase on top of pci.git/for-linus (6.10-rc1)
>  * Address mismatched data types reported by Sparse (Sparse check passed)
>  * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
>  * Skip bnxt affinity notifier registration if
>    pcie_tph_intr_vec_supported()=false
>  * Minor fixes in bnxt driver (i.e. warning messages)
> 
> Manoj Panicker (1):
>   bnxt_en: Add TPH support in BNXT driver
> 
> Michael Chan (1):
>   bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
> 
> Wei Huang (3):
>   PCI: Add TLP Processing Hints (TPH) support
>   PCI/TPH: Add Steering Tag support
>   PCI/TPH: Add TPH documentation
> 
>  Documentation/PCI/index.rst                   |   1 +
>  Documentation/PCI/tph.rst                     | 132 +++++
>  .../admin-guide/kernel-parameters.txt         |   4 +
>  Documentation/driver-api/pci/pci.rst          |   3 +
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  91 ++-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   7 +
>  drivers/pci/Kconfig                           |   9 +
>  drivers/pci/Makefile                          |   1 +
>  drivers/pci/pci.c                             |   4 +
>  drivers/pci/pci.h                             |  12 +
>  drivers/pci/probe.c                           |   1 +
>  drivers/pci/tph.c                             | 546 ++++++++++++++++++
>  include/linux/pci-tph.h                       |  44 ++
>  include/linux/pci.h                           |   7 +
>  include/uapi/linux/pci_regs.h                 |  37 +-
>  net/core/netdev_rx_queue.c                    |   1 +
>  16 files changed, 890 insertions(+), 10 deletions(-)
>  create mode 100644 Documentation/PCI/tph.rst
>  create mode 100644 drivers/pci/tph.c
>  create mode 100644 include/linux/pci-tph.h

I tentatively applied this on pci/tph for v6.13.

Not sure what you intend for the bnxt changes, since they depend on
the PCI core changes.  I'm happy to merge them via PCI, given acks
from Michael and an overall network maintainer.

Alternatively they could wait another cycle, or I could make an
immutable branch, although I prefer to preserve the option to update
or remove things until the merge window.

Thanks very much; this looks like nice work!

Bjorn
Michael Chan Oct. 2, 2024, 10:08 p.m. UTC | #2
On Wed, Oct 2, 2024 at 2:35 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> I tentatively applied this on pci/tph for v6.13.
>
> Not sure what you intend for the bnxt changes, since they depend on
> the PCI core changes.  I'm happy to merge them via PCI, given acks
> from Michael and an overall network maintainer.

The bnxt patch can go in through the PCI tree if Jakub agrees.  Thanks.
Paolo Abeni Oct. 8, 2024, 7:32 a.m. UTC | #3
On 10/3/24 00:08, Michael Chan wrote:
> On Wed, Oct 2, 2024 at 2:35 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>> I tentatively applied this on pci/tph for v6.13.
>>
>> Not sure what you intend for the bnxt changes, since they depend on
>> the PCI core changes.  I'm happy to merge them via PCI, given acks
>> from Michael and an overall network maintainer.
> 
> The bnxt patch can go in through the PCI tree if Jakub agrees.  Thanks.

I guess the most critical point is to avoid complex conflict at merge 
window time. My understanding it that the conventional way to avoid such 
issue would be sharing a stable branch somewhere with this change on top 
which both the netdev and the PCI tree could pull from.

Cheers,

Paolo
Bjorn Helgaas Oct. 16, 2024, 9:31 p.m. UTC | #4
On Wed, Oct 02, 2024 at 04:35:55PM -0500, Bjorn Helgaas wrote:
> On Wed, Oct 02, 2024 at 11:59:49AM -0500, Wei Huang wrote:
> > Hi All,
> > 
> > TPH (TLP Processing Hints) is a PCIe feature that allows endpoint
> > devices to provide optimization hints for requests that target memory
> > space. These hints, in a format called steering tag (ST), are provided
> > in the requester's TLP headers and allow the system hardware, including
> > the Root Complex, to optimize the utilization of platform resources
> > for the requests.
> > 
> > Upcoming AMD hardware implement a new Cache Injection feature that
> > leverages TPH. Cache Injection allows PCIe endpoints to inject I/O
> > Coherent DMA writes directly into an L2 within the CCX (core complex)
> > closest to the CPU core that will consume it. This technology is aimed
> > at applications requiring high performance and low latency, such as
> > networking and storage applications.
> > 
> > This series introduces generic TPH support in Linux, allowing STs to be
> > retrieved and used by PCIe endpoint drivers as needed. As a
> > demonstration, it includes an example usage in the Broadcom BNXT driver.
> > When running on Broadcom NICs with the appropriate firmware, it shows
> > substantial memory bandwidth savings and better network bandwidth using
> > real-world benchmarks. This solution is vendor-neutral and implemented
> > based on industry standards (PCIe Spec and PCI FW Spec).
> > 
> > V6->V7:
> >  * Rebase on top of the latest pci/main (6.12-rc1)
> >  * Fix compilation warning/error on clang-18 with w=1 (test robot)
> >  * Revise commit messages for Patch #2, #4, and #5 (Bjorn)
> >  * Add more _DSM method description for reference in Patch #2 (Bjorn)
> >  * Remove "default n" in Kconfig (Lukas)
> > 
> > V5->V6:
> >  * Rebase on top of pci/main (tag: pci-v6.12-changes)
> >  * Fix spellings and FIELD_PREP/bnxt.c compilation errors (Simon)
> >  * Move tph.c to drivers/pci directory (Lukas)
> >  * Remove CONFIG_ACPI dependency (Lukas)
> >  * Slightly re-arrange save/restore sequence (Lukas)
> > 
> > V4->V5:
> >  * Rebase on top of net-next/main tree (Broadcom)
> >  * Remove TPH mode query and TPH enabled checking functions (Bjorn)
> >  * Remove "nostmode" kernel parameter (Bjorn)
> >  * Add "notph" kernel parameter support (Bjorn)
> >  * Add back TPH documentation (Bjorn)
> >  * Change TPH register namings (Bjorn)
> >  * Squash TPH enable/disable/save/restore funcs as a single patch (Bjorn)
> >  * Squash ST get_st/set_st funcs as a single patch (Bjorn)
> >  * Replace nic_open/close with netdev_rx_queue_restart() (Jakub, Broadcom)
> > 
> > V3->V4:
> >  * Rebase on top of the latest pci/next tree (tag: 6.11-rc1)
> >  * Add new API functioins to query/enable/disable TPH support
> >  * Make pcie_tph_set_st() completely independent from pcie_tph_get_cpu_st()
> >  * Rewrite bnxt.c based on new APIs
> >  * Remove documentation for now due to constantly changing API
> >  * Remove pci=notph, but keep pci=nostmode with better flow (Bjorn)
> >  * Lots of code rewrite in tph.c & pci-tph.h with cleaner interface (Bjorn)
> >  * Add TPH save/restore support (Paul Luse and Lukas Wunner)
> > 
> > V2->V3:
> >  * Rebase on top of pci/next tree (tag: pci-v6.11-changes)
> >  * Redefine PCI TPH registers (pci_regs.h) without breaking uapi
> >  * Fix commit subjects/messages for kernel options (Jonathan and Bjorn)
> >  * Break API functions into three individual patches for easy review
> >  * Rewrite lots of code in tph.c/tph.h based (Jonathan and Bjorn)
> > 
> > V1->V2:
> >  * Rebase on top of pci.git/for-linus (6.10-rc1)
> >  * Address mismatched data types reported by Sparse (Sparse check passed)
> >  * Add pcie_tph_intr_vec_supported() for checking IRQ mode support
> >  * Skip bnxt affinity notifier registration if
> >    pcie_tph_intr_vec_supported()=false
> >  * Minor fixes in bnxt driver (i.e. warning messages)
> > 
> > Manoj Panicker (1):
> >   bnxt_en: Add TPH support in BNXT driver
> > 
> > Michael Chan (1):
> >   bnxt_en: Pass NQ ID to the FW when allocating RX/RX AGG rings
> > 
> > Wei Huang (3):
> >   PCI: Add TLP Processing Hints (TPH) support
> >   PCI/TPH: Add Steering Tag support
> >   PCI/TPH: Add TPH documentation
> > 
> >  Documentation/PCI/index.rst                   |   1 +
> >  Documentation/PCI/tph.rst                     | 132 +++++
> >  .../admin-guide/kernel-parameters.txt         |   4 +
> >  Documentation/driver-api/pci/pci.rst          |   3 +
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  91 ++-
> >  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |   7 +
> >  drivers/pci/Kconfig                           |   9 +
> >  drivers/pci/Makefile                          |   1 +
> >  drivers/pci/pci.c                             |   4 +
> >  drivers/pci/pci.h                             |  12 +
> >  drivers/pci/probe.c                           |   1 +
> >  drivers/pci/tph.c                             | 546 ++++++++++++++++++
> >  include/linux/pci-tph.h                       |  44 ++
> >  include/linux/pci.h                           |   7 +
> >  include/uapi/linux/pci_regs.h                 |  37 +-
> >  net/core/netdev_rx_queue.c                    |   1 +
> >  16 files changed, 890 insertions(+), 10 deletions(-)
> >  create mode 100644 Documentation/PCI/tph.rst
> >  create mode 100644 drivers/pci/tph.c
> >  create mode 100644 include/linux/pci-tph.h
> 
> I tentatively applied this on pci/tph for v6.13.
> 
> Not sure what you intend for the bnxt changes, since they depend on
> the PCI core changes.  I'm happy to merge them via PCI, given acks
> from Michael and an overall network maintainer.

Given the ongoing discussion about the bnxt_en patches, I dropped
those, so the PCI tree pci/tph branch now contains only these:

  e045e5c1c706 ("PCI/TPH: Add TPH documentation")
  d2e8a34876ce ("PCI/TPH: Add Steering Tag support")
  f69767a1ada3 ("PCI: Add TLP Processing Hints (TPH) support")

This is headed for v6.13, but the branch should not be considered
immutable, and it may be merged during the merge window either before
or after the netdev tree.

Bjorn