mbox series

[v5,00/10] Support using MSI interrupts in ntb_transport

Message ID 20190523223100.5526-1-logang@deltatee.com (mailing list archive)
Headers show
Series Support using MSI interrupts in ntb_transport | expand

Message

Logan Gunthorpe May 23, 2019, 10:30 p.m. UTC
This is another resend as there has been no feedback since v4.
Seems Jon has been MIA this past cycle so hopefully he appears on the
list soon.

I've addressed the feedback so far and rebased on the latest kernel
and would like this to be considered for merging this cycle.

The only outstanding issue I know of is that it still will not work
with IDT hardware, but ntb_transport doesn't work with IDT hardware
and there is still no sensible common infrastructure to support
ntb_peer_mw_set_trans(). Thus, I decline to consider that complication
in this patchset. However, I'll be happy to review work that adds this
feature in the future.

Also, as the port number and resource index stuff is a bit complicated,
I made a quick out of tree test fixture to ensure it's correct[1]. As
an excerise I also wrote some test code[2] using the upcomming KUnit
feature.

Logan

[1] https://repl.it/repls/ExcitingPresentFile
[2] https://github.com/sbates130272/linux-p2pmem/commits/ntb_kunit

--

Changes in v5:

* Rebased onto v5.2-rc1 (plus the patches in ntb-next)

--

Changes in v4:

* Rebased onto v5.1-rc6 (No changes)

* Numerous grammar and spelling mistakes spotted by Bjorn

--

Changes in v3:

* Rebased onto v5.1-rc1 (Dropped the first two patches as they have
  been merged, and cleaned up some minor conflicts in the PCI tree)

* Added a new patch (#3) to calculate logical port numbers that
  are port numbers from 0 to (number of ports - 1). This is
  then used in ntb_peer_resource_idx() to fix the issues brought
  up by Serge.

* Fixed missing __iomem and iowrite calls (as noticed by Serge)

* Added patch 10 which describes ntb_msi_test in the documentation
  file (as requested by Serge)

* A couple other minor nits and documentation fixes

--

Changes in v2:

* Cleaned up the changes in intel_irq_remapping.c to make them
  less confusing and add a comment. (Per discussion with Jacob and
  Joerg)

* Fixed a nit from Bjorn and collected his Ack

* Added a Kconfig dependancy on CONFIG_PCI_MSI for CONFIG_NTB_MSI
  as the Kbuild robot hit a random config that didn't build
  without it.

* Worked in a callback for when the MSI descriptor changes so that
  the clients can resend the new address and data values to the peer.
  On my test system this was never necessary, but there may be
  other platforms where this can occur. I tested this by hacking
  in a path to rewrite the MSI descriptor when I change the cpu
  affinity of an IRQ. There's a bit of uncertainty over the latency
  of the change, but without hardware this can acctually occur on
  we can't test this. This was the result of a discussion with Dave.

--

This patch series adds optional support for using MSI interrupts instead
of NTB doorbells in ntb_transport. This is desirable seeing doorbells on
current hardware are quite slow and therefore switching to MSI interrupts
provides a significant performance gain. On switchtec hardware, a simple
apples-to-apples comparison shows ntb_netdev/iperf numbers going from
3.88Gb/s to 14.1Gb/s when switching to MSI interrupts.

To do this, a couple changes are required outside of the NTB tree:

1) The IOMMU must know to accept MSI requests from aliased bused numbers
seeing NTB hardware typically sends proxied request IDs through
additional requester IDs. The first patch in this series adds support
for the Intel IOMMU. A quirk to add these aliases for switchtec hardware
was already accepted. See commit ad281ecf1c7d ("PCI: Add DMA alias quirk
for Microsemi Switchtec NTB") for a description of NTB proxy IDs and why
this is necessary.

2) NTB transport (and other clients) may often need more MSI interrupts
than the NTB hardware actually advertises support for. However, seeing
these interrupts will not be triggered by the hardware but through an
NTB memory window, the hardware does not actually need support or need
to know about them. Therefore we add the concept of Virtual MSI
interrupts which are allocated just like any other MSI interrupt but
are not programmed into the hardware's MSI table. This is done in
Patch 2 and then made use of in Patch 3.

The remaining patches in this series add a library for dealing with MSI
interrupts, a test client and finally support in ntb_transport.

The series is based off of v5.1-rc6 plus the patches in ntb-next.
A git repo is available here:

https://github.com/sbates130272/linux-p2pmem/ ntb_transport_msi_v4

Thanks,

Logan

--

Logan Gunthorpe (10):
  PCI/MSI: Support allocating virtual MSI interrupts
  PCI/switchtec: Add module parameter to request more interrupts
  NTB: Introduce helper functions to calculate logical port number
  NTB: Introduce functions to calculate multi-port resource index
  NTB: Rename ntb.c to support multiple source files in the module
  NTB: Introduce MSI library
  NTB: Introduce NTB MSI Test Client
  NTB: Add ntb_msi_test support to ntb_test
  NTB: Add MSI interrupt support to ntb_transport
  NTB: Describe the ntb_msi_test client in the documentation.

 Documentation/ntb.txt                   |  27 ++
 drivers/ntb/Kconfig                     |  11 +
 drivers/ntb/Makefile                    |   3 +
 drivers/ntb/{ntb.c => core.c}           |   0
 drivers/ntb/msi.c                       | 415 +++++++++++++++++++++++
 drivers/ntb/ntb_transport.c             | 169 ++++++++-
 drivers/ntb/test/Kconfig                |   9 +
 drivers/ntb/test/Makefile               |   1 +
 drivers/ntb/test/ntb_msi_test.c         | 433 ++++++++++++++++++++++++
 drivers/pci/msi.c                       |  54 ++-
 drivers/pci/switch/switchtec.c          |  12 +-
 include/linux/msi.h                     |   8 +
 include/linux/ntb.h                     | 196 ++++++++++-
 include/linux/pci.h                     |   9 +
 tools/testing/selftests/ntb/ntb_test.sh |  54 ++-
 15 files changed, 1386 insertions(+), 15 deletions(-)
 rename drivers/ntb/{ntb.c => core.c} (100%)
 create mode 100644 drivers/ntb/msi.c
 create mode 100644 drivers/ntb/test/ntb_msi_test.c

--
2.20.1

Comments

Jon Mason June 13, 2019, 1:30 p.m. UTC | #1
On Thu, May 23, 2019 at 04:30:50PM -0600, Logan Gunthorpe wrote:
> This is another resend as there has been no feedback since v4.
> Seems Jon has been MIA this past cycle so hopefully he appears on the
> list soon.
> 
> I've addressed the feedback so far and rebased on the latest kernel
> and would like this to be considered for merging this cycle.
> 
> The only outstanding issue I know of is that it still will not work
> with IDT hardware, but ntb_transport doesn't work with IDT hardware
> and there is still no sensible common infrastructure to support
> ntb_peer_mw_set_trans(). Thus, I decline to consider that complication
> in this patchset. However, I'll be happy to review work that adds this
> feature in the future.
> 
> Also, as the port number and resource index stuff is a bit complicated,
> I made a quick out of tree test fixture to ensure it's correct[1]. As
> an excerise I also wrote some test code[2] using the upcomming KUnit
> feature.

Sorry for the delay.  The patch is now in the ntb-next branch.  We've
missed window for 5.2, but it will be in the 5.3 pull request (barring
last minute comments).

Thanks,
Jon

> 
> Logan
> 
> [1] https://repl.it/repls/ExcitingPresentFile
> [2] https://github.com/sbates130272/linux-p2pmem/commits/ntb_kunit
> 
> --
> 
> Changes in v5:
> 
> * Rebased onto v5.2-rc1 (plus the patches in ntb-next)
> 
> --
> 
> Changes in v4:
> 
> * Rebased onto v5.1-rc6 (No changes)
> 
> * Numerous grammar and spelling mistakes spotted by Bjorn
> 
> --
> 
> Changes in v3:
> 
> * Rebased onto v5.1-rc1 (Dropped the first two patches as they have
>   been merged, and cleaned up some minor conflicts in the PCI tree)
> 
> * Added a new patch (#3) to calculate logical port numbers that
>   are port numbers from 0 to (number of ports - 1). This is
>   then used in ntb_peer_resource_idx() to fix the issues brought
>   up by Serge.
> 
> * Fixed missing __iomem and iowrite calls (as noticed by Serge)
> 
> * Added patch 10 which describes ntb_msi_test in the documentation
>   file (as requested by Serge)
> 
> * A couple other minor nits and documentation fixes
> 
> --
> 
> Changes in v2:
> 
> * Cleaned up the changes in intel_irq_remapping.c to make them
>   less confusing and add a comment. (Per discussion with Jacob and
>   Joerg)
> 
> * Fixed a nit from Bjorn and collected his Ack
> 
> * Added a Kconfig dependancy on CONFIG_PCI_MSI for CONFIG_NTB_MSI
>   as the Kbuild robot hit a random config that didn't build
>   without it.
> 
> * Worked in a callback for when the MSI descriptor changes so that
>   the clients can resend the new address and data values to the peer.
>   On my test system this was never necessary, but there may be
>   other platforms where this can occur. I tested this by hacking
>   in a path to rewrite the MSI descriptor when I change the cpu
>   affinity of an IRQ. There's a bit of uncertainty over the latency
>   of the change, but without hardware this can acctually occur on
>   we can't test this. This was the result of a discussion with Dave.
> 
> --
> 
> This patch series adds optional support for using MSI interrupts instead
> of NTB doorbells in ntb_transport. This is desirable seeing doorbells on
> current hardware are quite slow and therefore switching to MSI interrupts
> provides a significant performance gain. On switchtec hardware, a simple
> apples-to-apples comparison shows ntb_netdev/iperf numbers going from
> 3.88Gb/s to 14.1Gb/s when switching to MSI interrupts.
> 
> To do this, a couple changes are required outside of the NTB tree:
> 
> 1) The IOMMU must know to accept MSI requests from aliased bused numbers
> seeing NTB hardware typically sends proxied request IDs through
> additional requester IDs. The first patch in this series adds support
> for the Intel IOMMU. A quirk to add these aliases for switchtec hardware
> was already accepted. See commit ad281ecf1c7d ("PCI: Add DMA alias quirk
> for Microsemi Switchtec NTB") for a description of NTB proxy IDs and why
> this is necessary.
> 
> 2) NTB transport (and other clients) may often need more MSI interrupts
> than the NTB hardware actually advertises support for. However, seeing
> these interrupts will not be triggered by the hardware but through an
> NTB memory window, the hardware does not actually need support or need
> to know about them. Therefore we add the concept of Virtual MSI
> interrupts which are allocated just like any other MSI interrupt but
> are not programmed into the hardware's MSI table. This is done in
> Patch 2 and then made use of in Patch 3.
> 
> The remaining patches in this series add a library for dealing with MSI
> interrupts, a test client and finally support in ntb_transport.
> 
> The series is based off of v5.1-rc6 plus the patches in ntb-next.
> A git repo is available here:
> 
> https://github.com/sbates130272/linux-p2pmem/ ntb_transport_msi_v4
> 
> Thanks,
> 
> Logan
> 
> --
> 
> Logan Gunthorpe (10):
>   PCI/MSI: Support allocating virtual MSI interrupts
>   PCI/switchtec: Add module parameter to request more interrupts
>   NTB: Introduce helper functions to calculate logical port number
>   NTB: Introduce functions to calculate multi-port resource index
>   NTB: Rename ntb.c to support multiple source files in the module
>   NTB: Introduce MSI library
>   NTB: Introduce NTB MSI Test Client
>   NTB: Add ntb_msi_test support to ntb_test
>   NTB: Add MSI interrupt support to ntb_transport
>   NTB: Describe the ntb_msi_test client in the documentation.
> 
>  Documentation/ntb.txt                   |  27 ++
>  drivers/ntb/Kconfig                     |  11 +
>  drivers/ntb/Makefile                    |   3 +
>  drivers/ntb/{ntb.c => core.c}           |   0
>  drivers/ntb/msi.c                       | 415 +++++++++++++++++++++++
>  drivers/ntb/ntb_transport.c             | 169 ++++++++-
>  drivers/ntb/test/Kconfig                |   9 +
>  drivers/ntb/test/Makefile               |   1 +
>  drivers/ntb/test/ntb_msi_test.c         | 433 ++++++++++++++++++++++++
>  drivers/pci/msi.c                       |  54 ++-
>  drivers/pci/switch/switchtec.c          |  12 +-
>  include/linux/msi.h                     |   8 +
>  include/linux/ntb.h                     | 196 ++++++++++-
>  include/linux/pci.h                     |   9 +
>  tools/testing/selftests/ntb/ntb_test.sh |  54 ++-
>  15 files changed, 1386 insertions(+), 15 deletions(-)
>  rename drivers/ntb/{ntb.c => core.c} (100%)
>  create mode 100644 drivers/ntb/msi.c
>  create mode 100644 drivers/ntb/test/ntb_msi_test.c
> 
> --
> 2.20.1
Logan Gunthorpe June 13, 2019, 3:46 p.m. UTC | #2
On 2019-06-13 7:30 a.m., Jon Mason wrote:
> On Thu, May 23, 2019 at 04:30:50PM -0600, Logan Gunthorpe wrote:
>> This is another resend as there has been no feedback since v4.
>> Seems Jon has been MIA this past cycle so hopefully he appears on the
>> list soon.
>>
>> I've addressed the feedback so far and rebased on the latest kernel
>> and would like this to be considered for merging this cycle.
>>
>> The only outstanding issue I know of is that it still will not work
>> with IDT hardware, but ntb_transport doesn't work with IDT hardware
>> and there is still no sensible common infrastructure to support
>> ntb_peer_mw_set_trans(). Thus, I decline to consider that complication
>> in this patchset. However, I'll be happy to review work that adds this
>> feature in the future.
>>
>> Also, as the port number and resource index stuff is a bit complicated,
>> I made a quick out of tree test fixture to ensure it's correct[1]. As
>> an excerise I also wrote some test code[2] using the upcomming KUnit
>> feature.
> 
> Sorry for the delay.  The patch is now in the ntb-next branch.  We've
> missed window for 5.2, but it will be in the 5.3 pull request (barring
> last minute comments).

Thanks Jon!

Logan