Message ID | 20240404123717.11857-1-aaptel@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | nvme-tcp receive offloads | expand |
Doesn't apply, again, unfortunately. On Thu, 4 Apr 2024 12:36:57 +0000 Aurelien Aptel wrote: > Testing > ======= > This series was tested on ConnectX-7 HW using various configurations > of IO sizes, queue depths, MTUs, and with both the SPDK and kernel NVMe-TCP > targets. About testing, what do you have in terms of a testing setup? As you said this is similar to the TLS offload: > Note: > These offloads are similar in nature to the packet-based NIC TLS offloads, > which are already upstream (see net/tls/tls_device.c). > You can read more about TLS offload here: > https://www.kernel.org/doc/html/latest/networking/tls-offload.html and our experience trying to maintain and extend the very much used SW kernel TLS implementation in presence of the device offload is mixed :( We can't test it, we break it, get CVEs for it :( In all fairness the inline offload is probably not as bad as the crypto accelerator path, but still it breaks. So assuming that this gets all the necessary acks can we expect you to plug some testing into the netdev CI so that we see whether any incoming change is breaking this offload?
On 06/04/2024 8:45, Jakub Kicinski wrote: > Doesn't apply, again, unfortunately. > > On Thu, 4 Apr 2024 12:36:57 +0000 Aurelien Aptel wrote: >> Testing >> ======= >> This series was tested on ConnectX-7 HW using various configurations >> of IO sizes, queue depths, MTUs, and with both the SPDK and kernel NVMe-TCP >> targets. > About testing, what do you have in terms of a testing setup? > As you said this is similar to the TLS offload: > >> Note: >> These offloads are similar in nature to the packet-based NIC TLS offloads, >> which are already upstream (see net/tls/tls_device.c). >> You can read more about TLS offload here: >> https://www.kernel.org/doc/html/latest/networking/tls-offload.html > and our experience trying to maintain and extend the very much used SW > kernel TLS implementation in presence of the device offload is mixed :( > We can't test it, we break it, get CVEs for it :( In all fairness Especially when nvme-tcp can also run on tls, but that is incompatible with this offload at the moment (I've told the nvidia folks that I do not expect this incompatibility to be permanent). > the inline offload is probably not as bad as the crypto accelerator > path, but still it breaks. So assuming that this gets all the necessary > acks can we expect you to plug some testing into the netdev CI so that > we see whether any incoming change is breaking this offload? Agree, also given that there is an effort to extend blktests to run on real controllers, perhaps we should add a few tests there as well?
On 4/7/2024 3:21 PM, Sagi Grimberg wrote: > > > On 06/04/2024 8:45, Jakub Kicinski wrote: >> Doesn't apply, again, unfortunately. >> >> On Thu, 4 Apr 2024 12:36:57 +0000 Aurelien Aptel wrote: >>> Testing >>> ======= >>> This series was tested on ConnectX-7 HW using various configurations >>> of IO sizes, queue depths, MTUs, and with both the SPDK and kernel >>> NVMe-TCP >>> targets. >> About testing, what do you have in terms of a testing setup? >> As you said this is similar to the TLS offload: >> >>> Note: >>> These offloads are similar in nature to the packet-based NIC TLS >>> offloads, >>> which are already upstream (see net/tls/tls_device.c). >>> You can read more about TLS offload here: >>> https://www.kernel.org/doc/html/latest/networking/tls-offload.html >> and our experience trying to maintain and extend the very much used SW >> kernel TLS implementation in presence of the device offload is mixed :( >> We can't test it, we break it, get CVEs for it :( In all fairness > > Especially when nvme-tcp can also run on tls, but that is incompatible with > this offload at the moment (I've told the nvidia folks that I do not expect > this incompatibility to be permanent). > >> the inline offload is probably not as bad as the crypto accelerator >> path, but still it breaks. So assuming that this gets all the necessary >> acks can we expect you to plug some testing into the netdev CI so that >> we see whether any incoming change is breaking this offload? > > Agree, also given that there is an effort to extend blktests to run on > real controllers, perhaps we should add a few tests there as well? blktests seems to be the right framework to add all the testcases to cover the targeted subsystem(s) for this patchset. Daniel from Suse has already posted an RFC (see [1]) to add support for blktests so we can use real controllers for better test coverage. We will be discussing that at LSFMM session this year in detail. With this support in the blktest framework, we can definitely generate right test-coverage for the tcp-offload that can be run by anyone who has this H/W. Just like I run NVMe tests on the code going from NVMe tree to block tree for every pull request, we are planning to run new nvme tcp offload specific tests regularly on NVMe tree. We will be happy to provide the H/W to distros who are supporting this feature in order to make testing easier for others as well. Hope this answers any questions regrading ongoing testing on this patchset when this code is merged. -ck [1] https://lists.infradead.org/pipermail/linux-nvme/2024-March/046056.html
On Tue, 9 Apr 2024 22:35:51 +0000 Chaitanya Kulkarni wrote: > blktests seems to be the right framework to add all the testcases to > cover the targeted subsystem(s) for this patchset. Daniel from Suse has > already posted an RFC (see [1]) to add support for blktests so we can > use real controllers for better test coverage. We will be discussing > that at LSFMM session this year in detail. No preference on the framework or where the tests live, FWIW. > With this support in the blktest framework, we can definitely generate > right test-coverage for the tcp-offload that can be run by anyone who > has this H/W. Just like I run NVMe tests on the code going from NVMe > tree to block tree for every pull request, we are planning to run new > nvme tcp offload specific tests regularly on NVMe tree. We will be happy > to provide the H/W to distros who are supporting this feature in order > to make testing easier for others as well. You're not sending these patches to the distros, you're sending them to the upstream Linux kernel. And unfortunately we don't have a test lab where we could put your HW, so it's on you. To be clear all you need to do is periodically build and test certain upstream branches and report results. By "report" all I mean is put a JSON file with the result somewhere we can HTTP GET. KernelCI has been around for a while, I don't think this is a crazy ask.
On 4/9/2024 3:59 PM, Jakub Kicinski wrote: > On Tue, 9 Apr 2024 22:35:51 +0000 Chaitanya Kulkarni wrote: >> blktests seems to be the right framework to add all the testcases to >> cover the targeted subsystem(s) for this patchset. Daniel from Suse has >> already posted an RFC (see [1]) to add support for blktests so we can >> use real controllers for better test coverage. We will be discussing >> that at LSFMM session this year in detail. > > No preference on the framework or where the tests live, FWIW. > >> With this support in the blktest framework, we can definitely generate >> right test-coverage for the tcp-offload that can be run by anyone who >> has this H/W. Just like I run NVMe tests on the code going from NVMe >> tree to block tree for every pull request, we are planning to run new >> nvme tcp offload specific tests regularly on NVMe tree. We will be happy >> to provide the H/W to distros who are supporting this feature in order >> to make testing easier for others as well. > > You're not sending these patches to the distros, you're sending them > to the upstream Linux kernel. And unfortunately we don't have a test > lab where we could put your HW, so it's on you. To be clear all you > need to do is periodically build and test certain upstream branches > and report results. By "report" all I mean is put a JSON file with the > result somewhere we can HTTP GET. KernelCI has been around for a while, > I don't think this is a crazy ask. That should be doable, we can run the tests and make the results available for others to access in JASON format. Just to clarify what you mean by the Kernel CI ? what I understood you want our tests to run and provide the results, still not clear about the Kernel CI involvement in this process, can you please elaborate ? -ck
On Thu, 18 Apr 2024 08:29:27 +0000 Chaitanya Kulkarni wrote: > > You're not sending these patches to the distros, you're sending them > > to the upstream Linux kernel. And unfortunately we don't have a test > > lab where we could put your HW, so it's on you. To be clear all you > > need to do is periodically build and test certain upstream branches > > and report results. By "report" all I mean is put a JSON file with the > > result somewhere we can HTTP GET. KernelCI has been around for a while, > > I don't think this is a crazy ask. > > That should be doable, we can run the tests and make the results > available for others to access in JASON format. Just to > clarify what you mean by the Kernel CI ? what I understood you want > our tests to run and provide the results, still not clear about the > Kernel CI involvement in this process, can you please elaborate ? Kernel CI was just an example of CI being a thing upstream. For us you'd need to generate a much simplified JSON which netdev CI can consume. Could you reach out to Petr Machata or Ido Schimmel internally at nVidia? They are involved.