diff mbox series

[v7,10/23] Documentation: add ULP DDP offload documentation

Message ID 20221025135958.6242-11-aaptel@nvidia.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series nvme-tcp receive offloads | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count fail Series longer than 15 patches (and no cover letter)
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 2 maintainers not CCed: corbet@lwn.net linux-doc@vger.kernel.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Aurelien Aptel Oct. 25, 2022, 1:59 p.m. UTC
From: Yoray Zack <yorayz@nvidia.com>

Document the new ULP DDP API and add it under "networking".
Use NVMe-TCP implementation as an example.

Signed-off-by: Boris Pismenny <borisp@nvidia.com>
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com>
Signed-off-by: Yoray Zack <yorayz@nvidia.com>
Signed-off-by: Shai Malin <smalin@nvidia.com>
Signed-off-by: Aurelien Aptel <aaptel@nvidia.com>
---
 Documentation/networking/index.rst           |   1 +
 Documentation/networking/ulp-ddp-offload.rst | 368 +++++++++++++++++++
 2 files changed, 369 insertions(+)
 create mode 100644 Documentation/networking/ulp-ddp-offload.rst

Comments

kernel test robot Oct. 26, 2022, 10:23 p.m. UTC | #1
Hi Aurelien,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]
[also build test WARNING on next-20221026]
[cannot apply to net/master linus/master v6.1-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Aurelien-Aptel/nvme-tcp-receive-offloads/20221025-221001
patch link:    https://lore.kernel.org/r/20221025135958.6242-11-aaptel%40nvidia.com
patch subject: [PATCH v7 10/23] Documentation: add ULP DDP offload documentation
reproduce:
        # https://github.com/intel-lab-lkp/linux/commit/c0839ff3b217d1ad295c08fc8b3c07d64eefcf4f
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Aurelien-Aptel/nvme-tcp-receive-offloads/20221025-221001
        git checkout c0839ff3b217d1ad295c08fc8b3c07d64eefcf4f
        make menuconfig
        # enable CONFIG_COMPILE_TEST, CONFIG_WARN_MISSING_DOCUMENTS, CONFIG_WARN_ABI_ERRORS
        make htmldocs

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> Documentation/networking/ulp-ddp-offload.rst:239: WARNING: Error in "code-block" directive:
>> Documentation/networking/ulp-ddp-offload.rst:60: WARNING: undefined label: tls_offload (if the link has no caption the label must precede a section header)

vim +239 Documentation/networking/ulp-ddp-offload.rst

    59	
  > 60	Offloading does require NIC hardware to track L5P protocol framing, similarly
    61	to RX TLS offload (see documentation at
    62	:ref:`Documentation/networking/tls-offload.rst <tls_offload>`).  NIC hardware
    63	will parse PDU headers, extract fields such as operation type, length, tag
    64	identifier, etc. and only offload segments that correspond to tags registered
    65	with the NIC, see the :ref:`buf_reg` section.
    66	
    67	Device configuration
    68	====================
    69	
    70	During driver initialization the device sets the ``NETIF_F_HW_ULP_DDP`` feature
    71	and installs its
    72	:c:type:`struct ulp_ddp_ops <ulp_ddp_ops>`
    73	pointer in the :c:member:`ulp_ddp_ops` member of the
    74	:c:type:`struct net_device <net_device>`.
    75	
    76	Later, after the L5P completes its handshake, the L5P queries the
    77	device driver for its ULP capabilities (:c:type:`enum ulp_ddp_offload_capabilities`)
    78	and runtime limitations via the :c:member:`ulp_ddp_limits` callback:
    79	
    80	.. code-block:: c
    81	
    82	 int (*ulp_ddp_limits)(struct net_device *netdev,
    83			      struct ulp_ddp_limits *limits);
    84	
    85	The current list of capabilities is:
    86	
    87	.. code-block:: c
    88	
    89	 enum ulp_ddp_offload_capabilities {
    90		ULP_DDP_C_NVME_TCP = 1,
    91		ULP_DDP_C_NVME_TCP_DDGST_RX = 2,
    92	 };
    93	
    94	All L5P share a common set of limits and parameters (:c:type:`struct ulp_ddp_limits`):
    95	
    96	.. code-block:: c
    97	
    98	 /**
    99	  * struct ulp_ddp_limits - Generic ulp ddp limits: tcp ddp
   100	  * protocol limits.
   101	  * Protocol implementations must use this as the first member.
   102	  * Add new instances of ulp_ddp_limits below (nvme-tcp, etc.).
   103	  *
   104	  * @max_ddp_sgl_len:	maximum sgl size supported (zero means no limit)
   105	  * @io_threshold:	minimum payload size required to offload
   106	  */
   107	 struct ulp_ddp_limits {
   108		enum ulp_ddp_type	type;
   109		u64			offload_capabilities;
   110		int			max_ddp_sgl_len;
   111		int			io_threshold;
   112		unsigned char		buf[];
   113	 };
   114	
   115	But each L5P can also add protocol-specific limits e.g.:
   116	
   117	.. code-block:: c
   118	
   119	 /**
   120	  * struct nvme_tcp_ddp_limits - nvme tcp driver limitations
   121	  *
   122	  * @full_ccid_range:	true if the driver supports the full CID range
   123	  */
   124	 struct nvme_tcp_ddp_limits {
   125		struct ulp_ddp_limits	lmt;
   126	
   127		bool			full_ccid_range;
   128	 };
   129	
   130	Once the L5P has made sure the device is supported the offload
   131	operations are installed on the socket.
   132	
   133	If offload installation fails, then the connection is handled by software as if
   134	offload was not attempted.
   135	
   136	To request offload for a socket `sk`, the L5P calls :c:member:`ulp_ddp_sk_add`:
   137	
   138	.. code-block:: c
   139	
   140	 int (*ulp_ddp_sk_add)(struct net_device *netdev,
   141			      struct sock *sk,
   142			      struct ulp_ddp_config *config);
   143	
   144	The function return 0 for success. In case of failure, L5P software should
   145	fallback to normal non-offloaded operations.  The `config` parameter indicates
   146	the L5P type and any metadata relevant for that protocol. For example, in
   147	NVMe-TCP the following config is used:
   148	
   149	.. code-block:: c
   150	
   151	 /**
   152	  * struct nvme_tcp_ddp_config - nvme tcp ddp configuration for an IO queue
   153	  *
   154	  * @pfv:        pdu version (e.g., NVME_TCP_PFV_1_0)
   155	  * @cpda:       controller pdu data alignment (dwords, 0's based)
   156	  * @dgst:       digest types enabled.
   157	  *              The netdev will offload crc if L5P data digest is supported.
   158	  * @queue_size: number of nvme-tcp IO queue elements
   159	  * @queue_id:   queue identifier
   160	  * @cpu_io:     cpu core running the IO thread for this queue
   161	  */
   162	 struct nvme_tcp_ddp_config {
   163		struct ulp_ddp_config   cfg;
   164	
   165		u16			pfv;
   166		u8			cpda;
   167		u8			dgst;
   168		int			queue_size;
   169		int			queue_id;
   170		int			io_cpu;
   171	 };
   172	
   173	When offload is not needed anymore, e.g. when the socket is being released, the L5P
   174	calls :c:member:`ulp_ddp_sk_del` to release device contexts:
   175	
   176	.. code-block:: c
   177	
   178	 void (*ulp_ddp_sk_del)(struct net_device *netdev,
   179			        struct sock *sk);
   180	
   181	Normal operation
   182	================
   183	
   184	At the very least, the device maintains the following state for each connection:
   185	
   186	 * 5-tuple
   187	 * expected TCP sequence number
   188	 * mapping between tags and corresponding buffers
   189	 * current offset within PDU, PDU length, current PDU tag
   190	
   191	NICs should not assume any correlation between PDUs and TCP packets.
   192	If TCP packets arrive in-order, offload will place PDU payloads
   193	directly inside corresponding registered buffers. NIC offload should
   194	not delay packets. If offload is not possible, than the packet is
   195	passed as-is to software. To perform offload on incoming packets
   196	without buffering packets in the NIC, the NIC stores some inter-packet
   197	state, such as partial PDU headers.
   198	
   199	RX data-path
   200	------------
   201	
   202	After the device validates TCP checksums, it can perform DDP offload.  The
   203	packet is steered to the DDP offload context according to the 5-tuple.
   204	Thereafter, the expected TCP sequence number is checked against the packet
   205	TCP sequence number. If there is a match, offload is performed: the PDU payload
   206	is DMA written to the corresponding destination buffer according to the PDU header
   207	tag.  The data should be DMAed only once, and the NIC receive ring will only
   208	store the remaining TCP and PDU headers.
   209	
   210	We remark that a single TCP packet may have numerous PDUs embedded inside. NICs
   211	can choose to offload one or more of these PDUs according to various
   212	trade-offs. Possibly, offloading such small PDUs is of little value, and it is
   213	better to leave it to software.
   214	
   215	Upon receiving a DDP offloaded packet, the driver reconstructs the original SKB
   216	using page frags, while pointing to the destination buffers whenever possible.
   217	This method enables seamless integration with the network stack, which can
   218	inspect and modify packet fields transparently to the offload.
   219	
   220	.. _buf_reg:
   221	
   222	Destination buffer registration
   223	-------------------------------
   224	
   225	To register the mapping between tags and destination buffers for a socket
   226	`sk`, the L5P calls :c:member:`ulp_ddp_setup` of :c:type:`struct ulp_ddp_ops
   227	<ulp_ddp_ops>`:
   228	
   229	.. code-block:: c
   230	
   231	 int (*ulp_ddp_setup)(struct net_device *netdev,
   232			     struct sock *sk,
   233			     struct ulp_ddp_io *io);
   234	
   235	
   236	The `io` provides the buffer via scatter-gather list (`sg_table`) and
   237	corresponding tag (`command_id`):
   238	
 > 239	.. code-block:: c
   240	 /**
   241	  * struct ulp_ddp_io - tcp ddp configuration for an IO request.
   242	  *
   243	  * @command_id:  identifier on the wire associated with these buffers
   244	  * @nents:       number of entries in the sg_table
   245	  * @sg_table:    describing the buffers for this IO request
   246	  * @first_sgl:   first SGL in sg_table
   247	  */
   248	 struct ulp_ddp_io {
   249		u32			command_id;
   250		int			nents;
   251		struct sg_table		sg_table;
   252		struct scatterlist	first_sgl[SG_CHUNK_SIZE];
   253	 };
   254
diff mbox series

Patch

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 4f2d1f682a18..10dbbb6694dc 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -106,6 +106,7 @@  Contents:
    tc-actions-env-rules
    tc-queue-filters
    tcp-thin
+   ulp-ddp-offload
    team
    timestamping
    tipc
diff --git a/Documentation/networking/ulp-ddp-offload.rst b/Documentation/networking/ulp-ddp-offload.rst
new file mode 100644
index 000000000000..3927066938fb
--- /dev/null
+++ b/Documentation/networking/ulp-ddp-offload.rst
@@ -0,0 +1,368 @@ 
+.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+
+=================================
+ULP direct data placement offload
+=================================
+
+Overview
+========
+
+The Linux kernel ULP direct data placement (DDP) offload infrastructure
+provides tagged request-response protocols, such as NVMe-TCP, the ability to
+place response data directly in pre-registered buffers according to header
+tags. DDP is particularly useful for data-intensive pipelined protocols whose
+responses may be reordered.
+
+For example, in NVMe-TCP numerous read requests are sent together and each
+request is tagged using the PDU header CID field. Receiving servers process
+requests as fast as possible and sometimes responses for smaller requests
+bypasses responses to larger requests, e.g., 4KB reads bypass 1GB reads.
+Thereafter, clients correlate responses to requests using PDU header CID tags.
+The processing of each response requires copying data from SKBs to read
+request destination buffers; The offload avoids this copy. The offload is
+oblivious to destination buffers which can reside either in userspace
+(O_DIRECT) or in kernel pagecache.
+
+Request TCP byte-stream:
+
+.. parsed-literal::
+
+ +---------------+-------+---------------+-------+---------------+-------+
+ | PDU hdr CID=1 | Req 1 | PDU hdr CID=2 | Req 2 | PDU hdr CID=3 | Req 3 |
+ +---------------+-------+---------------+-------+---------------+-------+
+
+Response TCP byte-stream:
+
+.. parsed-literal::
+
+ +---------------+--------+---------------+--------+---------------+--------+
+ | PDU hdr CID=2 | Resp 2 | PDU hdr CID=3 | Resp 3 | PDU hdr CID=1 | Resp 1 |
+ +---------------+--------+---------------+--------+---------------+--------+
+
+The driver builds SKB page fragments that point to destination buffers.
+Consequently, SKBs represent the original data on the wire, which enables
+*transparent* inter-operation with the network stack. To avoid copies between
+SKBs and destination buffers, the layer-5 protocol (L5P) will check
+``if (src == dst)`` for SKB page fragments, success indicates that data is
+already placed there by NIC hardware and copy should be skipped.
+
+In addition, L5P might have DDGST which ensures data integrity over
+the network.  If not offloaded, ULP DDP might not be efficient as L5P
+will need to go over the data and calculate it by itself, cancelling
+out the benefits of the DDP copy skip.  ULP DDP has support for Rx/Tx
+DDGST offload. On the received side the NIC will verify DDGST for
+received PDUs and update SKB->ulp_ddp and SKB->ulp_crc bits.  If all the SKBs
+making up a L5P PDU have crc on, L5P will skip on calculating and
+verifying the DDGST for the corresponding PDU. On the Tx side, the NIC
+will be responsible for calculating and filling the DDGST fields in
+the sent PDUs.
+
+Offloading does require NIC hardware to track L5P protocol framing, similarly
+to RX TLS offload (see documentation at
+:ref:`Documentation/networking/tls-offload.rst <tls_offload>`).  NIC hardware
+will parse PDU headers, extract fields such as operation type, length, tag
+identifier, etc. and only offload segments that correspond to tags registered
+with the NIC, see the :ref:`buf_reg` section.
+
+Device configuration
+====================
+
+During driver initialization the device sets the ``NETIF_F_HW_ULP_DDP`` feature
+and installs its
+:c:type:`struct ulp_ddp_ops <ulp_ddp_ops>`
+pointer in the :c:member:`ulp_ddp_ops` member of the
+:c:type:`struct net_device <net_device>`.
+
+Later, after the L5P completes its handshake, the L5P queries the
+device driver for its ULP capabilities (:c:type:`enum ulp_ddp_offload_capabilities`)
+and runtime limitations via the :c:member:`ulp_ddp_limits` callback:
+
+.. code-block:: c
+
+ int (*ulp_ddp_limits)(struct net_device *netdev,
+		      struct ulp_ddp_limits *limits);
+
+The current list of capabilities is:
+
+.. code-block:: c
+
+ enum ulp_ddp_offload_capabilities {
+	ULP_DDP_C_NVME_TCP = 1,
+	ULP_DDP_C_NVME_TCP_DDGST_RX = 2,
+ };
+
+All L5P share a common set of limits and parameters (:c:type:`struct ulp_ddp_limits`):
+
+.. code-block:: c
+
+ /**
+  * struct ulp_ddp_limits - Generic ulp ddp limits: tcp ddp
+  * protocol limits.
+  * Protocol implementations must use this as the first member.
+  * Add new instances of ulp_ddp_limits below (nvme-tcp, etc.).
+  *
+  * @max_ddp_sgl_len:	maximum sgl size supported (zero means no limit)
+  * @io_threshold:	minimum payload size required to offload
+  */
+ struct ulp_ddp_limits {
+	enum ulp_ddp_type	type;
+	u64			offload_capabilities;
+	int			max_ddp_sgl_len;
+	int			io_threshold;
+	unsigned char		buf[];
+ };
+
+But each L5P can also add protocol-specific limits e.g.:
+
+.. code-block:: c
+
+ /**
+  * struct nvme_tcp_ddp_limits - nvme tcp driver limitations
+  *
+  * @full_ccid_range:	true if the driver supports the full CID range
+  */
+ struct nvme_tcp_ddp_limits {
+	struct ulp_ddp_limits	lmt;
+
+	bool			full_ccid_range;
+ };
+
+Once the L5P has made sure the device is supported the offload
+operations are installed on the socket.
+
+If offload installation fails, then the connection is handled by software as if
+offload was not attempted.
+
+To request offload for a socket `sk`, the L5P calls :c:member:`ulp_ddp_sk_add`:
+
+.. code-block:: c
+
+ int (*ulp_ddp_sk_add)(struct net_device *netdev,
+		      struct sock *sk,
+		      struct ulp_ddp_config *config);
+
+The function return 0 for success. In case of failure, L5P software should
+fallback to normal non-offloaded operations.  The `config` parameter indicates
+the L5P type and any metadata relevant for that protocol. For example, in
+NVMe-TCP the following config is used:
+
+.. code-block:: c
+
+ /**
+  * struct nvme_tcp_ddp_config - nvme tcp ddp configuration for an IO queue
+  *
+  * @pfv:        pdu version (e.g., NVME_TCP_PFV_1_0)
+  * @cpda:       controller pdu data alignment (dwords, 0's based)
+  * @dgst:       digest types enabled.
+  *              The netdev will offload crc if L5P data digest is supported.
+  * @queue_size: number of nvme-tcp IO queue elements
+  * @queue_id:   queue identifier
+  * @cpu_io:     cpu core running the IO thread for this queue
+  */
+ struct nvme_tcp_ddp_config {
+	struct ulp_ddp_config   cfg;
+
+	u16			pfv;
+	u8			cpda;
+	u8			dgst;
+	int			queue_size;
+	int			queue_id;
+	int			io_cpu;
+ };
+
+When offload is not needed anymore, e.g. when the socket is being released, the L5P
+calls :c:member:`ulp_ddp_sk_del` to release device contexts:
+
+.. code-block:: c
+
+ void (*ulp_ddp_sk_del)(struct net_device *netdev,
+		        struct sock *sk);
+
+Normal operation
+================
+
+At the very least, the device maintains the following state for each connection:
+
+ * 5-tuple
+ * expected TCP sequence number
+ * mapping between tags and corresponding buffers
+ * current offset within PDU, PDU length, current PDU tag
+
+NICs should not assume any correlation between PDUs and TCP packets.
+If TCP packets arrive in-order, offload will place PDU payloads
+directly inside corresponding registered buffers. NIC offload should
+not delay packets. If offload is not possible, than the packet is
+passed as-is to software. To perform offload on incoming packets
+without buffering packets in the NIC, the NIC stores some inter-packet
+state, such as partial PDU headers.
+
+RX data-path
+------------
+
+After the device validates TCP checksums, it can perform DDP offload.  The
+packet is steered to the DDP offload context according to the 5-tuple.
+Thereafter, the expected TCP sequence number is checked against the packet
+TCP sequence number. If there is a match, offload is performed: the PDU payload
+is DMA written to the corresponding destination buffer according to the PDU header
+tag.  The data should be DMAed only once, and the NIC receive ring will only
+store the remaining TCP and PDU headers.
+
+We remark that a single TCP packet may have numerous PDUs embedded inside. NICs
+can choose to offload one or more of these PDUs according to various
+trade-offs. Possibly, offloading such small PDUs is of little value, and it is
+better to leave it to software.
+
+Upon receiving a DDP offloaded packet, the driver reconstructs the original SKB
+using page frags, while pointing to the destination buffers whenever possible.
+This method enables seamless integration with the network stack, which can
+inspect and modify packet fields transparently to the offload.
+
+.. _buf_reg:
+
+Destination buffer registration
+-------------------------------
+
+To register the mapping between tags and destination buffers for a socket
+`sk`, the L5P calls :c:member:`ulp_ddp_setup` of :c:type:`struct ulp_ddp_ops
+<ulp_ddp_ops>`:
+
+.. code-block:: c
+
+ int (*ulp_ddp_setup)(struct net_device *netdev,
+		     struct sock *sk,
+		     struct ulp_ddp_io *io);
+
+
+The `io` provides the buffer via scatter-gather list (`sg_table`) and
+corresponding tag (`command_id`):
+
+.. code-block:: c
+ /**
+  * struct ulp_ddp_io - tcp ddp configuration for an IO request.
+  *
+  * @command_id:  identifier on the wire associated with these buffers
+  * @nents:       number of entries in the sg_table
+  * @sg_table:    describing the buffers for this IO request
+  * @first_sgl:   first SGL in sg_table
+  */
+ struct ulp_ddp_io {
+	u32			command_id;
+	int			nents;
+	struct sg_table		sg_table;
+	struct scatterlist	first_sgl[SG_CHUNK_SIZE];
+ };
+
+After the buffers have been consumed by the L5P, to release the NIC mapping of
+buffers the L5P calls :c:member:`ulp_ddp_teardown` of :c:type:`struct
+ulp_ddp_ops <ulp_ddp_ops>`:
+
+.. code-block:: c
+
+ int (*ulp_ddp_teardown)(struct net_device *netdev,
+			struct sock *sk,
+			struct ulp_ddp_io *io,
+			void *ddp_ctx);
+
+`ulp_ddp_teardown` receives the same `io` context and an additional opaque
+`ddp_ctx` that is used for asynchronous teardown, see the :ref:`async_release`
+section.
+
+.. _async_release:
+
+Asynchronous teardown
+---------------------
+
+To teardown the association between tags and buffers and allow tag reuse NIC HW
+is called by the NIC driver during `ulp_ddp_teardown`. This operation may be
+performed either synchronously or asynchronously. In asynchronous teardown,
+`ulp_ddp_teardown` returns immediately without unmapping NIC HW buffers. Later,
+when the unmapping completes by NIC HW, the NIC driver will call up to L5P
+using :c:member:`ddp_teardown_done` of :c:type:`struct ulp_ddp_ulp_ops`:
+
+.. code-block:: c
+
+ void (*ddp_teardown_done)(void *ddp_ctx);
+
+The `ddp_ctx` parameter passed in `ddp_teardown_done` is the same on provided
+in `ulp_ddp_teardown` and it is used to carry some context about the buffers
+and tags that are released.
+
+Resync handling
+===============
+
+RX
+--
+In presence of packet drops or network packet reordering, the device may lose
+synchronization between the TCP stream and the L5P framing, and require a
+resync with the kernel's TCP stack. When the device is out of sync, no offload
+takes place, and packets are passed as-is to software. (resync is very similar
+to TLS offload (see documentation at
+:ref:`Documentation/networking/tls-offload.rst <tls_offload>`)
+
+If only packets with L5P data are lost or reordered, then resynchronization may
+be avoided by NIC HW that keeps tracking PDU headers. If, however, PDU headers
+are reordered, then resynchronization is necessary.
+
+To resynchronize hardware during traffic, we use a handshake between hardware
+and software. The NIC HW searches for a sequence of bytes that identifies L5P
+headers (i.e., magic pattern).  For example, in NVMe-TCP, the PDU operation
+type can be used for this purpose.  Using the PDU header length field, the NIC
+HW will continue to find and match magic patterns in subsequent PDU headers. If
+the pattern is missing in an expected position, then searching for the pattern
+starts anew.
+
+The NIC will not resume offload when the magic pattern is first identified.
+Instead, it will request L5P software to confirm that indeed this is a PDU
+header. To request confirmation the NIC driver calls up to L5P using
+:c:member:`*resync_request` of :c:type:`struct ulp_ddp_ulp_ops`:
+
+.. code-block:: c
+
+  bool (*resync_request)(struct sock *sk, u32 seq, u32 flags);
+
+The `seq` parameter contains the TCP sequence of the last byte in the PDU header.
+The `flags` parameter contains a flag (`ULP_DDP_RESYNC_PENDING`) indicating whether
+a request is pending or not.
+L5P software will respond to this request after observing the packet containing
+TCP sequence `seq` in-order. If the PDU header is indeed there, then L5P
+software calls the NIC driver using the :c:member:`ulp_ddp_resync` function of
+the :c:type:`struct ulp_ddp_ops <ulp_ddp_ops>` inside the :c:type:`struct
+net_device <net_device>` while passing the same `seq` to confirm it is a PDU
+header.
+
+.. code-block:: c
+
+ void (*ulp_ddp_resync)(struct net_device *netdev,
+		       struct sock *sk, u32 seq);
+
+Statistics
+==========
+
+Per L5P protocol, the following NIC driver must report statistics for the above
+netdevice operations and packets processed by offload. For example, NVMe-TCP
+offload reports:
+
+ * ``rx_nvmeotcp_sk_add`` - number of NVMe-TCP Rx offload contexts created.
+ * ``rx_nvmeotcp_sk_add_fail`` - number of NVMe-TCP Rx offload context creation
+   failures.
+ * ``rx_nvmeotcp_sk_del`` - number of NVMe-TCP Rx offload contexts destroyed.
+ * ``rx_nvmeotcp_ddp_setup`` - number of DDP buffers mapped.
+ * ``rx_nvmeotcp_ddp_setup_fail`` - number of DDP buffers mapping that failed.
+ * ``rx_nvmeoulp_ddp_teardown`` - number of DDP buffers unmapped.
+ * ``rx_nvmeotcp_drop`` - number of packets dropped in the driver due to fatal
+   errors.
+ * ``rx_nvmeotcp_resync`` - number of packets with resync requests.
+ * ``rx_nvmeotcp_offload_packets`` - number of packets that used offload.
+ * ``rx_nvmeotcp_offload_bytes`` - number of bytes placed in DDP buffers.
+
+NIC requirements
+================
+
+NIC hardware should meet the following requirements to provide this offload:
+
+ * Offload must never buffer TCP packets.
+ * Offload must never modify TCP packet headers.
+ * Offload must never reorder TCP packets within a flow.
+ * Offload must never drop TCP packets.
+ * Offload must not depend on any TCP fields beyond the
+   5-tuple and TCP sequence number.