diff mbox

[REPOST,libibverbs] Add IP and TCP/UDP TX checksum offload support

Message ID 1434276784-15909-1-git-send-email-ogerlitz@mellanox.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Or Gerlitz June 14, 2015, 10:13 a.m. UTC
From: Moshe Lazer <moshel@mellanox.com>

Add a device capability flag IB_DEVICE_IP_SUM to denote checksum offload
support. Devices should set this flag if they support insertion of IP, TCP
and UDP checksums on outgoing IP packets sent over IB UD or ETH RAW_PACKET QPs.

In addition add a send flag IBV_SEND_IP_CSUM to utilize this capability.

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---

This is a rpost of the code we sent earlier on January, 
I understand that people's focus nowadays is around
the few threads that deal with kernel changes, but
let's get it in to sit in the patchworks again...

 include/infiniband/verbs.h |    4 +++-
 man/ibv_post_send.3        |    4 ++++
 2 files changed, 7 insertions(+), 1 deletions(-)

Comments

Christoph Lameter (Ampere) June 18, 2015, 3:21 p.m. UTC | #1
We run those patches and we would like to see them upstream.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe June 18, 2015, 4:38 p.m. UTC | #2
On Sun, Jun 14, 2015 at 01:13:04PM +0300, Or Gerlitz wrote:
> From: Moshe Lazer <moshel@mellanox.com>
> 
> Add a device capability flag IB_DEVICE_IP_SUM to denote checksum offload
> support. Devices should set this flag if they support insertion of IP, TCP
> and UDP checksums on outgoing IP packets sent over IB UD or ETH RAW_PACKET QPs.

It would be really nice to see the man page discuss exactly what is
done here, there are quite a few different options for checksum.

IPoIB maps this to 'NETIF_F_IP_CSUM', so only UDP/TCP checksum in
IPv4.

What value should be placed in the checksum header(s) prior to issuing
the send? It looks like IPoIB uses the CHECKSUM_PARTIAL protocol ?

This should all be made clear in the man page, IMHO.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Lameter (Ampere) July 1, 2015, 3:17 p.m. UTC | #3
Is there any release schedule and/or upstream repo where I can see changes
for libibverbs and libmlx4?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford July 29, 2015, 4:32 p.m. UTC | #4
On 07/01/2015 11:17 AM, Christoph Lameter wrote:
> Is there any release schedule and/or upstream repo where I can see changes
> for libibverbs and libmlx4?
> 

I'm working on libibverbs right now.  The upstream git repo has not
changed, it's the same as it always was.

Or, Jason asked for some (IMO very much needed) documentation on this
feature via man page additions.  For instance, the changelog calls out
that this is only used for TCP/UDP packets over IB UD QPs or Eth Raw
QPs.  Well, IB UD QPs sending IP packets is IPoIB, which is generally a
kernel protocol.  So, when it comes to user space, is this realistically
a Raw Eth QP only thing?

Also, the man page entry calls out what I would classify as hardware
limitations in the general IB verbs man page.  I would think it would
more appropriately call out that the support is hardware dependent and
then provide some means for the hardware to specify what types of IP
CSUM offloads it supports.  For instance, although there's been talk of
making connected mode IPoIB go faster by dropping the TCP checksum and
relying on the RC EC code instead and just ignoring the TCP checksum.
Another alternative would be to add TCP checksum support to your
firmware for connected mode QPs, so it's possible that in the future the
types of IP CSUM support could grow.  This patchset doesn't allow for
that or for distinguishing between hardware that has new or old firmware.
Or Gerlitz Aug. 11, 2015, 8:39 a.m. UTC | #5
On Thu, Jun 18, 2015 at 7:38 PM, Jason Gunthorpe
<jgunthorpe@obsidianresearch.com> wrote:
> On Sun, Jun 14, 2015 at 01:13:04PM +0300, Or Gerlitz wrote:
>> From: Moshe Lazer <moshel@mellanox.com>
>>
>> Add a device capability flag IB_DEVICE_IP_SUM to denote checksum offload
>> support. Devices should set this flag if they support insertion of IP, TCP
>> and UDP checksums on outgoing IP packets sent over IB UD or ETH RAW_PACKET QPs.
>
> It would be really nice to see the man page discuss exactly what is
> done here, there are quite a few different options for checksum.
>
> IPoIB maps this to 'NETIF_F_IP_CSUM', so only UDP/TCP checksum in
> IPv4.
>
> What value should be placed in the checksum header(s) prior to issuing
> the send? It looks like IPoIB uses the CHECKSUM_PARTIAL protocol ?

yes, we can follow here on how IPoIB uses this device capability. In that case,
AFAIK and from quick look on the TX section for CHECKSUM_PARTIAL in
include/linux/skbuff.h --
nothing special is assumed to be placed in the checksum field at the
header/s prior to issuing the send, agree?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/infiniband/verbs.h b/include/infiniband/verbs.h
index cfa1156..78b5997 100644
--- a/include/infiniband/verbs.h
+++ b/include/infiniband/verbs.h
@@ -115,6 +115,7 @@  enum ibv_device_cap_flags {
 	IBV_DEVICE_RC_RNR_NAK_GEN	= 1 << 12,
 	IBV_DEVICE_SRQ_RESIZE		= 1 << 13,
 	IBV_DEVICE_N_NOTIFY_CQ		= 1 << 14,
+	IBV_DEVICE_IP_CSUM		= 1 << 18,
 	IBV_DEVICE_XRC			= 1 << 20,
 	IBV_DEVICE_MANAGED_FLOW_STEERING = 1 << 29
 };
@@ -627,7 +628,8 @@  enum ibv_send_flags {
 	IBV_SEND_FENCE		= 1 << 0,
 	IBV_SEND_SIGNALED	= 1 << 1,
 	IBV_SEND_SOLICITED	= 1 << 2,
-	IBV_SEND_INLINE		= 1 << 3
+	IBV_SEND_INLINE		= 1 << 3,
+	IBV_SEND_IP_CSUM        = 1 << 4
 };
 
 struct ibv_sge {
diff --git a/man/ibv_post_send.3 b/man/ibv_post_send.3
index 33fbb50..8070996 100644
--- a/man/ibv_post_send.3
+++ b/man/ibv_post_send.3
@@ -98,6 +98,10 @@  The attribute send_flags describes the properties of the \s-1WR\s0. It is either
 .TP
 .B IBV_SEND_INLINE \fR Send data in given gather list as inline data
 in a send WQE.  Valid only for Send and RDMA Write.  The L_Key will not be checked.
+.TP
+.B IBV_SEND_IP_CSUM \fR Offload the IP and TCP/UDP checksum calculation.
+Valid only for QPs with Transport Service Type \fBIBV_QPT_UD\fR or \fBIBV_QPT_RAW_PACKET\fR.
+This feature supported only when \fBIBV_DEVICE_IP_CSUM\fR the flag is set in the device capability flags.
 .SH "RETURN VALUE"
 .B ibv_post_send()
 returns 0 on success, or the value of errno on failure (which indicates the failure reason).