Message ID | 20250227041209.2031104-8-almasrymina@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Device memory TCP TX | expand |
On Thu, 27 Feb 2025 04:12:08 +0000 Mina Almasry wrote:
> + if (!skb_frags_readable(skb) && !dev->netmem_tx)
How do you know it's for _this_ device tho?
The driver doesn't seem to check the DMA mapping belongs to it either.
Remind me, how do we prevent the unreadable skbs from getting into the
Tx path today?
On Fri, Feb 28, 2025 at 4:43 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 27 Feb 2025 04:12:08 +0000 Mina Almasry wrote: > > + if (!skb_frags_readable(skb) && !dev->netmem_tx) > > How do you know it's for _this_ device tho? Maybe a noob question, but how do we end up here with an skb that is not targeted for the 'dev' device? We are checking in tcp_sendmsg_locked that we're targeting the appropriate device before creating the skb. Is this about a packet arriving on a dmabuf bound to a device and then being forwarded through another device that doesn't own the mapping, bypassing the check? > The driver doesn't seem to check the DMA mapping belongs to it either. > > Remind me, how do we prevent the unreadable skbs from getting into the > Tx path today? I'm not sure if this is about forwarding, or if there is some other way for unreadable skbs to end up in the XT path that you have in mind. At some point in this thread[1] we had talked about preventing MP bound devices from being lower devices at all to side step this entirely but you mentioned that may not be enough, and we ended up sidestepping only XDP entirely. [1] https://lore.kernel.org/bpf/20240821153049.7dc983db@kernel.org/ -- Thanks, Mina
On Fri, 28 Feb 2025 17:53:24 -0800 Mina Almasry wrote: > On Fri, Feb 28, 2025 at 4:43 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 27 Feb 2025 04:12:08 +0000 Mina Almasry wrote: > > > + if (!skb_frags_readable(skb) && !dev->netmem_tx) > > > > How do you know it's for _this_ device tho? > > Maybe a noob question, but how do we end up here with an skb that is > not targeted for the 'dev' device? We are checking in > tcp_sendmsg_locked that we're targeting the appropriate device before > creating the skb. Is this about a packet arriving on a dmabuf bound to > a device and then being forwarded through another device that doesn't > own the mapping, bypassing the check? Forwarded or just redirected by nft/bpf/tc > > The driver doesn't seem to check the DMA mapping belongs to it either. > > > > Remind me, how do we prevent the unreadable skbs from getting into the > > Tx path today? > > I'm not sure if this is about forwarding, or if there is some other > way for unreadable skbs to end up in the XT path that you have in > mind. At some point in this thread[1] we had talked about preventing > MP bound devices from being lower devices at all to side step this > entirely but you mentioned that may not be enough, and we ended up > sidestepping only XDP entirely. > > [1] https://lore.kernel.org/bpf/20240821153049.7dc983db@kernel.org/ Upper devices and BPF access is covered I think, by the skbuff checks. But I think we missed adding a check in validate_xmit_skb() to protect the xmit paths of HW|virt drivers. You can try to add a TC rule which forwards all traffic from your devmem flow back out to the device and see if it crashes on net-next ?
On Mon, Mar 3, 2025 at 4:29 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Fri, 28 Feb 2025 17:53:24 -0800 Mina Almasry wrote: > > On Fri, Feb 28, 2025 at 4:43 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > On Thu, 27 Feb 2025 04:12:08 +0000 Mina Almasry wrote: > > > > + if (!skb_frags_readable(skb) && !dev->netmem_tx) > > > > > > How do you know it's for _this_ device tho? > > > > Maybe a noob question, but how do we end up here with an skb that is > > not targeted for the 'dev' device? We are checking in > > tcp_sendmsg_locked that we're targeting the appropriate device before > > creating the skb. Is this about a packet arriving on a dmabuf bound to > > a device and then being forwarded through another device that doesn't > > own the mapping, bypassing the check? > > Forwarded or just redirected by nft/bpf/tc > > > > The driver doesn't seem to check the DMA mapping belongs to it either. > > > > > > Remind me, how do we prevent the unreadable skbs from getting into the > > > Tx path today? > > > > I'm not sure if this is about forwarding, or if there is some other > > way for unreadable skbs to end up in the XT path that you have in > > mind. At some point in this thread[1] we had talked about preventing > > MP bound devices from being lower devices at all to side step this > > entirely but you mentioned that may not be enough, and we ended up > > sidestepping only XDP entirely. > > > > [1] https://lore.kernel.org/bpf/20240821153049.7dc983db@kernel.org/ > > Upper devices and BPF access is covered I think, by the skbuff checks. > But I think we missed adding a check in validate_xmit_skb() to protect > the xmit paths of HW|virt drivers. You can try to add a TC rule which > forwards all traffic from your devmem flow back out to the device and > see if it crashes on net-next ? No crash, but by adding debug logs I'm detecting that we're passing unreadable netmem dma-addresses to the dma_unmap_*() APIs, which is known to be unsafe. I just can't reproduce an issue because my platform has the IOMMU disabled. I guess I do need to send the hunk from validate_xmit_skb() as a fix to net and CC stable. Another thing I'm worried about is ip_forward() inserting an unreadable skb into the tx path somewhere higher up the stack which calls more code that isn't expecting unreadable skbs? Specifically worried about skb_frag_ref/unref. Does this sound like a concern as well? Or is it a similar code path to tc? -- Thanks, Mina
diff --git a/net/core/dev.c b/net/core/dev.c index 3f525278a871..80636c569cf4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3921,6 +3921,9 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device skb = validate_xmit_xfrm(skb, features, again); + if (!skb_frags_readable(skb) && !dev->netmem_tx) + goto out_kfree_skb; + return skb; out_kfree_skb: diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 6e7cd6a5c177..6c5d62df0d65 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -972,6 +972,13 @@ int netdev_nl_bind_tx_doit(struct sk_buff *skb, struct genl_info *info) goto err_unlock; } + if (!netdev->netmem_tx) { + err = -EOPNOTSUPP; + NL_SET_ERR_MSG(info->extack, + "Driver does not support netmem TX"); + goto err_unlock; + } + binding = net_devmem_bind_dmabuf(netdev, DMA_TO_DEVICE, dmabuf_fd, info->extack); if (IS_ERR(binding)) {