Message ID | 20231220014747.1508581-4-dw@davidwei.uk (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | netdevsim: link and forward skbs between ports | expand |
Wed, Dec 20, 2023 at 02:47:45AM CET, dw@davidwei.uk wrote: >Forward skbs sent from one netdevsim port to its connected netdevsim >port using dev_forward_skb, in a spirit similar to veth. > >Add a tx_dropped variable to struct netdevsim, tracking the number of >skbs that could not be forwarded using dev_forward_skb(). > >The xmit() function accessing the peer ptr is protected by an RCU read >critical section. The rcu_read_lock() is functionally redundant as since >v5.0 all softirqs are implicitly RCU read critical sections; but it is >useful for human readers. > >If another CPU is concurrently in nsim_destroy(), then it will first set >the peer ptr to NULL. This does not affect any existing readers that >dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >a synchronize_rcu() before the netdev is actually unregistered and >freed. This ensures that any readers i.e. xmit() that got a non-NULL >peer will complete before the netdev is freed. > >Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >will dereference NULL, making it safe. > >The codepath to nsim_destroy() and nsim_create() takes both the newly >added nsim_dev_list_lock and rtnl_lock. This makes it safe with >concurrent calls to linking two netdevsims together. > >Signed-off-by: David Wei <dw@davidwei.uk> >--- > drivers/net/netdevsim/netdev.c | 25 ++++++++++++++++++++++--- > drivers/net/netdevsim/netdevsim.h | 1 + > 2 files changed, 23 insertions(+), 3 deletions(-) > >diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >index 434322f6a565..00ab3098eb9f 100644 >--- a/drivers/net/netdevsim/netdev.c >+++ b/drivers/net/netdevsim/netdev.c >@@ -29,6 +29,8 @@ > static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) > { > struct netdevsim *ns = netdev_priv(dev); >+ struct netdevsim *peer_ns; >+ int ret = NETDEV_TX_OK; > > if (!nsim_ipsec_tx(ns, skb)) > goto out; >@@ -36,12 +38,29 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) > u64_stats_update_begin(&ns->syncp); > ns->tx_packets++; > ns->tx_bytes += skb->len; >+ >+ rcu_read_lock(); >+ peer_ns = rcu_dereference(ns->peer); >+ if (!peer_ns) >+ goto out_stats; >+ >+ skb_tx_timestamp(skb); >+ if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) { >+ ret = NET_XMIT_DROP; >+ ns->tx_dropped++; Idk, does not look fine to me to be in u64_stats_update section while calling dev_forward_skb() >+ } >+ >+ rcu_read_unlock(); > u64_stats_update_end(&ns->syncp); > >+ return ret; >+ >+out_stats: >+ rcu_read_unlock(); >+ u64_stats_update_end(&ns->syncp); > out: > dev_kfree_skb(skb); >- >- return NETDEV_TX_OK; >+ return ret; > } > > static void nsim_set_rx_mode(struct net_device *dev) >@@ -70,6 +89,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) > start = u64_stats_fetch_begin(&ns->syncp); > stats->tx_bytes = ns->tx_bytes; > stats->tx_packets = ns->tx_packets; >+ stats->tx_dropped = ns->tx_dropped; > } while (u64_stats_fetch_retry(&ns->syncp, start)); > } > >@@ -302,7 +322,6 @@ static void nsim_setup(struct net_device *dev) > eth_hw_addr_random(dev); > > dev->tx_queue_len = 0; >- dev->flags |= IFF_NOARP; > dev->flags &= ~IFF_MULTICAST; > dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | > IFF_NO_QUEUE; >diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >index 24fc3fbda791..083b1ee7a1a2 100644 >--- a/drivers/net/netdevsim/netdevsim.h >+++ b/drivers/net/netdevsim/netdevsim.h >@@ -98,6 +98,7 @@ struct netdevsim { > > u64 tx_packets; > u64 tx_bytes; >+ u64 tx_dropped; > struct u64_stats_sync syncp; > > struct nsim_bus_dev *nsim_bus_dev; >-- >2.39.3 >
On 2023-12-20 01:04, Jiri Pirko wrote: > Wed, Dec 20, 2023 at 02:47:45AM CET, dw@davidwei.uk wrote: >> Forward skbs sent from one netdevsim port to its connected netdevsim >> port using dev_forward_skb, in a spirit similar to veth. >> >> Add a tx_dropped variable to struct netdevsim, tracking the number of >> skbs that could not be forwarded using dev_forward_skb(). >> >> The xmit() function accessing the peer ptr is protected by an RCU read >> critical section. The rcu_read_lock() is functionally redundant as since >> v5.0 all softirqs are implicitly RCU read critical sections; but it is >> useful for human readers. >> >> If another CPU is concurrently in nsim_destroy(), then it will first set >> the peer ptr to NULL. This does not affect any existing readers that >> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >> a synchronize_rcu() before the netdev is actually unregistered and >> freed. This ensures that any readers i.e. xmit() that got a non-NULL >> peer will complete before the netdev is freed. >> >> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >> will dereference NULL, making it safe. >> >> The codepath to nsim_destroy() and nsim_create() takes both the newly >> added nsim_dev_list_lock and rtnl_lock. This makes it safe with >> concurrent calls to linking two netdevsims together. >> >> Signed-off-by: David Wei <dw@davidwei.uk> >> --- >> drivers/net/netdevsim/netdev.c | 25 ++++++++++++++++++++++--- >> drivers/net/netdevsim/netdevsim.h | 1 + >> 2 files changed, 23 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >> index 434322f6a565..00ab3098eb9f 100644 >> --- a/drivers/net/netdevsim/netdev.c >> +++ b/drivers/net/netdevsim/netdev.c >> @@ -29,6 +29,8 @@ >> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) >> { >> struct netdevsim *ns = netdev_priv(dev); >> + struct netdevsim *peer_ns; >> + int ret = NETDEV_TX_OK; >> >> if (!nsim_ipsec_tx(ns, skb)) >> goto out; >> @@ -36,12 +38,29 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) >> u64_stats_update_begin(&ns->syncp); >> ns->tx_packets++; >> ns->tx_bytes += skb->len; >> + >> + rcu_read_lock(); >> + peer_ns = rcu_dereference(ns->peer); >> + if (!peer_ns) >> + goto out_stats; >> + >> + skb_tx_timestamp(skb); >> + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) { >> + ret = NET_XMIT_DROP; >> + ns->tx_dropped++; > > Idk, does not look fine to me to be in u64_stats_update section while > calling dev_forward_skb() I see, it's a type of spinlock. I will fix this. > > >> + } >> + >> + rcu_read_unlock(); >> u64_stats_update_end(&ns->syncp); >> >> + return ret; >> + >> +out_stats: >> + rcu_read_unlock(); >> + u64_stats_update_end(&ns->syncp); >> out: >> dev_kfree_skb(skb); >> - >> - return NETDEV_TX_OK; >> + return ret; >> } >> >> static void nsim_set_rx_mode(struct net_device *dev) >> @@ -70,6 +89,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) >> start = u64_stats_fetch_begin(&ns->syncp); >> stats->tx_bytes = ns->tx_bytes; >> stats->tx_packets = ns->tx_packets; >> + stats->tx_dropped = ns->tx_dropped; >> } while (u64_stats_fetch_retry(&ns->syncp, start)); >> } >> >> @@ -302,7 +322,6 @@ static void nsim_setup(struct net_device *dev) >> eth_hw_addr_random(dev); >> >> dev->tx_queue_len = 0; >> - dev->flags |= IFF_NOARP; >> dev->flags &= ~IFF_MULTICAST; >> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | >> IFF_NO_QUEUE; >> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >> index 24fc3fbda791..083b1ee7a1a2 100644 >> --- a/drivers/net/netdevsim/netdevsim.h >> +++ b/drivers/net/netdevsim/netdevsim.h >> @@ -98,6 +98,7 @@ struct netdevsim { >> >> u64 tx_packets; >> u64 tx_bytes; >> + u64 tx_dropped; >> struct u64_stats_sync syncp; >> >> struct nsim_bus_dev *nsim_bus_dev; >> -- >> 2.39.3 >>
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c index 434322f6a565..00ab3098eb9f 100644 --- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c @@ -29,6 +29,8 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct netdevsim *ns = netdev_priv(dev); + struct netdevsim *peer_ns; + int ret = NETDEV_TX_OK; if (!nsim_ipsec_tx(ns, skb)) goto out; @@ -36,12 +38,29 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) u64_stats_update_begin(&ns->syncp); ns->tx_packets++; ns->tx_bytes += skb->len; + + rcu_read_lock(); + peer_ns = rcu_dereference(ns->peer); + if (!peer_ns) + goto out_stats; + + skb_tx_timestamp(skb); + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) { + ret = NET_XMIT_DROP; + ns->tx_dropped++; + } + + rcu_read_unlock(); u64_stats_update_end(&ns->syncp); + return ret; + +out_stats: + rcu_read_unlock(); + u64_stats_update_end(&ns->syncp); out: dev_kfree_skb(skb); - - return NETDEV_TX_OK; + return ret; } static void nsim_set_rx_mode(struct net_device *dev) @@ -70,6 +89,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) start = u64_stats_fetch_begin(&ns->syncp); stats->tx_bytes = ns->tx_bytes; stats->tx_packets = ns->tx_packets; + stats->tx_dropped = ns->tx_dropped; } while (u64_stats_fetch_retry(&ns->syncp, start)); } @@ -302,7 +322,6 @@ static void nsim_setup(struct net_device *dev) eth_hw_addr_random(dev); dev->tx_queue_len = 0; - dev->flags |= IFF_NOARP; dev->flags &= ~IFF_MULTICAST; dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_NO_QUEUE; diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h index 24fc3fbda791..083b1ee7a1a2 100644 --- a/drivers/net/netdevsim/netdevsim.h +++ b/drivers/net/netdevsim/netdevsim.h @@ -98,6 +98,7 @@ struct netdevsim { u64 tx_packets; u64 tx_bytes; + u64 tx_dropped; struct u64_stats_sync syncp; struct nsim_bus_dev *nsim_bus_dev;
Forward skbs sent from one netdevsim port to its connected netdevsim port using dev_forward_skb, in a spirit similar to veth. Add a tx_dropped variable to struct netdevsim, tracking the number of skbs that could not be forwarded using dev_forward_skb(). The xmit() function accessing the peer ptr is protected by an RCU read critical section. The rcu_read_lock() is functionally redundant as since v5.0 all softirqs are implicitly RCU read critical sections; but it is useful for human readers. If another CPU is concurrently in nsim_destroy(), then it will first set the peer ptr to NULL. This does not affect any existing readers that dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is a synchronize_rcu() before the netdev is actually unregistered and freed. This ensures that any readers i.e. xmit() that got a non-NULL peer will complete before the netdev is freed. Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() will dereference NULL, making it safe. The codepath to nsim_destroy() and nsim_create() takes both the newly added nsim_dev_list_lock and rtnl_lock. This makes it safe with concurrent calls to linking two netdevsims together. Signed-off-by: David Wei <dw@davidwei.uk> --- drivers/net/netdevsim/netdev.c | 25 ++++++++++++++++++++++--- drivers/net/netdevsim/netdevsim.h | 1 + 2 files changed, 23 insertions(+), 3 deletions(-)