Message ID | 20240210003240.847392-3-dw@davidwei.uk (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | netdevsim: link and forward skbs between ports | expand |
On 10/02/2024 01:32, David Wei wrote: > Forward skbs sent from one netdevsim port to its connected netdevsim > port using dev_forward_skb, in a spirit similar to veth. > > Add a tx_dropped variable to struct netdevsim, tracking the number of > skbs that could not be forwarded using dev_forward_skb(). > > The xmit() function accessing the peer ptr is protected by an RCU read > critical section. The rcu_read_lock() is functionally redundant as since > v5.0 all softirqs are implicitly RCU read critical sections; but it is > useful for human readers. > > If another CPU is concurrently in nsim_destroy(), then it will first set > the peer ptr to NULL. This does not affect any existing readers that > dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is > a synchronize_rcu() before the netdev is actually unregistered and > freed. This ensures that any readers i.e. xmit() that got a non-NULL > peer will complete before the netdev is freed. > > Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() > will dereference NULL, making it safe. > > The codepath to nsim_destroy() and nsim_create() takes both the newly > added nsim_dev_list_lock and rtnl_lock. This makes it safe with > concurrent calls to linking two netdevsims together. > > Signed-off-by: David Wei <dw@davidwei.uk> > --- > drivers/net/netdevsim/netdev.c | 28 +++++++++++++++++++++++----- > drivers/net/netdevsim/netdevsim.h | 1 + > 2 files changed, 24 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c > index 9063f4f2971b..13d3e1536451 100644 > --- a/drivers/net/netdevsim/netdev.c > +++ b/drivers/net/netdevsim/netdev.c > @@ -29,19 +29,37 @@ > static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) > { > struct netdevsim *ns = netdev_priv(dev); > + unsigned int len = skb->len; > + struct netdevsim *peer_ns; > + int ret = NETDEV_TX_OK; > > if (!nsim_ipsec_tx(ns, skb)) > goto out; > > + rcu_read_lock(); > + peer_ns = rcu_dereference(ns->peer); > + if (!peer_ns) > + goto out_stats; Change ret to NET_XMIT_DROP to correctly count packets as dropped > + > + skb_tx_timestamp(skb); > + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) > + ret = NET_XMIT_DROP; > + > +out_stats: > + rcu_read_unlock(); > u64_stats_update_begin(&ns->syncp); > - ns->tx_packets++; > - ns->tx_bytes += skb->len; > + if (ret == NET_XMIT_DROP) { > + ns->tx_dropped++; add dev_kfree_skb(skb); Thanks, Maciek > + } else { > + ns->tx_packets++; > + ns->tx_bytes += len; > + } > u64_stats_update_end(&ns->syncp); > + return ret; > > out: > dev_kfree_skb(skb); > - > - return NETDEV_TX_OK; > + return ret; > } > > static void nsim_set_rx_mode(struct net_device *dev) > @@ -70,6 +88,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) > start = u64_stats_fetch_begin(&ns->syncp); > stats->tx_bytes = ns->tx_bytes; > stats->tx_packets = ns->tx_packets; > + stats->tx_dropped = ns->tx_dropped; > } while (u64_stats_fetch_retry(&ns->syncp, start)); > } > > @@ -302,7 +321,6 @@ static void nsim_setup(struct net_device *dev) > eth_hw_addr_random(dev); > > dev->tx_queue_len = 0; > - dev->flags |= IFF_NOARP; > dev->flags &= ~IFF_MULTICAST; > dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | > IFF_NO_QUEUE; > diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h > index c8b45b0d955e..553c4b9b4f63 100644 > --- a/drivers/net/netdevsim/netdevsim.h > +++ b/drivers/net/netdevsim/netdevsim.h > @@ -98,6 +98,7 @@ struct netdevsim { > > u64 tx_packets; > u64 tx_bytes; > + u64 tx_dropped; > struct u64_stats_sync syncp; > > struct nsim_bus_dev *nsim_bus_dev;
On 2024-02-10 13:43, Maciek Machnikowski wrote: > > > On 10/02/2024 01:32, David Wei wrote: >> Forward skbs sent from one netdevsim port to its connected netdevsim >> port using dev_forward_skb, in a spirit similar to veth. >> >> Add a tx_dropped variable to struct netdevsim, tracking the number of >> skbs that could not be forwarded using dev_forward_skb(). >> >> The xmit() function accessing the peer ptr is protected by an RCU read >> critical section. The rcu_read_lock() is functionally redundant as since >> v5.0 all softirqs are implicitly RCU read critical sections; but it is >> useful for human readers. >> >> If another CPU is concurrently in nsim_destroy(), then it will first set >> the peer ptr to NULL. This does not affect any existing readers that >> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >> a synchronize_rcu() before the netdev is actually unregistered and >> freed. This ensures that any readers i.e. xmit() that got a non-NULL >> peer will complete before the netdev is freed. >> >> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >> will dereference NULL, making it safe. >> >> The codepath to nsim_destroy() and nsim_create() takes both the newly >> added nsim_dev_list_lock and rtnl_lock. This makes it safe with >> concurrent calls to linking two netdevsims together. >> >> Signed-off-by: David Wei <dw@davidwei.uk> >> --- >> drivers/net/netdevsim/netdev.c | 28 +++++++++++++++++++++++----- >> drivers/net/netdevsim/netdevsim.h | 1 + >> 2 files changed, 24 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >> index 9063f4f2971b..13d3e1536451 100644 >> --- a/drivers/net/netdevsim/netdev.c >> +++ b/drivers/net/netdevsim/netdev.c >> @@ -29,19 +29,37 @@ >> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) >> { >> struct netdevsim *ns = netdev_priv(dev); >> + unsigned int len = skb->len; >> + struct netdevsim *peer_ns; >> + int ret = NETDEV_TX_OK; >> >> if (!nsim_ipsec_tx(ns, skb)) >> goto out; >> >> + rcu_read_lock(); >> + peer_ns = rcu_dereference(ns->peer); >> + if (!peer_ns) >> + goto out_stats; > Change ret to NET_XMIT_DROP to correctly count packets as dropped Linking and forwarding between two netdevsims is optional, so I don't want to always return NET_XMIT_DROP. > >> + >> + skb_tx_timestamp(skb); >> + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) >> + ret = NET_XMIT_DROP; >> + >> +out_stats: >> + rcu_read_unlock(); >> u64_stats_update_begin(&ns->syncp); >> - ns->tx_packets++; >> - ns->tx_bytes += skb->len; >> + if (ret == NET_XMIT_DROP) { >> + ns->tx_dropped++; > add dev_kfree_skb(skb); dev_forward_skb() frees the skb if dropped already. > > Thanks, > Maciek > >> + } else { >> + ns->tx_packets++; >> + ns->tx_bytes += len; >> + } >> u64_stats_update_end(&ns->syncp); >> + return ret; >> >> out: >> dev_kfree_skb(skb); >> - >> - return NETDEV_TX_OK; >> + return ret; >> } >> >> static void nsim_set_rx_mode(struct net_device *dev) >> @@ -70,6 +88,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) >> start = u64_stats_fetch_begin(&ns->syncp); >> stats->tx_bytes = ns->tx_bytes; >> stats->tx_packets = ns->tx_packets; >> + stats->tx_dropped = ns->tx_dropped; >> } while (u64_stats_fetch_retry(&ns->syncp, start)); >> } >> >> @@ -302,7 +321,6 @@ static void nsim_setup(struct net_device *dev) >> eth_hw_addr_random(dev); >> >> dev->tx_queue_len = 0; >> - dev->flags |= IFF_NOARP; >> dev->flags &= ~IFF_MULTICAST; >> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | >> IFF_NO_QUEUE; >> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >> index c8b45b0d955e..553c4b9b4f63 100644 >> --- a/drivers/net/netdevsim/netdevsim.h >> +++ b/drivers/net/netdevsim/netdevsim.h >> @@ -98,6 +98,7 @@ struct netdevsim { >> >> u64 tx_packets; >> u64 tx_bytes; >> + u64 tx_dropped; >> struct u64_stats_sync syncp; >> >> struct nsim_bus_dev *nsim_bus_dev;
On 12/02/2024 18:32, David Wei wrote: > On 2024-02-10 13:43, Maciek Machnikowski wrote: >> >> >> On 10/02/2024 01:32, David Wei wrote: >>> Forward skbs sent from one netdevsim port to its connected netdevsim >>> port using dev_forward_skb, in a spirit similar to veth. >>> >>> Add a tx_dropped variable to struct netdevsim, tracking the number of >>> skbs that could not be forwarded using dev_forward_skb(). >>> >>> The xmit() function accessing the peer ptr is protected by an RCU read >>> critical section. The rcu_read_lock() is functionally redundant as since >>> v5.0 all softirqs are implicitly RCU read critical sections; but it is >>> useful for human readers. >>> >>> If another CPU is concurrently in nsim_destroy(), then it will first set >>> the peer ptr to NULL. This does not affect any existing readers that >>> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >>> a synchronize_rcu() before the netdev is actually unregistered and >>> freed. This ensures that any readers i.e. xmit() that got a non-NULL >>> peer will complete before the netdev is freed. >>> >>> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >>> will dereference NULL, making it safe. >>> >>> The codepath to nsim_destroy() and nsim_create() takes both the newly >>> added nsim_dev_list_lock and rtnl_lock. This makes it safe with >>> concurrent calls to linking two netdevsims together. >>> >>> Signed-off-by: David Wei <dw@davidwei.uk> >>> --- >>> drivers/net/netdevsim/netdev.c | 28 +++++++++++++++++++++++----- >>> drivers/net/netdevsim/netdevsim.h | 1 + >>> 2 files changed, 24 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >>> index 9063f4f2971b..13d3e1536451 100644 >>> --- a/drivers/net/netdevsim/netdev.c >>> +++ b/drivers/net/netdevsim/netdev.c >>> @@ -29,19 +29,37 @@ >>> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) >>> { >>> struct netdevsim *ns = netdev_priv(dev); >>> + unsigned int len = skb->len; >>> + struct netdevsim *peer_ns; >>> + int ret = NETDEV_TX_OK; >>> >>> if (!nsim_ipsec_tx(ns, skb)) >>> goto out; >>> >>> + rcu_read_lock(); >>> + peer_ns = rcu_dereference(ns->peer); >>> + if (!peer_ns) >>> + goto out_stats; >> Change ret to NET_XMIT_DROP to correctly count packets as dropped > > Linking and forwarding between two netdevsims is optional, so I don't > want to always return NET_XMIT_DROP. > OK - now I see that previously we calculated all packets as TX_OK and transmitted, so this is OK. >> >>> + >>> + skb_tx_timestamp(skb); >>> + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) >>> + ret = NET_XMIT_DROP; >>> + >>> +out_stats: >>> + rcu_read_unlock(); >>> u64_stats_update_begin(&ns->syncp); >>> - ns->tx_packets++; >>> - ns->tx_bytes += skb->len; >>> + if (ret == NET_XMIT_DROP) { >>> + ns->tx_dropped++; >> add dev_kfree_skb(skb); > > dev_forward_skb() frees the skb if dropped already. But the dev_forward_skb() won't be called if there is no peer_ns. In this case we still need to call dev_kfree_skb(). > >> >> Thanks, >> Maciek >> >>> + } else { >>> + ns->tx_packets++; >>> + ns->tx_bytes += len; >>> + } >>> u64_stats_update_end(&ns->syncp); >>> + return ret; >>> >>> out: >>> dev_kfree_skb(skb); >>> - >>> - return NETDEV_TX_OK; >>> + return ret; >>> } >>> >>> static void nsim_set_rx_mode(struct net_device *dev) >>> @@ -70,6 +88,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) >>> start = u64_stats_fetch_begin(&ns->syncp); >>> stats->tx_bytes = ns->tx_bytes; >>> stats->tx_packets = ns->tx_packets; >>> + stats->tx_dropped = ns->tx_dropped; >>> } while (u64_stats_fetch_retry(&ns->syncp, start)); >>> } >>> >>> @@ -302,7 +321,6 @@ static void nsim_setup(struct net_device *dev) >>> eth_hw_addr_random(dev); >>> >>> dev->tx_queue_len = 0; >>> - dev->flags |= IFF_NOARP; >>> dev->flags &= ~IFF_MULTICAST; >>> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | >>> IFF_NO_QUEUE; >>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >>> index c8b45b0d955e..553c4b9b4f63 100644 >>> --- a/drivers/net/netdevsim/netdevsim.h >>> +++ b/drivers/net/netdevsim/netdevsim.h >>> @@ -98,6 +98,7 @@ struct netdevsim { >>> >>> u64 tx_packets; >>> u64 tx_bytes; >>> + u64 tx_dropped; >>> struct u64_stats_sync syncp; >>> >>> struct nsim_bus_dev *nsim_bus_dev;
On 2024-02-12 10:10, Maciek Machnikowski wrote: > > > On 12/02/2024 18:32, David Wei wrote: >> On 2024-02-10 13:43, Maciek Machnikowski wrote: >>> >>> >>> On 10/02/2024 01:32, David Wei wrote: >>>> Forward skbs sent from one netdevsim port to its connected netdevsim >>>> port using dev_forward_skb, in a spirit similar to veth. >>>> >>>> Add a tx_dropped variable to struct netdevsim, tracking the number of >>>> skbs that could not be forwarded using dev_forward_skb(). >>>> >>>> The xmit() function accessing the peer ptr is protected by an RCU read >>>> critical section. The rcu_read_lock() is functionally redundant as since >>>> v5.0 all softirqs are implicitly RCU read critical sections; but it is >>>> useful for human readers. >>>> >>>> If another CPU is concurrently in nsim_destroy(), then it will first set >>>> the peer ptr to NULL. This does not affect any existing readers that >>>> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >>>> a synchronize_rcu() before the netdev is actually unregistered and >>>> freed. This ensures that any readers i.e. xmit() that got a non-NULL >>>> peer will complete before the netdev is freed. >>>> >>>> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >>>> will dereference NULL, making it safe. >>>> >>>> The codepath to nsim_destroy() and nsim_create() takes both the newly >>>> added nsim_dev_list_lock and rtnl_lock. This makes it safe with >>>> concurrent calls to linking two netdevsims together. >>>> >>>> Signed-off-by: David Wei <dw@davidwei.uk> >>>> --- >>>> drivers/net/netdevsim/netdev.c | 28 +++++++++++++++++++++++----- >>>> drivers/net/netdevsim/netdevsim.h | 1 + >>>> 2 files changed, 24 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >>>> index 9063f4f2971b..13d3e1536451 100644 >>>> --- a/drivers/net/netdevsim/netdev.c >>>> +++ b/drivers/net/netdevsim/netdev.c >>>> @@ -29,19 +29,37 @@ >>>> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) >>>> { >>>> struct netdevsim *ns = netdev_priv(dev); >>>> + unsigned int len = skb->len; >>>> + struct netdevsim *peer_ns; >>>> + int ret = NETDEV_TX_OK; >>>> >>>> if (!nsim_ipsec_tx(ns, skb)) >>>> goto out; >>>> >>>> + rcu_read_lock(); >>>> + peer_ns = rcu_dereference(ns->peer); >>>> + if (!peer_ns) >>>> + goto out_stats; >>> Change ret to NET_XMIT_DROP to correctly count packets as dropped >> >> Linking and forwarding between two netdevsims is optional, so I don't >> want to always return NET_XMIT_DROP. >> OK - now I see that previously we calculated all packets as TX_OK and > transmitted, so this is OK. > >>> >>>> + >>>> + skb_tx_timestamp(skb); >>>> + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) >>>> + ret = NET_XMIT_DROP; >>>> + >>>> +out_stats: >>>> + rcu_read_unlock(); >>>> u64_stats_update_begin(&ns->syncp); >>>> - ns->tx_packets++; >>>> - ns->tx_bytes += skb->len; >>>> + if (ret == NET_XMIT_DROP) { >>>> + ns->tx_dropped++; >>> add dev_kfree_skb(skb); >> >> dev_forward_skb() frees the skb if dropped already. > But the dev_forward_skb() won't be called if there is no peer_ns. In > this case we still need to call dev_kfree_skb(). Oh, I see now, thank you. > >> >>> >>> Thanks, >>> Maciek >>> >>>> + } else { >>>> + ns->tx_packets++; >>>> + ns->tx_bytes += len; >>>> + } >>>> u64_stats_update_end(&ns->syncp); >>>> + return ret; >>>> >>>> out: >>>> dev_kfree_skb(skb); >>>> - >>>> - return NETDEV_TX_OK; >>>> + return ret; >>>> } >>>> >>>> static void nsim_set_rx_mode(struct net_device *dev) >>>> @@ -70,6 +88,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) >>>> start = u64_stats_fetch_begin(&ns->syncp); >>>> stats->tx_bytes = ns->tx_bytes; >>>> stats->tx_packets = ns->tx_packets; >>>> + stats->tx_dropped = ns->tx_dropped; >>>> } while (u64_stats_fetch_retry(&ns->syncp, start)); >>>> } >>>> >>>> @@ -302,7 +321,6 @@ static void nsim_setup(struct net_device *dev) >>>> eth_hw_addr_random(dev); >>>> >>>> dev->tx_queue_len = 0; >>>> - dev->flags |= IFF_NOARP; >>>> dev->flags &= ~IFF_MULTICAST; >>>> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | >>>> IFF_NO_QUEUE; >>>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >>>> index c8b45b0d955e..553c4b9b4f63 100644 >>>> --- a/drivers/net/netdevsim/netdevsim.h >>>> +++ b/drivers/net/netdevsim/netdevsim.h >>>> @@ -98,6 +98,7 @@ struct netdevsim { >>>> >>>> u64 tx_packets; >>>> u64 tx_bytes; >>>> + u64 tx_dropped; >>>> struct u64_stats_sync syncp; >>>> >>>> struct nsim_bus_dev *nsim_bus_dev;
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c index 9063f4f2971b..13d3e1536451 100644 --- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c @@ -29,19 +29,37 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct netdevsim *ns = netdev_priv(dev); + unsigned int len = skb->len; + struct netdevsim *peer_ns; + int ret = NETDEV_TX_OK; if (!nsim_ipsec_tx(ns, skb)) goto out; + rcu_read_lock(); + peer_ns = rcu_dereference(ns->peer); + if (!peer_ns) + goto out_stats; + + skb_tx_timestamp(skb); + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) + ret = NET_XMIT_DROP; + +out_stats: + rcu_read_unlock(); u64_stats_update_begin(&ns->syncp); - ns->tx_packets++; - ns->tx_bytes += skb->len; + if (ret == NET_XMIT_DROP) { + ns->tx_dropped++; + } else { + ns->tx_packets++; + ns->tx_bytes += len; + } u64_stats_update_end(&ns->syncp); + return ret; out: dev_kfree_skb(skb); - - return NETDEV_TX_OK; + return ret; } static void nsim_set_rx_mode(struct net_device *dev) @@ -70,6 +88,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) start = u64_stats_fetch_begin(&ns->syncp); stats->tx_bytes = ns->tx_bytes; stats->tx_packets = ns->tx_packets; + stats->tx_dropped = ns->tx_dropped; } while (u64_stats_fetch_retry(&ns->syncp, start)); } @@ -302,7 +321,6 @@ static void nsim_setup(struct net_device *dev) eth_hw_addr_random(dev); dev->tx_queue_len = 0; - dev->flags |= IFF_NOARP; dev->flags &= ~IFF_MULTICAST; dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_NO_QUEUE; diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h index c8b45b0d955e..553c4b9b4f63 100644 --- a/drivers/net/netdevsim/netdevsim.h +++ b/drivers/net/netdevsim/netdevsim.h @@ -98,6 +98,7 @@ struct netdevsim { u64 tx_packets; u64 tx_bytes; + u64 tx_dropped; struct u64_stats_sync syncp; struct nsim_bus_dev *nsim_bus_dev;
Forward skbs sent from one netdevsim port to its connected netdevsim port using dev_forward_skb, in a spirit similar to veth. Add a tx_dropped variable to struct netdevsim, tracking the number of skbs that could not be forwarded using dev_forward_skb(). The xmit() function accessing the peer ptr is protected by an RCU read critical section. The rcu_read_lock() is functionally redundant as since v5.0 all softirqs are implicitly RCU read critical sections; but it is useful for human readers. If another CPU is concurrently in nsim_destroy(), then it will first set the peer ptr to NULL. This does not affect any existing readers that dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is a synchronize_rcu() before the netdev is actually unregistered and freed. This ensures that any readers i.e. xmit() that got a non-NULL peer will complete before the netdev is freed. Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() will dereference NULL, making it safe. The codepath to nsim_destroy() and nsim_create() takes both the newly added nsim_dev_list_lock and rtnl_lock. This makes it safe with concurrent calls to linking two netdevsims together. Signed-off-by: David Wei <dw@davidwei.uk> --- drivers/net/netdevsim/netdev.c | 28 +++++++++++++++++++++++----- drivers/net/netdevsim/netdevsim.h | 1 + 2 files changed, 24 insertions(+), 5 deletions(-)