Message ID | 20231228014633.3256862-4-dw@davidwei.uk (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | netdevsim: link and forward skbs between ports | expand |
Thu, Dec 28, 2023 at 02:46:31AM CET, dw@davidwei.uk wrote: >Forward skbs sent from one netdevsim port to its connected netdevsim >port using dev_forward_skb, in a spirit similar to veth. > >Add a tx_dropped variable to struct netdevsim, tracking the number of >skbs that could not be forwarded using dev_forward_skb(). > >The xmit() function accessing the peer ptr is protected by an RCU read >critical section. The rcu_read_lock() is functionally redundant as since >v5.0 all softirqs are implicitly RCU read critical sections; but it is >useful for human readers. > >If another CPU is concurrently in nsim_destroy(), then it will first set >the peer ptr to NULL. This does not affect any existing readers that >dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >a synchronize_rcu() before the netdev is actually unregistered and >freed. This ensures that any readers i.e. xmit() that got a non-NULL >peer will complete before the netdev is freed. > >Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >will dereference NULL, making it safe. > >The codepath to nsim_destroy() and nsim_create() takes both the newly >added nsim_dev_list_lock and rtnl_lock. This makes it safe with I don't see the rtnl_lock take in those functions. Otherwise, this patch looks fine to me. >concurrent calls to linking two netdevsims together. > >Signed-off-by: David Wei <dw@davidwei.uk> >--- > drivers/net/netdevsim/netdev.c | 21 ++++++++++++++++++--- > drivers/net/netdevsim/netdevsim.h | 1 + > 2 files changed, 19 insertions(+), 3 deletions(-) > >diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >index 434322f6a565..0009d0f1243f 100644 >--- a/drivers/net/netdevsim/netdev.c >+++ b/drivers/net/netdevsim/netdev.c >@@ -29,19 +29,34 @@ > static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) > { > struct netdevsim *ns = netdev_priv(dev); >+ struct netdevsim *peer_ns; >+ int ret = NETDEV_TX_OK; > > if (!nsim_ipsec_tx(ns, skb)) > goto out; > >+ rcu_read_lock(); >+ peer_ns = rcu_dereference(ns->peer); >+ if (!peer_ns) >+ goto out_stats; >+ >+ skb_tx_timestamp(skb); >+ if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) >+ ret = NET_XMIT_DROP; >+ >+out_stats: >+ rcu_read_unlock(); > u64_stats_update_begin(&ns->syncp); > ns->tx_packets++; > ns->tx_bytes += skb->len; >+ if (ret == NET_XMIT_DROP) >+ ns->tx_dropped++; > u64_stats_update_end(&ns->syncp); >+ return ret; > > out: > dev_kfree_skb(skb); >- >- return NETDEV_TX_OK; >+ return ret; > } > > static void nsim_set_rx_mode(struct net_device *dev) >@@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) > start = u64_stats_fetch_begin(&ns->syncp); > stats->tx_bytes = ns->tx_bytes; > stats->tx_packets = ns->tx_packets; >+ stats->tx_dropped = ns->tx_dropped; > } while (u64_stats_fetch_retry(&ns->syncp, start)); > } > >@@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev) > eth_hw_addr_random(dev); > > dev->tx_queue_len = 0; >- dev->flags |= IFF_NOARP; > dev->flags &= ~IFF_MULTICAST; > dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | > IFF_NO_QUEUE; >diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >index 24fc3fbda791..083b1ee7a1a2 100644 >--- a/drivers/net/netdevsim/netdevsim.h >+++ b/drivers/net/netdevsim/netdevsim.h >@@ -98,6 +98,7 @@ struct netdevsim { > > u64 tx_packets; > u64 tx_bytes; >+ u64 tx_dropped; > struct u64_stats_sync syncp; > > struct nsim_bus_dev *nsim_bus_dev; >-- >2.39.3 >
On Thu, Dec 28, 2023 at 2:46 AM David Wei <dw@davidwei.uk> wrote: > > Forward skbs sent from one netdevsim port to its connected netdevsim > port using dev_forward_skb, in a spirit similar to veth. > > Add a tx_dropped variable to struct netdevsim, tracking the number of > skbs that could not be forwarded using dev_forward_skb(). > > The xmit() function accessing the peer ptr is protected by an RCU read > critical section. The rcu_read_lock() is functionally redundant as since > v5.0 all softirqs are implicitly RCU read critical sections; but it is > useful for human readers. > > If another CPU is concurrently in nsim_destroy(), then it will first set > the peer ptr to NULL. This does not affect any existing readers that > dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is > a synchronize_rcu() before the netdev is actually unregistered and > freed. This ensures that any readers i.e. xmit() that got a non-NULL > peer will complete before the netdev is freed. > > Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() > will dereference NULL, making it safe. > > The codepath to nsim_destroy() and nsim_create() takes both the newly > added nsim_dev_list_lock and rtnl_lock. This makes it safe with > concurrent calls to linking two netdevsims together. > > Signed-off-by: David Wei <dw@davidwei.uk> > --- > drivers/net/netdevsim/netdev.c | 21 ++++++++++++++++++--- > drivers/net/netdevsim/netdevsim.h | 1 + > 2 files changed, 19 insertions(+), 3 deletions(-) > > @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev) > eth_hw_addr_random(dev); > > dev->tx_queue_len = 0; > - dev->flags |= IFF_NOARP; This part seems to be unrelated to this patch ? > dev->flags &= ~IFF_MULTICAST; > dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | > IFF_NO_QUEUE;
On 2024-01-02 03:20, Eric Dumazet wrote: > On Thu, Dec 28, 2023 at 2:46 AM David Wei <dw@davidwei.uk> wrote: >> >> Forward skbs sent from one netdevsim port to its connected netdevsim >> port using dev_forward_skb, in a spirit similar to veth. >> >> Add a tx_dropped variable to struct netdevsim, tracking the number of >> skbs that could not be forwarded using dev_forward_skb(). >> >> The xmit() function accessing the peer ptr is protected by an RCU read >> critical section. The rcu_read_lock() is functionally redundant as since >> v5.0 all softirqs are implicitly RCU read critical sections; but it is >> useful for human readers. >> >> If another CPU is concurrently in nsim_destroy(), then it will first set >> the peer ptr to NULL. This does not affect any existing readers that >> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >> a synchronize_rcu() before the netdev is actually unregistered and >> freed. This ensures that any readers i.e. xmit() that got a non-NULL >> peer will complete before the netdev is freed. >> >> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >> will dereference NULL, making it safe. >> >> The codepath to nsim_destroy() and nsim_create() takes both the newly >> added nsim_dev_list_lock and rtnl_lock. This makes it safe with >> concurrent calls to linking two netdevsims together. >> >> Signed-off-by: David Wei <dw@davidwei.uk> >> --- >> drivers/net/netdevsim/netdev.c | 21 ++++++++++++++++++--- >> drivers/net/netdevsim/netdevsim.h | 1 + >> 2 files changed, 19 insertions(+), 3 deletions(-) >> > > >> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev) >> eth_hw_addr_random(dev); >> >> dev->tx_queue_len = 0; >> - dev->flags |= IFF_NOARP; > > This part seems to be unrelated to this patch ? Hi Eric, I found that this change is needed for skb forwarding to work. Would you prefer me splitting this change into its own patch? > >> dev->flags &= ~IFF_MULTICAST; >> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | >> IFF_NO_QUEUE;
On 2024-01-02 03:13, Jiri Pirko wrote: > Thu, Dec 28, 2023 at 02:46:31AM CET, dw@davidwei.uk wrote: >> Forward skbs sent from one netdevsim port to its connected netdevsim >> port using dev_forward_skb, in a spirit similar to veth. >> >> Add a tx_dropped variable to struct netdevsim, tracking the number of >> skbs that could not be forwarded using dev_forward_skb(). >> >> The xmit() function accessing the peer ptr is protected by an RCU read >> critical section. The rcu_read_lock() is functionally redundant as since >> v5.0 all softirqs are implicitly RCU read critical sections; but it is >> useful for human readers. >> >> If another CPU is concurrently in nsim_destroy(), then it will first set >> the peer ptr to NULL. This does not affect any existing readers that >> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >> a synchronize_rcu() before the netdev is actually unregistered and >> freed. This ensures that any readers i.e. xmit() that got a non-NULL >> peer will complete before the netdev is freed. >> >> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >> will dereference NULL, making it safe. >> >> The codepath to nsim_destroy() and nsim_create() takes both the newly >> added nsim_dev_list_lock and rtnl_lock. This makes it safe with > > I don't see the rtnl_lock take in those functions. > > > Otherwise, this patch looks fine to me. For nsim_create(), rtnl_lock is taken in nsim_init_netdevsim(). For nsim_destroy(), rtnl_lock is taken directly in the function. What I mean here is, in the netdevsim device modification paths locks are taken in this order: devl_lock -> rtnl_lock nsim_dev_list_lock is taken outside (not nested) of these. In nsim_dev_peer_write() where two ports are linked, locks are taken in this order: nsim_dev_list_lock -> devl_lock -> rtnl_lock This will not cause deadlocks and ensures that two ports being linked are both valid. > > >> concurrent calls to linking two netdevsims together. >> >> Signed-off-by: David Wei <dw@davidwei.uk> >> --- >> drivers/net/netdevsim/netdev.c | 21 ++++++++++++++++++--- >> drivers/net/netdevsim/netdevsim.h | 1 + >> 2 files changed, 19 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >> index 434322f6a565..0009d0f1243f 100644 >> --- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c >> @@ -29,19 +29,34 @@ >> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) >> { >> struct netdevsim *ns = netdev_priv(dev); >> + struct netdevsim *peer_ns; >> + int ret = NETDEV_TX_OK; >> >> if (!nsim_ipsec_tx(ns, skb)) >> goto out; >> >> + rcu_read_lock(); >> + peer_ns = rcu_dereference(ns->peer); >> + if (!peer_ns) >> + goto out_stats; >> + >> + skb_tx_timestamp(skb); >> + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) >> + ret = NET_XMIT_DROP; >> + >> +out_stats: >> + rcu_read_unlock(); >> u64_stats_update_begin(&ns->syncp); >> ns->tx_packets++; >> ns->tx_bytes += skb->len; >> + if (ret == NET_XMIT_DROP) >> + ns->tx_dropped++; >> u64_stats_update_end(&ns->syncp); >> + return ret; >> >> out: >> dev_kfree_skb(skb); >> - >> - return NETDEV_TX_OK; >> + return ret; >> } >> >> static void nsim_set_rx_mode(struct net_device *dev) >> @@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) >> start = u64_stats_fetch_begin(&ns->syncp); >> stats->tx_bytes = ns->tx_bytes; >> stats->tx_packets = ns->tx_packets; >> + stats->tx_dropped = ns->tx_dropped; >> } while (u64_stats_fetch_retry(&ns->syncp, start)); >> } >> >> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev) >> eth_hw_addr_random(dev); >> >> dev->tx_queue_len = 0; >> - dev->flags |= IFF_NOARP; >> dev->flags &= ~IFF_MULTICAST; >> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | >> IFF_NO_QUEUE; >> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >> index 24fc3fbda791..083b1ee7a1a2 100644 >> --- a/drivers/net/netdevsim/netdevsim.h >> +++ b/drivers/net/netdevsim/netdevsim.h >> @@ -98,6 +98,7 @@ struct netdevsim { >> >> u64 tx_packets; >> u64 tx_bytes; >> + u64 tx_dropped; >> struct u64_stats_sync syncp; >> >> struct nsim_bus_dev *nsim_bus_dev; >> -- >> 2.39.3 >>
Wed, Jan 03, 2024 at 11:36:36PM CET, dw@davidwei.uk wrote: >On 2024-01-02 03:13, Jiri Pirko wrote: >> Thu, Dec 28, 2023 at 02:46:31AM CET, dw@davidwei.uk wrote: >>> Forward skbs sent from one netdevsim port to its connected netdevsim >>> port using dev_forward_skb, in a spirit similar to veth. >>> >>> Add a tx_dropped variable to struct netdevsim, tracking the number of >>> skbs that could not be forwarded using dev_forward_skb(). >>> >>> The xmit() function accessing the peer ptr is protected by an RCU read >>> critical section. The rcu_read_lock() is functionally redundant as since >>> v5.0 all softirqs are implicitly RCU read critical sections; but it is >>> useful for human readers. >>> >>> If another CPU is concurrently in nsim_destroy(), then it will first set >>> the peer ptr to NULL. This does not affect any existing readers that >>> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >>> a synchronize_rcu() before the netdev is actually unregistered and >>> freed. This ensures that any readers i.e. xmit() that got a non-NULL >>> peer will complete before the netdev is freed. >>> >>> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >>> will dereference NULL, making it safe. >>> >>> The codepath to nsim_destroy() and nsim_create() takes both the newly >>> added nsim_dev_list_lock and rtnl_lock. This makes it safe with >> >> I don't see the rtnl_lock take in those functions. >> >> >> Otherwise, this patch looks fine to me. > >For nsim_create(), rtnl_lock is taken in nsim_init_netdevsim(). For >nsim_destroy(), rtnl_lock is taken directly in the function. > >What I mean here is, in the netdevsim device modification paths locks >are taken in this order: > >devl_lock -> rtnl_lock > >nsim_dev_list_lock is taken outside (not nested) of these. > >In nsim_dev_peer_write() where two ports are linked, locks are taken in >this order: > >nsim_dev_list_lock -> devl_lock -> rtnl_lock > >This will not cause deadlocks and ensures that two ports being linked >are both valid. Okay. Perhaps would be good to document this in a comment somewhere in the code? > >> >> >>> concurrent calls to linking two netdevsims together. >>> >>> Signed-off-by: David Wei <dw@davidwei.uk> >>> --- >>> drivers/net/netdevsim/netdev.c | 21 ++++++++++++++++++--- >>> drivers/net/netdevsim/netdevsim.h | 1 + >>> 2 files changed, 19 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >>> index 434322f6a565..0009d0f1243f 100644 >>> --- a/drivers/net/netdevsim/netdev.c > +++ b/drivers/net/netdevsim/netdev.c >>> @@ -29,19 +29,34 @@ >>> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) >>> { >>> struct netdevsim *ns = netdev_priv(dev); >>> + struct netdevsim *peer_ns; >>> + int ret = NETDEV_TX_OK; >>> >>> if (!nsim_ipsec_tx(ns, skb)) >>> goto out; >>> >>> + rcu_read_lock(); >>> + peer_ns = rcu_dereference(ns->peer); >>> + if (!peer_ns) >>> + goto out_stats; >>> + >>> + skb_tx_timestamp(skb); >>> + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) >>> + ret = NET_XMIT_DROP; >>> + >>> +out_stats: >>> + rcu_read_unlock(); >>> u64_stats_update_begin(&ns->syncp); >>> ns->tx_packets++; >>> ns->tx_bytes += skb->len; >>> + if (ret == NET_XMIT_DROP) >>> + ns->tx_dropped++; >>> u64_stats_update_end(&ns->syncp); >>> + return ret; >>> >>> out: >>> dev_kfree_skb(skb); >>> - >>> - return NETDEV_TX_OK; >>> + return ret; >>> } >>> >>> static void nsim_set_rx_mode(struct net_device *dev) >>> @@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) >>> start = u64_stats_fetch_begin(&ns->syncp); >>> stats->tx_bytes = ns->tx_bytes; >>> stats->tx_packets = ns->tx_packets; >>> + stats->tx_dropped = ns->tx_dropped; >>> } while (u64_stats_fetch_retry(&ns->syncp, start)); >>> } >>> >>> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev) >>> eth_hw_addr_random(dev); >>> >>> dev->tx_queue_len = 0; >>> - dev->flags |= IFF_NOARP; >>> dev->flags &= ~IFF_MULTICAST; >>> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | >>> IFF_NO_QUEUE; >>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >>> index 24fc3fbda791..083b1ee7a1a2 100644 >>> --- a/drivers/net/netdevsim/netdevsim.h >>> +++ b/drivers/net/netdevsim/netdevsim.h >>> @@ -98,6 +98,7 @@ struct netdevsim { >>> >>> u64 tx_packets; >>> u64 tx_bytes; >>> + u64 tx_dropped; >>> struct u64_stats_sync syncp; >>> >>> struct nsim_bus_dev *nsim_bus_dev; >>> -- >>> 2.39.3 >>>
On 2024-01-04 01:31, Jiri Pirko wrote: > Wed, Jan 03, 2024 at 11:36:36PM CET, dw@davidwei.uk wrote: >> On 2024-01-02 03:13, Jiri Pirko wrote: >>> Thu, Dec 28, 2023 at 02:46:31AM CET, dw@davidwei.uk wrote: >>>> Forward skbs sent from one netdevsim port to its connected netdevsim >>>> port using dev_forward_skb, in a spirit similar to veth. >>>> >>>> Add a tx_dropped variable to struct netdevsim, tracking the number of >>>> skbs that could not be forwarded using dev_forward_skb(). >>>> >>>> The xmit() function accessing the peer ptr is protected by an RCU read >>>> critical section. The rcu_read_lock() is functionally redundant as since >>>> v5.0 all softirqs are implicitly RCU read critical sections; but it is >>>> useful for human readers. >>>> >>>> If another CPU is concurrently in nsim_destroy(), then it will first set >>>> the peer ptr to NULL. This does not affect any existing readers that >>>> dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is >>>> a synchronize_rcu() before the netdev is actually unregistered and >>>> freed. This ensures that any readers i.e. xmit() that got a non-NULL >>>> peer will complete before the netdev is freed. >>>> >>>> Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() >>>> will dereference NULL, making it safe. >>>> >>>> The codepath to nsim_destroy() and nsim_create() takes both the newly >>>> added nsim_dev_list_lock and rtnl_lock. This makes it safe with >>> >>> I don't see the rtnl_lock take in those functions. >>> >>> >>> Otherwise, this patch looks fine to me. >> >> For nsim_create(), rtnl_lock is taken in nsim_init_netdevsim(). For >> nsim_destroy(), rtnl_lock is taken directly in the function. >> >> What I mean here is, in the netdevsim device modification paths locks >> are taken in this order: >> >> devl_lock -> rtnl_lock >> >> nsim_dev_list_lock is taken outside (not nested) of these. >> >> In nsim_dev_peer_write() where two ports are linked, locks are taken in >> this order: >> >> nsim_dev_list_lock -> devl_lock -> rtnl_lock >> >> This will not cause deadlocks and ensures that two ports being linked >> are both valid. > > Okay. Perhaps would be good to document this in a comment somewhere in > the code? Yep, I'll add this. > > >> >>> >>> >>>> concurrent calls to linking two netdevsims together. >>>> >>>> Signed-off-by: David Wei <dw@davidwei.uk> >>>> --- >>>> drivers/net/netdevsim/netdev.c | 21 ++++++++++++++++++--- >>>> drivers/net/netdevsim/netdevsim.h | 1 + >>>> 2 files changed, 19 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c >>>> index 434322f6a565..0009d0f1243f 100644 >>>> --- a/drivers/net/netdevsim/netdev.c >> +++ b/drivers/net/netdevsim/netdev.c >>>> @@ -29,19 +29,34 @@ >>>> static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) >>>> { >>>> struct netdevsim *ns = netdev_priv(dev); >>>> + struct netdevsim *peer_ns; >>>> + int ret = NETDEV_TX_OK; >>>> >>>> if (!nsim_ipsec_tx(ns, skb)) >>>> goto out; >>>> >>>> + rcu_read_lock(); >>>> + peer_ns = rcu_dereference(ns->peer); >>>> + if (!peer_ns) >>>> + goto out_stats; >>>> + >>>> + skb_tx_timestamp(skb); >>>> + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) >>>> + ret = NET_XMIT_DROP; >>>> + >>>> +out_stats: >>>> + rcu_read_unlock(); >>>> u64_stats_update_begin(&ns->syncp); >>>> ns->tx_packets++; >>>> ns->tx_bytes += skb->len; >>>> + if (ret == NET_XMIT_DROP) >>>> + ns->tx_dropped++; >>>> u64_stats_update_end(&ns->syncp); >>>> + return ret; >>>> >>>> out: >>>> dev_kfree_skb(skb); >>>> - >>>> - return NETDEV_TX_OK; >>>> + return ret; >>>> } >>>> >>>> static void nsim_set_rx_mode(struct net_device *dev) >>>> @@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) >>>> start = u64_stats_fetch_begin(&ns->syncp); >>>> stats->tx_bytes = ns->tx_bytes; >>>> stats->tx_packets = ns->tx_packets; >>>> + stats->tx_dropped = ns->tx_dropped; >>>> } while (u64_stats_fetch_retry(&ns->syncp, start)); >>>> } >>>> >>>> @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev) >>>> eth_hw_addr_random(dev); >>>> >>>> dev->tx_queue_len = 0; >>>> - dev->flags |= IFF_NOARP; >>>> dev->flags &= ~IFF_MULTICAST; >>>> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | >>>> IFF_NO_QUEUE; >>>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h >>>> index 24fc3fbda791..083b1ee7a1a2 100644 >>>> --- a/drivers/net/netdevsim/netdevsim.h >>>> +++ b/drivers/net/netdevsim/netdevsim.h >>>> @@ -98,6 +98,7 @@ struct netdevsim { >>>> >>>> u64 tx_packets; >>>> u64 tx_bytes; >>>> + u64 tx_dropped; >>>> struct u64_stats_sync syncp; >>>> >>>> struct nsim_bus_dev *nsim_bus_dev; >>>> -- >>>> 2.39.3 >>>>
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c index 434322f6a565..0009d0f1243f 100644 --- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c @@ -29,19 +29,34 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct netdevsim *ns = netdev_priv(dev); + struct netdevsim *peer_ns; + int ret = NETDEV_TX_OK; if (!nsim_ipsec_tx(ns, skb)) goto out; + rcu_read_lock(); + peer_ns = rcu_dereference(ns->peer); + if (!peer_ns) + goto out_stats; + + skb_tx_timestamp(skb); + if (unlikely(dev_forward_skb(peer_ns->netdev, skb) == NET_RX_DROP)) + ret = NET_XMIT_DROP; + +out_stats: + rcu_read_unlock(); u64_stats_update_begin(&ns->syncp); ns->tx_packets++; ns->tx_bytes += skb->len; + if (ret == NET_XMIT_DROP) + ns->tx_dropped++; u64_stats_update_end(&ns->syncp); + return ret; out: dev_kfree_skb(skb); - - return NETDEV_TX_OK; + return ret; } static void nsim_set_rx_mode(struct net_device *dev) @@ -70,6 +85,7 @@ nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats) start = u64_stats_fetch_begin(&ns->syncp); stats->tx_bytes = ns->tx_bytes; stats->tx_packets = ns->tx_packets; + stats->tx_dropped = ns->tx_dropped; } while (u64_stats_fetch_retry(&ns->syncp, start)); } @@ -302,7 +318,6 @@ static void nsim_setup(struct net_device *dev) eth_hw_addr_random(dev); dev->tx_queue_len = 0; - dev->flags |= IFF_NOARP; dev->flags &= ~IFF_MULTICAST; dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_NO_QUEUE; diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h index 24fc3fbda791..083b1ee7a1a2 100644 --- a/drivers/net/netdevsim/netdevsim.h +++ b/drivers/net/netdevsim/netdevsim.h @@ -98,6 +98,7 @@ struct netdevsim { u64 tx_packets; u64 tx_bytes; + u64 tx_dropped; struct u64_stats_sync syncp; struct nsim_bus_dev *nsim_bus_dev;
Forward skbs sent from one netdevsim port to its connected netdevsim port using dev_forward_skb, in a spirit similar to veth. Add a tx_dropped variable to struct netdevsim, tracking the number of skbs that could not be forwarded using dev_forward_skb(). The xmit() function accessing the peer ptr is protected by an RCU read critical section. The rcu_read_lock() is functionally redundant as since v5.0 all softirqs are implicitly RCU read critical sections; but it is useful for human readers. If another CPU is concurrently in nsim_destroy(), then it will first set the peer ptr to NULL. This does not affect any existing readers that dereferenced a non-NULL peer. Then, in unregister_netdevice(), there is a synchronize_rcu() before the netdev is actually unregistered and freed. This ensures that any readers i.e. xmit() that got a non-NULL peer will complete before the netdev is freed. Any readers after the RCU_INIT_POINTER() but before synchronize_rcu() will dereference NULL, making it safe. The codepath to nsim_destroy() and nsim_create() takes both the newly added nsim_dev_list_lock and rtnl_lock. This makes it safe with concurrent calls to linking two netdevsims together. Signed-off-by: David Wei <dw@davidwei.uk> --- drivers/net/netdevsim/netdev.c | 21 ++++++++++++++++++--- drivers/net/netdevsim/netdevsim.h | 1 + 2 files changed, 19 insertions(+), 3 deletions(-)