Message ID | 20240411203435.228559-1-nnac123@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] ibmvnic: Return error code on TX scrq flush fail | expand |
On Thu, Apr 11, 2024 at 03:34:35PM -0500, Nick Child wrote: > In ibmvnic_xmit() if ibmvnic_tx_scrq_flush() returns H_CLOSED then > it will inform upper level networking functions to disable tx > queues. H_CLOSED signals that the connection with the vnic server is > down and a transport event is expected to recover the device. > > Previously, ibmvnic_tx_scrq_flush() was hard-coded to return success. > Therefore, the queues would remain active until ibmvnic_cleanup() is > called within do_reset(). > > The problem is that do_reset() depends on the RTNL lock. If several > ibmvnic devices are resetting then there can be a long wait time until > the last device can grab the lock. During this time the tx/rx queues > still appear active to upper level functions. > > FYI, we do make a call to netif_carrier_off() outside the RTNL lock but > its calls to dev_deactivate() are also dependent on the RTNL lock. > > As a result, large amounts of retransmissions were observed in a short > period of time, eventually leading to ETIMEOUT. This was specifically > seen with HNV devices, likely because of even more RTNL dependencies. > > Therefore, ensure the return code of ibmvnic_tx_scrq_flush() is > propagated to the xmit function to allow for an earlier (and lock-less) > response to a transport event. > > Signed-off-by: Nick Child <nnac123@linux.ibm.com> > --- > drivers/net/ethernet/ibm/ibmvnic.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c > index 30c47b8470ad..f5177f370354 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.c > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > @@ -2371,7 +2371,7 @@ static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter, > ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq); > else > ind_bufp->index = 0; > - return 0; > + return rc; > } > > static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) Hi Nick, I notice that some, but not all, cases the return value of ibmvnic_tx_scrq_flush() is not checked. Should that also be addressed?
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 30c47b8470ad..f5177f370354 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -2371,7 +2371,7 @@ static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter, ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq); else ind_bufp->index = 0; - return 0; + return rc; } static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
In ibmvnic_xmit() if ibmvnic_tx_scrq_flush() returns H_CLOSED then it will inform upper level networking functions to disable tx queues. H_CLOSED signals that the connection with the vnic server is down and a transport event is expected to recover the device. Previously, ibmvnic_tx_scrq_flush() was hard-coded to return success. Therefore, the queues would remain active until ibmvnic_cleanup() is called within do_reset(). The problem is that do_reset() depends on the RTNL lock. If several ibmvnic devices are resetting then there can be a long wait time until the last device can grab the lock. During this time the tx/rx queues still appear active to upper level functions. FYI, we do make a call to netif_carrier_off() outside the RTNL lock but its calls to dev_deactivate() are also dependent on the RTNL lock. As a result, large amounts of retransmissions were observed in a short period of time, eventually leading to ETIMEOUT. This was specifically seen with HNV devices, likely because of even more RTNL dependencies. Therefore, ensure the return code of ibmvnic_tx_scrq_flush() is propagated to the xmit function to allow for an earlier (and lock-less) response to a transport event. Signed-off-by: Nick Child <nnac123@linux.ibm.com> --- drivers/net/ethernet/ibm/ibmvnic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)