diff mbox series

iwlwifi: tx-fail reported with zero retry count.

Message ID 6b920700-e9a2-bbeb-98ce-7a2d09e76975@candelatech.com (mailing list archive)
State Not Applicable
Delegated to: Johannes Berg
Headers show
Series iwlwifi: tx-fail reported with zero retry count. | expand

Commit Message

Ben Greear June 6, 2023, 1:36 p.m. UTC
It seems that when tx fails (due to TX_STATUS_FAIL_LONG_LIMIT in this case), the
retry-count is sometimes reported as zero.  I would expect that it actually must have retried
many times (15 it seems?).  At least for fail-long-limit failures, is there a known constant amount of
times the firmware retransmits the frame before giving up?  If so, I can just hack that
value into the retry counter in this case.

Debug patch:

[greearb@ben-dt5 iwlwifi]$ git diff


 From dmesg:

Jun 06 06:29:40 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
Jun 06 06:29:43 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
Jun 06 06:29:48 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
Jun 06 06:29:49 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
Jun 06 06:29:52 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
Jun 06 06:29:57 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 0
Jun 06 06:29:57 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 0
Jun 06 06:29:58 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 0
Jun 06 06:30:01 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
Jun 06 06:30:02 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
Jun 06 06:30:03 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
....


Thanks,
Ben

Comments

Ben Greear June 6, 2023, 1:52 p.m. UTC | #1
On 6/6/23 6:36 AM, Ben Greear wrote:
> It seems that when tx fails (due to TX_STATUS_FAIL_LONG_LIMIT in this case), the
> retry-count is sometimes reported as zero.  I would expect that it actually must have retried
> many times (15 it seems?).  At least for fail-long-limit failures, is there a known constant amount of
> times the firmware retransmits the frame before giving up?  If so, I can just hack that
> value into the retry counter in this case.
> 
> Debug patch:
> 
> [greearb@ben-dt5 iwlwifi]$ git diff
> diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
> index 1cddc65dd51e..6544dabb092b 100644
> --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
> +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
> @@ -1929,8 +1929,14 @@ static void iwl_mvm_rx_tx_cmd_single(struct iwl_mvm *mvm,
>                  mvm->ethtool_stats.tx_mpdu_attempts += info->status.rates[0].count;
>                  mvm->ethtool_stats.tx_mpdu_retry += tx_resp->failure_frame;
>                  if (cb.flags & IWL_TX_CB_TXO_USED) {
> +                       u32 idx = status & TX_STATUS_MSK;
> +
>                          mvm->ethtool_stats.txo_tx_mpdu_attempts += info->status.rates[0].count;
>                          mvm->ethtool_stats.txo_tx_mpdu_retry += tx_resp->failure_frame;
> +                       if (idx != TX_STATUS_SUCCESS) {
> +                               pr_info("txo tx status failed: %d  tx_resp->failure_frame: %d\n",
> +                                       idx, tx_resp->failure_frame);
> +                       }
>                  }
> 
>                  iwl_mvm_hwrate_to_tx_status(mvm, mvm->fw,
> 
> 
>  From dmesg:
> 
> Jun 06 06:29:40 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
> Jun 06 06:29:43 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
> Jun 06 06:29:48 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
> Jun 06 06:29:49 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
> Jun 06 06:29:52 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
> Jun 06 06:29:57 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 0
> Jun 06 06:29:57 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 0
> Jun 06 06:29:58 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 0
> Jun 06 06:30:01 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
> Jun 06 06:30:02 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15
> Jun 06 06:30:03 ct523c-0b29 kernel: txo tx status failed: 131  tx_resp->failure_frame: 15

I think it has some other weirdness too.  I added printout to show any non-zero failure_frame:

+                       if (idx != TX_STATUS_SUCCESS || tx_resp->failure_frame) {
+                               pr_info("txo tx status: %d  tx_resp->failure_frame: %d\n",
+                                       idx, tx_resp->failure_frame);
+                       }


For a while, it showed the status 131 and failure_frame of 0, then it started reporting lots
of smaller numbers of retries (more like what I would expect to see).
And then it seemed to stick in a case where it was reporting
lots of successes with 15 retries:

Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15
Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1  tx_resp->failure_frame: 15


I think it is quite unlikely that it reliably succeeded on the very last retry
so many times in a row.  Maybe the FW has some issues in reporting the failure_frame
count properly?

Thanks,
Ben
diff mbox series

Patch

diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
index 1cddc65dd51e..6544dabb092b 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
@@ -1929,8 +1929,14 @@  static void iwl_mvm_rx_tx_cmd_single(struct iwl_mvm *mvm,
                 mvm->ethtool_stats.tx_mpdu_attempts += info->status.rates[0].count;
                 mvm->ethtool_stats.tx_mpdu_retry += tx_resp->failure_frame;
                 if (cb.flags & IWL_TX_CB_TXO_USED) {
+                       u32 idx = status & TX_STATUS_MSK;
+
                         mvm->ethtool_stats.txo_tx_mpdu_attempts += info->status.rates[0].count;
                         mvm->ethtool_stats.txo_tx_mpdu_retry += tx_resp->failure_frame;
+                       if (idx != TX_STATUS_SUCCESS) {
+                               pr_info("txo tx status failed: %d  tx_resp->failure_frame: %d\n",
+                                       idx, tx_resp->failure_frame);
+                       }
                 }

                 iwl_mvm_hwrate_to_tx_status(mvm, mvm->fw,