Message ID | 20221011072408.23731-3-quic_wgong@quicinc.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Kalle Valo |
Headers | show |
Series | wifi: ath11k: reduce the timeout value for hw scan | expand |
Wen Gong <quic_wgong@quicinc.com> writes: > For 11d scan, commit 9dcf6808b253 ("ath11k: add 11d scan offload support") > increased the timeout from one second to max 10 seconds when 11d scan > offload enabled and 6 GHz enabled, it is reasonable for the commit, it > is because the first 11d scan request is sent to firmware before the > first hw scan request after wlan load, then the hw scan started event > will reported from firmware after the 11d scan finished, it needs about > 6 seconds when 6 GHz enabled, so increased it from one second to 10 > seconds in the commit to avoid timed out for hw scan started. Then > another commit 1f682dc9fb37 ("ath11k: reduce the wait time of 11d scan > and hw scan while add interface") change the sequence of the first 11d > scan and hw scan, then ath11k will receive the hw scan started event > from firmware immediately for the first hw scan, thus ath11k does not > need set the timeout value to max 10 seconds again, and this is to set > the timeout value back from 10 seconds to 1 second. > > After the 1st hw scan finished, firmware will start 11d scan immediately, > and firmware need use some seconds to finish 11d scan, if the 2nd hw > scan is sent from ath11k to firmware before 11d scan finished, the 2nd > hw scan will started after 11d scan finished, this will lead timeout to > wait scan started in ath11k. Treat the timeout as a normal situation if > 11d scan is running and skip report scan fail for this situation. > > Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3 > > Signed-off-by: Wen Gong <quic_wgong@quicinc.com> [...] > @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw, > > ret = ath11k_start_scan(ar, &arg); > if (ret) { > - ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); > + if (ret == -EBUSY) > + ath11k_dbg(ar->ab, ATH11K_DBG_MAC, > + "scan engine is busy 11d state %d\n", ar->state_11d); > + else > + ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); > + > spin_lock_bh(&ar->data_lock); > ar->scan.state = ATH11K_SCAN_IDLE; > spin_unlock_bh(&ar->data_lock); This feels like a hack to me, for example will these failed scans now cause delays is connection establishment? IMHO it's crucial from user's point of view that we don't delay that in any way. I would rather fix the root cause, do we know what's causing this?
On 11/8/2022 6:20 PM, Kalle Valo wrote: > Wen Gong <quic_wgong@quicinc.com> writes: > ... > [...] > >> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw, >> >> ret = ath11k_start_scan(ar, &arg); >> if (ret) { >> - ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); >> + if (ret == -EBUSY) >> + ath11k_dbg(ar->ab, ATH11K_DBG_MAC, >> + "scan engine is busy 11d state %d\n", ar->state_11d); >> + else >> + ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); >> + >> spin_lock_bh(&ar->data_lock); >> ar->scan.state = ATH11K_SCAN_IDLE; >> spin_unlock_bh(&ar->data_lock); > This feels like a hack to me, for example will these failed scans now > cause delays is connection establishment? IMHO it's crucial from user's > point of view that we don't delay that in any way. It will not delay connection. After wlan load, the 1st hw scan will arrived to ath11k, and then 11d scan will be sent to firmware after the 1st hw scan. It means the hw scan for connection is run before 11d scan, and then connection could be started immediately after the 1st hw scan finished. It means no delay for connection. > I would rather fix the root cause, do we know what's causing this? In firmware, hw scan and 11d scan are all running in the same queue, they can not be run parallel. When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and then 11d scan cost the next 7s. After the 14s, the each hw scan arrived to ath11k will be run immediately. If the 2nd hw scan arrived before the 11d scan finished, for example, it arrived 7.1 seconds after the 1st hw scan, at this moment, the 11d scan is still running in firmware, then the 2nd hw scan will not receive scan started event untill the 11d scan finished, and meanwhile, the 2nd hw scan is holding the ar->conf_mutex in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s to avoid holding ar->conf_mutex for long time.
On 11/18/2022 6:29 PM, Wen Gong wrote: > On 11/8/2022 6:20 PM, Kalle Valo wrote: >> Wen Gong <quic_wgong@quicinc.com> writes: >> > ... >> [...] >> >>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct >>> ieee80211_hw *hw, >>> ret = ath11k_start_scan(ar, &arg); >>> if (ret) { >>> - ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); >>> + if (ret == -EBUSY) >>> + ath11k_dbg(ar->ab, ATH11K_DBG_MAC, >>> + "scan engine is busy 11d state %d\n", >>> ar->state_11d); >>> + else >>> + ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); >>> + >>> spin_lock_bh(&ar->data_lock); >>> ar->scan.state = ATH11K_SCAN_IDLE; >>> spin_unlock_bh(&ar->data_lock); >> This feels like a hack to me, for example will these failed scans now >> cause delays is connection establishment? IMHO it's crucial from user's >> point of view that we don't delay that in any way. > It will not delay connection. > After wlan load, the 1st hw scan will arrived to ath11k, and then 11d > scan will be sent to firmware after the 1st hw scan. It means the hw > scan for connection is run before 11d scan, and then connection could > be started immediately after the 1st hw scan finished. It means no > delay for connection. >> I would rather fix the root cause, do we know what's causing this? > In firmware, hw scan and 11d scan are all running in the same queue, > they can not be run parallel. > > When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and > then 11d scan cost the next 7s. After the 14s, the each hw scan arrived > to ath11k will be run immediately. If the 2nd hw scan arrived before > the 11d scan finished, for example, it arrived 7.1 seconds after the > 1st hw scan, at this moment, the 11d scan is still running in firmware, > then the 2nd hw scan will not receive scan started event untill the 11d > scan finished, and meanwhile, the 2nd hw scan is holding the > ar->conf_mutex > in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some > seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s > to avoid holding ar->conf_mutex for long time. Hi Kalle, Should I change commit log with above explanation and send v4?
Hi Kalle, Should I change commit log with below explanation and send v4? On 11/23/2022 11:41 AM, Wen Gong wrote: > On 11/18/2022 6:29 PM, Wen Gong wrote: >> On 11/8/2022 6:20 PM, Kalle Valo wrote: >>> Wen Gong <quic_wgong@quicinc.com> writes: >>> >> ... >>> [...] >>> >>>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct >>>> ieee80211_hw *hw, >>>> ret = ath11k_start_scan(ar, &arg); >>>> if (ret) { >>>> - ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); >>>> + if (ret == -EBUSY) >>>> + ath11k_dbg(ar->ab, ATH11K_DBG_MAC, >>>> + "scan engine is busy 11d state %d\n", >>>> ar->state_11d); >>>> + else >>>> + ath11k_warn(ar->ab, "failed to start hw scan: %d\n", >>>> ret); >>>> + >>>> spin_lock_bh(&ar->data_lock); >>>> ar->scan.state = ATH11K_SCAN_IDLE; >>>> spin_unlock_bh(&ar->data_lock); >>> This feels like a hack to me, for example will these failed scans now >>> cause delays is connection establishment? IMHO it's crucial from user's >>> point of view that we don't delay that in any way. >> It will not delay connection. >> After wlan load, the 1st hw scan will arrived to ath11k, and then 11d >> scan will be sent to firmware after the 1st hw scan. It means the hw >> scan for connection is run before 11d scan, and then connection could >> be started immediately after the 1st hw scan finished. It means no >> delay for connection. >>> I would rather fix the root cause, do we know what's causing this? >> In firmware, hw scan and 11d scan are all running in the same queue, >> they can not be run parallel. >> >> When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and >> then 11d scan cost the next 7s. After the 14s, the each hw scan arrived >> to ath11k will be run immediately. If the 2nd hw scan arrived before >> the 11d scan finished, for example, it arrived 7.1 seconds after the >> 1st hw scan, at this moment, the 11d scan is still running in firmware, >> then the 2nd hw scan will not receive scan started event untill the 11d >> scan finished, and meanwhile, the 2nd hw scan is holding the >> ar->conf_mutex >> in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some >> seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s >> to avoid holding ar->conf_mutex for long time. > > Hi Kalle, > > Should I change commit log with above explanation and send v4? >
Hi Kalle, Should I change commit log with below explanation and send v4? On 12/16/2022 11:08 AM, Wen Gong wrote: > Hi Kalle, > > Should I change commit log with below explanation and send v4? > > On 11/23/2022 11:41 AM, Wen Gong wrote: >> On 11/18/2022 6:29 PM, Wen Gong wrote: >>> On 11/8/2022 6:20 PM, Kalle Valo wrote: >>>> Wen Gong <quic_wgong@quicinc.com> writes: >>>> >>> ... >>>> [...] >>>> >>>>> @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct >>>>> ieee80211_hw *hw, >>>>> ret = ath11k_start_scan(ar, &arg); >>>>> if (ret) { >>>>> - ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); >>>>> + if (ret == -EBUSY) >>>>> + ath11k_dbg(ar->ab, ATH11K_DBG_MAC, >>>>> + "scan engine is busy 11d state %d\n", >>>>> ar->state_11d); >>>>> + else >>>>> + ath11k_warn(ar->ab, "failed to start hw scan: %d\n", >>>>> ret); >>>>> + >>>>> spin_lock_bh(&ar->data_lock); >>>>> ar->scan.state = ATH11K_SCAN_IDLE; >>>>> spin_unlock_bh(&ar->data_lock); >>>> This feels like a hack to me, for example will these failed scans now >>>> cause delays is connection establishment? IMHO it's crucial from >>>> user's >>>> point of view that we don't delay that in any way. >>> It will not delay connection. >>> After wlan load, the 1st hw scan will arrived to ath11k, and then 11d >>> scan will be sent to firmware after the 1st hw scan. It means the hw >>> scan for connection is run before 11d scan, and then connection could >>> be started immediately after the 1st hw scan finished. It means no >>> delay for connection. >>>> I would rather fix the root cause, do we know what's causing this? >>> In firmware, hw scan and 11d scan are all running in the same queue, >>> they can not be run parallel. >>> >>> When 6 GHz enabled, the 1st hw scan cost about 7s and finished, and >>> then 11d scan cost the next 7s. After the 14s, the each hw scan arrived >>> to ath11k will be run immediately. If the 2nd hw scan arrived before >>> the 11d scan finished, for example, it arrived 7.1 seconds after the >>> 1st hw scan, at this moment, the 11d scan is still running in firmware, >>> then the 2nd hw scan will not receive scan started event untill the 11d >>> scan finished, and meanwhile, the 2nd hw scan is holding the >>> ar->conf_mutex >>> in ath11k_mac_op_hw_scan(), it is not good to hold a lock for some >>> seconds because ar->conf_mutex is widely used. So reduce the 10s to 1s >>> to avoid holding ar->conf_mutex for long time. >> >> Hi Kalle, >> >> Should I change commit log with above explanation and send v4? >>
Wen Gong <quic_wgong@quicinc.com> writes:
> Should I change commit log with below explanation and send v4?
Please stop spamming the same question over and over, it's really
annoying. If I don't have time to look at something, spamming me won't
help, quite the opposite. It would be a lot better if you would help
with the other upstream related tasks we have, that way I might have
more time to look at your patches.
To answer your question I need to look at this patchset in detail and I
don't know when I'm able to do that. But at this moment I don't trust
this patchset is the right approach and I'm not willing to take it.
On 1/13/2023 8:14 PM, Kalle Valo wrote: > Wen Gong <quic_wgong@quicinc.com> writes: > >> Should I change commit log with below explanation and send v4? > Please stop spamming the same question over and over, it's really > annoying. If I don't have time to look at something, spamming me won't > help, quite the opposite. It would be a lot better if you would help > with the other upstream related tasks we have, that way I might have > more time to look at your patches. > > To answer your question I need to look at this patchset in detail and I > don't know when I'm able to do that. But at this moment I don't trust > this patchset is the right approach and I'm not willing to take it. yes. I will send v4 only for one patch "[v3,1/2] wifi: ath11k: change to set 11d state instead of start 11d scan while disconnect", is it ok?
diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c index b0c3cf258d12..666775a1e2a9 100644 --- a/drivers/net/wireless/ath/ath11k/mac.c +++ b/drivers/net/wireless/ath/ath11k/mac.c @@ -3560,7 +3560,6 @@ static int ath11k_start_scan(struct ath11k *ar, struct scan_req_params *arg) { int ret; - unsigned long timeout = 1 * HZ; lockdep_assert_held(&ar->conf_mutex); @@ -3571,19 +3570,15 @@ static int ath11k_start_scan(struct ath11k *ar, if (ret) return ret; - if (test_bit(WMI_TLV_SERVICE_11D_OFFLOAD, ar->ab->wmi_ab.svc_map)) { - timeout = 5 * HZ; - - if (ar->supports_6ghz) - timeout += 5 * HZ; - } - - ret = wait_for_completion_timeout(&ar->scan.started, timeout); + ret = wait_for_completion_timeout(&ar->scan.started, 1 * HZ); if (ret == 0) { ret = ath11k_scan_stop(ar); if (ret) ath11k_warn(ar->ab, "failed to stop scan: %d\n", ret); + if (ar->state_11d == ATH11K_11D_RUNNING) + return -EBUSY; + return -ETIMEDOUT; } @@ -3682,7 +3677,12 @@ static int ath11k_mac_op_hw_scan(struct ieee80211_hw *hw, ret = ath11k_start_scan(ar, &arg); if (ret) { - ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); + if (ret == -EBUSY) + ath11k_dbg(ar->ab, ATH11K_DBG_MAC, + "scan engine is busy 11d state %d\n", ar->state_11d); + else + ath11k_warn(ar->ab, "failed to start hw scan: %d\n", ret); + spin_lock_bh(&ar->data_lock); ar->scan.state = ATH11K_SCAN_IDLE; spin_unlock_bh(&ar->data_lock);
For 11d scan, commit 9dcf6808b253 ("ath11k: add 11d scan offload support") increased the timeout from one second to max 10 seconds when 11d scan offload enabled and 6 GHz enabled, it is reasonable for the commit, it is because the first 11d scan request is sent to firmware before the first hw scan request after wlan load, then the hw scan started event will reported from firmware after the 11d scan finished, it needs about 6 seconds when 6 GHz enabled, so increased it from one second to 10 seconds in the commit to avoid timed out for hw scan started. Then another commit 1f682dc9fb37 ("ath11k: reduce the wait time of 11d scan and hw scan while add interface") change the sequence of the first 11d scan and hw scan, then ath11k will receive the hw scan started event from firmware immediately for the first hw scan, thus ath11k does not need set the timeout value to max 10 seconds again, and this is to set the timeout value back from 10 seconds to 1 second. After the 1st hw scan finished, firmware will start 11d scan immediately, and firmware need use some seconds to finish 11d scan, if the 2nd hw scan is sent from ath11k to firmware before 11d scan finished, the 2nd hw scan will started after 11d scan finished, this will lead timeout to wait scan started in ath11k. Treat the timeout as a normal situation if 11d scan is running and skip report scan fail for this situation. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3 Signed-off-by: Wen Gong <quic_wgong@quicinc.com> --- drivers/net/wireless/ath/ath11k/mac.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)