diff mbox series

[v2,1/2] ath10k: add refcount for ath10k_core_restart

Message ID 20191225120002.11163-2-wgong@codeaurora.org (mailing list archive)
State Changes Requested
Delegated to: Kalle Valo
Headers show
Series start recovery process when payload length overflow for sdio | expand

Commit Message

Wen Gong Dec. 25, 2019, noon UTC
When it has more than one restart_work queued meanwhile, the 2nd
restart_work is very esay to break the 1st restart work and lead
recovery fail.

Add a ref count to allow only one restart work running untill
device successfully recovered.

This patch only effect sdio chips.

Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00029.

Signed-off-by: Wen Gong <wgong@codeaurora.org>
---
 drivers/net/wireless/ath/ath10k/core.c | 8 ++++++++
 drivers/net/wireless/ath/ath10k/core.h | 2 ++
 drivers/net/wireless/ath/ath10k/mac.c  | 1 +
 3 files changed, 11 insertions(+)

Comments

Justin Capella Dec. 25, 2019, 3:14 p.m. UTC | #1
This does not only effect SDIO.

Why a semaphore / count? Could the conf_mutex be held earlier, or
perhaps change the state to ATH10K_STATE_RESTARTING first?
ath10k_reconfig_complete is also called in mac.c when channel is changed so

On Wed, Dec 25, 2019 at 4:01 AM Wen Gong <wgong@codeaurora.org> wrote:
>
> When it has more than one restart_work queued meanwhile, the 2nd
> restart_work is very esay to break the 1st restart work and lead
> recovery fail.
>
> Add a ref count to allow only one restart work running untill
> device successfully recovered.
>
> This patch only effect sdio chips.
>
> Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00029.
>
> Signed-off-by: Wen Gong <wgong@codeaurora.org>
> ---
>  drivers/net/wireless/ath/ath10k/core.c | 8 ++++++++
>  drivers/net/wireless/ath/ath10k/core.h | 2 ++
>  drivers/net/wireless/ath/ath10k/mac.c  | 1 +
>  3 files changed, 11 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
> index 91f131b87efc..4e0e8c86bdd4 100644
> --- a/drivers/net/wireless/ath/ath10k/core.c
> +++ b/drivers/net/wireless/ath/ath10k/core.c
> @@ -2199,6 +2199,14 @@ static void ath10k_core_restart(struct work_struct *work)
>  {
>         struct ath10k *ar = container_of(work, struct ath10k, restart_work);
>         int ret;
> +       int restart_count;
> +
> +       restart_count = atomic_inc_and_test(&ar->restart_count);
> +       if (restart_count > 1) {
> +               ath10k_warn(ar, "can not restart, count: %d\n", restart_count);
> +               atomic_dec(&ar->restart_count);
> +               return;
> +       }
>
>         set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
>
> diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
> index e57b2e7235e3..810c99f2dc0e 100644
> --- a/drivers/net/wireless/ath/ath10k/core.h
> +++ b/drivers/net/wireless/ath/ath10k/core.h
> @@ -982,6 +982,8 @@ struct ath10k {
>         /* protected by conf_mutex */
>         u8 ps_state_enable;
>
> +       atomic_t restart_count;
> +
>         bool nlo_enabled;
>         bool p2p;
>
> diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
> index 3856edba7915..bc1574145e66 100644
> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -7208,6 +7208,7 @@ static void ath10k_reconfig_complete(struct ieee80211_hw *hw,
>                 ath10k_info(ar, "device successfully recovered\n");
>                 ar->state = ATH10K_STATE_ON;
>                 ieee80211_wake_queues(ar->hw);
> +               atomic_dec(&ar->restart_count);
>         }
>
>         mutex_unlock(&ar->conf_mutex);
> --
> 2.23.0
Wen Gong Dec. 31, 2019, 9:37 a.m. UTC | #2
On 2019-12-25 23:14, Justin Capella wrote:
> This does not only effect SDIO.
> 
> Why a semaphore / count? Could the conf_mutex be held earlier, or
> perhaps change the state to ATH10K_STATE_RESTARTING first?
> ath10k_reconfig_complete is also called in mac.c when channel is 
> changed so
patch v2:
https://patchwork.kernel.org/patch/11313853/
https://patchwork.kernel.org/patch/11313859/
Justin Capella Jan. 2, 2020, 3:10 a.m. UTC | #3
Instead of the atomic restart count, can the state be updated to
ATH10K_STATE_RESTARTING while holding
     mutex_unlock(&ar->conf_mutex);

I don't understand the bundles, but I wonder about the case when there
are multiple packets (n_rx_pkts) and if pkt_bundle_len might be the
one to check. Also if there needs to be a check that the len > sizeof
HTC HDR.

On Tue, Dec 31, 2019 at 1:37 AM <wgong@codeaurora.org> wrote:
>
> On 2019-12-25 23:14, Justin Capella wrote:
> > This does not only effect SDIO.
> >
> > Why a semaphore / count? Could the conf_mutex be held earlier, or
> > perhaps change the state to ATH10K_STATE_RESTARTING first?
> > ath10k_reconfig_complete is also called in mac.c when channel is
> > changed so
> patch v2:
> https://patchwork.kernel.org/patch/11313853/
> https://patchwork.kernel.org/patch/11313859/
Wen Gong Jan. 2, 2020, 4:46 a.m. UTC | #4
On 2020-01-01 19:10, Justin Capella wrote:
> Instead of the atomic restart count, can the state be updated to
> ATH10K_STATE_RESTARTING while holding
>      mutex_unlock(&ar->conf_mutex);
> 
the recovery process is begin with ath10k_core_restart, and end with 
ath10k_reconfig_complete.
I already see it has mutex_lock(&ar->conf_mutex) and 
mutex_unlock(&ar->conf_mutex) in ath10k_core_restart,
but it is not enough, for example:
1st recovery has finished ath10k_core_restart, but not arrive 
ath10k_reconfig_complete, then the 2nd recovery
begin to enter ath10k_core_restart, it will destroy the 1st recovery and 
let 1st recovery fail.
After apply this patch, after recovery about 18000+ times, and still can 
connect/scan/ping success.

> I don't understand the bundles, but I wonder about the case when there
> are multiple packets (n_rx_pkts) and if pkt_bundle_len might be the
> one to check. Also if there needs to be a check that the len > sizeof
> HTC HDR.
> 
the htc_hdr->len is len of payload, so it allow < sizeof HTC HDR, but 
not allow > ATH10K_HTC_MBOX_MAX_PAYLOAD_LENGTH.
pkt_bundle is only used when it has many packet in rx side, otherwise it 
is not bundled in rx.

patch v3:
https://patchwork.kernel.org/patch/11313853/
https://patchwork.kernel.org/patch/11313859/

> On Tue, Dec 31, 2019 at 1:37 AM <wgong@codeaurora.org> wrote:
>> 
>> On 2019-12-25 23:14, Justin Capella wrote:
>> > This does not only effect SDIO.
>> >
>> > Why a semaphore / count? Could the conf_mutex be held earlier, or
>> > perhaps change the state to ATH10K_STATE_RESTARTING first?
>> > ath10k_reconfig_complete is also called in mac.c when channel is
>> > changed so
>> patch v2:
>> https://patchwork.kernel.org/patch/11313853/
>> https://patchwork.kernel.org/patch/11313859/
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index 91f131b87efc..4e0e8c86bdd4 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -2199,6 +2199,14 @@  static void ath10k_core_restart(struct work_struct *work)
 {
 	struct ath10k *ar = container_of(work, struct ath10k, restart_work);
 	int ret;
+	int restart_count;
+
+	restart_count = atomic_inc_and_test(&ar->restart_count);
+	if (restart_count > 1) {
+		ath10k_warn(ar, "can not restart, count: %d\n", restart_count);
+		atomic_dec(&ar->restart_count);
+		return;
+	}
 
 	set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
 
diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
index e57b2e7235e3..810c99f2dc0e 100644
--- a/drivers/net/wireless/ath/ath10k/core.h
+++ b/drivers/net/wireless/ath/ath10k/core.h
@@ -982,6 +982,8 @@  struct ath10k {
 	/* protected by conf_mutex */
 	u8 ps_state_enable;
 
+	atomic_t restart_count;
+
 	bool nlo_enabled;
 	bool p2p;
 
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 3856edba7915..bc1574145e66 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -7208,6 +7208,7 @@  static void ath10k_reconfig_complete(struct ieee80211_hw *hw,
 		ath10k_info(ar, "device successfully recovered\n");
 		ar->state = ATH10K_STATE_ON;
 		ieee80211_wake_queues(ar->hw);
+		atomic_dec(&ar->restart_count);
 	}
 
 	mutex_unlock(&ar->conf_mutex);