Message ID | 20241106090439.3487958-2-ih@simonwunderlich.de (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Toke Høiland-Jørgensen |
Headers | show |
Series | [v2,1/2] wifi: ath9k: work around AR_CFG 0xdeadbeef chip hang | expand |
On Wednesday, 6 November 2024 10:04:39 CET Issam Hamdi wrote: [...] > This patch originally developed by "Simon Wunderlich <simon.wunderlich@open-mesh.com>" > and "Sven Eckelmann <sven.eckelmann@open-mesh.com>" Am I the only person which finds this style of adding information about "Co- authors" weird? [...] > Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> > Signed-off-by: Sven Eckelmann <se@simonwunderlich.de> > Signed-off-by: Issam Hamdi <ih@simonwunderlich.de> > --- > v2: change the "Co-developed-by" to "Signed-off-by", remove the dependency I think Kalle meant that "Co-developed-by" should be followed by a "Signed-off-by" - not that "Co-developed-by" should be removed. I was not part of the delivery path for this version of the patch. But current Signed-off-by seem to suggest this. > on CONFIG_ATH9K_DEBUGFS and add more information in the commit description And please don't reply to the old thread when sending a new patchset - this becomes really unreadable after a while. You can simply use the method which b4 uses and just reference the old thread in your mail. Something like: Changes in v2: - change the "Co-developed-by" to "Signed-off-by" - remove the dependency on CONFIG_ATH9K_DEBUGFS - add more information in the commit description - Link to v1: https://lore.kernel.org/r/20241104171627.3789199-1-ih@simonwunderlich.de [...] > +static bool ath_hw_hang_deaf(struct ath_softc *sc) > +{ > +#ifdef CONFIG_ATH9K_TX99 > + return false; > +#else > + struct ath_common *common = ath9k_hw_common(sc->sc_ah); > + u32 interrupts, interrupt_per_s; > + unsigned int interval; > + > + /* get historic data */ > + interval = jiffies_to_msecs(jiffies - sc->last_check_time); > + if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) > + interrupts = sc->debug.stats.istats.rxlp; > + else > + interrupts = sc->debug.stats.istats.rxok; You can't simply access sc->debug.stats.istats. sc->debug is only available when building with CONFIG_ATH9K_DEBUGFS. See ath9k.c struct ath_softc { [...] #ifdef CONFIG_ATH9K_DEBUGFS struct ath9k_debug debug; #endif [...] } > + /* sanity check, should be 4 seconds */ > + if (interval > 10000 || interval < 1000) Here you have hardcoded values but the actual interval is hidden behind ATH_HANG_WORK_INTERVAL. Two things which now are rather disconnected and might cause problems in the future (when somebody fiddles around with ATH_HANG_WORK_INTERVAL). Overall, the proposal from Toke seems to be a lot better integrated in the HW check style which was introduced by Felix in the beginning of 2017 [1]. At the same time there was a proposal by Felix [2] - which diverged too much from our original patch (and as a result caused too many resets) [3]. I would therefore propose to check Toke's version and test handles the problem correctly. Kind regards, Sven [1] https://github.com/openwrt/openwrt/commit/b94177e10fc72f9309eae7459c3570e5c080e960 [2] https://patchwork.kernel.org/project/linux-wireless/patch/20170125163654.66431-3-nbd@nbd.name/ [3] https://lore.kernel.org/all/2081606.z26xgMiW1A@prime/
Sven Eckelmann <se@simonwunderlich.de> writes: > Overall, the proposal from Toke seems to be a lot better integrated in the HW > check style which was introduced by Felix in the beginning of 2017 [1]. > > At the same time there was a proposal by Felix [2] - which diverged too much > from our original patch (and as a result caused too many resets) [3]. I would > therefore propose to check Toke's version and test handles the problem > correctly. Yes, agreed. I did actually write up a commit message for that, let me send it as a proper patch... -Toke
Sven Eckelmann <se@simonwunderlich.de> writes: > On Wednesday, 6 November 2024 10:04:39 CET Issam Hamdi wrote: > [...] >> This patch originally developed by "Simon Wunderlich <simon.wunderlich@open-mesh.com>" >> and "Sven Eckelmann <sven.eckelmann@open-mesh.com>" > > Am I the only person which finds this style of adding information about "Co- > authors" weird? No, you are not. I also find it weird. > [...] >> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> >> Signed-off-by: Sven Eckelmann <se@simonwunderlich.de> >> Signed-off-by: Issam Hamdi <ih@simonwunderlich.de> >> --- >> v2: change the "Co-developed-by" to "Signed-off-by", remove the dependency > > I think Kalle meant that "Co-developed-by" should be followed by a > "Signed-off-by" - not that "Co-developed-by" should be removed. Correct, to my understanding having both c-d-b and s-o-b is the preferred format. >> on CONFIG_ATH9K_DEBUGFS and add more information in the commit description > > And please don't reply to the old thread when sending a new patchset - this > becomes really unreadable after a while. You can simply use the method which > b4 uses and just reference the old thread in your mail. Something like: > > Changes in v2: > - change the "Co-developed-by" to "Signed-off-by" > - remove the dependency on CONFIG_ATH9K_DEBUGFS > - add more information in the commit description > - Link to v1: https://lore.kernel.org/r/20241104171627.3789199-1-ih@simonwunderlich.de Yes, this style is very much recommended. Having links to older versions helps reviewers.
diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h index c1ce081445a9..2b98c69fa37f 100644 --- a/drivers/net/wireless/ath/ath9k/ath9k.h +++ b/drivers/net/wireless/ath/ath9k/ath9k.h @@ -1026,6 +1026,9 @@ struct ath_softc { short nbcnvifs; unsigned long ps_usecount; + unsigned long last_check_time; + u32 last_check_interrupts; + struct ath_rx rx; struct ath_tx tx; struct ath_beacon beacon; diff --git a/drivers/net/wireless/ath/ath9k/debug.c b/drivers/net/wireless/ath/ath9k/debug.c index 6b2469a01f17..4128cf691166 100644 --- a/drivers/net/wireless/ath/ath9k/debug.c +++ b/drivers/net/wireless/ath/ath9k/debug.c @@ -751,6 +751,7 @@ static int read_file_reset(struct seq_file *file, void *data) [RESET_TX_DMA_ERROR] = "Tx DMA stop error", [RESET_RX_DMA_ERROR] = "Rx DMA stop error", [RESET_TYPE_DEADBEEF] = "deadbeef hang", + [RESET_TYPE_DEAF] = "deaf hang", }; int i; diff --git a/drivers/net/wireless/ath/ath9k/debug.h b/drivers/net/wireless/ath/ath9k/debug.h index 6ebb6053a8c1..76e27860455c 100644 --- a/drivers/net/wireless/ath/ath9k/debug.h +++ b/drivers/net/wireless/ath/ath9k/debug.h @@ -54,6 +54,7 @@ enum ath_reset_type { RESET_TX_DMA_ERROR, RESET_RX_DMA_ERROR, RESET_TYPE_DEADBEEF, + RESET_TYPE_DEAF, __RESET_TYPE_MAX }; diff --git a/drivers/net/wireless/ath/ath9k/link.c b/drivers/net/wireless/ath/ath9k/link.c index 37438960c278..2700598c6336 100644 --- a/drivers/net/wireless/ath/ath9k/link.c +++ b/drivers/net/wireless/ath/ath9k/link.c @@ -162,13 +162,59 @@ static bool ath_hw_hang_deadbeef(struct ath_softc *sc) return true; } +static bool ath_hw_hang_deaf(struct ath_softc *sc) +{ +#ifdef CONFIG_ATH9K_TX99 + return false; +#else + struct ath_common *common = ath9k_hw_common(sc->sc_ah); + u32 interrupts, interrupt_per_s; + unsigned int interval; + + /* get historic data */ + interval = jiffies_to_msecs(jiffies - sc->last_check_time); + if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) + interrupts = sc->debug.stats.istats.rxlp; + else + interrupts = sc->debug.stats.istats.rxok; + + interrupts -= sc->last_check_interrupts; + + /* save current data */ + sc->last_check_time = jiffies; + if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) + sc->last_check_interrupts = sc->debug.stats.istats.rxlp; + else + sc->last_check_interrupts = sc->debug.stats.istats.rxok; + + /* sanity check, should be 4 seconds */ + if (interval > 10000 || interval < 1000) + return false; + + /* should be at least one interrupt per second */ + interrupt_per_s = interrupts / (interval / 1000); + if (interrupt_per_s >= 1) + return false; + + ath_dbg(common, RESET, + "RX deaf hang is detected. Schedule chip reset\n"); + ath9k_queue_reset(sc, RESET_TYPE_DEAF); + + return true; +#endif +} + void ath_hw_hang_work(struct work_struct *work) { struct ath_softc *sc = container_of(work, struct ath_softc, hw_hang_work.work); - ath_hw_hang_deadbeef(sc); + if (ath_hw_hang_deadbeef(sc)) + goto requeue_worker; + ath_hw_hang_deaf(sc); + +requeue_worker: ieee80211_queue_delayed_work(sc->hw, &sc->hw_hang_work, msecs_to_jiffies(ATH_HANG_WORK_INTERVAL)); }