diff mbox series

[4/4] wcn36xx: Put DXE block into reset before freeing memory

Message ID 20211015131741.2455824-5-bryan.odonoghue@linaro.org (mailing list archive)
State Changes Requested
Delegated to: Kalle Valo
Headers show
Series wcn36xx: Fix DMA buffer allocation and free logic | expand

Commit Message

Bryan O'Donoghue Oct. 15, 2021, 1:17 p.m. UTC
When deiniting the DXE hardware we should reset the block to ensure there
is no spurious DMA write transaction from the downstream WCNSS to upstream
MSM at a skbuff address we will have released.

This is actually a pretty serious bug. Immediately after the reset we
release skbs, skbs which are from the perspective of the WCNSS DXE still
valid addresses for DMA.

Without first placing the DXE block into reset, it is possible for an
upstream DMA transaction to write to skbs we have freed.

We have seen some backtraces from usage in testing on 50k+ devices which
indicates an invalid RX of an APs beacon to unmapped memory.

The logical conclusion is that an RX transaction happened to a region of
memory that was previously valid but was subsequently released.

The only time such a window of opportunity exists is when we have
deallocated the skbs attached to the DMA BDs in other words after doing
wcn36xx_stop().

If we free the skbs on the DMA channel, we need to make sure we have
quiesced potential DMA on that channel prior to freeing.

This patch should eliminate that error.

Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
---
 drivers/net/wireless/ath/wcn36xx/dxe.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Bryan O'Donoghue Oct. 17, 2021, 1:28 a.m. UTC | #1
On 15/10/2021 14:17, Bryan O'Donoghue wrote:
> When deiniting the DXE hardware we should reset the block to ensure there
> is no spurious DMA write transaction from the downstream WCNSS to upstream
> MSM at a skbuff address we will have released.
> 
> This is actually a pretty serious bug. Immediately after the reset we
> release skbs, skbs which are from the perspective of the WCNSS DXE still
> valid addresses for DMA.
> 
> Without first placing the DXE block into reset, it is possible for an
> upstream DMA transaction to write to skbs we have freed.
> 
> We have seen some backtraces from usage in testing on 50k+ devices which
> indicates an invalid RX of an APs beacon to unmapped memory.
> 
> The logical conclusion is that an RX transaction happened to a region of
> memory that was previously valid but was subsequently released.
> 
> The only time such a window of opportunity exists is when we have
> deallocated the skbs attached to the DMA BDs in other words after doing
> wcn36xx_stop().
> 
> If we free the skbs on the DMA channel, we need to make sure we have
> quiesced potential DMA on that channel prior to freeing.
> 
> This patch should eliminate that error.
> 
> Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
> ---
>   drivers/net/wireless/ath/wcn36xx/dxe.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c
> index e89002502869a..56f605c23f36c 100644
> --- a/drivers/net/wireless/ath/wcn36xx/dxe.c
> +++ b/drivers/net/wireless/ath/wcn36xx/dxe.c
> @@ -1020,6 +1020,8 @@ int wcn36xx_dxe_init(struct wcn36xx *wcn)
>   
>   void wcn36xx_dxe_deinit(struct wcn36xx *wcn)
>   {
> +	int reg_data = 0;
> +
>   	/* Disable channel interrupts */
>   	wcn36xx_dxe_disable_ch_int(wcn, WCN36XX_INT_MASK_CHAN_RX_H);
>   	wcn36xx_dxe_disable_ch_int(wcn, WCN36XX_INT_MASK_CHAN_RX_L);
> @@ -1035,6 +1037,10 @@ void wcn36xx_dxe_deinit(struct wcn36xx *wcn)
>   		wcn->tx_ack_skb = NULL;
>   	}
>   
> +	/* Put the DXE block into reset before freeing memory */
> +	reg_data = WCN36XX_DXE_REG_RESET;
> +	wcn36xx_dxe_write_register(wcn, WCN36XX_DXE_REG_CSR_RESET, reg_data);
> +
>   	wcn36xx_dxe_ch_free_skbs(wcn, &wcn->dxe_rx_l_ch);
>   	wcn36xx_dxe_ch_free_skbs(wcn, &wcn->dxe_rx_h_ch);
>   
> 

Johannes asked me separately if we need to wait for the quiesence to 
complete.

I don't see that downstream but, that doesn't mean we shouldn't do it.

So I'll investigate that.

Also - now that I look at this code, this being the second usage of the 
CSR_RESET like this, also means the reset can be functionally decomposed 
into a routine.

So - I'll look into the first and definitely do the second as a V2
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c
index e89002502869a..56f605c23f36c 100644
--- a/drivers/net/wireless/ath/wcn36xx/dxe.c
+++ b/drivers/net/wireless/ath/wcn36xx/dxe.c
@@ -1020,6 +1020,8 @@  int wcn36xx_dxe_init(struct wcn36xx *wcn)
 
 void wcn36xx_dxe_deinit(struct wcn36xx *wcn)
 {
+	int reg_data = 0;
+
 	/* Disable channel interrupts */
 	wcn36xx_dxe_disable_ch_int(wcn, WCN36XX_INT_MASK_CHAN_RX_H);
 	wcn36xx_dxe_disable_ch_int(wcn, WCN36XX_INT_MASK_CHAN_RX_L);
@@ -1035,6 +1037,10 @@  void wcn36xx_dxe_deinit(struct wcn36xx *wcn)
 		wcn->tx_ack_skb = NULL;
 	}
 
+	/* Put the DXE block into reset before freeing memory */
+	reg_data = WCN36XX_DXE_REG_RESET;
+	wcn36xx_dxe_write_register(wcn, WCN36XX_DXE_REG_CSR_RESET, reg_data);
+
 	wcn36xx_dxe_ch_free_skbs(wcn, &wcn->dxe_rx_l_ch);
 	wcn36xx_dxe_ch_free_skbs(wcn, &wcn->dxe_rx_h_ch);