Message ID | 1443117270-16790-1-git-send-email-poh@qca.qualcomm.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Peter Oh <poh@qca.qualcomm.com> writes: > ath10k driver is using dma_pool_alloc per packet and dma_pool_free > in coresponding at Tx completion. > Use of pre-allocated DMA buffer in Tx will improve saving CPU resource > by 5% while it consumes about 56KB memory more as trade off. > > Signed-off-by: Peter Oh <poh@qca.qualcomm.com> Applied, thanks.
Peter Oh <poh@qca.qualcomm.com> writes: > ath10k driver is using dma_pool_alloc per packet and dma_pool_free > in coresponding at Tx completion. > Use of pre-allocated DMA buffer in Tx will improve saving CPU resource > by 5% while it consumes about 56KB memory more as trade off. > > Signed-off-by: Peter Oh <poh@qca.qualcomm.com> I see randomly the warning below with ath.git master branch commit 2e88ba7ebe8d1. Can this patch cause that? [ 53.637883] ------------[ cut here ]------------ [ 53.637999] WARNING: CPU: 2 PID: 0 at lib/dma-debug.c:1090 check_unmap+0x815/0x940() [ 53.638070] ath10k_pci 0000:02:00.0: qca99x0 hw2.0 (0x01000000, 0x003801ff bmi 1:1) fw 10.4.1.00007 fwapi 5 bdapi 2 htt-ver 2.2 wmi-op 6 htt-op 4 cal otp max-sta 512 raw 0 hwcrypto 1 features no-p2p [ 53.638077] ath10k_pci 0000:02:00.0: debug 1 debugfs 1 tracing 1 dfs 1 testmode 1 [ 53.638205] ath10k_pci 0000:02:00.0: DMA-API: device driver tries to free an invalid DMA memory address [ 53.638270] Modules linked in: ath10k_pci ath10k_core ath mac80211 cfg80211 [last unloaded: cfg80211] [ 53.638800] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.3.0-rc3-wl-ath+ #1082 [ 53.638880] Hardware name: Hewlett-Packard HP ProBook 6540b/1722, BIOS 68CDD Ver. F.04 01/27/2010 [ 53.638951] 00000000 00000000 f4445de4 c133dd94 00000000 f4445e14 c10562de c1b3e12c [ 53.639551] f4445e40 00000000 c1b3d3b9 00000442 c13693c5 c13693c5 f422e840 f4445eb4 [ 53.640169] 00000000 f4445e2c c10563c3 00000009 f4445e24 c1b3e12c f4445e40 f4445ea4 [ 53.640802] Call Trace: [ 53.640874] [<c133dd94>] dump_stack+0x48/0x64 [ 53.640953] [<c10562de>] warn_slowpath_common+0x8e/0xd0 [ 53.641031] [<c13693c5>] ? check_unmap+0x815/0x940 [ 53.641115] [<c13693c5>] ? check_unmap+0x815/0x940 [ 53.641201] [<c10563c3>] warn_slowpath_fmt+0x33/0x40 [ 53.641277] [<c13693c5>] check_unmap+0x815/0x940 [ 53.641356] [<c1088e35>] ? local_clock+0x25/0x30 [ 53.641432] [<c136974c>] debug_dma_unmap_page+0x8c/0xa0 [ 53.641511] [<fa258aaf>] ath10k_pci_htt_tx_cb+0x8f/0xb0 [ath10k_pci] [ 53.641589] [<fa258af0>] ath10k_pci_htt_rx_cb+0x20/0x30 [ath10k_pci] [ 53.641668] [<fa25ca2c>] ath10k_ce_per_engine_service+0x5c/0xa0 [ath10k_pci] [ 53.641746] [<fa25cae7>] ath10k_ce_per_engine_service_any+0x77/0x90 [ath10k_pci] [ 53.641825] [<fa25aa4b>] ath10k_pci_tasklet+0x1b/0x50 [ath10k_pci] [ 53.642064] [<c105bc4e>] tasklet_action+0x9e/0xb0 [ 53.642142] [<c105b17b>] __do_softirq+0xbb/0x3c0 [ 53.642222] [<c105b0c0>] ? trace_event_raw_event_irq_handler_entry+0xa0/0xa0 [ 53.642300] [<c10059b9>] do_softirq_own_stack+0x29/0x40 [ 53.642375] <IRQ> [<c105b716>] irq_exit+0x86/0xb0 [ 53.642526] [<c18c2310>] do_IRQ+0x60/0x120 [ 53.642604] [<c10a4deb>] ? trace_hardirqs_off+0xb/0x10 [ 53.642680] [<c18c1931>] common_interrupt+0x31/0x38 [ 53.642760] [<c1694fa9>] ? cpuidle_enter_state+0xc9/0x350 [ 53.642835] [<c1696879>] ? menu_select+0x239/0x4b0 [ 53.642925] [<c1695264>] cpuidle_enter+0x14/0x20 [ 53.643063] [<c109becc>] call_cpuidle+0x3c/0x70 [ 53.643140] [<c109c0a9>] cpu_startup_entry+0x1a9/0x390 [ 53.643218] [<c1038b75>] start_secondary+0x105/0x150 [ 53.643294] ---[ end trace a5dc1d40148089eb ]---
Does it happen after you applied copy engine patchset or even without the patchset? I saw ath10k_pci_htt_tx_cb in backtrace which is not merged to master branch yet. If it only happens after the patchset, I'll apply them and look into it if any possibilities are there. On 10/08/2015 04:49 AM, Kalle Valo wrote: > Peter Oh <poh@qca.qualcomm.com> writes: > >> ath10k driver is using dma_pool_alloc per packet and dma_pool_free >> in coresponding at Tx completion. >> Use of pre-allocated DMA buffer in Tx will improve saving CPU resource >> by 5% while it consumes about 56KB memory more as trade off. >> >> Signed-off-by: Peter Oh <poh@qca.qualcomm.com> > I see randomly the warning below with ath.git master branch commit > 2e88ba7ebe8d1. Can this patch cause that? > > [ 53.637883] ------------[ cut here ]------------ > [ 53.637999] WARNING: CPU: 2 PID: 0 at lib/dma-debug.c:1090 > check_unmap+0x815/0x940() > [ 53.638070] ath10k_pci 0000:02:00.0: qca99x0 hw2.0 (0x01000000, > 0x003801ff bmi 1:1) fw 10.4.1.00007 fwapi 5 bdapi 2 htt-ver 2.2 wmi-op 6 > htt-op 4 cal otp max-sta 512 raw 0 hwcrypto 1 features no-p2p > [ 53.638077] ath10k_pci 0000:02:00.0: debug 1 debugfs 1 tracing 1 dfs 1 > testmode 1 > [ 53.638205] ath10k_pci 0000:02:00.0: DMA-API: device driver tries to > free an invalid DMA memory address > [ 53.638270] Modules linked in: ath10k_pci ath10k_core ath mac80211 > cfg80211 [last unloaded: cfg80211] > [ 53.638800] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.3.0-rc3-wl-ath+ > #1082 > [ 53.638880] Hardware name: Hewlett-Packard HP ProBook 6540b/1722, BIOS > 68CDD Ver. F.04 01/27/2010 > [ 53.638951] 00000000 00000000 f4445de4 c133dd94 00000000 f4445e14 > c10562de c1b3e12c > [ 53.639551] f4445e40 00000000 c1b3d3b9 00000442 c13693c5 c13693c5 > f422e840 f4445eb4 > [ 53.640169] 00000000 f4445e2c c10563c3 00000009 f4445e24 c1b3e12c > f4445e40 f4445ea4 > [ 53.640802] Call Trace: > [ 53.640874] [<c133dd94>] dump_stack+0x48/0x64 > [ 53.640953] [<c10562de>] warn_slowpath_common+0x8e/0xd0 > [ 53.641031] [<c13693c5>] ? check_unmap+0x815/0x940 > [ 53.641115] [<c13693c5>] ? check_unmap+0x815/0x940 > [ 53.641201] [<c10563c3>] warn_slowpath_fmt+0x33/0x40 > [ 53.641277] [<c13693c5>] check_unmap+0x815/0x940 > [ 53.641356] [<c1088e35>] ? local_clock+0x25/0x30 > [ 53.641432] [<c136974c>] debug_dma_unmap_page+0x8c/0xa0 > [ 53.641511] [<fa258aaf>] ath10k_pci_htt_tx_cb+0x8f/0xb0 [ath10k_pci] > [ 53.641589] [<fa258af0>] ath10k_pci_htt_rx_cb+0x20/0x30 [ath10k_pci] > [ 53.641668] [<fa25ca2c>] ath10k_ce_per_engine_service+0x5c/0xa0 > [ath10k_pci] > [ 53.641746] [<fa25cae7>] ath10k_ce_per_engine_service_any+0x77/0x90 > [ath10k_pci] > [ 53.641825] [<fa25aa4b>] ath10k_pci_tasklet+0x1b/0x50 [ath10k_pci] > [ 53.642064] [<c105bc4e>] tasklet_action+0x9e/0xb0 > [ 53.642142] [<c105b17b>] __do_softirq+0xbb/0x3c0 > [ 53.642222] [<c105b0c0>] ? > trace_event_raw_event_irq_handler_entry+0xa0/0xa0 > [ 53.642300] [<c10059b9>] do_softirq_own_stack+0x29/0x40 > [ 53.642375] <IRQ> [<c105b716>] irq_exit+0x86/0xb0 > [ 53.642526] [<c18c2310>] do_IRQ+0x60/0x120 > [ 53.642604] [<c10a4deb>] ? trace_hardirqs_off+0xb/0x10 > [ 53.642680] [<c18c1931>] common_interrupt+0x31/0x38 > [ 53.642760] [<c1694fa9>] ? cpuidle_enter_state+0xc9/0x350 > [ 53.642835] [<c1696879>] ? menu_select+0x239/0x4b0 > [ 53.642925] [<c1695264>] cpuidle_enter+0x14/0x20 > [ 53.643063] [<c109becc>] call_cpuidle+0x3c/0x70 > [ 53.643140] [<c109c0a9>] cpu_startup_entry+0x1a9/0x390 > [ 53.643218] [<c1038b75>] start_secondary+0x105/0x150 > [ 53.643294] ---[ end trace a5dc1d40148089eb ]--- >
Peter Oh <poh@codeaurora.org> writes: > Does it happen after you applied copy engine patchset or even without > the patchset? I saw ath10k_pci_htt_tx_cb in backtrace which is not > merged to master branch yet. If it only happens after the patchset, > I'll apply them and look into it if any possibilities are there. Good point. I thought I was able to reproduce the bug with a clean master branch, but I now checked the logs and all three cases of this warning had ath10k_pci_htt_tx_cb in the stack trace. So I must have accidentally tested with a temporary branch where the pending branch was merged with master branch.
I did quick glance on the warning. The dma unmap warning is caused by dma_unmap_single called per every received packet in ath10k_pci_htt_tx_cb() <- ath10k_pci_htt_rx_cb(). The physical address is assigned by rx refill function for rx ring. So it looks like copy engine patchset has a bug. FYI, pre-allocated dma is coherent dma used in Tx only and only freed when ath10k driver stops. On 10/08/2015 10:17 AM, Peter Oh wrote: > Does it happen after you applied copy engine patchset or even without > the patchset? > I saw ath10k_pci_htt_tx_cb in backtrace which is not merged to master > branch yet. > If it only happens after the patchset, I'll apply them and look into > it if any possibilities are there. > > On 10/08/2015 04:49 AM, Kalle Valo wrote: >> Peter Oh <poh@qca.qualcomm.com> writes: >> >>> ath10k driver is using dma_pool_alloc per packet and dma_pool_free >>> in coresponding at Tx completion. >>> Use of pre-allocated DMA buffer in Tx will improve saving CPU resource >>> by 5% while it consumes about 56KB memory more as trade off. >>> >>> Signed-off-by: Peter Oh <poh@qca.qualcomm.com> >> I see randomly the warning below with ath.git master branch commit >> 2e88ba7ebe8d1. Can this patch cause that? >> >> [ 53.637883] ------------[ cut here ]------------ >> [ 53.637999] WARNING: CPU: 2 PID: 0 at lib/dma-debug.c:1090 >> check_unmap+0x815/0x940() >> [ 53.638070] ath10k_pci 0000:02:00.0: qca99x0 hw2.0 (0x01000000, >> 0x003801ff bmi 1:1) fw 10.4.1.00007 fwapi 5 bdapi 2 htt-ver 2.2 wmi-op 6 >> htt-op 4 cal otp max-sta 512 raw 0 hwcrypto 1 features no-p2p >> [ 53.638077] ath10k_pci 0000:02:00.0: debug 1 debugfs 1 tracing 1 dfs > 1 >> testmode 1 >> [ 53.638205] ath10k_pci 0000:02:00.0: DMA-API: device driver tries to >> free an invalid DMA memory address >> [ 53.638270] Modules linked in: ath10k_pci ath10k_core ath mac80211 >> cfg80211 [last unloaded: cfg80211] >> [ 53.638800] CPU: 2 PID: 0 Comm: swapper/2 Not tainted > 4.3.0-rc3-wl-ath+ >> #1082 >> [ 53.638880] Hardware name: Hewlett-Packard HP ProBook 6540b/1722, > BIOS >> 68CDD Ver. F.04 01/27/2010 >> [ 53.638951] 00000000 00000000 f4445de4 c133dd94 00000000 f4445e14 >> c10562de c1b3e12c >> [ 53.639551] f4445e40 00000000 c1b3d3b9 00000442 c13693c5 c13693c5 >> f422e840 f4445eb4 >> [ 53.640169] 00000000 f4445e2c c10563c3 00000009 f4445e24 c1b3e12c >> f4445e40 f4445ea4 >> [ 53.640802] Call Trace: >> [ 53.640874] [<c133dd94>] dump_stack+0x48/0x64 >> [ 53.640953] [<c10562de>] warn_slowpath_common+0x8e/0xd0 >> [ 53.641031] [<c13693c5>] ? check_unmap+0x815/0x940 >> [ 53.641115] [<c13693c5>] ? check_unmap+0x815/0x940 >> [ 53.641201] [<c10563c3>] warn_slowpath_fmt+0x33/0x40 >> [ 53.641277] [<c13693c5>] check_unmap+0x815/0x940 >> [ 53.641356] [<c1088e35>] ? local_clock+0x25/0x30 >> [ 53.641432] [<c136974c>] debug_dma_unmap_page+0x8c/0xa0 >> [ 53.641511] [<fa258aaf>] ath10k_pci_htt_tx_cb+0x8f/0xb0 [ath10k_pci] >> [ 53.641589] [<fa258af0>] ath10k_pci_htt_rx_cb+0x20/0x30 [ath10k_pci] >> [ 53.641668] [<fa25ca2c>] ath10k_ce_per_engine_service+0x5c/0xa0 >> [ath10k_pci] >> [ 53.641746] [<fa25cae7>] ath10k_ce_per_engine_service_any+0x77/0x90 >> [ath10k_pci] >> [ 53.641825] [<fa25aa4b>] ath10k_pci_tasklet+0x1b/0x50 [ath10k_pci] >> [ 53.642064] [<c105bc4e>] tasklet_action+0x9e/0xb0 >> [ 53.642142] [<c105b17b>] __do_softirq+0xbb/0x3c0 >> [ 53.642222] [<c105b0c0>] ? >> trace_event_raw_event_irq_handler_entry+0xa0/0xa0 >> [ 53.642300] [<c10059b9>] do_softirq_own_stack+0x29/0x40 >> [ 53.642375] <IRQ> [<c105b716>] irq_exit+0x86/0xb0 >> [ 53.642526] [<c18c2310>] do_IRQ+0x60/0x120 >> [ 53.642604] [<c10a4deb>] ? trace_hardirqs_off+0xb/0x10 >> [ 53.642680] [<c18c1931>] common_interrupt+0x31/0x38 >> [ 53.642760] [<c1694fa9>] ? cpuidle_enter_state+0xc9/0x350 >> [ 53.642835] [<c1696879>] ? menu_select+0x239/0x4b0 >> [ 53.642925] [<c1695264>] cpuidle_enter+0x14/0x20 >> [ 53.643063] [<c109becc>] call_cpuidle+0x3c/0x70 >> [ 53.643140] [<c109c0a9>] cpu_startup_entry+0x1a9/0x390 >> [ 53.643218] [<c1038b75>] start_secondary+0x105/0x150 >> [ 53.643294] ---[ end trace a5dc1d40148089eb ]--- >> > > > _______________________________________________ > ath10k mailing list > ath10k@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/ath10k
diff --git a/drivers/net/wireless/ath/ath10k/htt.h b/drivers/net/wireless/ath/ath10k/htt.h index 5a8e4ea..db0a99b 100644 --- a/drivers/net/wireless/ath/ath10k/htt.h +++ b/drivers/net/wireless/ath/ath10k/htt.h @@ -1488,7 +1488,6 @@ struct ath10k_htt { int num_pending_mgmt_tx; struct idr pending_tx; wait_queue_head_t empty_tx_wq; - struct dma_pool *tx_pool; /* set if host-fw communication goes haywire * used to avoid further failures */ @@ -1509,6 +1508,11 @@ struct ath10k_htt { dma_addr_t paddr; struct htt_msdu_ext_desc *vaddr; } frag_desc; + + struct { + dma_addr_t paddr; + struct ath10k_htt_txbuf *vaddr; + } txbuf; }; #define RX_HTT_HDR_STATUS_LEN 64 diff --git a/drivers/net/wireless/ath/ath10k/htt_tx.c b/drivers/net/wireless/ath/ath10k/htt_tx.c index eb5ba9b..5bed087 100644 --- a/drivers/net/wireless/ath/ath10k/htt_tx.c +++ b/drivers/net/wireless/ath/ath10k/htt_tx.c @@ -108,9 +108,12 @@ int ath10k_htt_tx_alloc(struct ath10k_htt *htt) spin_lock_init(&htt->tx_lock); idr_init(&htt->pending_tx); - htt->tx_pool = dma_pool_create("ath10k htt tx pool", htt->ar->dev, - sizeof(struct ath10k_htt_txbuf), 4, 0); - if (!htt->tx_pool) { + size = htt->max_num_pending_tx * sizeof(struct ath10k_htt_txbuf); + htt->txbuf.vaddr = dma_alloc_coherent(ar->dev, size, + &htt->txbuf.paddr, + GFP_DMA); + if (!htt->txbuf.vaddr) { + ath10k_err(ar, "failed to alloc tx buffer\n"); ret = -ENOMEM; goto free_idr_pending_tx; } @@ -125,14 +128,17 @@ int ath10k_htt_tx_alloc(struct ath10k_htt *htt) if (!htt->frag_desc.vaddr) { ath10k_warn(ar, "failed to alloc fragment desc memory\n"); ret = -ENOMEM; - goto free_tx_pool; + goto free_txbuf; } skip_frag_desc_alloc: return 0; -free_tx_pool: - dma_pool_destroy(htt->tx_pool); +free_txbuf: + size = htt->max_num_pending_tx * + sizeof(struct ath10k_htt_txbuf); + dma_free_coherent(htt->ar->dev, size, htt->txbuf.vaddr, + htt->txbuf.paddr); free_idr_pending_tx: idr_destroy(&htt->pending_tx); return ret; @@ -160,7 +166,13 @@ void ath10k_htt_tx_free(struct ath10k_htt *htt) idr_for_each(&htt->pending_tx, ath10k_htt_tx_clean_up_pending, htt->ar); idr_destroy(&htt->pending_tx); - dma_pool_destroy(htt->tx_pool); + + if (htt->txbuf.vaddr) { + size = htt->max_num_pending_tx * + sizeof(struct ath10k_htt_txbuf); + dma_free_coherent(htt->ar->dev, size, htt->txbuf.vaddr, + htt->txbuf.paddr); + } if (htt->frag_desc.vaddr) { size = htt->max_num_pending_tx * @@ -521,7 +533,6 @@ int ath10k_htt_tx(struct ath10k_htt *htt, struct sk_buff *msdu) int res; u8 flags0 = 0; u16 msdu_id, flags1 = 0; - dma_addr_t paddr = 0; u32 frags_paddr = 0; struct htt_msdu_ext_desc *ext_desc = NULL; bool limit_mgmt_desc = false; @@ -550,13 +561,9 @@ int ath10k_htt_tx(struct ath10k_htt *htt, struct sk_buff *msdu) prefetch_len = min(htt->prefetch_len, msdu->len); prefetch_len = roundup(prefetch_len, 4); - skb_cb->htt.txbuf = dma_pool_alloc(htt->tx_pool, GFP_ATOMIC, - &paddr); - if (!skb_cb->htt.txbuf) { - res = -ENOMEM; - goto err_free_msdu_id; - } - skb_cb->htt.txbuf_paddr = paddr; + skb_cb->htt.txbuf = &htt->txbuf.vaddr[msdu_id]; + skb_cb->htt.txbuf_paddr = htt->txbuf.paddr + + (sizeof(struct ath10k_htt_txbuf) * msdu_id); if ((ieee80211_is_action(hdr->frame_control) || ieee80211_is_deauth(hdr->frame_control) || @@ -574,7 +581,7 @@ int ath10k_htt_tx(struct ath10k_htt *htt, struct sk_buff *msdu) res = dma_mapping_error(dev, skb_cb->paddr); if (res) { res = -EIO; - goto err_free_txbuf; + goto err_free_msdu_id; } switch (skb_cb->txmode) { @@ -706,10 +713,6 @@ int ath10k_htt_tx(struct ath10k_htt *htt, struct sk_buff *msdu) err_unmap_msdu: dma_unmap_single(dev, skb_cb->paddr, msdu->len, DMA_TO_DEVICE); -err_free_txbuf: - dma_pool_free(htt->tx_pool, - skb_cb->htt.txbuf, - skb_cb->htt.txbuf_paddr); err_free_msdu_id: spin_lock_bh(&htt->tx_lock); ath10k_htt_tx_free_msdu_id(htt, msdu_id); diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c index 7db7d50..6d1105a 100644 --- a/drivers/net/wireless/ath/ath10k/txrx.c +++ b/drivers/net/wireless/ath/ath10k/txrx.c @@ -92,11 +92,6 @@ void ath10k_txrx_tx_unref(struct ath10k_htt *htt, skb_cb = ATH10K_SKB_CB(msdu); dma_unmap_single(dev, skb_cb->paddr, msdu->len, DMA_TO_DEVICE); - if (skb_cb->htt.txbuf) - dma_pool_free(htt->tx_pool, - skb_cb->htt.txbuf, - skb_cb->htt.txbuf_paddr); - ath10k_report_offchan_tx(htt->ar, msdu); info = IEEE80211_SKB_CB(msdu);
ath10k driver is using dma_pool_alloc per packet and dma_pool_free in coresponding at Tx completion. Use of pre-allocated DMA buffer in Tx will improve saving CPU resource by 5% while it consumes about 56KB memory more as trade off. Signed-off-by: Peter Oh <poh@qca.qualcomm.com> --- drivers/net/wireless/ath/ath10k/htt.h | 6 ++++- drivers/net/wireless/ath/ath10k/htt_tx.c | 43 +++++++++++++++++--------------- drivers/net/wireless/ath/ath10k/txrx.c | 5 ---- 3 files changed, 28 insertions(+), 26 deletions(-)