ath10k: change len of trace_ath10k_log_dbg_dump for large buffer size

Kernel panic every time in kernel when running below case:
steps:
1. connect to an AP with good signal strength
2. echo 0x7f > /sys/kernel/debug/ieee80211/phy0/ath10k/pktlog_filter
3. echo 0xffff 0 > /sys/kernel/debug/ieee80211/phy0/ath10k/fw_dbglog
4. echo 0 > /sys/module/ath10k_core/parameters/debug_mask
5. sudo trace-cmd record  -e ath10k
6. run "iperf -c 192.168.1.1 -u -b 100m -i 1 -t 30"
7. kernel panic immeditely

It always crash at trace_event_raw_event_ath10k_xxx, below is 2 sample:

dmesg 1:
[   58.038310] Unable to handle kernel paging request at virtual address 003130393837363d
[   58.038325] Mem abort info:
[   58.038330]   ESR = 0x96000004
[   58.038336]   Exception class = DABT (current EL), IL = 32 bits
[   58.038340]   SET = 0, FnV = 0
[   58.038345]   EA = 0, S1PTW = 0
[   58.038349] Data abort info:
[   58.038353]   ISV = 0, ISS = 0x00000004
[   58.038358]   CM = 0, WnR = 0
[   58.038363] [003130393837363d] address between user and kernel address ranges
[   58.038370] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   58.038377] Modules linked in: rfcomm algif_hash algif_skcipher
[   58.038385] Unable to handle kernel paging request at virtual address 003938373635343b
[   58.038386]  af_alg uinput cros_ec_rpmsg mtk_dip mtk_mdp3
[   58.038395] Mem abort info:
[   58.038403]   ESR = 0x96000004
[   58.038412]   Exception class = DABT (current EL), IL = 32 bits
[   58.038419]   SET = 0, FnV = 0
[   58.038427]   EA = 0, S1PTW = 0
[   58.038434] Data abort info:
[   58.038442]   ISV = 0, ISS = 0x00000004
[   58.038449]   CM = 0, WnR = 0
[   58.038457] [003938373635343b] address between user and kernel address ranges
[   58.038486] Process kworker/u16:1 (pid: 141, stack limit = 0x000000009bd6ba2e)
[   58.038496] CPU: 1 PID: 141 Comm: kworker/u16:1 Not tainted 4.19.139 #162
[   58.038501] Hardware name: MediaTek krane sku176 board (DT)
[   58.038515] Workqueue: ath10k_sdio_wq ath10k_sdio_write_async_work [ath10k_sdio]
[   58.038523] pstate: 80000005 (Nzcv daif -PAN -UAO)
[   58.038536] pc : filter_match_preds+0x20/0x94
[   58.038546] lr : trace_event_buffer_commit+0x150/0x22c
[   58.038550] sp : ffffff80088a39f0
[   58.038555] x29: ffffff80088a3a20 x28: 000000000000000c
[   58.038561] x27: 000000000000000c x26: 0000000000000001
[   58.038567] x25: ffffffdfd96ef008 x24: 0000000000000000
[   58.038573] x23: 0000000000000000 x22: ffffffe03b006c00
[   58.038580] x21: ffffffdfd96ef000 x20: 3231303938373635
[   58.038586] x19: ffffffe02de9b160 x18: 0000000000000020
[   58.038592] x17: 0000000000100000 x16: ffffffa4e67937f0
[   58.038598] x15: 0000000000100000 x14: 0000393837363534
[   58.038604] x13: 3332313039383736 x12: 3534333231303938
[   58.038611] x11: 3736353433323130 x10: 3938373635343332
[   58.038617] x9 : 3130393837363534 x8 : 000000000000040b
[   58.038623] x7 : 3534333231303938 x6 : ffffffdfd96fb2d1
[   58.038629] x5 : 0000000000000000 x4 : 0000000000000000
[   58.038635] x3 : 000000000000c2cd x2 : ffffffffffffffc0
[   58.038641] x1 : ffffffdfd96ef008 x0 : ffffffdfd96f7080
[   58.038648] Call trace:
[   58.038655]  filter_match_preds+0x20/0x94
[   58.038661]  trace_event_buffer_commit+0x150/0x22c
[   58.038698]  trace_event_raw_event_ath10k_log_dbg_dump+0x208/0x240 [ath10k_core]
[   58.038716]  ath10k_dbg_dump+0x1bc/0x21c [ath10k_core]
[   58.038724]  ath10k_sdio_write+0xc8/0xe8 [ath10k_sdio]
[   58.038730]  ath10k_sdio_write_async_work+0x11c/0x1f4 [ath10k_sdio]
[   58.038738]  process_one_work+0x210/0x418
[   58.038744]  worker_thread+0x234/0x3dc
[   58.038751]  kthread+0x120/0x140
[   58.038758]  ret_from_fork+0x10/0x18
[   58.038766] Code: 9100c3fd b40002e0 f9400014 b40002b4 (f9400680)
[   58.038773] ---[ end trace 8b9d2463f441eb7f ]---
[   58.051846] Kernel panic - not syncing: Fatal exception
[   58.051864] SMP: stopping secondary CPUs

dmesg 2:
[  228.692905] Unable to handle kernel paging request at virtual address 0033323130393837
[  228.692913] Mem abort info:
[  228.692920]   ESR = 0x96000004
[  228.692927]   Exception class = DABT (current EL), IL = 32 bits
[  228.692933]   SET = 0, FnV = 0
[  228.692938]   EA = 0, S1PTW = 0
[  228.692943] Data abort info:
[  228.692953]   ISV = 0, ISS = 0x00000004
[  228.692965]   CM = 0, WnR = 0
[  228.692975] [0033323130393837] address between user and kernel address ranges
[  228.692983] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[  228.693078] Process kworker/u16:2 (pid: 144, stack limit = 0x000000009f882493)
[  228.693088] CPU: 6 PID: 144 Comm: kworker/u16:2 Tainted: G        W         4.19.139 #162
[  228.693095] Hardware name: MediaTek krane sku176 board (DT)
[  228.693125] Workqueue: ath10k_tx_complete_wq ath10k_htc_tx_complete_work [ath10k_core]
[  228.693137] pstate: 80000005 (Nzcv daif -PAN -UAO)
[  228.693152] pc : filter_match_preds+0x38/0x94
[  228.693164] lr : trace_event_buffer_commit+0x150/0x22c
[  228.693170] sp : ffffff80088b3b20
[  228.693175] x29: ffffff80088b3b50 x28: 0000000000000402
[  228.693183] x27: fffffff040001f60 x26: ffffff9d4ba34000
[  228.693191] x25: fffffff040af5004 x24: 0000000000000000
[  228.693198] x23: 0000000000000000 x22: fffffff040af7100
[  228.693205] x21: 0000000000000000 x20: fffffff040af7100
[  228.693212] x19: fffffff040af5004 x18: 0000000000000000
[  228.693219] x17: 000000000000003c x16: ffffff9d4da649d8
[  228.693226] x15: 0000000000000006 x14: 3837363534333231
[  228.693233] x13: fffffff01514fc00 x12: 0000000000000000
[  228.693240] x11: 0000000000000000 x10: fffffff040af5004
[  228.693247] x9 : 0000000000000000 x8 : 0000000000000000
[  228.693254] x7 : 646b686064ff6e68 x6 : 0000800000000000
[  228.693261] x5 : 0000000000000000 x4 : 0000000000000000
[  228.693268] x3 : 000000000000002c x2 : 0000000000000000
[  228.693275] x1 : fffffff040af5004 x0 : 3433323130393837
[  228.693282] Call trace:
[  228.693291]  filter_match_preds+0x38/0x94
[  228.693299]  trace_event_buffer_commit+0x150/0x22c
[  228.693321]  trace_event_raw_event_ath10k_txrx_tx_unref+0x164/0x198 [ath10k_core]
[  228.693344]  ath10k_txrx_tx_unref+0x1c0/0x36c [ath10k_core]
[  228.693365]  ath10k_htt_htc_tx_complete+0xe4/0x114 [ath10k_core]
[  228.693386]  ath10k_htc_notify_tx_completion+0xe4/0x124 [ath10k_core]
[  228.693407]  ath10k_htc_tx_complete_work+0xb8/0xf0 [ath10k_core]
[  228.693418]  process_one_work+0x210/0x418
[  228.693425]  worker_thread+0x234/0x3dc
[  228.693433]  kthread+0x120/0x140
[  228.693441]  ret_from_fork+0x10/0x18
[  228.693451] Code: aa0103f3 aa1f03e8 2a1f03f5 aa1403f6 (f9400009)
[  228.693459] ---[ end trace 187b0ae372e7515c ]---
[  228.711074] Kernel panic - not syncing: Fatal exception
[  228.711086] SMP: stopping secondary CPUs
[  228.711254] Kernel Offset: 0x1d45600000 from 0xffffff8008000000
[  228.711259] CPU features: 0x0,2188200c
[  228.711262] Memory Limit: none

The value of prog in filter_match_preds of kernel/trace/trace_events_filter.c
is overwrite to the content of the UDP packets's content like this
0x0039383736353433, it is a invalid address, so crash.

Content of udp tx packets with iperf:
0000   00 03 7f 46 c4 49 40 49 0f 2f f9 09 08 00 45 00   ...F.I@I./....E.
0010   05 da 18 00 40 00 40 11 99 4a c0 a8 01 77 c0 a8   ....@.@..J...w..
0020   01 01 9c 7d 13 89 05 c6 89 a0 00 00 01 8b 5d 69   ...}..........]i
0030   98 9a 00 0a 55 b8 00 00 00 00 00 00 00 01 00 00   ....U...........
0040   13 89 00 00 00 00 00 00 00 00 ff ff f4 48 05 f5   .............H..
0050   e1 00 30 31 32 33 34 35 36 37 38 39 30 31 32 33   ..01234567890123
0060   34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 39   4567890123456789
0070   30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35   0123456789012345
0080   36 37 38 39 30 31 32 33 34 35 36 37 38 39 30 31   6789012345678901
0090   32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37   2345678901234567
00a0   38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33   8901234567890123
00b0   34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 39   4567890123456789
...
05b0   34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 39   4567890123456789
05c0   30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35   0123456789012345
05d0   36 37 38 39 30 31 32 33 34 35 36 37 38 39 30 31   6789012345678901
05e0   32 33 34 35 36 37 38 39                           23456789

ath10k_htc_send_bundle_skbs allocate skb with size 49792(1556*32), it
is bigger than PAGE_SIZE which is normally 4096, then ath10k_sdio_write
call ath10k_dbg_dump to trace the large size skb and corrupt the trace
data of tracing and lead crash. When disable the TX bundle of SDIO, it
does not crash, but TX bundle is for improve throughput, so it is enabled
by default. It is useless to call ath10k_dbg_dump to trace the large
bundled skb, so this patch trace the top part of large bundled skb.

I also did another try with below change, it is move the ath10k_dbg_dump
from ath10k_sdio_write to ath10k_sdio_mbox_rx_fetch_bundle, it also
crash immediately when run iperf UDP RX traffic, RX bundle is enabled
by default for SDIO, so the virt_pkt_len of ath10k_sdio_mbox_rx_fetch_bundle
is large, max value is 57344(1792*32), so it crashed same with upper test.

Change of try:

Message ID	1612839593-2308-1-git-send-email-wgong@codeaurora.org (mailing list archive)
State	Changes Requested
Delegated to:	Kalle Valo
Headers	show Return-Path: <linux-wireless-owner@kernel.org> Date: Subject: Cc: To: From: Sender; bh=xMgBiD+5AW08eVE+1rUZa7JU7MxxxYZe/YWY+zSZT6w=; b=LwZryzQPDvqq87Fz6W+bTg6hMdKoFQoDfiJinmqPlbUzxqfzzzNph/SOO09BnRxsyCKv9/6b I4ESByfRE4DObnYtOVUoeLpJHf4t/IsdTIstmazuGJDS2JlIQmVOwkpmPOh/Lftl+uUZZcjb 4PFAG16btC+RrXRQpz65sQU5H48= Sender: wgong=codeaurora.org@mg.codeaurora.org sender: wgong) by smtp.codeaurora.org (Postfix) with ESMTPSA id 16D7AC433ED; Tue, 9 Feb 2021 02:58:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 16D7AC433ED From: Wen Gong <wgong@codeaurora.org> To: ath10k@lists.infradead.org Cc: linux-wireless@vger.kernel.org, wgong@codeaurora.org Subject: [PATCH] ath10k: change len of trace_ath10k_log_dbg_dump for large buffer size Date: Tue, 9 Feb 2021 10:59:53 +0800 Message-Id: <1612839593-2308-1-git-send-email-wgong@codeaurora.org> Precedence: bulk
Series	ath10k: change len of trace_ath10k_log_dbg_dump for large buffer size \| expand ath10k: change len of trace_ath10k_log_dbg_dump for large buffer size

ath10k: change len of trace_ath10k_log_dbg_dump for large buffer size

Commit Message

Comments

Patch