Message ID | 20240413230030.390563-2-lewis.robbins2@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Ping-Ke Shih |
Headers | show |
Series | wifi: rtw88: reduce failed to flush queue severity | expand |
Lewis Robbins <lewis.robbins2@gmail.com> wrote: > > Reduce the log message severity when we fail to flush device priority > queue. If a system has a lot of traffic, we may fail to flush the queue > in time. This generates a lot of messages in the kernel ring buffer. As > this is a common occurrence, we should use dev_info instead of dev_warn. > > Signed-off-by: Lewis Robbins <lewis.robbins2@gmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> I'd like to know situations of " If a system has a lot of traffic...". Did you scan or do something during traffic?
Ping-Ke Shih <pkshih@realtek.com> writes: > Lewis Robbins <lewis.robbins2@gmail.com> wrote: >> >> Reduce the log message severity when we fail to flush device priority >> queue. If a system has a lot of traffic, we may fail to flush the queue >> in time. This generates a lot of messages in the kernel ring buffer. As >> this is a common occurrence, we should use dev_info instead of dev_warn. >> >> Signed-off-by: Lewis Robbins <lewis.robbins2@gmail.com> > > Acked-by: Ping-Ke Shih <pkshih@realtek.com> > > I'd like to know situations of " If a system has a lot of traffic...". > Did you scan or do something during traffic? The driver shouldn't print any warnings in normal usage, even using info level. If this is expected scenario then maybe change it to debug print? Or if is this an actual bug then it's better try to investigate and fix it.
Ping-Ke Shih <pkshih@realtek.com> writes: > Lewis Robbins <lewis.robbins2@gmail.com> wrote: >> >> Reduce the log message severity when we fail to flush device priority >> queue. If a system has a lot of traffic, we may fail to flush the queue >> in time. This generates a lot of messages in the kernel ring buffer. As >> this is a common occurrence, we should use dev_info instead of dev_warn. >> >> Signed-off-by: Lewis Robbins <lewis.robbins2@gmail.com> > > Acked-by: Ping-Ke Shih <pkshih@realtek.com> > > I'd like to know situations of " If a system has a lot of traffic...". > Did you scan or do something during traffic? So, after digging a bit more, it seems you're right this only happens during a scan. The log message itself is repeated about 5-10x. Kalle Valo <kvalo@kernel.org> writes: > The driver shouldn't print any warnings in normal usage, even using info > level. If this is expected scenario then maybe change it to debug print? > Or if is this an actual bug then it's better try to investigate and fix > it. I have the stack-trace: [23838.633664] rtw_8821ce 0000:02:00.0: timed out to flush queue 2 [23838.633685] CPU: 1 PID: 363059 Comm: kworker/u8:1 Tainted G 6.8.5 [23838.633698] Hardware name: /, BIOS 5.26 09/26/2023 [23838.633704] Workqueue: events_unbound cfg80211_wiphy_work [cfg80211] [23838.633881] Call Trace: [23838.633889] <TASK> [23838.633898] dump_stack_lvl+0x47/0x60 [23838.633918] rtw_mac_flush_queues+0x148/0x190 [rtw88_core 0d7ad2d9d6116c633c0aab4e7bc6016d572d75d4] [23838.633993] rtw_ops_flush+0x5a/0x70 [rtw88_core 0d7ad2d9d6116c633c0aab4e7bc6016d572d75d4] [23838.634056] __ieee80211_flush_queues+0x10b/0x2e0 [mac80211 5d0b446baffe1290bc56d55aa496e941688b7b40] [23838.634309] ieee80211_scan_work+0x3e3/0x520 [mac80211 5d0b446baffe1290bc56d55aa496e941688b7b40] [23838.634494] cfg80211_wiphy_work+0xa7/0xe0 [cfg80211 b36d5437ba649ace42ea92e8f83a3ec499e0d5b7] [23838.634646] process_one_work+0x178/0x350 [23838.634660] worker_thread+0x30f/0x450 [23838.634670] ? __pfx_worker_thread+0x10/0x10 [23838.634678] kthread+0xe5/0x120 [23838.634691] ? __pfx_kthread+0x10/0x10 [23838.634702] ret_from_fork+0x31/0x50 [23838.634714] ? __pfx_kthread+0x10/0x10 [23838.634724] ret_from_fork_asm+0x1b/0x30 [23838.634736] </TASK> I'm not sure as to the cause. If the flush operation takes a long time do we need to release any mutexes etc? And if this is just a hardware issue, then we can do a debug print as you say. BugZilla: https://bugzilla.kernel.org/show_bug.cgi?id=218697
Lewis Robbins <lewis.robbins2@gmail.com> wrote: > > Ping-Ke Shih <pkshih@realtek.com> writes: > > > Lewis Robbins <lewis.robbins2@gmail.com> wrote: > >> > >> Reduce the log message severity when we fail to flush device priority > >> queue. If a system has a lot of traffic, we may fail to flush the queue > >> in time. This generates a lot of messages in the kernel ring buffer. As > >> this is a common occurrence, we should use dev_info instead of dev_warn. > >> > >> Signed-off-by: Lewis Robbins <lewis.robbins2@gmail.com> > > > > Acked-by: Ping-Ke Shih <pkshih@realtek.com> > > > > I'd like to know situations of " If a system has a lot of traffic...". > > Did you scan or do something during traffic? > > So, after digging a bit more, it seems you're right this only happens during a > scan. The log message itself is repeated about 5-10x. That is the same as my test before. > > I'm not sure as to the cause. If the flush operation takes a long time do we > need to release any mutexes etc? And if this is just a hardware issue, then we > can do a debug print as you say. The cause is because packets in hardware TX queue that can't be sent out in time, and flush ops with 'drop = false', so driver throws one warning. I don't have good idea for now. Maybe, we can add a special debug mask to replace this kind of verbose warning with uncertain solution.
> > The cause is because packets in hardware TX queue that can't be sent out in time, > and flush ops with 'drop = false', so driver throws one warning. I don't have > good idea for now. Maybe, we can add a special debug mask to replace this kind of > verbose warning with uncertain solution. I have made a patch [1] as mentioned before. Lewis, please give it a try. [1] https://lore.kernel.org/linux-wireless/20240417014036.11234-1-pkshih@realtek.com/T/#u
diff --git a/drivers/net/wireless/realtek/rtw88/mac.c b/drivers/net/wireless/realtek/rtw88/mac.c index 0c1c1ff31085..800f4eabae13 100644 --- a/drivers/net/wireless/realtek/rtw88/mac.c +++ b/drivers/net/wireless/realtek/rtw88/mac.c @@ -1040,7 +1040,7 @@ static void __rtw_mac_flush_prio_queue(struct rtw_dev *rtwdev, * And it requires like ~2secs to flush the full priority queue. */ if (!drop) - rtw_warn(rtwdev, "timed out to flush queue %d\n", prio_queue); + rtw_info(rtwdev, "timed out to flush queue %d\n", prio_queue); } static void rtw_mac_flush_prio_queues(struct rtw_dev *rtwdev,
Reduce the log message severity when we fail to flush device priority queue. If a system has a lot of traffic, we may fail to flush the queue in time. This generates a lot of messages in the kernel ring buffer. As this is a common occurrence, we should use dev_info instead of dev_warn. Signed-off-by: Lewis Robbins <lewis.robbins2@gmail.com> --- drivers/net/wireless/realtek/rtw88/mac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)