diff mbox series

[net-next,v1] net: wwan: t7xx: Fix FSM command timeout issue

Message ID 20241212105555.10364-1-jinjian.song@fibocom.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next,v1] net: wwan: t7xx: Fix FSM command timeout issue | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 17 of 17 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch warning WARNING: The commit message has 'Call Trace:', perhaps it also needs a 'Fixes:' tag?
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest fail net-next-2024-12-12--12-00 (tests: 764)

Commit Message

Jinjian Song Dec. 12, 2024, 10:55 a.m. UTC
When driver processes the internal state change command, it use
asynchronous thread to process the command operation. If the main
thread detects that the task has timed out, the asynchronous thread
will panic when executing te completion notification because the
main thread completion object is released.

BUG: unable to handle page fault for address: fffffffffffffff8
PGD 1f283a067 P4D 1f283a067 PUD 1f283c067 PMD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:complete_all+0x3e/0xa0
[...]
Call Trace:
 <TASK>
 ? __die_body+0x68/0xb0
 ? page_fault_oops+0x379/0x3e0
 ? exc_page_fault+0x69/0xa0
 ? asm_exc_page_fault+0x22/0x30
 ? complete_all+0x3e/0xa0
 fsm_main_thread+0xa3/0x9c0 [mtk_t7xx (HASH:1400 5)]
 ? __pfx_autoremove_wake_function+0x10/0x10
 kthread+0xd8/0x110
 ? __pfx_fsm_main_thread+0x10/0x10 [mtk_t7xx (HASH:1400 5)]
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x38/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1b/0x30
 </TASK>
[...]
CR2: fffffffffffffff8
---[ end trace 0000000000000000 ]---

After the main thread determines that the task has timed out, mark
the completion invalid, and add judgment in the asynchronous task.

Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
---
 drivers/net/wwan/t7xx/t7xx_state_monitor.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Mateusz Polchlopek Dec. 12, 2024, 11:01 a.m. UTC | #1
On 12/12/2024 11:55 AM, Jinjian Song wrote:
> When driver processes the internal state change command, it use
> asynchronous thread to process the command operation. If the main
> thread detects that the task has timed out, the asynchronous thread
> will panic when executing te completion notification because the
> main thread completion object is released.
> 
> BUG: unable to handle page fault for address: fffffffffffffff8
> PGD 1f283a067 P4D 1f283a067 PUD 1f283c067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> RIP: 0010:complete_all+0x3e/0xa0
> [...]
> Call Trace:
>   <TASK>
>   ? __die_body+0x68/0xb0
>   ? page_fault_oops+0x379/0x3e0
>   ? exc_page_fault+0x69/0xa0
>   ? asm_exc_page_fault+0x22/0x30
>   ? complete_all+0x3e/0xa0
>   fsm_main_thread+0xa3/0x9c0 [mtk_t7xx (HASH:1400 5)]
>   ? __pfx_autoremove_wake_function+0x10/0x10
>   kthread+0xd8/0x110
>   ? __pfx_fsm_main_thread+0x10/0x10 [mtk_t7xx (HASH:1400 5)]
>   ? __pfx_kthread+0x10/0x10
>   ret_from_fork+0x38/0x50
>   ? __pfx_kthread+0x10/0x10
>   ret_from_fork_asm+0x1b/0x30
>   </TASK>
> [...]
> CR2: fffffffffffffff8
> ---[ end trace 0000000000000000 ]---
> 
> After the main thread determines that the task has timed out, mark
> the completion invalid, and add judgment in the asynchronous task.
> 
> Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
> ---
>   drivers/net/wwan/t7xx/t7xx_state_monitor.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
> index 3931c7a13f5a..57f1a7730fff 100644
> --- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c
> +++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
> @@ -108,7 +108,8 @@ static void fsm_finish_command(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_command
>   {
>   	if (cmd->flag & FSM_CMD_FLAG_WAIT_FOR_COMPLETION) {
>   		*cmd->ret = result;
> -		complete_all(cmd->done);
> +		if (cmd->done)
> +			complete_all(cmd->done);
>   	}
>   
>   	kfree(cmd);
> @@ -503,8 +504,10 @@ int t7xx_fsm_append_cmd(struct t7xx_fsm_ctl *ctl, enum t7xx_fsm_cmd_state cmd_id
>   
>   		wait_ret = wait_for_completion_timeout(&done,
>   						       msecs_to_jiffies(FSM_CMD_TIMEOUT_MS));
> -		if (!wait_ret)
> +		if (!wait_ret) {
> +			cmd->done = NULL;
>   			return -ETIMEDOUT;
> +		}
>   
>   		return ret;
>   	}

If this is a fix then should be targeted to net and not net-next
and probably should have Fixes: tag.
Bjorn Helgaas Dec. 12, 2024, 4:15 p.m. UTC | #2
On Thu, Dec 12, 2024 at 06:55:55PM +0800, Jinjian Song wrote:
> When driver processes the internal state change command, it use
> asynchronous thread to process the command operation. If the main
> thread detects that the task has timed out, the asynchronous thread
> will panic when executing te completion notification because the
> main thread completion object is released.

s/it use/it uses an/
s/te/the/
s/is released/has been released/
diff mbox series

Patch

diff --git a/drivers/net/wwan/t7xx/t7xx_state_monitor.c b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
index 3931c7a13f5a..57f1a7730fff 100644
--- a/drivers/net/wwan/t7xx/t7xx_state_monitor.c
+++ b/drivers/net/wwan/t7xx/t7xx_state_monitor.c
@@ -108,7 +108,8 @@  static void fsm_finish_command(struct t7xx_fsm_ctl *ctl, struct t7xx_fsm_command
 {
 	if (cmd->flag & FSM_CMD_FLAG_WAIT_FOR_COMPLETION) {
 		*cmd->ret = result;
-		complete_all(cmd->done);
+		if (cmd->done)
+			complete_all(cmd->done);
 	}
 
 	kfree(cmd);
@@ -503,8 +504,10 @@  int t7xx_fsm_append_cmd(struct t7xx_fsm_ctl *ctl, enum t7xx_fsm_cmd_state cmd_id
 
 		wait_ret = wait_for_completion_timeout(&done,
 						       msecs_to_jiffies(FSM_CMD_TIMEOUT_MS));
-		if (!wait_ret)
+		if (!wait_ret) {
+			cmd->done = NULL;
 			return -ETIMEDOUT;
+		}
 
 		return ret;
 	}