Message ID | 20221002014047.23066-1-duoming@zju.edu.cn (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | scsi: target: iscsi: cxgbit: fix sleep-in-atomic-context bug in cxgbit_abort_conn | expand |
On Sun, Oct 02, 2022 at 09:40:47AM +0800, Duoming Zhou wrote: > > The function iscsit_handle_time2retain_timeout() is a timer handler that > runs in an atomic context, but it calls "alloc_skb(0, GFP_KERNEL | ...)" > that may sleep. As a result, the sleep-in-atomic-context bug will happen. > The process is shown below: > > iscsit_handle_time2retain_timeout() > iscsit_close_session() > iscsit_free_connection_recovery_entries() > iscsit_free_cmd() > __iscsit_free_cmd() > cxgbit_unmap_cmd() > cxgbit_abort_conn() > alloc_skb(0, GFP_KERNEL | ...) //may sleep > > This patch changes the gfp_t parameter of alloc_skb() from GFP_KERNEL to > GFP_ATOMIC in order to mitigate the bug. > > Fixes: 1ae01724ae92 ("cxgbit: Abort the TCP connection in case of data out timeout") > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > --- > drivers/target/iscsi/cxgbit/cxgbit_cm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c > index 3336d2b78bf..eb3da6d2c62 100644 > --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c > +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c > @@ -697,7 +697,7 @@ __cxgbit_abort_conn(struct cxgbit_sock *csk, struct sk_buff *skb) > > void cxgbit_abort_conn(struct cxgbit_sock *csk) > { > - struct sk_buff *skb = alloc_skb(0, GFP_KERNEL | __GFP_NOFAIL); > + struct sk_buff *skb = alloc_skb(0, GFP_ATOMIC | __GFP_NOFAIL); > > cxgbit_get_csk(csk); > cxgbit_init_wr_wait(&csk->com.wr_wait); > -- > 2.17.1 > The last line in cxgbit_abort_conn is cxgbit_wait_for_reply() which also should not be called in interrupt context. Anyway this issue is not due to cxgbit, it is common for iSCSI itself: iscsit_close_session() iscsit_free_connection_recovery_entries() iscsit_free_cmd() transport_generic_free_cmd() target_wait_free_cmd() wait_for_completion_timeout() IMHO, there is no reason to call iscsit_close_session in an atomic context. I have two patches relaited Time2Retain timer. I will share them today. BR, Dmitry
Hello, On Mon, 3 Oct 2022 17:46:02 +0300 Dmitry Bogdanov wrote: > On Sun, Oct 02, 2022 at 09:40:47AM +0800, Duoming Zhou wrote: > > > > The function iscsit_handle_time2retain_timeout() is a timer handler that > > runs in an atomic context, but it calls "alloc_skb(0, GFP_KERNEL | ...)" > > that may sleep. As a result, the sleep-in-atomic-context bug will happen. > > The process is shown below: > > > > iscsit_handle_time2retain_timeout() > > iscsit_close_session() > > iscsit_free_connection_recovery_entries() > > iscsit_free_cmd() > > __iscsit_free_cmd() > > cxgbit_unmap_cmd() > > cxgbit_abort_conn() > > alloc_skb(0, GFP_KERNEL | ...) //may sleep > > > > This patch changes the gfp_t parameter of alloc_skb() from GFP_KERNEL to > > GFP_ATOMIC in order to mitigate the bug. > > > > Fixes: 1ae01724ae92 ("cxgbit: Abort the TCP connection in case of data out timeout") > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > --- > > drivers/target/iscsi/cxgbit/cxgbit_cm.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c > > index 3336d2b78bf..eb3da6d2c62 100644 > > --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c > > +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c > > @@ -697,7 +697,7 @@ __cxgbit_abort_conn(struct cxgbit_sock *csk, struct sk_buff *skb) > > > > void cxgbit_abort_conn(struct cxgbit_sock *csk) > > { > > - struct sk_buff *skb = alloc_skb(0, GFP_KERNEL | __GFP_NOFAIL); > > + struct sk_buff *skb = alloc_skb(0, GFP_ATOMIC | __GFP_NOFAIL); > > > > cxgbit_get_csk(csk); > > cxgbit_init_wr_wait(&csk->com.wr_wait); > > -- > > 2.17.1 > > > > The last line in cxgbit_abort_conn is cxgbit_wait_for_reply() which > also should not be called in interrupt context. I agree with you. > Anyway this issue is not due to cxgbit, it is common for iSCSI itself: > iscsit_close_session() > iscsit_free_connection_recovery_entries() > iscsit_free_cmd() > transport_generic_free_cmd() > target_wait_free_cmd() > wait_for_completion_timeout() I understand. > IMHO, there is no reason to call iscsit_close_session in an atomic context. > I have two patches relaited Time2Retain timer. I will share them today. That's great, thank you! Best regards, Duoming Zhou
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c index 3336d2b78bf..eb3da6d2c62 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c @@ -697,7 +697,7 @@ __cxgbit_abort_conn(struct cxgbit_sock *csk, struct sk_buff *skb) void cxgbit_abort_conn(struct cxgbit_sock *csk) { - struct sk_buff *skb = alloc_skb(0, GFP_KERNEL | __GFP_NOFAIL); + struct sk_buff *skb = alloc_skb(0, GFP_ATOMIC | __GFP_NOFAIL); cxgbit_get_csk(csk); cxgbit_init_wr_wait(&csk->com.wr_wait);
The function iscsit_handle_time2retain_timeout() is a timer handler that runs in an atomic context, but it calls "alloc_skb(0, GFP_KERNEL | ...)" that may sleep. As a result, the sleep-in-atomic-context bug will happen. The process is shown below: iscsit_handle_time2retain_timeout() iscsit_close_session() iscsit_free_connection_recovery_entries() iscsit_free_cmd() __iscsit_free_cmd() cxgbit_unmap_cmd() cxgbit_abort_conn() alloc_skb(0, GFP_KERNEL | ...) //may sleep This patch changes the gfp_t parameter of alloc_skb() from GFP_KERNEL to GFP_ATOMIC in order to mitigate the bug. Fixes: 1ae01724ae92 ("cxgbit: Abort the TCP connection in case of data out timeout") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> --- drivers/target/iscsi/cxgbit/cxgbit_cm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)