diff mbox

[PATCH-3.18.y,2/2] iscsi-target: Fix iscsi_np reset hung task during parallel delete

Message ID 1510813060-15334-3-git-send-email-nab@linux-iscsi.org (mailing list archive)
State New, archived
Headers show

Commit Message

Nicholas A. Bellinger Nov. 16, 2017, 6:17 a.m. UTC
From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch fixes a bug associated with iscsit_reset_np_thread()
that can occur during parallel configfs rmdir of a single iscsi_np
used across multiple iscsi-target instances, that would result in
hung task(s) similar to below where configfs rmdir process context
was blocked indefinately waiting for iscsi_np->np_restart_comp
to finish:

[ 6726.112076] INFO: task dcp_proxy_node_:15550 blocked for more than 120 seconds.
[ 6726.119440]       Tainted: G        W  O     4.1.26-3321 #2
[ 6726.125045] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6726.132927] dcp_proxy_node_ D ffff8803f202bc88     0 15550      1 0x00000000
[ 6726.140058]  ffff8803f202bc88 ffff88085c64d960 ffff88083b3b1ad0 ffff88087fffeb08
[ 6726.147593]  ffff8803f202c000 7fffffffffffffff ffff88083f459c28 ffff88083b3b1ad0
[ 6726.155132]  ffff88035373c100 ffff8803f202bca8 ffffffff8168ced2 ffff8803f202bcb8
[ 6726.162667] Call Trace:
[ 6726.165150]  [<ffffffff8168ced2>] schedule+0x32/0x80
[ 6726.170156]  [<ffffffff8168f5b4>] schedule_timeout+0x214/0x290
[ 6726.176030]  [<ffffffff810caef2>] ? __send_signal+0x52/0x4a0
[ 6726.181728]  [<ffffffff8168d7d6>] wait_for_completion+0x96/0x100
[ 6726.187774]  [<ffffffff810e7c80>] ? wake_up_state+0x10/0x10
[ 6726.193395]  [<ffffffffa035d6e2>] iscsit_reset_np_thread+0x62/0xe0 [iscsi_target_mod]
[ 6726.201278]  [<ffffffffa0355d86>] iscsit_tpg_disable_portal_group+0x96/0x190 [iscsi_target_mod]
[ 6726.210033]  [<ffffffffa0363f7f>] lio_target_tpg_store_enable+0x4f/0xc0 [iscsi_target_mod]
[ 6726.218351]  [<ffffffff81260c5a>] configfs_write_file+0xaa/0x110
[ 6726.224392]  [<ffffffff811ea364>] vfs_write+0xa4/0x1b0
[ 6726.229576]  [<ffffffff811eb111>] SyS_write+0x41/0xb0
[ 6726.234659]  [<ffffffff8169042e>] system_call_fastpath+0x12/0x71

It would happen because each iscsit_reset_np_thread() sets state
to ISCSI_NP_THREAD_RESET, sends SIGINT, and then blocks waiting
for completion on iscsi_np->np_restart_comp.

However, if iscsi_np was active processing a login request and
more than a single iscsit_reset_np_thread() caller to the same
iscsi_np was blocked on iscsi_np->np_restart_comp, iscsi_np
kthread process context in __iscsi_target_login_thread() would
flush pending signals and only perform a single completion of
np->np_restart_comp before going back to sleep within transport
specific iscsit_transport->iscsi_accept_np code.

To address this bug, add a iscsi_np->np_reset_count and update
__iscsi_target_login_thread() to keep completing np->np_restart_comp
until ->np_reset_count has reached zero.

Reported-by: Gary Guo <ghg@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/target/iscsi/iscsi_target.c       | 1 +
 drivers/target/iscsi/iscsi_target_core.h  | 1 +
 drivers/target/iscsi/iscsi_target_login.c | 7 +++++--
 include/target/iscsi/iscsi_target_core.h  | 1 +
 4 files changed, 8 insertions(+), 2 deletions(-)

Comments

Greg KH Nov. 16, 2017, 4:46 p.m. UTC | #1
On Thu, Nov 16, 2017 at 06:17:40AM +0000, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger <nab@linux-iscsi.org>
> 
> This patch fixes a bug associated with iscsit_reset_np_thread()
> that can occur during parallel configfs rmdir of a single iscsi_np
> used across multiple iscsi-target instances, that would result in
> hung task(s) similar to below where configfs rmdir process context
> was blocked indefinately waiting for iscsi_np->np_restart_comp
> to finish:
> 
> [ 6726.112076] INFO: task dcp_proxy_node_:15550 blocked for more than 120 seconds.
> [ 6726.119440]       Tainted: G        W  O     4.1.26-3321 #2
> [ 6726.125045] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 6726.132927] dcp_proxy_node_ D ffff8803f202bc88     0 15550      1 0x00000000
> [ 6726.140058]  ffff8803f202bc88 ffff88085c64d960 ffff88083b3b1ad0 ffff88087fffeb08
> [ 6726.147593]  ffff8803f202c000 7fffffffffffffff ffff88083f459c28 ffff88083b3b1ad0
> [ 6726.155132]  ffff88035373c100 ffff8803f202bca8 ffffffff8168ced2 ffff8803f202bcb8
> [ 6726.162667] Call Trace:
> [ 6726.165150]  [<ffffffff8168ced2>] schedule+0x32/0x80
> [ 6726.170156]  [<ffffffff8168f5b4>] schedule_timeout+0x214/0x290
> [ 6726.176030]  [<ffffffff810caef2>] ? __send_signal+0x52/0x4a0
> [ 6726.181728]  [<ffffffff8168d7d6>] wait_for_completion+0x96/0x100
> [ 6726.187774]  [<ffffffff810e7c80>] ? wake_up_state+0x10/0x10
> [ 6726.193395]  [<ffffffffa035d6e2>] iscsit_reset_np_thread+0x62/0xe0 [iscsi_target_mod]
> [ 6726.201278]  [<ffffffffa0355d86>] iscsit_tpg_disable_portal_group+0x96/0x190 [iscsi_target_mod]
> [ 6726.210033]  [<ffffffffa0363f7f>] lio_target_tpg_store_enable+0x4f/0xc0 [iscsi_target_mod]
> [ 6726.218351]  [<ffffffff81260c5a>] configfs_write_file+0xaa/0x110
> [ 6726.224392]  [<ffffffff811ea364>] vfs_write+0xa4/0x1b0
> [ 6726.229576]  [<ffffffff811eb111>] SyS_write+0x41/0xb0
> [ 6726.234659]  [<ffffffff8169042e>] system_call_fastpath+0x12/0x71
> 
> It would happen because each iscsit_reset_np_thread() sets state
> to ISCSI_NP_THREAD_RESET, sends SIGINT, and then blocks waiting
> for completion on iscsi_np->np_restart_comp.
> 
> However, if iscsi_np was active processing a login request and
> more than a single iscsit_reset_np_thread() caller to the same
> iscsi_np was blocked on iscsi_np->np_restart_comp, iscsi_np
> kthread process context in __iscsi_target_login_thread() would
> flush pending signals and only perform a single completion of
> np->np_restart_comp before going back to sleep within transport
> specific iscsit_transport->iscsi_accept_np code.
> 
> To address this bug, add a iscsi_np->np_reset_count and update
> __iscsi_target_login_thread() to keep completing np->np_restart_comp
> until ->np_reset_count has reached zero.
> 
> Reported-by: Gary Guo <ghg@datera.io>
> Tested-by: Gary Guo <ghg@datera.io>
> Cc: Mike Christie <mchristi@redhat.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: stable@vger.kernel.org # 3.10+
> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
> ---
>  drivers/target/iscsi/iscsi_target.c       | 1 +
>  drivers/target/iscsi/iscsi_target_core.h  | 1 +
>  drivers/target/iscsi/iscsi_target_login.c | 7 +++++--
>  include/target/iscsi/iscsi_target_core.h  | 1 +
>  4 files changed, 8 insertions(+), 2 deletions(-)

What is the git commit id of this patch in Linus's tree?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Christie Nov. 16, 2017, 6:35 p.m. UTC | #2
On 11/16/2017 10:46 AM, Greg-KH wrote:
> 
> What is the git commit id of this patch in Linus's tree?
> 

It's 978d13d60c34818a41fc35962602bdfa5c03f214

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg KH Nov. 19, 2017, 10:23 a.m. UTC | #3
On Thu, Nov 16, 2017 at 12:35:32PM -0600, Mike Christie wrote:
> On 11/16/2017 10:46 AM, Greg-KH wrote:
> > 
> > What is the git commit id of this patch in Linus's tree?
> > 
> 
> It's 978d13d60c34818a41fc35962602bdfa5c03f214

Thanks, now queued up.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 801369f..2fc6b49 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -428,6 +428,7 @@  int iscsit_reset_np_thread(
 		return 0;
 	}
 	np->np_thread_state = ISCSI_NP_THREAD_RESET;
+	atomic_inc(&np->np_reset_count);
 
 	if (np->np_thread) {
 		spin_unlock_bh(&np->np_thread_lock);
diff --git a/drivers/target/iscsi/iscsi_target_core.h b/drivers/target/iscsi/iscsi_target_core.h
index 1863de2..bf3da8e 100644
--- a/drivers/target/iscsi/iscsi_target_core.h
+++ b/drivers/target/iscsi/iscsi_target_core.h
@@ -783,6 +783,7 @@  struct iscsi_np {
 	int			np_sock_type;
 	enum np_thread_state_table np_thread_state;
 	bool                    enabled;
+	atomic_t		np_reset_count;
 	enum iscsi_timer_flags_table np_login_timer_flags;
 	u32			np_exports;
 	enum np_flags_table	np_flags;
diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c
index 4608d8d..540af1b 100644
--- a/drivers/target/iscsi/iscsi_target_login.c
+++ b/drivers/target/iscsi/iscsi_target_login.c
@@ -1275,9 +1275,11 @@  static int __iscsi_target_login_thread(struct iscsi_np *np)
 	flush_signals(current);
 
 	spin_lock_bh(&np->np_thread_lock);
-	if (np->np_thread_state == ISCSI_NP_THREAD_RESET) {
+	if (atomic_dec_if_positive(&np->np_reset_count) >= 0) {
 		np->np_thread_state = ISCSI_NP_THREAD_ACTIVE;
+		spin_unlock_bh(&np->np_thread_lock);
 		complete(&np->np_restart_comp);
+		return 1;
 	} else if (np->np_thread_state == ISCSI_NP_THREAD_SHUTDOWN) {
 		spin_unlock_bh(&np->np_thread_lock);
 		goto exit;
@@ -1310,7 +1312,8 @@  static int __iscsi_target_login_thread(struct iscsi_np *np)
 		goto exit;
 	} else if (rc < 0) {
 		spin_lock_bh(&np->np_thread_lock);
-		if (np->np_thread_state == ISCSI_NP_THREAD_RESET) {
+		if (atomic_dec_if_positive(&np->np_reset_count) >= 0) {
+			np->np_thread_state = ISCSI_NP_THREAD_ACTIVE;
 			spin_unlock_bh(&np->np_thread_lock);
 			complete(&np->np_restart_comp);
 			iscsit_put_transport(conn->conn_transport);
diff --git a/include/target/iscsi/iscsi_target_core.h b/include/target/iscsi/iscsi_target_core.h
index 7bd03f8..e365ed9 100644
--- a/include/target/iscsi/iscsi_target_core.h
+++ b/include/target/iscsi/iscsi_target_core.h
@@ -784,6 +784,7 @@  struct iscsi_np {
 	int			np_sock_type;
 	enum np_thread_state_table np_thread_state;
 	bool                    enabled;
+	atomic_t		np_reset_count;
 	enum iscsi_timer_flags_table np_login_timer_flags;
 	u32			np_exports;
 	enum np_flags_table	np_flags;