Message ID | 20170504225102.8931-17-bart.vanassche@sandisk.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 05/05/2017 12:50 AM, Bart Van Assche wrote: > Fix the following deadlock: > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 4.10.0-rc7-dbg+ #1 Not tainted > ------------------------------------------------------- > rmdir/13321 is trying to acquire lock: > (&sess->cmdsn_mutex){+.+.+.}, at: [<ffffffffa02cb47d>] iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] > > but task is already holding lock: > (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140 > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > -> #1 (&sb->s_type->i_mutex_key#14){++++++}: > lock_acquire+0x71/0x90 > down_write+0x3f/0x70 > configfs_depend_item+0x3a/0xb0 [configfs] > target_depend_item+0x13/0x20 [target_core_mod] > target_xcopy_locate_se_dev_e4+0xdd/0x1a0 [target_core_mod] > target_do_xcopy+0x34b/0x970 [target_core_mod] > __target_execute_cmd+0x22/0xa0 [target_core_mod] > target_execute_cmd+0x233/0x2c0 [target_core_mod] > iscsit_execute_cmd+0x208/0x270 [iscsi_target_mod] > iscsit_sequence_cmd+0x10b/0x190 [iscsi_target_mod] > iscsit_get_rx_pdu+0x37d/0xcd0 [iscsi_target_mod] > iscsi_target_rx_thread+0x6e/0xa0 [iscsi_target_mod] > kthread+0x102/0x140 > ret_from_fork+0x31/0x40 > > -> #0 (&sess->cmdsn_mutex){+.+.+.}: > __lock_acquire+0x10e6/0x1260 > lock_acquire+0x71/0x90 > mutex_lock_nested+0x5f/0x670 > iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] > iscsit_close_session+0xac/0x200 [iscsi_target_mod] > lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod] > target_shutdown_sessions+0xc3/0xd0 [target_core_mod] > core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod] > target_fabric_nacl_base_release+0x20/0x30 [target_core_mod] > config_item_release+0x5a/0xc0 [configfs] > config_item_put+0x1d/0x1f [configfs] > configfs_rmdir+0x1a6/0x300 [configfs] > vfs_rmdir+0xb7/0x140 > do_rmdir+0x1f4/0x200 > SyS_rmdir+0x11/0x20 > entry_SYSCALL_64_fastpath+0x23/0xc6 > > other info that might help us debug this: > > Possible unsafe locking scenario: > CPU0 CPU1 > ---- ---- > lock(&sb->s_type->i_mutex_key#14); > lock(&sess->cmdsn_mutex); > lock(&sb->s_type->i_mutex_key#14); > lock(&sess->cmdsn_mutex); > > *** DEADLOCK *** > > 3 locks held by rmdir/13321: > #0: (sb_writers#10){.+.+.+}, at: [<ffffffff811e1aff>] mnt_want_write+0x1f/0x50 > #1: (&default_group_class[depth - 1]#2/1){+.+.+.}, at: [<ffffffff811cc8ce>] do_rmdir+0x15e/0x200 > #2: (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140 > > stack backtrace: > CPU: 2 PID: 13321 Comm: rmdir Not tainted 4.10.0-rc7-dbg+ #1 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 > Call Trace: > dump_stack+0x86/0xc3 > print_circular_bug+0x1c7/0x220 > __lock_acquire+0x10e6/0x1260 > lock_acquire+0x71/0x90 > mutex_lock_nested+0x5f/0x670 > iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] > iscsit_close_session+0xac/0x200 [iscsi_target_mod] > lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod] > target_shutdown_sessions+0xc3/0xd0 [target_core_mod] > core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod] > target_fabric_nacl_base_release+0x20/0x30 [target_core_mod] > config_item_release+0x5a/0xc0 [configfs] > config_item_put+0x1d/0x1f [configfs] > configfs_rmdir+0x1a6/0x300 [configfs] > vfs_rmdir+0xb7/0x140 > do_rmdir+0x1f4/0x200 > SyS_rmdir+0x11/0x20 > entry_SYSCALL_64_fastpath+0x23/0xc6 > > Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > Cc: Hannes Reinecke <hare@suse.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Andy Grover <agrover@redhat.com> > Cc: David Disseldorp <ddiss@suse.de> > --- > drivers/target/iscsi/iscsi_target_erl1.c | 20 +++++++++++++++++++- > include/target/iscsi/iscsi_target_core.h | 1 + > 2 files changed, 20 insertions(+), 1 deletion(-) > Reviewed-by: Hannes Reinecke <hare@suse.com> Cheers, Hannes
On Thu, 2017-05-04 at 15:50 -0700, Bart Van Assche wrote: > Fix the following deadlock: > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 4.10.0-rc7-dbg+ #1 Not tainted > ------------------------------------------------------- > rmdir/13321 is trying to acquire lock: > (&sess->cmdsn_mutex){+.+.+.}, at: [<ffffffffa02cb47d>] iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] > > but task is already holding lock: > (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140 > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > -> #1 (&sb->s_type->i_mutex_key#14){++++++}: > lock_acquire+0x71/0x90 > down_write+0x3f/0x70 > configfs_depend_item+0x3a/0xb0 [configfs] > target_depend_item+0x13/0x20 [target_core_mod] > target_xcopy_locate_se_dev_e4+0xdd/0x1a0 [target_core_mod] > target_do_xcopy+0x34b/0x970 [target_core_mod] > __target_execute_cmd+0x22/0xa0 [target_core_mod] > target_execute_cmd+0x233/0x2c0 [target_core_mod] > iscsit_execute_cmd+0x208/0x270 [iscsi_target_mod] > iscsit_sequence_cmd+0x10b/0x190 [iscsi_target_mod] > iscsit_get_rx_pdu+0x37d/0xcd0 [iscsi_target_mod] > iscsi_target_rx_thread+0x6e/0xa0 [iscsi_target_mod] > kthread+0x102/0x140 > ret_from_fork+0x31/0x40 > > -> #0 (&sess->cmdsn_mutex){+.+.+.}: > __lock_acquire+0x10e6/0x1260 > lock_acquire+0x71/0x90 > mutex_lock_nested+0x5f/0x670 > iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] > iscsit_close_session+0xac/0x200 [iscsi_target_mod] > lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod] > target_shutdown_sessions+0xc3/0xd0 [target_core_mod] > core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod] > target_fabric_nacl_base_release+0x20/0x30 [target_core_mod] > config_item_release+0x5a/0xc0 [configfs] > config_item_put+0x1d/0x1f [configfs] > configfs_rmdir+0x1a6/0x300 [configfs] > vfs_rmdir+0xb7/0x140 > do_rmdir+0x1f4/0x200 > SyS_rmdir+0x11/0x20 > entry_SYSCALL_64_fastpath+0x23/0xc6 > > other info that might help us debug this: > > Possible unsafe locking scenario: > CPU0 CPU1 > ---- ---- > lock(&sb->s_type->i_mutex_key#14); > lock(&sess->cmdsn_mutex); > lock(&sb->s_type->i_mutex_key#14); > lock(&sess->cmdsn_mutex); > > *** DEADLOCK *** > > 3 locks held by rmdir/13321: > #0: (sb_writers#10){.+.+.+}, at: [<ffffffff811e1aff>] mnt_want_write+0x1f/0x50 > #1: (&default_group_class[depth - 1]#2/1){+.+.+.}, at: [<ffffffff811cc8ce>] do_rmdir+0x15e/0x200 > #2: (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140 > > stack backtrace: > CPU: 2 PID: 13321 Comm: rmdir Not tainted 4.10.0-rc7-dbg+ #1 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 > Call Trace: > dump_stack+0x86/0xc3 > print_circular_bug+0x1c7/0x220 > __lock_acquire+0x10e6/0x1260 > lock_acquire+0x71/0x90 > mutex_lock_nested+0x5f/0x670 > iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] > iscsit_close_session+0xac/0x200 [iscsi_target_mod] > lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod] > target_shutdown_sessions+0xc3/0xd0 [target_core_mod] > core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod] > target_fabric_nacl_base_release+0x20/0x30 [target_core_mod] > config_item_release+0x5a/0xc0 [configfs] > config_item_put+0x1d/0x1f [configfs] > configfs_rmdir+0x1a6/0x300 [configfs] > vfs_rmdir+0xb7/0x140 > do_rmdir+0x1f4/0x200 > SyS_rmdir+0x11/0x20 > entry_SYSCALL_64_fastpath+0x23/0xc6 > > Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > Cc: Hannes Reinecke <hare@suse.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Andy Grover <agrover@redhat.com> > Cc: David Disseldorp <ddiss@suse.de> > --- > drivers/target/iscsi/iscsi_target_erl1.c | 20 +++++++++++++++++++- > include/target/iscsi/iscsi_target_core.h | 1 + > 2 files changed, 20 insertions(+), 1 deletion(-) > > diff --git a/drivers/target/iscsi/iscsi_target_erl1.c b/drivers/target/iscsi/iscsi_target_erl1.c > index fe9b7f1e44ac..454f493174db 100644 > --- a/drivers/target/iscsi/iscsi_target_erl1.c > +++ b/drivers/target/iscsi/iscsi_target_erl1.c > @@ -912,6 +912,24 @@ int iscsit_execute_ooo_cmdsns(struct iscsi_session *sess) > } > > /* > + * Since iscsit_execute_cmd() can be called with cmdsn_mutex held, execute > + * target_execute_cmd() asynchronously to avoid that target_do_xcopy() > + * deadlocks. > + */ > +static void iscsit_execute_work(struct work_struct *w) > +{ > + struct iscsi_cmd *cmd = container_of(w, typeof(*cmd), work); > + > + target_execute_cmd(&cmd->se_cmd); > +} > + > +static void iscsit_execute_cmd_async(struct iscsi_cmd *cmd) > +{ > + INIT_WORK(&cmd->work, iscsit_execute_work); > + WARN_ON_ONCE(!schedule_work(&cmd->work)); > +} > + > +/* > * Called either: > * > * 1. With sess->cmdsn_mutex held from iscsi_execute_ooo_cmdsns() > @@ -968,7 +986,7 @@ int iscsit_execute_cmd(struct iscsi_cmd *cmd, int ooo) > if (cmd->immediate_data) { > if (cmd->cmd_flags & ICF_GOT_LAST_DATAOUT) { > spin_unlock_bh(&cmd->istate_lock); > - target_execute_cmd(&cmd->se_cmd); > + iscsit_execute_cmd_async(cmd); > return 0; > } > spin_unlock_bh(&cmd->istate_lock); > diff --git a/include/target/iscsi/iscsi_target_core.h b/include/target/iscsi/iscsi_target_core.h > index 275581d483dd..da7bc5097f5d 100644 > --- a/include/target/iscsi/iscsi_target_core.h > +++ b/include/target/iscsi/iscsi_target_core.h > @@ -349,6 +349,7 @@ struct iscsi_r2t { > } ____cacheline_aligned; > > struct iscsi_cmd { > + struct work_struct work; > enum iscsi_timer_flags_table dataout_timer_flags; > /* DataOUT timeout retries */ > u8 dataout_timeout_retries; So every iscsit_execute_work() now has to perform an extra context switch for every WRITE, just to avoid a locking dependency for EXTENDED_COPY. Introducing this performance regression for every WRITE is unacceptable. The proper fix here is to simply move the parsing of segment and target descriptors out of iscsit_execute_cmd() -> target_execute_cmd() -> target_do_xcopy() context, and into target_xcopy_do_work(). -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
====================================================== [ INFO: possible circular locking dependency detected ] 4.10.0-rc7-dbg+ #1 Not tainted ------------------------------------------------------- rmdir/13321 is trying to acquire lock: (&sess->cmdsn_mutex){+.+.+.}, at: [<ffffffffa02cb47d>] iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] but task is already holding lock: (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#14){++++++}: lock_acquire+0x71/0x90 down_write+0x3f/0x70 configfs_depend_item+0x3a/0xb0 [configfs] target_depend_item+0x13/0x20 [target_core_mod] target_xcopy_locate_se_dev_e4+0xdd/0x1a0 [target_core_mod] target_do_xcopy+0x34b/0x970 [target_core_mod] __target_execute_cmd+0x22/0xa0 [target_core_mod] target_execute_cmd+0x233/0x2c0 [target_core_mod] iscsit_execute_cmd+0x208/0x270 [iscsi_target_mod] iscsit_sequence_cmd+0x10b/0x190 [iscsi_target_mod] iscsit_get_rx_pdu+0x37d/0xcd0 [iscsi_target_mod] iscsi_target_rx_thread+0x6e/0xa0 [iscsi_target_mod] kthread+0x102/0x140 ret_from_fork+0x31/0x40 -> #0 (&sess->cmdsn_mutex){+.+.+.}: __lock_acquire+0x10e6/0x1260 lock_acquire+0x71/0x90 mutex_lock_nested+0x5f/0x670 iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] iscsit_close_session+0xac/0x200 [iscsi_target_mod] lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod] target_shutdown_sessions+0xc3/0xd0 [target_core_mod] core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod] target_fabric_nacl_base_release+0x20/0x30 [target_core_mod] config_item_release+0x5a/0xc0 [configfs] config_item_put+0x1d/0x1f [configfs] configfs_rmdir+0x1a6/0x300 [configfs] vfs_rmdir+0xb7/0x140 do_rmdir+0x1f4/0x200 SyS_rmdir+0x11/0x20 entry_SYSCALL_64_fastpath+0x23/0xc6 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#14); lock(&sess->cmdsn_mutex); lock(&sb->s_type->i_mutex_key#14); lock(&sess->cmdsn_mutex); *** DEADLOCK *** 3 locks held by rmdir/13321: #0: (sb_writers#10){.+.+.+}, at: [<ffffffff811e1aff>] mnt_want_write+0x1f/0x50 #1: (&default_group_class[depth - 1]#2/1){+.+.+.}, at: [<ffffffff811cc8ce>] do_rmdir+0x15e/0x200 #2: (&sb->s_type->i_mutex_key#14){++++++}, at: [<ffffffff811c6e20>] vfs_rmdir+0x50/0x140 stack backtrace: CPU: 2 PID: 13321 Comm: rmdir Not tainted 4.10.0-rc7-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x86/0xc3 print_circular_bug+0x1c7/0x220 __lock_acquire+0x10e6/0x1260 lock_acquire+0x71/0x90 mutex_lock_nested+0x5f/0x670 iscsit_free_all_ooo_cmdsns+0x2d/0xb0 [iscsi_target_mod] iscsit_close_session+0xac/0x200 [iscsi_target_mod] lio_tpg_close_session+0x9f/0xb0 [iscsi_target_mod] target_shutdown_sessions+0xc3/0xd0 [target_core_mod] core_tpg_del_initiator_node_acl+0x91/0x140 [target_core_mod] target_fabric_nacl_base_release+0x20/0x30 [target_core_mod] config_item_release+0x5a/0xc0 [configfs] config_item_put+0x1d/0x1f [configfs] configfs_rmdir+0x1a6/0x300 [configfs] vfs_rmdir+0xb7/0x140 do_rmdir+0x1f4/0x200 SyS_rmdir+0x11/0x20 entry_SYSCALL_64_fastpath+0x23/0xc6 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> --- drivers/target/iscsi/iscsi_target_erl1.c | 20 +++++++++++++++++++- include/target/iscsi/iscsi_target_core.h | 1 + 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/target/iscsi/iscsi_target_erl1.c b/drivers/target/iscsi/iscsi_target_erl1.c index fe9b7f1e44ac..454f493174db 100644 --- a/drivers/target/iscsi/iscsi_target_erl1.c +++ b/drivers/target/iscsi/iscsi_target_erl1.c @@ -912,6 +912,24 @@ int iscsit_execute_ooo_cmdsns(struct iscsi_session *sess) } /* + * Since iscsit_execute_cmd() can be called with cmdsn_mutex held, execute + * target_execute_cmd() asynchronously to avoid that target_do_xcopy() + * deadlocks. + */ +static void iscsit_execute_work(struct work_struct *w) +{ + struct iscsi_cmd *cmd = container_of(w, typeof(*cmd), work); + + target_execute_cmd(&cmd->se_cmd); +} + +static void iscsit_execute_cmd_async(struct iscsi_cmd *cmd) +{ + INIT_WORK(&cmd->work, iscsit_execute_work); + WARN_ON_ONCE(!schedule_work(&cmd->work)); +} + +/* * Called either: * * 1. With sess->cmdsn_mutex held from iscsi_execute_ooo_cmdsns() @@ -968,7 +986,7 @@ int iscsit_execute_cmd(struct iscsi_cmd *cmd, int ooo) if (cmd->immediate_data) { if (cmd->cmd_flags & ICF_GOT_LAST_DATAOUT) { spin_unlock_bh(&cmd->istate_lock); - target_execute_cmd(&cmd->se_cmd); + iscsit_execute_cmd_async(cmd); return 0; } spin_unlock_bh(&cmd->istate_lock); diff --git a/include/target/iscsi/iscsi_target_core.h b/include/target/iscsi/iscsi_target_core.h index 275581d483dd..da7bc5097f5d 100644 --- a/include/target/iscsi/iscsi_target_core.h +++ b/include/target/iscsi/iscsi_target_core.h @@ -349,6 +349,7 @@ struct iscsi_r2t { } ____cacheline_aligned; struct iscsi_cmd { + struct work_struct work; enum iscsi_timer_flags_table dataout_timer_flags; /* DataOUT timeout retries */ u8 dataout_timeout_retries;