Message ID | 20170215002612.14566-15-bart.vanassche@sandisk.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Tue, 2017-02-14 at 16:25 -0800, Bart Van Assche wrote: > For several target drivers (e.g. ib_srpt and ib_isert) sleeping inside > transport_generic_free_cmd() causes RDMA completion processing to stall. > Hence only sleep inside this function if the (iSCSI) target driver > requires this. > > This patch avoids that messages similar to the following appear in the > kernel log: > > INFO: task kworker/u25:0:1013 blocked for more than 480 seconds. > Tainted: G W 4.10.0-rc7-dbg+ #3 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > kworker/u25:0 D 0 1013 2 0x00000000 > Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] > Call Trace: > __schedule+0x2da/0xb00 > schedule+0x38/0x90 > schedule_timeout+0x2fe/0x640 > wait_for_completion+0xfe/0x160 > transport_generic_free_cmd+0x2e/0x80 [target_core_mod] > srpt_send_done+0x59/0x9f [ib_srpt] > __ib_process_cq+0x4b/0xd0 [ib_core] > ib_cq_poll_work+0x1b/0x60 [ib_core] > process_one_work+0x208/0x6a0 > worker_thread+0x49/0x4a0 > kthread+0x107/0x140 > ret_from_fork+0x2e/0x40 > > Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > Cc: <stable@vger.kernel.org> > --- > drivers/target/target_core_transport.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c > index 2ed9721a7202..ab1c493a9a16 100644 > --- a/drivers/target/target_core_transport.c > +++ b/drivers/target/target_core_transport.c > @@ -2504,15 +2504,13 @@ int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks) > int ret = 0; > bool aborted = false, tas = false; > > - if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD)) { > - if (wait_for_tasks && (cmd->se_cmd_flags & SCF_SCSI_TMR_CDB)) > - target_wait_free_cmd(cmd, &aborted, &tas); > + if (wait_for_tasks) > + target_wait_free_cmd(cmd, &aborted, &tas); > > + if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD)) { > if (!aborted || tas) > ret = transport_put_cmd(cmd); > } else { > - if (wait_for_tasks) > - target_wait_free_cmd(cmd, &aborted, &tas); > /* > * Handle WRITE failure case where transport_generic_new_cmd() > * has already added se_cmd to state_list, but fabric has > @@ -2535,7 +2533,6 @@ int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks) > */ > if (aborted) { > pr_debug("Detected CMD_T_ABORTED for ITT: %llu\n", cmd->tag); > - wait_for_completion(&cmd->cmd_wait_comp); > cmd->se_tfo->release_cmd(cmd); > ret = 1; > } Removing the aborted wait_for_completion breaks the second order scenario described for iscsi-target previously here: http://www.spinics.net/lists/target-devel/msg14456.html where a concurrent CMD_T_ABORTED occurs while an iscsi connection is shutting down, and transport_generic_free_cmd() must not return until cmd->cmd_kref reaches zero and does the complete. However, looking at transport_generic_free_cmd() in the context of ib_srpt and ib_isert, I don't see how this scenario can ever happen in as-is in upstream code. Namely, all of the transport_generic_free_cmd() callers in ib_srpt and ib_isert pass wait_for_tasks = false. And the only time transport_generic_free_cmd() will block is when wait_for_tasks = true, or when wait_for_tasks = true && abort = true. So as-is in upstream, how can transport_generic_free_cmd() ever block when wait_for_tasks = false..? -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/20/2017 01:38 PM, Nicholas A. Bellinger wrote: > However, looking at transport_generic_free_cmd() in the context of > ib_srpt and ib_isert, I don't see how this scenario can ever happen in > as-is in upstream code. > > Namely, all of the transport_generic_free_cmd() callers in ib_srpt and > ib_isert pass wait_for_tasks = false. And the only time > transport_generic_free_cmd() will block is when wait_for_tasks = true, > or when wait_for_tasks = true && abort = true. > > So as-is in upstream, how can transport_generic_free_cmd() ever block > when wait_for_tasks = false..? I will check the other patches in this series for changes that trigger a call to wait_for_completion() if wait_for_tasks == false. Bart. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2017-02-21 at 18:58 +0000, Bart Van Assche wrote: > On 02/20/2017 01:38 PM, Nicholas A. Bellinger wrote: > > However, looking at transport_generic_free_cmd() in the context of > > ib_srpt and ib_isert, I don't see how this scenario can ever happen in > > as-is in upstream code. > > > > Namely, all of the transport_generic_free_cmd() callers in ib_srpt and > > ib_isert pass wait_for_tasks = false. And the only time > > transport_generic_free_cmd() will block is when wait_for_tasks = true, > > or when wait_for_tasks = true && abort = true. > > > > So as-is in upstream, how can transport_generic_free_cmd() ever block > > when wait_for_tasks = false..? > > I will check the other patches in this series for changes that trigger a > call to wait_for_completion() if wait_for_tasks == false. > Given there has not been any follow-up to identify a case in upstream where ib_isert or ib_srpt is using target_generic_free_cmd() with wait_for_tasks = true or a case where wait_for_tasks = false blocks, I'll conclude this was a bugfix incorrectly CC'ed to stable for breakage introduced by other changes in this series. To reiterate the importance of having bug-fixes, especially those intended for stable, always be leading other patches.. There is no way for a maintainer to know which bug-fixes are to existing code unless they precede all other patches and not intermixed with various other changes. The bug-fixes to existing upstream code need to be tested on their own without the other changes (especially those that effect the same area) invalidating the tests. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
T24gV2VkLCAyMDE3LTAzLTAxIGF0IDIxOjIxIC0wODAwLCBOaWNob2xhcyBBLiBCZWxsaW5nZXIg d3JvdGU6DQo+IFRvIHJlaXRlcmF0ZSB0aGUgaW1wb3J0YW5jZSBvZiBoYXZpbmcgYnVnLWZpeGVz LCBlc3BlY2lhbGx5IHRob3NlDQo+IGludGVuZGVkIGZvciBzdGFibGUsIGFsd2F5cyBiZSBsZWFk aW5nIG90aGVyIHBhdGNoZXMuLg0KDQpZb3Ugc2hvdWxkIGtub3cgdGhhdCBJIGNhbid0IGdyb3Vw IGFsbCBidWdmaXhlcyBhdCB0aGUgc3RhcnQgb2YgdGhlIHNlcmllcw0KYmVjYXVzZSBzb21lIG9m IHRoZSBidWdmaXhlcyBkZXBlbmQgb24gcGF0Y2hlcyB0aGF0IGFyZSBub3QgYnVnZml4ZXMuDQoN CkJhcnQu -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2017-03-02 at 05:24 +0000, Bart Van Assche wrote: > On Wed, 2017-03-01 at 21:21 -0800, Nicholas A. Bellinger wrote: > > To reiterate the importance of having bug-fixes, especially those > > intended for stable, always be leading other patches.. > > You should know that I can't group all bugfixes at the start of the series > because some of the bugfixes depend on patches that are not bugfixes. > That makes no sense. Either it's a bug-fix to existing upstream code, or it's not. This patch was not a bug-fix to upstream, because it detailed a scenario that doesn't existing in upstream. That is, a case where ib_isert or ib_srpt calls target_generic_free_cmd() with wait_for_tasks = true or a case where wait_for_tasks = false blocks. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 2ed9721a7202..ab1c493a9a16 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -2504,15 +2504,13 @@ int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks) int ret = 0; bool aborted = false, tas = false; - if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD)) { - if (wait_for_tasks && (cmd->se_cmd_flags & SCF_SCSI_TMR_CDB)) - target_wait_free_cmd(cmd, &aborted, &tas); + if (wait_for_tasks) + target_wait_free_cmd(cmd, &aborted, &tas); + if (!(cmd->se_cmd_flags & SCF_SE_LUN_CMD)) { if (!aborted || tas) ret = transport_put_cmd(cmd); } else { - if (wait_for_tasks) - target_wait_free_cmd(cmd, &aborted, &tas); /* * Handle WRITE failure case where transport_generic_new_cmd() * has already added se_cmd to state_list, but fabric has @@ -2535,7 +2533,6 @@ int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks) */ if (aborted) { pr_debug("Detected CMD_T_ABORTED for ITT: %llu\n", cmd->tag); - wait_for_completion(&cmd->cmd_wait_comp); cmd->se_tfo->release_cmd(cmd); ret = 1; }
For several target drivers (e.g. ib_srpt and ib_isert) sleeping inside transport_generic_free_cmd() causes RDMA completion processing to stall. Hence only sleep inside this function if the (iSCSI) target driver requires this. This patch avoids that messages similar to the following appear in the kernel log: INFO: task kworker/u25:0:1013 blocked for more than 480 seconds. Tainted: G W 4.10.0-rc7-dbg+ #3 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u25:0 D 0 1013 2 0x00000000 Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] Call Trace: __schedule+0x2da/0xb00 schedule+0x38/0x90 schedule_timeout+0x2fe/0x640 wait_for_completion+0xfe/0x160 transport_generic_free_cmd+0x2e/0x80 [target_core_mod] srpt_send_done+0x59/0x9f [ib_srpt] __ib_process_cq+0x4b/0xd0 [ib_core] ib_cq_poll_work+0x1b/0x60 [ib_core] process_one_work+0x208/0x6a0 worker_thread+0x49/0x4a0 kthread+0x107/0x140 ret_from_fork+0x2e/0x40 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: <stable@vger.kernel.org> --- drivers/target/target_core_transport.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)