Message ID | 1491580154-30489-1-git-send-email-bryantly@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On 4/7/17 10:49 AM, Bryant G. Ly wrote: > The driver is sending a response to the aborted task response > along with LIO sending the tmr response. SCSI spec says > that the initiator that sends the abort task TM NEVER gets a > response to the aborted op and with the current code it will > send a response. Thus this fix will remove that response if the > op is aborted. > > Cc: <stable@vger.kernel.org> # v4.8+ > Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> > --- > drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 60 +++++++++++++++++++++----------- > drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h | 1 + > 2 files changed, 40 insertions(+), 21 deletions(-) > > diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c > index 4bb5635..8e2733f 100644 > --- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c > +++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c > @@ -1169,6 +1169,7 @@ static struct ibmvscsis_cmd *ibmvscsis_get_free_cmd(struct scsi_info *vscsi) > cmd = list_first_entry_or_null(&vscsi->free_cmd, > struct ibmvscsis_cmd, list); > if (cmd) { > + cmd->flags &= ~(CMD_ABORTED); > list_del(&cmd->list); > cmd->iue = iue; > cmd->type = UNSET_TYPE; > @@ -1758,33 +1759,41 @@ static void ibmvscsis_send_messages(struct scsi_info *vscsi) > > if (!(vscsi->flags & RESPONSE_Q_DOWN)) { > list_for_each_entry_safe(cmd, nxt, &vscsi->waiting_rsp, list) { > - iue = cmd->iue; > + /* > + * If an Abort flag is set then dont send response > + */ > + if (cmd->flags & CMD_ABORTED) { > + list_del(&cmd->list); > + ibmvscsis_free_cmd_resources(vscsi, cmd); > + } else { > + iue = cmd->iue; > > - crq->valid = VALID_CMD_RESP_EL; > - crq->format = cmd->rsp.format; > + crq->valid = VALID_CMD_RESP_EL; > + crq->format = cmd->rsp.format; > > - if (cmd->flags & CMD_FAST_FAIL) > - crq->status = VIOSRP_ADAPTER_FAIL; > + if (cmd->flags & CMD_FAST_FAIL) > + crq->status = VIOSRP_ADAPTER_FAIL; > > - crq->IU_length = cpu_to_be16(cmd->rsp.len); > + crq->IU_length = cpu_to_be16(cmd->rsp.len); > > - rc = h_send_crq(vscsi->dma_dev->unit_address, > - be64_to_cpu(msg_hi), > - be64_to_cpu(cmd->rsp.tag)); > + rc = h_send_crq(vscsi->dma_dev->unit_address, > + be64_to_cpu(msg_hi), > + be64_to_cpu(cmd->rsp.tag)); > > - pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n", > - cmd, be64_to_cpu(cmd->rsp.tag), rc); > + pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n", > + cmd, be64_to_cpu(cmd->rsp.tag), rc); > > - /* if all ok free up the command element resources */ > - if (rc == H_SUCCESS) { > - /* some movement has occurred */ > - vscsi->rsp_q_timer.timer_pops = 0; > - list_del(&cmd->list); > + /* if all ok free up the command element resources */ > + if (rc == H_SUCCESS) { > + /* some movement has occurred */ > + vscsi->rsp_q_timer.timer_pops = 0; > + list_del(&cmd->list); > > - ibmvscsis_free_cmd_resources(vscsi, cmd); > - } else { > - srp_snd_msg_failed(vscsi, rc); > - break; > + ibmvscsis_free_cmd_resources(vscsi, cmd); > + } else { > + srp_snd_msg_failed(vscsi, rc); > + break; > + } > } > } > > @@ -3581,9 +3590,15 @@ static int ibmvscsis_write_pending(struct se_cmd *se_cmd) > { > struct ibmvscsis_cmd *cmd = container_of(se_cmd, struct ibmvscsis_cmd, > se_cmd); > + struct scsi_info *vscsi = cmd->adapter; > struct iu_entry *iue = cmd->iue; > int rc; > > + if ((vscsi->flags & (CLIENT_FAILED | RESPONSE_Q_DOWN))) { > + pr_err("write_pending failed since: %d\n", vscsi->flags); > + return -EIO; > + } > + > rc = srp_transfer_data(cmd, &vio_iu(iue)->srp.cmd, ibmvscsis_rdma, > 1, 1); > if (rc) { One question I had for you Nick was whether I should just return success if CLIENT_FAILED or RSP Q DOWN. I know LIO wants EAGAIN or ENONEM on write pending return codes, but in this case LIO prob doesn't care. It would produce less noise/warning if I were to just do success, what do you think? -Bryant
On 04/07/2017 08:49 AM, Bryant G. Ly wrote: > The driver is sending a response to the aborted task response ^^^^^^^^ That occurrence of "response" looks extraneous? > along with LIO sending the tmr response. SCSI spec says > that the initiator that sends the abort task TM NEVER gets a > response to the aborted op and with the current code it will > send a response. Thus this fix will remove that response if the > op is aborted. Hello Bryan, Are you sure that a new flag needed to prevent that a command response is sent to the initiator that submitted the ABORT? __target_check_io_state() only sets CMD_T_TAS if the TMF was received from another I_T nexus than the I_T nexus through which the aborted command was received. core_tmr_handle_tas_abort() only triggers a call to .queue_status() if CMD_T_TAS is set. Do you agree with this analysis? Thanks, Bart.
On Fri, 2017-04-07 at 11:12 -0500, Bryant G. Ly wrote: > So from this stack trace it looks like the ibmvscsis driver is sending an > extra response through send_messages even though an abort was issued. > IBMi, reported this issue internally when they were testing the driver, > because their client didn't like getting a response back for the aborted op. > They are only expecting a TMR from the abort request, NOT the aborted op. Hello Bryant, What is the root cause of this behavior? Why is it that the behavior of the ibmvscsi_tgt contradicts what has been implemented in the LIO core? Sorry but the patch at the start of this thread looks to me like an attempt to paper over the problem instead of addressing the root cause. Thanks, Bart.
On 4/7/17 11:36 AM, Bart Van Assche wrote: > On Fri, 2017-04-07 at 11:12 -0500, Bryant G. Ly wrote: >> So from this stack trace it looks like the ibmvscsis driver is sending an >> extra response through send_messages even though an abort was issued. >> IBMi, reported this issue internally when they were testing the driver, >> because their client didn't like getting a response back for the aborted op. >> They are only expecting a TMR from the abort request, NOT the aborted op. > Hello Bryant, > > What is the root cause of this behavior? Why is it that the behavior of > the ibmvscsi_tgt contradicts what has been implemented in the LIO core? > Sorry but the patch at the start of this thread looks to me like an > attempt to paper over the problem instead of addressing the root cause. > > Thanks, > IBMi clients received a response for an aborted operation, so they sent an abort tm request. Afterwards they get a CRQ response to the op that they aborted. That should not happen, because they are only supposed to get a response for the tm request NOT the aborted operation. Looking at the code it looks like it is due to send messages, processing a response without checking to see if it was an aborted op. This patch addresses a bug within the ibmvscsis driver and the fact that it SENT a response to the aborted operation(which is wrong since) without looking at what LIO core had done. The driver isn't supposed to send any response to the aborted operation, BUT only a response to the abort tm request, which LIO core currently does. -Bryant
On 4/7/17 12:01 PM, Bryant G. Ly wrote: > > > On 4/7/17 11:36 AM, Bart Van Assche wrote: >> On Fri, 2017-04-07 at 11:12 -0500, Bryant G. Ly wrote: >>> So from this stack trace it looks like the ibmvscsis driver is >>> sending an >>> extra response through send_messages even though an abort was issued. >>> IBMi, reported this issue internally when they were testing the driver, >>> because their client didn't like getting a response back for the >>> aborted op. >>> They are only expecting a TMR from the abort request, NOT the >>> aborted op. >> Hello Bryant, >> >> What is the root cause of this behavior? Why is it that the behavior of >> the ibmvscsi_tgt contradicts what has been implemented in the LIO core? >> Sorry but the patch at the start of this thread looks to me like an >> attempt to paper over the problem instead of addressing the root cause. >> >> Thanks, >> > IBMi clients received a response for an aborted operation, so they > sent an abort tm request. > Afterwards they get a CRQ response to the op that they aborted. That > should not happen, because they are only supposed to get a response > for the tm request NOT the aborted operation. > Looking at the code it looks like it is due to send messages, > processing a response without checking to see if it was an aborted op. > This patch addresses a bug within the ibmvscsis driver and the fact > that it SENT a response to the aborted operation(which is wrong since) > without looking at what LIO core had done. > The driver isn't supposed to send any response to the aborted > operation, BUT only a response to the abort tm request, which LIO core > currently does. > > -Bryant > I think I can clarify the issue here: ibmvscsis_tgt does not send the response to the client until release_cmd time. The reason for this was because if we did it at queue_status time, then the client would be free to reuse the tag for that command, but we're still using the tag until the command is released at release_cmd time, so we chose to delay sending the response until then. That then caused this issue, because release_cmd is always called, even if queue_status is not. Perhaps it would be cleaner to set some sort of status valid flag during queue_status instead of setting a flag in aborted_task?
On Fri, 2017-04-07 at 16:14 -0500, Michael Cyr wrote: > That then caused this issue, because release_cmd is always called, even > if queue_status is not. Perhaps it would be cleaner to set some sort of > status valid flag during queue_status instead of setting a flag in > aborted_task? Hello Michael, Thanks for the clarification. Have you already checked whether a new flag is really needed or whether checking CMD_T_TAS would be sufficient? Thanks, Bart.
diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c index 4bb5635..8e2733f 100644 --- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c +++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c @@ -1169,6 +1169,7 @@ static struct ibmvscsis_cmd *ibmvscsis_get_free_cmd(struct scsi_info *vscsi) cmd = list_first_entry_or_null(&vscsi->free_cmd, struct ibmvscsis_cmd, list); if (cmd) { + cmd->flags &= ~(CMD_ABORTED); list_del(&cmd->list); cmd->iue = iue; cmd->type = UNSET_TYPE; @@ -1758,33 +1759,41 @@ static void ibmvscsis_send_messages(struct scsi_info *vscsi) if (!(vscsi->flags & RESPONSE_Q_DOWN)) { list_for_each_entry_safe(cmd, nxt, &vscsi->waiting_rsp, list) { - iue = cmd->iue; + /* + * If an Abort flag is set then dont send response + */ + if (cmd->flags & CMD_ABORTED) { + list_del(&cmd->list); + ibmvscsis_free_cmd_resources(vscsi, cmd); + } else { + iue = cmd->iue; - crq->valid = VALID_CMD_RESP_EL; - crq->format = cmd->rsp.format; + crq->valid = VALID_CMD_RESP_EL; + crq->format = cmd->rsp.format; - if (cmd->flags & CMD_FAST_FAIL) - crq->status = VIOSRP_ADAPTER_FAIL; + if (cmd->flags & CMD_FAST_FAIL) + crq->status = VIOSRP_ADAPTER_FAIL; - crq->IU_length = cpu_to_be16(cmd->rsp.len); + crq->IU_length = cpu_to_be16(cmd->rsp.len); - rc = h_send_crq(vscsi->dma_dev->unit_address, - be64_to_cpu(msg_hi), - be64_to_cpu(cmd->rsp.tag)); + rc = h_send_crq(vscsi->dma_dev->unit_address, + be64_to_cpu(msg_hi), + be64_to_cpu(cmd->rsp.tag)); - pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n", - cmd, be64_to_cpu(cmd->rsp.tag), rc); + pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n", + cmd, be64_to_cpu(cmd->rsp.tag), rc); - /* if all ok free up the command element resources */ - if (rc == H_SUCCESS) { - /* some movement has occurred */ - vscsi->rsp_q_timer.timer_pops = 0; - list_del(&cmd->list); + /* if all ok free up the command element resources */ + if (rc == H_SUCCESS) { + /* some movement has occurred */ + vscsi->rsp_q_timer.timer_pops = 0; + list_del(&cmd->list); - ibmvscsis_free_cmd_resources(vscsi, cmd); - } else { - srp_snd_msg_failed(vscsi, rc); - break; + ibmvscsis_free_cmd_resources(vscsi, cmd); + } else { + srp_snd_msg_failed(vscsi, rc); + break; + } } } @@ -3581,9 +3590,15 @@ static int ibmvscsis_write_pending(struct se_cmd *se_cmd) { struct ibmvscsis_cmd *cmd = container_of(se_cmd, struct ibmvscsis_cmd, se_cmd); + struct scsi_info *vscsi = cmd->adapter; struct iu_entry *iue = cmd->iue; int rc; + if ((vscsi->flags & (CLIENT_FAILED | RESPONSE_Q_DOWN))) { + pr_err("write_pending failed since: %d\n", vscsi->flags); + return -EIO; + } + rc = srp_transfer_data(cmd, &vio_iu(iue)->srp.cmd, ibmvscsis_rdma, 1, 1); if (rc) { @@ -3674,7 +3689,10 @@ static void ibmvscsis_queue_tm_rsp(struct se_cmd *se_cmd) static void ibmvscsis_aborted_task(struct se_cmd *se_cmd) { - /* TBD: What (if anything) should we do here? */ + struct ibmvscsis_cmd *cmd = container_of(se_cmd, struct ibmvscsis_cmd, + se_cmd); + + cmd->flags |= CMD_ABORTED; pr_debug("ibmvscsis_aborted_task %p\n", se_cmd); } diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h index 98b0ca7..24db7a9 100644 --- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h +++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h @@ -171,6 +171,7 @@ struct ibmvscsis_cmd { unsigned char sense_buf[TRANSPORT_SENSE_BUFFER]; u64 init_time; #define CMD_FAST_FAIL BIT(0) +#define CMD_ABORTED BIT(1) u32 flags; char type; };
The driver is sending a response to the aborted task response along with LIO sending the tmr response. SCSI spec says that the initiator that sends the abort task TM NEVER gets a response to the aborted op and with the current code it will send a response. Thus this fix will remove that response if the op is aborted. Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> --- drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 60 +++++++++++++++++++++----------- drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h | 1 + 2 files changed, 40 insertions(+), 21 deletions(-)