Message ID | 20180723143751.10843-2-keith.busch@intel.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [PATCHv2,1/2] blk-mq: export setting request completion state | expand |
On Mon, Jul 23, 2018 at 08:37:51AM -0600, Keith Busch wrote: > The scsi block layer requires requests claimed by the error handling be > completed by the error handler. A previous commit allowed completions > to proceed for blk-mq, breaking that assumption. > > This patch prevents completions that may race with the timeout handler > by marking the state to complete, restoring the previous behavior. > > Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce") > Signed-off-by: Keith Busch <keith.busch@intel.com> Looks good enough (reluctantly) for this merge window: Reviewed-by: Christoph Hellwig <hch@lst.de>
On 07/23/18 07:37, Keith Busch wrote: > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 8932ae81a15a..2715cdaa669c 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req) > rtn = host->hostt->eh_timed_out(scmd); > > if (rtn == BLK_EH_DONE) { > + /* > + * For blk-mq, we must set the request state to complete now > + * before sending the request to the scsi error handler. This > + * will prevent a use-after-free in the event the LLD manages > + * to complete the request before the error handler finishes > + * processing this timed out request. > + * > + * If the request was already completed, then the LLD beat the > + * time out handler from transferring the request to the scsi > + * error handler. In that case we can return immediately as no > + * further action is required. > + */ > + if (req->q->mq_ops && !blk_mq_mark_complete(req)) > + return rtn; > if (scsi_abort_command(scmd) != SUCCESS) { > set_host_byte(scmd, DID_TIME_OUT); > scsi_eh_scmd_add(scmd); This change looks incomplete to me. I think the following scenario is not handled properly by the above patch: - host->hostt->eh_timed_out() gets called. - The request "req" completes from another context while host->hostt->eh_timed_out() is in progress. - host->hostt->eh_timed_out() calls scsi_finish_command(). I think that scenario will lead to a double completion. Bart.
On Tue, Jul 24, 2018 at 03:46:17PM -0700, Bart Van Assche wrote: > On 07/23/18 07:37, Keith Busch wrote: > > + if (req->q->mq_ops && !blk_mq_mark_complete(req)) > > + return rtn; > > if (scsi_abort_command(scmd) != SUCCESS) { > > set_host_byte(scmd, DID_TIME_OUT); > > scsi_eh_scmd_add(scmd); > > This change looks incomplete to me. I think the following scenario is not > handled properly by the above patch: > - host->hostt->eh_timed_out() gets called. > - The request "req" completes from another context while > host->hostt->eh_timed_out() is in progress. > - host->hostt->eh_timed_out() calls scsi_finish_command(). > > I think that scenario will lead to a double completion. A bit of a moot point, isn't it? Not a single scsi lld directly calls scsi_finish_command() from anywhere, much less through eh_timed_out().
On 2018-07-24 09:15 PM, Keith Busch wrote: > On Tue, Jul 24, 2018 at 03:46:17PM -0700, Bart Van Assche wrote: >> On 07/23/18 07:37, Keith Busch wrote: >>> + if (req->q->mq_ops && !blk_mq_mark_complete(req)) >>> + return rtn; >>> if (scsi_abort_command(scmd) != SUCCESS) { >>> set_host_byte(scmd, DID_TIME_OUT); >>> scsi_eh_scmd_add(scmd); >> >> This change looks incomplete to me. I think the following scenario is not >> handled properly by the above patch: >> - host->hostt->eh_timed_out() gets called. >> - The request "req" completes from another context while >> host->hostt->eh_timed_out() is in progress. >> - host->hostt->eh_timed_out() calls scsi_finish_command(). >> >> I think that scenario will lead to a double completion. > > A bit of a moot point, isn't it? Not a single scsi lld directly calls > scsi_finish_command() from anywhere, much less through eh_timed_out(). " * scsi_finish_command - cleanup and pass command back to upper layer" That is from the comment block above the definition of scsi_finish_command(). To me that means the mid-level has already got the response from the LLD (directly via a queuecommand() return, or asynchronously via the scsi_done() callback) and this function will call the "upper layer" which will be the one of the sd, sr, st, sg, ses, etc drivers that originated the command for that UL-driver's completion. In USB Type C speak, the SCSI mid level has two interfaces, one downward facing (toward LLDs) and one upward facing (toward the drivers listed above). No LLD should be calling scsi_finish_command() IMO. Doug Gilbert
On Tue, Jul 24, 2018 at 09:56:35PM -0400, dgilbert@interlog.com wrote:
> No LLD should be calling scsi_finish_command() IMO.
Agreed. I don't even think scsi's error handler should directly call it
either, but detangling that may take some time, if ever.
On Mon, 2018-07-23 at 08:37 -0600, Keith Busch wrote: > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 8932ae81a15a..2715cdaa669c 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req) > rtn = host->hostt->eh_timed_out(scmd); > > if (rtn == BLK_EH_DONE) { > + /* > + * For blk-mq, we must set the request state to complete now > + * before sending the request to the scsi error handler. This > + * will prevent a use-after-free in the event the LLD manages > + * to complete the request before the error handler finishes > + * processing this timed out request. > + * > + * If the request was already completed, then the LLD beat the > + * time out handler from transferring the request to the scsi > + * error handler. In that case we can return immediately as no > + * further action is required. > + */ > + if (req->q->mq_ops && !blk_mq_mark_complete(req)) > + return rtn; > if (scsi_abort_command(scmd) != SUCCESS) { > set_host_byte(scmd, DID_TIME_OUT); > scsi_eh_scmd_add(scmd); Hello Keith, What will happen if a completion occurs after scsi_times_out() has started and before or during the host->hostt->eh_timed_out()? Can that cause a use-after-free in .eh_timed_out()? Can that cause .eh_timed_out() to return BLK_EH_RESET_TIMER when it should return BLK_EH_DONE? Can that cause blk_mq_rq_timed_out() to call blk_add_timer() when that function shouldn't be called? Thanks, Bart.
On Wed, Jul 25, 2018 at 03:52:17PM +0000, Bart Van Assche wrote: > On Mon, 2018-07-23 at 08:37 -0600, Keith Busch wrote: > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > > index 8932ae81a15a..2715cdaa669c 100644 > > --- a/drivers/scsi/scsi_error.c > > +++ b/drivers/scsi/scsi_error.c > > @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req) > > rtn = host->hostt->eh_timed_out(scmd); > > > > if (rtn == BLK_EH_DONE) { > > + /* > > + * For blk-mq, we must set the request state to complete now > > + * before sending the request to the scsi error handler. This > > + * will prevent a use-after-free in the event the LLD manages > > + * to complete the request before the error handler finishes > > + * processing this timed out request. > > + * > > + * If the request was already completed, then the LLD beat the > > + * time out handler from transferring the request to the scsi > > + * error handler. In that case we can return immediately as no > > + * further action is required. > > + */ > > + if (req->q->mq_ops && !blk_mq_mark_complete(req)) > > + return rtn; > > if (scsi_abort_command(scmd) != SUCCESS) { > > set_host_byte(scmd, DID_TIME_OUT); > > scsi_eh_scmd_add(scmd); > > Hello Keith, > > What will happen if a completion occurs after scsi_times_out() has started and > before or during the host->hostt->eh_timed_out()? Can that cause a use-after-free > in .eh_timed_out()? Can that cause .eh_timed_out() to return BLK_EH_RESET_TIMER > when it should return BLK_EH_DONE? Can that cause blk_mq_rq_timed_out() to call > blk_add_timer() when that function shouldn't be called? That's what the request's refcount protects. The whole point was that driver returning RESET_TIMER doesn't lose the completion. In the worst case scenario, the blk-mq timeout work spends some CPU cycles re-arming a timer that it didn't need to.
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 8932ae81a15a..2715cdaa669c 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req) rtn = host->hostt->eh_timed_out(scmd); if (rtn == BLK_EH_DONE) { + /* + * For blk-mq, we must set the request state to complete now + * before sending the request to the scsi error handler. This + * will prevent a use-after-free in the event the LLD manages + * to complete the request before the error handler finishes + * processing this timed out request. + * + * If the request was already completed, then the LLD beat the + * time out handler from transferring the request to the scsi + * error handler. In that case we can return immediately as no + * further action is required. + */ + if (req->q->mq_ops && !blk_mq_mark_complete(req)) + return rtn; if (scsi_abort_command(scmd) != SUCCESS) { set_host_byte(scmd, DID_TIME_OUT); scsi_eh_scmd_add(scmd);
The scsi block layer requires requests claimed by the error handling be completed by the error handler. A previous commit allowed completions to proceed for blk-mq, breaking that assumption. This patch prevents completions that may race with the timeout handler by marking the state to complete, restoring the previous behavior. Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce") Signed-off-by: Keith Busch <keith.busch@intel.com> --- v1 -> v2: Document why this is necessary in code comments. Update to API's changed return value drivers/scsi/scsi_error.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)