Message ID | 1529027847-29085-1-git-send-email-jianchao.w.wang@oracle.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang <jianchao.w.wang@oracle.com> wrote: > After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible > to complete the timed out request, however, for blk-legacy, the > 'complete' is still marked, blk_complete_request will do nothing, > we export __blk_complete_request for LLDD to complete the request > in timeout path. > > Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> > --- > block/blk-softirq.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/block/blk-softirq.c b/block/blk-softirq.c > index 01e2b35..15c1f5e 100644 > --- a/block/blk-softirq.c > +++ b/block/blk-softirq.c > @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req) > > local_irq_restore(flags); > } > +EXPORT_SYMBOL(__blk_complete_request); > > /** > * blk_complete_request - end I/O on a request > -- > 2.7.4 > Looks non-blk-mq timeout code need to convert to ref-counter based approach too? Thanks, Ming Lei
Hi Ming On 06/15/2018 10:17 AM, Ming Lei wrote: > On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang > <jianchao.w.wang@oracle.com> wrote: >> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible >> to complete the timed out request, however, for blk-legacy, the >> 'complete' is still marked, blk_complete_request will do nothing, >> we export __blk_complete_request for LLDD to complete the request >> in timeout path. >> >> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> >> --- >> block/blk-softirq.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/block/blk-softirq.c b/block/blk-softirq.c >> index 01e2b35..15c1f5e 100644 >> --- a/block/blk-softirq.c >> +++ b/block/blk-softirq.c >> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req) >> >> local_irq_restore(flags); >> } >> +EXPORT_SYMBOL(__blk_complete_request); >> >> /** >> * blk_complete_request - end I/O on a request >> -- >> 2.7.4 >> > > Looks non-blk-mq timeout code need to convert to ref-counter > based approach too? IMO, ref-counter is just to fix the blk-mq req life recycle issue. It cannot replace the blk_mark_rq_complete which could avoid the race between timeout and io completion path. Or maybe my understanding is wrong ... Thanks Jianchao >
On 06/15/2018 10:22 AM, jianchao.wang wrote: > Hi Ming > > On 06/15/2018 10:17 AM, Ming Lei wrote: >> On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang >> <jianchao.w.wang@oracle.com> wrote: >>> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible >>> to complete the timed out request, however, for blk-legacy, the >>> 'complete' is still marked, blk_complete_request will do nothing, >>> we export __blk_complete_request for LLDD to complete the request >>> in timeout path. >>> >>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> >>> --- >>> block/blk-softirq.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/block/blk-softirq.c b/block/blk-softirq.c >>> index 01e2b35..15c1f5e 100644 >>> --- a/block/blk-softirq.c >>> +++ b/block/blk-softirq.c >>> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req) >>> >>> local_irq_restore(flags); >>> } >>> +EXPORT_SYMBOL(__blk_complete_request); >>> >>> /** >>> * blk_complete_request - end I/O on a request >>> -- >>> 2.7.4 >>> >> >> Looks non-blk-mq timeout code need to convert to ref-counter >> based approach too? > > IMO, ref-counter is just to fix the blk-mq req life recycle issue. > It cannot replace the blk_mark_rq_complete which could avoid the race between > timeout and io completion path. The .timeout return BLK_EH_DONE doesn't always mean the request has been completed. Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out request is still in abort or eh process. What if a completion irq come during that ? > Or maybe my understanding is wrong ... > > Thanks > Jianchao >> >
On Fri, Jun 15, 2018 at 10:22 AM, jianchao.wang <jianchao.w.wang@oracle.com> wrote: > Hi Ming > > On 06/15/2018 10:17 AM, Ming Lei wrote: >> On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang >> <jianchao.w.wang@oracle.com> wrote: >>> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible >>> to complete the timed out request, however, for blk-legacy, the >>> 'complete' is still marked, blk_complete_request will do nothing, >>> we export __blk_complete_request for LLDD to complete the request >>> in timeout path. >>> >>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> >>> --- >>> block/blk-softirq.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/block/blk-softirq.c b/block/blk-softirq.c >>> index 01e2b35..15c1f5e 100644 >>> --- a/block/blk-softirq.c >>> +++ b/block/blk-softirq.c >>> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req) >>> >>> local_irq_restore(flags); >>> } >>> +EXPORT_SYMBOL(__blk_complete_request); >>> >>> /** >>> * blk_complete_request - end I/O on a request >>> -- >>> 2.7.4 >>> >> >> Looks non-blk-mq timeout code need to convert to ref-counter >> based approach too? > > IMO, ref-counter is just to fix the blk-mq req life recycle issue. Just thought of that, it is one blk-mq specific issue. > It cannot replace the blk_mark_rq_complete which could avoid the race between > timeout and io completion path. > Or maybe my understanding is wrong ... I didn't mean that this patch is unnecessary. But the question is that given driver has to deal with race between timeout and normal completion, why don't you follow blk-mq's way to move the atomic state change into __blk_complete_request()? Thanks, Ming Lei
On Fri, Jun 15, 2018 at 10:44 AM, jianchao.wang <jianchao.w.wang@oracle.com> wrote: > > > On 06/15/2018 10:22 AM, jianchao.wang wrote: >> Hi Ming >> >> On 06/15/2018 10:17 AM, Ming Lei wrote: >>> On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang >>> <jianchao.w.wang@oracle.com> wrote: >>>> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible >>>> to complete the timed out request, however, for blk-legacy, the >>>> 'complete' is still marked, blk_complete_request will do nothing, >>>> we export __blk_complete_request for LLDD to complete the request >>>> in timeout path. >>>> >>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> >>>> --- >>>> block/blk-softirq.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/block/blk-softirq.c b/block/blk-softirq.c >>>> index 01e2b35..15c1f5e 100644 >>>> --- a/block/blk-softirq.c >>>> +++ b/block/blk-softirq.c >>>> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req) >>>> >>>> local_irq_restore(flags); >>>> } >>>> +EXPORT_SYMBOL(__blk_complete_request); >>>> >>>> /** >>>> * blk_complete_request - end I/O on a request >>>> -- >>>> 2.7.4 >>>> >>> >>> Looks non-blk-mq timeout code need to convert to ref-counter >>> based approach too? >> >> IMO, ref-counter is just to fix the blk-mq req life recycle issue. >> It cannot replace the blk_mark_rq_complete which could avoid the race between >> timeout and io completion path. > > The .timeout return BLK_EH_DONE doesn't always mean the request has been completed. > Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out > request is still in abort or eh process. What if a completion irq come during that ? For blk-mq, it is avoided by the atomic state change in __blk_mq_complete_request(), that is why I mentioned the question in my last reply. But what if the timed-out request has been freed by EH? Then seems req's ref_counter is still needed for non-mq? Thanks, Ming Lei
Hi Ming Thanks for your kindly response. On 06/15/2018 10:56 AM, Ming Lei wrote: >>> IMO, ref-counter is just to fix the blk-mq req life recycle issue. >>> It cannot replace the blk_mark_rq_complete which could avoid the race between >>> timeout and io completion path. >> The .timeout return BLK_EH_DONE doesn't always mean the request has been completed. >> Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out >> request is still in abort or eh process. What if a completion irq come during that ? > For blk-mq, it is avoided by the atomic state change in > __blk_mq_complete_request(), > that is why I mentioned the question in my last reply. > but blk_mq_check_expired doesn't do that. do I miss anything ? > But what if the timed-out request has been freed by EH? Then seems > req's ref_counter Thanks Jianchao
diff --git a/block/blk-softirq.c b/block/blk-softirq.c index 01e2b35..15c1f5e 100644 --- a/block/blk-softirq.c +++ b/block/blk-softirq.c @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req) local_irq_restore(flags); } +EXPORT_SYMBOL(__blk_complete_request); /** * blk_complete_request - end I/O on a request
After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible to complete the timed out request, however, for blk-legacy, the 'complete' is still marked, blk_complete_request will do nothing, we export __blk_complete_request for LLDD to complete the request in timeout path. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> --- block/blk-softirq.c | 1 + 1 file changed, 1 insertion(+)