[Question] Why queue_work unneeded for REQUEUE bio

Message ID	20201103092329.17694-1-jefflexu@linux.alibaba.com (mailing list archive)
State	Rejected, archived
Delegated to:	Mike Snitzer
Headers	show Return-Path: <SRS0=NHnu=EJ=redhat.com=dm-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DAB9220735 From: Jeffle Xu <jefflexu@linux.alibaba.com> To: snitzer@redhat.com Date: Tue, 3 Nov 2020 17:23:29 +0800 Message-Id: <20201103092329.17694-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false Cc: joseph.qi@linux.alibaba.com, dm-devel@redhat.com Subject: [dm-devel] [Question] Why queue_work unneeded for REQUEUE bio Precedence: junk Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit
Series	[Question] Why queue_work unneeded for REQUEUE bio \| expand [Question] Why queue_work unneeded for REQUEUE bio

Message ID

20201103092329.17694-1-jefflexu@linux.alibaba.com (mailing list archive)

State

Rejected, archived

Delegated to:

Mike Snitzer

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DAB9220735
From: Jeffle Xu <jefflexu@linux.alibaba.com>
To: snitzer@redhat.com
Date: Tue,  3 Nov 2020 17:23:29 +0800
Message-Id: <20201103092329.17694-1-jefflexu@linux.alibaba.com>
MIME-Version: 1.0
Cc: joseph.qi@linux.alibaba.com, dm-devel@redhat.com
Subject: [dm-devel] [Question] Why queue_work unneeded for REQUEUE bio
Precedence: junk
Sender: dm-devel-bounces@redhat.com
Errors-To: dm-devel-bounces@redhat.com
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

Series

[Question] Why queue_work unneeded for REQUEUE bio | expand

Comments

Mike Snitzer Nov. 3, 2020, 1:48 p.m. UTC | #1

On Tue, Nov 03 2020 at  4:23am -0500,
Jeffle Xu <jefflexu@linux.alibaba.com> wrote:

> Hi Mike,
> 
> Why queue_work() is unnecessary here for bio with BLK_STS_DM_REQUEUE
> returned?
> 
> Thanks
> Jeffle Xu
> 
> ---
>  drivers/md/dm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index c18fc2548518..ae550daa99b5 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -908,9 +908,11 @@ static void dec_pending(struct dm_io *io, blk_status_t error)
>  			 * Target requested pushing back the I/O.
>  			 */
>  			spin_lock_irqsave(&md->deferred_lock, flags);
> -			if (__noflush_suspending(md))
> +			if (__noflush_suspending(md)) {
>  				/* NOTE early return due to BLK_STS_DM_REQUEUE below */
>  				bio_list_add_head(&md->deferred, io->orig_bio);
> +				queue_work(md->wq, &md->work);
> +			}
>  			else
>  				/* noflush suspend was interrupted. */
>  				io->status = BLK_STS_IOERR;
> -- 
> 2.27.0
> 

For the case you highlighted (BLK_STS_DM_REQUEUE + __noflush_suspending)
I think the missing queue_work is because we're actively dealing with
the fact that we do _not_ want to flush IO.  SO kicking the workqueue
there isn't helpful because it just processes work that cannot be issued
yet -- the workqueue will be kicked upon resume (see __dm_resume's
dm_queue_flush).

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Jingbo Xu Nov. 4, 2020, 1:34 a.m. UTC | #2

On 11/3/20 9:48 PM, Mike Snitzer wrote:
> On Tue, Nov 03 2020 at  4:23am -0500,
> Jeffle Xu <jefflexu@linux.alibaba.com> wrote:
>
>> Hi Mike,
>>
>> Why queue_work() is unnecessary here for bio with BLK_STS_DM_REQUEUE
>> returned?
>>
>> Thanks
>> Jeffle Xu
>>
>> ---
>>   drivers/md/dm.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>> index c18fc2548518..ae550daa99b5 100644
>> --- a/drivers/md/dm.c
>> +++ b/drivers/md/dm.c
>> @@ -908,9 +908,11 @@ static void dec_pending(struct dm_io *io, blk_status_t error)
>>   			 * Target requested pushing back the I/O.
>>   			 */
>>   			spin_lock_irqsave(&md->deferred_lock, flags);
>> -			if (__noflush_suspending(md))
>> +			if (__noflush_suspending(md)) {
>>   				/* NOTE early return due to BLK_STS_DM_REQUEUE below */
>>   				bio_list_add_head(&md->deferred, io->orig_bio);
>> +				queue_work(md->wq, &md->work);
>> +			}
>>   			else
>>   				/* noflush suspend was interrupted. */
>>   				io->status = BLK_STS_IOERR;
>> -- 
>> 2.27.0
>>
> For the case you highlighted (BLK_STS_DM_REQUEUE + __noflush_suspending)
> I think the missing queue_work is because we're actively dealing with
> the fact that we do _not_ want to flush IO.  SO kicking the workqueue
> there isn't helpful because it just processes work that cannot be issued
> yet -- the workqueue will be kicked upon resume (see __dm_resume's
> dm_queue_flush).


Got it. Thanks.

If we are in process of DMF_NOFLUSH_SUSPENDING, then the BLK_STS_DM_REQUEUE

may be impacted by the suspending, then re-enqueue the bio to @deferred 
list. Or just pop

out BLK_STS_IOERR error.

Jingbo Xu Nov. 6, 2020, 3:49 a.m. UTC | #3

Hi Mike,


I have another question about dm, though it's irrelevant to this original

mail.


Currently abnormal IO will call blk_queue_split() and normal IO will

be split considering @max_sectors/@chunk_sectos (in max_io_len()).


Question 1: Why bio should be split considering queue_limits in dm layer?

After all the underlying device will split themselves by queue_limits if

the dm layer doesn't split by queue_limits.


Then Question 2: Currently only @max_sectors is considered when splitting

normal IO in dm layer. Should we also consider 
@max_segments/@max_segment_size

as blk_queue_split() does?


Thanks,

Jeffle


On 11/3/20 5:23 PM, Jeffle Xu wrote:
> Hi Mike,
>
> Why queue_work() is unnecessary here for bio with BLK_STS_DM_REQUEUE
> returned?
>
> Thanks
> Jeffle Xu
>
> ---
>   drivers/md/dm.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index c18fc2548518..ae550daa99b5 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -908,9 +908,11 @@ static void dec_pending(struct dm_io *io, blk_status_t error)
>   			 * Target requested pushing back the I/O.
>   			 */
>   			spin_lock_irqsave(&md->deferred_lock, flags);
> -			if (__noflush_suspending(md))
> +			if (__noflush_suspending(md)) {
>   				/* NOTE early return due to BLK_STS_DM_REQUEUE below */
>   				bio_list_add_head(&md->deferred, io->orig_bio);
> +				queue_work(md->wq, &md->work);
> +			}
>   			else
>   				/* noflush suspend was interrupted. */
>   				io->status = BLK_STS_IOERR;

Mike Snitzer Nov. 6, 2020, 3:21 p.m. UTC | #4

On Thu, Nov 05 2020 at 10:49pm -0500,
JeffleXu <jefflexu@linux.alibaba.com> wrote:

> Hi Mike,
> 
> 
> I have another question about dm, though it's irrelevant to this original
> 
> mail.
> 
> 
> Currently abnormal IO will call blk_queue_split() and normal IO will
> 
> be split considering @max_sectors/@chunk_sectos (in max_io_len()).
> 
> 
> Question 1: Why bio should be split considering queue_limits in dm layer?
> 
> After all the underlying device will split themselves by queue_limits if
> 
> the dm layer doesn't split by queue_limits.

Some targets have "abnormal IO" constraints in their implementation that
is reflected in the queue_limits -- discards in particular.

> Then Question 2: Currently only @max_sectors is considered when splitting
> 
> normal IO in dm layer. Should we also consider
> @max_segments/@max_segment_size
> 
> as blk_queue_split() does?

Great question, it does appear the one gap in DM's splitting for
"normal" IO. I'm less familiar with @max_segments/@max_segment_size. 

Since commit 5091cdec56fa ("dm: change max_io_len() to use
blk_max_size_offset()") DM is making use of more block core code to
calculate its splits -- the offset based splitting is much more natural 
for DM to perform given that potential for spanning multiple targets,
etc.  But DM targets really don't get involved with concern for
@max_segments/@max_segment_size

dm-crypt.c:crypt_io_hints is the only DM target code that concerns
itself with @max_segment_size -- and it is punitive by setting it to
PAGE_SIZE, please see commit 586b286b110e94e ("dm crypt: constrain crypt
device's max_segment_size to PAGE_SIZE") for more context.

Mike

> On 11/3/20 5:23 PM, Jeffle Xu wrote:
> >Hi Mike,
> >
> >Why queue_work() is unnecessary here for bio with BLK_STS_DM_REQUEUE
> >returned?
> >
> >Thanks
> >Jeffle Xu
> >
> >---
> >  drivers/md/dm.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> >index c18fc2548518..ae550daa99b5 100644
> >--- a/drivers/md/dm.c
> >+++ b/drivers/md/dm.c
> >@@ -908,9 +908,11 @@ static void dec_pending(struct dm_io *io, blk_status_t error)
> >  			 * Target requested pushing back the I/O.
> >  			 */
> >  			spin_lock_irqsave(&md->deferred_lock, flags);
> >-			if (__noflush_suspending(md))
> >+			if (__noflush_suspending(md)) {
> >  				/* NOTE early return due to BLK_STS_DM_REQUEUE below */
> >  				bio_list_add_head(&md->deferred, io->orig_bio);
> >+				queue_work(md->wq, &md->work);
> >+			}
> >  			else
> >  				/* noflush suspend was interrupted. */
> >  				io->status = BLK_STS_IOERR;
> 
> -- 
> Thanks,
> Jeffle
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Jingbo Xu Nov. 7, 2020, 8:12 a.m. UTC | #5

On 11/6/20 11:21 PM, Mike Snitzer wrote:
> On Thu, Nov 05 2020 at 10:49pm -0500,
> JeffleXu <jefflexu@linux.alibaba.com> wrote:
>
>> Hi Mike,
>>
>>
>> I have another question about dm, though it's irrelevant to this original
>>
>> mail.
>>
>>
>> Currently abnormal IO will call blk_queue_split() and normal IO will
>>
>> be split considering @max_sectors/@chunk_sectos (in max_io_len()).
>>
>>
>> Question 1: Why bio should be split considering queue_limits in dm layer?
>>
>> After all the underlying device will split themselves by queue_limits if
>>
>> the dm layer doesn't split by queue_limits.
> Some targets have "abnormal IO" constraints in their implementation that
> is reflected in the queue_limits -- discards in particular.
>
>> Then Question 2: Currently only @max_sectors is considered when splitting
>>
>> normal IO in dm layer. Should we also consider
>> @max_segments/@max_segment_size
>>
>> as blk_queue_split() does?
> Great question, it does appear the one gap in DM's splitting for
> "normal" IO. I'm less familiar with @max_segments/@max_segment_size.
>
> Since commit 5091cdec56fa ("dm: change max_io_len() to use
> blk_max_size_offset()") DM is making use of more block core code to
> calculate its splits -- the offset based splitting is much more natural
> for DM to perform given that potential for spanning multiple targets,
> etc.  But DM targets really don't get involved with concern for
> @max_segments/@max_segment_size
>
> dm-crypt.c:crypt_io_hints is the only DM target code that concerns
> itself with @max_segment_size -- and it is punitive by setting it to
> PAGE_SIZE, please see commit 586b286b110e94e ("dm crypt: constrain crypt
> device's max_segment_size to PAGE_SIZE") for more context.

Thanks. So the principle of handling queue_limits of dm layer is that,  
only when dm-target

specific queue_limits is set, shall we consider queue_limitswhen 
splitting bio in dm, in

which case these queue_limits (from underlying devices) are left for the 
underlying devices

to be handled appropriately.


(In this case there's no dm-target set its own 
@max_segments/@max_segment_size, then

these two parameters are not considered when dm splitting.)


If that's the case, then why bother stacking queue_limits in 
blk_stack_limits()? Such as

```

blk_stack_limits

     t->max_segments = min_not_zero(t->max_segments, b->max_segments);
     t->max_segment_size = min_not_zero(t->max_segment_size,
                        b->max_segment_size);

```


I'm not challenging the implementation. Of course it works fine 
currently. I'm just curious

about the design consideration beneath the implementation. :-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index c18fc2548518..ae550daa99b5 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -908,9 +908,11 @@  static void dec_pending(struct dm_io *io, blk_status_t error)
 			 * Target requested pushing back the I/O.
 			 */
 			spin_lock_irqsave(&md->deferred_lock, flags);
-			if (__noflush_suspending(md))
+			if (__noflush_suspending(md)) {
 				/* NOTE early return due to BLK_STS_DM_REQUEUE below */
 				bio_list_add_head(&md->deferred, io->orig_bio);
+				queue_work(md->wq, &md->work);
+			}
 			else
 				/* noflush suspend was interrupted. */
 				io->status = BLK_STS_IOERR;

[Question] Why queue_work unneeded for REQUEUE bio

Commit Message

Comments

Patch