[RFC,0/3] blk-mq: rework queue freeze and preempt-only

Message ID	1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com (mailing list archive)
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Jianchao Wang <jianchao.w.wang@oracle.com> To: axboe@kernel.dk, tj@kernel.org, kent.overstreet@gmail.com, ming.lei@redhat.com, bart.vanassche@wdc.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 0/3] blk-mq: rework queue freeze and preempt-only Date: Thu, 20 Sep 2018 18:18:20 +0800 Message-Id: <1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk
Series	blk-mq: rework queue freeze and preempt-only \| expand [RFC,0/3] blk-mq: rework queue freeze and preempt-only [1/3] percpu_ref: add a new helper interface __percpu_ref_get_many [2/3] blk-core: rework the queue freeze [3/3] block, scsi: rework the preempt only mode

Message ID

1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com (mailing list archive)

Headers

From: Jianchao Wang <jianchao.w.wang@oracle.com>
To: axboe@kernel.dk, tj@kernel.org, kent.overstreet@gmail.com,
        ming.lei@redhat.com, bart.vanassche@wdc.com
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [RFC PATCH 0/3] blk-mq: rework queue freeze and preempt-only
Date: Thu, 20 Sep 2018 18:18:20 +0800
Message-Id: <1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com>
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk

Series

blk-mq: rework queue freeze and preempt-only | expand

Message

jianchao.wang Sept. 20, 2018, 10:18 a.m. UTC

Hi all

The current queue freeze depending on percpu_ref_kil/reinit has a limit that
we have drain the requests before unfreeze the queue.

Let's rework the queue freeze feature as following:
1. introduce __percpu_ref_get_many.
It is same with original percpu_ref_get_many, but just need callers to provide
sched rcu critical section. We will put the __percpu_ref_get_many and our own
condition checking under rcu_read_lock_sched. With this new helper interface,
we could save an extra rcu_read_lock_sched.
2. rework the blk_queue_enter as:

rcu_read_lock_sched()
if condition check true
__percpu_ref_get_many(&q->q_usage_counter, 1)
else
goto wait
rcu_read_unlock_sched()
3. use percpu_ref_switch_to_atomic/percpu to switch mode directly.

Then we could unfreeze the queue w/o draining requests.
In addition, preempt-only mode code could be simplified.

Jianchao Wang (3)
percpu_ref: add a new helper interface
blk-core: rework the queue freeze
block, scsi: rework the preempt only mode

d16c830226a0ac15cb5947479030b
block/blk-core.c | 58 ++++++++++++++++++++++++++--------------------------------
block/blk-mq-debugfs.c | 1 -
block/blk-mq.c | 8 ++++++--
block/blk.h | 5 +++++
drivers/scsi/scsi_lib.c | 12 ++++--------
include/linux/blkdev.h | 5 ++---
include/linux/percpu-refcount.h | 23 +++++++++++++++++------
7 files changed, 60 insertions(+), 52 deletions(-)

Thanks
Jianchao

Comments

Bart Van Assche Sept. 20, 2018, 5:06 p.m. UTC | #1

On Thu, 2018-09-20 at 18:18 +0800, Jianchao Wang wrote:
> The current queue freeze depending on percpu_ref_kil/reinit has a limit that
> we have drain the requests before unfreeze the queue.
> 
> Let's rework the queue freeze feature as following:
> 1. introduce __percpu_ref_get_many.
>    It is same with original percpu_ref_get_many, but just need callers to provide
>    sched rcu critical section. We will put the __percpu_ref_get_many and our own
>    condition checking under rcu_read_lock_sched. With this new helper interface,
>    we could save an extra rcu_read_lock_sched.
> 2. rework the blk_queue_enter as:
>    
>    rcu_read_lock_sched()
>    if condition check true
>      __percpu_ref_get_many(&q->q_usage_counter, 1)
>    else
>      goto wait
>    rcu_read_unlock_sched()
> 3. use percpu_ref_switch_to_atomic/percpu to switch mode directly.
> 
> Then we could unfreeze the queue w/o draining requests.
> In addition, preempt-only mode code could be simplified.

Hello Jianchao,
                                                                           =
Thanks for having taken a look. However, the approach of this patch series
may be more complicated than necessary. Had you considered something like
the patch below?

Thanks,

Bart.


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 20fdd78b75c7..d384ab700afd 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -199,7 +199,7 @@ void blk_mq_unfreeze_queue(struct request_queue *q)
 	freeze_depth = atomic_dec_return(&q->mq_freeze_depth);
 	WARN_ON_ONCE(freeze_depth < 0);
 	if (!freeze_depth) {
-		percpu_ref_reinit(&q->q_usage_counter);
+		percpu_ref_resurrect(&q->q_usage_counter);
 		wake_up_all(&q->mq_freeze_wq);
 	}
 }
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 009cdf3d65b6..b297cd1cd4f1 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -108,6 +108,7 @@ void percpu_ref_switch_to_atomic_sync(struct percpu_ref *ref);
 void percpu_ref_switch_to_percpu(struct percpu_ref *ref);
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill);
+void percpu_ref_resurrect(struct percpu_ref *ref);
 void percpu_ref_reinit(struct percpu_ref *ref);
 
 /**
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 9f96fa7bc000..054e7650cd35 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -356,11 +356,31 @@ EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
  */
 void percpu_ref_reinit(struct percpu_ref *ref)
 {
+	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
+
+	percpu_ref_resurrect(ref);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_reinit);
+
+/**
+ * percpu_ref_resurrect - modify a percpu refcount from dead to live
+ * @ref: perpcu_ref to resurrect
+ *
+ * Modify @ref so that it's in the same state as before percpu_ref_kill() was
+ * called. @ref must have be dead but not exited.
+ *
+ * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
+ * this function is in progress.
+ */
+void percpu_ref_resurrect(struct percpu_ref *ref)
+{
+	unsigned long __percpu *percpu_count;
 	unsigned long flags;
 
 	spin_lock_irqsave(&percpu_ref_switch_lock, flags);
 
-	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
+	WARN_ON_ONCE(!(ref->percpu_count_ptr & __PERCPU_REF_DEAD));
+	WARN_ON_ONCE(__ref_is_percpu(ref, &percpu_count));
 
 	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
 	percpu_ref_get(ref);
@@ -368,4 +388,4 @@ void percpu_ref_reinit(struct percpu_ref *ref)
 
 	spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
 }
-EXPORT_SYMBOL_GPL(percpu_ref_reinit);
+EXPORT_SYMBOL_GPL(percpu_ref_resurrect);

jianchao.wang Sept. 21, 2018, 1:51 a.m. UTC | #2

Hi Bart

On 09/21/2018 01:06 AM, Bart Van Assche wrote:
> On Thu, 2018-09-20 at 18:18 +0800, Jianchao Wang wrote:
>> The current queue freeze depending on percpu_ref_kil/reinit has a limit that
>> we have drain the requests before unfreeze the queue.
>>
>> Let's rework the queue freeze feature as following:
>> 1. introduce __percpu_ref_get_many.
>>    It is same with original percpu_ref_get_many, but just need callers to provide
>>    sched rcu critical section. We will put the __percpu_ref_get_many and our own
>>    condition checking under rcu_read_lock_sched. With this new helper interface,
>>    we could save an extra rcu_read_lock_sched.
>> 2. rework the blk_queue_enter as:
>>    
>>    rcu_read_lock_sched()
>>    if condition check true
>>      __percpu_ref_get_many(&q->q_usage_counter, 1)
>>    else
>>      goto wait
>>    rcu_read_unlock_sched()
>> 3. use percpu_ref_switch_to_atomic/percpu to switch mode directly.
>>
>> Then we could unfreeze the queue w/o draining requests.
>> In addition, preempt-only mode code could be simplified.
> Hello Jianchao,
>                                                                            =
> Thanks for having taken a look. However, the approach of this patch series
> may be more complicated than necessary. Had you considered something like
> the patch below?

This patch is just a trying that let our life easier w/o changing anything in percpu-ref code. :)
In fact, the 1st patch which introduces a new non-functional-changing helper interface in
percpu-ref code is not necessary.

This patchset just implement our own condition checking in blk_queue_enter instead of depending on
__PERCPU_REF_DEAD checking in percpu_ref_tryget_live.

Then we could do more thing here, like:
 - unfreeze the queue without draining requests.
 - check whether q->q_usage_counter is zero
 - add other gate conditions into blk_queue_enter, such as preempt-only, even any others.

So why not try it ?

Thanks
Jianchao

Bart Van Assche Sept. 21, 2018, 2:13 a.m. UTC | #3

On 9/20/18 6:51 PM, jianchao.wang wrote:
> Hi Bart
> 
> On 09/21/2018 01:06 AM, Bart Van Assche wrote:
>> On Thu, 2018-09-20 at 18:18 +0800, Jianchao Wang wrote:
>>> The current queue freeze depending on percpu_ref_kil/reinit has a limit that
>>> we have drain the requests before unfreeze the queue.
>>>
>>> Let's rework the queue freeze feature as following:
>>> 1. introduce __percpu_ref_get_many.
>>>     It is same with original percpu_ref_get_many, but just need callers to provide
>>>     sched rcu critical section. We will put the __percpu_ref_get_many and our own
>>>     condition checking under rcu_read_lock_sched. With this new helper interface,
>>>     we could save an extra rcu_read_lock_sched.
>>> 2. rework the blk_queue_enter as:
>>>     
>>>     rcu_read_lock_sched()
>>>     if condition check true
>>>       __percpu_ref_get_many(&q->q_usage_counter, 1)
>>>     else
>>>       goto wait
>>>     rcu_read_unlock_sched()
>>> 3. use percpu_ref_switch_to_atomic/percpu to switch mode directly.
>>>
>>> Then we could unfreeze the queue w/o draining requests.
>>> In addition, preempt-only mode code could be simplified.
>> Hello Jianchao,
>>                                                                             =
>> Thanks for having taken a look. However, the approach of this patch series
>> may be more complicated than necessary. Had you considered something like
>> the patch below?
> 
> This patch is just a trying that let our life easier w/o changing anything in percpu-ref code. :)
> In fact, the 1st patch which introduces a new non-functional-changing helper interface in
> percpu-ref code is not necessary.
> 
> This patchset just implement our own condition checking in blk_queue_enter instead of depending on
> __PERCPU_REF_DEAD checking in percpu_ref_tryget_live.
> 
> Then we could do more thing here, like:
>   - unfreeze the queue without draining requests.
>   - check whether q->q_usage_counter is zero
>   - add other gate conditions into blk_queue_enter, such as preempt-only, even any others.
> 
> So why not try it ?

Hello Jianchao,

Some time ago Tejun wrote that he wants to keep the dependency of the 
percpu_ref implementation on rcu-sched inside the percpu_ref 
implementation because he would like to have the freedom to switch to 
regular RCU. Your patch series makes the dependency of percpu_ref 
counters on rcu-sched visible outside the percpu_ref implementation. I 
think that's a disadvantage of your approach. Tejun, please correct me 
if I misunderstood you.

Bart.

jianchao.wang Sept. 21, 2018, 2:24 a.m. UTC | #4

Hi Bart

On 09/21/2018 10:13 AM, Bart Van Assche wrote:
> On 9/20/18 6:51 PM, jianchao.wang wrote:
>> Hi Bart
>>
>> On 09/21/2018 01:06 AM, Bart Van Assche wrote:
>>> On Thu, 2018-09-20 at 18:18 +0800, Jianchao Wang wrote:
>>>> The current queue freeze depending on percpu_ref_kil/reinit has a limit that
>>>> we have drain the requests before unfreeze the queue.
>>>>
>>>> Let's rework the queue freeze feature as following:
>>>> 1. introduce __percpu_ref_get_many.
>>>>     It is same with original percpu_ref_get_many, but just need callers to provide
>>>>     sched rcu critical section. We will put the __percpu_ref_get_many and our own
>>>>     condition checking under rcu_read_lock_sched. With this new helper interface,
>>>>     we could save an extra rcu_read_lock_sched.
>>>> 2. rework the blk_queue_enter as:
>>>>         rcu_read_lock_sched()
>>>>     if condition check true
>>>>       __percpu_ref_get_many(&q->q_usage_counter, 1)
>>>>     else
>>>>       goto wait
>>>>     rcu_read_unlock_sched()
>>>> 3. use percpu_ref_switch_to_atomic/percpu to switch mode directly.
>>>>
>>>> Then we could unfreeze the queue w/o draining requests.
>>>> In addition, preempt-only mode code could be simplified.
>>> Hello Jianchao,
>>>                                                                             =
>>> Thanks for having taken a look. However, the approach of this patch series
>>> may be more complicated than necessary. Had you considered something like
>>> the patch below?
>>
>> This patch is just a trying that let our life easier w/o changing anything in percpu-ref code. :)
>> In fact, the 1st patch which introduces a new non-functional-changing helper interface in
>> percpu-ref code is not necessary.
>>
>> This patchset just implement our own condition checking in blk_queue_enter instead of depending on
>> __PERCPU_REF_DEAD checking in percpu_ref_tryget_live.
>>
>> Then we could do more thing here, like:
>>   - unfreeze the queue without draining requests.
>>   - check whether q->q_usage_counter is zero
>>   - add other gate conditions into blk_queue_enter, such as preempt-only, even any others.
>>
>> So why not try it ?
> 
> Hello Jianchao,
> 
> Some time ago Tejun wrote that he wants to keep the dependency of the percpu_ref implementation on rcu-sched inside the percpu_ref implementation because he would like to have the freedom to switch to regular RCU. Your patch series makes the dependency of percpu_ref counters on rcu-sched visible outside the percpu_ref implementation. I think that's a disadvantage of your approach. Tejun, please correct me if I misunderstood you.

It will be easy to fix this through adding some other interfaces like:

percpu_ref_lock/unlock
{
    rcu_read_lock/unlock_sched
}

But things is up to Tejun that whether he agrees to do like this.
 
Thanks
Jianchao