From patchwork Thu Sep 20 10:18:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10607329 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1C21714BD for ; Thu, 20 Sep 2018 10:16:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C0242D011 for ; Thu, 20 Sep 2018 10:16:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 003A82D075; Thu, 20 Sep 2018 10:16:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E8EB2D011 for ; Thu, 20 Sep 2018 10:16:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730984AbeITP7V (ORCPT ); Thu, 20 Sep 2018 11:59:21 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:39728 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730654AbeITP7V (ORCPT ); Thu, 20 Sep 2018 11:59:21 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w8KAF83N102169; Thu, 20 Sep 2018 10:16:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=WdQMwdwP5PFLQVcMIQw7SAoiypQcZmo6xCJvd0x/t84=; b=VWI8JahKe1e4RbVNE16/jvXl27wIpScuUv91fj2SX+yEedyuLFs1HtTqCKv0+JTAgkcM fIbRYLT2BaTtlfRvVZUUGuUkzeDCM2BXH3FaxjCt2t6uqk18HrDEguj3JSKXUGKYqf5S O7mqAC/V2kmmTYnAGHL02o4pKpn6dr8XiKasWzpXu51Bgz+AjTZTaTqrRNOa1JQGlSvC K6V6Z5F4AJViUzgZWN6LlLKAkIYh27ucNQYattlpxaD9KVzbwRkpkUynX0qVPGqBiqRE 0JH5g4n5DXWlJN/dmoFoh1w0VVON+8mM696QQtmFjyW3f+g9zIoyZVjPyH/bpZEinY7e /g== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2mgsgu1udk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 20 Sep 2018 10:16:32 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w8KAGVRc017125 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 20 Sep 2018 10:16:31 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w8KAGUDR001252; Thu, 20 Sep 2018 10:16:30 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 20 Sep 2018 03:16:30 -0700 From: Jianchao Wang To: axboe@kernel.dk, tj@kernel.org, kent.overstreet@gmail.com, ming.lei@redhat.com, bart.vanassche@wdc.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/3] percpu_ref: add a new helper interface __percpu_ref_get_many Date: Thu, 20 Sep 2018 18:18:21 +0800 Message-Id: <1537438703-25217-2-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com> References: <1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9021 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809200105 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This __percpu_ref_get_many is almost same with the percpu_ref_get_many, except for the caller need to provide a sched rcu critical section for it. We want to do some other condition checking under the sched rcu lock. With this interface, one extra rcu_read_lock/unlock_sched could be saved. Signed-off-by: Jianchao Wang --- include/linux/percpu-refcount.h | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h index 009cdf3..b86e03b 100644 --- a/include/linux/percpu-refcount.h +++ b/include/linux/percpu-refcount.h @@ -169,21 +169,32 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref, * @ref: percpu_ref to get * @nr: number of references to get * - * Analogous to atomic_long_add(). - * - * This function is safe to call as long as @ref is between init and exit. + * This function is same with percpu_ref_get_many except for the caller need to + * provide a sched rcu critical section for it. */ -static inline void percpu_ref_get_many(struct percpu_ref *ref, unsigned long nr) +static inline void __percpu_ref_get_many(struct percpu_ref *ref, unsigned long nr) { unsigned long __percpu *percpu_count; - rcu_read_lock_sched(); - if (__ref_is_percpu(ref, &percpu_count)) this_cpu_add(*percpu_count, nr); else atomic_long_add(nr, &ref->count); +} +/** + * percpu_ref_get_many - increment a percpu refcount + * @ref: percpu_ref to get + * @nr: number of references to get + * + * Analogous to atomic_long_add(). + * + * This function is safe to call as long as @ref is between init and exit. + */ +static inline void percpu_ref_get_many(struct percpu_ref *ref, unsigned long nr) +{ + rcu_read_lock_sched(); + __percpu_ref_get_many(ref, nr); rcu_read_unlock_sched(); } From patchwork Thu Sep 20 10:18:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10607331 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5B50A14BD for ; Thu, 20 Sep 2018 10:16:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4903B2D011 for ; Thu, 20 Sep 2018 10:16:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3B27A2D075; Thu, 20 Sep 2018 10:16:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5F802D011 for ; Thu, 20 Sep 2018 10:16:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732470AbeITP7b (ORCPT ); Thu, 20 Sep 2018 11:59:31 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:39806 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730654AbeITP7b (ORCPT ); Thu, 20 Sep 2018 11:59:31 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w8KAELSD069248; Thu, 20 Sep 2018 10:16:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=+evWuUa36LTRNnzPaQqRnLeyvL7xtt2aVrG6OpR2mVc=; b=DuU9Zi2YVfnlnhM6VZrJMZLxw8mV8HwnwjpqQX7eXwOP9uSuHCN2ztare5cy6h13/GMO F8WFDFxVYqceeKYNQaZSWpGv8QWOZI3vJZfvj2mjW7zwwXr6EnSC8Ix5Sl+wEodqQfqQ 8HC9V0hMbEKtH/CpobNg9VbGfra3YPcaXkHAvH0rSq6JD8Az77waJ52JvK6WdvqzvQ71 uCspbq/WAh8usIzqFwQmF/sIgC50RGTS5RFUscxRM1aFneYHQctfHklpSV8HpI9bno0o YjayaVb6u+PH9oAe8cKuJT/iuw2ZaxP/3JyyiOt72TwJctkQjphTpviZvr0UBqNDhuBe lQ== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2120.oracle.com with ESMTP id 2mgtqr9ru3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 20 Sep 2018 10:16:40 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w8KAGXpE032501 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 20 Sep 2018 10:16:34 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w8KAGXbt023972; Thu, 20 Sep 2018 10:16:33 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 20 Sep 2018 03:16:32 -0700 From: Jianchao Wang To: axboe@kernel.dk, tj@kernel.org, kent.overstreet@gmail.com, ming.lei@redhat.com, bart.vanassche@wdc.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/3] blk-core: rework the queue freeze Date: Thu, 20 Sep 2018 18:18:22 +0800 Message-Id: <1537438703-25217-3-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com> References: <1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9021 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809200105 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The previous queue freeze depends on the percpu_ref_kill/reinit. But the limit is that we have to drain the q_usage_counter when unfreeze the queue. To improve this, we implement our own condition checking, namely queue_gate, instead of depending on the __PERCPU_REF_DEAD. Then put both the checking on queue_gate and __percpu_ref_get_many under sched rcu lock. At the same time, switch the percpu ref mode between atomic and percpu with percpu_ref_switch_to_atomic/percpu. After this, introduce the BLK_QUEUE_GATE_FROZEN on queue_gate to implement queue freeze feature. Then we could unfreeze the queue anytime without drain the queue. In addition, this fashion will be convinient to implement other condition checking, such as preempt-only mode. Signed-off-by: Jianchao Wang --- block/blk-core.c | 28 +++++++++++++++++----------- block/blk-mq.c | 8 ++++++-- block/blk.h | 4 ++++ drivers/scsi/scsi_lib.c | 2 +- include/linux/blkdev.h | 2 ++ 5 files changed, 30 insertions(+), 14 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index dee56c2..f8b8fe2 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -910,6 +910,18 @@ struct request_queue *blk_alloc_queue(gfp_t gfp_mask) } EXPORT_SYMBOL(blk_alloc_queue); +static inline bool blk_queue_gate_allow(struct request_queue *q, + blk_mq_req_flags_t flags) +{ + if (likely(!q->queue_gate)) + return true; + + if (test_bit(BLK_QUEUE_GATE_FROZEN, &q->queue_gate)) + return false; + + return true; +} + /** * blk_queue_enter() - try to increase q->q_usage_counter * @q: request queue pointer @@ -922,8 +934,9 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) while (true) { bool success = false; - rcu_read_lock(); - if (percpu_ref_tryget_live(&q->q_usage_counter)) { + rcu_read_lock_sched(); + if (blk_queue_gate_allow(q, flags)) { + __percpu_ref_get_many(&q->q_usage_counter, 1); /* * The code that sets the PREEMPT_ONLY flag is * responsible for ensuring that that flag is globally @@ -935,7 +948,7 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) percpu_ref_put(&q->q_usage_counter); } } - rcu_read_unlock(); + rcu_read_unlock_sched(); if (success) return 0; @@ -943,17 +956,10 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) if (flags & BLK_MQ_REQ_NOWAIT) return -EBUSY; - /* - * read pair of barrier in blk_freeze_queue_start(), - * we need to order reading __PERCPU_REF_DEAD flag of - * .q_usage_counter and reading .mq_freeze_depth or - * queue dying flag, otherwise the following wait may - * never return if the two reads are reordered. - */ smp_rmb(); wait_event(q->mq_freeze_wq, - (atomic_read(&q->mq_freeze_depth) == 0 && + (blk_queue_gate_allow(q, flags) && (preempt || !blk_queue_preempt_only(q))) || blk_queue_dying(q)); if (blk_queue_dying(q)) diff --git a/block/blk-mq.c b/block/blk-mq.c index 85a1c1a..fc90ad3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -140,7 +140,9 @@ void blk_freeze_queue_start(struct request_queue *q) freeze_depth = atomic_inc_return(&q->mq_freeze_depth); if (freeze_depth == 1) { - percpu_ref_kill(&q->q_usage_counter); + set_bit(BLK_QUEUE_GATE_FROZEN, &q->queue_gate); + percpu_ref_put(&q->q_usage_counter); + percpu_ref_switch_to_atomic(&q->q_usage_counter, NULL); if (q->mq_ops) blk_mq_run_hw_queues(q, false); } @@ -198,7 +200,9 @@ void blk_mq_unfreeze_queue(struct request_queue *q) freeze_depth = atomic_dec_return(&q->mq_freeze_depth); WARN_ON_ONCE(freeze_depth < 0); if (!freeze_depth) { - percpu_ref_reinit(&q->q_usage_counter); + clear_bit(BLK_QUEUE_GATE_FROZEN, &q->queue_gate); + percpu_ref_get(&q->q_usage_counter); + percpu_ref_switch_to_percpu(&q->q_usage_counter); wake_up_all(&q->mq_freeze_wq); } } diff --git a/block/blk.h b/block/blk.h index 9db4e38..19d2c00 100644 --- a/block/blk.h +++ b/block/blk.h @@ -19,6 +19,10 @@ extern struct dentry *blk_debugfs_root; #endif +enum blk_queue_gate_flag_t { + BLK_QUEUE_GATE_FROZEN, +}; + struct blk_flush_queue { unsigned int flush_queue_delayed:1; unsigned int flush_pending_idx:1; diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0adfb3b..1980648 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -3066,7 +3066,7 @@ scsi_device_quiesce(struct scsi_device *sdev) * unfreeze even if the queue was already frozen before this function * was called. See also https://lwn.net/Articles/573497/. */ - synchronize_rcu(); + synchronize_sched(); blk_mq_unfreeze_queue(q); mutex_lock(&sdev->state_mutex); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index d6869e0..9f3f0d7 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -647,6 +647,8 @@ struct request_queue { struct rcu_head rcu_head; wait_queue_head_t mq_freeze_wq; struct percpu_ref q_usage_counter; + unsigned long queue_gate; + struct list_head all_q_node; struct blk_mq_tag_set *tag_set; From patchwork Thu Sep 20 10:18:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10607333 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 278BF157B for ; Thu, 20 Sep 2018 10:16:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 164D22D011 for ; Thu, 20 Sep 2018 10:16:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 09FEB2D075; Thu, 20 Sep 2018 10:16:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 781D12D011 for ; Thu, 20 Sep 2018 10:16:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732064AbeITP71 (ORCPT ); Thu, 20 Sep 2018 11:59:27 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:39718 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728444AbeITP71 (ORCPT ); Thu, 20 Sep 2018 11:59:27 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w8KAEAbq069185; Thu, 20 Sep 2018 10:16:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=caQR3mcQVm0y+/BNMSplK3Ti+/fZfJbNdENNojzT3gg=; b=lPQtLx33llMyFw0Rw/VJIYq6EMYaHgy0e/zW1rkBqV66WiTgvzr355SJ8kG5TWlVbbCA 0kaOmWOy6++HLHaQgR3+yc4MVhCkq4ycUwR/AktXBisk71cl1U0SdDiDJc+Pk4D0+B51 mxi5z+JxMIGbUrAhWgUZ6bcoCNAZpYsqhezmV9kbZUpQJOm+jvDA5TICVUtxrUvqOpXD hU7EC8BWPRdDNasgGeOgJ4ZkdqNtUR1AdzrpSkFcETCdCao9it5V4O7eCRM+hUsRN/n3 0YwqSzp/4WAXPxk2mp2Driet3WPxuHT3ba1gNb2JkWQYfzVg2bIJ/PIOy4bXu2fU/yPc gg== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2mgtqr9rty-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 20 Sep 2018 10:16:37 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w8KAGaE1017303 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 20 Sep 2018 10:16:36 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w8KAGZUH023984; Thu, 20 Sep 2018 10:16:35 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 20 Sep 2018 03:16:35 -0700 From: Jianchao Wang To: axboe@kernel.dk, tj@kernel.org, kent.overstreet@gmail.com, ming.lei@redhat.com, bart.vanassche@wdc.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/3] block, scsi: rework the preempt only mode Date: Thu, 20 Sep 2018 18:18:23 +0800 Message-Id: <1537438703-25217-4-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com> References: <1537438703-25217-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9021 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809200105 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Migrate preempt-only mode from queue_flags to queue_gate. Because the queue_gate checking and __percpu_ref_get_many are under the same sched rcu critical section, we don't need extra synchronize_sched after blk_mq_freeze_queue any more. Signed-off-by: Jianchao Wang --- block/blk-core.c | 32 ++++++++++---------------------- block/blk-mq-debugfs.c | 1 - block/blk.h | 1 + drivers/scsi/scsi_lib.c | 12 ++++-------- include/linux/blkdev.h | 3 --- 5 files changed, 15 insertions(+), 34 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index f8b8fe2..c8d642a 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -429,13 +429,13 @@ EXPORT_SYMBOL(blk_sync_queue); */ int blk_set_preempt_only(struct request_queue *q) { - return blk_queue_flag_test_and_set(QUEUE_FLAG_PREEMPT_ONLY, q); + return test_and_set_bit(BLK_QUEUE_GATE_PREEMPT_ONLY, &q->queue_gate); } EXPORT_SYMBOL_GPL(blk_set_preempt_only); void blk_clear_preempt_only(struct request_queue *q) { - blk_queue_flag_clear(QUEUE_FLAG_PREEMPT_ONLY, q); + clear_bit(BLK_QUEUE_GATE_PREEMPT_ONLY, &q->queue_gate); wake_up_all(&q->mq_freeze_wq); } EXPORT_SYMBOL_GPL(blk_clear_preempt_only); @@ -919,6 +919,10 @@ static inline bool blk_queue_gate_allow(struct request_queue *q, if (test_bit(BLK_QUEUE_GATE_FROZEN, &q->queue_gate)) return false; + if (test_bit(BLK_QUEUE_GATE_PREEMPT_ONLY, &q->queue_gate) && + !(flags & BLK_MQ_REQ_PREEMPT)) + return false; + return true; } @@ -929,39 +933,23 @@ static inline bool blk_queue_gate_allow(struct request_queue *q, */ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) { - const bool preempt = flags & BLK_MQ_REQ_PREEMPT; - while (true) { - bool success = false; - rcu_read_lock_sched(); if (blk_queue_gate_allow(q, flags)) { __percpu_ref_get_many(&q->q_usage_counter, 1); - /* - * The code that sets the PREEMPT_ONLY flag is - * responsible for ensuring that that flag is globally - * visible before the queue is unfrozen. - */ - if (preempt || !blk_queue_preempt_only(q)) { - success = true; - } else { - percpu_ref_put(&q->q_usage_counter); - } + rcu_read_unlock_sched(); + return 0; } rcu_read_unlock_sched(); - if (success) - return 0; - if (flags & BLK_MQ_REQ_NOWAIT) return -EBUSY; smp_rmb(); wait_event(q->mq_freeze_wq, - (blk_queue_gate_allow(q, flags) && - (preempt || !blk_queue_preempt_only(q))) || - blk_queue_dying(q)); + blk_queue_gate_allow(q, flags) || + blk_queue_dying(q)); if (blk_queue_dying(q)) return -ENODEV; } diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index cb1e6cf..4174951 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -132,7 +132,6 @@ static const char *const blk_queue_flag_name[] = { QUEUE_FLAG_NAME(REGISTERED), QUEUE_FLAG_NAME(SCSI_PASSTHROUGH), QUEUE_FLAG_NAME(QUIESCED), - QUEUE_FLAG_NAME(PREEMPT_ONLY), }; #undef QUEUE_FLAG_NAME diff --git a/block/blk.h b/block/blk.h index 19d2c00..e38060b 100644 --- a/block/blk.h +++ b/block/blk.h @@ -21,6 +21,7 @@ extern struct dentry *blk_debugfs_root; enum blk_queue_gate_flag_t { BLK_QUEUE_GATE_FROZEN, + BLK_QUEUE_GATE_PREEMPT_ONLY, }; struct blk_flush_queue { diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 1980648..3c9c33a 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -3058,17 +3058,13 @@ scsi_device_quiesce(struct scsi_device *sdev) WARN_ON_ONCE(sdev->quiesced_by && sdev->quiesced_by != current); blk_set_preempt_only(q); - blk_mq_freeze_queue(q); + blk_mq_unfreeze_queue(q); /* - * Ensure that the effect of blk_set_preempt_only() will be visible - * for percpu_ref_tryget() callers that occur after the queue - * unfreeze even if the queue was already frozen before this function - * was called. See also https://lwn.net/Articles/573497/. + * When we reach here: + * - PREEMPT_ONLY gate flag is globally visible + * - no non-preempt request in queue */ - synchronize_sched(); - blk_mq_unfreeze_queue(q); - mutex_lock(&sdev->state_mutex); err = scsi_device_set_state(sdev, SDEV_QUIESCE); if (err == 0) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 9f3f0d7..caa73b8 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -700,7 +700,6 @@ struct request_queue { #define QUEUE_FLAG_REGISTERED 26 /* queue has been registered to a disk */ #define QUEUE_FLAG_SCSI_PASSTHROUGH 27 /* queue supports SCSI commands */ #define QUEUE_FLAG_QUIESCED 28 /* queue has been quiesced */ -#define QUEUE_FLAG_PREEMPT_ONLY 29 /* only process REQ_PREEMPT requests */ #define QUEUE_FLAG_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \ (1 << QUEUE_FLAG_SAME_COMP) | \ @@ -738,8 +737,6 @@ bool blk_queue_flag_test_and_clear(unsigned int flag, struct request_queue *q); ((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \ REQ_FAILFAST_DRIVER)) #define blk_queue_quiesced(q) test_bit(QUEUE_FLAG_QUIESCED, &(q)->queue_flags) -#define blk_queue_preempt_only(q) \ - test_bit(QUEUE_FLAG_PREEMPT_ONLY, &(q)->queue_flags) #define blk_queue_fua(q) test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags) extern int blk_set_preempt_only(struct request_queue *q);