From patchwork Fri Oct 12 10:07:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10638295 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 47AA713AD for ; Fri, 12 Oct 2018 10:06:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 322CD28569 for ; Fri, 12 Oct 2018 10:06:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 21D7028685; Fri, 12 Oct 2018 10:06:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E82428569 for ; Fri, 12 Oct 2018 10:06:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728286AbeJLRiJ (ORCPT ); Fri, 12 Oct 2018 13:38:09 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:34222 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728149AbeJLRiJ (ORCPT ); Fri, 12 Oct 2018 13:38:09 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w9CA4UTp011813; Fri, 12 Oct 2018 10:05:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=U5eb0BI+bhjjA0FFyFyJx+OwzawEA17ktJtMtVLIfWg=; b=QL552yNL3JiCVapddM0oCkLHJ/Lke0XjhLCpkMUhQsRxhEJmNjZ0O+l8q+WCe4KyrlJV LljdXUOG1XxszQJ7nR9Nvk53yK1X4uXObUsuXH9GA3bEuCyYpCYORLtJHhltgWL5evGr 7DvqWBBxMrmSJC0L97EN+rZN/z1lDZMPzE1UW0ibMAOCUEwDkuVwT5+Ukj4xHy9Cxhw3 e9CiUdlsyT3/p6k+yCsHzoyppvNngTyG4/S7yeVoqA3PEufaX1F69YSgQzbhIfy2MzVp u1IdFKI2A3iqCACuINeQickWSi3eajDexwU1aaOcBq5Zl2YnjjwZ36EN0rWDWmUjUyGG cw== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2120.oracle.com with ESMTP id 2mxn0qhqq5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Oct 2018 10:05:27 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w9CA5QVH015293 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Oct 2018 10:05:27 GMT Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w9CA5Q46028488; Fri, 12 Oct 2018 10:05:26 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 12 Oct 2018 10:05:25 +0000 From: Jianchao Wang To: axboe@kernel.dk Cc: keith.busch@linux.intel.com, ming.lei@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH V2 1/4] blk-mq: adjust debugfs and sysfs register when updating nr_hw_queues Date: Fri, 12 Oct 2018 18:07:25 +0800 Message-Id: <1539338848-1789-2-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1539338848-1789-1-git-send-email-jianchao.w.wang@oracle.com> References: <1539338848-1789-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9043 signatures=668706 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810120101 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP blk-mq debugfs and sysfs entries need to be removed before updating queue map, otherwise, we get get wrong result there. This patch fixes it and remove the redundant debugfs and sysfs register/unregister operations during __blk_mq_update_nr_hw_queues. Signed-off-by: Jianchao Wang Reviewed-by: Ming Lei --- block/blk-mq.c | 39 ++++++++++++--------------------------- 1 file changed, 12 insertions(+), 27 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index e3c39ea..32be432 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2137,8 +2137,6 @@ static void blk_mq_exit_hctx(struct request_queue *q, struct blk_mq_tag_set *set, struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx) { - blk_mq_debugfs_unregister_hctx(hctx); - if (blk_mq_hw_queue_mapped(hctx)) blk_mq_tag_idle(hctx); @@ -2165,6 +2163,7 @@ static void blk_mq_exit_hw_queues(struct request_queue *q, queue_for_each_hw_ctx(q, hctx, i) { if (i == nr_queue) break; + blk_mq_debugfs_unregister_hctx(hctx); blk_mq_exit_hctx(q, set, hctx, i); } } @@ -2222,8 +2221,6 @@ static int blk_mq_init_hctx(struct request_queue *q, if (hctx->flags & BLK_MQ_F_BLOCKING) init_srcu_struct(hctx->srcu); - blk_mq_debugfs_register_hctx(q, hctx); - return 0; free_fq: @@ -2512,8 +2509,6 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, int i, j; struct blk_mq_hw_ctx **hctxs = q->queue_hw_ctx; - blk_mq_sysfs_unregister(q); - /* protect against switching io scheduler */ mutex_lock(&q->sysfs_lock); for (i = 0; i < set->nr_hw_queues; i++) { @@ -2561,7 +2556,6 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, } q->nr_hw_queues = i; mutex_unlock(&q->sysfs_lock); - blk_mq_sysfs_register(q); } struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, @@ -2659,25 +2653,6 @@ void blk_mq_free_queue(struct request_queue *q) blk_mq_exit_hw_queues(q, set, set->nr_hw_queues); } -/* Basically redo blk_mq_init_queue with queue frozen */ -static void blk_mq_queue_reinit(struct request_queue *q) -{ - WARN_ON_ONCE(!atomic_read(&q->mq_freeze_depth)); - - blk_mq_debugfs_unregister_hctxs(q); - blk_mq_sysfs_unregister(q); - - /* - * redo blk_mq_init_cpu_queues and blk_mq_init_hw_queues. FIXME: maybe - * we should change hctx numa_node according to the new topology (this - * involves freeing and re-allocating memory, worth doing?) - */ - blk_mq_map_swqueue(q); - - blk_mq_sysfs_register(q); - blk_mq_debugfs_register_hctxs(q); -} - static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set) { int i; @@ -2987,11 +2962,21 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, if (!blk_mq_elv_switch_none(&head, q)) goto switch_back; + list_for_each_entry(q, &set->tag_list, tag_set_list) { + blk_mq_debugfs_unregister_hctxs(q); + blk_mq_sysfs_unregister(q); + } + set->nr_hw_queues = nr_hw_queues; blk_mq_update_queue_map(set); list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_realloc_hw_ctxs(set, q); - blk_mq_queue_reinit(q); + blk_mq_map_swqueue(q); + } + + list_for_each_entry(q, &set->tag_list, tag_set_list) { + blk_mq_sysfs_register(q); + blk_mq_debugfs_register_hctxs(q); } switch_back: From patchwork Fri Oct 12 10:07:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10638289 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F266617E1 for ; Fri, 12 Oct 2018 10:05:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E1F5128569 for ; Fri, 12 Oct 2018 10:05:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D5B0228685; Fri, 12 Oct 2018 10:05:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A89428569 for ; Fri, 12 Oct 2018 10:05:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728537AbeJLRhS (ORCPT ); Fri, 12 Oct 2018 13:37:18 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:33336 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728154AbeJLRhR (ORCPT ); Fri, 12 Oct 2018 13:37:17 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w9CA4KYm011747; Fri, 12 Oct 2018 10:05:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=pEF4n6wCcRzKn7ZwHolr/TYpOoOBiwbt3M1Hte5rMy8=; b=wu4dMI6pEUUH6ANzV9JqNbO9bHRPek9GKCoEprOh0eYkkqhuy2sX9QmiaLkOIXHQsH2+ ivIrPTqvaYA/mnl4rX+fcFFntfxveVJinKaQ0O3P8wlmctDyy+Rkw9s0TdxJoe7YeAvt lQd37jBEunBtliXjcluGIZxhGF64RiJdhp0oXpDulPBbJNGArqE3YM19JXHXvgh48jCm Zlyvnxoh5rw9nWuZu0m7lTXm+brCIPbf06gBxGXWmdI+/XJy/X6ceNWEr3PuXjYCYuOh AEHUackHr/7zyOCb+MLRziRPZAIQ2nVnjc4mUp+kDNAScGXBBXoicmD2cFWbsJ8nhfsE Vg== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp2120.oracle.com with ESMTP id 2mxn0qhqqe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Oct 2018 10:05:34 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w9CA5Sqk014712 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Oct 2018 10:05:28 GMT Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w9CA5SgM003339; Fri, 12 Oct 2018 10:05:28 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 12 Oct 2018 10:05:27 +0000 From: Jianchao Wang To: axboe@kernel.dk Cc: keith.busch@linux.intel.com, ming.lei@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH V2 2/4] blk-mq: change gfp flags to GFP_NOIO in blk_mq_realloc_hw_ctxs Date: Fri, 12 Oct 2018 18:07:26 +0800 Message-Id: <1539338848-1789-3-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1539338848-1789-1-git-send-email-jianchao.w.wang@oracle.com> References: <1539338848-1789-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9043 signatures=668706 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=839 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810120101 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP blk_mq_realloc_hw_ctxs could be invoked during update hw queues. At the momemt, IO is blocked. Change the gfp flags from GFP_KERNEL to GFP_NOIO to avoid forever hang during memory allocation in blk_mq_realloc_hw_ctxs. Signed-off-by: Jianchao Wang --- block/blk-core.c | 2 +- block/blk-flush.c | 6 +++--- block/blk-mq.c | 17 ++++++++++------- block/blk.h | 2 +- 4 files changed, 15 insertions(+), 12 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index cff0a60..4fe7084 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1160,7 +1160,7 @@ int blk_init_allocated_queue(struct request_queue *q) { WARN_ON_ONCE(q->mq_ops); - q->fq = blk_alloc_flush_queue(q, NUMA_NO_NODE, q->cmd_size); + q->fq = blk_alloc_flush_queue(q, NUMA_NO_NODE, q->cmd_size, GFP_KERNEL); if (!q->fq) return -ENOMEM; diff --git a/block/blk-flush.c b/block/blk-flush.c index ce41f66..8b44b86 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -566,12 +566,12 @@ int blkdev_issue_flush(struct block_device *bdev, gfp_t gfp_mask, EXPORT_SYMBOL(blkdev_issue_flush); struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q, - int node, int cmd_size) + int node, int cmd_size, gfp_t flags) { struct blk_flush_queue *fq; int rq_sz = sizeof(struct request); - fq = kzalloc_node(sizeof(*fq), GFP_KERNEL, node); + fq = kzalloc_node(sizeof(*fq), flags, node); if (!fq) goto fail; @@ -579,7 +579,7 @@ struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q, spin_lock_init(&fq->mq_flush_lock); rq_sz = round_up(rq_sz + cmd_size, cache_line_size()); - fq->flush_rq = kzalloc_node(rq_sz, GFP_KERNEL, node); + fq->flush_rq = kzalloc_node(rq_sz, flags, node); if (!fq->flush_rq) goto fail_rq; diff --git a/block/blk-mq.c b/block/blk-mq.c index 32be432..9805d6a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2193,12 +2193,12 @@ static int blk_mq_init_hctx(struct request_queue *q, * runtime */ hctx->ctxs = kmalloc_array_node(nr_cpu_ids, sizeof(void *), - GFP_KERNEL, node); + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, node); if (!hctx->ctxs) goto unregister_cpu_notifier; - if (sbitmap_init_node(&hctx->ctx_map, nr_cpu_ids, ilog2(8), GFP_KERNEL, - node)) + if (sbitmap_init_node(&hctx->ctx_map, nr_cpu_ids, ilog2(8), + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, node)) goto free_ctxs; hctx->nr_ctx = 0; @@ -2211,7 +2211,8 @@ static int blk_mq_init_hctx(struct request_queue *q, set->ops->init_hctx(hctx, set->driver_data, hctx_idx)) goto free_bitmap; - hctx->fq = blk_alloc_flush_queue(q, hctx->numa_node, set->cmd_size); + hctx->fq = blk_alloc_flush_queue(q, hctx->numa_node, set->cmd_size, + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY); if (!hctx->fq) goto exit_hctx; @@ -2519,12 +2520,14 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, node = blk_mq_hw_queue_to_node(q->mq_map, i); hctxs[i] = kzalloc_node(blk_mq_hw_ctx_size(set), - GFP_KERNEL, node); + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, + node); if (!hctxs[i]) break; - if (!zalloc_cpumask_var_node(&hctxs[i]->cpumask, GFP_KERNEL, - node)) { + if (!zalloc_cpumask_var_node(&hctxs[i]->cpumask, + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, + node)) { kfree(hctxs[i]); hctxs[i] = NULL; break; diff --git a/block/blk.h b/block/blk.h index 9db4e38..0f86eda 100644 --- a/block/blk.h +++ b/block/blk.h @@ -124,7 +124,7 @@ static inline void __blk_get_queue(struct request_queue *q) } struct blk_flush_queue *blk_alloc_flush_queue(struct request_queue *q, - int node, int cmd_size); + int node, int cmd_size, gfp_t flags); void blk_free_flush_queue(struct blk_flush_queue *q); int blk_init_rl(struct request_list *rl, struct request_queue *q, From patchwork Fri Oct 12 10:07:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10638287 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F2EF517E1 for ; Fri, 12 Oct 2018 10:05:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDF2B28569 for ; Fri, 12 Oct 2018 10:05:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CFAD628685; Fri, 12 Oct 2018 10:05:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6352028569 for ; Fri, 12 Oct 2018 10:05:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728154AbeJLRhT (ORCPT ); Fri, 12 Oct 2018 13:37:19 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:48992 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728151AbeJLRhT (ORCPT ); Fri, 12 Oct 2018 13:37:19 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w9CA4aFf147470; Fri, 12 Oct 2018 10:05:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=/bkO0BrR1JivtILyNmExe07M5GqhYeTrxgnB6cVPpjI=; b=Fp2aTDx68v7P4zbNP/3rfaYKKtZul/rgb54R+3iS5n4d7xfPagqR3zd0LEAt/YdhzF9z fr7hJhMQbj9LpuZZb/CTwGumHvUcDLjnZIYcnOig13xdxj1fmGfCOm8sohoMbyiIIodC uubnoQC432TOMeXHMUGfOQe4eR6hPDnNqkeRI2qBWpGjB50pRBG5D8L0jUf8eNROVujQ w+zQVZGxRIZchU+8fre55JxBXQLXhSWiU6zAlfc27SSjCx5MMFPFgqI2jL9q2VQ34Kro BTYg2lKYymon7pC6pN82iaT1/Ht0cm5EUnBrm0CTKKFxldm22DifwpBM9qfOhiaaxo3n OQ== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2120.oracle.com with ESMTP id 2mxnprhjd5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Oct 2018 10:05:36 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w9CA5UeF015533 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Oct 2018 10:05:30 GMT Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w9CA5UYe003351; Fri, 12 Oct 2018 10:05:30 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 12 Oct 2018 10:05:29 +0000 From: Jianchao Wang To: axboe@kernel.dk Cc: keith.busch@linux.intel.com, ming.lei@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH V2 3/4] blk-mq: realloc hctx when hw queue is mapped to another node Date: Fri, 12 Oct 2018 18:07:27 +0800 Message-Id: <1539338848-1789-4-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1539338848-1789-1-git-send-email-jianchao.w.wang@oracle.com> References: <1539338848-1789-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9043 signatures=668706 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810120101 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When the hw queues and mq_map are updated, a hctx could be mapped to a different numa node. At this moment, we need to realloc the hctx. If fail to do that, go on using previous hctx. Signed-off-by: Jianchao Wang --- block/blk-mq.c | 82 +++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 56 insertions(+), 26 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9805d6a..34f3973 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2504,6 +2504,39 @@ static int blk_mq_hw_ctx_size(struct blk_mq_tag_set *tag_set) return hw_ctx_size; } +static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx( + struct blk_mq_tag_set *set, struct request_queue *q, + int hctx_idx, int node) +{ + struct blk_mq_hw_ctx *hctx; + + hctx = kzalloc_node(blk_mq_hw_ctx_size(set), + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, + node); + if (!hctx) + return NULL; + + if (!zalloc_cpumask_var_node(&hctx->cpumask, + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, + node)) { + kfree(hctx); + return NULL; + } + + atomic_set(&hctx->nr_active, 0); + hctx->numa_node = node; + hctx->queue_num = hctx_idx; + + if (blk_mq_init_hctx(q, set, hctx, hctx_idx)) { + free_cpumask_var(hctx->cpumask); + kfree(hctx); + return NULL; + } + blk_mq_hctx_kobj_init(hctx); + + return hctx; +} + static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, struct request_queue *q) { @@ -2514,37 +2547,34 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, mutex_lock(&q->sysfs_lock); for (i = 0; i < set->nr_hw_queues; i++) { int node; - - if (hctxs[i]) - continue; + struct blk_mq_hw_ctx *hctx; node = blk_mq_hw_queue_to_node(q->mq_map, i); - hctxs[i] = kzalloc_node(blk_mq_hw_ctx_size(set), - GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, - node); - if (!hctxs[i]) - break; - - if (!zalloc_cpumask_var_node(&hctxs[i]->cpumask, - GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, - node)) { - kfree(hctxs[i]); - hctxs[i] = NULL; - break; - } - - atomic_set(&hctxs[i]->nr_active, 0); - hctxs[i]->numa_node = node; - hctxs[i]->queue_num = i; + /* + * If the hw queue has been mapped to another numa node, + * we need to realloc the hctx. If allocation fails, fallback + * to use the previous one. + */ + if (hctxs[i] && (hctxs[i]->numa_node == node)) + continue; - if (blk_mq_init_hctx(q, set, hctxs[i], i)) { - free_cpumask_var(hctxs[i]->cpumask); - kfree(hctxs[i]); - hctxs[i] = NULL; - break; + hctx = blk_mq_alloc_and_init_hctx(set, q, i, node); + if (hctx) { + if (hctxs[i]) { + blk_mq_exit_hctx(q, set, hctxs[i], i); + kobject_put(&hctxs[i]->kobj); + } + hctxs[i] = hctx; + } else { + if (hctxs[i]) + pr_warn("Allocate new hctx on node %d fails,\ + fallback to previous one on node %d\n", + node, hctxs[i]->numa_node); + else + break; } - blk_mq_hctx_kobj_init(hctxs[i]); } + for (j = i; j < q->nr_hw_queues; j++) { struct blk_mq_hw_ctx *hctx = hctxs[j]; From patchwork Fri Oct 12 10:07:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10638285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8474613AD for ; Fri, 12 Oct 2018 10:05:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 73FB9204C3 for ; Fri, 12 Oct 2018 10:05:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 67D2A28569; Fri, 12 Oct 2018 10:05:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3780B204C3 for ; Fri, 12 Oct 2018 10:05:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728387AbeJLRhQ (ORCPT ); Fri, 12 Oct 2018 13:37:16 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:33328 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728151AbeJLRhQ (ORCPT ); Fri, 12 Oct 2018 13:37:16 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w9CA4L86011762; Fri, 12 Oct 2018 10:05:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=vN7PK2lAoJpD3Cyz91jtauhT5WQYU6XRTXSy5Z3vhf4=; b=EXhn4ZwTMnmyUN8buVHsRCJ+Etn3OwN5ofrFTSo1zTXNnv4o2SVVgvwxN6PQN+R52S1/ MwFO2pvf7GFuuadl/occh3c72RahPHrnRuIdQr2k2tW/M3rs/uIIdLclK7PT+S3XTTTJ 27xIqjytvOFvNCbNi3UvjXs1dWwiTfRwKw42TaainVxv0uH1NgdL17nrj4AzNpd85h5c ASXFJz5ygFIL2ONwVVCpagNwFhoSm+8jOJnyD3MzePArFZIoNh580VTRvajTdSWEk2BT 0sn6HF9hBFGYWa0i6tP5srl0SHrR2P9B1SYoaMeGlLqujS9CzfjtA3OMNtK6+lFVzmkN kg== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp2120.oracle.com with ESMTP id 2mxn0qhqqd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Oct 2018 10:05:33 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w9CA5WRI014919 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 12 Oct 2018 10:05:32 GMT Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w9CA5W4D016623; Fri, 12 Oct 2018 10:05:32 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 12 Oct 2018 10:05:32 +0000 From: Jianchao Wang To: axboe@kernel.dk Cc: keith.busch@linux.intel.com, ming.lei@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH V2 4/4] blk-mq: fallback to previous nr_hw_queues when updating fails Date: Fri, 12 Oct 2018 18:07:28 +0800 Message-Id: <1539338848-1789-5-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1539338848-1789-1-git-send-email-jianchao.w.wang@oracle.com> References: <1539338848-1789-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9043 signatures=668706 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=824 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810120101 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When we try to increate the nr_hw_queues, we may fail due to shortage of memory or other reason, then blk_mq_realloc_hw_ctxs stops and some entries in q->queue_hw_ctx are left with NULL. However, because queue map has been updated with new nr_hw_queues, some cpus have been mapped to hw queue which just encounters allocation failure, thus blk_mq_map_queue could return NULL. This will cause panic in following blk_mq_map_swqueue. To fix it, when increase nr_hw_queues fails, fallback to previous nr_hw_queues and post warning. At the same time, driver's .map_queues usually use completion irq affinity to map hw and cpu, fallback nr_hw_queues will cause lack of some cpu's map to hw, so use default blk_mq_map_queues to do that. Reported-by: syzbot+83e8cbe702263932d9d4@syzkaller.appspotmail.com Signed-off-by: Jianchao Wang --- block/blk-mq.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 34f3973..d2ce67a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2540,7 +2540,7 @@ static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx( static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, struct request_queue *q) { - int i, j; + int i, j, end; struct blk_mq_hw_ctx **hctxs = q->queue_hw_ctx; /* protect against switching io scheduler */ @@ -2574,8 +2574,20 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, break; } } + /* + * Increasing nr_hw_queues fails. Free the newly allocated + * hctxs and keep the previous q->nr_hw_queues. + */ + if (i != set->nr_hw_queues) { + j = q->nr_hw_queues; + end = i; + } else { + j = i; + end = q->nr_hw_queues; + q->nr_hw_queues = set->nr_hw_queues; + } - for (j = i; j < q->nr_hw_queues; j++) { + for (; j < end; j++) { struct blk_mq_hw_ctx *hctx = hctxs[j]; if (hctx) { @@ -2587,7 +2599,6 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, } } - q->nr_hw_queues = i; mutex_unlock(&q->sysfs_lock); } @@ -2972,6 +2983,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, { struct request_queue *q; LIST_HEAD(head); + int prev_nr_hw_queues; lockdep_assert_held(&set->tag_list_lock); @@ -3000,10 +3012,19 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, blk_mq_sysfs_unregister(q); } + prev_nr_hw_queues = set->nr_hw_queues; set->nr_hw_queues = nr_hw_queues; blk_mq_update_queue_map(set); +fallback: list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_realloc_hw_ctxs(set, q); + if (q->nr_hw_queues != set->nr_hw_queues) { + pr_warn("Increasing nr_hw_queues to %d fails, fallback to %d\n", + nr_hw_queues, prev_nr_hw_queues); + set->nr_hw_queues = prev_nr_hw_queues; + blk_mq_map_queues(set); + goto fallback; + } blk_mq_map_swqueue(q); }