From patchwork Thu Sep 27 10:43:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10617787 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF289175A for ; Thu, 27 Sep 2018 10:42:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE8F42B083 for ; Thu, 27 Sep 2018 10:42:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C320A2B107; Thu, 27 Sep 2018 10:42:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E3662B083 for ; Thu, 27 Sep 2018 10:42:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727112AbeI0Q74 (ORCPT ); Thu, 27 Sep 2018 12:59:56 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:40926 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726948AbeI0Q74 (ORCPT ); Thu, 27 Sep 2018 12:59:56 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w8RAdFFc180000; Thu, 27 Sep 2018 10:41:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=HkKuifm6HndyIPAiUM/dsxQmb5jn/HcD2CQ1W0hSYcY=; b=Nih9Dj9uVB+5yThlU5f71ZTZwkYnXtcV7R/+r6ls8pEjq6uFlkyMFiZQyiOSr6ZTbg7l du/xDG/j0fTHiqI3soBUcV0v0LSi9YdH8OMp01+iyktCSWaJw6Bys1cm1OdmNBVO2bQw BYxcocQtKM3JeCDeZbCNaG3lMWe9Wv/U2VMp3DfmF06GX+nOGZ9UMai/xT/YGHrAdMC3 kWr6WuyJH29cf+pslbSZw+kktyqqVvWMfla1mqPjtJlpcWP6GfXJ2COOoSBdzKJPj/9z HmesMd5/qnYyCAquboW7EzhIv/D8r7T0DeCsBPHlYzjdVUnHTWbWXTGuuZl9gSuV2u2B 3w== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2130.oracle.com with ESMTP id 2mnd5ts9vm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Sep 2018 10:41:14 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w8RAfD6r008170 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Sep 2018 10:41:13 GMT Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w8RAfDlW021352; Thu, 27 Sep 2018 10:41:13 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 27 Sep 2018 03:41:12 -0700 From: Jianchao Wang To: axboe@kernel.dk Cc: keith.busch@linux.intel.com, ming.lei@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/2] blk-mq: adjust debugfs and sysfs register when updating nr_hw_queues Date: Thu, 27 Sep 2018 18:43:03 +0800 Message-Id: <1538044984-2147-2-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1538044984-2147-1-git-send-email-jianchao.w.wang@oracle.com> References: <1538044984-2147-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9028 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809270109 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP blk-mq debugfs and sysfs entries need to be removed before updating queue map, otherwise, we get get wrong result there. This patch fixes it and remove the redundant debugfs and sysfs register/unregister operations during __blk_mq_update_nr_hw_queues. Signed-off-by: Jianchao Wang Reviewed-by: Ming Lei --- block/blk-mq.c | 39 ++++++++++++--------------------------- 1 file changed, 12 insertions(+), 27 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 85a1c1a..6356455 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2137,8 +2137,6 @@ static void blk_mq_exit_hctx(struct request_queue *q, struct blk_mq_tag_set *set, struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx) { - blk_mq_debugfs_unregister_hctx(hctx); - if (blk_mq_hw_queue_mapped(hctx)) blk_mq_tag_idle(hctx); @@ -2165,6 +2163,7 @@ static void blk_mq_exit_hw_queues(struct request_queue *q, queue_for_each_hw_ctx(q, hctx, i) { if (i == nr_queue) break; + blk_mq_debugfs_unregister_hctx(hctx); blk_mq_exit_hctx(q, set, hctx, i); } } @@ -2222,8 +2221,6 @@ static int blk_mq_init_hctx(struct request_queue *q, if (hctx->flags & BLK_MQ_F_BLOCKING) init_srcu_struct(hctx->srcu); - blk_mq_debugfs_register_hctx(q, hctx); - return 0; free_fq: @@ -2512,8 +2509,6 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, int i, j; struct blk_mq_hw_ctx **hctxs = q->queue_hw_ctx; - blk_mq_sysfs_unregister(q); - /* protect against switching io scheduler */ mutex_lock(&q->sysfs_lock); for (i = 0; i < set->nr_hw_queues; i++) { @@ -2561,7 +2556,6 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, } q->nr_hw_queues = i; mutex_unlock(&q->sysfs_lock); - blk_mq_sysfs_register(q); } struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, @@ -2659,25 +2653,6 @@ void blk_mq_free_queue(struct request_queue *q) blk_mq_exit_hw_queues(q, set, set->nr_hw_queues); } -/* Basically redo blk_mq_init_queue with queue frozen */ -static void blk_mq_queue_reinit(struct request_queue *q) -{ - WARN_ON_ONCE(!atomic_read(&q->mq_freeze_depth)); - - blk_mq_debugfs_unregister_hctxs(q); - blk_mq_sysfs_unregister(q); - - /* - * redo blk_mq_init_cpu_queues and blk_mq_init_hw_queues. FIXME: maybe - * we should change hctx numa_node according to the new topology (this - * involves freeing and re-allocating memory, worth doing?) - */ - blk_mq_map_swqueue(q); - - blk_mq_sysfs_register(q); - blk_mq_debugfs_register_hctxs(q); -} - static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set) { int i; @@ -2987,11 +2962,21 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, if (!blk_mq_elv_switch_none(&head, q)) goto switch_back; + list_for_each_entry(q, &set->tag_list, tag_set_list) { + blk_mq_debugfs_unregister_hctxs(q); + blk_mq_sysfs_unregister(q); + } + set->nr_hw_queues = nr_hw_queues; blk_mq_update_queue_map(set); list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_realloc_hw_ctxs(set, q); - blk_mq_queue_reinit(q); + blk_mq_map_swqueue(q); + } + + list_for_each_entry(q, &set->tag_list, tag_set_list) { + blk_mq_sysfs_register(q); + blk_mq_debugfs_register_hctxs(q); } switch_back: From patchwork Thu Sep 27 10:43:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "jianchao.wang" X-Patchwork-Id: 10617785 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 245F714BD for ; Thu, 27 Sep 2018 10:41:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 15D032B1B5 for ; Thu, 27 Sep 2018 10:41:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 13EBF2B1C5; Thu, 27 Sep 2018 10:41:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F00952B1B5 for ; Thu, 27 Sep 2018 10:41:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727341AbeI0Q66 (ORCPT ); Thu, 27 Sep 2018 12:58:58 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:40218 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726948AbeI0Q66 (ORCPT ); Thu, 27 Sep 2018 12:58:58 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w8RAdLc5180047; Thu, 27 Sep 2018 10:41:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=5/NiHPpH6FeSskyyYSQt5odXm1q++NojKsiyvpJn4A8=; b=V1L3wL9F/xj0pgtRocWmSuVvSBd0PDrYiVPr1TBt8ujm0uLa1++VBXHSdk3b+cKHFAnc F45gO5eRiXq1yLtfxExIUuTYN3gcS0ZM9L+qmMy8VWPrDahtOjr8Z0adDwP6D/AlRvbk H1brA9o1xZtZ4Psmkaza05khw4wWLicEPWWvGPlYfNSsSp9E5nHaKlW1uEX//vih3CY2 666UR9dYVgpyoVELNPwKHpTk6b6wMmBqm4/BT7cqwF/0PiVohIL90HKpef9mUG3IvCQ4 5AbQ13hs3SRiX+dWwodI6x+fJF8HUW591hq03bHZu7a6/3TrvAXNe0uR4DZrob3BszRx Zw== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2130.oracle.com with ESMTP id 2mnd5ts9vs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Sep 2018 10:41:15 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w8RAfFJN023187 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Sep 2018 10:41:15 GMT Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w8RAfFfO021365; Thu, 27 Sep 2018 10:41:15 GMT Received: from will-ThinkCentre-M910s.cn.oracle.com (/10.182.70.254) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 27 Sep 2018 03:41:14 -0700 From: Jianchao Wang To: axboe@kernel.dk Cc: keith.busch@linux.intel.com, ming.lei@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] blk-mq: fallback to previous nr_hw_queues when updating fails Date: Thu, 27 Sep 2018 18:43:04 +0800 Message-Id: <1538044984-2147-3-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1538044984-2147-1-git-send-email-jianchao.w.wang@oracle.com> References: <1538044984-2147-1-git-send-email-jianchao.w.wang@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9028 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809270109 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When we try to increate the nr_hw_queues, we may fail due to shortage of memory or other reason, then blk_mq_realloc_hw_ctxs stops and some entries in q->queue_hw_ctx are left with NULL. However, because queue map has been updated with new nr_hw_queues, some cpus have been mapped to hw queue which just encounters allocation failure, thus blk_mq_map_queue could return NULL. This will cause panic in following blk_mq_map_swqueue. To fix it, let blk_mq_realloc_hw_ctxs return false to skip blk_mq_map_swqueue and fallback to previous nr_hw_queues in case of increasing nr_hw_queues failure. Reported-by: syzbot+83e8cbe702263932d9d4@syzkaller.appspotmail.com Signed-off-by: Jianchao Wang --- block/blk-mq.c | 40 +++++++++++++++++++++++++++++++++++----- 1 file changed, 35 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 6356455..c867ede 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2503,10 +2503,10 @@ static int blk_mq_hw_ctx_size(struct blk_mq_tag_set *tag_set) return hw_ctx_size; } -static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, +static bool blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, struct request_queue *q) { - int i, j; + int i, j, end; struct blk_mq_hw_ctx **hctxs = q->queue_hw_ctx; /* protect against switching io scheduler */ @@ -2542,7 +2542,24 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, } blk_mq_hctx_kobj_init(hctxs[i]); } - for (j = i; j < q->nr_hw_queues; j++) { + + if (i != set->nr_hw_queues) { + /* + * Increasing nr_hw_queues fails. Free the newly allocated + * hctxs and keep the previous q->nr_hw_queues. + */ + j = q->nr_hw_queues; + end = i; + } else { + /* + * If nr_hw_queues is decreased, free the redundant hctxs. + */ + j = i; + end = q->nr_hw_queues; + q->nr_hw_queues = set->nr_hw_queues; + } + + for (; j < end; j++) { struct blk_mq_hw_ctx *hctx = hctxs[j]; if (hctx) { @@ -2554,8 +2571,9 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, } } - q->nr_hw_queues = i; mutex_unlock(&q->sysfs_lock); + + return (q->nr_hw_queues == set->nr_hw_queues); } struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, @@ -2939,6 +2957,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, { struct request_queue *q; LIST_HEAD(head); + int prev_nr_hw_queues; lockdep_assert_held(&set->tag_list_lock); @@ -2967,10 +2986,21 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, blk_mq_sysfs_unregister(q); } + prev_nr_hw_queues = set->nr_hw_queues; set->nr_hw_queues = nr_hw_queues; +again: blk_mq_update_queue_map(set); list_for_each_entry(q, &set->tag_list, tag_set_list) { - blk_mq_realloc_hw_ctxs(set, q); + /* + * If increasing nr_hw_queues fail, fallback to previous + * nr_hw_queues. + */ + if (!blk_mq_realloc_hw_ctxs(set, q)) { + pr_warn("updating nr_hw_queues to %d fails, fallback to %d.\n", + nr_hw_queues, prev_nr_hw_queues); + set->nr_hw_queues = prev_nr_hw_queues; + goto again; + } blk_mq_map_swqueue(q); }