From patchwork Fri Apr 18 16:36:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057456 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81A9E1DE89C for ; Fri, 18 Apr 2025 16:37:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994254; cv=none; b=DEC2tjtZhaUjUuGv41ELR6zAKhG1Exr5rp9WcjjMt8o7nBi0FEg5oEAJtBWdDFRPbD0jfRPOG/f9DFjv/00379gj3zjP6U5Z+J7YnlopRs3+Ct1V9WlPEbbmHVh0BrFKe4+rHpHxyRfUMI66CC0R8kUH12f7Jr/K2a1kJOzGRVo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994254; c=relaxed/simple; bh=OekWkbXXWVC2e0o7p+59vMG5DTG2K+6CkaptD60z0YQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FrjjEFXuvsup384RcywU7H3qRqfXNS63q9pvxXX09IQpfioXvLrl8w6nWUQOMSmN+UkS+0iiJmcSco/RZsGdWV3ZHNRPhcz2WoOhT2ouDr1QZ59PwV73ZrbIcPfxO2rj2sY0EocBbl3qhTxjS8qWQxNdoF8eWchg0qXUqj5Yf6w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XP+uBi/q; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XP+uBi/q" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994251; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QKrc+xETT0qprHI8rv9ShF+wQAVxVxWdOFPF3+qK7/U=; b=XP+uBi/qTRa5TIJN5RtqOoqMY+GPTN0JdzQx09MhuUkeSTxll1aXg0kEDcuwxIXpm0IcN1 7ZqG/TMr2r3VLJNBUT6L0DjlDVIEUiB0EZXshJOKTYfdXyTae+UXcF4q9zQwQFKLpPxT8Z WvsiUD+B/acHE+cNsLK0AovBx+w3OWo= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-474-yFZtN0OOOhOBCHKZ1G70Kg-1; Fri, 18 Apr 2025 12:37:28 -0400 X-MC-Unique: yFZtN0OOOhOBCHKZ1G70Kg-1 X-Mimecast-MFC-AGG-ID: yFZtN0OOOhOBCHKZ1G70Kg_1744994246 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6B0621955DC6; Fri, 18 Apr 2025 16:37:26 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2850F180047F; Fri, 18 Apr 2025 16:37:24 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 01/20] block: move blk_mq_add_queue_tag_set() after blk_mq_map_swqueue() Date: Sat, 19 Apr 2025 00:36:42 +0800 Message-ID: <20250418163708.442085-2-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Move blk_mq_add_queue_tag_set() after blk_mq_map_swqueue(), and publish this request queue to tagset after everything is setup. This way is safe because BLK_MQ_F_TAG_QUEUE_SHARED isn't used by blk_mq_map_swqueue(), and this flag is mainly checked in fast IO code path. Prepare for removing ->elevator_lock from blk_mq_map_swqueue() which is supposed to be called when elevator switch isn't possible. Reported-by: Nilay Shroff Closes: https://lore.kernel.org/linux-block/567cb7ab-23d6-4cee-a915-c8cdac903ddd@linux.ibm.com/ Signed-off-by: Ming Lei Reviewed-by: Nilay Shroff --- block/blk-mq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index e0fe12f1320f..7cda919fafba 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4561,8 +4561,8 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, q->nr_requests = set->queue_depth; blk_mq_init_cpu_queues(q, set->nr_hw_queues); - blk_mq_add_queue_tag_set(set, q); blk_mq_map_swqueue(q); + blk_mq_add_queue_tag_set(set, q); return 0; err_hctxs: From patchwork Fri Apr 18 16:36:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057457 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B083B214225 for ; Fri, 18 Apr 2025 16:37:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994259; cv=none; b=u6QBSQKqCbmTkU0c151zdt5uLPsBXIUQ31PpeuvGFXec4/S5sFlHOppPN1VcldkhFo1IQEmr2cinKLsvBtIFaDmgmOIN/3Nrp/4P5Y4zB38iXJ26n0KMPmK1awiD6wJzSI9yMVkf+1VR1nSDUffLuyjYcbTyoX7LZv46OivPxGs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994259; c=relaxed/simple; bh=mXr71+aN2L7L1lCan/kiXDty06gdGQlsari5pVPMX78=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=p4B+YjHCL1lBrdvAxFIJuRcLmMisZhhamVFOuwbLPcRl0EhiMdxyL++D17aRCkNdXy+i2d5EVRAnuTE9l+PBRQUfvX8j+QLHcCZvi5/P1TzwruCxBM9c0Cr9KDWamPy15iTwMqlETYEjIl0qaEdU5ab5/+7AvcsTsdG1aWISYnY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=AktFcZoN; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="AktFcZoN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994256; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J5R8mslqCpCba75gYztodtEGAhBuZSV74Sqy7kHA2mU=; b=AktFcZoNRqJh/8i49Ua3gjZRudCScxKBsamH+QO0tKLNarznmyWc1bi1kKQSLhFn0lAUVO j6DU9m1aYl5Quzl2+39c6xGcojpzsP6mdcxYdKlZoRQN8WO06xDCqhZwZ0Iqe6dyflIiVl XZKN+DA2TVbB/sAXcj6sB0bXeO45S/g= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-324-ByuW0vB-MU-TBW73pb416w-1; Fri, 18 Apr 2025 12:37:32 -0400 X-MC-Unique: ByuW0vB-MU-TBW73pb416w-1 X-Mimecast-MFC-AGG-ID: ByuW0vB-MU-TBW73pb416w_1744994251 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D433A1956094; Fri, 18 Apr 2025 16:37:30 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9D5B030002C2; Fri, 18 Apr 2025 16:37:28 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 02/20] block: move ELEVATOR_FLAG_DISABLE_WBT as request queue flag Date: Sat, 19 Apr 2025 00:36:43 +0800 Message-ID: <20250418163708.442085-3-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 ELEVATOR_FLAG_DISABLE_WBT is only used by BFQ to disallow wbt when BFQ is in use. The flag is set in BFQ's init(), and cleared in BFQ's exit(). Making it as request queue flag, so that we can avoid to deal with elevator switch race. Also it isn't graceful to checking one scheduler flag in wbt_enable_default(). Signed-off-by: Ming Lei Reviewed-by: Nilay Shroff --- block/bfq-iosched.c | 4 ++-- block/blk-mq-debugfs.c | 1 + block/blk-wbt.c | 3 +-- block/elevator.h | 1 - include/linux/blkdev.h | 3 +++ 5 files changed, 7 insertions(+), 5 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index abd80dc13562..40e4106a71e7 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -7210,7 +7210,7 @@ static void bfq_exit_queue(struct elevator_queue *e) #endif blk_stat_disable_accounting(bfqd->queue); - clear_bit(ELEVATOR_FLAG_DISABLE_WBT, &e->flags); + blk_queue_flag_clear(QUEUE_FLAG_DISABLE_WBT, bfqd->queue); wbt_enable_default(bfqd->queue->disk); kfree(bfqd); @@ -7397,7 +7397,7 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e) /* We dispatch from request queue wide instead of hw queue */ blk_queue_flag_set(QUEUE_FLAG_SQ_SCHED, q); - set_bit(ELEVATOR_FLAG_DISABLE_WBT, &eq->flags); + blk_queue_flag_set(QUEUE_FLAG_DISABLE_WBT, q); wbt_disable_default(q->disk); blk_stat_enable_accounting(q); diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 3421b5521fe2..31e249a18407 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -93,6 +93,7 @@ static const char *const blk_queue_flag_name[] = { QUEUE_FLAG_NAME(RQ_ALLOC_TIME), QUEUE_FLAG_NAME(HCTX_ACTIVE), QUEUE_FLAG_NAME(SQ_SCHED), + QUEUE_FLAG_NAME(DISABLE_WBT), }; #undef QUEUE_FLAG_NAME diff --git a/block/blk-wbt.c b/block/blk-wbt.c index f1754d07f7e0..29cd2e33666f 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -704,8 +704,7 @@ void wbt_enable_default(struct gendisk *disk) struct rq_qos *rqos; bool enable = IS_ENABLED(CONFIG_BLK_WBT_MQ); - if (q->elevator && - test_bit(ELEVATOR_FLAG_DISABLE_WBT, &q->elevator->flags)) + if (blk_queue_disable_wbt(q)) enable = false; /* Throttling already enabled? */ diff --git a/block/elevator.h b/block/elevator.h index e4e44dfac503..e27af5492cdb 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -121,7 +121,6 @@ struct elevator_queue }; #define ELEVATOR_FLAG_REGISTERED 0 -#define ELEVATOR_FLAG_DISABLE_WBT 1 /* * block elevator interface diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e39c45bc0a97..10410d9b03ad 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -644,6 +644,7 @@ enum { QUEUE_FLAG_RQ_ALLOC_TIME, /* record rq->alloc_time_ns */ QUEUE_FLAG_HCTX_ACTIVE, /* at least one blk-mq hctx is active */ QUEUE_FLAG_SQ_SCHED, /* single queue style io dispatch */ + QUEUE_FLAG_DISABLE_WBT, /* for sched to disable/enable wbt */ QUEUE_FLAG_MAX }; @@ -679,6 +680,8 @@ void blk_queue_flag_clear(unsigned int flag, struct request_queue *q); #define blk_queue_sq_sched(q) test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags) #define blk_queue_skip_tagset_quiesce(q) \ ((q)->limits.features & BLK_FEAT_SKIP_TAGSET_QUIESCE) +#define blk_queue_disable_wbt(q) \ + test_bit(QUEUE_FLAG_DISABLE_WBT, &(q)->queue_flags) extern void blk_set_pm_only(struct request_queue *q); extern void blk_clear_pm_only(struct request_queue *q); From patchwork Fri Apr 18 16:36:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057458 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E621F13AD3F for ; Fri, 18 Apr 2025 16:37:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994264; cv=none; b=rvxCgRmRnvbul9cY3yCgdZmmgWn5+eKzeSgZ29jIMNA6mcAEg4PqjthYpttNiLvCfNPphNPmWZdvYux0Y4lc03ihRcDIyyszd4FlmeKXxx/Kx2qkov/pPQR9tsFgblficGG9YQHF9M+jXye1gppVnv6VYo3N8ljdNl/d/cUNiS4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994264; c=relaxed/simple; bh=bavUu1Lt/jALufmHuyOkgx+9KGl02TYwcyjDVjt+QC0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z3Ff0hS1CWCu+aAz3rVx2/MDGGnT1hruOsEc/KNggoLEMiure/irjPbxl+ZkBD4lFDxEk8t3NvGwLbrT/Z5JhUGd3rxbqLXgS0xsnnbtIEQd8Njf8T3VXLh3p1UqYfi89EVYnzHQNPkI5j0Pl5fEKQTk8hNjQoPgWKsCfRZifG0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=NI2geyd/; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NI2geyd/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994261; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3HInUZ1JgSSKLSxNX9tbFbkA6LH4NVwRZynu5inN9PA=; b=NI2geyd/Za9xuKzQukZTwT2b7pGgx7vn9GB4zoSnkxFWC/+uGjN8cyONqEPTRO3UVC/F59 WUmAyMN4wt9qhMh43BZjy/zAG9ev5akM1YR5/baP3I/57mvIWOxI8BYM+pwvogp03RLPeS qrpElvpoO5LvJvikbBU4V0/ih/N8Nqw= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-244-O97dmKStPwSjH6cYek2PKA-1; Fri, 18 Apr 2025 12:37:36 -0400 X-MC-Unique: O97dmKStPwSjH6cYek2PKA-1 X-Mimecast-MFC-AGG-ID: O97dmKStPwSjH6cYek2PKA_1744994255 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 041121955E79; Fri, 18 Apr 2025 16:37:35 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A8CCD180047F; Fri, 18 Apr 2025 16:37:33 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 03/20] block: don't call freeze queue in elevator_switch() and elevator_disable() Date: Sat, 19 Apr 2025 00:36:44 +0800 Message-ID: <20250418163708.442085-4-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Both elevator_switch() and elevator_disable() are called from sysfs store and updating nr_hw_queue code paths only. And in the two code paths, queue has been frozen already, so don't call freeze queue in the two functions. Reviewed-by: Nilay Shroff Signed-off-by: Ming Lei Reviewed-by: Yu Kuai --- block/elevator.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/block/elevator.c b/block/elevator.c index b4d08026b02c..5051a98dc08c 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -615,12 +615,11 @@ void elevator_init_mq(struct request_queue *q) */ int elevator_switch(struct request_queue *q, struct elevator_type *new_e) { - unsigned int memflags; int ret; + WARN_ON_ONCE(q->mq_freeze_depth == 0); lockdep_assert_held(&q->elevator_lock); - memflags = blk_mq_freeze_queue(q); blk_mq_quiesce_queue(q); if (q->elevator) { @@ -641,7 +640,6 @@ int elevator_switch(struct request_queue *q, struct elevator_type *new_e) out_unfreeze: blk_mq_unquiesce_queue(q); - blk_mq_unfreeze_queue(q, memflags); if (ret) { pr_warn("elv: switch to \"%s\" failed, falling back to \"none\"\n", @@ -653,11 +651,9 @@ int elevator_switch(struct request_queue *q, struct elevator_type *new_e) void elevator_disable(struct request_queue *q) { - unsigned int memflags; - + WARN_ON_ONCE(q->mq_freeze_depth == 0); lockdep_assert_held(&q->elevator_lock); - memflags = blk_mq_freeze_queue(q); blk_mq_quiesce_queue(q); elv_unregister_queue(q); @@ -668,7 +664,6 @@ void elevator_disable(struct request_queue *q) blk_add_trace_msg(q, "elv switch: none"); blk_mq_unquiesce_queue(q); - blk_mq_unfreeze_queue(q, memflags); } /* From patchwork Fri Apr 18 16:36:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057459 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 552B97462 for ; Fri, 18 Apr 2025 16:37:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994268; cv=none; b=tPP3I/P8RZuPBd90ot/F1Ag7BJxOmXw5HfaaN9B6VFAZ2deey7x/skDlqgxv5eDzFFcigkykPEVfyyIzZLuIN15hVilw/EmnaNoanySs3bSXzf2iK9BpVwzvMDnZOalT8IMg0UrkmaocjkDNRPGvLHOpFMa76ePVBJIQLEp7iPg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994268; c=relaxed/simple; bh=n2ViFboOP2gwnrlOVp4ABpgN/Pw/bH62EkWuHGzOfC0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CEQyeb5jjoOqIEmwMbgHqqV5Aiua1ILFYYT/RGJhjpvTYjreDWuKIAQB1lt68TBrs6TiKJRWBhvj9CwxieYOtNPuLOK3UUe0UdtRE9RwgvFKZzniQiMjUukWreVkYNN5N2ttR05/438hkAubsmx2avjC5SOE/hTXYNf/igUfObo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Ehedby96; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ehedby96" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994265; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NvTDS7thVTXN/u04S6+ClpWUK0vBOMkSbdDsMrHvi28=; b=Ehedby96g8zZh+k+OhyuXpxPRGuZGoJHP93g08pIc07WGnC2kj1YDottwbWTybeWlopufR kSvJAdveJFF7UEbeOb191b1GNaLRt07sNMzFQ4i1ERmKkysd7Q622D7mRcKf7te8wQJmi/ 6AIIkgKdqxA3JHigjDzh4amL2bAeOjw= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-531-odx5te6LMwiNZyEcQy2Xww-1; Fri, 18 Apr 2025 12:37:40 -0400 X-MC-Unique: odx5te6LMwiNZyEcQy2Xww-1 X-Mimecast-MFC-AGG-ID: odx5te6LMwiNZyEcQy2Xww_1744994259 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E3DD319560AF; Fri, 18 Apr 2025 16:37:38 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C591518001EA; Fri, 18 Apr 2025 16:37:37 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 04/20] block: add two helpers for registering/un-registering sched debugfs Date: Sat, 19 Apr 2025 00:36:45 +0800 Message-ID: <20250418163708.442085-5-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Add blk_mq_sched_reg_debugfs()/blk_mq_sched_unreg_debugfs() to clean up sched init/exit code a bit. Register & unregister debugfs for sched & sched_hctx order is changed a bit, but it is safe because sched & sched_hctx is guaranteed to be ready when exporting via debugfs. Signed-off-by: Ming Lei Reviewed-by: Yu Kuai Reviewed-by: Nilay Shroff --- block/blk-mq-sched.c | 45 +++++++++++++++++++++++++++++--------------- 1 file changed, 30 insertions(+), 15 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 9b81771774ef..2abc5e0704e8 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -434,6 +434,30 @@ static int blk_mq_init_sched_shared_tags(struct request_queue *queue) return 0; } +static void blk_mq_sched_reg_debugfs(struct request_queue *q) +{ + struct blk_mq_hw_ctx *hctx; + unsigned long i; + + mutex_lock(&q->debugfs_mutex); + blk_mq_debugfs_register_sched(q); + queue_for_each_hw_ctx(q, hctx, i) + blk_mq_debugfs_register_sched_hctx(q, hctx); + mutex_unlock(&q->debugfs_mutex); +} + +static void blk_mq_sched_unreg_debugfs(struct request_queue *q) +{ + struct blk_mq_hw_ctx *hctx; + unsigned long i; + + mutex_lock(&q->debugfs_mutex); + queue_for_each_hw_ctx(q, hctx, i) + blk_mq_debugfs_unregister_sched_hctx(hctx); + blk_mq_debugfs_unregister_sched(q); + mutex_unlock(&q->debugfs_mutex); +} + /* caller must have a reference to @e, will grab another one if successful */ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) { @@ -467,10 +491,6 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) if (ret) goto err_free_map_and_rqs; - mutex_lock(&q->debugfs_mutex); - blk_mq_debugfs_register_sched(q); - mutex_unlock(&q->debugfs_mutex); - queue_for_each_hw_ctx(q, hctx, i) { if (e->ops.init_hctx) { ret = e->ops.init_hctx(hctx, i); @@ -482,11 +502,11 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) return ret; } } - mutex_lock(&q->debugfs_mutex); - blk_mq_debugfs_register_sched_hctx(q, hctx); - mutex_unlock(&q->debugfs_mutex); } + /* sched is initialized, it is ready to export it via debugfs */ + blk_mq_sched_reg_debugfs(q); + return 0; err_free_map_and_rqs: @@ -524,11 +544,10 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e) unsigned long i; unsigned int flags = 0; - queue_for_each_hw_ctx(q, hctx, i) { - mutex_lock(&q->debugfs_mutex); - blk_mq_debugfs_unregister_sched_hctx(hctx); - mutex_unlock(&q->debugfs_mutex); + /* unexport via debugfs before exiting sched */ + blk_mq_sched_unreg_debugfs(q); + queue_for_each_hw_ctx(q, hctx, i) { if (e->type->ops.exit_hctx && hctx->sched_data) { e->type->ops.exit_hctx(hctx, i); hctx->sched_data = NULL; @@ -536,10 +555,6 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e) flags = hctx->flags; } - mutex_lock(&q->debugfs_mutex); - blk_mq_debugfs_unregister_sched(q); - mutex_unlock(&q->debugfs_mutex); - if (e->type->ops.exit_sched) e->type->ops.exit_sched(e); blk_mq_sched_tags_teardown(q, flags); From patchwork Fri Apr 18 16:36:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057460 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 71985213E71 for ; Fri, 18 Apr 2025 16:37:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994270; cv=none; b=pqEH/a+WFjnSA6KtKbXEjeCMet8w8Eo12gEdd6/BhPB9q3omX6cxwEfg6BHPdBuh8LI15xbcMBUv3hirm3SJ1vDxqZ/eLfpZItfckkz9HKtg2O0S4sL8QPR8BdgBsS9y5yUjsKCkC7W8IOzzJs6YFv6cbNQHdYWSWeru5N6/RA4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994270; c=relaxed/simple; bh=dBS/j8Rw0TlgkqwNXurMc0QN3YOINorYTIJ6hFVUIYI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jp2sqj7xhqnLSwUbD+/w2iWpma3F7PdIpvxeUI6cw1vWFTdBMe52G6yb4UaGtci8xI4tIs0jIlXehn6NAXKqWVe59uuEPieIkb0OT5UIEOqQeXZZbQSeECyWsXqL3G7ZMqcEiM2ojXdzQEycmRMdjh2dDRTECvpRb1f8RpfzhTo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CIkK2y4M; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CIkK2y4M" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994267; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sQNGHwXdkGsB87pTkvJtPUoq1bBw6AOdCjyb1IuON4g=; b=CIkK2y4Mz1t8pH5Q0mqNv8zpx5BauHxYes32c/SV+kS6CQAcpBXJNDxZJ8MsB18dE7s3MF 4aXk1Awe1asGkmWH2l/46x+5brgVsRFZJS6snO928qeifKJRoVxB2OoUIs6iB6uL6/vC0p J7gL4cM07eSOs0SdMAUXje8BuWCGAKU= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-510-U_v6r0zlNPSJt7CNoy7bBA-1; Fri, 18 Apr 2025 12:37:44 -0400 X-MC-Unique: U_v6r0zlNPSJt7CNoy7bBA-1 X-Mimecast-MFC-AGG-ID: U_v6r0zlNPSJt7CNoy7bBA_1744994263 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CCE6D1800ECA; Fri, 18 Apr 2025 16:37:42 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B04A219560BA; Fri, 18 Apr 2025 16:37:41 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 05/20] block: move sched debugfs register into elvevator_register_queue Date: Sat, 19 Apr 2025 00:36:46 +0800 Message-ID: <20250418163708.442085-6-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 sched debugfs shares same lifetime with scheduler's kobject, and same lock(elevator lock), so move sched debugfs register/unregister into elevator_register_queue() and elevator_unregister_queue(). Then we needn't blk_mq_debugfs_register() for us to register sched debugfs any more. Signed-off-by: Ming Lei Reviewed-by: Yu Kuai Reviewed-by: Nilay Shroff --- block/blk-mq-debugfs.c | 11 ----------- block/blk-mq-sched.c | 11 ++--------- block/elevator.c | 8 ++++++++ block/elevator.h | 3 +++ 4 files changed, 13 insertions(+), 20 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 31e249a18407..0fa0f8836b7d 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -625,20 +625,9 @@ void blk_mq_debugfs_register(struct request_queue *q) debugfs_create_files(q->debugfs_dir, q, blk_mq_debugfs_queue_attrs); - /* - * blk_mq_init_sched() attempted to do this already, but q->debugfs_dir - * didn't exist yet (because we don't know what to name the directory - * until the queue is registered to a gendisk). - */ - if (q->elevator && !q->sched_debugfs_dir) - blk_mq_debugfs_register_sched(q); - - /* Similarly, blk_mq_init_hctx() couldn't do this previously. */ queue_for_each_hw_ctx(q, hctx, i) { if (!hctx->debugfs_dir) blk_mq_debugfs_register_hctx(q, hctx); - if (q->elevator && !hctx->sched_debugfs_dir) - blk_mq_debugfs_register_sched_hctx(q, hctx); } if (q->rq_qos) { diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 2abc5e0704e8..336a15ffecfa 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -434,7 +434,7 @@ static int blk_mq_init_sched_shared_tags(struct request_queue *queue) return 0; } -static void blk_mq_sched_reg_debugfs(struct request_queue *q) +void blk_mq_sched_reg_debugfs(struct request_queue *q) { struct blk_mq_hw_ctx *hctx; unsigned long i; @@ -446,7 +446,7 @@ static void blk_mq_sched_reg_debugfs(struct request_queue *q) mutex_unlock(&q->debugfs_mutex); } -static void blk_mq_sched_unreg_debugfs(struct request_queue *q) +void blk_mq_sched_unreg_debugfs(struct request_queue *q) { struct blk_mq_hw_ctx *hctx; unsigned long i; @@ -503,10 +503,6 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e) } } } - - /* sched is initialized, it is ready to export it via debugfs */ - blk_mq_sched_reg_debugfs(q); - return 0; err_free_map_and_rqs: @@ -544,9 +540,6 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e) unsigned long i; unsigned int flags = 0; - /* unexport via debugfs before exiting sched */ - blk_mq_sched_unreg_debugfs(q); - queue_for_each_hw_ctx(q, hctx, i) { if (e->type->ops.exit_hctx && hctx->sched_data) { e->type->ops.exit_hctx(hctx, i); diff --git a/block/elevator.c b/block/elevator.c index 5051a98dc08c..d25b9cc6c509 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -472,6 +472,11 @@ int elv_register_queue(struct request_queue *q, bool uevent) if (uevent) kobject_uevent(&e->kobj, KOBJ_ADD); + /* + * Sched is initialized, it is ready to export it via + * debugfs + */ + blk_mq_sched_reg_debugfs(q); set_bit(ELEVATOR_FLAG_REGISTERED, &e->flags); } return error; @@ -486,6 +491,9 @@ void elv_unregister_queue(struct request_queue *q) if (e && test_and_clear_bit(ELEVATOR_FLAG_REGISTERED, &e->flags)) { kobject_uevent(&e->kobj, KOBJ_REMOVE); kobject_del(&e->kobj); + + /* unexport via debugfs before exiting sched */ + blk_mq_sched_unreg_debugfs(q); } } diff --git a/block/elevator.h b/block/elevator.h index e27af5492cdb..9198676644a9 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -181,4 +181,7 @@ extern struct request *elv_rb_find(struct rb_root *, sector_t); #define rq_entry_fifo(ptr) list_entry((ptr), struct request, queuelist) #define rq_fifo_clear(rq) list_del_init(&(rq)->queuelist) +void blk_mq_sched_reg_debugfs(struct request_queue *q); +void blk_mq_sched_unreg_debugfs(struct request_queue *q); + #endif /* _ELEVATOR_H */ From patchwork Fri Apr 18 16:36:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057461 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8959208A7 for ; Fri, 18 Apr 2025 16:37:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994275; cv=none; b=uFCoDI2ot8UTRnkJhgqyKicyTUkdphkA9AucEaTakosL0qp9uDLNH1WYipsP+grnP1SPBlx8l0u4pdNTnRxcPrQkqr2VsF4FYXq45EV55uzZXq+M6K4b895K3lStDS51v5tl6HJwWAoSfz+JIvqGu+YP0+RC8CLkq4KWRnetZYw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994275; c=relaxed/simple; bh=b/MSZZsD+HCnboRalqnYxzjKQ8CYWeV0BzzDNKGiOdk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=A8zqZlYWI3O6wnlt6esl0Lv+hsIpigTnAJ9u0sdoC7swCnf1vcmhMhbVEAB6iOlX3o/iq8tl+MKWlFkAdaqMA/2y5TU6iPas0sGrsXnECo69TEiWg5ohwR+X0NQTC50qFrkxgNsyswZ/vmOBPw35nPiqD5+8n8WcWHAuyUgvQ/E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=He6TsD+2; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="He6TsD+2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994272; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=76SV8HuxVR187jaN0G5ISTMO24o9l6sktx4O1GP+5hs=; b=He6TsD+2kCIwsL6mWtHAHcg8YwGVhs3+fHB4xgQVJYwbBK4anZmtAAI2oXp71yeFpUmP+p SSbaq3mm2sdDa7yPHXxp+5NSB6EjhefbkKo71JNDP/ka0Gd+48yliMPQvQ1uzGBElrDKm8 CFA7zf9ZA5JDagn8Blch2ojhudYlaVY= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-661-mo8bE56hOjms3fup_aGOnQ-1; Fri, 18 Apr 2025 12:37:48 -0400 X-MC-Unique: mo8bE56hOjms3fup_aGOnQ-1 X-Mimecast-MFC-AGG-ID: mo8bE56hOjms3fup_aGOnQ_1744994267 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5F2011800263; Fri, 18 Apr 2025 16:37:47 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id EE6A819560A3; Fri, 18 Apr 2025 16:37:45 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 06/20] block: add & pass 'struct gendisk_data' for retrying add/del disk in updating nr_hw_queues Date: Sat, 19 Apr 2025 00:36:47 +0800 Message-ID: <20250418163708.442085-7-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Add/del gendisk code is actually reader of 'nr_hw_queues' in case of blk-mq: - debugfs / hctx sysfs register - setup scheduler since ->sched_tags depends on hctx, which relies on 'nr_hw_queues' Add & pass 'struct gendisk_data' to add/del disk helper and prepare for retrying add/del disk when updating nr_hw_queues is in-progress. Signed-off-by: Ming Lei --- block/genhd.c | 105 ++++++++++++++++++++++++++++++++------------------ 1 file changed, 67 insertions(+), 38 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index c2bd86cd09de..4370c5be1f34 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -33,6 +33,13 @@ #include "blk-rq-qos.h" #include "blk-cgroup.h" +struct gendisk_data { + struct gendisk *disk; + struct device *parent; + const struct attribute_group **groups; + struct fwnode_handle *fwnode; +}; + static struct kobject *block_depr; /* @@ -389,21 +396,9 @@ int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode) return ret; } -/** - * add_disk_fwnode - add disk information to kernel list with fwnode - * @parent: parent device for the disk - * @disk: per-device partitioning information - * @groups: Additional per-device sysfs groups - * @fwnode: attached disk fwnode - * - * This function registers the partitioning information in @disk - * with the kernel. Also attach a fwnode to the disk device. - */ -int __must_check add_disk_fwnode(struct device *parent, struct gendisk *disk, - const struct attribute_group **groups, - struct fwnode_handle *fwnode) - +static int __add_disk_fwnode(struct gendisk_data *data) { + struct gendisk *disk = data->disk; struct device *ddev = disk_to_dev(disk); int ret; @@ -463,11 +458,11 @@ int __must_check add_disk_fwnode(struct device *parent, struct gendisk *disk, /* delay uevents, until we scanned partition table */ dev_set_uevent_suppress(ddev, 1); - ddev->parent = parent; - ddev->groups = groups; + ddev->parent = data->parent; + ddev->groups = data->groups; dev_set_name(ddev, "%s", disk->disk_name); - if (fwnode) - device_set_node(ddev, fwnode); + if (data->fwnode) + device_set_node(ddev, data->fwnode); if (!(disk->flags & GENHD_FL_HIDDEN)) ddev->devt = MKDEV(disk->major, disk->first_minor); ret = device_add(ddev); @@ -572,6 +567,30 @@ int __must_check add_disk_fwnode(struct device *parent, struct gendisk *disk, } return ret; } + +/** + * add_disk_fwnode - add disk information to kernel list with fwnode + * @parent: parent device for the disk + * @disk: per-device partitioning information + * @groups: Additional per-device sysfs groups + * @fwnode: attached disk fwnode + * + * This function registers the partitioning information in @disk + * with the kernel. Also attach a fwnode to the disk device. + */ +int __must_check add_disk_fwnode(struct device *parent, struct gendisk *disk, + const struct attribute_group **groups, + struct fwnode_handle *fwnode) +{ + struct gendisk_data data = { + .disk = disk, + .parent = parent, + .groups = groups, + .fwnode = fwnode, + }; + + return __add_disk_fwnode(&data); +} EXPORT_SYMBOL_GPL(add_disk_fwnode); /** @@ -652,27 +671,9 @@ void blk_mark_disk_dead(struct gendisk *disk) } EXPORT_SYMBOL_GPL(blk_mark_disk_dead); -/** - * del_gendisk - remove the gendisk - * @disk: the struct gendisk to remove - * - * Removes the gendisk and all its associated resources. This deletes the - * partitions associated with the gendisk, and unregisters the associated - * request_queue. - * - * This is the counter to the respective __device_add_disk() call. - * - * The final removal of the struct gendisk happens when its refcount reaches 0 - * with put_disk(), which should be called after del_gendisk(), if - * __device_add_disk() was used. - * - * Drivers exist which depend on the release of the gendisk to be synchronous, - * it should not be deferred. - * - * Context: can sleep - */ -void del_gendisk(struct gendisk *disk) +static void __del_gendisk(struct gendisk_data *data) { + struct gendisk *disk = data->disk; struct request_queue *q = disk->queue; struct block_device *part; unsigned long idx; @@ -766,6 +767,34 @@ void del_gendisk(struct gendisk *disk) } EXPORT_SYMBOL(del_gendisk); +/** + * del_gendisk - remove the gendisk + * @disk: the struct gendisk to remove + * + * Removes the gendisk and all its associated resources. This deletes the + * partitions associated with the gendisk, and unregisters the associated + * request_queue. + * + * This is the counter to the respective __device_add_disk() call. + * + * The final removal of the struct gendisk happens when its refcount reaches 0 + * with put_disk(), which should be called after del_gendisk(), if + * __device_add_disk() was used. + * + * Drivers exist which depend on the release of the gendisk to be synchronous, + * it should not be deferred. + * + * Context: can sleep + */ +void del_gendisk(struct gendisk *disk) +{ + struct gendisk_data data = { + .disk = disk, + }; + + __del_gendisk(&data); +} + /** * invalidate_disk - invalidate the disk * @disk: the struct gendisk to invalidate From patchwork Fri Apr 18 16:36:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057462 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2A3E208A7 for ; Fri, 18 Apr 2025 16:37:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994278; cv=none; b=qnp243ENZjhLZwo7TTUKwpF88IKmv6T/kS7qb4PAhMlIrXEUorKGy8Cs/7uOnvIDuHtdlADruo1cClg9PVBOVCGAjy8rH+Ca2xcgJsbvw308M/wcXArWXiv21/LDXdkerBFcht1nsWQxS/9lU1L3dqIvWG2U9OXNhZgOwkMVuNE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994278; c=relaxed/simple; bh=lSVEvALNSLKI3ZAiS7XJzcwM2mn1yvCus3Ne7vwRfMg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=srruRph6hHjaqCSE7j3MRT9xTQ7FGByR2c8Gn0SbHJ+/ho7lQm2rHHCmDO/XQibuJfy+kW8r8b2XUNstemded1MYw1aRDKDAdFdUwRDjnyzxLNlOLk80yQOUFWi7D7rj96bEy+rSGpnRvp6C56CGQIAPTrabSOIluBkYCub+k3Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=JKUjGKcN; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JKUjGKcN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994275; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OuGjHwRqAOnTVi/MB2ZbCXB1MH3VBU/S7SnAnHKEq44=; b=JKUjGKcNYf98yUFCBtGqKQgEh/u6iPj1ws9Q6BZ7imURh7UitE1XCywqZXQCHIefBjZ/ta qeSsglMQhpBcwG7bDv/0j0Ez+DrOYoLfsQOg6/c15ChCFbSaHtCS6t0bFWcOY8TnFtePse oSs03aG6NFH18K9t32w4Js66QVRyFzE= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-673-LaaQyoQWM2GKeRmqhwmqxA-1; Fri, 18 Apr 2025 12:37:52 -0400 X-MC-Unique: LaaQyoQWM2GKeRmqhwmqxA-1 X-Mimecast-MFC-AGG-ID: LaaQyoQWM2GKeRmqhwmqxA_1744994271 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 33BEE180048E; Fri, 18 Apr 2025 16:37:51 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2F23B19560A3; Fri, 18 Apr 2025 16:37:49 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 07/20] block: prevent adding/deleting disk during updating nr_hw_queues Date: Sat, 19 Apr 2025 00:36:48 +0800 Message-ID: <20250418163708.442085-8-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Both adding/deleting disk code are reader of `nr_hw_queues`, so we can't allow them in-progress when updating nr_hw_queues, kernel panic and kasan has been reported in [1]. Prevent adding/deleting disk during updating nr_hw_queues by setting set->updating_nr_hwq, and use SRCU to fail & retry to add/delete disk. This way avoids lot of trouble. Reported-by: Nilay Shroff Closes: https://lore.kernel.org/linux-block/a5896cdb-a59a-4a37-9f99-20522f5d2987@linux.ibm.com/ Signed-off-by: Ming Lei --- block/blk-mq.c | 22 +++++++++++++++++++++- block/genhd.c | 36 ++++++++++++++++++++++++++++++++---- include/linux/blk-mq.h | 5 +++++ 3 files changed, 58 insertions(+), 5 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 7cda919fafba..e1662617cc7a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4782,12 +4782,18 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) goto out_free_srcu; } + mutex_init(&set->update_nr_hwq_lock); + init_waitqueue_head(&set->update_nr_hwq_wq); + ret = init_srcu_struct(&set->update_nr_hwq_srcu); + if (ret) + goto out_cleanup_srcu; + ret = -ENOMEM; set->tags = kcalloc_node(set->nr_hw_queues, sizeof(struct blk_mq_tags *), GFP_KERNEL, set->numa_node); if (!set->tags) - goto out_cleanup_srcu; + goto out_cleanup_hwq_srcu; for (i = 0; i < set->nr_maps; i++) { set->map[i].mq_map = kcalloc_node(nr_cpu_ids, @@ -4816,6 +4822,8 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set) } kfree(set->tags); set->tags = NULL; +out_cleanup_hwq_srcu: + cleanup_srcu_struct(&set->update_nr_hwq_srcu); out_cleanup_srcu: if (set->flags & BLK_MQ_F_BLOCKING) cleanup_srcu_struct(set->srcu); @@ -5077,9 +5085,21 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues) { + mutex_lock(&set->update_nr_hwq_lock); + /* + * Mark us in updating nr_hw_queues for preventing reader of + * nr_hw_queues, such as adding/deleting disk. + */ + set->updating_nr_hwq = true; + synchronize_srcu(&set->update_nr_hwq_srcu); + mutex_lock(&set->tag_list_lock); __blk_mq_update_nr_hw_queues(set, nr_hw_queues); mutex_unlock(&set->tag_list_lock); + + set->updating_nr_hwq = false; + wake_up_all(&set->update_nr_hwq_wq); + mutex_unlock(&set->update_nr_hwq_lock); } EXPORT_SYMBOL_GPL(blk_mq_update_nr_hw_queues); diff --git a/block/genhd.c b/block/genhd.c index 4370c5be1f34..d22fdc0d5383 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -396,6 +396,33 @@ int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode) return ret; } +static int retry_on_updating_nr_hwq(struct gendisk_data *data, + int (*cb)(struct gendisk_data *data)) +{ + struct gendisk *disk = data->disk; + struct blk_mq_tag_set *set; + + if (!queue_is_mq(disk->queue)) + return cb(data); + + set = disk->queue->tag_set; + do { + int idx, ret; + + idx = srcu_read_lock(&set->update_nr_hwq_srcu); + if (set->updating_nr_hwq) { + srcu_read_unlock(&set->update_nr_hwq_srcu, idx); + goto wait; + } + ret = cb(data); + srcu_read_unlock(&set->update_nr_hwq_srcu, idx); + return ret; + wait: + wait_event_interruptible(set->update_nr_hwq_wq, + !set->updating_nr_hwq); + } while (true); +} + static int __add_disk_fwnode(struct gendisk_data *data) { struct gendisk *disk = data->disk; @@ -589,7 +616,7 @@ int __must_check add_disk_fwnode(struct device *parent, struct gendisk *disk, .fwnode = fwnode, }; - return __add_disk_fwnode(&data); + return retry_on_updating_nr_hwq(&data, __add_disk_fwnode); } EXPORT_SYMBOL_GPL(add_disk_fwnode); @@ -671,7 +698,7 @@ void blk_mark_disk_dead(struct gendisk *disk) } EXPORT_SYMBOL_GPL(blk_mark_disk_dead); -static void __del_gendisk(struct gendisk_data *data) +static int __del_gendisk(struct gendisk_data *data) { struct gendisk *disk = data->disk; struct request_queue *q = disk->queue; @@ -682,7 +709,7 @@ static void __del_gendisk(struct gendisk_data *data) might_sleep(); if (WARN_ON_ONCE(!disk_live(disk) && !(disk->flags & GENHD_FL_HIDDEN))) - return; + return 0; disk_del_events(disk); @@ -764,6 +791,7 @@ static void __del_gendisk(struct gendisk_data *data) if (start_drain) blk_unfreeze_release_lock(q); + return 0; } EXPORT_SYMBOL(del_gendisk); @@ -792,7 +820,7 @@ void del_gendisk(struct gendisk *disk) .disk = disk, }; - __del_gendisk(&data); + retry_on_updating_nr_hwq(&data, __del_gendisk); } /** diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 8eb9b3310167..afe76dcfaa3c 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -527,6 +527,11 @@ struct blk_mq_tag_set { struct mutex tag_list_lock; struct list_head tag_list; struct srcu_struct *srcu; + + bool updating_nr_hwq; + struct mutex update_nr_hwq_lock; + struct srcu_struct update_nr_hwq_srcu; + wait_queue_head_t update_nr_hwq_wq; }; /** From patchwork Fri Apr 18 16:36:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057463 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20C9C126BF7 for ; Fri, 18 Apr 2025 16:38:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994284; cv=none; b=oEm/pf6Ttr4Pqg7Jrp1XctgssosvxpzAf930PLOC0k4b7YqpE7eF2NVwPFg3t7lAhbiNcCp1SVvb5BF2VktwdsnIY5GYAlvpyfoAr7IHAgQKFPSC3DXz3rUqAspFOEMTZf7DFl0TgcZQKht2MQmLJYi4XZncsJBtM0vqGhs4e1w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994284; c=relaxed/simple; bh=g+vSBE1jqlmkdWoGiLfGIghNskj57xHTkyKHBmaSMbQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PyVjuARZMiR2cxsIxLTLsSJsJizzRRaUzDcKh+eo4Ooa3HPTk5UwqDphH77FhFoarv+I9GkeXQhgf3AMiKZaUu48y5hBFpkJ8s15qYRipD6BAC+y1YMBzH2QhhYLg18wtvBN878RMC3IjhV9yX45pbYD77PO4NldIodqWuDEwx8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=hgfxmTu5; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hgfxmTu5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994281; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RMsfzF7/g8xKSRVD3q2SeMGZCslRdjBJW3uL/AAP3jk=; b=hgfxmTu5xi51fgy4QoywOE5BevnEZYA6vUz2JjfTgLd/cKfgFtXXe88c9YPpZnYcmz3xS1 KdcBKMzPf0/njiQA/xlbFauKIMV8ciuB/e7tgkrRtB8GGWeM6rVf66dJv45QNqy8eaFJNA WL8YNMM5WUbei9/TzCGtSTowxg5KN1g= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-190-M_79VWycP7ySPduEoWm8ow-1; Fri, 18 Apr 2025 12:37:56 -0400 X-MC-Unique: M_79VWycP7ySPduEoWm8ow-1 X-Mimecast-MFC-AGG-ID: M_79VWycP7ySPduEoWm8ow_1744994275 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 49518180036E; Fri, 18 Apr 2025 16:37:55 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 225761800362; Fri, 18 Apr 2025 16:37:53 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 08/20] block: don't allow to switch elevator if updating nr_hw_queues is in-progress Date: Sat, 19 Apr 2025 00:36:49 +0800 Message-ID: <20250418163708.442085-9-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Elevator switch code is another `nr_hw_queue` reader in non-fast-IO code path, so it can't be done if updating `nr_hw_queues` is in-progress. Take same approach with not allowing add/del disk when updating nr_hw_queues is in-progress, by holding srcu lock & check set->updating_nr_hwq. Reported-by: Shinichiro Kawasaki Closes: https://lore.kernel.org/linux-block/mz4t4tlwiqjijw3zvqnjb7ovvvaegkqganegmmlc567tt5xj67@xal5ro544cnc/ Signed-off-by: Ming Lei --- block/elevator.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/block/elevator.c b/block/elevator.c index d25b9cc6c509..c23912652f96 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -720,9 +720,10 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, { char elevator_name[ELV_NAME_MAX]; char *name; - int ret; + int ret, idx; unsigned int memflags; struct request_queue *q = disk->queue; + struct blk_mq_tag_set *set = q->tag_set; /* * If the attribute needs to load a module, do it before freezing the @@ -734,6 +735,12 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, elv_iosched_load_module(name); + idx = srcu_read_lock(&set->update_nr_hwq_srcu); + if (set->updating_nr_hwq) { + ret = -EBUSY; + goto exit; + } + memflags = blk_mq_freeze_queue(q); mutex_lock(&q->elevator_lock); ret = elevator_change(q, name); @@ -741,6 +748,8 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, ret = count; mutex_unlock(&q->elevator_lock); blk_mq_unfreeze_queue(q, memflags); +exit: + srcu_read_unlock(&set->update_nr_hwq_srcu, idx); return ret; } From patchwork Fri Apr 18 16:36:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057464 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE1B41DE89C for ; Fri, 18 Apr 2025 16:38:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994286; cv=none; b=C6q/QKd8kUJYuBdWtTe4+IzoNe1UnrQR+N6BcxI4PIA1SZ/0svg7vFcHiVCEWcRLm0jkLOOplhkLvKy0HYh6rzJ2PA9uInYnZGPzZ5r02rIVqgbxkRoYqxopwPWzZAHVXTaMoBEDybfiqjIHrdfB8YKLneA3Gt+KfdHNLNJebt0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994286; c=relaxed/simple; bh=8Toy+V4rnFoxPInuVlP3u0TQYfThUJ2xkaMNF9U6IsA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RSReRCO6t632zCJ9aEp5p9cIfn7ZVDBALZu1s3s1YKlSkxbS8li2oZMpGKcR1jn1q6uIfB8ojmpXzFVz28BH2QRTWjPaMLb/QFcwoPYR3ZWYMHF2TGIzimKcfuS0QkhGSIjyH7sfMDLyPVuUcqth4dNth76I5yz7JNdLZQosA4Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Ru0ch4QR; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ru0ch4QR" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994283; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=za3jlM+C3nArmNJzhk0X6IST5v0fosk18GaQWJKeyuY=; b=Ru0ch4QRvTmqzJz/zLAxJvxo9RfRCArQgFVHsSLlZzOooM8Bn9JknuJQMkX2spuGUasghF v05bXwPKVKbPBXFmK0ZxFSQvyCaKI7RR1edWw/URIrwuqwQvscQogR5SWqUWEMnygU//2Z itDS68HFHTENG9PQXpA1ePEUiQCuzko= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-648-ofvF3Ij-NiukJcfyOggtQQ-1; Fri, 18 Apr 2025 12:38:00 -0400 X-MC-Unique: ofvF3Ij-NiukJcfyOggtQQ-1 X-Mimecast-MFC-AGG-ID: ofvF3Ij-NiukJcfyOggtQQ_1744994279 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 44E2418001D5; Fri, 18 Apr 2025 16:37:59 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1AE8F19560BA; Fri, 18 Apr 2025 16:37:57 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 09/20] block: simplify elevator rebuild for updating nr_hw_queues Date: Sat, 19 Apr 2025 00:36:50 +0800 Message-ID: <20250418163708.442085-10-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 In blk_mq_update_nr_hw_queues(), nr_hw_queues changes and elevator data depends on it, so elevator has to be rebuilt. Now elevator switch isn't allowed during blk_mq_update_nr_hw_queues(), so we can simply call elevator_change() to rebuild elevator sched tags after nr_hw_queues is updated. Reviewed-by: Christoph Hellwig Signed-off-by: Ming Lei Reviewed-by: Nilay Shroff --- block/blk-mq.c | 103 ++++++----------------------------------------- block/blk.h | 4 +- block/elevator.c | 12 +++--- 3 files changed, 22 insertions(+), 97 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index e1662617cc7a..0f4a5e674874 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4929,88 +4929,10 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) return ret; } -/* - * request_queue and elevator_type pair. - * It is just used by __blk_mq_update_nr_hw_queues to cache - * the elevator_type associated with a request_queue. - */ -struct blk_mq_qe_pair { - struct list_head node; - struct request_queue *q; - struct elevator_type *type; -}; - -/* - * Cache the elevator_type in qe pair list and switch the - * io scheduler to 'none' - */ -static bool blk_mq_elv_switch_none(struct list_head *head, - struct request_queue *q) -{ - struct blk_mq_qe_pair *qe; - - qe = kmalloc(sizeof(*qe), GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY); - if (!qe) - return false; - - /* Accessing q->elevator needs protection from ->elevator_lock. */ - mutex_lock(&q->elevator_lock); - - if (!q->elevator) { - kfree(qe); - goto unlock; - } - - INIT_LIST_HEAD(&qe->node); - qe->q = q; - qe->type = q->elevator->type; - /* keep a reference to the elevator module as we'll switch back */ - __elevator_get(qe->type); - list_add(&qe->node, head); - elevator_disable(q); -unlock: - mutex_unlock(&q->elevator_lock); - - return true; -} - -static struct blk_mq_qe_pair *blk_lookup_qe_pair(struct list_head *head, - struct request_queue *q) -{ - struct blk_mq_qe_pair *qe; - - list_for_each_entry(qe, head, node) - if (qe->q == q) - return qe; - - return NULL; -} - -static void blk_mq_elv_switch_back(struct list_head *head, - struct request_queue *q) -{ - struct blk_mq_qe_pair *qe; - struct elevator_type *t; - - qe = blk_lookup_qe_pair(head, q); - if (!qe) - return; - t = qe->type; - list_del(&qe->node); - kfree(qe); - - mutex_lock(&q->elevator_lock); - elevator_switch(q, t); - /* drop the reference acquired in blk_mq_elv_switch_none */ - elevator_put(t); - mutex_unlock(&q->elevator_lock); -} - static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues) { struct request_queue *q; - LIST_HEAD(head); int prev_nr_hw_queues = set->nr_hw_queues; unsigned int memflags; int i; @@ -5028,15 +4950,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_freeze_queue_nomemsave(q); - /* - * Switch IO scheduler to 'none', cleaning up the data associated - * with the previous scheduler. We will switch back once we are done - * updating the new sw to hw queue mappings. - */ - list_for_each_entry(q, &set->tag_list, tag_set_list) - if (!blk_mq_elv_switch_none(&head, q)) - goto switch_back; - list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_debugfs_unregister_hctxs(q); blk_mq_sysfs_unregister_hctxs(q); @@ -5070,9 +4983,19 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, blk_mq_debugfs_register_hctxs(q); } -switch_back: - list_for_each_entry(q, &set->tag_list, tag_set_list) - blk_mq_elv_switch_back(&head, q); + list_for_each_entry(q, &set->tag_list, tag_set_list) { + const char *name = "none"; + + mutex_lock(&q->elevator_lock); + if (q->elevator && !blk_queue_dying(q)) + name = q->elevator->type->elevator_name; + /* + * nr_hw_queues is changed and elevator data depends on + * it, so we have to force to rebuild elevator + */ + __elevator_change(q, name, true); + mutex_unlock(&q->elevator_lock); + } list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_unfreeze_queue_nomemrestore(q); diff --git a/block/blk.h b/block/blk.h index 006e3be433d2..0c3cc1af2525 100644 --- a/block/blk.h +++ b/block/blk.h @@ -319,8 +319,8 @@ bool blk_bio_list_merge(struct request_queue *q, struct list_head *list, bool blk_insert_flush(struct request *rq); -int elevator_switch(struct request_queue *q, struct elevator_type *new_e); -void elevator_disable(struct request_queue *q); +int __elevator_change(struct request_queue *q, const char *elevator_name, + bool force); void elevator_exit(struct request_queue *q); int elv_register_queue(struct request_queue *q, bool uevent); void elv_unregister_queue(struct request_queue *q); diff --git a/block/elevator.c b/block/elevator.c index c23912652f96..f4c02a6c045d 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -621,7 +621,7 @@ void elevator_init_mq(struct request_queue *q) * If switching fails, we are most likely running out of memory and not able * to restore the old io scheduler, so leaving the io scheduler being none. */ -int elevator_switch(struct request_queue *q, struct elevator_type *new_e) +static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) { int ret; @@ -657,7 +657,7 @@ int elevator_switch(struct request_queue *q, struct elevator_type *new_e) return ret; } -void elevator_disable(struct request_queue *q) +static void elevator_disable(struct request_queue *q) { WARN_ON_ONCE(q->mq_freeze_depth == 0); lockdep_assert_held(&q->elevator_lock); @@ -677,7 +677,8 @@ void elevator_disable(struct request_queue *q) /* * Switch this queue to the given IO scheduler. */ -static int elevator_change(struct request_queue *q, const char *elevator_name) +int __elevator_change(struct request_queue *q, const char *elevator_name, + bool force) { struct elevator_type *e; int ret; @@ -692,7 +693,8 @@ static int elevator_change(struct request_queue *q, const char *elevator_name) return 0; } - if (q->elevator && elevator_match(q->elevator->type, elevator_name)) + if (!force && q->elevator && + elevator_match(q->elevator->type, elevator_name)) return 0; e = elevator_find_get(elevator_name); @@ -743,7 +745,7 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, memflags = blk_mq_freeze_queue(q); mutex_lock(&q->elevator_lock); - ret = elevator_change(q, name); + ret = __elevator_change(q, name, false); if (!ret) ret = count; mutex_unlock(&q->elevator_lock); From patchwork Fri Apr 18 16:36:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057465 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B45C21DE89C for ; Fri, 18 Apr 2025 16:38:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994289; cv=none; b=Dga8ITODNqRu/Cwt4/fQVsdkGBx/g0ouwqEhHRkj84gj7W3CEPzqRRlx+6efjDDe4nm+kmpSIJsc50PAHNCh3Iq5Sc9QESzr0eYzPgzufZ0apTfCP/sfEMQWOwZWhTJwo2Ivm7Lc9Xo0GOFHydH47hzqOXwtEpm98t2wWuNXGLE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994289; c=relaxed/simple; bh=6SKK5aBSXbfRkzhvP5qt7ZtaTGJJzn+Dw2+Dcr167Gc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=syXLnSmoYrEKq87QKdBYXMLZxWq4nJp3c0KqzjN1/1+6VtkrvXXwBXmmSD1dSpNj+swvizuSFwPylWwmd3cES4curpfUl1wsSqHh0FWzdhDQ/sIhUED4dzis8Jke3T7jjlwUFtj9X7gbM5/rsOH2ckHdnGpcVBZamHKZ5WW1g64= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=f+y6E6Jo; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="f+y6E6Jo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994286; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3nP66ZCaEMu96ZZQXAyvu/3haWXm8Zj+oUfUrwYzjSM=; b=f+y6E6JoRYGJ5N6sfwBemMC0eHHUMq5e03PexJ9mUsmXw5flrZCnSu9w2ZLJbaxr8U2moQ Ph7V7y2s1I2qjbyNqsljj35LsER2UXJ7b98cWISvJSXjnkr0iAuPvRHFaSz6NHtOuqCScO XouwwJ/HJxQ9vxPT5b4ezp7m3ek7sKQ= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-12-oPGePI3VOl6Km5_cvNioWg-1; Fri, 18 Apr 2025 12:38:04 -0400 X-MC-Unique: oPGePI3VOl6Km5_cvNioWg-1 X-Mimecast-MFC-AGG-ID: oPGePI3VOl6Km5_cvNioWg_1744994283 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 437B318001D5; Fri, 18 Apr 2025 16:38:03 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 30B6719560B0; Fri, 18 Apr 2025 16:38:01 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 10/20] block: add helper of elevator_change() Date: Sat, 19 Apr 2025 00:36:51 +0800 Message-ID: <20250418163708.442085-11-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Add elevator_change() to simplify elv_iosched_store() a bit, and the new helper will be used for unifying all scheduler change. Signed-off-by: Ming Lei --- block/elevator.c | 44 +++++++++++++++++++++++++++----------------- 1 file changed, 27 insertions(+), 17 deletions(-) diff --git a/block/elevator.c b/block/elevator.c index f4c02a6c045d..6bf3871c7164 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -53,6 +53,8 @@ static LIST_HEAD(elv_list); */ #define rq_hash_key(rq) (blk_rq_pos(rq) + blk_rq_sectors(rq)) +static int elevator_change(struct request_queue *q, const char *name); + /* * Query io scheduler to see if the current process issuing bio may be * merged with rq. @@ -705,6 +707,28 @@ int __elevator_change(struct request_queue *q, const char *elevator_name, return ret; } +static int elevator_change(struct request_queue *q, const char *name) +{ + struct blk_mq_tag_set *set = q->tag_set; + unsigned int memflags; + int ret, idx; + + idx = srcu_read_lock(&set->update_nr_hwq_srcu); + if (set->updating_nr_hwq) { + ret = -EBUSY; + goto exit; + } + + memflags = blk_mq_freeze_queue(q); + mutex_lock(&q->elevator_lock); + ret = __elevator_change(q, name, false); + mutex_unlock(&q->elevator_lock); + blk_mq_unfreeze_queue(q, memflags); +exit: + srcu_read_unlock(&set->update_nr_hwq_srcu, idx); + return ret; +} + static void elv_iosched_load_module(char *elevator_name) { struct elevator_type *found; @@ -720,12 +744,10 @@ static void elv_iosched_load_module(char *elevator_name) ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, size_t count) { + struct request_queue *q = disk->queue; char elevator_name[ELV_NAME_MAX]; char *name; - int ret, idx; - unsigned int memflags; - struct request_queue *q = disk->queue; - struct blk_mq_tag_set *set = q->tag_set; + int ret; /* * If the attribute needs to load a module, do it before freezing the @@ -737,21 +759,9 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, elv_iosched_load_module(name); - idx = srcu_read_lock(&set->update_nr_hwq_srcu); - if (set->updating_nr_hwq) { - ret = -EBUSY; - goto exit; - } - - memflags = blk_mq_freeze_queue(q); - mutex_lock(&q->elevator_lock); - ret = __elevator_change(q, name, false); + ret = elevator_change(q, name); if (!ret) ret = count; - mutex_unlock(&q->elevator_lock); - blk_mq_unfreeze_queue(q, memflags); -exit: - srcu_read_unlock(&set->update_nr_hwq_srcu, idx); return ret; } From patchwork Fri Apr 18 16:36:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057466 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90B181DE89C for ; Fri, 18 Apr 2025 16:38:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994293; cv=none; b=lb3+wp5pFKweNMwSKNWJwZCxBxRKdTaFnX9y47R3DxGVU1zS0p64grKbLZ+RbT2rzWmV0Qz/i6PX3YkfmEK/30aDHpv7hSQAm7Y6rqCIko8UKr05yciMxt7FNdq9y4ghuoeLSFUwRv3swiVt5/31aXYn0nWbd3gMWSTb61m+I/c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994293; c=relaxed/simple; bh=9umTJ+eIsj6kjjaD7RaWGUUqxxEKtJldOR0awF0uALY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Y2+ua6yDxCQICdE00X1c6wYbknAzYjoP6rKCcAJDEKtpSJqS6zi71MePAgGQTQJqhXrKTCkn5PmtFNbEgnLJXTsubMikgEjDWOhz8aZw+pqcq89zzWTJ8loWfuoinFMZTHxPBWUeVkZV91Dn7t9gKZbS56BAyZKwIgXLcNOuIwA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=KGRQA6G1; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="KGRQA6G1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994290; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gaEe/cy87tCmAT8pjSKB35mAoRNQ92uw8XtLksZPyAQ=; b=KGRQA6G14t4EWutKdGsTarL6j696r4TYOuj9HmZ84jK9EAA7LU03Qk1OkyDO29LY5K9vJ4 4zrgnOXm9gzQrWv9B+VSbder3+8ER0mNJ7ZHDK3+HSlU94cPOPzWgFhw3Ry6aGmjNzuS/W q37Vapr1NmefrJpM1S/PLJieEB169zg= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-322-vsy2F_ZfPiK-u0PJR19jdw-1; Fri, 18 Apr 2025 12:38:08 -0400 X-MC-Unique: vsy2F_ZfPiK-u0PJR19jdw-1 X-Mimecast-MFC-AGG-ID: vsy2F_ZfPiK-u0PJR19jdw_1744994287 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id EB5FE180087B; Fri, 18 Apr 2025 16:38:06 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id EACFB19560A3; Fri, 18 Apr 2025 16:38:05 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 11/20] block: move blk_unregister_queue() & device_del() after freeze wait Date: Sat, 19 Apr 2025 00:36:52 +0800 Message-ID: <20250418163708.442085-12-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Move blk_unregister_queue() & device_del() after freeze wait, and prepare for unifying elevator switch. This way is just fine, since bdev has been unhashed at the beginning of del_gendisk(), both blk_unregister_queue() & device_del() are dealing with kobject & debugfs thing only. Signed-off-by: Ming Lei --- block/genhd.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index d22fdc0d5383..86c3db5b9305 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -749,8 +749,6 @@ static int __del_gendisk(struct gendisk_data *data) bdi_unregister(disk->bdi); } - blk_unregister_queue(disk); - kobject_put(disk->part0->bd_holder_dir); kobject_put(disk->slave_dir); disk->slave_dir = NULL; @@ -759,10 +757,12 @@ static int __del_gendisk(struct gendisk_data *data) disk->part0->bd_stamp = 0; sysfs_remove_link(block_depr, dev_name(disk_to_dev(disk))); pm_runtime_set_memalloc_noio(disk_to_dev(disk), false); - device_del(disk_to_dev(disk)); blk_mq_freeze_queue_wait(q); + blk_unregister_queue(disk); + device_del(disk_to_dev(disk)); + blk_throtl_cancel_bios(disk); blk_sync_queue(q); From patchwork Fri Apr 18 16:36:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057467 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCD137462 for ; Fri, 18 Apr 2025 16:38:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994299; cv=none; b=gxEW76Aeh9DnM6HyRz0BUEtIX36nrFvNXTDpk+pIgTryImYrA7sZUqfZOVpknimWPExPrPhxCu98/jJEzfuE/miSgiRvXlNtYEQ3BxAQFELCOcSDD+HINgfBnt3LMKv4qPrB/jIeJ9/eXmbjyWS92AO3qDh9C6RinIBR9oEzv68= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994299; c=relaxed/simple; bh=dWiGyVnze8iyTTOSQcPeHmem5mrMIPAMUWmsaLD5BA4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cBJA9OC5RCa/nDWZ+PSd9mLQO99kyWsjtH4NpeyOR38lPGNfqsa/88SHp3JvjrC+S7ShRLGGhveWr/0YbQlw61mNANOhsryoqJEDjn4Mssw0pdZ6G5ew2EuinZYODfMqAb3W4KCk6c2RKMmliQkFcfPQTIUEf5ZEnF3Rl0rfwOs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DAytdnjY; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DAytdnjY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994295; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WPMeHn70SahxC5BWbKKirz+/N0HSPtcsBxvwuyrkJWI=; b=DAytdnjY80x7fknFWx1n2mZv+UFX8k8NdBFUmryLaFDvXqpHQXn5SFIqvYhKFsmouzEe5Y 45Z2RqPqzvW1TbrZZMpvh1zQ4YbHfVnMJXpoJDdXh7lNt42NezWiLhjPX5+bwdKWi2DS+B xvSlUziVePiiwiO2/HbhOS9e/b50o+I= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-618-CpWK5XexPZW_agwG-PLapw-1; Fri, 18 Apr 2025 12:38:12 -0400 X-MC-Unique: CpWK5XexPZW_agwG-PLapw-1 X-Mimecast-MFC-AGG-ID: CpWK5XexPZW_agwG-PLapw_1744994291 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2EF6619560A1; Fri, 18 Apr 2025 16:38:11 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id BCE9019560A3; Fri, 18 Apr 2025 16:38:09 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 12/20] block: add `struct elv_change_ctx` for unifying elevator_change Date: Sat, 19 Apr 2025 00:36:53 +0800 Message-ID: <20250418163708.442085-13-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Add `struct elv_change_ctx` and prepare for unifying elevator_change(), with this way, any input & output parameter can be provided & observed in top caller. This way also helps to move kobject & debugfs things out of ->elevator_lock & freezing queue. Signed-off-by: Ming Lei --- block/blk-mq.c | 16 ++++++++++------ block/blk.h | 4 ++-- block/elevator.c | 33 +++++++++++++++++++-------------- block/elevator.h | 7 +++++++ 4 files changed, 38 insertions(+), 22 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 0f4a5e674874..1a287c2e791c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4984,16 +4984,20 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, } list_for_each_entry(q, &set->tag_list, tag_set_list) { - const char *name = "none"; - - mutex_lock(&q->elevator_lock); - if (q->elevator && !blk_queue_dying(q)) - name = q->elevator->type->elevator_name; /* * nr_hw_queues is changed and elevator data depends on * it, so we have to force to rebuild elevator */ - __elevator_change(q, name, true); + struct elv_change_ctx ctx = { + .name = "none", + .force = true, + .uevent = true, + }; + + mutex_lock(&q->elevator_lock); + if (q->elevator && !blk_queue_dying(q)) + ctx.name = q->elevator->type->elevator_name; + __elevator_change(q, &ctx); mutex_unlock(&q->elevator_lock); } diff --git a/block/blk.h b/block/blk.h index 0c3cc1af2525..be01cb9f3910 100644 --- a/block/blk.h +++ b/block/blk.h @@ -12,6 +12,7 @@ #include "blk-crypto-internal.h" struct elevator_type; +struct elv_change_ctx; #define BLK_DEV_MAX_SECTORS (LLONG_MAX >> 9) #define BLK_MIN_SEGMENT_SIZE 4096 @@ -319,8 +320,7 @@ bool blk_bio_list_merge(struct request_queue *q, struct list_head *list, bool blk_insert_flush(struct request *rq); -int __elevator_change(struct request_queue *q, const char *elevator_name, - bool force); +int __elevator_change(struct request_queue *q, struct elv_change_ctx *ctx); void elevator_exit(struct request_queue *q); int elv_register_queue(struct request_queue *q, bool uevent); void elv_unregister_queue(struct request_queue *q); diff --git a/block/elevator.c b/block/elevator.c index 6bf3871c7164..836138fc148a 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -53,7 +53,8 @@ static LIST_HEAD(elv_list); */ #define rq_hash_key(rq) (blk_rq_pos(rq) + blk_rq_sectors(rq)) -static int elevator_change(struct request_queue *q, const char *name); +static int elevator_change(struct request_queue *q, + struct elv_change_ctx *ctx); /* * Query io scheduler to see if the current process issuing bio may be @@ -623,7 +624,8 @@ void elevator_init_mq(struct request_queue *q) * If switching fails, we are most likely running out of memory and not able * to restore the old io scheduler, so leaving the io scheduler being none. */ -static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) +static int elevator_switch(struct request_queue *q, struct elevator_type *new_e, + struct elv_change_ctx *ctx) { int ret; @@ -641,7 +643,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) if (ret) goto out_unfreeze; - ret = elv_register_queue(q, true); + ret = elv_register_queue(q, ctx->uevent); if (ret) { elevator_exit(q); goto out_unfreeze; @@ -679,9 +681,9 @@ static void elevator_disable(struct request_queue *q) /* * Switch this queue to the given IO scheduler. */ -int __elevator_change(struct request_queue *q, const char *elevator_name, - bool force) +int __elevator_change(struct request_queue *q, struct elv_change_ctx *ctx) { + const char *elevator_name = ctx->name; struct elevator_type *e; int ret; @@ -695,19 +697,20 @@ int __elevator_change(struct request_queue *q, const char *elevator_name, return 0; } - if (!force && q->elevator && + if (!ctx->force && q->elevator && elevator_match(q->elevator->type, elevator_name)) return 0; e = elevator_find_get(elevator_name); if (!e) return -EINVAL; - ret = elevator_switch(q, e); + ret = elevator_switch(q, e, ctx); elevator_put(e); return ret; } -static int elevator_change(struct request_queue *q, const char *name) +static int elevator_change(struct request_queue *q, + struct elv_change_ctx *ctx) { struct blk_mq_tag_set *set = q->tag_set; unsigned int memflags; @@ -721,7 +724,7 @@ static int elevator_change(struct request_queue *q, const char *name) memflags = blk_mq_freeze_queue(q); mutex_lock(&q->elevator_lock); - ret = __elevator_change(q, name, false); + ret = __elevator_change(q, ctx); mutex_unlock(&q->elevator_lock); blk_mq_unfreeze_queue(q, memflags); exit: @@ -729,7 +732,7 @@ static int elevator_change(struct request_queue *q, const char *name) return ret; } -static void elv_iosched_load_module(char *elevator_name) +static void elv_iosched_load_module(const char *elevator_name) { struct elevator_type *found; @@ -746,7 +749,9 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, { struct request_queue *q = disk->queue; char elevator_name[ELV_NAME_MAX]; - char *name; + struct elv_change_ctx ctx = { + .uevent = true, + }; int ret; /* @@ -755,11 +760,11 @@ ssize_t elv_iosched_store(struct gendisk *disk, const char *buf, * queue is the one for the device storing the module file. */ strscpy(elevator_name, buf, sizeof(elevator_name)); - name = strstrip(elevator_name); + ctx.name = strstrip(elevator_name); - elv_iosched_load_module(name); + elv_iosched_load_module(ctx.name); - ret = elevator_change(q, name); + ret = elevator_change(q, &ctx); if (!ret) ret = count; return ret; diff --git a/block/elevator.h b/block/elevator.h index 9198676644a9..63fc4cad16cc 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -122,6 +122,13 @@ struct elevator_queue #define ELEVATOR_FLAG_REGISTERED 0 +/* Holding context data for changing elevator */ +struct elv_change_ctx { + const char *name; + bool force; + bool uevent; +}; + /* * block elevator interface */ From patchwork Fri Apr 18 16:36:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057468 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBA217462 for ; Fri, 18 Apr 2025 16:38:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994302; cv=none; b=lwKYLm5XvDsmwWxdmqcDW6aIHrxR5H91B/6Pkx4m1E+e1Cg5vTmTjdeeMlU1GFgMzaROimAt4ruCkdU+884DMeOzmZfwintoqxC7RmL0THFaIjlrziDDaB82rApW1EborTn3r7Ced3eynIeHMV4h0WupeP/rYDXZF0s/OZX8c/U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994302; c=relaxed/simple; bh=l5iDY4vdG3/Wa1qbY8/v6ioEt/nxba7rJYqtEEy5Ib4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RqJGt105AMnCra09SHp+bCGP/ZLWOaqI9ejAhKsTCeiK+1fgtbMbzZudCbenHioSktsJSB5KW2PxYf0tVrkqmhQBHu/ou0ijH0Bb8RGMbrwfPPWLqAWSexV6obVmHy+lOg+U7ndMRJPo8fDp7Ikoqg4Twg9Yv42UEIDJk7sZaLs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=WWEykqlu; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WWEykqlu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994299; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nbQZm7VSQJbTqnwa0krZSA8qprnhb3/LlpIvYBXnaUk=; b=WWEykqlu1gPYrsVRb21o+gxiHRJeW+Dl6MxNHmU3GSa9dLurflsM4ef6iPZqH7lvIzE1pZ EdYpsanvCsLTiUAySTbNp/84u/OanCq2swGBbsBzBzU43Uw2Zg/9HaqgckQUQoCgxAnN2M YvbKqZz1cFiVjVFQehsva4d4ygIWSWw= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-584-MZjpWt8DOXyXkw3aoD43bg-1; Fri, 18 Apr 2025 12:38:16 -0400 X-MC-Unique: MZjpWt8DOXyXkw3aoD43bg-1 X-Mimecast-MFC-AGG-ID: MZjpWt8DOXyXkw3aoD43bg_1744994295 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 00FD219560BD; Fri, 18 Apr 2025 16:38:15 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1E5BD1954B0B; Fri, 18 Apr 2025 16:38:13 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 13/20] block: unifying elevator change Date: Sat, 19 Apr 2025 00:36:54 +0800 Message-ID: <20250418163708.442085-14-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 elevator change is one well-define behavior: - tear down current elevator if it exists - setup new elevator It is supposed to cover any case for changing elevator by single internal API, typically the following cases: - setup default elevator in add_disk() - switch to none in del_disk() - reset elevator in blk_mq_update_nr_hw_queues() - switch elevator in sysfs `store` elevator attribute This patch uses elevator_change() to cover all above cases: - every elevator switch is serialized with each other: add_disk/del_disk/ store elevator is serialized already, blk_mq_update_nr_hw_queues() uses srcu for syncing with the other three cases - for both add_disk()/del_disk(), queue freeze works at atomic mode or has been froze, so the freeze in elevator_change() won't add extra delay - `struct elev_change_ctx` instance holds any info for changing elevator Signed-off-by: Ming Lei --- block/blk-sysfs.c | 18 ++++------- block/blk.h | 5 ++- block/elevator.c | 81 ++++++++++++++++++++++++++--------------------- block/elevator.h | 1 + block/genhd.c | 19 +---------- 5 files changed, 55 insertions(+), 69 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index a2882751f0d2..58c50709bc14 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -869,14 +869,8 @@ int blk_register_queue(struct gendisk *disk) if (ret) goto out_unregister_ia_ranges; + elevator_set_default(q); mutex_lock(&q->elevator_lock); - if (q->elevator) { - ret = elv_register_queue(q, false); - if (ret) { - mutex_unlock(&q->elevator_lock); - goto out_crypto_sysfs_unregister; - } - } wbt_enable_default(disk); mutex_unlock(&q->elevator_lock); @@ -902,8 +896,6 @@ int blk_register_queue(struct gendisk *disk) return ret; -out_crypto_sysfs_unregister: - blk_crypto_sysfs_unregister(disk); out_unregister_ia_ranges: disk_unregister_independent_access_ranges(disk); out_debugfs_remove: @@ -949,9 +941,11 @@ void blk_unregister_queue(struct gendisk *disk) blk_mq_sysfs_unregister(disk); blk_crypto_sysfs_unregister(disk); - mutex_lock(&q->elevator_lock); - elv_unregister_queue(q); - mutex_unlock(&q->elevator_lock); + if (q->elevator) { + blk_mq_quiesce_queue(q); + elevator_set_none(q); + blk_mq_unquiesce_queue(q); + } mutex_lock(&q->sysfs_lock); disk_unregister_independent_access_ranges(disk); diff --git a/block/blk.h b/block/blk.h index be01cb9f3910..0e19c09009ed 100644 --- a/block/blk.h +++ b/block/blk.h @@ -321,9 +321,8 @@ bool blk_bio_list_merge(struct request_queue *q, struct list_head *list, bool blk_insert_flush(struct request *rq); int __elevator_change(struct request_queue *q, struct elv_change_ctx *ctx); -void elevator_exit(struct request_queue *q); -int elv_register_queue(struct request_queue *q, bool uevent); -void elv_unregister_queue(struct request_queue *q); +void elevator_set_default(struct request_queue *q); +void elevator_set_none(struct request_queue *q); ssize_t part_size_show(struct device *dev, struct device_attribute *attr, char *buf); diff --git a/block/elevator.c b/block/elevator.c index 836138fc148a..936d8ec9e9f0 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -151,7 +151,7 @@ static void elevator_release(struct kobject *kobj) kfree(e); } -void elevator_exit(struct request_queue *q) +static void elevator_exit(struct request_queue *q) { struct elevator_queue *e = q->elevator; @@ -455,7 +455,7 @@ static const struct kobj_type elv_ktype = { .release = elevator_release, }; -int elv_register_queue(struct request_queue *q, bool uevent) +static int elv_register_queue(struct request_queue *q, bool uevent) { struct elevator_queue *e = q->elevator; int error; @@ -485,7 +485,7 @@ int elv_register_queue(struct request_queue *q, bool uevent) return error; } -void elv_unregister_queue(struct request_queue *q) +static void elv_unregister_queue(struct request_queue *q) { struct elevator_queue *e = q->elevator; @@ -562,60 +562,59 @@ EXPORT_SYMBOL_GPL(elv_unregister); * For single queue devices, default to using mq-deadline. If we have multiple * queues or mq-deadline is not available, default to "none". */ -static struct elevator_type *elevator_get_default(struct request_queue *q) +static bool use_default_elevator(struct request_queue *q) { if (q->tag_set->flags & BLK_MQ_F_NO_SCHED_BY_DEFAULT) - return NULL; + return false; if (q->nr_hw_queues != 1 && !blk_mq_is_shared_tags(q->tag_set->flags)) - return NULL; + return false; - return elevator_find_get("mq-deadline"); + return true; } /* * Use the default elevator settings. If the chosen elevator initialization * fails, fall back to the "none" elevator (no elevator). */ -void elevator_init_mq(struct request_queue *q) +void elevator_set_default(struct request_queue *q) { - struct elevator_type *e; - unsigned int memflags; + struct elv_change_ctx ctx = { + .init = true, + }; int err; - WARN_ON_ONCE(blk_queue_registered(q)); - - if (unlikely(q->elevator)) + if (!queue_is_mq(q)) return; - e = elevator_get_default(q); - if (!e) + ctx.name = use_default_elevator(q) ? "mq-deadline" : "none"; + if (!q->elevator && !strcmp(ctx.name, "none")) return; + err = elevator_change(q, &ctx); + if (err < 0) + pr_warn("\"%s\" set elevator failed %d, " + "falling back to \"none\"\n", ctx.name, err); +} - /* - * We are called before adding disk, when there isn't any FS I/O, - * so freezing queue plus canceling dispatch work is enough to - * drain any dispatch activities originated from passthrough - * requests, then no need to quiesce queue which may add long boot - * latency, especially when lots of disks are involved. - * - * Disk isn't added yet, so verifying queue lock only manually. - */ - memflags = blk_mq_freeze_queue(q); - - blk_mq_cancel_work_sync(q); - - err = blk_mq_init_sched(q, e); +void elevator_set_none(struct request_queue *q) +{ + struct elv_change_ctx ctx = { + .name = "none", + .uevent = true, + .init = true, + }; + int err; - blk_mq_unfreeze_queue(q, memflags); + if (!queue_is_mq(q)) + return; - if (err) { - pr_warn("\"%s\" elevator initialization failed, " - "falling back to \"none\"\n", e->elevator_name); - } + if (!q->elevator) + return; - elevator_put(e); + err = elevator_change(q, &ctx); + if (err < 0) + pr_warn("%s: set none elevator failed %d\n", __func__, err); } /* @@ -688,7 +687,7 @@ int __elevator_change(struct request_queue *q, struct elv_change_ctx *ctx) int ret; /* Make sure queue is not in the middle of being removed */ - if (!blk_queue_registered(q)) + if (!ctx->init && !blk_queue_registered(q)) return -ENOENT; if (!strncmp(elevator_name, "none", 4)) { @@ -723,6 +722,16 @@ static int elevator_change(struct request_queue *q, } memflags = blk_mq_freeze_queue(q); + /* + * May be called before adding disk, when there isn't any FS I/O, + * so freezing queue plus canceling dispatch work is enough to + * drain any dispatch activities originated from passthrough + * requests, then no need to quiesce queue which may add long boot + * latency, especially when lots of disks are involved. + * + * Disk isn't added yet, so verifying queue lock only manually. + */ + blk_mq_cancel_work_sync(q); mutex_lock(&q->elevator_lock); ret = __elevator_change(q, ctx); mutex_unlock(&q->elevator_lock); diff --git a/block/elevator.h b/block/elevator.h index 63fc4cad16cc..e74e27dd6586 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -127,6 +127,7 @@ struct elv_change_ctx { const char *name; bool force; bool uevent; + bool init; }; /* diff --git a/block/genhd.c b/block/genhd.c index 86c3db5b9305..de227aa923ed 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -438,12 +438,6 @@ static int __add_disk_fwnode(struct gendisk_data *data) */ if (disk->fops->submit_bio || disk->fops->poll_bio) return -EINVAL; - - /* - * Initialize the I/O scheduler code and pick a default one if - * needed. - */ - elevator_init_mq(disk->queue); } else { if (!disk->fops->submit_bio) return -EINVAL; @@ -587,11 +581,7 @@ static int __add_disk_fwnode(struct gendisk_data *data) if (disk->major == BLOCK_EXT_MAJOR) blk_free_ext_minor(disk->first_minor); out_exit_elevator: - if (disk->queue->elevator) { - mutex_lock(&disk->queue->elevator_lock); - elevator_exit(disk->queue); - mutex_unlock(&disk->queue->elevator_lock); - } + elevator_set_none(disk->queue); return ret; } @@ -771,14 +761,7 @@ static int __del_gendisk(struct gendisk_data *data) if (queue_is_mq(q)) blk_mq_cancel_work_sync(q); - blk_mq_quiesce_queue(q); - if (q->elevator) { - mutex_lock(&q->elevator_lock); - elevator_exit(q); - mutex_unlock(&q->elevator_lock); - } rq_qos_exit(q); - blk_mq_unquiesce_queue(q); /* * If the disk does not own the queue, allow using passthrough requests From patchwork Fri Apr 18 16:36:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057469 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BFE8213E89 for ; Fri, 18 Apr 2025 16:38:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994306; cv=none; b=qxKUeLPnfiK0dg/3V6fKO0vZLyCqMyaS/lQi9OPd5FGopHsEklO7GkIEvpsehgxxKMmz7ob53HuHWaEsIqt43MyxXdEIuUxFpexOYzXCC7s4D7guPkGHKwdTmmDU/DTfyuDMSwIqGEGb2wUWvjTbkUxDLsjuUpaIqwFpUuTt1nU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994306; c=relaxed/simple; bh=tRsr3l9dcZhoFdeNn4LxqKqEONS/HFl0QW/N7boD7hY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JG+MgpmEPX0S3Oh4wQGQ8Lo20SQ2xQLH1nk8AagUi/yN+M/0tZ0yaFm8fFFwl6+cOVLhHeeA06y1QT0YbkRKPMwOom5lvppLbJR66Ia33PulMdkuB5e+upj0mSxdgxfpMjFO4Yatp+OmxVp37LV3oOZLsqyGnQLZLqcmO/AFV3o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UagYH5Uk; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UagYH5Uk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994303; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+q9Sx7hfZ0zwaTfmmHHuVub0hFr8kTt3KPnLwaYFg+g=; b=UagYH5Uk6maXY5UkAJXugvspKDNoU4iseYcYwmWHZ7k2mxS3nP0L6Yp39CDdRsHDemm3vT jGS/fW+zYAwsLtZpH4fB3xM+CwmQ7ZC26Wsete4l3TRh1SOsaIe5DoYBNZ+VYzFqSEmmh5 FNSPIdyRt+bXq06z70MIVWi0qdl0dfg= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-616-iGt2HzBVMrKGillM1KKOug-1; Fri, 18 Apr 2025 12:38:20 -0400 X-MC-Unique: iGt2HzBVMrKGillM1KKOug-1 X-Mimecast-MFC-AGG-ID: iGt2HzBVMrKGillM1KKOug_1744994299 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E9A9019560A7; Fri, 18 Apr 2025 16:38:18 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C1B11180045C; Fri, 18 Apr 2025 16:38:17 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 14/20] block: pass elevator_queue to elv_register_queue & unregister_queue Date: Sat, 19 Apr 2025 00:36:55 +0800 Message-ID: <20250418163708.442085-15-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Pass elevator_queue reference to elv_register_queue() & elv_unregister_queue(). No functional change, and prepare for moving the two out of elevator lock & freezing queue, when we need to store the old & new elevator queue in `struct elv_change_ctx` instance, then both two can co-exist for short while, so we have to pass the specific elevator_queue instance to elv_register_queue & unregister_queue. Signed-off-by: Ming Lei Reviewed-by: Nilay Shroff --- block/elevator.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/block/elevator.c b/block/elevator.c index 936d8ec9e9f0..568457e01d28 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -455,9 +455,10 @@ static const struct kobj_type elv_ktype = { .release = elevator_release, }; -static int elv_register_queue(struct request_queue *q, bool uevent) +static int elv_register_queue(struct request_queue *q, + struct elevator_queue *e, + bool uevent) { - struct elevator_queue *e = q->elevator; int error; lockdep_assert_held(&q->elevator_lock); @@ -485,10 +486,9 @@ static int elv_register_queue(struct request_queue *q, bool uevent) return error; } -static void elv_unregister_queue(struct request_queue *q) +static void elv_unregister_queue(struct request_queue *q, + struct elevator_queue *e) { - struct elevator_queue *e = q->elevator; - lockdep_assert_held(&q->elevator_lock); if (e && test_and_clear_bit(ELEVATOR_FLAG_REGISTERED, &e->flags)) { @@ -634,7 +634,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e, blk_mq_quiesce_queue(q); if (q->elevator) { - elv_unregister_queue(q); + elv_unregister_queue(q, q->elevator); elevator_exit(q); } @@ -642,7 +642,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e, if (ret) goto out_unfreeze; - ret = elv_register_queue(q, ctx->uevent); + ret = elv_register_queue(q, q->elevator, ctx->uevent); if (ret) { elevator_exit(q); goto out_unfreeze; @@ -667,7 +667,7 @@ static void elevator_disable(struct request_queue *q) blk_mq_quiesce_queue(q); - elv_unregister_queue(q); + elv_unregister_queue(q, q->elevator); elevator_exit(q); blk_queue_flag_clear(QUEUE_FLAG_SQ_SCHED, q); q->elevator = NULL; From patchwork Fri Apr 18 16:36:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057470 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E27E7462 for ; Fri, 18 Apr 2025 16:38:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994312; cv=none; b=ltyxeX+tFgdGoZdm0aWH4/mWIvvKxI1bgqbmOUiBfPDTZKKNhbWWU6ErFfgfSt59Uk0WUSuToQiKV9CUFDZPhtss7n5gwHg9LLwevW3RmQuIEnpfe04TKjzeHHL4oBx3FToxlqbZAxZMQ49TfmC1mpXv4ybFjBB63NThVOHiXHg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994312; c=relaxed/simple; bh=q2jCv4fj42tMGcSPxWIaka9P/+dS4gyig+acpdMrlSc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GkOEsXhbX+6fcYoeaZh98Kn4eGht9vWPVF+5oiNH/L3rz0jlmGojUVTOZXICZMwrFexMrKgOLrpGOtdb5IIwITgAmGeThboihH9/08FhojEHEHmjLfmQFPW5hagQH1r5oARrZghCwQQNVUXd+XEzkpwz+r1qaFLPg5DIWsRgz3U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UA82JnEk; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UA82JnEk" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994309; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5wkWQXgWjr4tyJneJ8rrzbQKv/kR8kzwxaaf7vr4i4g=; b=UA82JnEk3/KE92aZGMkjibpEJTBIUh1KHEMvZ05uIZgDtPBC5hd1XZQNVgKcMeVXwQv0Dp wXlBzYEELue0AOh3km6YcnNDXZVudpptJxM5ynNYKsJZ2aEkfPGq4u1n9iIh2vG998387d JPvd4RPyWD2o6tIQqvamk3T1fCLNqxk= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-168-Src3OM0DOsG-FLZxIrKiPg-1; Fri, 18 Apr 2025 12:38:24 -0400 X-MC-Unique: Src3OM0DOsG-FLZxIrKiPg-1 X-Mimecast-MFC-AGG-ID: Src3OM0DOsG-FLZxIrKiPg_1744994303 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 11D891956086; Fri, 18 Apr 2025 16:38:23 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 05D8119560A3; Fri, 18 Apr 2025 16:38:21 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 15/20] block: fail to show/store elevator sysfs attribute if elevator is dying Date: Sat, 19 Apr 2025 00:36:56 +0800 Message-ID: <20250418163708.442085-16-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Prepare for moving elv_register[unregister]_queue out of elevator_lock & queue freezing, so we may have to call elv_unregister_queue() after elevator ->exit() is called, then there is small window for user to run into ->show()/store(), and user-after-free can be caused. Fail to show/store elevator sysfs attribute if elevator is dying by adding one new flag of ELEVATOR_FLAG_DYNG, which is protected by elevator ->sysfs_lock. Signed-off-by: Ming Lei Reviewed-by: Nilay Shroff --- block/blk-mq-sched.c | 1 + block/elevator.c | 10 ++++++++-- block/elevator.h | 1 + 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 336a15ffecfa..55a0fd105147 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -551,5 +551,6 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e) if (e->type->ops.exit_sched) e->type->ops.exit_sched(e); blk_mq_sched_tags_teardown(q, flags); + set_bit(ELEVATOR_FLAG_DYING, &q->elevator->flags); q->elevator = NULL; } diff --git a/block/elevator.c b/block/elevator.c index 568457e01d28..16171ea92f80 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -422,7 +422,10 @@ elv_attr_show(struct kobject *kobj, struct attribute *attr, char *page) e = container_of(kobj, struct elevator_queue, kobj); mutex_lock(&e->sysfs_lock); - error = e->type ? entry->show(e, page) : -ENOENT; + if (test_bit(ELEVATOR_FLAG_DYING, &e->flags)) + error = -ENODEV; + else + error = e->type ? entry->show(e, page) : -ENOENT; mutex_unlock(&e->sysfs_lock); return error; } @@ -440,7 +443,10 @@ elv_attr_store(struct kobject *kobj, struct attribute *attr, e = container_of(kobj, struct elevator_queue, kobj); mutex_lock(&e->sysfs_lock); - error = e->type ? entry->store(e, page, length) : -ENOENT; + if (test_bit(ELEVATOR_FLAG_DYING, &e->flags)) + error = -ENODEV; + else + error = e->type ? entry->store(e, page, length) : -ENOENT; mutex_unlock(&e->sysfs_lock); return error; } diff --git a/block/elevator.h b/block/elevator.h index e74e27dd6586..16d8888fa2b2 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -121,6 +121,7 @@ struct elevator_queue }; #define ELEVATOR_FLAG_REGISTERED 0 +#define ELEVATOR_FLAG_DYING 1 /* Holding context data for changing elevator */ struct elv_change_ctx { From patchwork Fri Apr 18 16:36:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057471 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BA44208A7 for ; Fri, 18 Apr 2025 16:38:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994313; cv=none; b=U8huVJ1vmXpVz8NajmOBH+I9s1YKDK2C3IuEk5geHHWgftinAQaSzZurlNJlc3x/EoI6NDH3MaWEZNrRk7ua+5M80f24pB2VYEJfYMFRnWROnQkzyiqc2k4RmlSoo4tXQM0uKkGQtCmb6jSRMZV89JzX4MCjnmer6IYtUroQFts= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994313; c=relaxed/simple; bh=n0PjAqp1SLEPoDi51x/AOlF+xp2ZX8yyEZ7g8BVGRzs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GPm9zQUb86TT35RwFSc6ztx8PvIRybzrdx6GXbrlZi/vc9DXgLsqNHiuRK6hQRsMKSS7g7vrrKfp3vuAeB0ssYEHkv1xI2MVSEjMiz1atXSWDZA2ke6sYZ0x88YZKLEUMuYSLeeUGYIAl+2HAhxMcLxLoqzp99WoQjIzQA7lcJk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=F/P756Ze; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="F/P756Ze" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994311; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n003RY9MWUqFmwHicd3yeJ7i5Q8gpS0sXbB8dpyS9uY=; b=F/P756Ze3ISrdYfigqLAi0Y7fKZMJq5438IWFVbhFvyCDR4uLD73XG3jHrIuscPImJLlQy hwuT6/MJZ6PVbfnLIm4GMvH79bMC2mduGqJq4aKh5l3uEsNgPGjoFSiPN0pRZo6DqdVrzx 39779Ujk4T7NN2TsEiouzX4Cd10WlrI= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-241-NWuZpid1NYiTKKwqECIXqw-1; Fri, 18 Apr 2025 12:38:27 -0400 X-MC-Unique: NWuZpid1NYiTKKwqECIXqw-1 X-Mimecast-MFC-AGG-ID: NWuZpid1NYiTKKwqECIXqw_1744994306 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B6D8C1800876; Fri, 18 Apr 2025 16:38:26 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D27BA19560A3; Fri, 18 Apr 2025 16:38:25 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 16/20] block: move elv_register[unregister]_queue out of elevator_lock Date: Sat, 19 Apr 2025 00:36:57 +0800 Message-ID: <20250418163708.442085-17-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Move elv_register[unregister]_queue out of ->elevator_lock & queue freezing, so we can kill many lockdep warnings. elv_register[unregister]_queue() is serialized, and just dealing with sysfs/ debugfs things, no need to be done with queue frozen. With this change, elevator's ->exit() is called before calling elv_unregister_queue, then user may call into ->show()/store() of elevator's sysfs attributes, and we have covered this issue by adding `ELEVATOR_FLAG_DYNG`. For blk-mq debugfs, hctx->sched_tags is always checked with ->elevator_lock by debugfs code, meantime hctx->sched_tags is updated with ->elevator_lock, so there isn't such issue. Signed-off-by: Ming Lei --- block/blk-mq.c | 9 ++++---- block/blk.h | 1 + block/elevator.c | 58 ++++++++++++++++++++++++++++++++++-------------- block/elevator.h | 5 +++++ 4 files changed, 52 insertions(+), 21 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 1a287c2e791c..9a361a173a8e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4993,16 +4993,17 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, .force = true, .uevent = true, }; + int ret = -ENODEV; mutex_lock(&q->elevator_lock); if (q->elevator && !blk_queue_dying(q)) ctx.name = q->elevator->type->elevator_name; - __elevator_change(q, &ctx); + ret = __elevator_change(q, &ctx); mutex_unlock(&q->elevator_lock); - } - - list_for_each_entry(q, &set->tag_list, tag_set_list) blk_mq_unfreeze_queue_nomemrestore(q); + if (!ret) + WARN_ON_ONCE(elevator_change_done(q, &ctx)); + } memalloc_noio_restore(memflags); /* Free the excess tags when nr_hw_queues shrink. */ diff --git a/block/blk.h b/block/blk.h index 0e19c09009ed..48cf6b1c36fe 100644 --- a/block/blk.h +++ b/block/blk.h @@ -323,6 +323,7 @@ bool blk_insert_flush(struct request *rq); int __elevator_change(struct request_queue *q, struct elv_change_ctx *ctx); void elevator_set_default(struct request_queue *q); void elevator_set_none(struct request_queue *q); +int elevator_change_done(struct request_queue *q, struct elv_change_ctx *ctx); ssize_t part_size_show(struct device *dev, struct device_attribute *attr, char *buf); diff --git a/block/elevator.c b/block/elevator.c index 16171ea92f80..8652fe45a2db 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -151,18 +151,24 @@ static void elevator_release(struct kobject *kobj) kfree(e); } -static void elevator_exit(struct request_queue *q) +static void __elevator_exit(struct request_queue *q) { struct elevator_queue *e = q->elevator; + lockdep_assert_held(&q->elevator_lock); + ioc_clear_queue(q); blk_mq_sched_free_rqs(q); mutex_lock(&e->sysfs_lock); blk_mq_exit_sched(q, e); mutex_unlock(&e->sysfs_lock); +} - kobject_put(&e->kobj); +static void elevator_exit(struct request_queue *q) +{ + __elevator_exit(q); + kobject_put(&q->elevator->kobj); } static inline void __elv_rqhash_del(struct request *rq) @@ -467,8 +473,6 @@ static int elv_register_queue(struct request_queue *q, { int error; - lockdep_assert_held(&q->elevator_lock); - error = kobject_add(&e->kobj, &q->disk->queue_kobj, "iosched"); if (!error) { const struct elv_fs_entry *attr = e->type->elevator_attrs; @@ -495,8 +499,6 @@ static int elv_register_queue(struct request_queue *q, static void elv_unregister_queue(struct request_queue *q, struct elevator_queue *e) { - lockdep_assert_held(&q->elevator_lock); - if (e && test_and_clear_bit(ELEVATOR_FLAG_REGISTERED, &e->flags)) { kobject_uevent(&e->kobj, KOBJ_REMOVE); kobject_del(&e->kobj); @@ -640,19 +642,15 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e, blk_mq_quiesce_queue(q); if (q->elevator) { - elv_unregister_queue(q, q->elevator); - elevator_exit(q); + ctx->old = q->elevator; + __elevator_exit(q); } ret = blk_mq_init_sched(q, new_e); if (ret) goto out_unfreeze; - ret = elv_register_queue(q, q->elevator, ctx->uevent); - if (ret) { - elevator_exit(q); - goto out_unfreeze; - } + ctx->new = q->elevator; blk_add_trace_msg(q, "elv switch: %s", new_e->elevator_name); out_unfreeze: @@ -666,15 +664,16 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e, return ret; } -static void elevator_disable(struct request_queue *q) +static void elevator_disable(struct request_queue *q, + struct elv_change_ctx *ctx) { WARN_ON_ONCE(q->mq_freeze_depth == 0); lockdep_assert_held(&q->elevator_lock); blk_mq_quiesce_queue(q); - elv_unregister_queue(q, q->elevator); - elevator_exit(q); + ctx->old = q->elevator; + __elevator_exit(q); blk_queue_flag_clear(QUEUE_FLAG_SQ_SCHED, q); q->elevator = NULL; q->nr_requests = q->tag_set->queue_depth; @@ -683,6 +682,28 @@ static void elevator_disable(struct request_queue *q) blk_mq_unquiesce_queue(q); } +int elevator_change_done(struct request_queue *q, struct elv_change_ctx *ctx) +{ + int ret = 0; + + if (ctx->old) { + elv_unregister_queue(q, ctx->old); + kobject_put(&ctx->old->kobj); + } + if (ctx->new) { + ret = elv_register_queue(q, ctx->new, ctx->uevent); + if (ret) { + unsigned memflags = blk_mq_freeze_queue(q); + + mutex_lock(&q->elevator_lock); + elevator_exit(q); + mutex_unlock(&q->elevator_lock); + blk_mq_unfreeze_queue(q, memflags); + } + } + return 0; +} + /* * Switch this queue to the given IO scheduler. */ @@ -698,7 +719,7 @@ int __elevator_change(struct request_queue *q, struct elv_change_ctx *ctx) if (!strncmp(elevator_name, "none", 4)) { if (q->elevator) - elevator_disable(q); + elevator_disable(q, ctx); return 0; } @@ -742,6 +763,9 @@ static int elevator_change(struct request_queue *q, ret = __elevator_change(q, ctx); mutex_unlock(&q->elevator_lock); blk_mq_unfreeze_queue(q, memflags); + if (!ret) + ret = elevator_change_done(q, ctx); + exit: srcu_read_unlock(&set->update_nr_hwq_srcu, idx); return ret; diff --git a/block/elevator.h b/block/elevator.h index 16d8888fa2b2..486be0690499 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -129,6 +129,11 @@ struct elv_change_ctx { bool force; bool uevent; bool init; + + /* for unregistering old elevator */ + struct elevator_queue *old; + /* for registering new elevator */ + struct elevator_queue *new; }; /* From patchwork Fri Apr 18 16:36:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057472 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 237462153ED for ; Fri, 18 Apr 2025 16:38:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994317; cv=none; b=gcRALL6X69RfbIUqYNxL5yKAfuMvSHqlFvEfvQVNuBD/nwzEPfjj1vl73f+jmsCkbFienMUZowtq84LyK8PPf30S8oZgImqs5qlQVNPLb8XZZDAAR3zBfdcPjVgcZ4GxMlbdeXRvyQ8Ju9aPmKRCwzpP0QHq3LWNXj3IUDduhrk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994317; c=relaxed/simple; bh=veltcLZO+kYibCq4cuTVU8/d5uglnZrkyYQ2QeLEU20=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GOMzlfeKLyoh2PDQ3VsHszs6ZVP4RNVJm9+Ril1pZEGD8oOWqwQ/pxmPx7BpRGFFGdeVwZeC3DWpT5Entrh0vStXiszx5PI1DLLf1G0tcwFI+4Dmx3BSWks8FOkVYmKBsc3x/Vc/Pafqoc2ZOy9PlHKXFTnvl+I0VTbe/3QSLUE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DARrR4yq; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DARrR4yq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994315; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5umzoEAyWPE4xSTnNJBk8dtRp7sY6P5MKOuIVnJhpBg=; b=DARrR4yqhwX4HqPr6t4oQv1JP49kWdFu4zFyEC8TntULr+Saqf5cioQPX/w9uSCieb0foQ nl90pO5EReMEdtqfHlYxpdFlNhtVUfRa0rkQBqH7/KzP8FcqGEYhi199ZHy+TO0vDyK9wq Kqbr+rzCtklzS8MMYlqO2ctPc5zHr+4= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-433-2aaYd9ulMpqvhkhyGTUJnA-1; Fri, 18 Apr 2025 12:38:31 -0400 X-MC-Unique: 2aaYd9ulMpqvhkhyGTUJnA-1 X-Mimecast-MFC-AGG-ID: 2aaYd9ulMpqvhkhyGTUJnA_1744994310 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 974371955DCC; Fri, 18 Apr 2025 16:38:30 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B39D91800362; Fri, 18 Apr 2025 16:38:29 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 17/20] block: move debugfs/sysfs register out of freezing queue Date: Sat, 19 Apr 2025 00:36:58 +0800 Message-ID: <20250418163708.442085-18-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Move debugfs/sysfs register out of freezing queue in __blk_mq_update_nr_hw_queues(), so that the following lockdep dependency can be killed: #2 (&q->q_usage_counter(io)#16){++++}-{0:0}: #1 (fs_reclaim){+.+.}-{0:0}: #0 (&sb->s_type->i_mutex_key#3){+.+.}-{4:4}: //debugfs And registering/un-registering debugfs/sysfs does not require queue to be frozen. Signed-off-by: Ming Lei Reviewed-by: Nilay Shroff --- block/blk-mq.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 9a361a173a8e..8d08127e40be 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4946,15 +4946,15 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, if (set->nr_maps == 1 && nr_hw_queues == set->nr_hw_queues) return; - memflags = memalloc_noio_save(); - list_for_each_entry(q, &set->tag_list, tag_set_list) - blk_mq_freeze_queue_nomemsave(q); - list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_debugfs_unregister_hctxs(q); blk_mq_sysfs_unregister_hctxs(q); } + memflags = memalloc_noio_save(); + list_for_each_entry(q, &set->tag_list, tag_set_list) + blk_mq_freeze_queue_nomemsave(q); + if (blk_mq_realloc_tag_set_tags(set, nr_hw_queues) < 0) goto reregister; @@ -4977,12 +4977,6 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, blk_mq_map_swqueue(q); } -reregister: - list_for_each_entry(q, &set->tag_list, tag_set_list) { - blk_mq_sysfs_register_hctxs(q); - blk_mq_debugfs_register_hctxs(q); - } - list_for_each_entry(q, &set->tag_list, tag_set_list) { /* * nr_hw_queues is changed and elevator data depends on @@ -5006,6 +5000,12 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, } memalloc_noio_restore(memflags); +reregister: + list_for_each_entry(q, &set->tag_list, tag_set_list) { + blk_mq_sysfs_register_hctxs(q); + blk_mq_debugfs_register_hctxs(q); + } + /* Free the excess tags when nr_hw_queues shrink. */ for (i = set->nr_hw_queues; i < prev_nr_hw_queues; i++) __blk_mq_free_map_and_rqs(set, i); From patchwork Fri Apr 18 16:36:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057473 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7BB602144D6 for ; Fri, 18 Apr 2025 16:38:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994322; cv=none; b=CpQvAtsrU84HaTWdM6vZMmyaP+mA/KQs0wq24Obo94lqoaW0xqAYd3Tibr7DY6Ur7oUaZR7OHjLyFD68ZfYTD0MU3Q33P7EMwbE+DLMU7HrRllLd0XF/mi9ik35cq6qsNQCfmC5t0CkWGcGXnBtcJ1jv00cT/KiGubZFZ6Is++o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994322; c=relaxed/simple; bh=qIuSPA71+8nJuqcbk1+fokcRFUpBhglUZPGljLPSrf8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DtjH7SNWR/noAKAOAQbKxzZz44C7y1ev5Na1jGxSEG25oe3SYV3Uj1jxlhERhZh5JHhCWGiVRR/8dCx57jqpZTeRpeEMo6qYROOPNxPKGl9ToackTBKkzSj7Zli3+3De8iwA4EXZLcjd3AMiH7hWwJKZHWViuw8adWM0n082z24= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=KVgBKL8y; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="KVgBKL8y" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994319; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N1pSGQSZdF+YKyoAaPhYH4SBdfrMEc0N0nBXvfyV1tA=; b=KVgBKL8yrsQru39rzV0fSguD9l27TgSxLOk9xvZ0Vt6sw+THkEHx5PWZ6zB2uqbom87D6e qIN4bc4QazIxuEiKYkOwP822Ut0Iv/SAGRlf0a/UhLeFqKwdBJND4+SaDRzflPDVLISC6A ObMVqzBjCh0jxoSF0OjZz2BHvNQnFCc= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-678-xQtKnscLMD2HQ8a157abSg-1; Fri, 18 Apr 2025 12:38:35 -0400 X-MC-Unique: xQtKnscLMD2HQ8a157abSg-1 X-Mimecast-MFC-AGG-ID: xQtKnscLMD2HQ8a157abSg_1744994314 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4B55B1800368; Fri, 18 Apr 2025 16:38:34 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2ABD630002C2; Fri, 18 Apr 2025 16:38:32 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 18/20] block: remove several ->elevator_lock Date: Sat, 19 Apr 2025 00:36:59 +0800 Message-ID: <20250418163708.442085-19-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Both blk_mq_map_swqueue() and blk_mq_realloc_hw_ctxs() are called before the request queue is added to tagset list, so the two won't run concurrently with blk_mq_update_nr_hw_queues(). When the two functions are only called from queue initialization or blk_mq_update_nr_hw_queues(), elevator switch can't happen. So remove these ->elevator_lock uses. Signed-off-by: Ming Lei --- block/blk-mq.c | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 8d08127e40be..4de3287ce6e3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4092,8 +4092,6 @@ static void blk_mq_map_swqueue(struct request_queue *q) struct blk_mq_ctx *ctx; struct blk_mq_tag_set *set = q->tag_set; - mutex_lock(&q->elevator_lock); - queue_for_each_hw_ctx(q, hctx, i) { cpumask_clear(hctx->cpumask); hctx->nr_ctx = 0; @@ -4198,8 +4196,6 @@ static void blk_mq_map_swqueue(struct request_queue *q) hctx->next_cpu = blk_mq_first_mapped_cpu(hctx); hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH; } - - mutex_unlock(&q->elevator_lock); } /* @@ -4503,16 +4499,9 @@ static void __blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, } static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set, - struct request_queue *q, bool lock) + struct request_queue *q) { - if (lock) { - /* protect against switching io scheduler */ - mutex_lock(&q->elevator_lock); - __blk_mq_realloc_hw_ctxs(set, q); - mutex_unlock(&q->elevator_lock); - } else { - __blk_mq_realloc_hw_ctxs(set, q); - } + __blk_mq_realloc_hw_ctxs(set, q); /* unregister cpuhp callbacks for exited hctxs */ blk_mq_remove_hw_queues_cpuhp(q); @@ -4544,7 +4533,7 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, xa_init(&q->hctx_table); - blk_mq_realloc_hw_ctxs(set, q, false); + blk_mq_realloc_hw_ctxs(set, q); if (!q->nr_hw_queues) goto err_hctxs; @@ -4961,7 +4950,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, fallback: blk_mq_update_queue_map(set); list_for_each_entry(q, &set->tag_list, tag_set_list) { - blk_mq_realloc_hw_ctxs(set, q, true); + blk_mq_realloc_hw_ctxs(set, q); if (q->nr_hw_queues != set->nr_hw_queues) { int i = prev_nr_hw_queues; From patchwork Fri Apr 18 16:37:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057474 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77B81221727 for ; Fri, 18 Apr 2025 16:38:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994327; cv=none; b=lB2PtlKtbh1zqWvlK0S5fF+5wi810w3mpyAP1zfsnqZYpoiE/sMloAheZH0v3UNjHBJkEDi+z0xdAZgBvMoIkPOCB9FrIIAsLyyGIZxA85txXdL5lixb4c8xl2hxAcNr7KoZ0VVZn2oG1tzVs0O/mxgppqS5iorZuUD+hmE2bdY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994327; c=relaxed/simple; bh=MtThmghw1w3kNZsCcx2xm97LOf7++KfeDdtlvfyijRs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=X6GRFq0QXBqG+u/d/qBPFSnWR4aVXfnnSymddUTjnQZckr36xPpx6/B7K6Pa2/BwYhwh7u5wdoKXW/P5cE8b3JHT/1ORFT07pzjaTiDPTYyAI+bNyUIbWuqBmvKHjchpyv2Jpq0mFEMO53Dg1eIC/VD0nW/idpCVcrJwLkbnfzU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=AdPnQqvC; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="AdPnQqvC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994324; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E7U3/0+RbpfT+xqUZlJWpiri/XSmvPo5gL6Xs2YC1xM=; b=AdPnQqvCmdTPCBs0rjs8Z4EYPs6xAUiXoIZCP28aBWJE4NoXjHzjyo+5/d0i24EXCxV/Vc wqGJto8LoVkFygFnezIpImWQbaIkuu19icoa2RGN3y9ynz6lCksr8S/EfvSQx/M/kcFKIF rljNrzARLYWTjDkpLX+551PEBfOn7O0= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-636-7qbF-NAWM7utTX1qGX09-A-1; Fri, 18 Apr 2025 12:38:39 -0400 X-MC-Unique: 7qbF-NAWM7utTX1qGX09-A-1 X-Mimecast-MFC-AGG-ID: 7qbF-NAWM7utTX1qGX09-A_1744994318 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1B56D195608C; Fri, 18 Apr 2025 16:38:38 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 37EE719560A3; Fri, 18 Apr 2025 16:38:36 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 19/20] block: move hctx cpuhp add/del out of queue freezing Date: Sat, 19 Apr 2025 00:37:00 +0800 Message-ID: <20250418163708.442085-20-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Move hctx cpuhp add/del out of queue freezing for not connecting freeze lock with cpuhp locks, then lockdep warning can be avoided. This way is safe because both needn't queue to be frozen and scheduler switch isn't allowed. Signed-off-by: Ming Lei --- block/blk-mq.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 4de3287ce6e3..72f106163466 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4950,7 +4950,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, fallback: blk_mq_update_queue_map(set); list_for_each_entry(q, &set->tag_list, tag_set_list) { - blk_mq_realloc_hw_ctxs(set, q); + __blk_mq_realloc_hw_ctxs(set, q); if (q->nr_hw_queues != set->nr_hw_queues) { int i = prev_nr_hw_queues; @@ -4993,6 +4993,9 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_sysfs_register_hctxs(q); blk_mq_debugfs_register_hctxs(q); + + blk_mq_remove_hw_queues_cpuhp(q); + blk_mq_add_hw_queues_cpuhp(q); } /* Free the excess tags when nr_hw_queues shrink. */ From patchwork Fri Apr 18 16:37:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 14057475 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A9C4F2222D3 for ; Fri, 18 Apr 2025 16:38:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994327; cv=none; b=iuL4+zfS9pO85iZLqrW7uBWEamwAFkmvsz2+mTNpYXf9Z5Vjzbax2T+JU1eHU5NHIbQyNUuyF4j8QLaw+GLG4m2qqN6Z0d2Plyvoj9gNujkJLthieofMmzd2Y0qlRuCpJT02ycdceKK7bKCrTzfFEa8hZE/cIGjGX7gyQSzHhIo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744994327; c=relaxed/simple; bh=GFn+0JsJbd41Gknq+7uBEOwcXxoe58QdIzoMjzwPg4w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CxU6b1pJm2wiX034cAaFJ2WQEq3MHSTVSEMGMrXMyBu9zhrd8TjAorjGWhLehhNvygxWQEpMeIhHgQ96fzIxWp0noLV6thwKXVeYD6LsqJOoT+lnRslgJti1Vl7nrgQzlYevFbsMY2Na/HscuXvx0ciUbBIPO2Mfq+EpVDdLPN0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Uz397ogN; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Uz397ogN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744994324; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dRDvh1Lew3SoNPiwpubUaTWEXt/qfAL3pl3cp3qv+3M=; b=Uz397ogNketqqXEe0wy2KKf7JtnCjE3D0LCeCDexEWXSBcDbIBlyKUVipSxHr8KVUUrm7l Q2X/As2c9xQOYAk5ab/W5hkD67qO3PAAW4irc609XGE1TaQZ2f6q2PtuAtoHBuHlxBDUjs AGNHcaaB/fDCiV6Joo66LG6yhGgOSsg= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-392-cB-dr81wO4uEpnrWBAiyRw-1; Fri, 18 Apr 2025 12:38:43 -0400 X-MC-Unique: cB-dr81wO4uEpnrWBAiyRw-1 X-Mimecast-MFC-AGG-ID: cB-dr81wO4uEpnrWBAiyRw_1744994322 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4ADFB195608E; Fri, 18 Apr 2025 16:38:42 +0000 (UTC) Received: from localhost (unknown [10.72.116.50]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 07A4A19560A3; Fri, 18 Apr 2025 16:38:40 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Nilay Shroff , Shinichiro Kawasaki , =?utf-8?q?Thomas_Hellstr?= =?utf-8?q?=C3=B6m?= , Christoph Hellwig , Ming Lei Subject: [PATCH V2 20/20] block: move wbt_enable_default() out of queue freezing from sched ->exit() Date: Sat, 19 Apr 2025 00:37:01 +0800 Message-ID: <20250418163708.442085-21-ming.lei@redhat.com> In-Reply-To: <20250418163708.442085-1-ming.lei@redhat.com> References: <20250418163708.442085-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 scheduler's ->exit() is called with queue frozen and elevator lock is held, and wbt_enable_default() can't be called with queue frozen, otherwise the following lockdep warning is triggered: #6 (&q->rq_qos_mutex){+.+.}-{4:4}: #5 (&eq->sysfs_lock){+.+.}-{4:4}: #4 (&q->elevator_lock){+.+.}-{4:4}: #3 (&q->q_usage_counter(io)#3){++++}-{0:0}: #2 (fs_reclaim){+.+.}-{0:0}: #1 (&sb->s_type->i_mutex_key#3){+.+.}-{4:4}: #0 (&q->debugfs_mutex){+.+.}-{4:4}: Fix the issue by moving wbt_enable_default() out of bfq's exit(), and call it from elevator_change_done(). Signed-off-by: Ming Lei --- block/bfq-iosched.c | 2 +- block/elevator.c | 5 +++++ block/elevator.h | 1 + 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 40e4106a71e7..310ce1d8c41e 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -7211,7 +7211,7 @@ static void bfq_exit_queue(struct elevator_queue *e) blk_stat_disable_accounting(bfqd->queue); blk_queue_flag_clear(QUEUE_FLAG_DISABLE_WBT, bfqd->queue); - wbt_enable_default(bfqd->queue->disk); + set_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, &e->flags); kfree(bfqd); } diff --git a/block/elevator.c b/block/elevator.c index 8652fe45a2db..378553fce5d8 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -687,8 +687,13 @@ int elevator_change_done(struct request_queue *q, struct elv_change_ctx *ctx) int ret = 0; if (ctx->old) { + bool enable_wbt = test_bit(ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT, + &ctx->old->flags); + elv_unregister_queue(q, ctx->old); kobject_put(&ctx->old->kobj); + if (enable_wbt) + wbt_enable_default(q->disk); } if (ctx->new) { ret = elv_register_queue(q, ctx->new, ctx->uevent); diff --git a/block/elevator.h b/block/elevator.h index 486be0690499..b14c611c74b6 100644 --- a/block/elevator.h +++ b/block/elevator.h @@ -122,6 +122,7 @@ struct elevator_queue #define ELEVATOR_FLAG_REGISTERED 0 #define ELEVATOR_FLAG_DYING 1 +#define ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT 2 /* Holding context data for changing elevator */ struct elv_change_ctx {