From patchwork Sun Aug 11 10:19:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 13759723 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2DE12E646 for ; Sun, 11 Aug 2024 10:19:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723371579; cv=none; b=oZBz6fv7v4EEWtYgGl/V/kMcz7tIvHfYbTPzuldjN9/vSsbumAtXhEyoqvRmXjdfyv0zGYiqH03dhnb6J0GAjtMzPfs5suOffs5ExMqPLhhLt/0wpfOcEl+dtctkTuBiviH+jKN9Kv6zZe+IaVafUPWOWk+8dkhMO4yBgP+i3E4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723371579; c=relaxed/simple; bh=jYIWF+d8RtT7CApAhcVVOIWuKzK336s2dIOxS36Vebo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Y9H6JbxYSL04QkXZGVpbflIQjzKcx2GzxEP+NOuxWivQ5aM+XvdSGOmsPnsyx3mnAag+hUewxtD1TCsUiVJo08BcTZJkgVN9Bd0etMYPcY1Wsn9QKcVMx4Do/gn/R3cucs4V9KQ4+ulAYytQuj7RYS1JhWKD+Z1iGd6ZDtaCNGk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=Es5d83oS; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="Es5d83oS" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1fd70ba6a15so25867435ad.0 for ; Sun, 11 Aug 2024 03:19:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1723371577; x=1723976377; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ds1ze48w37yLgthqPugWbimWqZx+HO9a59JENpUWbI8=; b=Es5d83oS7dIE8k3gu8bnmEXEzY0ILq8dlhLPRwcOaE+mZB5NWye0KCTTRJnnhcXSMI dGScq+I4CgBBJUGMobo+AaJXidin9E+eotvaSevl+wjFFH4b/INU+Ps6m2uiPljhq0c8 B/8Xo8HdbK04PVYCPd5UPhJc5lVwgyjAV5tq0Zc2f726EfVygyHCRdWg7zqW1xsmAGdu 5+6Feo9/D1I7Gwe+w7u9OjNqvotrk3SUuka+wwWlzzfL+BDf/RhvoOWOOcE+byRLyYBI f8ANMxKoLAYxCVxWNY0el1OP0rCzAoaYGH15VIPA0+nsR2Y5NxXB/MyxHO/iHyg/RW+8 LMiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723371577; x=1723976377; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ds1ze48w37yLgthqPugWbimWqZx+HO9a59JENpUWbI8=; b=goR8psZQ/pNB9un2nwPp1iFGAlLVuJKLbjEjmYOm3uZ6rddf0TysVeyy8CQ0RfPHV2 OJQe+Sf5qvL5WgNSPfOp25ZJVsD4wAZepM9gWBt9OHjRtgk0dfihVe649ETe3VF+ti+k 3ODML7Bfq71K2o3jdsxjTI+mYCn41/U+EwIWd9hWj91OmAbWxYbeZArdaxNpvLaCq0Eb uAslNo+ojRne8fHT0pIwe8PjxUPz31Of+tONz/qHsj4iZKX4HOj9D3FvuCDDPeECFckp 29P//TcQKEG8IhgBtyK97hcL7yVdiAAKiPPy79Q0cxsQcQyOp0O3WcM8hbCRLCm5c7G4 RjbA== X-Gm-Message-State: AOJu0Yy76D9Hb41Z91mduvsjSZXhTrIC00SPoE7BwU41x36C+1E23NIG nsrDZMgjW07kZDks1q9XOyPw1E5jA+o43g87ORa+0tr6Imn6ShSQLr7qu076GjkT3m02LogbKH2 D X-Google-Smtp-Source: AGHT+IHrhGO7FADS+Xso42jN4oliqSfyeCN/hga1nn4ByOZpRGS4MDU9GUc15B1hCpOlbLBBamKaog== X-Received: by 2002:a17:903:41d2:b0:1fa:7e0:d69a with SMTP id d9443c01a7336-200ae5cf5b0mr55765245ad.46.1723371576935; Sun, 11 Aug 2024 03:19:36 -0700 (PDT) Received: from PXLDJ45XCM.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-200bbb48b81sm20992155ad.297.2024.08.11.03.19.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 11 Aug 2024 03:19:36 -0700 (PDT) From: Muchun Song To: axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 1/4] block: fix request starvation when queue is stopped or quiesced Date: Sun, 11 Aug 2024 18:19:18 +0800 Message-Id: <20240811101921.4031-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240811101921.4031-1-songmuchun@bytedance.com> References: <20240811101921.4031-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Supposing the following scenario with a virtio_blk driver. CPU0 CPU1 CPU2 blk_mq_try_issue_directly() __blk_mq_issue_directly() q->mq_ops->queue_rq() virtio_queue_rq() blk_mq_stop_hw_queue() blk_mq_try_issue_directly() virtblk_done() if (blk_mq_hctx_stopped()) blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queue() blk_mq_run_hw_queue() blk_mq_run_hw_queue() blk_mq_insert_request() return // Who is responsible for dispatching this IO request? After CPU0 has marked the queue as stopped, CPU1 will see the queue is stopped. But before CPU1 puts the request on the dispatch list, CPU2 receives the interrupt of completion of request, so it will run the hardware queue and marks the queue as non-stopped. Meanwhile, CPU1 also runs the same hardware queue. After both CPU1 and CPU2 complete blk_mq_run_hw_queue(), CPU1 just puts the request to the same hardware queue and returns. Seems it misses dispatching a request. Fix it by running the hardware queue explicitly. I think blk_mq_request_issue_directly() should handle a similar problem. Signed-off-by: Muchun Song Reviewed-by: Ming Lei --- block/blk-mq.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index e3c3c0c21b553..b2d0f22de0c7f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2619,6 +2619,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) { blk_mq_insert_request(rq, 0); + blk_mq_run_hw_queue(hctx, false); return; } @@ -2649,6 +2650,7 @@ static blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last) if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(rq->q)) { blk_mq_insert_request(rq, 0); + blk_mq_run_hw_queue(hctx, false); return BLK_STS_OK; } From patchwork Sun Aug 11 10:19:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 13759724 Received: from mail-oi1-f179.google.com (mail-oi1-f179.google.com [209.85.167.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C78E482EF for ; Sun, 11 Aug 2024 10:19:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723371582; cv=none; b=G/viO0f2K7QpDlFyQKm+QQaQdWwjqoz71koMYXYMpguc3/yyTQ+Uua+v6OUdhEE28OmMb/kmzrrez0SrCsVpvBB8XQlGJLHGvB+TrEAQM37rADgS60LpDmtML8mTbgL5/F6Ym6zq0m+S22Jkf65wnJ4KTXvW/sicOwHUr0qAtEI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723371582; c=relaxed/simple; bh=6pftb2nnNbxoYTrue2Jw5X76kBNH8jDfyjIN9qpxI1c=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=BNg7ljbgWpLbMU51ADo3yTUos4bfTIPt87nk+Q4AhCM0SYIu4dn5ruMyDA8xxry+vZ6T7zcbaWIBc54C1MQKrfra2mXMNXxAVFOTJMKSR8PHx35LPrBP71id7ZLFj4p/Qw8F7Cl4nNCs0065w2PBtWQRrVrLAohgNdGmiK6pX64= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=XXAkHmvb; arc=none smtp.client-ip=209.85.167.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="XXAkHmvb" Received: by mail-oi1-f179.google.com with SMTP id 5614622812f47-3db157cb959so2470242b6e.0 for ; Sun, 11 Aug 2024 03:19:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1723371580; x=1723976380; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8nvX3wlUypt+6rCQzwo1fpItqGzwm6bo/2mQyCkFK9s=; b=XXAkHmvbt1ZgL+N+KEuYtJsOrMixC5FPNa1YEInbx6I+R2Ke+Q2gIM+QgZoNTC3eqK xOPWXRmSZlQHlHphWNrRME2zuNuOCFm01GDi8Eaf1kyfhZuNQH7Coe1e+COrhAPHHL7a pC4PStqaZ7eNKiEH8QaT+vSv5eUDkhUQb9ZcP044sG3gH2YFaezVCxrUT2HZhkZaXdqW fYQSsIJQCmwv71WY3MjycM8x8GEcilUYj+IFEjDCAirMVCtgoNkWAsJykhehQ/shbblJ MDUJD7F5kIiGOIluoE3yhYDZBXdO/S+uJovNfmm06+J71C3f1nsJqCjpnlMwfmY5Q0t1 rawQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723371580; x=1723976380; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8nvX3wlUypt+6rCQzwo1fpItqGzwm6bo/2mQyCkFK9s=; b=ucjhwky70AiO8klaxVOe8aB52zvoFAS2n2ygv2ciO4P1wCsUIUA8+hm7Mr/EiBSQJf ulfjRIaRfktefpEQA4kBFhftJTv43Ih7WtwPjxVJRElrA4OFDEHAkTzdDemfmoSzRhhP fVvYIiC85CW31a/lY3EJasoJWR3Dfo0CEIbTnmH8O60QTMQ8CKX7Rfr5TiTUnEWyRNoh 8dXF+FyG+HqSjMVgohK47Mi8tMH549EWGiS3yMkrNaxGv1R7hKZK0aqnvRxlYwus6N8R TFOGbKLgp/NXj8eNcNiYWbOuDSQSXWlDf3CQEdNQOGWTVZsLtSC0BHrm1atwat6/cPvD 8I0g== X-Gm-Message-State: AOJu0YwIaDR1a650M79JcEkVBJckFbs4QB2GvsILnFC9nRt9cddv2Nqa kb4HgHlg9w4YeckTRpTfUWo85TBwBU4kICMgwdCyMIodWPQAfvvUh3yVxjsbVbY= X-Google-Smtp-Source: AGHT+IGNBvMJelhnYpGmxkSKFmzcolA3NBJvUuOHdFqKSBrLXqP7AFYt9sICFSkTLaLIlAD7xmrYZw== X-Received: by 2002:a05:6808:30a5:b0:3da:ae19:ef0 with SMTP id 5614622812f47-3dc417059e8mr9311323b6e.49.1723371579831; Sun, 11 Aug 2024 03:19:39 -0700 (PDT) Received: from PXLDJ45XCM.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-200bbb48b81sm20992155ad.297.2024.08.11.03.19.37 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 11 Aug 2024 03:19:39 -0700 (PDT) From: Muchun Song To: axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 2/4] block: fix ordering between checking BLK_MQ_S_STOPPED and adding requests to hctx->dispatch Date: Sun, 11 Aug 2024 18:19:19 +0800 Message-Id: <20240811101921.4031-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240811101921.4031-1-songmuchun@bytedance.com> References: <20240811101921.4031-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Supposing the following scenario with a virtio_blk driver. CPU0 CPU1 blk_mq_try_issue_directly() __blk_mq_issue_directly() q->mq_ops->queue_rq() virtio_queue_rq() blk_mq_stop_hw_queue() virtblk_done() blk_mq_request_bypass_insert() blk_mq_start_stopped_hw_queues() /* Add IO request to dispatch list */ 1) store blk_mq_start_stopped_hw_queue() clear_bit(BLK_MQ_S_STOPPED) 3) store blk_mq_run_hw_queue() blk_mq_run_hw_queue() if (!blk_mq_hctx_has_pending()) if (!blk_mq_hctx_has_pending()) 4) load return return blk_mq_sched_dispatch_requests() blk_mq_sched_dispatch_requests() if (blk_mq_hctx_stopped()) 2) load if (blk_mq_hctx_stopped()) return return __blk_mq_sched_dispatch_requests() __blk_mq_sched_dispatch_requests() The full memory barrier should be inserted between 1) and 2), as well as between 3) and 4) to make sure that either CPU0 sees BLK_MQ_S_STOPPED is cleared or CPU1 sees dispatch list or setting of bitmap of software queue. Otherwise, either CPU will not re-run the hardware queue causing starvation. Signed-off-by: Muchun Song --- block/blk-mq.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index b2d0f22de0c7f..6f18993b8f454 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2075,6 +2075,13 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, * in blk_mq_sched_restart(). Avoid restart code path to * miss the new added requests to hctx->dispatch, meantime * SCHED_RESTART is observed here. + * + * This barrier is also used to order adding of dispatch list + * above and the test of BLK_MQ_S_STOPPED in the following + * routine (in blk_mq_delay_run_hw_queue()). Pairs with the + * barrier in blk_mq_start_stopped_hw_queue(). So dispatch code + * could either see BLK_MQ_S_STOPPED is cleared or dispatch list + * to avoid missing dispatching requests. */ smp_mb(); @@ -2237,6 +2244,17 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) if (!need_run) return; + /* + * This barrier is used to order adding of dispatch list or setting + * of bitmap of any software queue outside of this function and the + * test of BLK_MQ_S_STOPPED in the following routine. Pairs with the + * barrier in blk_mq_start_stopped_hw_queue(). So dispatch code could + * either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting + * of bitmap of any software queue to avoid missing dispatching + * requests. + */ + smp_mb(); + if (async || !cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask)) { blk_mq_delay_run_hw_queue(hctx, 0); return; @@ -2392,6 +2410,13 @@ void blk_mq_start_stopped_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) return; clear_bit(BLK_MQ_S_STOPPED, &hctx->state); + /* + * Pairs with the smp_mb() in blk_mq_run_hw_queue() or + * blk_mq_dispatch_rq_list() to order the clearing of + * BLK_MQ_S_STOPPED and the test of dispatch list or + * bitmap of any software queue. + */ + smp_mb__after_atomic(); blk_mq_run_hw_queue(hctx, async); } EXPORT_SYMBOL_GPL(blk_mq_start_stopped_hw_queue); From patchwork Sun Aug 11 10:19:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 13759725 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1F264D8BD for ; Sun, 11 Aug 2024 10:19:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723371585; cv=none; b=dC4gBw0tRDIshDztyQ60jTyvLTFD+JHAjx1WIHg/lheRjImEYVWFAV+KZsorTMVUT8oQx7XovHA5+HwF+YgRp1LPKibgi+n7P29NWy5CkxjKG1UwUjB84JrPla/2BAv+egk3m33X3wR2V5v51rcA8YfSBo4KJivqkE0F4X1br7Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723371585; c=relaxed/simple; bh=/1eM/KOiyGmp7R3yAOpaP4qJdimrFBpaEBE6Erne01k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kMIkLJZYNdpjX0CbDjIP7BRhvR7mlpviMSi1/pO8Hfn0BOGT3JSn9iZdGQl0NI1x+7/XAK7B1mv57jcWbwCOhva370yOTlpAYgHj7r7+ZylVsMSLUwa8kATcx/LpAa9PQitCUjDgUZE3dSuhufuDIUl3EUBTxzX2Mr8Pc8d25g0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=S2LqvYBy; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="S2LqvYBy" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1fd640a6454so26965385ad.3 for ; Sun, 11 Aug 2024 03:19:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1723371583; x=1723976383; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PIZMdYQrQP44xaCgPEOO0G1aVB5mHBqSKoNfWGusOBY=; b=S2LqvYByH6OxwVMYRv9+E5xUu7m2fik90U84BdZpvgFlYYPO8QjSXs6xEgxSIDVtE7 ILDhh7M1UAd6H8PAEm1NOdBuHQsvn5cueHzA8rx5LZQPkl03LkGwttCghjdx6WUncdoA FLuy9vUVe0+WF5nEnGrIB31uq8LoAOADiAirjEs1ecKyUWPNbDfcluZw6Y2amKyRMpgF kzWY+iGPhyL5mSKgVUY/kcdSC4oQN0Pr2iCj6Ai6jb+TKRofKxOv/lMulWJ/CKo2MmYr tOeijUBAjWEKGUwEBO9yWRig82D1NpINxk9djJpTK9wyNXnDMgHSdjg2f8cLNaA5T81z cPTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723371583; x=1723976383; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PIZMdYQrQP44xaCgPEOO0G1aVB5mHBqSKoNfWGusOBY=; b=JHJGixDCvZOhqO+jXuhg4oWeiZmqZtWR0JGt7Zf3BKZ2IThLbwv8Vl9XuoO9AP73RF z7Llf7m/VbF0s0PA5Yhps05Y89VMwALyfS+PCQjuVC1aSHu/QFm70SESBH4KPGwrv8J2 JK6t/DFM7XDtxa6GcLtOiQOdUg8YPxtzQkdC6wX1jzp9yFNjeUCT/d3KjxWX1BFH4I7d wvlttGKnIzH5qm5EJpNgQT0tV3zZK4iEofHX+yAJdiXMeO9bT+YWjv7hCU5OWIGyh4+c xFS+qt+S69/YxUyUTOaY5u48V0qJNimhW3yFQBQuRxhu5j74x2vfUpNac56HZWg8ez5q q2vg== X-Gm-Message-State: AOJu0Yys6LXDVr4jbnMeqE6x4rTjY0Qxxwa2qr6LZwiNUpM9u4VmDNXr i2BZXKLB88EhbXb6XFzbJJ9ODpOmT/ePIgGOPIf35CxT6zfBwfJpPqzsooF1aXo= X-Google-Smtp-Source: AGHT+IEmgTi5qvtkHw8I1YhlxEDcSWuSsse8TTFfstC3EcNNkAaqrvnRF74AUV8RBR/aZ18EZX8wiQ== X-Received: by 2002:a17:902:ea12:b0:1fd:a503:88f0 with SMTP id d9443c01a7336-200ae597a39mr56929905ad.34.1723371582987; Sun, 11 Aug 2024 03:19:42 -0700 (PDT) Received: from PXLDJ45XCM.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-200bbb48b81sm20992155ad.297.2024.08.11.03.19.40 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 11 Aug 2024 03:19:42 -0700 (PDT) From: Muchun Song To: axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 3/4] block: fix missing smp_mb in blk_mq_{delay_}run_hw_queues Date: Sun, 11 Aug 2024 18:19:20 +0800 Message-Id: <20240811101921.4031-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240811101921.4031-1-songmuchun@bytedance.com> References: <20240811101921.4031-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Supposing the following scenario with a virtio_blk driver. CPU0 CPU1 /* * Add request to dispatch list or set bitmap of * software queue. 1) store virtblk_done() */ blk_mq_run_hw_queues()/blk_mq_delay_run_hw_queues() blk_mq_start_stopped_hw_queues() if (blk_mq_hctx_stopped()) 2) load blk_mq_start_stopped_hw_queue() continue clear_bit(BLK_MQ_S_STOPPED) 3) store blk_mq_run_hw_queue()/blk_mq_delay_run_hw_queue() blk_mq_run_hw_queue() if (!blk_mq_hctx_has_pending()) 4) load return blk_mq_sched_dispatch_requests() The full memory barrier should be inserted between 1) and 2), as well as between 3) and 4) to make sure that either CPU0 sees BLK_MQ_S_STOPPED is cleared or CPU1 sees dispatch list or setting of bitmap of software queue. Otherwise, either CPU will not re-run the hardware queue causing starvation. Signed-off-by: Muchun Song --- block/blk-mq.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 6f18993b8f454..385a74e566874 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2299,6 +2299,18 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async) sq_hctx = NULL; if (blk_queue_sq_sched(q)) sq_hctx = blk_mq_get_sq_hctx(q); + + /* + * This barrier is used to order adding of dispatch list or setting + * of bitmap of any software queue outside of this function and the + * test of BLK_MQ_S_STOPPED in the following routine. Pairs with the + * barrier in blk_mq_start_stopped_hw_queue(). So dispatch code could + * either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting + * of bitmap of any software queue to avoid missing dispatching + * requests. + */ + smp_mb(); + queue_for_each_hw_ctx(q, hctx, i) { if (blk_mq_hctx_stopped(hctx)) continue; @@ -2327,6 +2339,18 @@ void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs) sq_hctx = NULL; if (blk_queue_sq_sched(q)) sq_hctx = blk_mq_get_sq_hctx(q); + + /* + * This barrier is used to order adding of dispatch list or setting + * of bitmap of any software queue outside of this function and the + * test of BLK_MQ_S_STOPPED in the following routine. Pairs with the + * barrier in blk_mq_start_stopped_hw_queue(). So dispatch code could + * either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting + * of bitmap of any software queue to avoid missing dispatching + * requests. + */ + smp_mb(); + queue_for_each_hw_ctx(q, hctx, i) { if (blk_mq_hctx_stopped(hctx)) continue; From patchwork Sun Aug 11 10:19:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 13759726 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 95CCC56452 for ; Sun, 11 Aug 2024 10:19:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723371588; cv=none; b=DxZQWw4Dyv6eVVnob2/oSYKyEGh4HrtBYcBvZu22Z6QblZK2tXmSdk1bMAwktKY7ROg58RPdxzVvdIwt01SbPB8lOx9b7agnKN+GfeuK9gNAArzHRCiHNJJOzsv4oKQRaa4v01YbGPx6kv26ei9r/4Ahk1lFAdcJd4nw742mh58= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723371588; c=relaxed/simple; bh=yv62PE+HHdpS6svbT+CRG1qAc/38LkE6TMF1vR09V/0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=iURvo9/UQoud/tJYdgdHm0l08X/XRJTAHGXLqNMOeExGcdXvgGsHAFiS0iFsPk3c6vs2/vXQp2Zhyyvf8uQMrcMAjabSb6lcsEhVr3ZtUIxdSEzsS5sYJQysPv9ifuXektawpe81mm7lXR+Krza6BbrPCxXaniCckCZTj56r1r4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=OTwUzk6L; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="OTwUzk6L" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-1fc4fcbb131so26121405ad.3 for ; Sun, 11 Aug 2024 03:19:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1723371586; x=1723976386; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YeL46knJ+ZqHC6WT5dXT2aP7zuNbEux3//I19LuUM/0=; b=OTwUzk6L5nbjJxLvTk57t2KBqvW1ZODkQks7xyGlW9FJXhnQxFnZVagL4ECEzMFg9U 9/J9GkvjfRQKeLjxV5i2hCMnb3hMMpUrMUpZEalrVwsrRSbYUZTBuFWNaa0T7Xavw+Cj Rt5mCKkx+kNGo6YwZNDAsyWwtUD9+bs1pyjWxqgT5EafV70n4UZFOhB3yrbR1KSRieUx gjI62/b3XrfBFUOLc8bYEnHXoqkQbh166rcWJSpwj10bjHeDbREs6oeMTC9bHLM1p3Za Gf8sFp8OTrmENp+C/mpqIk10P7mrm2iw6tD19hLTqqUbyulnI8xfiMYAr6gqIJguKYUy iaCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723371586; x=1723976386; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YeL46knJ+ZqHC6WT5dXT2aP7zuNbEux3//I19LuUM/0=; b=PYjhH+bMB9mF9DeglYuPOjifxIIq2xYe95gKvSXAZq+8BUNeqIIgivttJJnVyP0ZZd MU8cHBDX5Wwr35zAwZhBygaiVeF7Nmxkuo2pJAf7dTZOlh8VPCJaiVY8pQBG2xWzB2TF DaWv8i9luPbFjr7GnXC2uZARYfoa595RL1ALPzS1YUTHw5wrwhss2nqpKqdTDB1m9ccg pz9ENn/WWqBKvvHAf5JA1WUEsh7cezPC8S/56kMjJu7YMDY5p4WdQZlK7RRO4dLfUNwt fCVLaWyv9s8FAyU/1ILGTxds1GOEDpIb04cvVq9bs/0UPW4m3jy1ZEondzosj+R+R0Jz BciA== X-Gm-Message-State: AOJu0YxZfl8HLdB8I3Abt1aQlDT9wp0ky3QAcd9Ow3HGiZdhQqx3a4B0 SdQPRoU4kLKm/rb+FQHBuW9DQkGwWfvW5c8m2oPje6B7A7tZLuQRoLh48tsa/1A= X-Google-Smtp-Source: AGHT+IEQlVNibHSTFhZ2s10HfYkPh8ARzQx4wM3RfJLUDq+guzel8WQQIkhmkI9Swx2NtwrIvNj/lw== X-Received: by 2002:a17:902:ced0:b0:1fb:2e9a:beea with SMTP id d9443c01a7336-200ae258491mr73318225ad.0.1723371585843; Sun, 11 Aug 2024 03:19:45 -0700 (PDT) Received: from PXLDJ45XCM.bytedance.net ([139.177.225.239]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-200bbb48b81sm20992155ad.297.2024.08.11.03.19.43 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 11 Aug 2024 03:19:45 -0700 (PDT) From: Muchun Song To: axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 4/4] block: fix fix ordering between checking QUEUE_FLAG_QUIESCED and adding requests to hctx->dispatch Date: Sun, 11 Aug 2024 18:19:21 +0800 Message-Id: <20240811101921.4031-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240811101921.4031-1-songmuchun@bytedance.com> References: <20240811101921.4031-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Supposing the following scenario. CPU0 CPU1 blk_mq_request_issue_directly() blk_mq_unquiesce_queue() if (blk_queue_quiesced()) blk_queue_flag_clear(QUEUE_FLAG_QUIESCED) 3) store blk_mq_insert_request() blk_mq_run_hw_queues() /* blk_mq_run_hw_queue() * Add request to dispatch list or set bitmap of if (!blk_mq_hctx_has_pending()) 4) load * software queue. 1) store return */ blk_mq_run_hw_queue() if (blk_queue_quiesced()) 2) load return blk_mq_sched_dispatch_requests() The full memory barrier should be inserted between 1) and 2), as well as between 3) and 4) to make sure that either CPU0 sees QUEUE_FLAG_QUIESCED is cleared or CPU1 sees dispatch list or setting of bitmap of software queue. Otherwise, either CPU will not re-run the hardware queue causing starvation. Signed-off-by: Muchun Song --- block/blk-mq.c | 38 +++++++++++++++++++++++++++----------- 1 file changed, 27 insertions(+), 11 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 385a74e566874..66b21407a9a6c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -264,6 +264,13 @@ void blk_mq_unquiesce_queue(struct request_queue *q) ; } else if (!--q->quiesce_depth) { blk_queue_flag_clear(QUEUE_FLAG_QUIESCED, q); + /** + * The need of memory barrier is in blk_mq_run_hw_queues() to + * make sure clearing of QUEUE_FLAG_QUIESCED is before the + * checking of dispatch list or bitmap of any software queue. + * + * smp_mb__after_atomic(); + */ run_queue = true; } spin_unlock_irqrestore(&q->queue_lock, flags); @@ -2222,6 +2229,21 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) { bool need_run; + /* + * This barrier is used to order adding of dispatch list or setting + * of bitmap of any software queue outside of this function and the + * test of BLK_MQ_S_STOPPED in the following routine. Pairs with the + * barrier in blk_mq_start_stopped_hw_queue(). So dispatch code could + * either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting + * of bitmap of any software queue to avoid missing dispatching + * requests. + * + * This barrier is also used to order adding of dispatch list or + * setting of bitmap of any software queue outside of this function + * and test of QUEUE_FLAG_QUIESCED below. + */ + smp_mb(); + /* * We can't run the queue inline with interrupts disabled. */ @@ -2244,17 +2266,6 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) if (!need_run) return; - /* - * This barrier is used to order adding of dispatch list or setting - * of bitmap of any software queue outside of this function and the - * test of BLK_MQ_S_STOPPED in the following routine. Pairs with the - * barrier in blk_mq_start_stopped_hw_queue(). So dispatch code could - * either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting - * of bitmap of any software queue to avoid missing dispatching - * requests. - */ - smp_mb(); - if (async || !cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask)) { blk_mq_delay_run_hw_queue(hctx, 0); return; @@ -2308,6 +2319,11 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async) * either see BLK_MQ_S_STOPPED is cleared or dispatch list or setting * of bitmap of any software queue to avoid missing dispatching * requests. + * + * This barrier is also used to order clearing of QUEUE_FLAG_QUIESCED + * outside of this function in blk_mq_unquiesce_queue() and checking + * of dispatch list or bitmap of any software queue in + * blk_mq_run_hw_queue(). */ smp_mb();