From patchwork Mon Feb 14 17:44:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Roesch X-Patchwork-Id: 12745942 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B872DC433EF for ; Mon, 14 Feb 2022 17:45:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357249AbiBNRpO (ORCPT ); Mon, 14 Feb 2022 12:45:14 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:34924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357248AbiBNRor (ORCPT ); Mon, 14 Feb 2022 12:44:47 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 949DC6549B for ; Mon, 14 Feb 2022 09:44:37 -0800 (PST) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21ECHRdb024089 for ; Mon, 14 Feb 2022 09:44:37 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=LHVvoEZfE6JaHZroD1k0ORc9e9YEoF8oOTWuN+537Qc=; b=m8IUD7cIb0bY1zrxnNi/Wp7ND5BOIKoqg2fLChxD1KbWdy4R8eKjp03nsRoywYY5ccC9 aovmYfV1fYvtRyyTvncJ+7FM4WYupFi7gW/JB7NqLu4D5bI3btoKl1EwLDptwkLkwcnP XLym/5qbBGRfzYEgTHu0kZK1RUSUSmArQ+0= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3e7py4j44e-19 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 14 Feb 2022 09:44:37 -0800 Received: from twshared7634.08.ash8.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Mon, 14 Feb 2022 09:44:29 -0800 Received: by devvm225.atn0.facebook.com (Postfix, from userid 425415) id D2FC8ABBD10B; Mon, 14 Feb 2022 09:44:09 -0800 (PST) From: Stefan Roesch To: , , , CC: Subject: [PATCH v1 12/14] mm: support write throttling for async buffered writes Date: Mon, 14 Feb 2022 09:44:01 -0800 Message-ID: <20220214174403.4147994-13-shr@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220214174403.4147994-1-shr@fb.com> References: <20220214174403.4147994-1-shr@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: -82GtROvFUo3LpZQpQinlX4dTOAYlLSP X-Proofpoint-GUID: -82GtROvFUo3LpZQpQinlX4dTOAYlLSP X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-14_07,2022-02-14_03,2021-12-02_01 X-Proofpoint-Spam-Details: rule=fb_outbound_notspam policy=fb_outbound score=0 phishscore=0 clxscore=1015 spamscore=0 bulkscore=0 adultscore=0 priorityscore=1501 suspectscore=0 mlxlogscore=999 impostorscore=0 lowpriorityscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202140105 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This change adds support for async write throttling in the function balance_dirty_pages(). So far if throttling was required, the code was waiting synchronously as long as the writes were throttled. This change introduces asynchronous throttling. Instead of waiting in the function balance_dirty_pages(), the timeout is set in the task_struct field bdp_pause. Once the timeout has expired, the writes are no longer throttled. - Add a new parameter to the balance_dirty_pages() function - This allows the caller to pass in the nowait flag - When the nowait flag is specified, the code does not wait in balance_dirty_pages(), but instead stores the wait expiration in the new task_struct field bdp_pause. - The function balance_dirty_pages_ratelimited() resets the new values in the task_struct, once the timeout has expired This change is required to support write throttling for the async buffered writes. While the writes are throttled, io_uring still can make progress with processing other requests. Signed-off-by: Stefan Roesch --- include/linux/writeback.h | 1 + mm/filemap.c | 2 +- mm/page-writeback.c | 54 ++++++++++++++++++++++++++++----------- 3 files changed, 41 insertions(+), 16 deletions(-) diff --git a/include/linux/writeback.h b/include/linux/writeback.h index fec248ab1fec..48176a8047db 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -373,6 +373,7 @@ unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh); void wb_update_bandwidth(struct bdi_writeback *wb); void balance_dirty_pages_ratelimited(struct address_space *mapping); +void balance_dirty_pages_ratelimited_flags(struct address_space *mapping, bool is_async); bool wb_over_bg_thresh(struct bdi_writeback *wb); typedef int (*writepage_t)(struct page *page, struct writeback_control *wbc, diff --git a/mm/filemap.c b/mm/filemap.c index 19065ad95a4c..aa51ff1a0e8f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3805,7 +3805,7 @@ static ssize_t do_generic_perform_write(struct file *file, struct iov_iter *i, pos += status; written += status; - balance_dirty_pages_ratelimited(mapping); + balance_dirty_pages_ratelimited_flags(mapping, flags & AOP_FLAGS_NOWAIT); } while (iov_iter_count(i)); return written ? written : status; diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 91d163f8d36b..767d0b997da5 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -1558,7 +1558,7 @@ static inline void wb_dirty_limits(struct dirty_throttle_control *dtc) * perform some writeout. */ static void balance_dirty_pages(struct bdi_writeback *wb, - unsigned long pages_dirtied) + unsigned long pages_dirtied, bool is_async) { struct dirty_throttle_control gdtc_stor = { GDTC_INIT(wb) }; struct dirty_throttle_control mdtc_stor = { MDTC_INIT(wb, &gdtc_stor) }; @@ -1792,6 +1792,14 @@ static void balance_dirty_pages(struct bdi_writeback *wb, period, pause, start_time); + if (is_async) { + if (current->bdp_nr_dirtied_pause == -1) { + current->bdp_pause = now + pause; + current->bdp_nr_dirtied_pause = nr_dirtied_pause; + } + break; + } + __set_current_state(TASK_KILLABLE); wb->dirty_sleep = now; io_schedule_timeout(pause); @@ -1799,6 +1807,8 @@ static void balance_dirty_pages(struct bdi_writeback *wb, current->dirty_paused_when = now + pause; current->nr_dirtied = 0; current->nr_dirtied_pause = nr_dirtied_pause; + current->bdp_nr_dirtied_pause = -1; + current->bdp_pause = 0; /* * This is typically equal to (dirty < thresh) and can also @@ -1863,19 +1873,7 @@ static DEFINE_PER_CPU(int, bdp_ratelimits); */ DEFINE_PER_CPU(int, dirty_throttle_leaks) = 0; -/** - * balance_dirty_pages_ratelimited - balance dirty memory state - * @mapping: address_space which was dirtied - * - * Processes which are dirtying memory should call in here once for each page - * which was newly dirtied. The function will periodically check the system's - * dirty state and will initiate writeback if needed. - * - * Once we're over the dirty memory limit we decrease the ratelimiting - * by a lot, to prevent individual processes from overshooting the limit - * by (ratelimit_pages) each. - */ -void balance_dirty_pages_ratelimited(struct address_space *mapping) +void balance_dirty_pages_ratelimited_flags(struct address_space *mapping, bool is_async) { struct inode *inode = mapping->host; struct backing_dev_info *bdi = inode_to_bdi(inode); @@ -1886,6 +1884,15 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping) if (!(bdi->capabilities & BDI_CAP_WRITEBACK)) return; + if (current->bdp_nr_dirtied_pause != -1 && time_after(jiffies, current->bdp_pause)) { + current->dirty_paused_when = current->bdp_pause; + current->nr_dirtied = 0; + current->nr_dirtied_pause = current->bdp_nr_dirtied_pause; + + current->bdp_nr_dirtied_pause = -1; + current->bdp_pause = 0; + } + if (inode_cgwb_enabled(inode)) wb = wb_get_create_current(bdi, GFP_KERNEL); if (!wb) @@ -1924,10 +1931,27 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping) preempt_enable(); if (unlikely(current->nr_dirtied >= ratelimit)) - balance_dirty_pages(wb, current->nr_dirtied); + balance_dirty_pages(wb, current->nr_dirtied, is_async); wb_put(wb); } + +/** + * balance_dirty_pages_ratelimited - balance dirty memory state + * @mapping: address_space which was dirtied + * + * Processes which are dirtying memory should call in here once for each page + * which was newly dirtied. The function will periodically check the system's + * dirty state and will initiate writeback if needed. + * + * Once we're over the dirty memory limit we decrease the ratelimiting + * by a lot, to prevent individual processes from overshooting the limit + * by (ratelimit_pages) each. + */ +void balance_dirty_pages_ratelimited(struct address_space *mapping) +{ + balance_dirty_pages_ratelimited_flags(mapping, false); +} EXPORT_SYMBOL(balance_dirty_pages_ratelimited); /**