From patchwork Thu Sep 15 09:41:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 12977126 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0E16C6FA8B for ; Thu, 15 Sep 2022 09:42:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 365026B0080; Thu, 15 Sep 2022 05:42:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 22AC36B007D; Thu, 15 Sep 2022 05:42:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E85918D0001; Thu, 15 Sep 2022 05:42:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C7F766B0078 for ; Thu, 15 Sep 2022 05:42:24 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 725F940DB7 for ; Thu, 15 Sep 2022 09:42:24 +0000 (UTC) X-FDA: 79913829408.06.66F8367 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf31.hostedemail.com (Postfix) with ESMTP id 3297B200C2 for ; Thu, 15 Sep 2022 09:42:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=SxtVDqdat7BUx/W51wS73TW46FWCaaqoJViREBV0E4g=; b=XCKKlba980yHrEb7U2Z7nxXZcq YZzQTA/3rSfLmxP/itoKa1cbxHecjo6lZ+/daLAoQ54ZywDoNmcYhRKQh7BhjSSHYgAkEROhNV0/F 14GG/i0DGrSzBpEno94+6Ri22JjcTLo6AeQn1wLhshEvNQWMAE80uVSy7lfx0Ic4vGDqbQuJxd+jz TiTrR0NISz7uCi8jJpu5X0J1hYZD+scAZRZY54MvRA9w34a9r6/My0FLliJ93QY1RS4/0GnzTnRsB BSei3Ct+cvW3zZBM7lACxj9tSoD9XtuX3iLfVJvhdqhe9e4BywOrgFoGIoW0PNpYxnkBWmNGmZv+S fkJwuS3g==; Received: from [185.122.133.20] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oYlNk-005b0u-Ql; Thu, 15 Sep 2022 09:42:05 +0000 From: Christoph Hellwig To: Jens Axboe , Matthew Wilcox , Johannes Weiner , Suren Baghdasaryan , Andrew Morton Cc: Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 1/5] mm: add PSI accounting around ->read_folio and ->readahead calls Date: Thu, 15 Sep 2022 10:41:56 +0100 Message-Id: <20220915094200.139713-2-hch@lst.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220915094200.139713-1-hch@lst.de> References: <20220915094200.139713-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html ARC-Authentication-Results: i=1; imf31.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=XCKKlba9; spf=none (imf31.hostedemail.com: domain of BATV+aa90abf7a61f323a8d2f+6962+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+aa90abf7a61f323a8d2f+6962+infradead.org+hch@bombadil.srs.infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663234944; a=rsa-sha256; cv=none; b=axONjOr/Mf3t8sXSlHmcWuLkICQKTdmznPFWrVoE6g7C626fgvlCi1GtynEIHYwCu6d9wv 5du1ur8TbDIJyfdgEi2sj+XmaEzCrLxNmZyVEcxlwpRWSfDrJ2W9ZyVMU3INj81FCyio6v ykQsVa9EK6sUH47dDhKGvDT+4H3FjMk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663234944; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SxtVDqdat7BUx/W51wS73TW46FWCaaqoJViREBV0E4g=; b=G1CmyYFzqe/3ju1NFhRQTWun5tcgIBS2qkH7VMDAak+72Nv0YWSFoqbglNWpXyNu/ey6Kf 3vzJccTijXVZ8ebUyf8934y2Pev5TieDyi+zC7ytkA3Wu267jQK8WozbtibSBs5QihLe9e KcANzl8IBeWsRgOXi7XgBli5GuVEW/I= X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 3297B200C2 X-Rspam-User: Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=XCKKlba9; spf=none (imf31.hostedemail.com: domain of BATV+aa90abf7a61f323a8d2f+6962+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+aa90abf7a61f323a8d2f+6962+infradead.org+hch@bombadil.srs.infradead.org; dmarc=none X-Stat-Signature: 6jzjb647efnxwx58gt3ff7yhmteb84tm X-HE-Tag: 1663234944-87901 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: PSI tries to account for the cost of bringing back in pages discarded by the MM LRU management. Currently the prime place for that is hooked into the bio submission path, which is a rather bad place: - it does not actually account I/O for non-block file systems, of which we have many - it adds overhead and a layering violation to the block layer Add the accounting into the two places in the core MM code that read pages into an address space by calling into ->read_folio and ->readahead so that the entire file system operations are covered, to broaden the coverage and allow removing the accounting in the block layer going forward. As psi_memstall_enter can deal with nested calls this will not lead to double accounting even while the bio annotations are still present. Signed-off-by: Christoph Hellwig Acked-by: Johannes Weiner --- include/linux/pagemap.h | 2 ++ mm/filemap.c | 7 +++++++ mm/readahead.c | 22 ++++++++++++++++++---- 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 0178b2040ea38..201dc7281640b 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -1173,6 +1173,8 @@ struct readahead_control { pgoff_t _index; unsigned int _nr_pages; unsigned int _batch_count; + bool _workingset; + unsigned long _pflags; }; #define DEFINE_READAHEAD(ractl, f, r, m, i) \ diff --git a/mm/filemap.c b/mm/filemap.c index 15800334147b3..c943d1b90cc26 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2382,6 +2382,8 @@ static void filemap_get_read_batch(struct address_space *mapping, static int filemap_read_folio(struct file *file, filler_t filler, struct folio *folio) { + bool workingset = folio_test_workingset(folio); + unsigned long pflags; int error; /* @@ -2390,8 +2392,13 @@ static int filemap_read_folio(struct file *file, filler_t filler, * fails. */ folio_clear_error(folio); + /* Start the actual read. The read will unlock the page. */ + if (unlikely(workingset)) + psi_memstall_enter(&pflags); error = filler(file, folio); + if (unlikely(workingset)) + psi_memstall_leave(&pflags); if (error) return error; diff --git a/mm/readahead.c b/mm/readahead.c index fdcd28cbd92de..b10f0cf81d804 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -122,6 +122,7 @@ #include #include #include +#include #include #include #include @@ -152,6 +153,8 @@ static void read_pages(struct readahead_control *rac) if (!readahead_count(rac)) return; + if (unlikely(rac->_workingset)) + psi_memstall_enter(&rac->_pflags); blk_start_plug(&plug); if (aops->readahead) { @@ -179,6 +182,9 @@ static void read_pages(struct readahead_control *rac) } blk_finish_plug(&plug); + if (unlikely(rac->_workingset)) + psi_memstall_leave(&rac->_pflags); + rac->_workingset = false; BUG_ON(readahead_count(rac)); } @@ -252,6 +258,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, } if (i == nr_to_read - lookahead_size) folio_set_readahead(folio); + ractl->_workingset |= folio_test_workingset(folio); ractl->_nr_pages++; } @@ -480,11 +487,14 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index, if (index == mark) folio_set_readahead(folio); err = filemap_add_folio(ractl->mapping, folio, index, gfp); - if (err) + if (err) { folio_put(folio); - else - ractl->_nr_pages += 1UL << order; - return err; + return err; + } + + ractl->_nr_pages += 1UL << order; + ractl->_workingset |= folio_test_workingset(folio); + return 0; } void page_cache_ra_order(struct readahead_control *ractl, @@ -826,6 +836,10 @@ void readahead_expand(struct readahead_control *ractl, put_page(page); return; } + if (unlikely(PageWorkingset(page)) && !ractl->_workingset) { + ractl->_workingset = true; + psi_memstall_enter(&ractl->_pflags); + } ractl->_nr_pages++; if (ra) { ra->size++;