From patchwork Sat Sep 10 06:50:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 12972428 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1DB8C6FA8D for ; Sat, 10 Sep 2022 06:51:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 04BAF6B0071; Sat, 10 Sep 2022 02:51:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EDBF08D0001; Sat, 10 Sep 2022 02:51:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BCE5A6B0071; Sat, 10 Sep 2022 02:51:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9F3F86B0072 for ; Sat, 10 Sep 2022 02:51:22 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7E3A2A0DA4 for ; Sat, 10 Sep 2022 06:51:22 +0000 (UTC) X-FDA: 79895254404.31.BC03778 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf29.hostedemail.com (Postfix) with ESMTP id 668391200B9 for ; Sat, 10 Sep 2022 06:51:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=TWMB057d75ixVxyjdIFV1XXGOo0lh1V+gLJp5z7ed9M=; b=NBIurit5r6+q8mICiFS5WC3bos uEapd4J0qYeXSXm7FM/nQTunbL1M8v4YBFR9liWo5ysvmgUYjxYsTtwK2M8sQCLKESTfdhMW3H/3l Jz7Q6BUAlaFmzwe8swKOZJT70jf5yW9XUqZoi1Yecm/TY3dhKnFbxNoLAsYNyqPvFRsmJiM/MEZiN npLdahDgF7NnGQTEutWPlQcdWL2ZeMoJ8yHrx6o8043Pvp0q9HwICGWwVeK1zI8LF6iZu6Xv0k04b vTmK+8o2q/D1f7TAGVbdim1bIj0B4DNljspC3aewMIQ4+UoC3TVM/JOldv1tphvsPX88P38NUZCGG TJI26X9A==; Received: from [2001:4bb8:198:38af:e8dc:dbbd:a9d:5c54] (helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1oWuKa-006pZL-QP; Sat, 10 Sep 2022 06:51:09 +0000 From: Christoph Hellwig To: Jens Axboe , Matthew Wilcox , Johannes Weiner , Suren Baghdasaryan , Andrew Morton Cc: Chris Mason , Josef Bacik , David Sterba , Gao Xiang , Chao Yu , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 1/5] mm: add PSI accounting around ->read_folio and ->readahead calls Date: Sat, 10 Sep 2022 08:50:54 +0200 Message-Id: <20220910065058.3303831-2-hch@lst.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220910065058.3303831-1-hch@lst.de> References: <20220910065058.3303831-1-hch@lst.de> MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=NBIurit5; spf=none (imf29.hostedemail.com: domain of BATV+74593d5eb0e0b46c37a4+6957+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+74593d5eb0e0b46c37a4+6957+infradead.org+hch@bombadil.srs.infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662792682; a=rsa-sha256; cv=none; b=WIFifeQgzgVq6zM7LGu9yoYEUhdO6nknsnoC1Dn7hte9H13P8VZdUpXmQyU69ydA8NTnMj oSFIO5A6m8GHbmJoa6+ubDR5GLHsEp6jBhNoFyovYbhwF5A/bsnRcN/n7FHclVAg9oVYz1 3Qqt2xaVPDUd1kILM61SulaAynEZx7w= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662792682; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TWMB057d75ixVxyjdIFV1XXGOo0lh1V+gLJp5z7ed9M=; b=AHU8PbmhdfXQewGAua2uDxn0q4aOqgFMBy9HYWUjYy6T7vS9zhgeGrhO0n5q1f2Z4po8ZN t72MFvvDiKKpioYBZ/tC6KTutgBigmDyklGxiPWYHDX0nfoIzeqSVjL5xCnDTN5Z3w33fk gSJEgtETKlHJJ4V+EoBRgRx3/mtYfGo= X-Stat-Signature: emdbdoc8apbg7t3p9n5ec14orf9gz5aa X-Rspam-User: Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=bombadil.20210309 header.b=NBIurit5; spf=none (imf29.hostedemail.com: domain of BATV+74593d5eb0e0b46c37a4+6957+infradead.org+hch@bombadil.srs.infradead.org has no SPF policy when checking 198.137.202.133) smtp.mailfrom=BATV+74593d5eb0e0b46c37a4+6957+infradead.org+hch@bombadil.srs.infradead.org; dmarc=none X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 668391200B9 X-HE-Tag: 1662792681-748360 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: PSI tries to account for the cost of bringing back in pages discarded by the MM LRU management. Currently the prime place for that is hooked into the bio submission path, which is a rather bad place: - it does not actually account I/O for non-block file systems, of which we have many - it adds overhead and a layering violation to the block layer Add the accounting into the two places in the core MM code that read pages into an address space by calling into ->read_folio and ->readahead so that the entire file system operations are covered, to broaden the coverage and allow removing the accounting in the block layer going forward. As psi_memstall_enter can deal with nested calls this will not lead to double accounting even while the bio annotations are still present. Signed-off-by: Christoph Hellwig Acked-by: Johannes Weiner --- include/linux/pagemap.h | 2 ++ mm/filemap.c | 7 +++++++ mm/readahead.c | 22 ++++++++++++++++++---- 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 0178b2040ea38..201dc7281640b 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -1173,6 +1173,8 @@ struct readahead_control { pgoff_t _index; unsigned int _nr_pages; unsigned int _batch_count; + bool _workingset; + unsigned long _pflags; }; #define DEFINE_READAHEAD(ractl, f, r, m, i) \ diff --git a/mm/filemap.c b/mm/filemap.c index 15800334147b3..c943d1b90cc26 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2382,6 +2382,8 @@ static void filemap_get_read_batch(struct address_space *mapping, static int filemap_read_folio(struct file *file, filler_t filler, struct folio *folio) { + bool workingset = folio_test_workingset(folio); + unsigned long pflags; int error; /* @@ -2390,8 +2392,13 @@ static int filemap_read_folio(struct file *file, filler_t filler, * fails. */ folio_clear_error(folio); + /* Start the actual read. The read will unlock the page. */ + if (unlikely(workingset)) + psi_memstall_enter(&pflags); error = filler(file, folio); + if (unlikely(workingset)) + psi_memstall_leave(&pflags); if (error) return error; diff --git a/mm/readahead.c b/mm/readahead.c index fdcd28cbd92de..43631c146d6dc 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -122,6 +122,7 @@ #include #include #include +#include #include #include #include @@ -152,6 +153,8 @@ static void read_pages(struct readahead_control *rac) if (!readahead_count(rac)) return; + if (unlikely(rac->_workingset)) + psi_memstall_enter(&rac->_pflags); blk_start_plug(&plug); if (aops->readahead) { @@ -179,6 +182,9 @@ static void read_pages(struct readahead_control *rac) } blk_finish_plug(&plug); + if (unlikely(rac->_workingset)) + psi_memstall_leave(&rac->_pflags); + rac->_workingset = false; BUG_ON(readahead_count(rac)); } @@ -252,6 +258,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, } if (i == nr_to_read - lookahead_size) folio_set_readahead(folio); + ractl->_workingset |= folio_test_workingset(folio); ractl->_nr_pages++; } @@ -480,11 +487,14 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index, if (index == mark) folio_set_readahead(folio); err = filemap_add_folio(ractl->mapping, folio, index, gfp); - if (err) + if (err) { folio_put(folio); - else - ractl->_nr_pages += 1UL << order; - return err; + return err; + } + + ractl->_nr_pages += 1UL << order; + ractl->_workingset = folio_test_workingset(folio); + return 0; } void page_cache_ra_order(struct readahead_control *ractl, @@ -826,6 +836,10 @@ void readahead_expand(struct readahead_control *ractl, put_page(page); return; } + if (unlikely(PageWorkingset(page)) && !ractl->_workingset) { + ractl->_workingset = true; + psi_memstall_enter(&ractl->_pflags); + } ractl->_nr_pages++; if (ra) { ra->size++;