From patchwork Thu Aug 12 20:22:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12434185 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD7FAC432BE for ; Thu, 12 Aug 2021 20:22:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 54D5F61077 for ; Thu, 12 Aug 2021 20:22:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 54D5F61077 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DEC396B0071; Thu, 12 Aug 2021 16:22:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9C3B6B0072; Thu, 12 Aug 2021 16:22:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C649C8D0001; Thu, 12 Aug 2021 16:22:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0090.hostedemail.com [216.40.44.90]) by kanga.kvack.org (Postfix) with ESMTP id AD09F6B0071 for ; Thu, 12 Aug 2021 16:22:16 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5FAD02489C for ; Thu, 12 Aug 2021 20:22:16 +0000 (UTC) X-FDA: 78467550672.05.C31AE17 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 0155E60193A8 for ; Thu, 12 Aug 2021 20:22:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628799735; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qTmU29yZIgB7SK96cJsT0SpdzVNLSTZbvkXxHkWMKmU=; b=UIt0rWoR9u2HOZoxD5CpbWskcqYqw0e2+0AL8/MWtcQHGfCr/p8VUGzz8/1c00vtnnJ37W IQY5AQjSwgqftniEibQWkuAQSAmiodBOxcCBZSrVZ9FVmx77xxR3O9PHPPn07RHoTocLDZ XrlLXeu8bzfEok+wX5ckPlZEhvA3HSk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-561-OOb8V-K0PFKzUe5mosFGXw-1; Thu, 12 Aug 2021 16:22:12 -0400 X-MC-Unique: OOb8V-K0PFKzUe5mosFGXw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 43A641853029; Thu, 12 Aug 2021 20:22:10 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.22.32.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 96A9C669ED; Thu, 12 Aug 2021 20:22:07 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH v2 1/5] nfs: Fix write to swapfile failure due to generic_write_checks() From: David Howells To: willy@infradead.org Cc: "Darrick J. Wong" , Christoph Hellwig , Trond Myklebust , linux-nfs@vger.kernel.org, dhowells@redhat.com, dhowells@redhat.com, trond.myklebust@primarydata.com, darrick.wong@oracle.com, hch@lst.de, viro@zeniv.linux.org.uk, jlayton@kernel.org, sfrench@samba.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 12 Aug 2021 21:22:06 +0100 Message-ID: <162879972678.3306668.10709543333474121000.stgit@warthog.procyon.org.uk> In-Reply-To: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> References: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UIt0rWoR; spf=none (imf14.hostedemail.com: domain of dhowells@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 0155E60193A8 X-Stat-Signature: y9f9kjqdtsc9gyt69wfma78jmzu33y9r X-HE-Tag: 1628799735-331218 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Trying to use a swapfile on NFS results in every DIO write failing with ETXTBSY because generic_write_checks(), as called by nfs_direct_write() from nfs_direct_IO(), forbids writes to swapfiles. Fix this by introducing a new kiocb flag, IOCB_SWAP, that's set by the swap code to indicate that the swapper is doing this operation and so overrule the check in generic_write_checks(). Without this patch, the following is seen: Write error on dio swapfile (3800334336) Altering __swap_writepage() to show the error shows: Write error (-26) on dio swapfile (3800334336) Tested by swapping off all swap partitions and then swapping on a prepared NFS file (CONFIG_NFS_SWAP=y is also needed). Enough copies of the following program then need to be run to force swapping to occur (at least one per gigabyte of RAM): #include #include #include #include #include int main() { unsigned int pid = getpid(), iterations = 0; size_t i, j, size = 1024 * 1024 * 1024; char *p; bool mismatch; p = malloc(size); if (!p) { perror("malloc"); exit(1); } srand(pid); for (i = 0; i < size; i += 4) *(unsigned int *)(p + i) = rand(); do { for (j = 0; j < 16; j++) { for (i = 0; i < size; i += 4096) *(unsigned int *)(p + i) += 1; iterations++; } mismatch = false; srand(pid); for (i = 0; i < size; i += 4) { unsigned int r = rand(); unsigned int v = *(unsigned int *)(p + i); if (i % 4096 == 0) v -= iterations; if (v != r) { fprintf(stderr, "mismatch %zx: %x != %x (diff %x)\n", i, v, r, v - r); mismatch = true; } } } while (!mismatch); exit(1); } Fixes: dc617f29dbe5 ("vfs: don't allow writes to swap files") Signed-off-by: David Howells cc: Darrick J. Wong cc: Christoph Hellwig cc: Trond Myklebust cc: linux-nfs@vger.kernel.org --- fs/read_write.c | 2 +- include/linux/fs.h | 1 + mm/page_io.c | 7 ++++--- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 9db7adf160d2..daef721ca67e 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1646,7 +1646,7 @@ ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) loff_t count; int ret; - if (IS_SWAPFILE(inode)) + if (IS_SWAPFILE(inode) && !(iocb->ki_flags & IOCB_SWAP)) return -ETXTBSY; if (!iov_iter_count(from)) diff --git a/include/linux/fs.h b/include/linux/fs.h index 640574294216..b3e6a20f28ef 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -319,6 +319,7 @@ enum rw_hint { /* iocb->ki_waitq is valid */ #define IOCB_WAITQ (1 << 19) #define IOCB_NOIO (1 << 20) +#define IOCB_SWAP (1 << 21) /* This is a swap request */ struct kiocb { struct file *ki_filp; diff --git a/mm/page_io.c b/mm/page_io.c index d597bc6e6e45..edb72bf624d2 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -303,7 +303,8 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); init_sync_kiocb(&kiocb, swap_file); - kiocb.ki_pos = page_file_offset(page); + kiocb.ki_pos = page_file_offset(page); + kiocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; set_page_writeback(page); unlock_page(page); @@ -324,8 +325,8 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, */ set_page_dirty(page); ClearPageReclaim(page); - pr_err_ratelimited("Write error on dio swapfile (%llu)\n", - page_file_offset(page)); + pr_err_ratelimited("Write error (%d) on dio swapfile (%llu)\n", + ret, page_file_offset(page)); } end_page_writeback(page); return ret; From patchwork Thu Aug 12 20:22:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12434187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 913EDC4320A for ; Thu, 12 Aug 2021 20:22:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3C479610FC for ; Thu, 12 Aug 2021 20:22:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3C479610FC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BC91C6B0072; Thu, 12 Aug 2021 16:22:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B795F8D0001; Thu, 12 Aug 2021 16:22:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A68F56B0074; Thu, 12 Aug 2021 16:22:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id 8D9286B0072 for ; Thu, 12 Aug 2021 16:22:25 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3DF2125718 for ; Thu, 12 Aug 2021 20:22:25 +0000 (UTC) X-FDA: 78467551050.24.086C55B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id AF1EAB25B for ; Thu, 12 Aug 2021 20:22:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628799744; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=78K4D7iEyfZAy66gDZncwdkERY71xPhDycqvasMywDk=; b=hoca/IHftv4d3PO/hl9MyBVtGX53JixmjkVNF6gVBaWdTlsUEsXmR2IvaaIKkYrN/exVaj 5jGkYSMpvviJNelxs7XS6mOdr35UqLxhw7XAWbmyQFztA8XYi/2jam94vzJ21bbk0g91sg sXXL/bGwJUSp+MDm/XNBG3pbXL9hwbQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-594-P7YgFk1mPIqz992NKuug1g-1; Thu, 12 Aug 2021 16:22:21 -0400 X-MC-Unique: P7YgFk1mPIqz992NKuug1g-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1E98E1008060; Thu, 12 Aug 2021 20:22:19 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.22.32.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4AFC760C05; Thu, 12 Aug 2021 20:22:16 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH v2 2/5] mm: Remove the callback func argument from __swap_writepage() From: David Howells To: willy@infradead.org Cc: Seth Jennings , Bob Liu , Minchan Kim , Dan Magenheimer , dhowells@redhat.com, dhowells@redhat.com, trond.myklebust@primarydata.com, darrick.wong@oracle.com, hch@lst.de, viro@zeniv.linux.org.uk, jlayton@kernel.org, sfrench@samba.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 12 Aug 2021 21:22:15 +0100 Message-ID: <162879973548.3306668.4893577928865857447.stgit@warthog.procyon.org.uk> In-Reply-To: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> References: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="hoca/IHf"; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf22.hostedemail.com: domain of dhowells@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=dhowells@redhat.com X-Stat-Signature: p4cjefs84chfn6ttnyt6jd6a6rpyc6sc X-Rspamd-Queue-Id: AF1EAB25B X-Rspamd-Server: rspam05 X-HE-Tag: 1628799744-432181 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Remove the callback func argument from __swap_writepage() as it's end_swap_bio_write() in both places that call it. This reverts: commit 1eec6702a80e04416d528846a5ff2122484d95ec mm: allow for outstanding swap writeback accounting Signed-off-by: David Howells cc: Matthew Wilcox (Oracle) cc: Seth Jennings cc: Bob Liu cc: Minchan Kim cc: Dan Magenheimer Reviewed-by: Christoph Hellwig --- include/linux/swap.h | 4 +--- mm/page_io.c | 9 ++++----- mm/zswap.c | 2 +- 3 files changed, 6 insertions(+), 9 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 81801ba78b1e..b785bb041a44 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -425,9 +425,7 @@ extern void kswapd_stop(int nid); /* linux/mm/page_io.c */ extern int swap_readpage(struct page *page, bool do_poll); extern int swap_writepage(struct page *page, struct writeback_control *wbc); -extern void end_swap_bio_write(struct bio *bio); -extern int __swap_writepage(struct page *page, struct writeback_control *wbc, - bio_end_io_t end_write_func); +extern int __swap_writepage(struct page *page, struct writeback_control *wbc); extern int swap_set_page_dirty(struct page *page); int add_swap_extent(struct swap_info_struct *sis, unsigned long start_page, diff --git a/mm/page_io.c b/mm/page_io.c index edb72bf624d2..62cabcdfcec6 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -26,7 +26,7 @@ #include #include -void end_swap_bio_write(struct bio *bio) +static void end_swap_bio_write(struct bio *bio) { struct page *page = bio_first_page_all(bio); @@ -249,7 +249,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) end_page_writeback(page); goto out; } - ret = __swap_writepage(page, wbc, end_swap_bio_write); + ret = __swap_writepage(page, wbc); out: return ret; } @@ -282,8 +282,7 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct page *page) #define bio_associate_blkg_from_page(bio, page) do { } while (0) #endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */ -int __swap_writepage(struct page *page, struct writeback_control *wbc, - bio_end_io_t end_write_func) +int __swap_writepage(struct page *page, struct writeback_control *wbc) { struct bio *bio; int ret; @@ -342,7 +341,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, bio_set_dev(bio, sis->bdev); bio->bi_iter.bi_sector = swap_page_sector(page); bio->bi_opf = REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc); - bio->bi_end_io = end_write_func; + bio->bi_end_io = end_swap_bio_write; bio_add_page(bio, page, thp_size(page), 0); bio_associate_blkg_from_page(bio, page); diff --git a/mm/zswap.c b/mm/zswap.c index 7944e3e57e78..f38e34917aa3 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1011,7 +1011,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) SetPageReclaim(page); /* start writeback */ - __swap_writepage(page, &wbc, end_swap_bio_write); + __swap_writepage(page, &wbc); put_page(page); zswap_written_back_pages++; From patchwork Thu Aug 12 20:22:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12434189 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73E33C4320A for ; Thu, 12 Aug 2021 20:22:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 09D2660EB2 for ; Thu, 12 Aug 2021 20:22:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 09D2660EB2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9312C6B0073; Thu, 12 Aug 2021 16:22:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E1B46B0074; Thu, 12 Aug 2021 16:22:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CFDA6B0075; Thu, 12 Aug 2021 16:22:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id 6316E6B0073 for ; Thu, 12 Aug 2021 16:22:33 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0CFAB18297704 for ; Thu, 12 Aug 2021 20:22:33 +0000 (UTC) X-FDA: 78467551386.01.3A13ABA Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf12.hostedemail.com (Postfix) with ESMTP id B31231010124 for ; Thu, 12 Aug 2021 20:22:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628799752; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dmNW2mWHNd0CF2q46rKFU2zuLMA3busMBrPduaMV8ew=; b=VwUrNgvW+MnUNdhuwDrcTNnmC9qHwZY4Aom7iC6MHMFfvn5Ad2QwS3ks4DLikCe7lRtvKS S+DdTUNwu7dbNzzZxkuPxrgeokIXcAaDk/HJnYck9pDFeLHLDjgC2u127ZeXUhAYrjotHV 65NHfid2vNDuFqQ6ThoIfqOzWwXPf+s= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-130-Na8LsIJzOrSlIr_klTKa_w-1; Thu, 12 Aug 2021 16:22:29 -0400 X-MC-Unique: Na8LsIJzOrSlIr_klTKa_w-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5060A1853033; Thu, 12 Aug 2021 20:22:27 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.22.32.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1C4C528556; Thu, 12 Aug 2021 20:22:24 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH v2 3/5] mm: Make swap_readpage() for SWP_FS_OPS use ->direct_IO() not ->readpage() From: David Howells To: willy@infradead.org Cc: dhowells@redhat.com, dhowells@redhat.com, trond.myklebust@primarydata.com, darrick.wong@oracle.com, hch@lst.de, viro@zeniv.linux.org.uk, jlayton@kernel.org, sfrench@samba.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 12 Aug 2021 21:22:24 +0100 Message-ID: <162879974434.3306668.4798886633463058599.stgit@warthog.procyon.org.uk> In-Reply-To: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> References: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B31231010124 X-Stat-Signature: kxe6ytms8wqmk1p5313erwkehtuj6efr Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VwUrNgvW; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf12.hostedemail.com: domain of dhowells@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=dhowells@redhat.com X-HE-Tag: 1628799752-1217 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make swap_readpage(), when accessing a swap file (SWP_FS_OPS) use the ->direct_IO() method on the filesystem rather then ->readpage(). Suggested-by: Matthew Wilcox (Oracle) Signed-off-by: David Howells --- include/linux/fs.h | 1 mm/page_io.c | 115 +++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 110 insertions(+), 6 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index b3e6a20f28ef..94c47b9b5b1c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -336,6 +336,7 @@ struct kiocb { union { unsigned int ki_cookie; /* for ->iopoll */ struct wait_page_queue *ki_waitq; /* for async buffered IO */ + struct page *ki_swap_page; /* For swapfile_read/write */ }; randomized_struct_fields_end diff --git a/mm/page_io.c b/mm/page_io.c index 62cabcdfcec6..92ec4a7b0545 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -26,6 +26,24 @@ #include #include +/* + * Keep track of the kiocb we're using to do async DIO. We have to + * refcount it until various things stop looking at the kiocb *after* + * calling ->ki_complete(). + */ +struct swapfile_kiocb { + struct kiocb iocb; + refcount_t ki_refcnt; +}; + +static void swapfile_put_kiocb(struct swapfile_kiocb *ki) +{ + if (refcount_dec_and_test(&ki->ki_refcnt)) { + fput(ki->iocb.ki_filp); + kfree(ki); + } +} + static void end_swap_bio_write(struct bio *bio) { struct page *page = bio_first_page_all(bio); @@ -353,6 +371,96 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc) return 0; } +static void __swapfile_read_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct page *page = iocb->ki_swap_page; + + if (ret == PAGE_SIZE) { + count_vm_event(PSWPIN); + SetPageUptodate(page); + } else { + SetPageError(page); + pr_err_ratelimited("Read error (%ld) on dio swapfile (%llu)\n", + ret, page_file_offset(page)); + } + + unlock_page(page); +} + +static void swapfile_read_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct swapfile_kiocb *ki = container_of(iocb, struct swapfile_kiocb, iocb); + + __swapfile_read_complete(iocb, ret, ret2); + swapfile_put_kiocb(ki); +} + +static int swapfile_read_sync(struct swap_info_struct *sis, struct page *page) +{ + struct kiocb kiocb; + struct file *swap_file = sis->swap_file; + struct bio_vec bv = { + .bv_page = page, + .bv_len = thp_size(page), + .bv_offset = 0 + }; + struct iov_iter to; + int ret; + + init_sync_kiocb(&kiocb, swap_file); + kiocb.ki_swap_page = page; + kiocb.ki_pos = page_file_offset(page); + kiocb.ki_filp = swap_file; + kiocb.ki_flags = IOCB_DIRECT | IOCB_SWAP; + /* Should set IOCB_HIPRI too, but the box becomes unresponsive whilst + * putting out occasional messages about the NFS sunrpc scheduling + * tasks being hung. + */ + + iov_iter_bvec(&to, READ, &bv, 1, thp_size(page)); + ret = swap_file->f_mapping->a_ops->direct_IO(&kiocb, &to); + + __swapfile_read_complete(&kiocb, ret, 0); + return (ret > 0) ? 0 : ret; +} + +static int swapfile_read(struct swap_info_struct *sis, struct page *page, + bool synchronous) +{ + struct swapfile_kiocb *ki; + struct file *swap_file = sis->swap_file; + struct bio_vec bv = { + .bv_page = page, + .bv_len = thp_size(page), + .bv_offset = 0 + }; + struct iov_iter to; + int ret; + + if (synchronous) + return swapfile_read_sync(sis, page); + + ki = kzalloc(sizeof(*ki), GFP_KERNEL); + if (!ki) + return -ENOMEM; + + refcount_set(&ki->ki_refcnt, 2); + init_sync_kiocb(&ki->iocb, swap_file); + ki->iocb.ki_swap_page = page; + ki->iocb.ki_flags = IOCB_DIRECT | IOCB_SWAP; + ki->iocb.ki_pos = page_file_offset(page); + ki->iocb.ki_filp = get_file(swap_file); + ki->iocb.ki_complete = swapfile_read_complete; + + iov_iter_bvec(&to, READ, &bv, 1, thp_size(page)); + ret = swap_file->f_mapping->a_ops->direct_IO(&ki->iocb, &to); + + if (ret != -EIOCBQUEUED) + swapfile_read_complete(&ki->iocb, ret, 0); + swapfile_put_kiocb(ki); + return (ret > 0) ? 0 : ret; +} + int swap_readpage(struct page *page, bool synchronous) { struct bio *bio; @@ -380,12 +488,7 @@ int swap_readpage(struct page *page, bool synchronous) } if (data_race(sis->flags & SWP_FS_OPS)) { - struct file *swap_file = sis->swap_file; - struct address_space *mapping = swap_file->f_mapping; - - ret = mapping->a_ops->readpage(swap_file, page); - if (!ret) - count_vm_event(PSWPIN); + ret = swapfile_read(sis, page, synchronous); goto out; } From patchwork Thu Aug 12 20:22:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12434191 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27CE8C4320E for ; Thu, 12 Aug 2021 20:22:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BC49C6103E for ; Thu, 12 Aug 2021 20:22:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BC49C6103E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 49EC98D0003; Thu, 12 Aug 2021 16:22:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 44E878D0001; Thu, 12 Aug 2021 16:22:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 314D16B007B; Thu, 12 Aug 2021 16:22:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187]) by kanga.kvack.org (Postfix) with ESMTP id 166CF6B0074 for ; Thu, 12 Aug 2021 16:22:41 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B1D75252D1 for ; Thu, 12 Aug 2021 20:22:40 +0000 (UTC) X-FDA: 78467551680.32.4216F81 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 466B7700BC70 for ; Thu, 12 Aug 2021 20:22:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628799759; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3PjaRtIcO2Dhz+NqgsQ6uRqvSpUWTutj/OluxNvCw4I=; b=a8S/bFhUSIrxE7BlMhWSTlULkrhfxCR0TXmLP1fpKLGl5t5iOMRvkJn/FQmKEkMQhdmuwb 7BuaCHphxFv1r0svhZHJFGTDzWAVrJVKaFbQ9yiNRBnBJg67X8vmHdiVXHwHi8afNYeZBS Nk0yvpZrsBU6qv3hJAuSMYP6U/nDUHg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-449-YRDf8ugSOvanAlSrc8p2Ag-1; Thu, 12 Aug 2021 16:22:38 -0400 X-MC-Unique: YRDf8ugSOvanAlSrc8p2Ag-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 292BC185302C; Thu, 12 Aug 2021 20:22:36 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.22.32.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5F89F5C23A; Thu, 12 Aug 2021 20:22:33 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH v2 4/5] mm: Make __swap_writepage() do async DIO if asked for it From: David Howells To: willy@infradead.org Cc: Trond Myklebust , linux-nfs@vger.kernel.org, dhowells@redhat.com, dhowells@redhat.com, trond.myklebust@primarydata.com, darrick.wong@oracle.com, hch@lst.de, viro@zeniv.linux.org.uk, jlayton@kernel.org, sfrench@samba.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 12 Aug 2021 21:22:32 +0100 Message-ID: <162879975253.3306668.15630001599959638168.stgit@warthog.procyon.org.uk> In-Reply-To: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> References: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 466B7700BC70 X-Stat-Signature: 1xydy5b9cxxid349ja9ucimw1bt6abua Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="a8S/bFhU"; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf27.hostedemail.com: domain of dhowells@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=dhowells@redhat.com X-HE-Tag: 1628799760-214757 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make __swap_writepage()'s DIO path do sync DIO if the writeback control's sync mode is WB_SYNC_ALL and async DIO if not. Note that this causes hanging processes in sunrpc if the swapfile is on NFS. I'm not sure whether it's due to misscheduling or something else. Suggested-by: Matthew Wilcox (Oracle) Signed-off-by: David Howells cc: Trond Myklebust cc: linux-nfs@vger.kernel.org --- mm/page_io.c | 145 +++++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 102 insertions(+), 43 deletions(-) diff --git a/mm/page_io.c b/mm/page_io.c index 92ec4a7b0545..dae7bbd7a842 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -300,6 +300,105 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct page *page) #define bio_associate_blkg_from_page(bio, page) do { } while (0) #endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */ +static void __swapfile_write_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct page *page = iocb->ki_swap_page; + + if (ret == thp_size(page)) { + count_vm_event(PSWPOUT); + ret = 0; + } else { + /* + * In the case of swap-over-nfs, this can be a + * temporary failure if the system has limited memory + * for allocating transmit buffers. Mark the page + * dirty and avoid folio_rotate_reclaimable but + * rate-limit the messages but do not flag PageError + * like the normal direct-to-bio case as it could be + * temporary. + */ + set_page_dirty(page); + ClearPageReclaim(page); + pr_err_ratelimited("Write error (%ld) on dio swapfile (%llu)\n", + ret, page_file_offset(page)); + } + end_page_writeback(page); +} + +static void swapfile_write_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct swapfile_kiocb *ki = container_of(iocb, struct swapfile_kiocb, iocb); + + __swapfile_write_complete(iocb, ret, ret2); + swapfile_put_kiocb(ki); +} + +static int swapfile_write_sync(struct swap_info_struct *sis, + struct page *page, struct writeback_control *wbc) +{ + struct kiocb kiocb; + struct file *swap_file = sis->swap_file; + struct bio_vec bv = { + .bv_page = page, + .bv_len = thp_size(page), + .bv_offset = 0 + }; + struct iov_iter from; + int ret; + + init_sync_kiocb(&kiocb, swap_file); + kiocb.ki_swap_page = page; + kiocb.ki_pos = page_file_offset(page); + kiocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; + + set_page_writeback(page); + unlock_page(page); + + iov_iter_bvec(&from, WRITE, &bv, 1, thp_size(page)); + ret = swap_file->f_mapping->a_ops->direct_IO(&kiocb, &from); + __swapfile_write_complete(&kiocb, ret, 0); + return (ret > 0) ? 0 : ret; +} + +static int swapfile_write(struct swap_info_struct *sis, + struct page *page, struct writeback_control *wbc) +{ + struct swapfile_kiocb *ki; + struct file *swap_file = sis->swap_file; + struct bio_vec bv = { + .bv_page = page, + .bv_len = thp_size(page), + .bv_offset = 0 + }; + struct iov_iter from; + int ret; + + if (wbc->sync_mode == WB_SYNC_ALL) + return swapfile_write_sync(sis, page, wbc); + + ki = kzalloc(sizeof(*ki), GFP_KERNEL); + if (!ki) + return -ENOMEM; + + refcount_set(&ki->ki_refcnt, 2); + iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); + init_sync_kiocb(&ki->iocb, swap_file); + ki->iocb.ki_swap_page = page; + ki->iocb.ki_pos = page_file_offset(page); + ki->iocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; + ki->iocb.ki_complete = swapfile_write_complete; + get_file(swap_file); + + set_page_writeback(page); + unlock_page(page); + ret = swap_file->f_mapping->a_ops->direct_IO(&ki->iocb, &from); + + if (ret != -EIOCBQUEUED) + swapfile_write_complete(&ki->iocb, ret, 0); + swapfile_put_kiocb(ki); + return (ret > 0) ? 0 : ret; +} + int __swap_writepage(struct page *page, struct writeback_control *wbc) { struct bio *bio; @@ -307,47 +406,8 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc) struct swap_info_struct *sis = page_swap_info(page); VM_BUG_ON_PAGE(!PageSwapCache(page), page); - if (data_race(sis->flags & SWP_FS_OPS)) { - struct kiocb kiocb; - struct file *swap_file = sis->swap_file; - struct address_space *mapping = swap_file->f_mapping; - struct bio_vec bv = { - .bv_page = page, - .bv_len = PAGE_SIZE, - .bv_offset = 0 - }; - struct iov_iter from; - - iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); - init_sync_kiocb(&kiocb, swap_file); - kiocb.ki_pos = page_file_offset(page); - kiocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; - - set_page_writeback(page); - unlock_page(page); - ret = mapping->a_ops->direct_IO(&kiocb, &from); - if (ret == PAGE_SIZE) { - count_vm_event(PSWPOUT); - ret = 0; - } else { - /* - * In the case of swap-over-nfs, this can be a - * temporary failure if the system has limited - * memory for allocating transmit buffers. - * Mark the page dirty and avoid - * folio_rotate_reclaimable but rate-limit the - * messages but do not flag PageError like - * the normal direct-to-bio case as it could - * be temporary. - */ - set_page_dirty(page); - ClearPageReclaim(page); - pr_err_ratelimited("Write error (%d) on dio swapfile (%llu)\n", - ret, page_file_offset(page)); - } - end_page_writeback(page); - return ret; - } + if (data_race(sis->flags & SWP_FS_OPS)) + return swapfile_write(sis, page, wbc); ret = bdev_write_page(sis->bdev, swap_page_sector(page), page, wbc); if (!ret) { @@ -410,7 +470,6 @@ static int swapfile_read_sync(struct swap_info_struct *sis, struct page *page) init_sync_kiocb(&kiocb, swap_file); kiocb.ki_swap_page = page; kiocb.ki_pos = page_file_offset(page); - kiocb.ki_filp = swap_file; kiocb.ki_flags = IOCB_DIRECT | IOCB_SWAP; /* Should set IOCB_HIPRI too, but the box becomes unresponsive whilst * putting out occasional messages about the NFS sunrpc scheduling @@ -449,8 +508,8 @@ static int swapfile_read(struct swap_info_struct *sis, struct page *page, ki->iocb.ki_swap_page = page; ki->iocb.ki_flags = IOCB_DIRECT | IOCB_SWAP; ki->iocb.ki_pos = page_file_offset(page); - ki->iocb.ki_filp = get_file(swap_file); ki->iocb.ki_complete = swapfile_read_complete; + get_file(swap_file); iov_iter_bvec(&to, READ, &bv, 1, thp_size(page)); ret = swap_file->f_mapping->a_ops->direct_IO(&ki->iocb, &to); From patchwork Thu Aug 12 20:22:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12434193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BC36C432BE for ; Thu, 12 Aug 2021 20:22:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 175096103E for ; Thu, 12 Aug 2021 20:22:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 175096103E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 97E706B0075; Thu, 12 Aug 2021 16:22:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 92F8E6B0078; Thu, 12 Aug 2021 16:22:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81DA88D0001; Thu, 12 Aug 2021 16:22:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0002.hostedemail.com [216.40.44.2]) by kanga.kvack.org (Postfix) with ESMTP id 655AB6B0075 for ; Thu, 12 Aug 2021 16:22:51 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0D70782D3699 for ; Thu, 12 Aug 2021 20:22:51 +0000 (UTC) X-FDA: 78467552142.32.6E1F574 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 8A51930071D2 for ; Thu, 12 Aug 2021 20:22:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628799769; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ky+LOtMAnxfOrF4U8ZetcHe2dR1USBrS9gYy/2Wr6Ig=; b=Izku0CvWzLUvUdIA5Qdu4qiJ11OMOjtOc26xfUhZDZ6KeRCntIqnxip1bw2xmpQFA3MBdT YgTcCf28DAd//uuoPotsnUZLMdRE1hHXD9lORgD9S2xwkf8r/fNvorRyryiq+eNdoHKDIi 7hYa9y2SlACXhiDPWoakJMmIX2BFVLE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-599-cIGfTAv2MU-tGSQChAlpyQ-1; Thu, 12 Aug 2021 16:22:46 -0400 X-MC-Unique: cIGfTAv2MU-tGSQChAlpyQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8A8F88799E0; Thu, 12 Aug 2021 20:22:44 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.22.32.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 336435D9D5; Thu, 12 Aug 2021 20:22:42 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH v2 5/5] mm: Remove swap BIO paths and only use DIO paths [BROKEN] From: David Howells To: willy@infradead.org Cc: dhowells@redhat.com, dhowells@redhat.com, trond.myklebust@primarydata.com, darrick.wong@oracle.com, hch@lst.de, viro@zeniv.linux.org.uk, jlayton@kernel.org, sfrench@samba.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 12 Aug 2021 21:22:41 +0100 Message-ID: <162879976139.3306668.12495248062404308890.stgit@warthog.procyon.org.uk> In-Reply-To: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> References: <162879971699.3306668.8977537647318498651.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Rspamd-Queue-Id: 8A51930071D2 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Izku0CvW; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf09.hostedemail.com: domain of dhowells@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=dhowells@redhat.com X-Rspamd-Server: rspam04 X-Stat-Signature: iic8zjzb75uhexicebnz6ua6abtkz856 X-HE-Tag: 1628799770-417102 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: [!] NOTE: This doesn't work and might damage your disk's contents. Delete the BIO-generating swap read/write paths and always use ->direct_IO(). This puts the mapping layer in the filesystem. This doesn't work - probably due to ki_pos being set to page_file_offset(page) which then gets remapped. Suggested-by: Matthew Wilcox (Oracle) Signed-off-by: David Howells --- fs/direct-io.c | 2 + include/linux/bio.h | 2 + mm/page_io.c | 156 ++------------------------------------------------- 3 files changed, 9 insertions(+), 151 deletions(-) diff --git a/fs/direct-io.c b/fs/direct-io.c index b2e86e739d7a..76eec0a68fa4 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -1216,6 +1216,8 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, } if (iocb->ki_flags & IOCB_HIPRI) dio->op_flags |= REQ_HIPRI; + if (iocb->ki_flags & IOCB_SWAP) + dio->op_flags |= REQ_SWAP; /* * For AIO O_(D)SYNC writes we need to defer completions to a workqueue diff --git a/include/linux/bio.h b/include/linux/bio.h index 2203b686e1f0..da75cfa72ed3 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -816,6 +816,8 @@ static inline void bio_set_polled(struct bio *bio, struct kiocb *kiocb) bio->bi_opf |= REQ_HIPRI; if (!is_sync_kiocb(kiocb)) bio->bi_opf |= REQ_NOWAIT; + if (kiocb->ki_flags & IOCB_SWAP) + bio->bi_opf |= REQ_SWAP; } struct bio *blk_next_bio(struct bio *bio, unsigned int nr_pages, gfp_t gfp); diff --git a/mm/page_io.c b/mm/page_io.c index dae7bbd7a842..fb260d9c3973 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -44,30 +44,6 @@ static void swapfile_put_kiocb(struct swapfile_kiocb *ki) } } -static void end_swap_bio_write(struct bio *bio) -{ - struct page *page = bio_first_page_all(bio); - - if (bio->bi_status) { - SetPageError(page); - /* - * We failed to write the page out to swap-space. - * Re-dirty the page in order to avoid it being reclaimed. - * Also print a dire warning that things will go BAD (tm) - * very quickly. - * - * Also clear PG_reclaim to avoid folio_rotate_reclaimable() - */ - set_page_dirty(page); - pr_alert_ratelimited("Write-error on swap-device (%u:%u:%llu)\n", - MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)), - (unsigned long long)bio->bi_iter.bi_sector); - ClearPageReclaim(page); - } - end_page_writeback(page); - bio_put(bio); -} - static void swap_slot_free_notify(struct page *page) { struct swap_info_struct *sis; @@ -116,32 +92,6 @@ static void swap_slot_free_notify(struct page *page) } } -static void end_swap_bio_read(struct bio *bio) -{ - struct page *page = bio_first_page_all(bio); - struct task_struct *waiter = bio->bi_private; - - if (bio->bi_status) { - SetPageError(page); - ClearPageUptodate(page); - pr_alert_ratelimited("Read-error on swap-device (%u:%u:%llu)\n", - MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)), - (unsigned long long)bio->bi_iter.bi_sector); - goto out; - } - - SetPageUptodate(page); - swap_slot_free_notify(page); -out: - unlock_page(page); - WRITE_ONCE(bio->bi_private, NULL); - bio_put(bio); - if (waiter) { - blk_wake_io_task(waiter); - put_task_struct(waiter); - } -} - int generic_swapfile_activate(struct swap_info_struct *sis, struct file *swap_file, sector_t *span) @@ -281,31 +231,12 @@ static inline void count_swpout_vm_event(struct page *page) count_vm_events(PSWPOUT, thp_nr_pages(page)); } -#if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP) -static void bio_associate_blkg_from_page(struct bio *bio, struct page *page) -{ - struct cgroup_subsys_state *css; - struct mem_cgroup *memcg; - - memcg = page_memcg(page); - if (!memcg) - return; - - rcu_read_lock(); - css = cgroup_e_css(memcg->css.cgroup, &io_cgrp_subsys); - bio_associate_blkg_from_css(bio, css); - rcu_read_unlock(); -} -#else -#define bio_associate_blkg_from_page(bio, page) do { } while (0) -#endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */ - static void __swapfile_write_complete(struct kiocb *iocb, long ret, long ret2) { struct page *page = iocb->ki_swap_page; if (ret == thp_size(page)) { - count_vm_event(PSWPOUT); + count_swpout_vm_event(page); ret = 0; } else { /* @@ -401,34 +332,10 @@ static int swapfile_write(struct swap_info_struct *sis, int __swap_writepage(struct page *page, struct writeback_control *wbc) { - struct bio *bio; - int ret; struct swap_info_struct *sis = page_swap_info(page); VM_BUG_ON_PAGE(!PageSwapCache(page), page); - if (data_race(sis->flags & SWP_FS_OPS)) - return swapfile_write(sis, page, wbc); - - ret = bdev_write_page(sis->bdev, swap_page_sector(page), page, wbc); - if (!ret) { - count_swpout_vm_event(page); - return 0; - } - - bio = bio_alloc(GFP_NOIO, 1); - bio_set_dev(bio, sis->bdev); - bio->bi_iter.bi_sector = swap_page_sector(page); - bio->bi_opf = REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc); - bio->bi_end_io = end_swap_bio_write; - bio_add_page(bio, page, thp_size(page), 0); - - bio_associate_blkg_from_page(bio, page); - count_swpout_vm_event(page); - set_page_writeback(page); - unlock_page(page); - submit_bio(bio); - - return 0; + return swapfile_write(sis, page, wbc); } static void __swapfile_read_complete(struct kiocb *iocb, long ret, long ret2) @@ -437,6 +344,7 @@ static void __swapfile_read_complete(struct kiocb *iocb, long ret, long ret2) if (ret == PAGE_SIZE) { count_vm_event(PSWPIN); + swap_slot_free_notify(page); SetPageUptodate(page); } else { SetPageError(page); @@ -522,12 +430,9 @@ static int swapfile_read(struct swap_info_struct *sis, struct page *page, int swap_readpage(struct page *page, bool synchronous) { - struct bio *bio; - int ret = 0; struct swap_info_struct *sis = page_swap_info(page); - blk_qc_t qc; - struct gendisk *disk; unsigned long pflags; + int ret = 0; VM_BUG_ON_PAGE(!PageSwapCache(page) && !synchronous, page); VM_BUG_ON_PAGE(!PageLocked(page), page); @@ -543,60 +448,9 @@ int swap_readpage(struct page *page, bool synchronous) if (frontswap_load(page) == 0) { SetPageUptodate(page); unlock_page(page); - goto out; - } - - if (data_race(sis->flags & SWP_FS_OPS)) { + } else { ret = swapfile_read(sis, page, synchronous); - goto out; - } - - if (sis->flags & SWP_SYNCHRONOUS_IO) { - ret = bdev_read_page(sis->bdev, swap_page_sector(page), page); - if (!ret) { - if (trylock_page(page)) { - swap_slot_free_notify(page); - unlock_page(page); - } - - count_vm_event(PSWPIN); - goto out; - } } - - ret = 0; - bio = bio_alloc(GFP_KERNEL, 1); - bio_set_dev(bio, sis->bdev); - bio->bi_opf = REQ_OP_READ; - bio->bi_iter.bi_sector = swap_page_sector(page); - bio->bi_end_io = end_swap_bio_read; - bio_add_page(bio, page, thp_size(page), 0); - - disk = bio->bi_bdev->bd_disk; - /* - * Keep this task valid during swap readpage because the oom killer may - * attempt to access it in the page fault retry time check. - */ - if (synchronous) { - bio->bi_opf |= REQ_HIPRI; - get_task_struct(current); - bio->bi_private = current; - } - count_vm_event(PSWPIN); - bio_get(bio); - qc = submit_bio(bio); - while (synchronous) { - set_current_state(TASK_UNINTERRUPTIBLE); - if (!READ_ONCE(bio->bi_private)) - break; - - if (!blk_poll(disk->queue, qc, true)) - blk_io_schedule(); - } - __set_current_state(TASK_RUNNING); - bio_put(bio); - -out: psi_memstall_leave(&pflags); return ret; }