From patchwork Fri Jun 19 09:39:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 11613631 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE040138C for ; Fri, 19 Jun 2020 09:40:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9247720786 for ; Fri, 19 Jun 2020 09:40:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="b09KbAiv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9247720786 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D58AF8D0092; Fri, 19 Jun 2020 05:40:48 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D093C8D0089; Fri, 19 Jun 2020 05:40:48 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C46428D0092; Fri, 19 Jun 2020 05:40:48 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0229.hostedemail.com [216.40.44.229]) by kanga.kvack.org (Postfix) with ESMTP id AB3828D0089 for ; Fri, 19 Jun 2020 05:40:48 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6A4CF824556B for ; Fri, 19 Jun 2020 09:40:48 +0000 (UTC) X-FDA: 76945466976.29.ghost42_5811c4226e17 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 97B8D18095BB8 for ; Fri, 19 Jun 2020 09:40:15 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,agruenba@redhat.com,,RULES_HIT,0,RBL:205.139.110.61:@redhat.com:.lbl8.mailshell.net-66.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:34,LUA_SUMMARY:none X-HE-Tag: ghost42_5811c4226e17 X-Filterd-Recvd-Size: 5371 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Fri, 19 Jun 2020 09:40:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1592559614; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SldHT58QWZEaNH+GrGNea/awooFbiqL4jxPiQBrZZXc=; b=b09KbAivpE/OZsWTbEfiBrnUam2KUP5B8J3KrwtGAQXcR2hLkjiacGR/4y20qm07UfEGPU 1zwW8KTJA3Plgan4+zyxy23XIrV38WSt3cwGtUXoFqYxKVjVPEx06Ocd5Oud61wvg0c270 D/CkhqT1D115KlOl1H9PRVcQobVx90o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-366-SIB5PbDMObCm2YydCZpjNg-1; Fri, 19 Jun 2020 05:39:37 -0400 X-MC-Unique: SIB5PbDMObCm2YydCZpjNg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E25C218FF660; Fri, 19 Jun 2020 09:39:34 +0000 (UTC) Received: from max.home.com (unknown [10.40.195.140]) by smtp.corp.redhat.com (Postfix) with ESMTP id 258625D9EF; Fri, 19 Jun 2020 09:39:26 +0000 (UTC) From: Andreas Gruenbacher To: Matthew Wilcox , Linus Torvalds Cc: Andreas Gruenbacher , Andrew Morton , linux-xfs , Junxiao Bi , William Kucharski , Joseph Qi , John Hubbard , linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, ocfs2-devel@oss.oracle.com, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-erofs@lists.ozlabs.org, Christoph Hellwig , linux-btrfs@vger.kernel.org Subject: [PATCH 1/2] fs: Add IOCB_CACHED flag for generic_file_read_iter Date: Fri, 19 Jun 2020 11:39:15 +0200 Message-Id: <20200619093916.1081129-2-agruenba@redhat.com> In-Reply-To: <20200619093916.1081129-1-agruenba@redhat.com> References: <20200619093916.1081129-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Rspamd-Queue-Id: 97B8D18095BB8 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add an IOCB_CACHED flag which indicates to generic_file_read_iter that it should only regard the page cache, without triggering any filesystem I/O for the actual request or for readahead. With this flag, -EAGAIN is returned when regular I/O would be triggered similar to the IOCB_NOWAIT flag, and -ECANCELED is returned when readahead would be triggered. This allows the caller to perform a tentative read out of the page cache, and to retry the read if the requested pages are not cached. Please see the next commit for what this is used for. Signed-off-by: Andreas Gruenbacher --- include/linux/fs.h | 1 + mm/filemap.c | 16 ++++++++++++++-- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 6c4ab4dc1cd7..74eade571b1c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -315,6 +315,7 @@ enum rw_hint { #define IOCB_SYNC (1 << 5) #define IOCB_WRITE (1 << 6) #define IOCB_NOWAIT (1 << 7) +#define IOCB_CACHED (1 << 8) struct kiocb { struct file *ki_filp; diff --git a/mm/filemap.c b/mm/filemap.c index f0ae9a6308cb..bd11f27bf6ae 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2028,7 +2028,7 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, page = find_get_page(mapping, index); if (!page) { - if (iocb->ki_flags & IOCB_NOWAIT) + if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_CACHED)) goto would_block; page_cache_sync_readahead(mapping, ra, filp, @@ -2038,12 +2038,17 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, goto no_cached_page; } if (PageReadahead(page)) { + if (iocb->ki_flags & IOCB_CACHED) { + put_page(page); + error = -ECANCELED; + goto out; + } page_cache_async_readahead(mapping, ra, filp, page, index, last_index - index); } if (!PageUptodate(page)) { - if (iocb->ki_flags & IOCB_NOWAIT) { + if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_CACHED)) { put_page(page); goto would_block; } @@ -2249,6 +2254,13 @@ EXPORT_SYMBOL_GPL(generic_file_buffered_read); * * This is the "read_iter()" routine for all filesystems * that can use the page cache directly. + * + * In the IOCB_NOWAIT flag in iocb->ki_flags indicates that -EAGAIN should be + * returned if completing the request would require I/O; this does not prevent + * readahead. The IOCB_CACHED flag indicates that -EAGAIN should be returned + * as under the IOCB_NOWAIT flag, and that -ECANCELED should be returned when + * readhead would be triggered. + * * Return: * * number of bytes copied, even for partial reads * * negative error code if nothing was read From patchwork Fri Jun 19 09:39:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Gruenbacher X-Patchwork-Id: 11613627 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF84C138C for ; Fri, 19 Jun 2020 09:40:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 83A2F20786 for ; Fri, 19 Jun 2020 09:40:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="b8dE8iRk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 83A2F20786 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D32918D0091; Fri, 19 Jun 2020 05:40:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D085F8D0089; Fri, 19 Jun 2020 05:40:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF72A8D0091; Fri, 19 Jun 2020 05:40:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0082.hostedemail.com [216.40.44.82]) by kanga.kvack.org (Postfix) with ESMTP id A68808D0089 for ; Fri, 19 Jun 2020 05:40:43 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 65802824556B for ; Fri, 19 Jun 2020 09:40:43 +0000 (UTC) X-FDA: 76945466766.11.part25_3507df226e17 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 2A90118045D23 for ; Fri, 19 Jun 2020 09:39:57 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,agruenba@redhat.com,,RULES_HIT:30003:30054,0,RBL:205.139.110.61:@redhat.com:.lbl8.mailshell.net-66.10.201.10 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: part25_3507df226e17 X-Filterd-Recvd-Size: 7653 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Fri, 19 Jun 2020 09:39:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1592559596; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lrk5imypRvlh6Xm/35IKdaRrF7e1DPYFegz850ZqocU=; b=b8dE8iRkeJlcv7fb+3xQG8TfSiBz4H6e9+W85UEMG2H2QkYSBOUWrjxFG61qExU5f7qJ5f 33j5VSAvvnPyv8ZleBlc4NVxiNfpXdY8JV+rny+m/c38b0e71bJBrUAqIH5L7LVQoZuQG+ 2k5Jr70yTBdDi1AAEtszUrcNVE0Qxys= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-146-jYi5da2QMoKUefxLZ_07RQ-1; Fri, 19 Jun 2020 05:39:43 -0400 X-MC-Unique: jYi5da2QMoKUefxLZ_07RQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C7066107B274; Fri, 19 Jun 2020 09:39:40 +0000 (UTC) Received: from max.home.com (unknown [10.40.195.140]) by smtp.corp.redhat.com (Postfix) with ESMTP id 494C05D9CA; Fri, 19 Jun 2020 09:39:35 +0000 (UTC) From: Andreas Gruenbacher To: Matthew Wilcox , Linus Torvalds Cc: Andreas Gruenbacher , Andrew Morton , linux-xfs , Junxiao Bi , William Kucharski , Joseph Qi , John Hubbard , linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, ocfs2-devel@oss.oracle.com, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-erofs@lists.ozlabs.org, Christoph Hellwig , linux-btrfs@vger.kernel.org Subject: [PATCH 2/2] gfs2: Rework read and page fault locking Date: Fri, 19 Jun 2020 11:39:16 +0200 Message-Id: <20200619093916.1081129-3-agruenba@redhat.com> In-Reply-To: <20200619093916.1081129-1-agruenba@redhat.com> References: <20200619093916.1081129-1-agruenba@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Rspamd-Queue-Id: 2A90118045D23 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The cache consistency model of filesystems like gfs2 is such that if data is found in the page cache, the data is up to date and can be used without taking any filesystem locks. If a page is not cached, filesystem locks must be taken before populating the page cache. Thus far, gfs2 has taken the filesystem locks inside the ->readpage and ->readpages address space operations. This was already causing lock ordering problems, but commit d4388340ae0b ("fs: convert mpage_readpages to mpage_readahead") made things worse: the ->readahead operation is called with the pages to readahead locked, so grabbing the inode's glock can now deadlock with processes which are holding the inode glock while trying to lock the same pages. Fix this by taking the inode glock in the ->read_iter file and ->fault vm operations. To avoid taking the inode glock when the data is already cached, the ->read_iter file operation first tries to read the data with the IOCB_CACHED flag set. If that fails, the inode glock is locked and the operation is repeated without the IOCB_CACHED flag. Signed-off-by: Andreas Gruenbacher --- fs/gfs2/aops.c | 27 ++-------------------- fs/gfs2/file.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 61 insertions(+), 27 deletions(-) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index 72c9560f4467..73c2fe768a3f 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -513,26 +513,10 @@ static int __gfs2_readpage(void *file, struct page *page) static int gfs2_readpage(struct file *file, struct page *page) { - struct address_space *mapping = page->mapping; - struct gfs2_inode *ip = GFS2_I(mapping->host); - struct gfs2_holder gh; int error; - unlock_page(page); - gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); - error = gfs2_glock_nq(&gh); - if (unlikely(error)) - goto out; - error = AOP_TRUNCATED_PAGE; - lock_page(page); - if (page->mapping == mapping && !PageUptodate(page)) - error = __gfs2_readpage(file, page); - else - unlock_page(page); - gfs2_glock_dq(&gh); -out: - gfs2_holder_uninit(&gh); - if (error && error != AOP_TRUNCATED_PAGE) + error = __gfs2_readpage(file, page); + if (error) lock_page(page); return error; } @@ -598,16 +582,9 @@ static void gfs2_readahead(struct readahead_control *rac) { struct inode *inode = rac->mapping->host; struct gfs2_inode *ip = GFS2_I(inode); - struct gfs2_holder gh; - gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); - if (gfs2_glock_nq(&gh)) - goto out_uninit; if (!gfs2_is_stuffed(ip)) mpage_readahead(rac, gfs2_block_map); - gfs2_glock_dq(&gh); -out_uninit: - gfs2_holder_uninit(&gh); } /** diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index fe305e4bfd37..f729b0ff2a3c 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -558,8 +558,29 @@ static vm_fault_t gfs2_page_mkwrite(struct vm_fault *vmf) return block_page_mkwrite_return(ret); } +static vm_fault_t gfs2_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct gfs2_inode *ip = GFS2_I(inode); + struct gfs2_holder gh; + vm_fault_t ret; + int err; + + gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); + err = gfs2_glock_nq(&gh); + if (err) { + ret = block_page_mkwrite_return(err); + goto out_uninit; + } + ret = filemap_fault(vmf); + gfs2_glock_dq(&gh); +out_uninit: + gfs2_holder_uninit(&gh); + return ret; +} + static const struct vm_operations_struct gfs2_vm_ops = { - .fault = filemap_fault, + .fault = gfs2_fault, .map_pages = filemap_map_pages, .page_mkwrite = gfs2_page_mkwrite, }; @@ -824,15 +845,51 @@ static ssize_t gfs2_file_direct_write(struct kiocb *iocb, struct iov_iter *from) static ssize_t gfs2_file_read_iter(struct kiocb *iocb, struct iov_iter *to) { + struct gfs2_inode *ip; + struct gfs2_holder gh; + size_t written = 0; ssize_t ret; + gfs2_holder_mark_uninitialized(&gh); if (iocb->ki_flags & IOCB_DIRECT) { ret = gfs2_file_direct_read(iocb, to); if (likely(ret != -ENOTBLK)) return ret; iocb->ki_flags &= ~IOCB_DIRECT; } - return generic_file_read_iter(iocb, to); + iocb->ki_flags |= IOCB_CACHED; + ret = generic_file_read_iter(iocb, to); + iocb->ki_flags &= ~IOCB_CACHED; + if (ret >= 0) { + if (!iov_iter_count(to)) + return ret; + written = ret; + } else { + switch(ret) { + case -EAGAIN: + if (iocb->ki_flags & IOCB_NOWAIT) + return ret; + break; + case -ECANCELED: + break; + default: + return ret; + } + } + ip = GFS2_I(iocb->ki_filp->f_mapping->host); + gfs2_holder_init(ip->i_gl, LM_ST_SHARED, 0, &gh); + ret = gfs2_glock_nq(&gh); + if (ret) + goto out_uninit; + ret = generic_file_read_iter(iocb, to); + if (ret > 0) + written += ret; + if (gfs2_holder_initialized(&gh)) + gfs2_glock_dq(&gh); +out_uninit: + if (gfs2_holder_initialized(&gh)) + gfs2_holder_uninit(&gh); + return written ? written : ret; } /**