From patchwork Fri Dec 19 06:24:05 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 5517571 Return-Path: X-Original-To: patchwork-linux-nfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 2A25A9F30B for ; Fri, 19 Dec 2014 06:24:16 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2B2D320138 for ; Fri, 19 Dec 2014 06:24:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 907212012B for ; Fri, 19 Dec 2014 06:24:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752011AbaLSGYL (ORCPT ); Fri, 19 Dec 2014 01:24:11 -0500 Received: from mail-pd0-f170.google.com ([209.85.192.170]:38085 "EHLO mail-pd0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751903AbaLSGYK (ORCPT ); Fri, 19 Dec 2014 01:24:10 -0500 Received: by mail-pd0-f170.google.com with SMTP id v10so579189pde.1 for ; Thu, 18 Dec 2014 22:24:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=alRCOzox7YBV1zndFVJwZSzkUfUYlnHE1+Rq71VS9cU=; b=PhMl7FRYwFWr3iv36EbsSotmY5H+QCHGSK1CldPi8owJjIpXfCDC+2/aYPeok21Ai5 yMGGwJ2TMGS1RITPBUfF49cP/p0HJaGyHh/RkTaXyW2rwIalAVGjTApXOJwlsvdhfDpQ vkEGYwZOUXfwdfsbQmljuxyEtcfHHUG+SyUSCYrZ2s7ihFfK2dPaldJzR/kMlaXksh+U vVYX1YSnkNnN/GVQMrPiZuyWsKBALM6WY9PVWPIdyNZ8Qk6L4p9KsCtSxB+Lw0rawPjy 32GrlOKmRr32BoCVcgogtioG5J4am83lKjo1WdK1PtZzzDpch9XTWKOBNSGfehKG/MUv 22YQ== X-Gm-Message-State: ALoCoQn05kwNPEPnx90YSGW6msltJh6CWt+rDIRps3heeS6aQpd06iUtknS7dG/TQ87sAwetay4g X-Received: by 10.68.239.70 with SMTP id vq6mr9753500pbc.110.1418970249414; Thu, 18 Dec 2014 22:24:09 -0800 (PST) Received: from mew ([72.192.100.38]) by mx.google.com with ESMTPSA id og12sm8491574pdb.43.2014.12.18.22.24.07 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Dec 2014 22:24:08 -0800 (PST) Date: Thu, 18 Dec 2014 22:24:05 -0800 From: Omar Sandoval To: Al Viro Cc: Christoph Hellwig , Jan Kara , Andrew Morton , Trond Myklebust , David Sterba , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO Message-ID: <20141219062405.GA11486@mew> References: <20141215165615.GA19041@infradead.org> <20141215221100.GA4637@mew> <20141216083543.GA32425@infradead.org> <20141216085624.GA25256@mew> <20141217080610.GA20335@infradead.org> <20141217082020.GH22149@ZenIV.linux.org.uk> <20141217082437.GA9301@infradead.org> <20141217145832.GA3497@mew> <20141217185256.GA5657@infradead.org> <20141217220313.GK22149@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20141217220313.GK22149@ZenIV.linux.org.uk> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Dec 17, 2014 at 10:03:13PM +0000, Al Viro wrote: > On Wed, Dec 17, 2014 at 10:52:56AM -0800, Christoph Hellwig wrote: > > On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote: > > > See my previous message. If we use O_DIRECT on the original open, then > > > filesystems that implement bmap but not direct_IO will no longer work. > > > These are the ones that I found in my tree: > > > > In the long run I don't think they are worth keeping. But to keep you > > out of that discussion you can just try an open without O_DIRECT if the > > open with the flag failed. > > Umm... That's one possibility, of course (and if swapon(2) is on someone's > hotpath, I really would like to see what the hell they are doing - it has > to be interesting in a sick way). If this is the approach you'd prefer, I'll go ahead and do that for v2. I personally think it looks pretty kludgey, but I'm fine either way: > NFS does, but local ones do not... > Besides, do we even allow swapfiles on AFS? AFS doesn't implement ->bmap or ->swap_activate, so that code is dead, probably cargo-culted from the NFS code. It seems pretty pointless, not only because it's inconsistent with the local filesystems like you mentioned, but also because it's trivial to bypass with O_DIRECT on NFS: ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from) { struct file *file = iocb->ki_filp; struct inode *inode = file_inode(file); unsigned long written = 0; ssize_t result; size_t count = iov_iter_count(from); loff_t pos = iocb->ki_pos; result = nfs_key_timeout_notify(file, inode); if (result) return result; if (file->f_flags & O_DIRECT) return nfs_file_direct_write(iocb, from, pos); dprintk("NFS: write(%pD2, %zu@%Ld)\n", file, count, (long long) pos); result = -EBUSY; if (IS_SWAPFILE(inode)) goto out_swapfile; I think it's safe to scrap that code. However, this also led me to find that NFS doesn't prevent truncates on an active swapfile. I'm submitting a patch for that now. diff --git a/mm/swapfile.c b/mm/swapfile.c index 63f55cc..c1b3073 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2379,7 +2379,16 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) name = NULL; goto bad_swap; } - swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0); + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0); + if (IS_ERR(swap_file) && PTR_ERR(swap_file) == -EINVAL) + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE, 0); if (IS_ERR(swap_file)) { error = PTR_ERR(swap_file); swap_file = NULL; > BTW, speaking of read/write vs. swap - what's the story with e.g. AFS > write() checking IS_SWAPFILE() and failing with -EBUSY? Note that > * it's done before acquiring i_mutex, so it isn't race-free > * it's dubious from the POSIX POV - EBUSY isn't in the error > list for write(2). > * other filesystems generally don't have anything of that sort.