From patchwork Mon Jun 19 20:33:41 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 9797775 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 3431060381 for ; Mon, 19 Jun 2017 20:33:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2626227E5A for ; Mon, 19 Jun 2017 20:33:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1A25528451; Mon, 19 Jun 2017 20:33:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7201E27E5A for ; Mon, 19 Jun 2017 20:33:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751934AbdFSUdr (ORCPT ); Mon, 19 Jun 2017 16:33:47 -0400 Received: from mail-it0-f47.google.com ([209.85.214.47]:35540 "EHLO mail-it0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751758AbdFSUdp (ORCPT ); Mon, 19 Jun 2017 16:33:45 -0400 Received: by mail-it0-f47.google.com with SMTP id m62so2564278itc.0 for ; Mon, 19 Jun 2017 13:33:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:from:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=nmwh9jfgrrZR341OW7xN6ZtP7vJsJjAjAfGaGm7qRgA=; b=PmXGOawFnQ5yVMBYykXTk3jCnHbdlGqS6fUYSckN2s7IU1Sgne6ODpRC20w9NXp7bX FADrbzn0nYKEJFhhGcFCfzOibA/R4xPp7VqRFHPaBTseODjCZ9paITUpkSEQnmqpi9Ne VYlHX6VysZCqc0n/wzQxGNqUfamTo0OeQoy18xVDDeJV9L+ZDET9fpbN3agR2yvkoEon WPfO56O+ZeCvQcoEjJbNMGlZmNx1kCmueCd7QcwSF9qIw3WMmNBMCyLhp/vYuAzAq3wE OT5Jk+VQ+fKJxHpEJe+Str96qDsshzIpkktWTj7oNCNO5QbRvAELG1PH/Hwb4lNTfzlW ZGJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=nmwh9jfgrrZR341OW7xN6ZtP7vJsJjAjAfGaGm7qRgA=; b=gyMVptUngokoeAIbwzQrGGonFBJyTFISxtswq5bNDUGtp+jywqFQwr+swWbztj6IsN 0grksvRI9kb40/MyG6khM3IweYvadXBab507YECORA2Bl6eBySBWqN92bfiO7OgSxWaY ZuMmnY8H1IcqMOFdBXl80ucm0dkvgn/9jIYEIEx9oZs9Msq/0jb5e/JtbuL/ZIUCi14j ciuWiBW2Vo4KIMB5osQZn8AMpH5S3FM1uD9yNbvwmM6dbArBXp2aepqIITWrVZUAzZOq BLfaVmYEHv66qX6gvXCVuQFgff8zeLIGaK3c+ywKvBcnwRnMXee0mw0wxslJeYnNOjaW zEWA== X-Gm-Message-State: AKS2vOxef2OqQUIGafiz9H3RgapK6hwyqjLTHST2En6Z49cMGFlnBKAO 23OIKGl61nk44RxgocAybw== X-Received: by 10.36.22.137 with SMTP id a131mr511633ita.115.1497904423420; Mon, 19 Jun 2017 13:33:43 -0700 (PDT) Received: from [192.168.1.154] ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id a125sm1540718ioa.44.2017.06.19.13.33.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Jun 2017 13:33:42 -0700 (PDT) Subject: Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints From: Jens Axboe To: Christoph Hellwig Cc: linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, adilger@dilger.ca, martin.petersen@oracle.com References: <1497729594-4707-1-git-send-email-axboe@kernel.dk> <1497729594-4707-5-git-send-email-axboe@kernel.dk> <20170619062706.GC2311@infradead.org> <8db675e7-b868-bae1-784a-33cba67d0874@kernel.dk> <20170619185805.GC32047@infradead.org> <5259a1ba-1c5f-2d89-3b47-3a81cd0a3e4e@kernel.dk> <0c3c6a33-77c5-337e-66e0-49fe22d3cc4e@kernel.dk> Message-ID: <3e6f7b97-8372-b268-f55a-cc88812c1a68@kernel.dk> Date: Mon, 19 Jun 2017 14:33:41 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <0c3c6a33-77c5-337e-66e0-49fe22d3cc4e@kernel.dk> Content-Language: en-US Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 06/19/2017 01:10 PM, Jens Axboe wrote: > On 06/19/2017 01:00 PM, Jens Axboe wrote: >> On 06/19/2017 12:58 PM, Christoph Hellwig wrote: >>> On Mon, Jun 19, 2017 at 10:02:09AM -0600, Jens Axboe wrote: >>>> Actually, one good use case is O_DIRECT on a block device. Since I'm >>>> not a huge fan of having per-call hints that is only useful for a >>>> single case, how about we add the hints to the struct file as well? >>>> For buffered IO, just grab it from the inode. If we have a file >>>> available, then that overrides the per-inode setting. >>> >>> Even for buffered I/O per-fіle would seem more useful to be honest. >>> For the buffer_head based file systems this could even be done fairly >>> easily. >> >> If I add the per-file hint as well, then anywhere that has the file should >> just grab it from there. Unless not set, then grab from inode. >> >> That does raise an issue with the NONE hint being 0. We can tell right now >> if NONE was set, or nothing was set. This becomes a problem if we want the >> file hint to override the inode hint. Should probably just bump the values >> up by one, so that NONE is 1, SHORT is 2, etc. > > Actually, we don't have to, as long as the file inherits the inode mask. > Then we can just use the file hint if it differs from the inode hint. That doesn't work, in case it's cleared, or for checking whether it has been set or not. Oh well, I added a NOT_SET variant for this. See below for an incremental that adds support for file write hints as well. Use the file write hint, if we have it, otherwise use the inode provided one. Setting hints on a file propagates to the inode, only if the inode doesn't currently have a hint set. diff --git a/fs/fcntl.c b/fs/fcntl.c index 113b78c11631..34ca821767a0 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -247,12 +247,16 @@ static long fcntl_rw_hint(struct file *file, unsigned int cmd, u64 __user *ptr) { struct inode *inode = file_inode(file); + u64 old_hint, hint; long ret = 0; - u64 hint; switch (cmd) { case F_GET_RW_HINT: - hint = mask_to_write_hint(inode->i_flags, S_WRITE_LIFE_SHIFT); + if (file->f_write_hint != WRITE_LIFE_NOT_SET) + hint = file->f_write_hint; + else + hint = mask_to_write_hint(inode->i_flags, + S_WRITE_LIFE_SHIFT); if (put_user(hint, ptr)) ret = -EFAULT; break; @@ -267,7 +271,15 @@ static long fcntl_rw_hint(struct file *file, unsigned int cmd, case WRITE_LIFE_MEDIUM: case WRITE_LIFE_LONG: case WRITE_LIFE_EXTREME: - inode_set_write_hint(inode, hint); + spin_lock(&file->f_lock); + file->f_write_hint = hint; + spin_unlock(&file->f_lock); + + /* Only propagate hint to inode, if no hint is set */ + old_hint = mask_to_write_hint(inode->i_flags, + S_WRITE_LIFE_SHIFT); + if (old_hint == WRITE_LIFE_NOT_SET) + inode_set_write_hint(inode, hint); ret = 0; break; default: diff --git a/fs/inode.c b/fs/inode.c index defb015a2c6d..e4a4e123d52b 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -134,7 +134,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) inode->i_sb = sb; inode->i_blkbits = sb->s_blocksize_bits; - inode->i_flags = 0; + inode->i_flags = S_WRITE_LIFE_MASK; atomic_set(&inode->i_count, 1); inode->i_op = &empty_iops; inode->i_fop = &no_open_fops; diff --git a/fs/open.c b/fs/open.c index cd0c5be8d012..3fe0c4aa7d27 100644 --- a/fs/open.c +++ b/fs/open.c @@ -759,6 +759,7 @@ static int do_dentry_open(struct file *f, likely(f->f_op->write || f->f_op->write_iter)) f->f_mode |= FMODE_CAN_WRITE; + f->f_write_hint = WRITE_LIFE_NOT_SET; f->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC); file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping); diff --git a/include/linux/fs.h b/include/linux/fs.h index 8720251cc153..e81bdb8ec189 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -859,6 +859,7 @@ struct file { * Must not be taken from IRQ context. */ spinlock_t f_lock; + unsigned int f_write_hint; atomic_long_t f_count; unsigned int f_flags; fmode_t f_mode; @@ -1902,7 +1903,9 @@ enum rw_hint { WRITE_LIFE_SHORT = RWH_WRITE_LIFE_SHORT, WRITE_LIFE_MEDIUM = RWH_WRITE_LIFE_MEDIUM, WRITE_LIFE_LONG = RWH_WRITE_LIFE_LONG, - WRITE_LIFE_EXTREME = RWH_WRITE_LIFE_EXTREME + WRITE_LIFE_EXTREME = RWH_WRITE_LIFE_EXTREME, + + WRITE_LIFE_NOT_SET = 7, }; static inline unsigned int write_hint_to_mask(enum rw_hint hint, @@ -1917,12 +1920,25 @@ static inline enum rw_hint mask_to_write_hint(unsigned int mask, return (mask >> shift) & 0x7; } -static inline unsigned int inode_write_hint(struct inode *inode) +static inline enum rw_hint inode_write_hint(struct inode *inode) { - if (inode) - return mask_to_write_hint(inode->i_flags, S_WRITE_LIFE_SHIFT); + enum rw_hint ret = WRITE_LIFE_NONE; - return 0; + if (inode) { + ret = mask_to_write_hint(inode->i_flags, S_WRITE_LIFE_SHIFT); + if (ret == WRITE_LIFE_NOT_SET) + ret = WRITE_LIFE_NONE; + } + + return ret; +} + +static inline enum rw_hint file_write_hint(struct file *file) +{ + if (file->f_write_hint != WRITE_LIFE_NOT_SET) + return file->f_write_hint; + + return inode_write_hint(file_inode(file)); } /* @@ -3097,9 +3113,7 @@ static inline bool io_is_direct(struct file *filp) static inline int iocb_flags(struct file *file) { - struct inode *inode = file_inode(file); int res = 0; - if (file->f_flags & O_APPEND) res |= IOCB_APPEND; if (io_is_direct(file)) @@ -3108,13 +3122,8 @@ static inline int iocb_flags(struct file *file) res |= IOCB_DSYNC; if (file->f_flags & __O_SYNC) res |= IOCB_SYNC; - if (mask_to_write_hint(inode->i_flags, S_WRITE_LIFE_SHIFT)) { - enum rw_hint hint; - - hint = mask_to_write_hint(inode->i_flags, S_WRITE_LIFE_SHIFT); - res |= write_hint_to_mask(hint, IOCB_WRITE_LIFE_SHIFT); - } + res |= write_hint_to_mask(file->f_write_hint, IOCB_WRITE_LIFE_SHIFT); return res; }