From patchwork Tue Jun 27 15:09:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 9812619 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 37A316020A for ; Tue, 27 Jun 2017 15:30:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2BC4828687 for ; Tue, 27 Jun 2017 15:30:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1E434286E6; Tue, 27 Jun 2017 15:30:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.4 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 51875286C7 for ; Tue, 27 Jun 2017 15:30:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753417AbdF0PP5 (ORCPT ); Tue, 27 Jun 2017 11:15:57 -0400 Received: from mail-io0-f174.google.com ([209.85.223.174]:33557 "EHLO mail-io0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752471AbdF0PKB (ORCPT ); Tue, 27 Jun 2017 11:10:01 -0400 Received: by mail-io0-f174.google.com with SMTP id h64so19255096iod.0 for ; Tue, 27 Jun 2017 08:09:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=24CGw+yVpmGCrJXEiQPfXc9FDRKsVPhAz7MKeWCQv+c=; b=CRXtoEiuHhTMbWaDAxRdxNDe89n+m7DzlT2tMOITe4qsZuzAPnl+OUCJQBwRFrACFF G3Pc9rVf339tVDn5l6gOfcfzgQu1TxBx9jymsWOlfMiLRszY1Gz+eqp3uPu1JIPfVZPt CExZWBYGszviAhuT0AH99KvElqK/hJH1a6sFx4SvBDZEmCLMFPUyYOIT//MmnCsAIE2s zmNachwB/cHkKt3yh8mKwxZFQFQtDE6Yl3y5wmzgxjBBWgVg+nOuLWOuk0a6u1YhoOS3 hlom05ObUBJ79528tloIDlcLd+NP+b/xIzJNZLcK6UG0numOSLLx6iyiCwa658xeke8s 7ZdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=24CGw+yVpmGCrJXEiQPfXc9FDRKsVPhAz7MKeWCQv+c=; b=iRqYzVoJnymI3NEu1EIUg2h3IPDWerfs0BbLWFcU1kGiLaV4nUzzetYMXL8qyqeAmr mahmACFNdAZJzk7CF/5ZEOlybk+Lb1lusTdOZh2tOe+SqkH638/mEvbZwaVxJ03n/KF3 jHnJ568IiQcGzzMnpdhwLr7oNdzTrujnsCJNsjudKWWKPfIDrCsbUeJP/yJxzX/cSH7f 1Qm9ynNuH4hIeM9wRwbyovFStKfMSHGsksEZaSgvKsQkxYCPNk3yJeCxA2C43Ym4bqaJ kvxOpIuq1cVhpH9AmFIuSzHTl+JSEVR4CMQsLuUYSj1rg8HEkBiYLTCCdPmAse4ir6rq XRtQ== X-Gm-Message-State: AKS2vOyFecAZQ4TM9qx1fuABOWX0Bt9RKMZkqEt1wY0wtarx2ChGCeoc SgYzF/zw54Hn0UcNw5dn2w== X-Received: by 10.107.15.170 with SMTP id 42mr7141627iop.170.1498576190321; Tue, 27 Jun 2017 08:09:50 -0700 (PDT) Received: from [192.168.1.154] ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id e80sm1501424ite.3.2017.06.27.08.09.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Jun 2017 08:09:49 -0700 (PDT) Subject: Re: [PATCH 1/9] fs: add fcntl() interface for setting/getting write life time hints To: Christoph Hellwig Cc: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, hch@lst.de, martin.petersen@oracle.com References: <1498491480-16306-1-git-send-email-axboe@kernel.dk> <1498491480-16306-2-git-send-email-axboe@kernel.dk> <20170627144255.GB2541@infradead.org> From: Jens Axboe Message-ID: Date: Tue, 27 Jun 2017 09:09:48 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170627144255.GB2541@infradead.org> Content-Language: en-US Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 06/27/2017 08:42 AM, Christoph Hellwig wrote: > The API looks ok, but the code could use some cleanups. What do > you think about the incremental patch below: > > It refactors various manipulations, and stores the write hint right > in the iocb as there is a 4 byte hole (this will need some minor > adjustments in the next patches): How's this? Fixes for compile, and also squeeze an enum rw_hint into a hole in the inode structure. I'll refactor around this and squeeze into bio/rq holes as well, then re-test it. diff --git a/fs/fcntl.c b/fs/fcntl.c index f4e7267d117f..25f96a101f1a 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -243,6 +243,62 @@ static int f_getowner_uids(struct file *filp, unsigned long arg) } #endif +static bool rw_hint_valid(enum rw_hint hint) +{ + switch (hint) { + case RWF_WRITE_LIFE_NOT_SET: + case RWH_WRITE_LIFE_NONE: + case RWH_WRITE_LIFE_SHORT: + case RWH_WRITE_LIFE_MEDIUM: + case RWH_WRITE_LIFE_LONG: + case RWH_WRITE_LIFE_EXTREME: + return true; + default: + return false; + } +} + +static long fcntl_rw_hint(struct file *file, unsigned int cmd, + unsigned long arg) +{ + struct inode *inode = file_inode(file); + u64 *argp = (u64 __user *)arg; + enum rw_hint hint; + + switch (cmd) { + case F_GET_FILE_RW_HINT: + if (put_user(__file_write_hint(file), argp)) + return -EFAULT; + return 0; + case F_SET_FILE_RW_HINT: + if (get_user(hint, argp)) + return -EFAULT; + if (!rw_hint_valid(hint)) + return -EINVAL; + + spin_lock(&file->f_lock); + file->f_write_hint = hint; + spin_unlock(&file->f_lock); + return 0; + case F_GET_RW_HINT: + if (put_user(__inode_write_hint(inode), argp)) + return -EFAULT; + return 0; + case F_SET_RW_HINT: + if (get_user(hint, argp)) + return -EFAULT; + if (!rw_hint_valid(hint)) + return -EINVAL; + + inode_lock(inode); + inode->i_write_hint = hint; + inode_unlock(inode); + return 0; + default: + return -EINVAL; + } +} + static long do_fcntl(int fd, unsigned int cmd, unsigned long arg, struct file *filp) { @@ -337,6 +393,12 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg, case F_GET_SEALS: err = shmem_fcntl(filp, cmd, arg); break; + case F_GET_RW_HINT: + case F_SET_RW_HINT: + case F_GET_FILE_RW_HINT: + case F_SET_FILE_RW_HINT: + err = fcntl_rw_hint(filp, cmd, arg); + break; default: break; } diff --git a/fs/inode.c b/fs/inode.c index db5914783a71..f0e5fc77e6a4 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -146,6 +146,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) i_gid_write(inode, 0); atomic_set(&inode->i_writecount, 0); inode->i_size = 0; + inode->i_write_hint = WRITE_LIFE_NOT_SET; inode->i_blocks = 0; inode->i_bytes = 0; inode->i_generation = 0; diff --git a/fs/open.c b/fs/open.c index cd0c5be8d012..3fe0c4aa7d27 100644 --- a/fs/open.c +++ b/fs/open.c @@ -759,6 +759,7 @@ static int do_dentry_open(struct file *f, likely(f->f_op->write || f->f_op->write_iter)) f->f_mode |= FMODE_CAN_WRITE; + f->f_write_hint = WRITE_LIFE_NOT_SET; f->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC); file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping); diff --git a/include/linux/fs.h b/include/linux/fs.h index 4574121f4746..4587a181162e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -265,6 +265,20 @@ struct page; struct address_space; struct writeback_control; +#include + +/* + * Write life time hint values. + */ +enum rw_hint { + WRITE_LIFE_NOT_SET = 0, + WRITE_LIFE_NONE = RWH_WRITE_LIFE_NONE, + WRITE_LIFE_SHORT = RWH_WRITE_LIFE_SHORT, + WRITE_LIFE_MEDIUM = RWH_WRITE_LIFE_MEDIUM, + WRITE_LIFE_LONG = RWH_WRITE_LIFE_LONG, + WRITE_LIFE_EXTREME = RWH_WRITE_LIFE_EXTREME, +}; + #define IOCB_EVENTFD (1 << 0) #define IOCB_APPEND (1 << 1) #define IOCB_DIRECT (1 << 2) @@ -280,6 +294,7 @@ struct kiocb { void (*ki_complete)(struct kiocb *iocb, long ret, long ret2); void *private; int ki_flags; + enum rw_hint ki_hint; }; static inline bool is_sync_kiocb(struct kiocb *kiocb) @@ -597,6 +612,7 @@ struct inode { spinlock_t i_lock; /* i_blocks, i_bytes, maybe i_size */ unsigned short i_bytes; unsigned int i_blkbits; + enum rw_hint i_write_hint; blkcnt_t i_blocks; #ifdef __NEED_I_SIZE_ORDERED @@ -851,6 +867,7 @@ struct file { * Must not be taken from IRQ context. */ spinlock_t f_lock; + enum rw_hint f_write_hint; atomic_long_t f_count; unsigned int f_flags; fmode_t f_mode; @@ -1026,8 +1043,6 @@ struct file_lock_context { #define OFFT_OFFSET_MAX INT_LIMIT(off_t) #endif -#include - extern void send_sigio(struct fown_struct *fown, int fd, int band); /* @@ -1878,6 +1893,35 @@ static inline bool HAS_UNMAPPED_ID(struct inode *inode) return !uid_valid(inode->i_uid) || !gid_valid(inode->i_gid); } +static inline enum rw_hint __inode_write_hint(struct inode *inode) +{ + return inode->i_write_hint; +} + +static inline enum rw_hint inode_write_hint(struct inode *inode) +{ + enum rw_hint ret = __inode_write_hint(inode); + if (ret != WRITE_LIFE_NOT_SET) + return ret; + return WRITE_LIFE_NONE; +} + +static inline enum rw_hint __file_write_hint(struct file *file) +{ + if (file->f_write_hint != WRITE_LIFE_NOT_SET) + return file->f_write_hint; + + return __inode_write_hint(file_inode(file)); +} + +static inline enum rw_hint file_write_hint(struct file *file) +{ + enum rw_hint ret = __file_write_hint(file); + if (ret != WRITE_LIFE_NOT_SET) + return ret; + return WRITE_LIFE_NONE; +} + /* * Inode state bits. Protected by inode->i_lock * diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index 813afd6eee71..ec69d55bcec7 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -43,6 +43,27 @@ /* (1U << 31) is reserved for signed error codes */ /* + * Set/Get write life time hints. {GET,SET}_RW_HINT operate on the + * underlying inode, while {GET,SET}_FILE_RW_HINT operate only on + * the specific file. + */ +#define F_GET_RW_HINT (F_LINUX_SPECIFIC_BASE + 11) +#define F_SET_RW_HINT (F_LINUX_SPECIFIC_BASE + 12) +#define F_GET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 13) +#define F_SET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 14) + +/* + * Valid hint values for F_{GET,SET}_RW_HINT. 0 is "not set", or can be + * used to clear any hints previously set. + */ +#define RWF_WRITE_LIFE_NOT_SET 0 +#define RWH_WRITE_LIFE_NONE 1 +#define RWH_WRITE_LIFE_SHORT 2 +#define RWH_WRITE_LIFE_MEDIUM 3 +#define RWH_WRITE_LIFE_LONG 4 +#define RWH_WRITE_LIFE_EXTREME 5 + +/* * Types of directory notifications that may be requested. */ #define DN_ACCESS 0x00000001 /* File accessed */