diff mbox

[02/16] vfs: check kiocb->ki_flags instead filp->fl_flags

Message ID 1428174805-853-3-git-send-email-dmonakhov@openvz.org (mailing list archive)
State New, archived
Headers show

Commit Message

Dmitry Monakhov April 4, 2015, 7:13 p.m. UTC
generic_write_checks now accept kiocb as an argument
Unfortunetly it is impossible to get rid of old interface because some crappy
do not support write_iter interface so leave __generic_write_checks as backward
compatibility helper.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
---
 include/linux/fs.h |    8 +++++++-
 mm/filemap.c       |   13 +++++++------
 2 files changed, 14 insertions(+), 7 deletions(-)

Comments

Al Viro April 4, 2015, 9:36 p.m. UTC | #1
On Sat, Apr 04, 2015 at 11:13:11PM +0400, Dmitry Monakhov wrote:
> generic_write_checks now accept kiocb as an argument
> Unfortunetly it is impossible to get rid of old interface because some crappy
> do not support write_iter interface so leave __generic_write_checks as backward
> compatibility helper.

Check the current vfs.git#for-next (there's even some generic_write_checks()
work in it).  The same goes for the rest of the series.  Please, rebase it.

What's more, generic_write_checks() should take iov_iter *, not the address
of something its ->count had been copied into.  Note that all callers of that
thing end up doing iov_iter_truncate() pretty soon afterwards.  I hadn't
pushed that one out yet (there is some weirdness in ocfs2 which might be
a bug; I want to sort that out first), but that's where it's going.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Monakhov April 5, 2015, 11:03 a.m. UTC | #2
Al Viro <viro@ZenIV.linux.org.uk> writes:

> On Sat, Apr 04, 2015 at 11:13:11PM +0400, Dmitry Monakhov wrote:
>> generic_write_checks now accept kiocb as an argument
>> Unfortunetly it is impossible to get rid of old interface because some crappy
>> do not support write_iter interface so leave __generic_write_checks as backward
>> compatibility helper.
>
> Check the current vfs.git#for-next (there's even some generic_write_checks()
> work in it).  The same goes for the rest of the series.  Please, rebase it.
Ok. Yes you right. I've prepared patches against vfs.git#for-next(48a0c62) but
it is too old. Will rebase. Also it is reasonable to fold my fs-xxx
conversion to one combo patch similar to (d04cfe7840)
>
> What's more, generic_write_checks() should take iov_iter *, not the address
> of something its ->count had been copied into.  Note that all callers of that
> thing end up doing iov_iter_truncate() pretty soon afterwards.  I hadn't
> pushed that one out yet (there is some weirdness in ocfs2 which might be
> a bug; I want to sort that out first), but that's where it's going.
I'm not sure I have get your point about ocfs2 because it does
iov_iter_truncate() right after generic_write_checks()
But nfs definitely has a bug because iter was not truncated after generic_write_checks
->nfs_file_direct_write
  ->generic_write_checks(file, &pos, &count);
  dreq->bytes_left = count;
  task_io_account_write(count);
  ->nfs_direct_write_schedule_iovec(dreq, iter, pos);
    ###Ignore dreq->bytes_left(ala count) and submit non truncated iter
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
Al Viro April 5, 2015, 6:11 p.m. UTC | #3
On Sun, Apr 05, 2015 at 02:03:22PM +0300, Dmitry Monakhov wrote:

> I'm not sure I have get your point about ocfs2 because it does
> iov_iter_truncate() right after generic_write_checks()

This
        ret = ocfs2_prepare_inode_for_write(file, ppos, count, appending,
                                            &can_do_direct, &has_refcount);
being done before generic_write_checks().  It actually duplicates some
parts of generic_write_checks() inside (O_APPEND-related, and AFAICS
they _are_ triggered twice that way).
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro April 5, 2015, 9:54 p.m. UTC | #4
On Sun, Apr 05, 2015 at 07:11:45PM +0100, Al Viro wrote:
> On Sun, Apr 05, 2015 at 02:03:22PM +0300, Dmitry Monakhov wrote:
> 
> > I'm not sure I have get your point about ocfs2 because it does
> > iov_iter_truncate() right after generic_write_checks()
> 
> This
>         ret = ocfs2_prepare_inode_for_write(file, ppos, count, appending,
>                                             &can_do_direct, &has_refcount);
> being done before generic_write_checks().  It actually duplicates some
> parts of generic_write_checks() inside (O_APPEND-related, and AFAICS
> they _are_ triggered twice that way).

XFS seems to be buggered as well:
        /* DIO must be aligned to device logical sector size */
        if ((pos | count) & target->bt_logical_sectormask)
                return -EINVAL;

        /* "unaligned" here means not aligned to a filesystem block */
        if ((pos & mp->m_blockmask) || ((pos + count) & mp->m_blockmask))
                unaligned_io = 1;
...
        ret = xfs_file_aio_write_checks(file, &pos, &count, &iolock);

now, play with rlimit() and suddenly the alignment checks above have nothing
to do with what'll actually happen after that sucker - it's calling
generic_write_checks(), so...

Incidentally, we want the result of alignment check to decide how to take
the lock that protects the file size, so simply lifting O_APPEND treatment
above those won't do.  I suspect that in case of lock taken shared we
need to redo alignment checks and treat "it became unaligned" as "unlock
and redo it with lock taken exclusive".

BTW, xfs_break_layouts() having dropped and regained lock would invalidate
the O_APPEND treatment in generic_write_checks() just prior (both in
xfs_file_aio_write_checks())...

Al "really not fond of xfs_rw_ilock()" Viro...

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4c20030..992685e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2570,7 +2570,7 @@  extern int sb_min_blocksize(struct super_block *, int);
 
 extern int generic_file_mmap(struct file *, struct vm_area_struct *);
 extern int generic_file_readonly_mmap(struct file *, struct vm_area_struct *);
-int generic_write_checks(struct file *file, loff_t *pos, size_t *count, int isblk);
+int __generic_write_checks(struct file * file, loff_t *pos, size_t *count, int isblk, int append);
 extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *);
 extern ssize_t __generic_file_write_iter(struct kiocb *, struct iov_iter *);
 extern ssize_t generic_file_write_iter(struct kiocb *, struct iov_iter *);
@@ -2798,6 +2798,12 @@  static inline bool is_nonblock_kiocb(struct kiocb *kiocb)
 	return kiocb->ki_flags & IOCB_NONBLOCK;
 }
 
+static inline int generic_write_checks(struct kiocb *iocb, loff_t *pos, size_t *count, int isblk)
+{
+	return __generic_write_checks(iocb->ki_filp, pos, count, isblk,
+				      is_append_kiocb(iocb));
+}
+
 /* XXX: this is obsolete helper, and will be removed soon.
  * One should use io_direct_kiocb() instead */
 static inline bool io_is_direct(struct file *filp)
diff --git a/mm/filemap.c b/mm/filemap.c
index 876f4e6..b519824 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1694,7 +1694,7 @@  generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 	loff_t *ppos = &iocb->ki_pos;
 	loff_t pos = *ppos;
 
-	if (io_is_direct(file)) {
+	if (is_direct_kiocb(iocb)) {
 		struct address_space *mapping = file->f_mapping;
 		struct inode *inode = mapping->host;
 		size_t count = iov_iter_count(iter);
@@ -2260,7 +2260,8 @@  EXPORT_SYMBOL(read_cache_page_gfp);
  * Returns appropriate error code that caller should return or
  * zero in case that write should be allowed.
  */
-inline int generic_write_checks(struct file *file, loff_t *pos, size_t *count, int isblk)
+inline int __generic_write_checks(struct file *file, loff_t *pos, size_t *count,
+				  int isblk, int is_append)
 {
 	struct inode *inode = file->f_mapping->host;
 	unsigned long limit = rlimit(RLIMIT_FSIZE);
@@ -2270,7 +2271,7 @@  inline int generic_write_checks(struct file *file, loff_t *pos, size_t *count, i
 
 	if (!isblk) {
 		/* FIXME: this is for backwards compatibility with 2.4 */
-		if (file->f_flags & O_APPEND)
+		if (is_append)
                         *pos = i_size_read(inode);
 
 		if (limit != RLIM_INFINITY) {
@@ -2333,7 +2334,7 @@  inline int generic_write_checks(struct file *file, loff_t *pos, size_t *count, i
 	}
 	return 0;
 }
-EXPORT_SYMBOL(generic_write_checks);
+EXPORT_SYMBOL(__generic_write_checks);
 
 int pagecache_write_begin(struct file *file, struct address_space *mapping,
 				loff_t pos, unsigned len, unsigned flags,
@@ -2565,7 +2566,7 @@  ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 
 	/* We can write back this queue in page reclaim */
 	current->backing_dev_info = inode_to_bdi(inode);
-	err = generic_write_checks(file, &pos, &count, S_ISBLK(inode->i_mode));
+	err = generic_write_checks(iocb, &pos, &count, S_ISBLK(inode->i_mode));
 	if (err)
 		goto out;
 
@@ -2582,7 +2583,7 @@  ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	if (err)
 		goto out;
 
-	if (io_is_direct(file)) {
+	if (is_direct_kiocb(iocb)) {
 		loff_t endbyte;
 
 		written = generic_file_direct_write(iocb, from, pos);