diff mbox

[v6,4/9] iomap: Generic inline data handling

Message ID 20180602095717.31641-5-agruenba@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andreas Gruenbacher June 2, 2018, 9:57 a.m. UTC
Add generic inline data handling by adding a pointer to the inline data
region to struct iomap.  When handling a buffered IOMAP_INLINE write,
iomap_write_begin will copy the current inline data from the inline data
region into the page cache, and iomap_write_end will copy the changes in
the page cache back to the inline data region.

This doesn't cover inline data reads and direct I/O yet because so far,
we have no users.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/buffer.c                 | 20 ++++++++-----
 fs/iomap.c                  | 56 ++++++++++++++++++++++++++++++++++---
 include/linux/buffer_head.h |  3 ++
 include/linux/iomap.h       |  1 +
 4 files changed, 69 insertions(+), 11 deletions(-)

Comments

Christoph Hellwig June 2, 2018, 5:04 p.m. UTC | #1
On Sat, Jun 02, 2018 at 11:57:12AM +0200, Andreas Gruenbacher wrote:
> Add generic inline data handling by adding a pointer to the inline data
> region to struct iomap.  When handling a buffered IOMAP_INLINE write,
> iomap_write_begin will copy the current inline data from the inline data
> region into the page cache, and iomap_write_end will copy the changes in
> the page cache back to the inline data region.

This approach looks good.  A few comments below:

> 
> This doesn't cover inline data reads and direct I/O yet because so far,
> we have no users.

I'm fine with that as a first step, but gfs2 should be able to do
proper direct I/O to inline data and use by new iomap_readpage(s)
easily at least for blocksize == PAGE_SIZE, so this should be added
soon.

> -int generic_write_end(struct file *file, struct address_space *mapping,
> +void __generic_write_end(struct file *file, struct address_space *mapping,
>  			loff_t pos, unsigned len, unsigned copied,
> -			struct page *page, void *fsdata)
> +			struct page *page, void *fsdata, bool dirty_inode)

This is going to clash with

http://git.infradead.org/users/hch/xfs.git/commitdiff/2733909d6b40046ce9c7302c2e742c5e993a0108

It should also be a separate prep patch with a good explanation

>  iomap_write_end(struct inode *inode, loff_t pos, unsigned len,
> -		unsigned copied, struct page *page)
> +		unsigned copied, struct page *page, struct iomap *iomap)

Note that I have another patch adding this parameter.  I think we'll need
a common iomap base tree for the gfs2 and xfs changes for the next
merge window.  I'd be happy to one one up.


> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 918f14075702..c61113c71a60 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -47,6 +47,7 @@ struct iomap {
>  	u64			length;	/* length of mapping, bytes */
>  	u16			type;	/* type of mapping */
>  	u16			flags;	/* flags for mapping */
> +	void			*inline_data;  /* inline data buffer */
>  	struct block_device	*bdev;	/* block device for I/O */
>  	struct dax_device	*dax_dev; /* dax_dev for dax operations */
>  };

Eventually we need to find a way to union the inline_data, bdev and
dax_dev fields, but that can be left to later.
Andreas Grünbacher June 4, 2018, 12:02 p.m. UTC | #2
2018-06-02 19:04 GMT+02:00 Christoph Hellwig <hch@lst.de>:
> On Sat, Jun 02, 2018 at 11:57:12AM +0200, Andreas Gruenbacher wrote:
>> Add generic inline data handling by adding a pointer to the inline data
>> region to struct iomap.  When handling a buffered IOMAP_INLINE write,
>> iomap_write_begin will copy the current inline data from the inline data
>> region into the page cache, and iomap_write_end will copy the changes in
>> the page cache back to the inline data region.
>
> This approach looks good.  A few comments below:
>
>>
>> This doesn't cover inline data reads and direct I/O yet because so far,
>> we have no users.
>
> I'm fine with that as a first step, but gfs2 should be able to do
> proper direct I/O to inline data and use by new iomap_readpage(s)
> easily at least for blocksize == PAGE_SIZE, so this should be added
> soon.

Yes, there are a few open ends in gfs's iomap support, but what we
have so far is already quite useful.

>> -int generic_write_end(struct file *file, struct address_space *mapping,
>> +void __generic_write_end(struct file *file, struct address_space *mapping,
>>                       loff_t pos, unsigned len, unsigned copied,
>> -                     struct page *page, void *fsdata)
>> +                     struct page *page, void *fsdata, bool dirty_inode)
>
> This is going to clash with
>
> http://git.infradead.org/users/hch/xfs.git/commitdiff/2733909d6b40046ce9c7302c2e742c5e993a0108
>
> It should also be a separate prep patch with a good explanation

I'll cherry-pick that commit for now.

>>  iomap_write_end(struct inode *inode, loff_t pos, unsigned len,
>> -             unsigned copied, struct page *page)
>> +             unsigned copied, struct page *page, struct iomap *iomap)
>
> Note that I have another patch adding this parameter.  I think we'll need
> a common iomap base tree for the gfs2 and xfs changes for the next
> merge window.  I'd be happy to one one up.

No objections there, I don't think we'll end up with major conflicts though.

>> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
>> index 918f14075702..c61113c71a60 100644
>> --- a/include/linux/iomap.h
>> +++ b/include/linux/iomap.h
>> @@ -47,6 +47,7 @@ struct iomap {
>>       u64                     length; /* length of mapping, bytes */
>>       u16                     type;   /* type of mapping */
>>       u16                     flags;  /* flags for mapping */
>> +     void                    *inline_data;  /* inline data buffer */
>>       struct block_device     *bdev;  /* block device for I/O */
>>       struct dax_device       *dax_dev; /* dax_dev for dax operations */
>>  };
>
> Eventually we need to find a way to union the inline_data, bdev and
> dax_dev fields, but that can be left to later.

For gfs2, passing a private pointer from iomap_begin to iomap_end (for
the buffer head holding the inode) would help as well, but it's not
critical.

Thanks,
Andreas
Christoph Hellwig June 4, 2018, 12:12 p.m. UTC | #3
On Mon, Jun 04, 2018 at 02:02:10PM +0200, Andreas Grünbacher wrote:
> For gfs2, passing a private pointer from iomap_begin to iomap_end (for
> the buffer head holding the inode) would help as well, but it's not
> critical.

A private pointer is fine with me.
Andreas Grünbacher June 4, 2018, 5:01 p.m. UTC | #4
2018-06-04 14:12 GMT+02:00 Christoph Hellwig <hch@lst.de>:
> On Mon, Jun 04, 2018 at 02:02:10PM +0200, Andreas Grünbacher wrote:
>> For gfs2, passing a private pointer from iomap_begin to iomap_end (for
>> the buffer head holding the inode) would help as well, but it's not
>> critical.
>
> A private pointer is fine with me.

Okay, that'll be a patch for later though.

Andreas
diff mbox

Patch

diff --git a/fs/buffer.c b/fs/buffer.c
index 5220d9efcd18..cfd1ad2a5805 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2112,15 +2112,12 @@  int block_write_end(struct file *file, struct address_space *mapping,
 }
 EXPORT_SYMBOL(block_write_end);
 
-int generic_write_end(struct file *file, struct address_space *mapping,
+void __generic_write_end(struct file *file, struct address_space *mapping,
 			loff_t pos, unsigned len, unsigned copied,
-			struct page *page, void *fsdata)
+			struct page *page, void *fsdata, bool dirty_inode)
 {
 	struct inode *inode = mapping->host;
 	loff_t old_size = inode->i_size;
-	int i_size_changed = 0;
-
-	copied = block_write_end(file, mapping, pos, len, copied, page, fsdata);
 
 	/*
 	 * No need to use i_size_read() here, the i_size
@@ -2131,7 +2128,7 @@  int generic_write_end(struct file *file, struct address_space *mapping,
 	 */
 	if (pos+copied > inode->i_size) {
 		i_size_write(inode, pos+copied);
-		i_size_changed = 1;
+		dirty_inode = true;
 	}
 
 	unlock_page(page);
@@ -2145,9 +2142,18 @@  int generic_write_end(struct file *file, struct address_space *mapping,
 	 * ordering of page lock and transaction start for journaling
 	 * filesystems.
 	 */
-	if (i_size_changed)
+	if (dirty_inode)
 		mark_inode_dirty(inode);
+}
 
+int generic_write_end(struct file *file, struct address_space *mapping,
+			loff_t pos, unsigned len, unsigned copied,
+			struct page *page, void *fsdata)
+{
+	copied = block_write_end(file, mapping, pos, len, copied,
+				 page, fsdata);
+	__generic_write_end(file, mapping, pos, len, copied,
+			    page, fsdata, false);
 	return copied;
 }
 EXPORT_SYMBOL(generic_write_end);
diff --git a/fs/iomap.c b/fs/iomap.c
index a0d3b7742060..857c4b7b54eb 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -108,6 +108,40 @@  iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
 		truncate_pagecache_range(inode, max(pos, i_size), pos + len);
 }
 
+static void
+iomap_read_inline_data(struct page *page, struct iomap *iomap, loff_t size)
+{
+	void *data = iomap->inline_data;
+	void *addr;
+
+	if (PageUptodate(page))
+		return;
+
+	BUG_ON(page->index);
+	BUG_ON(size > PAGE_SIZE - offset_in_page(data));
+
+	addr = kmap_atomic(page);
+	memcpy(addr, data, size);
+	memset(addr + size, 0, PAGE_SIZE - size);
+	kunmap_atomic(addr);
+	SetPageUptodate(page);
+}
+
+static void
+iomap_write_inline_data(struct page *page, struct iomap *iomap, off_t pos,
+		unsigned copied)
+{
+	void *data = iomap->inline_data;
+	void *addr;
+
+	BUG_ON(pos + copied > PAGE_SIZE - offset_in_page(data));
+	WARN_ON_ONCE(!PageUptodate(page));
+
+	addr = kmap_atomic(page);
+	memcpy(data + pos, addr + pos, copied);
+	kunmap_atomic(addr);
+}
+
 static int
 iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 		struct page **pagep, struct iomap *iomap)
@@ -125,6 +159,11 @@  iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 	if (!page)
 		return -ENOMEM;
 
+	if (iomap->type == IOMAP_INLINE) {
+		iomap_read_inline_data(page, iomap, inode->i_size);
+		goto out;
+	}
+
 	status = __block_write_begin_int(page, pos, len, NULL, iomap);
 	if (unlikely(status)) {
 		unlock_page(page);
@@ -134,16 +173,24 @@  iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 		iomap_write_failed(inode, pos, len);
 	}
 
+out:
 	*pagep = page;
 	return status;
 }
 
 static int
 iomap_write_end(struct inode *inode, loff_t pos, unsigned len,
-		unsigned copied, struct page *page)
+		unsigned copied, struct page *page, struct iomap *iomap)
 {
 	int ret;
 
+	if (iomap->type == IOMAP_INLINE) {
+		iomap_write_inline_data(page, iomap, pos, copied);
+		__generic_write_end(NULL, inode->i_mapping, pos, len,
+				copied, page, NULL, true);
+		return copied;
+	}
+
 	ret = generic_write_end(NULL, inode->i_mapping, pos, len,
 			copied, page, NULL);
 	if (ret < len)
@@ -200,7 +247,8 @@  iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 
 		flush_dcache_page(page);
 
-		status = iomap_write_end(inode, pos, bytes, copied, page);
+		status = iomap_write_end(inode, pos, bytes, copied, page,
+				iomap);
 		if (unlikely(status < 0))
 			break;
 		copied = status;
@@ -294,7 +342,7 @@  iomap_dirty_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 
 		WARN_ON_ONCE(!PageUptodate(page));
 
-		status = iomap_write_end(inode, pos, bytes, bytes, page);
+		status = iomap_write_end(inode, pos, bytes, bytes, page, iomap);
 		if (unlikely(status <= 0)) {
 			if (WARN_ON_ONCE(status == 0))
 				return -EIO;
@@ -346,7 +394,7 @@  static int iomap_zero(struct inode *inode, loff_t pos, unsigned offset,
 	zero_user(page, offset, bytes);
 	mark_page_accessed(page);
 
-	return iomap_write_end(inode, pos, bytes, bytes, page);
+	return iomap_write_end(inode, pos, bytes, bytes, page, iomap);
 }
 
 static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes,
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 894e5d125de6..c730ae8888a8 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -231,6 +231,9 @@  int __block_write_begin(struct page *page, loff_t pos, unsigned len,
 int block_write_end(struct file *, struct address_space *,
 				loff_t, unsigned, unsigned,
 				struct page *, void *);
+void __generic_write_end(struct file *, struct address_space *,
+			 loff_t, unsigned, unsigned,
+			 struct page *, void *, bool);
 int generic_write_end(struct file *, struct address_space *,
 				loff_t, unsigned, unsigned,
 				struct page *, void *);
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 918f14075702..c61113c71a60 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -47,6 +47,7 @@  struct iomap {
 	u64			length;	/* length of mapping, bytes */
 	u16			type;	/* type of mapping */
 	u16			flags;	/* flags for mapping */
+	void			*inline_data;  /* inline data buffer */
 	struct block_device	*bdev;	/* block device for I/O */
 	struct dax_device	*dax_dev; /* dax_dev for dax operations */
 };