[RFC,1/3] filemap: convert page_endio to folio_endio

Message ID	20230315123233.121593-2-p.raghav@samsung.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@vger.kernel.org> From: Pankaj Raghav <p.raghav@samsung.com> To: hubcap@omnibond.com, senozhatsky@chromium.org, martin@omnibond.com, willy@infradead.org, minchan@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, axboe@kernel.dk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, gost.dev@samsung.com, mcgrof@kernel.org, devel@lists.orangefs.org, Pankaj Raghav <p.raghav@samsung.com> Subject: [RFC PATCH 1/3] filemap: convert page_endio to folio_endio Date: Wed, 15 Mar 2023 13:32:31 +0100 Message-Id: <20230315123233.121593-2-p.raghav@samsung.com> In-Reply-To: <20230315123233.121593-1-p.raghav@samsung.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" CMS-TYPE: 201P References: <20230315123233.121593-1-p.raghav@samsung.com> <CGME20230315123234eucas1p2503d83ad0180cecde02e924d7b143535@eucas1p2.samsung.com> Precedence: bulk
Series	[RFC,1/3] filemap: convert page_endio to folio_endio \| expand [RFC,1/3] filemap: convert page_endio to folio_endio [RFC,2/3] mpage: use bio_for_each_folio_all in mpage_end_io() [RFC,3/3] orangefs: use folio in orangefs_readahead()

Pankaj Raghav March 15, 2023, 12:32 p.m. UTC

page_endio() already works on folios by converting a page in to a folio as
the first step. Convert page_endio to folio_endio by taking a folio as the
first parameter.

Instead of doing a page to folio conversion in the page_endio()
function, the consumers of this API do this conversion and call the
folio_endio() function in this patch.
The following patches will convert the consumers of this API to use native
folio functions to pass to folio_endio().

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
 drivers/block/zram/zram_drv.c | 2 +-
 fs/mpage.c                    | 2 +-
 fs/orangefs/inode.c           | 2 +-
 include/linux/pagemap.h       | 2 +-
 mm/filemap.c                  | 8 +++-----
 5 files changed, 7 insertions(+), 9 deletions(-)

Luis Chamberlain March 15, 2023, 2:46 p.m. UTC | #1

On Wed, Mar 15, 2023 at 01:32:31PM +0100, Pankaj Raghav wrote:
> page_endio() already works on folios by converting a page in to a folio as
> the first step. Convert page_endio to folio_endio by taking a folio as the
> first parameter.
> 
> Instead of doing a page to folio conversion in the page_endio()
> function, the consumers of this API do this conversion and call the
> folio_endio() function in this patch.
> The following patches will convert the consumers of this API to use native
> folio functions to pass to folio_endio().
> 
> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>

Perhaps the hard thing is what tree would this go through?

  Luis

Hannes Reinecke March 15, 2023, 2:51 p.m. UTC | #2

On 3/15/23 13:32, Pankaj Raghav wrote:
> page_endio() already works on folios by converting a page in to a folio as
> the first step. Convert page_endio to folio_endio by taking a folio as the
> first parameter.
> 
> Instead of doing a page to folio conversion in the page_endio()
> function, the consumers of this API do this conversion and call the
> folio_endio() function in this patch.
> The following patches will convert the consumers of this API to use native
> folio functions to pass to folio_endio().
> 
> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
> ---
>   drivers/block/zram/zram_drv.c | 2 +-
>   fs/mpage.c                    | 2 +-
>   fs/orangefs/inode.c           | 2 +-
>   include/linux/pagemap.h       | 2 +-
>   mm/filemap.c                  | 8 +++-----
>   5 files changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index aa490da3cef2..f441251c9138 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -610,7 +610,7 @@ static void zram_page_end_io(struct bio *bio)
>   {
>   	struct page *page = bio_first_page_all(bio);
>   
> -	page_endio(page, op_is_write(bio_op(bio)),
> +	folio_endio(page_folio(page), op_is_write(bio_op(bio)),
>   			blk_status_to_errno(bio->bi_status));
>   	bio_put(bio);
>   }
> diff --git a/fs/mpage.c b/fs/mpage.c
> index 22b9de5ddd68..40e86e839e77 100644
> --- a/fs/mpage.c
> +++ b/fs/mpage.c
> @@ -50,7 +50,7 @@ static void mpage_end_io(struct bio *bio)
>   
>   	bio_for_each_segment_all(bv, bio, iter_all) {
>   		struct page *page = bv->bv_page;
> -		page_endio(page, bio_op(bio),
> +		folio_endio(page_folio(page), bio_op(bio),
>   			   blk_status_to_errno(bio->bi_status));
>   	}
>   
Can't this be converted to use 'bio_for_each_folio_all()' instead of
bio_for_each_segment_all()?

Cheers,

Hannes

Christoph Hellwig March 15, 2023, 2:56 p.m. UTC | #3

Can we take a step back and figure out if page_endio is a good
idea to start with?

The zram usage seems clearly wrong to me.  zram is a block driver
and does not own the pages, so it shouldn't touch any of the page
state.  It seems like this mostly operates on it's own
pages allocated using alloc_page so the harm might not be horrible
at least.

orangefs uses it on readahead pages, with ret known for the whole
iteration.  So one quick loop for the success and one for the
failure case would look simpler an more obvious.

mpage really should use separate end_io handler for read vs write
as well like most other aops do.

So overall I'd be happier to just kill the helper.

Pankaj Raghav March 16, 2023, 10:04 a.m. UTC | #4

Hi Christoph,

On 2023-03-15 15:56, Christoph Hellwig wrote:
> Can we take a step back and figure out if page_endio is a good
> idea to start with?
> 
> The zram usage seems clearly wrong to me.  zram is a block driver
> and does not own the pages, so it shouldn't touch any of the page
> state.  It seems like this mostly operates on it's own
> pages allocated using alloc_page so the harm might not be horrible
> at least.
> 
It looks like this endio function is called when alloc_page is used (for
partial IO) to trigger writeback from the user space `echo "idle" >
/sys/block/zram0/writeback`.

I don't understand when you say the harm might not be horrible if we don't
call folio_endio here. Do you think it is just safe to remove the call to
folio_endio function?

> orangefs uses it on readahead pages, with ret known for the whole
> iteration.  So one quick loop for the success and one for the
> failure case would look simpler an more obvious.
> 
Got it. Something like this?

@@ -266,18 +266,23 @@ static void orangefs_readahead(struct
readahead_control *rac)
        iov_iter_xarray(&iter, ITER_DEST, i_pages, offset,
readahead_length(rac));

        /* read in the pages. */
-       if ((ret = wait_for_direct_io(ORANGEFS_IO_READ, inode,
-                       &offset, &iter, readahead_length(rac),
-                       inode->i_size, NULL, NULL, rac->file)) < 0)
+       if ((ret = wait_for_direct_io(ORANGEFS_IO_READ, inode, &offset, &iter,
+                                     readahead_length(rac), inode->i_size,
+                                     NULL, NULL, rac->file)) < 0) {
                gossip_debug(GOSSIP_FILE_DEBUG,
-                       "%s: wait_for_direct_io failed. \n", __func__);
-       else
-               ret = 0;
+                            "%s: wait_for_direct_io failed. \n", __func__);

-       /* clean up. */
-       while ((page = readahead_page(rac))) {
-               page_endio(page, false, ret);
-               put_page(page);
+               while ((folio = readahead_folio(rac))) {
+                       folio_clear_uptodate(folio);
+                       folio_set_error(folio);
+                       folio_unlock(folio);
+               }
+               return;
+       }
+
+       while ((folio = readahead_folio(rac))) {
+               folio_mark_uptodate(folio);
+               folio_unlock(folio);
        }
 }

> mpage really should use separate end_io handler for read vs write
> as well like most other aops do.
> 

I don't know if this is the right abstraction to do the switch, but let me
know what you think:
@@ -59,7 +59,8 @@ static void mpage_end_io(struct bio *bio)
static struct bio *mpage_bio_submit(struct bio *bio)
 {
-       bio->bi_end_io = mpage_end_io;
+       bio->bi_end_io = (op_is_write(bio_op(bio))) ? mpage_write_end_io :
+                                                     mpage_read_end_io;
        guard_bio_eod(bio);
        submit_bio(bio);
        return NULL;
And mpage_{write,read}_end_io will iterate over the folio and call the
respective functions.

> So overall I'd be happier to just kill the helper.

Christoph Hellwig March 16, 2023, 3:24 p.m. UTC | #5

On Thu, Mar 16, 2023 at 11:04:54AM +0100, Pankaj Raghav wrote:
> It looks like this endio function is called when alloc_page is used (for
> partial IO) to trigger writeback from the user space `echo "idle" >
> /sys/block/zram0/writeback`.


Yes.

> I don't understand when you say the harm might not be horrible if we don't
> call folio_endio here. Do you think it is just safe to remove the call to
> folio_endio function?

I suspect so.  It doesn't seem like the involved pages are ever locked
or have the writeback set, so it should be fine.

> +               while ((folio = readahead_folio(rac))) {
> +                       folio_clear_uptodate(folio);
> +                       folio_set_error(folio);
> +                       folio_unlock(folio);
> +               }
> +               return;
> +       }
> +
> +       while ((folio = readahead_folio(rac))) {
> +               folio_mark_uptodate(folio);
> +               folio_unlock(folio);
>         }
>  }

Looks good.

> @@ -59,7 +59,8 @@ static void mpage_end_io(struct bio *bio)
> static struct bio *mpage_bio_submit(struct bio *bio)
>  {
> -       bio->bi_end_io = mpage_end_io;
> +       bio->bi_end_io = (op_is_write(bio_op(bio))) ? mpage_write_end_io :
> +                                                     mpage_read_end_io;
>         guard_bio_eod(bio);
>         submit_bio(bio);
>         return NULL;
> And mpage_{write,read}_end_io will iterate over the folio and call the
> respective functions.

Yes, although I'd do it with a good old if/else and with less braces.
It might make sense to split mpage_bio_submit as well, though.

Matthew Wilcox (Oracle) March 17, 2023, 3:31 p.m. UTC | #6

On Thu, Mar 16, 2023 at 11:04:54AM +0100, Pankaj Raghav wrote:
> -       /* clean up. */
> -       while ((page = readahead_page(rac))) {
> -               page_endio(page, false, ret);
> -               put_page(page);
> +               while ((folio = readahead_folio(rac))) {
> +                       folio_clear_uptodate(folio);
> +                       folio_set_error(folio);
> +                       folio_unlock(folio);
> +               }
> +               return;
> +       }
> +
> +       while ((folio = readahead_folio(rac))) {
> +               folio_mark_uptodate(folio);
> +               folio_unlock(folio);
>         }

readahead_folio() is a bit too heavy-weight for that, IMO.  I'd do this
as;

	while ((folio = readahead_folio(rac))) {
		if (!ret)
			folio_mark_uptodate(folio);
		folio_unlock(folio);
	}

(there's no need to call folio_set_error(), nor folio_clear_uptodate())

Pankaj Raghav March 17, 2023, 4:14 p.m. UTC | #7

On 2023-03-17 16:31, Matthew Wilcox wrote:
>> +
>> +       while ((folio = readahead_folio(rac))) {
>> +               folio_mark_uptodate(folio);
>> +               folio_unlock(folio);
>>         }
> 
> readahead_folio() is a bit too heavy-weight for that, IMO.  I'd do this
> as;
> 
> 	while ((folio = readahead_folio(rac))) {
> 		if (!ret)
> 			folio_mark_uptodate(folio);
> 		folio_unlock(folio);
> 	}
> 

This looks good.

> (there's no need to call folio_set_error(), nor folio_clear_uptodate())

I am trying to understand why these calls are not needed for the error case.
I see similar pattern, for e.g. in iomap_finish_folio_read() where we call these
functions for the error case.

If we don't need to call these anymore, can the mpage code also be shortened like this:

-static void mpage_end_io(struct bio *bio)
+static void mpage_read_end_io(struct bio *bio)
 {
-       struct bio_vec *bv;
-       struct bvec_iter_all iter_all;
+       struct folio_iter fi;
+       int err = blk_status_to_errno(bio->bi_status);

-       bio_for_each_segment_all(bv, bio, iter_all) {
-               struct page *page = bv->bv_page;
-               page_endio(page, bio_op(bio),
-                          blk_status_to_errno(bio->bi_status));
+       bio_for_each_folio_all(fi, bio) {
+               struct folio *folio = fi.folio;
+
+               if (!err)
+                       folio_mark_uptodate(folio);
+               folio_unlock(folio);
+       }
+
+       bio_put(bio);
+}
+
+static void mpage_write_end_io(struct bio *bio)
+{
....
+

Pankaj Raghav March 21, 2023, 1:51 p.m. UTC | #8

Hi Willy,

On 2023-03-17 17:14, Pankaj Raghav wrote:
> On 2023-03-17 16:31, Matthew Wilcox wrote:
>>> +
>>> +       while ((folio = readahead_folio(rac))) {
>>> +               folio_mark_uptodate(folio);
>>> +               folio_unlock(folio);
>>>         }
>>
>> readahead_folio() is a bit too heavy-weight for that, IMO.  I'd do this
>> as;
>>
>> 	while ((folio = readahead_folio(rac))) {
>> 		if (!ret)
>> 			folio_mark_uptodate(folio);
>> 		folio_unlock(folio);
>> 	}
>>
> 
> This looks good.
> 
>> (there's no need to call folio_set_error(), nor folio_clear_uptodate())
> 
> I am trying to understand why these calls are not needed for the error case.
> I see similar pattern, for e.g. in iomap_finish_folio_read() where we call these
> functions for the error case.
> 

I am planning to send the next version. It would be great if I can get a rationale for your
statement regarding not needing to call folio_set_error() or folio_clear_uptodate().

> If we don't need to call these anymore, can the mpage code also be shortened like this:
> 
> -static void mpage_end_io(struct bio *bio)
> +static void mpage_read_end_io(struct bio *bio)
>  {
> -       struct bio_vec *bv;
> -       struct bvec_iter_all iter_all;
> +       struct folio_iter fi;
> +       int err = blk_status_to_errno(bio->bi_status);
> 
> -       bio_for_each_segment_all(bv, bio, iter_all) {
> -               struct page *page = bv->bv_page;
> -               page_endio(page, bio_op(bio),
> -                          blk_status_to_errno(bio->bi_status));
> +       bio_for_each_folio_all(fi, bio) {
> +               struct folio *folio = fi.folio;
> +
> +               if (!err)
> +                       folio_mark_uptodate(folio);
> +               folio_unlock(folio);
> +       }
> +
> +       bio_put(bio);
> +}
> +
> +static void mpage_write_end_io(struct bio *bio)
> +{
> ....
> +

[RFC,1/3] filemap: convert page_endio to folio_endio

Commit Message

Comments

Patch