diff mbox series

[2/2] block: remove the ioprio field from struct request

Message ID 20241112170050.1612998-3-hch@lst.de (mailing list archive)
State Not Applicable
Headers show
Series [1/2] block: remove the write_hint field from struct request | expand

Commit Message

Christoph Hellwig Nov. 12, 2024, 5 p.m. UTC
The request ioprio is only initialized from the first attached bio,
so requests without a bio already never set it.  Directly use the
bio field instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-merge.c            | 10 ++++------
 block/blk-mq.c               |  3 +--
 include/linux/blk-mq.h       |  7 +++----
 include/trace/events/block.h |  6 +++---
 4 files changed, 11 insertions(+), 15 deletions(-)

Comments

John Garry Nov. 12, 2024, 6:22 p.m. UTC | #1
On 12/11/2024 17:00, Christoph Hellwig wrote:
>   		if (rq->bio->bi_write_hint != bio->bi_write_hint)
>   			return false;
> +		if (rq->bio->bi_ioprio != bio->bi_ioprio)

At this point bio_prio() looks to only be used in md code. Is it worth 
deleting that wrapper? I doubt its value.

And at many points bio->bi_ioprio is written without using 
bio_set_prio(), so I doubt the use of that one as well..

I do realize that all this is outside the scope of this series.

> +			return false;
>   	}
>   
> -	if (rq->ioprio != bio_prio(bio))
> -		return false;
> -
Bart Van Assche Nov. 12, 2024, 6:32 p.m. UTC | #2
On 11/12/24 9:00 AM, Christoph Hellwig wrote:
> The request ioprio is only initialized from the first attached bio,
> so requests without a bio already never set it.  Directly use the
> bio field instead.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Christoph Hellwig Nov. 13, 2024, 5:06 a.m. UTC | #3
On Tue, Nov 12, 2024 at 06:22:24PM +0000, John Garry wrote:
> And at many points bio->bi_ioprio is written without using bio_set_prio(), 
> so I doubt the use of that one as well..
>
> I do realize that all this is outside the scope of this series.

Agreed to all of the above.  If anyone is in for an easy cleanup, go
for it!
Sam Protsenko Nov. 22, 2024, 5:04 a.m. UTC | #4
Hi Christoph,

This patch causes a regression on E850-96 board. Specifically, there are
two noticeable time lags when booting Debian rootfs:

  1. When systemd reports this stage:

         "Reached target getty.target - Login Prompts."

     it hangs for 5 seconds or so, before going further.

  2. When I'm logging in, the system hangs for 60 seconds or so before
     the shell prompt appears.

It only seems to reproduce the first time (during the boot). My attempt to
re-start the mentioned systemd target or run "login" command again worked
fine.

Reverting commit 6975c1a486a4 ("block: remove the ioprio field from
struct request") fixes the issue for me. Do you have any ideas by chance
what might be the reason? Or maybe you have any pointers on debugging it?

Thanks!
Christoph Hellwig Nov. 22, 2024, 12:04 p.m. UTC | #5
On Thu, Nov 21, 2024 at 11:04:19PM -0600, Sam Protsenko wrote:
> Hi Christoph,
> 
> This patch causes a regression on E850-96 board. Specifically, there are
> two noticeable time lags when booting Debian rootfs:

What storage driver does this board use?  Anything else interesting
about the setup?
Sam Protsenko Nov. 22, 2024, 9:55 p.m. UTC | #6
On Fri, Nov 22, 2024 at 6:04 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Thu, Nov 21, 2024 at 11:04:19PM -0600, Sam Protsenko wrote:
> > Hi Christoph,
> >
> > This patch causes a regression on E850-96 board. Specifically, there are
> > two noticeable time lags when booting Debian rootfs:
>
> What storage driver does this board use?  Anything else interesting
> about the setup?
>

It's an Exynos based board with eMMC, so it uses DW MMC driver, with
Exynos glue layer on top of it, so:

    drivers/mmc/host/dw_mmc.c
    drivers/mmc/host/dw_mmc-exynos.c

I'm using the regular ARM64 defconfig. Nothing fancy about this setup
neither, the device tree with eMMC definition (mmc_0) is here:

    arch/arm64/boot/dts/exynos/exynos850-e850-96.dts

FWIW, I was able to narrow down the issue to dd_insert_request()
function. With this hack the freeze is gone:

8<-------------------------------------------------------------------->8
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index acdc28756d9d..83d272b66e71 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -676,7 +676,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx
*hctx, struct request *rq,
        struct request_queue *q = hctx->queue;
        struct deadline_data *dd = q->elevator->elevator_data;
        const enum dd_data_dir data_dir = rq_data_dir(rq);
-       u16 ioprio = req_get_ioprio(rq);
+       u16 ioprio = 0; /* the same as old req->ioprio */
        u8 ioprio_class = IOPRIO_PRIO_CLASS(ioprio);
        struct dd_per_prio *per_prio;
        enum dd_prio prio;
8<-------------------------------------------------------------------->8

Does it tell you anything about where the possible issue can be?

Thanks!
Bart Van Assche Nov. 22, 2024, 10:18 p.m. UTC | #7
On 11/22/24 1:55 PM, Sam Protsenko wrote:
> On Fri, Nov 22, 2024 at 6:04 AM Christoph Hellwig <hch@lst.de> wrote:
>>
>> On Thu, Nov 21, 2024 at 11:04:19PM -0600, Sam Protsenko wrote:
>>> Hi Christoph,
>>>
>>> This patch causes a regression on E850-96 board. Specifically, there are
>>> two noticeable time lags when booting Debian rootfs:
>>
>> What storage driver does this board use?  Anything else interesting
>> about the setup?
>>
> 
> It's an Exynos based board with eMMC, so it uses DW MMC driver, with
> Exynos glue layer on top of it, so:
> 
>      drivers/mmc/host/dw_mmc.c
>      drivers/mmc/host/dw_mmc-exynos.c
> 
> I'm using the regular ARM64 defconfig. Nothing fancy about this setup
> neither, the device tree with eMMC definition (mmc_0) is here:
> 
>      arch/arm64/boot/dts/exynos/exynos850-e850-96.dts
> 
> FWIW, I was able to narrow down the issue to dd_insert_request()
> function. With this hack the freeze is gone:
> 
> 8<-------------------------------------------------------------------->8
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index acdc28756d9d..83d272b66e71 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -676,7 +676,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx
> *hctx, struct request *rq,
>          struct request_queue *q = hctx->queue;
>          struct deadline_data *dd = q->elevator->elevator_data;
>          const enum dd_data_dir data_dir = rq_data_dir(rq);
> -       u16 ioprio = req_get_ioprio(rq);
> +       u16 ioprio = 0; /* the same as old req->ioprio */
>          u8 ioprio_class = IOPRIO_PRIO_CLASS(ioprio);
>          struct dd_per_prio *per_prio;
>          enum dd_prio prio;
> 8<-------------------------------------------------------------------->8
> 
> Does it tell you anything about where the possible issue can be?

It seems like eMMC devices do not tolerate I/O prioritization. How about
disabling I/O prioritization for eMMC setups? Is the ioprio cgroup
controller perhaps activated by the user space software that is running
on this setup?

Thanks,

Bart.
diff mbox series

Patch

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2306014c108d..df36f83f3738 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -871,11 +871,10 @@  static struct request *attempt_merge(struct request_queue *q,
 		/* Don't merge requests with different write hints. */
 		if (req->bio->bi_write_hint != next->bio->bi_write_hint)
 			return NULL;
+		if (req->bio->bi_ioprio != next->bio->bi_ioprio)
+			return NULL;
 	}
 
-	if (req->ioprio != next->ioprio)
-		return NULL;
-
 	if (!blk_atomic_write_mergeable_rqs(req, next))
 		return NULL;
 
@@ -1007,11 +1006,10 @@  bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 		/* Don't merge requests with different write hints. */
 		if (rq->bio->bi_write_hint != bio->bi_write_hint)
 			return false;
+		if (rq->bio->bi_ioprio != bio->bi_ioprio)
+			return false;
 	}
 
-	if (rq->ioprio != bio_prio(bio))
-		return false;
-
 	if (blk_atomic_write_mergeable_rq_bio(rq, bio) == false)
 		return false;
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 65e6b86d341c..3c6cadba75e3 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -842,7 +842,7 @@  static void blk_print_req_error(struct request *req, blk_status_t status)
 		blk_op_str(req_op(req)),
 		(__force u32)(req->cmd_flags & ~REQ_OP_MASK),
 		req->nr_phys_segments,
-		IOPRIO_PRIO_CLASS(req->ioprio));
+		IOPRIO_PRIO_CLASS(req_get_ioprio(req)));
 }
 
 /*
@@ -3306,7 +3306,6 @@  int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 		rq->special_vec = rq_src->special_vec;
 	}
 	rq->nr_phys_segments = rq_src->nr_phys_segments;
-	rq->ioprio = rq_src->ioprio;
 
 	if (rq->bio && blk_crypto_rq_bio_prep(rq, rq->bio, gfp_mask) < 0)
 		goto free_and_out;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 2804fe181d9d..a28264442948 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -156,8 +156,6 @@  struct request {
 	struct blk_crypto_keyslot *crypt_keyslot;
 #endif
 
-	unsigned short ioprio;
-
 	enum mq_rq_state state;
 	atomic_t ref;
 
@@ -221,7 +219,9 @@  static inline bool blk_rq_is_passthrough(struct request *rq)
 
 static inline unsigned short req_get_ioprio(struct request *req)
 {
-	return req->ioprio;
+	if (req->bio)
+		return req->bio->bi_ioprio;
+	return 0;
 }
 
 #define rq_data_dir(rq)		(op_is_write(req_op(rq)) ? WRITE : READ)
@@ -984,7 +984,6 @@  static inline void blk_rq_bio_prep(struct request *rq, struct bio *bio,
 	rq->nr_phys_segments = nr_segs;
 	rq->__data_len = bio->bi_iter.bi_size;
 	rq->bio = rq->biotail = bio;
-	rq->ioprio = bio_prio(bio);
 }
 
 void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx,
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 1527d5d45e01..bd0ea07338eb 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -99,7 +99,7 @@  TRACE_EVENT(block_rq_requeue,
 		__entry->dev	   = rq->q->disk ? disk_devt(rq->q->disk) : 0;
 		__entry->sector    = blk_rq_trace_sector(rq);
 		__entry->nr_sector = blk_rq_trace_nr_sectors(rq);
-		__entry->ioprio    = rq->ioprio;
+		__entry->ioprio    = req_get_ioprio(rq);
 
 		blk_fill_rwbs(__entry->rwbs, rq->cmd_flags);
 		__get_str(cmd)[0] = '\0';
@@ -136,7 +136,7 @@  DECLARE_EVENT_CLASS(block_rq_completion,
 		__entry->sector    = blk_rq_pos(rq);
 		__entry->nr_sector = nr_bytes >> 9;
 		__entry->error     = blk_status_to_errno(error);
-		__entry->ioprio    = rq->ioprio;
+		__entry->ioprio    = req_get_ioprio(rq);
 
 		blk_fill_rwbs(__entry->rwbs, rq->cmd_flags);
 		__get_str(cmd)[0] = '\0';
@@ -209,7 +209,7 @@  DECLARE_EVENT_CLASS(block_rq,
 		__entry->sector    = blk_rq_trace_sector(rq);
 		__entry->nr_sector = blk_rq_trace_nr_sectors(rq);
 		__entry->bytes     = blk_rq_bytes(rq);
-		__entry->ioprio	   = rq->ioprio;
+		__entry->ioprio	   = req_get_ioprio(rq);
 
 		blk_fill_rwbs(__entry->rwbs, rq->cmd_flags);
 		__get_str(cmd)[0] = '\0';