Message ID | 20200617135823.980-1-jack@suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blktrace: Provide event for request merging | expand |
Jan, On 6/17/20 7:03 AM, Jan Kara wrote: > Currently blk-mq does not report any event when two requests get merged > in the elevator. This then results in difficult to understand sequence > of events like: > > ... > 8,0 34 1579 0.608765271 2718 I WS 215023504 + 40 [dbench] > 8,0 34 1584 0.609184613 2719 A WS 215023544 + 56 <- (8,4) 2160568 > 8,0 34 1585 0.609184850 2719 Q WS 215023544 + 56 [dbench] > 8,0 34 1586 0.609188524 2719 G WS 215023544 + 56 [dbench] > 8,0 3 602 0.609684162 773 D WS 215023504 + 96 [kworker/3:1H] > 8,0 34 1591 0.609843593 0 C WS 215023504 + 96 [0] > > and you can only guess (after quite some headscratching since the above > excerpt is intermixed with a lot of other IO) that request 215023544+56 > got merged to request 215023504+40. Provide proper event for request > merging like we used to do in the legacy block layer. > > Signed-off-by: Jan Kara <jack@suse.cz> The attempt_merge function is also responsible for the discard merging which doesn't have any direction specified in ELEVATOR_XXX identifiers. In this patch we care unconditionally generating back merge event irrespective of the req_op. Do we need to either generate event iff ELEVATOR_BACK_MERGE true case or add another trace point for discard ? given that it may lead to confusion since elevator flags for the discard doesn't specify the direction.
On Wed, Jun 17, 2020 at 03:58:23PM +0200, Jan Kara wrote: > blk_account_io_merge_request(next); > > + trace_block_rq_merge(q, next); q can be derived from next, no need to explicitly pass it. And yes, I know a lot of existing tracepoints do so, but I plan to fix that up as well.
Hello, On Wed 17-06-20 17:41:36, Chaitanya Kulkarni wrote: > On 6/17/20 7:03 AM, Jan Kara wrote: > > Currently blk-mq does not report any event when two requests get merged > > in the elevator. This then results in difficult to understand sequence > > of events like: > > > > ... > > 8,0 34 1579 0.608765271 2718 I WS 215023504 + 40 [dbench] > > 8,0 34 1584 0.609184613 2719 A WS 215023544 + 56 <- (8,4) 2160568 > > 8,0 34 1585 0.609184850 2719 Q WS 215023544 + 56 [dbench] > > 8,0 34 1586 0.609188524 2719 G WS 215023544 + 56 [dbench] > > 8,0 3 602 0.609684162 773 D WS 215023504 + 96 [kworker/3:1H] > > 8,0 34 1591 0.609843593 0 C WS 215023504 + 96 [0] > > > > and you can only guess (after quite some headscratching since the above > > excerpt is intermixed with a lot of other IO) that request 215023544+56 > > got merged to request 215023504+40. Provide proper event for request > > merging like we used to do in the legacy block layer. > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > The attempt_merge function is also responsible for the discard merging > which doesn't have any direction specified in ELEVATOR_XXX identifiers. > In this patch we care unconditionally generating back merge event > irrespective of the req_op. > > Do we need to either generate event iff ELEVATOR_BACK_MERGE true case or > add another trace point for discard ? given that it may lead to > confusion since elevator flags for the discard doesn't specify the > direction. attempt_merge() is always called so that 'req' is always the request with the lower sector, 'next' is the request with a higher sector, and 'next' is discarded and 'req' is updated. So attempt_merge() always performs only one direction of a merge and I don't think it makes any sence to distinguish back merges and forward merges in this case. So discards don't need any special treatment either AFAICT. Honza
On Wed 17-06-20 23:53:59, Christoph Hellwig wrote: > On Wed, Jun 17, 2020 at 03:58:23PM +0200, Jan Kara wrote: > > blk_account_io_merge_request(next); > > > > + trace_block_rq_merge(q, next); > > q can be derived from next, no need to explicitly pass it. And yes, > I know a lot of existing tracepoints do so, but I plan to fix that up > as well. OK, I'll update the patch. Honza
On 6/18/20 5:04 AM, Jan Kara wrote: >> The attempt_merge function is also responsible for the discard merging >> which doesn't have any direction specified in ELEVATOR_XXX identifiers. >> In this patch we care unconditionally generating back merge event >> irrespective of the req_op. >> >> Do we need to either generate event iff ELEVATOR_BACK_MERGE true case or >> add another trace point for discard ? given that it may lead to >> confusion since elevator flags for the discard doesn't specify the >> direction. > attempt_merge() is always called so that 'req' is always the request with > the lower sector, 'next' is the request with a higher sector, and 'next' is > discarded and 'req' is updated. So attempt_merge() always performs only one > direction of a merge and I don't think it makes any sence to distinguish > back merges and forward merges in this case. So discards don't need any > special treatment either AFAICT. > Please disregard my comment regarding adding a separate tracpoint for discard. I understand the code and that is what I'm saying it will only back-merge. The concern is just like we have ELEVATOR_BACK_MERGE used in attempt_merge, we should also have ELEVATOR_DISCARD_BACK_MERGE and use it the attempt_merge() then this tracepoint will map to the back-merge which your patch does with the use of BLK_TC_BACKMERGE nothing needed in your patch to change regarding tracepoint. > Honza > > -- Jan Kara <jack@suse.com> SUSE Labs, CR
On Wed 17-06-20 23:53:59, Christoph Hellwig wrote: > On Wed, Jun 17, 2020 at 03:58:23PM +0200, Jan Kara wrote: > > blk_account_io_merge_request(next); > > > > + trace_block_rq_merge(q, next); > > q can be derived from next, no need to explicitly pass it. And yes, > I know a lot of existing tracepoints do so, but I plan to fix that up > as well. I had a look into it now and I could do this but that would mean that block_rq_merge trace event would now differ from all other similar events and we couldn't use block_rq event class and have to define our own and so overall it would IMO make future conversion to get rid of 'q' argument more difficult, not simpler. So I can do this but are you really sure? Honza
On 6/18/20 9:13 AM, Jan Kara wrote: > On Wed 17-06-20 23:53:59, Christoph Hellwig wrote: >> On Wed, Jun 17, 2020 at 03:58:23PM +0200, Jan Kara wrote: >>> blk_account_io_merge_request(next); >>> >>> + trace_block_rq_merge(q, next); >> q can be derived from next, no need to explicitly pass it. And yes, >> I know a lot of existing tracepoints do so, but I plan to fix that up >> as well. > I had a look into it now and I could do this but that would mean that > block_rq_merge trace event would now differ from all other similar events > and we couldn't use block_rq event class and have to define our own and so > overall it would IMO make future conversion to get rid of 'q' argument more > difficult, not simpler. So I can do this but are you really sure? > > Honza > -- Jan Kara <jack@suse.com> SUSE Labs, CR If you guys are okay then I can look into this and send out a patch and we can keep this patch separate.
On Thu, Jun 18, 2020 at 06:13:31PM +0200, Jan Kara wrote: > On Wed 17-06-20 23:53:59, Christoph Hellwig wrote: > > On Wed, Jun 17, 2020 at 03:58:23PM +0200, Jan Kara wrote: > > > blk_account_io_merge_request(next); > > > > > > + trace_block_rq_merge(q, next); > > > > q can be derived from next, no need to explicitly pass it. And yes, > > I know a lot of existing tracepoints do so, but I plan to fix that up > > as well. > > I had a look into it now and I could do this but that would mean that > block_rq_merge trace event would now differ from all other similar events > and we couldn't use block_rq event class and have to define our own and so > overall it would IMO make future conversion to get rid of 'q' argument more > difficult, not simpler. So I can do this but are you really sure? Ok, let's keep the original version for now then.
On Wed 17-06-20 15:58:23, Jan Kara wrote: > Currently blk-mq does not report any event when two requests get merged > in the elevator. This then results in difficult to understand sequence > of events like: > > ... > 8,0 34 1579 0.608765271 2718 I WS 215023504 + 40 [dbench] > 8,0 34 1584 0.609184613 2719 A WS 215023544 + 56 <- (8,4) 2160568 > 8,0 34 1585 0.609184850 2719 Q WS 215023544 + 56 [dbench] > 8,0 34 1586 0.609188524 2719 G WS 215023544 + 56 [dbench] > 8,0 3 602 0.609684162 773 D WS 215023504 + 96 [kworker/3:1H] > 8,0 34 1591 0.609843593 0 C WS 215023504 + 96 [0] > > and you can only guess (after quite some headscratching since the above > excerpt is intermixed with a lot of other IO) that request 215023544+56 > got merged to request 215023504+40. Provide proper event for request > merging like we used to do in the legacy block layer. > > Signed-off-by: Jan Kara <jack@suse.cz> Jens, it seems people are fine with this patch in the end. Can you please merge it? Thanks! Honza > --- > block/blk-merge.c | 2 ++ > include/trace/events/block.h | 15 +++++++++++++++ > kernel/trace/blktrace.c | 10 ++++++++++ > 3 files changed, 27 insertions(+) > > diff --git a/block/blk-merge.c b/block/blk-merge.c > index f0b0bae075a0..9c9fb21584b6 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -793,6 +793,8 @@ static struct request *attempt_merge(struct request_queue *q, > */ > blk_account_io_merge_request(next); > > + trace_block_rq_merge(q, next); > + > /* > * ownership of bio passed from next to req, return 'next' for > * the caller to free > diff --git a/include/trace/events/block.h b/include/trace/events/block.h > index 81b43f5bdf23..b81205560782 100644 > --- a/include/trace/events/block.h > +++ b/include/trace/events/block.h > @@ -211,6 +211,21 @@ DEFINE_EVENT(block_rq, block_rq_issue, > TP_ARGS(q, rq) > ); > > +/** > + * block_rq_merge - merge request with another one in the elevator > + * @q: queue holding operation > + * @rq: block IO operation operation request > + * > + * Called when block operation request @rq from queue @q is merged to another > + * request queued in the elevator. > + */ > +DEFINE_EVENT(block_rq, block_rq_merge, > + > + TP_PROTO(struct request_queue *q, struct request *rq), > + > + TP_ARGS(q, rq) > +); > + > /** > * block_bio_bounce - used bounce buffer when processing block operation > * @q: queue holding the block operation > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c > index ea47f2084087..41521216d3eb 100644 > --- a/kernel/trace/blktrace.c > +++ b/kernel/trace/blktrace.c > @@ -834,6 +834,13 @@ static void blk_add_trace_rq_issue(void *ignore, > blk_trace_request_get_cgid(q, rq)); > } > > +static void blk_add_trace_rq_merge(void *ignore, > + struct request_queue *q, struct request *rq) > +{ > + blk_add_trace_rq(rq, 0, blk_rq_bytes(rq), BLK_TA_BACKMERGE, > + blk_trace_request_get_cgid(q, rq)); > +} > + > static void blk_add_trace_rq_requeue(void *ignore, > struct request_queue *q, > struct request *rq) > @@ -1115,6 +1122,8 @@ static void blk_register_tracepoints(void) > WARN_ON(ret); > ret = register_trace_block_rq_issue(blk_add_trace_rq_issue, NULL); > WARN_ON(ret); > + ret = register_trace_block_rq_merge(blk_add_trace_rq_merge, NULL); > + WARN_ON(ret); > ret = register_trace_block_rq_requeue(blk_add_trace_rq_requeue, NULL); > WARN_ON(ret); > ret = register_trace_block_rq_complete(blk_add_trace_rq_complete, NULL); > @@ -1161,6 +1170,7 @@ static void blk_unregister_tracepoints(void) > unregister_trace_block_bio_bounce(blk_add_trace_bio_bounce, NULL); > unregister_trace_block_rq_complete(blk_add_trace_rq_complete, NULL); > unregister_trace_block_rq_requeue(blk_add_trace_rq_requeue, NULL); > + unregister_trace_block_rq_merge(blk_add_trace_rq_merge, NULL); > unregister_trace_block_rq_issue(blk_add_trace_rq_issue, NULL); > unregister_trace_block_rq_insert(blk_add_trace_rq_insert, NULL); > > -- > 2.16.4 >
Hi Jan and Jens On 6/25/20 12:51 PM, Jan Kara wrote: > On Wed 17-06-20 15:58:23, Jan Kara wrote: >> Currently blk-mq does not report any event when two requests get merged >> in the elevator. This then results in difficult to understand sequence >> of events like: >> >> ... >> 8,0 34 1579 0.608765271 2718 I WS 215023504 + 40 [dbench] >> 8,0 34 1584 0.609184613 2719 A WS 215023544 + 56 <- (8,4) 2160568 >> 8,0 34 1585 0.609184850 2719 Q WS 215023544 + 56 [dbench] >> 8,0 34 1586 0.609188524 2719 G WS 215023544 + 56 [dbench] >> 8,0 3 602 0.609684162 773 D WS 215023504 + 96 [kworker/3:1H] >> 8,0 34 1591 0.609843593 0 C WS 215023504 + 96 [0] >> >> and you can only guess (after quite some headscratching since the above >> excerpt is intermixed with a lot of other IO) that request 215023544+56 >> got merged to request 215023504+40. Provide proper event for request >> merging like we used to do in the legacy block layer. >> >> Signed-off-by: Jan Kara<jack@suse.cz> > Jens, it seems people are fine with this patch in the end. Can you please > merge it? Thanks! > I'm sending blktrace cleanup based on the discussion [1], if you guys want I can also add this patch and keep Jan Author and have cleanup done with this patch included separtely. Please let me know that is not accepted I'll send my series separately. [1] https://marc.info/?l=linux-block&m=159296928424805&w=2 >
On 6/25/20 1:51 PM, Jan Kara wrote: > On Wed 17-06-20 15:58:23, Jan Kara wrote: >> Currently blk-mq does not report any event when two requests get merged >> in the elevator. This then results in difficult to understand sequence >> of events like: >> >> ... >> 8,0 34 1579 0.608765271 2718 I WS 215023504 + 40 [dbench] >> 8,0 34 1584 0.609184613 2719 A WS 215023544 + 56 <- (8,4) 2160568 >> 8,0 34 1585 0.609184850 2719 Q WS 215023544 + 56 [dbench] >> 8,0 34 1586 0.609188524 2719 G WS 215023544 + 56 [dbench] >> 8,0 3 602 0.609684162 773 D WS 215023504 + 96 [kworker/3:1H] >> 8,0 34 1591 0.609843593 0 C WS 215023504 + 96 [0] >> >> and you can only guess (after quite some headscratching since the above >> excerpt is intermixed with a lot of other IO) that request 215023544+56 >> got merged to request 215023504+40. Provide proper event for request >> merging like we used to do in the legacy block layer. >> >> Signed-off-by: Jan Kara <jack@suse.cz> > > Jens, it seems people are fine with this patch in the end. Can you please > merge it? Thanks! Applied for 5.9, thanks.
diff --git a/block/blk-merge.c b/block/blk-merge.c index f0b0bae075a0..9c9fb21584b6 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -793,6 +793,8 @@ static struct request *attempt_merge(struct request_queue *q, */ blk_account_io_merge_request(next); + trace_block_rq_merge(q, next); + /* * ownership of bio passed from next to req, return 'next' for * the caller to free diff --git a/include/trace/events/block.h b/include/trace/events/block.h index 81b43f5bdf23..b81205560782 100644 --- a/include/trace/events/block.h +++ b/include/trace/events/block.h @@ -211,6 +211,21 @@ DEFINE_EVENT(block_rq, block_rq_issue, TP_ARGS(q, rq) ); +/** + * block_rq_merge - merge request with another one in the elevator + * @q: queue holding operation + * @rq: block IO operation operation request + * + * Called when block operation request @rq from queue @q is merged to another + * request queued in the elevator. + */ +DEFINE_EVENT(block_rq, block_rq_merge, + + TP_PROTO(struct request_queue *q, struct request *rq), + + TP_ARGS(q, rq) +); + /** * block_bio_bounce - used bounce buffer when processing block operation * @q: queue holding the block operation diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index ea47f2084087..41521216d3eb 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -834,6 +834,13 @@ static void blk_add_trace_rq_issue(void *ignore, blk_trace_request_get_cgid(q, rq)); } +static void blk_add_trace_rq_merge(void *ignore, + struct request_queue *q, struct request *rq) +{ + blk_add_trace_rq(rq, 0, blk_rq_bytes(rq), BLK_TA_BACKMERGE, + blk_trace_request_get_cgid(q, rq)); +} + static void blk_add_trace_rq_requeue(void *ignore, struct request_queue *q, struct request *rq) @@ -1115,6 +1122,8 @@ static void blk_register_tracepoints(void) WARN_ON(ret); ret = register_trace_block_rq_issue(blk_add_trace_rq_issue, NULL); WARN_ON(ret); + ret = register_trace_block_rq_merge(blk_add_trace_rq_merge, NULL); + WARN_ON(ret); ret = register_trace_block_rq_requeue(blk_add_trace_rq_requeue, NULL); WARN_ON(ret); ret = register_trace_block_rq_complete(blk_add_trace_rq_complete, NULL); @@ -1161,6 +1170,7 @@ static void blk_unregister_tracepoints(void) unregister_trace_block_bio_bounce(blk_add_trace_bio_bounce, NULL); unregister_trace_block_rq_complete(blk_add_trace_rq_complete, NULL); unregister_trace_block_rq_requeue(blk_add_trace_rq_requeue, NULL); + unregister_trace_block_rq_merge(blk_add_trace_rq_merge, NULL); unregister_trace_block_rq_issue(blk_add_trace_rq_issue, NULL); unregister_trace_block_rq_insert(blk_add_trace_rq_insert, NULL);
Currently blk-mq does not report any event when two requests get merged in the elevator. This then results in difficult to understand sequence of events like: ... 8,0 34 1579 0.608765271 2718 I WS 215023504 + 40 [dbench] 8,0 34 1584 0.609184613 2719 A WS 215023544 + 56 <- (8,4) 2160568 8,0 34 1585 0.609184850 2719 Q WS 215023544 + 56 [dbench] 8,0 34 1586 0.609188524 2719 G WS 215023544 + 56 [dbench] 8,0 3 602 0.609684162 773 D WS 215023504 + 96 [kworker/3:1H] 8,0 34 1591 0.609843593 0 C WS 215023504 + 96 [0] and you can only guess (after quite some headscratching since the above excerpt is intermixed with a lot of other IO) that request 215023544+56 got merged to request 215023504+40. Provide proper event for request merging like we used to do in the legacy block layer. Signed-off-by: Jan Kara <jack@suse.cz> --- block/blk-merge.c | 2 ++ include/trace/events/block.h | 15 +++++++++++++++ kernel/trace/blktrace.c | 10 ++++++++++ 3 files changed, 27 insertions(+)