Message ID | 20200408201450.3959560-3-tj@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/5] blk-iocost: switch to fixed non-auto-decaying use_delay | expand |
Hi Tejun, On Wed, Apr 08, 2020 at 04:14:47PM -0400, Tejun Heo wrote: > Currently, at the time of completeion, there's no way of knowing how big a > request was. blk-iocost will need this information to account for IO size when > calculating expected latencies. > > This patch adds rq->io_data_len which remembers blk_rq_bytes() at the time the > request gets issued. The field is enabled iff CONFIG_BLK_IO_DATA_LEN is set and > doesn't increase the size of the struct even when enabled. Almost all __blk_mq_end_request() follow blk_update_request(), so the completed bytes can be passed to __blk_mq_end_request(), then we can avoid to introduce this field. Also there is just 20 callers of __blk_mq_end_request(), looks this kind of change shouldn't be too big. Thanks, Ming
On Thu, Apr 09, 2020 at 09:44:06AM +0800, Ming Lei wrote: > Almost all __blk_mq_end_request() follow blk_update_request(), so the > completed bytes can be passed to __blk_mq_end_request(), then we can > avoid to introduce this field. But on some drivers blk_update_request() may be called multiple times before __blk_mq_end_request() is called and what's needed here is the total number of bytes in the whole request, not just in the final completion. > Also there is just 20 callers of __blk_mq_end_request(), looks this kind > of change shouldn't be too big. This would work iff we get rid of partial completions and if we get rid of partial completions, we might as well stop exposing blk_update_request() and __blk_mq_end_request(). Thanks.
On Wed, Apr 08, 2020 at 10:11:19PM -0400, Tejun Heo wrote: > On Thu, Apr 09, 2020 at 09:44:06AM +0800, Ming Lei wrote: > > Almost all __blk_mq_end_request() follow blk_update_request(), so the > > completed bytes can be passed to __blk_mq_end_request(), then we can > > avoid to introduce this field. > > But on some drivers blk_update_request() may be called multiple times before > __blk_mq_end_request() is called and what's needed here is the total number of > bytes in the whole request, not just in the final completion. OK. Another choice might be to record request bytes in rq's payload when calling .queue_rq() only for these drivers. > > > Also there is just 20 callers of __blk_mq_end_request(), looks this kind > > of change shouldn't be too big. > > This would work iff we get rid of partial completions and if we get rid of > partial completions, we might as well stop exposing blk_update_request() and > __blk_mq_end_request(). Indeed, we can store the completed bytes in request payload, so looks killing partial completion shouldn't be too hard. Thanks, Ming
On 2020-04-08 13:14, Tejun Heo wrote: > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 32868fbedc9e..bfd34c6a27ef 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -142,6 +142,14 @@ struct request { > > /* the following two fields are internal, NEVER access directly */ > unsigned int __data_len; /* total data len */ > +#ifdef CONFIG_BLK_RQ_IO_DATA_LEN > + /* > + * Total data len at the time of issue. This doesn't get deducted by > + * blk_update_request() and can be used by completion path to determine > + * the request size. > + */ > + unsigned int io_data_len; > +#endif > sector_t __sector; /* sector cursor */ > > struct bio *bio; So we have one struct member with the description "total data len" and another struct member with the description "total data len at the time of issue"? How could one not get confused by these descriptions? This change makes the comment above __data_len incorrect. Please update that comment or move io_data_len in front of that comment. How does this change interact with the code in drivers/scsi/sd.c that manipulates __data_len directly? Thanks, Bart.
On 09/04/2020 05:38, Ming Lei wrote: > On Wed, Apr 08, 2020 at 10:11:19PM -0400, Tejun Heo wrote: >> On Thu, Apr 09, 2020 at 09:44:06AM +0800, Ming Lei wrote: >>> Almost all __blk_mq_end_request() follow blk_update_request(), so the >>> completed bytes can be passed to __blk_mq_end_request(), then we can >>> avoid to introduce this field. >> >> But on some drivers blk_update_request() may be called multiple times before >> __blk_mq_end_request() is called and what's needed here is the total number of >> bytes in the whole request, not just in the final completion. > > OK. > > Another choice might be to record request bytes in rq's payload > when calling .queue_rq() only for these drivers. > >> >>> Also there is just 20 callers of __blk_mq_end_request(), looks this kind >>> of change shouldn't be too big. >> >> This would work iff we get rid of partial completions and if we get rid of >> partial completions, we might as well stop exposing blk_update_request() and >> __blk_mq_end_request(). > > Indeed, we can store the completed bytes in request payload, so looks killing > partial completion shouldn't be too hard. struct request already has such field (see @stats_sectors) because of the same root-cause. I'd prefer killing it as well by following Ming's way, but otherwise it could be easily adopted.
Hello, On Wed, Apr 08, 2020 at 08:44:17PM -0700, Bart Van Assche wrote: > On 2020-04-08 13:14, Tejun Heo wrote: > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > > index 32868fbedc9e..bfd34c6a27ef 100644 > > --- a/include/linux/blkdev.h > > +++ b/include/linux/blkdev.h > > @@ -142,6 +142,14 @@ struct request { > > > > /* the following two fields are internal, NEVER access directly */ > > unsigned int __data_len; /* total data len */ > > +#ifdef CONFIG_BLK_RQ_IO_DATA_LEN > > + /* > > + * Total data len at the time of issue. This doesn't get deducted by > > + * blk_update_request() and can be used by completion path to determine > > + * the request size. > > + */ > > + unsigned int io_data_len; > > +#endif > > sector_t __sector; /* sector cursor */ > > > > struct bio *bio; > > So we have one struct member with the description "total data len" and > another struct member with the description "total data len at the time > of issue"? How could one not get confused by these descriptions? The new one explicitly says it doesn't get deducted by update_request. > This change makes the comment above __data_len incorrect. Please update > that comment or move io_data_len in front of that comment. Sure. > How does this change interact with the code in drivers/scsi/sd.c that > manipulates __data_len directly? It doesn't. Thanks.
Hello, On Thu, Apr 09, 2020 at 10:38:57AM +0800, Ming Lei wrote: > On Wed, Apr 08, 2020 at 10:11:19PM -0400, Tejun Heo wrote: > > On Thu, Apr 09, 2020 at 09:44:06AM +0800, Ming Lei wrote: > > > Almost all __blk_mq_end_request() follow blk_update_request(), so the > > > completed bytes can be passed to __blk_mq_end_request(), then we can > > > avoid to introduce this field. > > > > But on some drivers blk_update_request() may be called multiple times before > > __blk_mq_end_request() is called and what's needed here is the total number of > > bytes in the whole request, not just in the final completion. > > OK. > > Another choice might be to record request bytes in rq's payload > when calling .queue_rq() only for these drivers. There sure are multiple ways to skin a cat. > > > Also there is just 20 callers of __blk_mq_end_request(), looks this kind > > > of change shouldn't be too big. > > > > This would work iff we get rid of partial completions and if we get rid of > > partial completions, we might as well stop exposing blk_update_request() and > > __blk_mq_end_request(). > > Indeed, we can store the completed bytes in request payload, so looks killing > partial completion shouldn't be too hard. There's a reason why we've had partial completions. On slower IO devices, like floppy, partial completions actually are advantageous. I'm not arguing this still holds up as a valid justification but getting rid of partial completions isn't just a decision about plumbing details either. Thanks.
Hello, On Thu, Apr 09, 2020 at 08:08:59AM +0300, Pavel Begunkov wrote: > struct request already has such field (see @stats_sectors) because of the same > root-cause. I'd prefer killing it as well by following Ming's way, but otherwise > it could be easily adopted. Oh, I completely missed that field. Lemme just use that one instead. Please disregard this patch. As for killing partial completions, while I'm not necessarily against it, it isn't a no brainer either. Thanks.
diff --git a/block/Kconfig b/block/Kconfig index 3bc76bb113a0..48308e600dc8 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -26,6 +26,9 @@ menuconfig BLOCK if BLOCK +config BLK_RQ_IO_DATA_LEN + bool + config BLK_RQ_ALLOC_TIME bool diff --git a/block/blk-mq.c b/block/blk-mq.c index f6291ceedee4..64ed22712fe4 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -415,6 +415,9 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op, return ERR_PTR(-EWOULDBLOCK); rq->__data_len = 0; +#ifdef CONFIG_BLK_RQ_IO_DATA_LEN + rq->io_data_len = 0; +#endif rq->__sector = (sector_t) -1; rq->bio = rq->biotail = NULL; return rq; @@ -655,6 +658,9 @@ void blk_mq_start_request(struct request *rq) trace_block_rq_issue(q, rq); +#ifdef CONFIG_BLK_RQ_IO_DATA_LEN + rq->io_data_len = blk_rq_bytes(rq); +#endif if (test_bit(QUEUE_FLAG_STATS, &q->queue_flags)) { rq->io_start_time_ns = ktime_get_ns(); rq->stats_sectors = blk_rq_sectors(rq); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 32868fbedc9e..bfd34c6a27ef 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -142,6 +142,14 @@ struct request { /* the following two fields are internal, NEVER access directly */ unsigned int __data_len; /* total data len */ +#ifdef CONFIG_BLK_RQ_IO_DATA_LEN + /* + * Total data len at the time of issue. This doesn't get deducted by + * blk_update_request() and can be used by completion path to determine + * the request size. + */ + unsigned int io_data_len; +#endif sector_t __sector; /* sector cursor */ struct bio *bio;
Currently, at the time of completeion, there's no way of knowing how big a request was. blk-iocost will need this information to account for IO size when calculating expected latencies. This patch adds rq->io_data_len which remembers blk_rq_bytes() at the time the request gets issued. The field is enabled iff CONFIG_BLK_IO_DATA_LEN is set and doesn't increase the size of the struct even when enabled. Signed-off-by: Tejun Heo <tj@kernel.org> --- block/Kconfig | 3 +++ block/blk-mq.c | 6 ++++++ include/linux/blkdev.h | 8 ++++++++ 3 files changed, 17 insertions(+)