diff mbox series

[V2,8/8] blk-mq: remove blk_mq_tagset_busy_iter

Message ID 1553492318-1810-9-git-send-email-jianchao.w.wang@oracle.com (mailing list archive)
State New, archived
Headers show
Series : blk-mq: use static_rqs to iterate busy tags | expand

Commit Message

jianchao.wang March 25, 2019, 5:38 a.m. UTC
As nobody uses blk_mq_tagset_busy_iter, remove it.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 block/blk-mq-tag.c     | 95 --------------------------------------------------
 include/linux/blk-mq.h |  2 --
 2 files changed, 97 deletions(-)

Comments

Hannes Reinecke March 25, 2019, 7:18 a.m. UTC | #1
On 3/25/19 6:38 AM, Jianchao Wang wrote:
> As nobody uses blk_mq_tagset_busy_iter, remove it.
> 
> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> ---
>   block/blk-mq-tag.c     | 95 --------------------------------------------------
>   include/linux/blk-mq.h |  2 --
>   2 files changed, 97 deletions(-)
> 
Please, don't.

I'm currently implementing reserved commands for SCSI and reworking the 
SCSI error handling where I rely on this interface quite heavily.

Cheers,

Hannes
jianchao.wang March 25, 2019, 7:37 a.m. UTC | #2
Hi Hannes

On 3/25/19 3:18 PM, Hannes Reinecke wrote:
> On 3/25/19 6:38 AM, Jianchao Wang wrote:
>> As nobody uses blk_mq_tagset_busy_iter, remove it.
>>
>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>> ---
>>   block/blk-mq-tag.c     | 95 --------------------------------------------------
>>   include/linux/blk-mq.h |  2 --
>>   2 files changed, 97 deletions(-)
>>
> Please, don't.
> 
> I'm currently implementing reserved commands for SCSI and reworking the SCSI error handling where I rely on this interface quite heavily.


blk_mq_tagset_busy_iter could access some stale requests which maybe freed due to io scheduler switching, request_queue cleanup (shared tagset)
when there is someone submits io and gets driver tag. When io scheduler attached, even quiesce request_queue won't work.

If this patchset is accepted, blk_mq_tagset_busy_iter could be replaced with blk_mq_queue_inflight_tag_iter which needs
to be invoked by every request_queue that shares the tagset.

Please go ahead your work.

Thanks
Jianchao

> 
> Cheers,
> 
> Hannes
Hannes Reinecke March 25, 2019, 8:25 a.m. UTC | #3
On 3/25/19 8:37 AM, jianchao.wang wrote:
> Hi Hannes
> 
> On 3/25/19 3:18 PM, Hannes Reinecke wrote:
>> On 3/25/19 6:38 AM, Jianchao Wang wrote:
>>> As nobody uses blk_mq_tagset_busy_iter, remove it.
>>>
>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>>> ---
>>>    block/blk-mq-tag.c     | 95 --------------------------------------------------
>>>    include/linux/blk-mq.h |  2 --
>>>    2 files changed, 97 deletions(-)
>>>
>> Please, don't.
>>
>> I'm currently implementing reserved commands for SCSI and reworking the SCSI error handling where I rely on
>> this interface quite heavily.
> 
> 
> blk_mq_tagset_busy_iter could access some stale requests which maybe freed due to io scheduler switching,
> request_queue cleanup (shared tagset)
> when there is someone submits io and gets driver tag. When io scheduler attached, even quiesce
> request_queue won't work.
> 
> If this patchset is accepted, blk_mq_tagset_busy_iter could be replaced with blk_mq_queue_inflight_tag_iter
> which needs to be invoked by every request_queue that shares the tagset.
> 
The point is, at that time I do _not_ have a request queue to work with.

Most SCSI drivers have a host-wide shared tagset, which is used by all 
request queues on that host.
Iterating over the shared tagset is far more efficient than to traverse 
over all devices and the attached request queues.

If I had to traverse all request queues I would need to add additional 
locking to ensure this traversal is race-free, making it a really 
cumbersome interface to use.

Plus the tagset iter is understood to be used only in cases where I/O is 
stopped from the upper layers (ie no new I/O will be submitted).
So here we only need to protect against I/O being completed, which is 
not what this patchset is about.

So my objection still stands: Please, don't.

Cheers,

Hannes
jianchao.wang March 25, 2019, 9:12 a.m. UTC | #4
On 3/25/19 4:25 PM, Hannes Reinecke wrote:
> On 3/25/19 8:37 AM, jianchao.wang wrote:
>> Hi Hannes
>>
>> On 3/25/19 3:18 PM, Hannes Reinecke wrote:
>>> On 3/25/19 6:38 AM, Jianchao Wang wrote:
>>>> As nobody uses blk_mq_tagset_busy_iter, remove it.
>>>>
>>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>>>> ---
>>>>    block/blk-mq-tag.c     | 95 --------------------------------------------------
>>>>    include/linux/blk-mq.h |  2 --
>>>>    2 files changed, 97 deletions(-)
>>>>
>>> Please, don't.
>>>
>>> I'm currently implementing reserved commands for SCSI and reworking the SCSI error handling where I rely on
>>> this interface quite heavily.
>>
>>
>> blk_mq_tagset_busy_iter could access some stale requests which maybe freed due to io scheduler switching,
>> request_queue cleanup (shared tagset)
>> when there is someone submits io and gets driver tag. When io scheduler attached, even quiesce
                                                         ^^^^
                                                      s/When/Without
>> request_queue won't work.
>>
>> If this patchset is accepted, blk_mq_tagset_busy_iter could be replaced with blk_mq_queue_inflight_tag_iter
>> which needs to be invoked by every request_queue that shares the tagset.
>>
> The point is, at that time I do _not_ have a request queue to work with.
> 
> Most SCSI drivers have a host-wide shared tagset, which is used by all request queues on that host.
> Iterating over the shared tagset is far more efficient than to traverse over all devices and the attached request queues.
> 
> If I had to traverse all request queues I would need to add additional locking to ensure this traversal is race-free, making it a really cumbersome interface to use.

Yes, the new interface int this patchset is indeed not convenient to use with shared tagset case.
Perhaps we could introduce a interface to iterate all of the request_queue that shares a same tagset, such as,

mutex_lock(&set->tag_list_lock)
list_for_each_entry(q, &set->tag_list, tag_set_list)
...
mutex_unlock(&set->tag_list_lock)

> 
> Plus the tagset iter is understood to be used only in cases where I/O is stopped from the upper layers (ie no new I/O will be submitted).

The window is during get driver tag and store the rq into tags->rqs[].
w/ io-scheduler attached, we need to quiesce the request_queue to stop the driver tag allocation attempts.
w/o io-scheduler attached, quiesce request_queue cannot work, we need to freeze the queue and drain all of tasks that enters blk_mq_make_request
to rule out all of attempts of allocating driver tags. Unfortunately, we cannot distinguish between the task entering .make_request and allocated
requests both of which occupies q_usage_counter.

So to stop all of the attmpts of allocating driver tag, we have freeze & drain and quiesce the request_queue.


Thanks
Jianchao

> So here we only need to protect against I/O being completed, which is not what this patchset is about.
> 
> So my objection still stands: Please, don't.
> 
> Cheers,
> 
> Hannes
Jens Axboe March 26, 2019, 2:17 p.m. UTC | #5
On 3/25/19 2:25 AM, Hannes Reinecke wrote:
> On 3/25/19 8:37 AM, jianchao.wang wrote:
>> Hi Hannes
>>
>> On 3/25/19 3:18 PM, Hannes Reinecke wrote:
>>> On 3/25/19 6:38 AM, Jianchao Wang wrote:
>>>> As nobody uses blk_mq_tagset_busy_iter, remove it.
>>>>
>>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>>>> ---
>>>>    block/blk-mq-tag.c     | 95 --------------------------------------------------
>>>>    include/linux/blk-mq.h |  2 --
>>>>    2 files changed, 97 deletions(-)
>>>>
>>> Please, don't.
>>>
>>> I'm currently implementing reserved commands for SCSI and reworking
>>> the SCSI error handling where I rely on this interface quite
>>> heavily.
>>
>>
>> blk_mq_tagset_busy_iter could access some stale requests which maybe
>> freed due to io scheduler switching, request_queue cleanup (shared
>> tagset) when there is someone submits io and gets driver tag. When io
>> scheduler attached, even quiesce request_queue won't work.
>>
>> If this patchset is accepted, blk_mq_tagset_busy_iter could be
>> replaced with blk_mq_queue_inflight_tag_iter which needs to be
>> invoked by every request_queue that shares the tagset.
>>
> The point is, at that time I do _not_ have a request queue to work
> with.
> 
> Most SCSI drivers have a host-wide shared tagset, which is used by all
> request queues on that host.  Iterating over the shared tagset is far
> more efficient than to traverse over all devices and the attached
> request queues.
> 
> If I had to traverse all request queues I would need to add additional
> locking to ensure this traversal is race-free, making it a really
> cumbersome interface to use.
> 
> Plus the tagset iter is understood to be used only in cases where I/O
> is stopped from the upper layers (ie no new I/O will be submitted).
> So here we only need to protect against I/O being completed, which is
> not what this patchset is about.
> 
> So my objection still stands: Please, don't.

We can't just keep an interface that's hard to use correctly just
because you have something pending for that. Jianchao has good
suggestions for you on how to proceed.
diff mbox series

Patch

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index bf7b235..a6a28dd 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -279,101 +279,6 @@  static void bt_for_each(struct blk_mq_hw_ctx *hctx, struct sbitmap_queue *bt,
 	sbitmap_for_each_set(&bt->sb, bt_iter, &iter_data);
 }
 
-struct bt_tags_iter_data {
-	struct blk_mq_tags *tags;
-	busy_tag_iter_fn *fn;
-	void *data;
-	bool reserved;
-};
-
-static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
-{
-	struct bt_tags_iter_data *iter_data = data;
-	struct blk_mq_tags *tags = iter_data->tags;
-	bool reserved = iter_data->reserved;
-	struct request *rq;
-
-	if (!reserved)
-		bitnr += tags->nr_reserved_tags;
-
-	/*
-	 * We can hit rq == NULL here, because the tagging functions
-	 * test and set the bit before assining ->rqs[].
-	 */
-	rq = tags->rqs[bitnr];
-	if (rq && blk_mq_request_started(rq))
-		return iter_data->fn(rq, iter_data->data, reserved);
-
-	return true;
-}
-
-/**
- * bt_tags_for_each - iterate over the requests in a tag map
- * @tags:	Tag map to iterate over.
- * @bt:		sbitmap to examine. This is either the breserved_tags member
- *		or the bitmap_tags member of struct blk_mq_tags.
- * @fn:		Pointer to the function that will be called for each started
- *		request. @fn will be called as follows: @fn(rq, @data,
- *		@reserved) where rq is a pointer to a request. Return true
- *		to continue iterating tags, false to stop.
- * @data:	Will be passed as second argument to @fn.
- * @reserved:	Indicates whether @bt is the breserved_tags member or the
- *		bitmap_tags member of struct blk_mq_tags.
- */
-static void bt_tags_for_each(struct blk_mq_tags *tags, struct sbitmap_queue *bt,
-			     busy_tag_iter_fn *fn, void *data, bool reserved)
-{
-	struct bt_tags_iter_data iter_data = {
-		.tags = tags,
-		.fn = fn,
-		.data = data,
-		.reserved = reserved,
-	};
-
-	if (tags->rqs)
-		sbitmap_for_each_set(&bt->sb, bt_tags_iter, &iter_data);
-}
-
-/**
- * blk_mq_all_tag_busy_iter - iterate over all started requests in a tag map
- * @tags:	Tag map to iterate over.
- * @fn:		Pointer to the function that will be called for each started
- *		request. @fn will be called as follows: @fn(rq, @priv,
- *		reserved) where rq is a pointer to a request. 'reserved'
- *		indicates whether or not @rq is a reserved request. Return
- *		true to continue iterating tags, false to stop.
- * @priv:	Will be passed as second argument to @fn.
- */
-static void blk_mq_all_tag_busy_iter(struct blk_mq_tags *tags,
-		busy_tag_iter_fn *fn, void *priv)
-{
-	if (tags->nr_reserved_tags)
-		bt_tags_for_each(tags, &tags->breserved_tags, fn, priv, true);
-	bt_tags_for_each(tags, &tags->bitmap_tags, fn, priv, false);
-}
-
-/**
- * blk_mq_tagset_busy_iter - iterate over all started requests in a tag set
- * @tagset:	Tag set to iterate over.
- * @fn:		Pointer to the function that will be called for each started
- *		request. @fn will be called as follows: @fn(rq, @priv,
- *		reserved) where rq is a pointer to a request. 'reserved'
- *		indicates whether or not @rq is a reserved request. Return
- *		true to continue iterating tags, false to stop.
- * @priv:	Will be passed as second argument to @fn.
- */
-void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
-		busy_tag_iter_fn *fn, void *priv)
-{
-	int i;
-
-	for (i = 0; i < tagset->nr_hw_queues; i++) {
-		if (tagset->tags && tagset->tags[i])
-			blk_mq_all_tag_busy_iter(tagset->tags[i], fn, priv);
-	}
-}
-EXPORT_SYMBOL(blk_mq_tagset_busy_iter);
-
 /**
  * __blk_mq_queue_tag_busy_iter - iterate over all busy or inflight tags
  * @q:		Request queue to examine.
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index dff9bb6..3ba8001 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -318,8 +318,6 @@  void blk_mq_unquiesce_queue(struct request_queue *q);
 void blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs);
 bool blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async);
 void blk_mq_run_hw_queues(struct request_queue *q, bool async);
-void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
-		busy_tag_iter_fn *fn, void *priv);
 void blk_mq_queue_tag_inflight_iter(struct request_queue *q,
 		busy_iter_fn *fn, void *priv);
 void blk_mq_freeze_queue(struct request_queue *q);