diff mbox

[rfc,04/10] block: Add a non-selective polling interface

Message ID 1489065402-14757-5-git-send-email-sagi@grimberg.me (mailing list archive)
State RFC
Headers show

Commit Message

Sagi Grimberg March 9, 2017, 1:16 p.m. UTC
For a server/target appliance mode where we don't
necessarily care about specific IOs but rather want
to poll opportunisticly, it is useful to have a
non-selective polling interface.

Expose a blk_poll_batch for a batched blkdev polling
interface so our nvme target (and others) can use.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 block/blk-mq.c         | 14 ++++++++++++++
 include/linux/blk-mq.h |  2 ++
 include/linux/blkdev.h |  1 +
 3 files changed, 17 insertions(+)

Comments

Johannes Thumshirn March 9, 2017, 1:44 p.m. UTC | #1
On 03/09/2017 02:16 PM, Sagi Grimberg wrote:
> For a server/target appliance mode where we don't
> necessarily care about specific IOs but rather want
> to poll opportunisticly, it is useful to have a
> non-selective polling interface.
> 
> Expose a blk_poll_batch for a batched blkdev polling
> interface so our nvme target (and others) can use.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>  block/blk-mq.c         | 14 ++++++++++++++
>  include/linux/blk-mq.h |  2 ++
>  include/linux/blkdev.h |  1 +
>  3 files changed, 17 insertions(+)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index b2fd175e84d7..1962785b571a 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2911,6 +2911,20 @@ bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
>  }
>  EXPORT_SYMBOL_GPL(blk_mq_poll);
>  
> +int blk_mq_poll_batch(struct request_queue *q, unsigned int batch)
> +{
> +	struct blk_mq_hw_ctx *hctx;
> +
> +	if (!q->mq_ops || !q->mq_ops->poll_batch)
> +		return 0;
> +
> +	hctx = blk_mq_map_queue(q, smp_processor_id());
> +	return q->mq_ops->poll_batch(hctx, batch);
> +}
> +EXPORT_SYMBOL_GPL(blk_mq_poll_batch);
> +
> +
> +

Quite some additional newlines and I'm not really fond of the
->poll_batch() name. It's a bit confusing with ->poll() and we also have
irq_poll(). But the only thing that would come to my mind is
complete_batch() which "races" with ->complete().

Otherwise looks OK,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Bart Van Assche March 9, 2017, 4:25 p.m. UTC | #2
On Thu, 2017-03-09 at 15:16 +0200, Sagi Grimberg wrote:
> +int blk_mq_poll_batch(struct request_queue *q, unsigned int batch)
> +{
> +	struct blk_mq_hw_ctx *hctx;
> +
> +	if (!q->mq_ops || !q->mq_ops->poll_batch)
> +		return 0;
> +
> +	hctx = blk_mq_map_queue(q, smp_processor_id());
> +	return q->mq_ops->poll_batch(hctx, batch);
> +}
> +EXPORT_SYMBOL_GPL(blk_mq_poll_batch);

A new exported function without any documentation? Wow. Please add a header
above this function that documents at least which other completion processing
code can execute concurrently with this function and from which contexts the
other completion processing code can be called (e.g. blk_mq_poll() and
.complete()).

Why to return if (!q->mq_ops || !q->mq_ops->poll_batch)? Shouldn't that be a
WARN_ON_ONCE() instead? I think it is an error to calling blk_mq_poll_batch()
against a queue that does not define .poll_batch().

Additionally, I think making the hardware context an argument of this function
instead of using blk_mq_map_queue(q, smp_processor_id()) would make this
function much more versatile.

Bart.--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Damien Le Moal March 10, 2017, 3:04 a.m. UTC | #3
On 3/9/17 22:44, Johannes Thumshirn wrote:
> On 03/09/2017 02:16 PM, Sagi Grimberg wrote:
>> For a server/target appliance mode where we don't
>> necessarily care about specific IOs but rather want
>> to poll opportunisticly, it is useful to have a
>> non-selective polling interface.
>>
>> Expose a blk_poll_batch for a batched blkdev polling
>> interface so our nvme target (and others) can use.
>>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>> ---
>>  block/blk-mq.c         | 14 ++++++++++++++
>>  include/linux/blk-mq.h |  2 ++
>>  include/linux/blkdev.h |  1 +
>>  3 files changed, 17 insertions(+)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index b2fd175e84d7..1962785b571a 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2911,6 +2911,20 @@ bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
>>  }
>>  EXPORT_SYMBOL_GPL(blk_mq_poll);
>>  
>> +int blk_mq_poll_batch(struct request_queue *q, unsigned int batch)
>> +{
>> +	struct blk_mq_hw_ctx *hctx;
>> +
>> +	if (!q->mq_ops || !q->mq_ops->poll_batch)
>> +		return 0;
>> +
>> +	hctx = blk_mq_map_queue(q, smp_processor_id());
>> +	return q->mq_ops->poll_batch(hctx, batch);
>> +}
>> +EXPORT_SYMBOL_GPL(blk_mq_poll_batch);
>> +
>> +
>> +
> 
> Quite some additional newlines and I'm not really fond of the
> ->poll_batch() name. It's a bit confusing with ->poll() and we also have
> irq_poll(). But the only thing that would come to my mind is
> complete_batch() which "races" with ->complete().

What about ->check_completions()? After all, that is what both ->poll()
and ->poll_batch do but with a different stop condition, no ?
So it would also be easy to merge the two: both tag and batch are
unsigned int which could be called "cookie", and add a flag to tell how
to interpret it (as a tag or a batch limit).
e.g. something like:

+typedef int (check_completions_fn)(struct blk_mq_hw_ctx *,
		enum blk_mq_check_flags, /* flag (TAG or BATCH) */
		unsigned int); /* Target tag or batch limit */

Best regards.
Sagi Grimberg March 13, 2017, 8:15 a.m. UTC | #4
>> +int blk_mq_poll_batch(struct request_queue *q, unsigned int batch)
>> +{
>> +	struct blk_mq_hw_ctx *hctx;
>> +
>> +	if (!q->mq_ops || !q->mq_ops->poll_batch)
>> +		return 0;
>> +
>> +	hctx = blk_mq_map_queue(q, smp_processor_id());
>> +	return q->mq_ops->poll_batch(hctx, batch);
>> +}
>> +EXPORT_SYMBOL_GPL(blk_mq_poll_batch);
>
> A new exported function without any documentation? Wow.

I just copied blk_mq_poll export...

> Please add a header
> above this function that documents at least which other completion processing
> code can execute concurrently with this function and from which contexts the
> other completion processing code can be called (e.g. blk_mq_poll() and
> .complete()).

I can do that, I'll document blk_mq_poll too..

> Why to return if (!q->mq_ops || !q->mq_ops->poll_batch)? Shouldn't that be a
> WARN_ON_ONCE() instead? I think it is an error to calling blk_mq_poll_batch()
> against a queue that does not define .poll_batch().

Not really, we don't know if the block driver actually supports 
poll_batch (or poll for that matter). Instead of conditioning in the
call-site, we condition within the call.

Its not really a bug, its harmless.

> Additionally, I think making the hardware context an argument of this function
> instead of using blk_mq_map_queue(q, smp_processor_id()) would make this
> function much more versatile.

What do you mean? remember that the callers interface to the device is
a request queue, it doesn't even know if its a blk-mq device. Can you
explain in more details what you would like to see?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg March 13, 2017, 8:26 a.m. UTC | #5
>> Quite some additional newlines and I'm not really fond of the
>> ->poll_batch() name. It's a bit confusing with ->poll() and we also have
>> irq_poll(). But the only thing that would come to my mind is
>> complete_batch() which "races" with ->complete().
>
> What about ->check_completions()? After all, that is what both ->poll()
> and ->poll_batch do but with a different stop condition, no ?
> So it would also be easy to merge the two: both tag and batch are
> unsigned int which could be called "cookie", and add a flag to tell how
> to interpret it (as a tag or a batch limit).
> e.g. something like:
>
> +typedef int (check_completions_fn)(struct blk_mq_hw_ctx *,
> 		enum blk_mq_check_flags, /* flag (TAG or BATCH) */
> 		unsigned int); /* Target tag or batch limit */
>

I'd rather not to unite poll/poll_batch, but if this is something
that people want I can definitely do it.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche March 14, 2017, 9:21 p.m. UTC | #6
On Mon, 2017-03-13 at 10:15 +0200, Sagi Grimberg wrote:
> > Additionally, I think making the hardware context an argument of this function
> > instead of using blk_mq_map_queue(q, smp_processor_id()) would make this
> > function much more versatile.
> 
> What do you mean? remember that the callers interface to the device is
> a request queue, it doesn't even know if its a blk-mq device. Can you
> explain in more details what you would like to see?

Have you considered to make the CPU ID an argument instead of using the
smp_processor_id() result?

Bart.--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index b2fd175e84d7..1962785b571a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2911,6 +2911,20 @@  bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
 }
 EXPORT_SYMBOL_GPL(blk_mq_poll);
 
+int blk_mq_poll_batch(struct request_queue *q, unsigned int batch)
+{
+	struct blk_mq_hw_ctx *hctx;
+
+	if (!q->mq_ops || !q->mq_ops->poll_batch)
+		return 0;
+
+	hctx = blk_mq_map_queue(q, smp_processor_id());
+	return q->mq_ops->poll_batch(hctx, batch);
+}
+EXPORT_SYMBOL_GPL(blk_mq_poll_batch);
+
+
+
 void blk_mq_disable_hotplug(void)
 {
 	mutex_lock(&all_q_mutex);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index b296a9006117..e1f33cad3067 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -100,6 +100,7 @@  typedef void (busy_iter_fn)(struct blk_mq_hw_ctx *, struct request *, void *,
 typedef void (busy_tag_iter_fn)(struct request *, void *, bool);
 typedef int (poll_fn)(struct blk_mq_hw_ctx *, unsigned int);
 typedef int (map_queues_fn)(struct blk_mq_tag_set *set);
+typedef int (poll_batch_fn)(struct blk_mq_hw_ctx *, unsigned int);
 
 
 struct blk_mq_ops {
@@ -117,6 +118,7 @@  struct blk_mq_ops {
 	 * Called to poll for completion of a specific tag.
 	 */
 	poll_fn			*poll;
+	poll_batch_fn		*poll_batch;
 
 	softirq_done_fn		*complete;
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 796016e63c1d..a93507e61a57 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -971,6 +971,7 @@  extern void blk_execute_rq_nowait(struct request_queue *, struct gendisk *,
 				  struct request *, int, rq_end_io_fn *);
 
 bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie);
+int blk_mq_poll_batch(struct request_queue *q, unsigned int batch);
 
 static inline struct request_queue *bdev_get_queue(struct block_device *bdev)
 {