block: Avoid polling configuration errors

Message ID	20240531091015.2636025-1-xue01.he@samsung.com (mailing list archive)
State	New, archived
Headers	show Received: from mailout3.samsung.com (mailout3.samsung.com [203.254.224.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9BD1917565 for <linux-block@vger.kernel.org>; Fri, 31 May 2024 09:10:45 +0000 (UTC) From: hexue <xue01.he@samsung.com> To: axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, hexue <xue01.he@samsung.com> Subject: [PATCH] block: Avoid polling configuration errors Date: Fri, 31 May 2024 17:10:15 +0800 Message-Id: <20240531091015.2636025-1-xue01.he@samsung.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" CMS-TYPE: 105P DLP-Filter: Pass References: <CGME20240531091021epcas5p48fdbd6302bec7a91ff66272c600b0dab@epcas5p4.samsung.com>
Series	block: Avoid polling configuration errors \| expand block: Avoid polling configuration errors

hexue May 31, 2024, 9:10 a.m. UTC

Here's a misconfigured if application is doing polled IO
for devices that don't have a poll queue, the process will
continue to do syscall between user space and kernel space,
as in normal poll IO, CPU utilization will be 100%. IO actually
arrives through interruption.

This patch returns a signal that does not support the operation
when the underlying device does not have a poll queue, avoiding
performance and CPU simultaneous loss.

Signed-off-by: hexue <xue01.he@samsung.com>
---
 block/blk-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Christoph Hellwig June 1, 2024, 7:38 a.m. UTC | #1

On Fri, May 31, 2024 at 05:10:15PM +0800, hexue wrote:
> Here's a misconfigured if application is doing polled IO
> for devices that don't have a poll queue, the process will
> continue to do syscall between user space and kernel space,
> as in normal poll IO, CPU utilization will be 100%. IO actually
> arrives through interruption.
> 
> This patch returns a signal that does not support the operation
> when the underlying device does not have a poll queue, avoiding
> performance and CPU simultaneous loss.

This feels like the wrong place to check for this.

As we've dropped synchronous polling we now only support
thead based polling, right now only through io_uring.

So we need to ensure REQ_POLLED doesn't even get set for any
other I/O.

hexue June 5, 2024, 9:01 a.m. UTC | #2

On Sat, Jun 1, 2024 at 00:38:49 -0700, Christoph Hellwig wrote:
>On Fri, May 31, 2024 at 05:10:15PM +0800, hexue wrote:
>> Here's a misconfigured if application is doing polled IO
>> for devices that don't have a poll queue, the process will
>> continue to do syscall between user space and kernel space,
>> as in normal poll IO, CPU utilization will be 100%. IO actually
>> arrives through interruption.
>> 
>> This patch returns a signal that does not support the operation
>> when the underlying device does not have a poll queue, avoiding
>> performance and CPU simultaneous loss.
>
>This feels like the wrong place to check for this.
>
>As we've dropped synchronous polling we now only support
>thead based polling, right now only through io_uring.
>
>So we need to ensure REQ_POLLED doesn't even get set for any
>other I/O.

Sorry I'm not entirely clear on the impact of REQ_POLLED in this context.
I searched that REQ_POLLED is set for request if polled in io_uring or
io_uring_cmd, but here we just judge the state of request_queue.

Do you mean making this judgment here may not be a good choice, because
it may affect other IO (by same path), and this change should be just
targeted at io_uring?

Jens Axboe June 12, 2024, 8:53 p.m. UTC | #3

On 6/1/24 1:38 AM, Christoph Hellwig wrote:
> On Fri, May 31, 2024 at 05:10:15PM +0800, hexue wrote:
>> Here's a misconfigured if application is doing polled IO
>> for devices that don't have a poll queue, the process will
>> continue to do syscall between user space and kernel space,
>> as in normal poll IO, CPU utilization will be 100%. IO actually
>> arrives through interruption.
>>
>> This patch returns a signal that does not support the operation
>> when the underlying device does not have a poll queue, avoiding
>> performance and CPU simultaneous loss.
> 
> This feels like the wrong place to check for this.
> 
> As we've dropped synchronous polling we now only support
> thead based polling, right now only through io_uring.
> 
> So we need to ensure REQ_POLLED doesn't even get set for any
> other I/O.

We happily allow polled IO for async polled IO, even if the destination
queue isn't polled (or it doesn't exist). This is different than the old
sync polled support.

It'll work just fine, it just won't really do what you expect in the
sense that IRQs are still being triggered. The app side won't wait
however, it'll just busy poll on the completion and either race with the
IRQ delivery or find it once completed.

So I think the bigger question here is if we want to change that. It can
indicate a bad configuration, but there's also risk there in terms of
breaking a setup that already works for someone. You'd get -ENONOTSUPP
rather than just (suboptimal) completed IO.

Christoph Hellwig June 13, 2024, 8:07 a.m. UTC | #4

On Wed, Jun 12, 2024 at 02:53:27PM -0600, Jens Axboe wrote:
> > So we need to ensure REQ_POLLED doesn't even get set for any
> > other I/O.
> 
> We happily allow polled IO for async polled IO, even if the destination
> queue isn't polled (or it doesn't exist). This is different than the old
> sync polled support.

Yes, and for that to work we can't start returning -EOPNOTSUPP as in
this patch, as BLK_QC_T_NONE an be cleared for all kinds of reasons.

So if we want some kind of error handling that people don't even
bother to poll for devices where it is not supported we need that
check much earlier (probably in io_uring).

Jens Axboe June 16, 2024, 2:39 a.m. UTC | #5

On 6/13/24 2:07 AM, Christoph Hellwig wrote:
> On Wed, Jun 12, 2024 at 02:53:27PM -0600, Jens Axboe wrote:
>>> So we need to ensure REQ_POLLED doesn't even get set for any
>>> other I/O.
>>
>> We happily allow polled IO for async polled IO, even if the destination
>> queue isn't polled (or it doesn't exist). This is different than the old
>> sync polled support.
> 
> Yes, and for that to work we can't start returning -EOPNOTSUPP as in
> this patch, as BLK_QC_T_NONE an be cleared for all kinds of reasons.
> 
> So if we want some kind of error handling that people don't even
> bother to poll for devices where it is not supported we need that
> check much earlier (probably in io_uring).

There's just no way we can do that, who knows if you'll run into a
polled queue or not further down the stack.

IMHO there's nothing wrong with the current code. If you do stupid
things (do polled IO without having polled queues), then you get to
collect stupid prizes (potentially excessive CPU usage).

hexue June 19, 2024, 6:38 a.m. UTC | #6

On 6/16/24 2:39 AM,, Jens Axboe wrote:
>On 6/13/24 2:07 AM, Christoph Hellwig wrote:
>>> We happily allow polled IO for async polled IO, even if the destination
>>> queue isn't polled (or it doesn't exist). This is different than the old
>>> sync polled support.
>> 
>> Yes, and for that to work we can't start returning -EOPNOTSUPP as in
>> this patch, as BLK_QC_T_NONE an be cleared for all kinds of reasons.
>> 
>> So if we want some kind of error handling that people don't even
>> bother to poll for devices where it is not supported we need that
>> check much earlier (probably in io_uring).
>
>There's just no way we can do that, who knows if you'll run into a
>polled queue or not further down the stack.
>
>IMHO there's nothing wrong with the current code. If you do stupid
>things (do polled IO without having polled queues), then you get to
>collect stupid prizes (potentially excessive CPU usage).

I think the problem is that when the user makes this incorrect configuration,
but doesn't have any error feedback, user is unware and easy to get some wrong
performance information. So I hope to add some feedback for the user to help
them more easily modify the configuration.

I got your point, therefore, I'm considering whether it would be more
resonable to not return -EOPNOTSUPP directly to stop the operation.
Instead, we could detect this information and provide a prompt (like 
warining?), allowing user to be aware without disrupting the original
flow. Do you think this approach is more reasonable?

block: Avoid polling configuration errors

Commit Message

Comments

Patch