diff mbox series

target/file: don't zero iter before iov_iter_bvec

Message ID 34cd22d6cec046e3adf402accb1453cc255b9042.1610207523.git.asml.silence@gmail.com (mailing list archive)
State Accepted
Headers show
Series target/file: don't zero iter before iov_iter_bvec | expand

Commit Message

Pavel Begunkov Jan. 9, 2021, 3:53 p.m. UTC
iov_iter_bvec() initialises iterators well, no need to pre-zero it
beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it
out and generate extra code for that (confirmed with assembly).

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 drivers/target/target_core_file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Chaitanya Kulkarni Jan. 9, 2021, 8:09 p.m. UTC | #1
On 1/9/21 07:59, Pavel Begunkov wrote:
> iov_iter_bvec() initialises iterators well, no need to pre-zero it
> beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it
> out and generate extra code for that (confirmed with assembly).
It will be great if we can quantify this optimization with the actual
performance
numbers.
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  drivers/target/target_core_file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
> index cce455929778..5a66854def95 100644
> --- a/drivers/target/target_core_file.c
> +++ b/drivers/target/target_core_file.c
> @@ -267,7 +267,7 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
>  	struct fd_dev *fd_dev = FD_DEV(dev);
>  	struct file *file = fd_dev->fd_file;
>  	struct target_core_file_cmd *aio_cmd;
> -	struct iov_iter iter = {};
> +	struct iov_iter iter;
>  	struct scatterlist *sg;
>  	ssize_t len = 0;
>  	int ret = 0, i;
Pavel Begunkov Jan. 9, 2021, 8:37 p.m. UTC | #2
On 09/01/2021 20:09, Chaitanya Kulkarni wrote:
> On 1/9/21 07:59, Pavel Begunkov wrote:
>> iov_iter_bvec() initialises iterators well, no need to pre-zero it
>> beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it
>> out and generate extra code for that (confirmed with assembly).
> It will be great if we can quantify this optimization with the actual
> performance
> numbers.

I expect you won't find any, but such little things can pile up
into a not-easy-to-spot overhead over time.

In any case, I don't think this requires performance justification
because it neither makes it less safe or uglier. Those iov_iter*()
are there to handle initialisation, that's a part of the iter API.

>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>  drivers/target/target_core_file.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
>> index cce455929778..5a66854def95 100644
>> --- a/drivers/target/target_core_file.c
>> +++ b/drivers/target/target_core_file.c
>> @@ -267,7 +267,7 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
>>  	struct fd_dev *fd_dev = FD_DEV(dev);
>>  	struct file *file = fd_dev->fd_file;
>>  	struct target_core_file_cmd *aio_cmd;
>> -	struct iov_iter iter = {};
>> +	struct iov_iter iter;
>>  	struct scatterlist *sg;
>>  	ssize_t len = 0;
>>  	int ret = 0, i;
>
Chaitanya Kulkarni Jan. 9, 2021, 8:52 p.m. UTC | #3
On 1/9/21 12:40, Pavel Begunkov wrote:
> I expect you won't find any, but such little things can pile up
> into a not-easy-to-spot overhead over time.

That is what I suspected with the resulting assembly. The commit log
needs to document that there is no direct impact on the performance
which can be seen with this patch, but this is nice to have
micro-optimization in the long run.
Pavel Begunkov Jan. 9, 2021, 9:25 p.m. UTC | #4
On 09/01/2021 20:52, Chaitanya Kulkarni wrote:
> On 1/9/21 12:40, Pavel Begunkov wrote:
>> I expect you won't find any, but such little things can pile up
>> into a not-easy-to-spot overhead over time.
> 
> That is what I suspected with the resulting assembly. The commit log
> needs to document that there is no direct impact on the performance

It's obvious that 3-4 extra mov $0 off(%reg) won't change performance
but still hasn't been formally confirmed ...

> which can be seen with this patch, but this is nice to have

... so if you don't mind, I won't be resending just for that.
Chaitanya Kulkarni Jan. 11, 2021, 2:06 a.m. UTC | #5
On 1/9/21 13:29, Pavel Begunkov wrote:
> On 09/01/2021 20:52, Chaitanya Kulkarni wrote:
>> On 1/9/21 12:40, Pavel Begunkov wrote:
>>> I expect you won't find any, but such little things can pile up
>>> into a not-easy-to-spot overhead over time.
>> That is what I suspected with the resulting assembly. The commit log
>> needs to document that there is no direct impact on the performance
> It's obvious that 3-4 extra mov $0 off(%reg) won't change performance
> but still hasn't been formally confirmed ...
This is obvious for you and me since we spent time into looking into
resulting assembly not every reviewer is expected to do that see [1].
>
>> which can be seen with this patch, but this is nice to have
> ... so if you don't mind, I won't be resending just for that.
As per commit log guidelines [1] you have to quantify the optimization.

Since you cannot quantify the optimization modify the commit log explaining
that there is not significant performance benefit observe.
> -- Pavel Begunkov
[1] https://www.kernel.org/doc/html/v4.10/process/submitting-patches.html
Pavel Begunkov Jan. 11, 2021, 2:28 a.m. UTC | #6
On 11/01/2021 02:06, Chaitanya Kulkarni wrote:
> On 1/9/21 13:29, Pavel Begunkov wrote:
>> On 09/01/2021 20:52, Chaitanya Kulkarni wrote:
>>> On 1/9/21 12:40, Pavel Begunkov wrote:
>>>> I expect you won't find any, but such little things can pile up
>>>> into a not-easy-to-spot overhead over time.
>>> That is what I suspected with the resulting assembly. The commit log
>>> needs to document that there is no direct impact on the performance
>> It's obvious that 3-4 extra mov $0 off(%reg) won't change performance
>> but still hasn't been formally confirmed ...
> This is obvious for you and me since we spent time into looking into
> resulting assembly not every reviewer is expected to do that see [1].
>>
>>> which can be seen with this patch, but this is nice to have
>> ... so if you don't mind, I won't be resending just for that.
> As per commit log guidelines [1] you have to quantify the optimization.
> 
> Since you cannot quantify the optimization modify the commit log explaining

And then you see "Optimizations usually aren’t free but trade-offs
between", and the patch doesn't fall under it.

Let me be frank, I see it more like as a whim. If the maintainer agrees
with that strange requirement of yours and want to bury it under
bureaucracy, fine by me, don't take it, I don't care, but I haven't
ever been asked here to do that for patches as this.

> that there is not significant performance benefit observe.

It's not "I cannot" but rather "I haven't even tried to and expect...".
Don't mix, there is a huge difference between.
Bart Van Assche Jan. 11, 2021, 2:49 a.m. UTC | #7
On 1/9/21 7:53 AM, Pavel Begunkov wrote:
> iov_iter_bvec() initialises iterators well, no need to pre-zero it
> beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it
> out and generate extra code for that (confirmed with assembly).

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Chaitanya Kulkarni Jan. 11, 2021, 5:23 a.m. UTC | #8
On 1/10/21 18:32, Pavel Begunkov wrote:
> On 11/01/2021 02:06, Chaitanya Kulkarni wrote:
>> On 1/9/21 13:29, Pavel Begunkov wrote:
>>> On 09/01/2021 20:52, Chaitanya Kulkarni wrote:
>>>> On 1/9/21 12:40, Pavel Begunkov wrote:
>>>>> I expect you won't find any, but such little things can pile up
>>>>> into a not-easy-to-spot overhead over time.
>>>> That is what I suspected with the resulting assembly. The commit log
>>>> needs to document that there is no direct impact on the performance
>>> It's obvious that 3-4 extra mov $0 off(%reg) won't change performance
>>> but still hasn't been formally confirmed ...
>> This is obvious for you and me since we spent time into looking into
>> resulting assembly not every reviewer is expected to do that see [1].
>>>> which can be seen with this patch, but this is nice to have
>>> ... so if you don't mind, I won't be resending just for that.
>> As per commit log guidelines [1] you have to quantify the optimization.
>>
>> Since you cannot quantify the optimization modify the commit log explaining
> And then you see "Optimizations usually aren’t free but trade-offs
> between", and the patch doesn't fall under it.
First part applies to all the optimizations with and without tradeoffs
"Quantify optimizations and trade-offs."
The later part doesn't mean optimizations without trade-offs should be
allowed without having any supportive data.
>
> Let me be frank, I see it more like as a whim. If the maintainer agrees
> with that strange requirement of yours and want to bury it under
> bureaucracy, fine by me, don't take it, I don't care, but I haven't
> ever been asked here to do that for patches as this.
I didn't write the commit log guidelines, as a reviewer I'm following them.
The patch commit log claims optimization with neither having any data nor
having the supporting fact ("possibly no observable difference but in the
long term it matters") for the completeness.
> It's not "I cannot" but rather "I haven't even tried to and expect...".
> Don't mix, there is a huge difference between.
Then provide the numbers to support your claim.
Pavel Begunkov Jan. 11, 2021, 5:31 a.m. UTC | #9
On 11/01/2021 05:23, Chaitanya Kulkarni wrote:
> On 1/10/21 18:32, Pavel Begunkov wrote:
>> On 11/01/2021 02:06, Chaitanya Kulkarni wrote:
>>> On 1/9/21 13:29, Pavel Begunkov wrote:
>>>> On 09/01/2021 20:52, Chaitanya Kulkarni wrote:
>>>>> On 1/9/21 12:40, Pavel Begunkov wrote:
>>>>>> I expect you won't find any, but such little things can pile up
>>>>>> into a not-easy-to-spot overhead over time.
>>>>> That is what I suspected with the resulting assembly. The commit log
>>>>> needs to document that there is no direct impact on the performance
>>>> It's obvious that 3-4 extra mov $0 off(%reg) won't change performance
>>>> but still hasn't been formally confirmed ...
>>> This is obvious for you and me since we spent time into looking into
>>> resulting assembly not every reviewer is expected to do that see [1].
>>>>> which can be seen with this patch, but this is nice to have
>>>> ... so if you don't mind, I won't be resending just for that.
>>> As per commit log guidelines [1] you have to quantify the optimization.
>>>
>>> Since you cannot quantify the optimization modify the commit log explaining
>> And then you see "Optimizations usually aren’t free but trade-offs
>> between", and the patch doesn't fall under it.
> First part applies to all the optimizations with and without tradeoffs
> "Quantify optimizations and trade-offs."
> The later part doesn't mean optimizations without trade-offs should be
> allowed without having any supportive data.
>>
>> Let me be frank, I see it more like as a whim. If the maintainer agrees
>> with that strange requirement of yours and want to bury it under
>> bureaucracy, fine by me, don't take it, I don't care, but I haven't
>> ever been asked here to do that for patches as this.
> I didn't write the commit log guidelines, as a reviewer I'm following them.
> The patch commit log claims optimization with neither having any data nor
> having the supporting fact ("possibly no observable difference but in the
> long term it matters") for the completeness.
>> It's not "I cannot" but rather "I haven't even tried to and expect...".
>> Don't mix, there is a huge difference between.
> Then provide the numbers to support your claim.

What claim? I didn't make any regarding performance, you may want to
re-read the commit message.

Anyway, I'll halt replying to this topic. Nothing personal, but it's
getting annoying.
Martin K. Petersen Jan. 13, 2021, 5:34 a.m. UTC | #10
Pavel,

> iov_iter_bvec() initialises iterators well, no need to pre-zero it
> beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it
> out and generate extra code for that (confirmed with assembly).

Applied to 5.12/scsi-staging, thanks!
Martin K. Petersen Jan. 15, 2021, 4:08 a.m. UTC | #11
On Sat, 9 Jan 2021 15:53:27 +0000, Pavel Begunkov wrote:

> iov_iter_bvec() initialises iterators well, no need to pre-zero it
> beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it
> out and generate extra code for that (confirmed with assembly).

Applied to 5.12/scsi-queue, thanks!

[1/1] target/file: don't zero iter before iov_iter_bvec
      https://git.kernel.org/mkp/scsi/c/6b1dba3d8c85
diff mbox series

Patch

diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
index cce455929778..5a66854def95 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -267,7 +267,7 @@  fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 	struct fd_dev *fd_dev = FD_DEV(dev);
 	struct file *file = fd_dev->fd_file;
 	struct target_core_file_cmd *aio_cmd;
-	struct iov_iter iter = {};
+	struct iov_iter iter;
 	struct scatterlist *sg;
 	ssize_t len = 0;
 	int ret = 0, i;