Message ID | 34cd22d6cec046e3adf402accb1453cc255b9042.1610207523.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | target/file: don't zero iter before iov_iter_bvec | expand |
On 1/9/21 07:59, Pavel Begunkov wrote: > iov_iter_bvec() initialises iterators well, no need to pre-zero it > beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it > out and generate extra code for that (confirmed with assembly). It will be great if we can quantify this optimization with the actual performance numbers. > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > --- > drivers/target/target_core_file.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c > index cce455929778..5a66854def95 100644 > --- a/drivers/target/target_core_file.c > +++ b/drivers/target/target_core_file.c > @@ -267,7 +267,7 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, > struct fd_dev *fd_dev = FD_DEV(dev); > struct file *file = fd_dev->fd_file; > struct target_core_file_cmd *aio_cmd; > - struct iov_iter iter = {}; > + struct iov_iter iter; > struct scatterlist *sg; > ssize_t len = 0; > int ret = 0, i;
On 09/01/2021 20:09, Chaitanya Kulkarni wrote: > On 1/9/21 07:59, Pavel Begunkov wrote: >> iov_iter_bvec() initialises iterators well, no need to pre-zero it >> beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it >> out and generate extra code for that (confirmed with assembly). > It will be great if we can quantify this optimization with the actual > performance > numbers. I expect you won't find any, but such little things can pile up into a not-easy-to-spot overhead over time. In any case, I don't think this requires performance justification because it neither makes it less safe or uglier. Those iov_iter*() are there to handle initialisation, that's a part of the iter API. >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >> --- >> drivers/target/target_core_file.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c >> index cce455929778..5a66854def95 100644 >> --- a/drivers/target/target_core_file.c >> +++ b/drivers/target/target_core_file.c >> @@ -267,7 +267,7 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, >> struct fd_dev *fd_dev = FD_DEV(dev); >> struct file *file = fd_dev->fd_file; >> struct target_core_file_cmd *aio_cmd; >> - struct iov_iter iter = {}; >> + struct iov_iter iter; >> struct scatterlist *sg; >> ssize_t len = 0; >> int ret = 0, i; >
On 1/9/21 12:40, Pavel Begunkov wrote: > I expect you won't find any, but such little things can pile up > into a not-easy-to-spot overhead over time. That is what I suspected with the resulting assembly. The commit log needs to document that there is no direct impact on the performance which can be seen with this patch, but this is nice to have micro-optimization in the long run.
On 09/01/2021 20:52, Chaitanya Kulkarni wrote: > On 1/9/21 12:40, Pavel Begunkov wrote: >> I expect you won't find any, but such little things can pile up >> into a not-easy-to-spot overhead over time. > > That is what I suspected with the resulting assembly. The commit log > needs to document that there is no direct impact on the performance It's obvious that 3-4 extra mov $0 off(%reg) won't change performance but still hasn't been formally confirmed ... > which can be seen with this patch, but this is nice to have ... so if you don't mind, I won't be resending just for that.
On 1/9/21 13:29, Pavel Begunkov wrote: > On 09/01/2021 20:52, Chaitanya Kulkarni wrote: >> On 1/9/21 12:40, Pavel Begunkov wrote: >>> I expect you won't find any, but such little things can pile up >>> into a not-easy-to-spot overhead over time. >> That is what I suspected with the resulting assembly. The commit log >> needs to document that there is no direct impact on the performance > It's obvious that 3-4 extra mov $0 off(%reg) won't change performance > but still hasn't been formally confirmed ... This is obvious for you and me since we spent time into looking into resulting assembly not every reviewer is expected to do that see [1]. > >> which can be seen with this patch, but this is nice to have > ... so if you don't mind, I won't be resending just for that. As per commit log guidelines [1] you have to quantify the optimization. Since you cannot quantify the optimization modify the commit log explaining that there is not significant performance benefit observe. > -- Pavel Begunkov [1] https://www.kernel.org/doc/html/v4.10/process/submitting-patches.html
On 11/01/2021 02:06, Chaitanya Kulkarni wrote: > On 1/9/21 13:29, Pavel Begunkov wrote: >> On 09/01/2021 20:52, Chaitanya Kulkarni wrote: >>> On 1/9/21 12:40, Pavel Begunkov wrote: >>>> I expect you won't find any, but such little things can pile up >>>> into a not-easy-to-spot overhead over time. >>> That is what I suspected with the resulting assembly. The commit log >>> needs to document that there is no direct impact on the performance >> It's obvious that 3-4 extra mov $0 off(%reg) won't change performance >> but still hasn't been formally confirmed ... > This is obvious for you and me since we spent time into looking into > resulting assembly not every reviewer is expected to do that see [1]. >> >>> which can be seen with this patch, but this is nice to have >> ... so if you don't mind, I won't be resending just for that. > As per commit log guidelines [1] you have to quantify the optimization. > > Since you cannot quantify the optimization modify the commit log explaining And then you see "Optimizations usually aren’t free but trade-offs between", and the patch doesn't fall under it. Let me be frank, I see it more like as a whim. If the maintainer agrees with that strange requirement of yours and want to bury it under bureaucracy, fine by me, don't take it, I don't care, but I haven't ever been asked here to do that for patches as this. > that there is not significant performance benefit observe. It's not "I cannot" but rather "I haven't even tried to and expect...". Don't mix, there is a huge difference between.
On 1/9/21 7:53 AM, Pavel Begunkov wrote: > iov_iter_bvec() initialises iterators well, no need to pre-zero it > beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it > out and generate extra code for that (confirmed with assembly). Reviewed-by: Bart Van Assche <bvanassche@acm.org>
On 1/10/21 18:32, Pavel Begunkov wrote: > On 11/01/2021 02:06, Chaitanya Kulkarni wrote: >> On 1/9/21 13:29, Pavel Begunkov wrote: >>> On 09/01/2021 20:52, Chaitanya Kulkarni wrote: >>>> On 1/9/21 12:40, Pavel Begunkov wrote: >>>>> I expect you won't find any, but such little things can pile up >>>>> into a not-easy-to-spot overhead over time. >>>> That is what I suspected with the resulting assembly. The commit log >>>> needs to document that there is no direct impact on the performance >>> It's obvious that 3-4 extra mov $0 off(%reg) won't change performance >>> but still hasn't been formally confirmed ... >> This is obvious for you and me since we spent time into looking into >> resulting assembly not every reviewer is expected to do that see [1]. >>>> which can be seen with this patch, but this is nice to have >>> ... so if you don't mind, I won't be resending just for that. >> As per commit log guidelines [1] you have to quantify the optimization. >> >> Since you cannot quantify the optimization modify the commit log explaining > And then you see "Optimizations usually aren’t free but trade-offs > between", and the patch doesn't fall under it. First part applies to all the optimizations with and without tradeoffs "Quantify optimizations and trade-offs." The later part doesn't mean optimizations without trade-offs should be allowed without having any supportive data. > > Let me be frank, I see it more like as a whim. If the maintainer agrees > with that strange requirement of yours and want to bury it under > bureaucracy, fine by me, don't take it, I don't care, but I haven't > ever been asked here to do that for patches as this. I didn't write the commit log guidelines, as a reviewer I'm following them. The patch commit log claims optimization with neither having any data nor having the supporting fact ("possibly no observable difference but in the long term it matters") for the completeness. > It's not "I cannot" but rather "I haven't even tried to and expect...". > Don't mix, there is a huge difference between. Then provide the numbers to support your claim.
On 11/01/2021 05:23, Chaitanya Kulkarni wrote: > On 1/10/21 18:32, Pavel Begunkov wrote: >> On 11/01/2021 02:06, Chaitanya Kulkarni wrote: >>> On 1/9/21 13:29, Pavel Begunkov wrote: >>>> On 09/01/2021 20:52, Chaitanya Kulkarni wrote: >>>>> On 1/9/21 12:40, Pavel Begunkov wrote: >>>>>> I expect you won't find any, but such little things can pile up >>>>>> into a not-easy-to-spot overhead over time. >>>>> That is what I suspected with the resulting assembly. The commit log >>>>> needs to document that there is no direct impact on the performance >>>> It's obvious that 3-4 extra mov $0 off(%reg) won't change performance >>>> but still hasn't been formally confirmed ... >>> This is obvious for you and me since we spent time into looking into >>> resulting assembly not every reviewer is expected to do that see [1]. >>>>> which can be seen with this patch, but this is nice to have >>>> ... so if you don't mind, I won't be resending just for that. >>> As per commit log guidelines [1] you have to quantify the optimization. >>> >>> Since you cannot quantify the optimization modify the commit log explaining >> And then you see "Optimizations usually aren’t free but trade-offs >> between", and the patch doesn't fall under it. > First part applies to all the optimizations with and without tradeoffs > "Quantify optimizations and trade-offs." > The later part doesn't mean optimizations without trade-offs should be > allowed without having any supportive data. >> >> Let me be frank, I see it more like as a whim. If the maintainer agrees >> with that strange requirement of yours and want to bury it under >> bureaucracy, fine by me, don't take it, I don't care, but I haven't >> ever been asked here to do that for patches as this. > I didn't write the commit log guidelines, as a reviewer I'm following them. > The patch commit log claims optimization with neither having any data nor > having the supporting fact ("possibly no observable difference but in the > long term it matters") for the completeness. >> It's not "I cannot" but rather "I haven't even tried to and expect...". >> Don't mix, there is a huge difference between. > Then provide the numbers to support your claim. What claim? I didn't make any regarding performance, you may want to re-read the commit message. Anyway, I'll halt replying to this topic. Nothing personal, but it's getting annoying.
Pavel, > iov_iter_bvec() initialises iterators well, no need to pre-zero it > beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it > out and generate extra code for that (confirmed with assembly). Applied to 5.12/scsi-staging, thanks!
On Sat, 9 Jan 2021 15:53:27 +0000, Pavel Begunkov wrote: > iov_iter_bvec() initialises iterators well, no need to pre-zero it > beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it > out and generate extra code for that (confirmed with assembly). Applied to 5.12/scsi-queue, thanks! [1/1] target/file: don't zero iter before iov_iter_bvec https://git.kernel.org/mkp/scsi/c/6b1dba3d8c85
diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c index cce455929778..5a66854def95 100644 --- a/drivers/target/target_core_file.c +++ b/drivers/target/target_core_file.c @@ -267,7 +267,7 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, struct fd_dev *fd_dev = FD_DEV(dev); struct file *file = fd_dev->fd_file; struct target_core_file_cmd *aio_cmd; - struct iov_iter iter = {}; + struct iov_iter iter; struct scatterlist *sg; ssize_t len = 0; int ret = 0, i;
iov_iter_bvec() initialises iterators well, no need to pre-zero it beforehand as done in fd_execute_rw_aio(). Compilers can't optimise it out and generate extra code for that (confirmed with assembly). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- drivers/target/target_core_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)