[1/2] IB/isert: use unlikely macro in the fast path

Message ID	20200805121231.166162-1-maxg@mellanox.com (mailing list archive)
State	Rejected
Delegated to:	Leon Romanovsky
Headers	show Return-Path: <SRS0=TZef=BP=vger.kernel.org=linux-rdma-owner@kernel.org> From: Max Gurtovoy <maxg@mellanox.com> To: sagi@grimberg.me, linux-rdma@vger.kernel.org, jgg@nvidia.com, jgg@mellanox.com, dledford@redhat.com, leonro@mellanox.com Cc: oren@mellanox.com, Max Gurtovoy <maxg@mellanox.com> Subject: [PATCH 1/2] IB/isert: use unlikely macro in the fast path Date: Wed, 5 Aug 2020 15:12:30 +0300 Message-Id: <20200805121231.166162-1-maxg@mellanox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk
Series	[1/2] IB/isert: use unlikely macro in the fast path \| expand [1/2] IB/isert: use unlikely macro in the fast path [2/2] IB/isert: remove duplicated error prints

Max Gurtovoy Aug. 5, 2020, 12:12 p.m. UTC

Add performance optimization that might slightly improve small IO sizes
benchmarks.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/infiniband/ulp/isert/ib_isert.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Leon Romanovsky Aug. 5, 2020, 1:16 p.m. UTC | #1

On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote:
> Add performance optimization that might slightly improve small IO sizes
> benchmarks.
>
> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> ---
>  drivers/infiniband/ulp/isert/ib_isert.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

I find the expectation from "unlikely/likely" keywords to be overrated.

When we introduced dissagregate post send verbs in rdma-core, we
benchmarked likely/unlikely and didn't find any significant difference
for code with and without such keywords.

Thanks

Max Gurtovoy Aug. 5, 2020, 3:14 p.m. UTC | #2

On 8/5/2020 4:16 PM, Leon Romanovsky wrote:
> On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote:
>> Add performance optimization that might slightly improve small IO sizes
>> benchmarks.
>>
>> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
>> ---
>>   drivers/infiniband/ulp/isert/ib_isert.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
> I find the expectation from "unlikely/likely" keywords to be overrated.
>
> When we introduced dissagregate post send verbs in rdma-core, we
> benchmarked likely/unlikely and didn't find any significant difference
> for code with and without such keywords.
>
> Thanks

Leon,

We are using these small optimizations in all our ULPs and we saw 
benefit in large scale and high loads (we did the same in NVMf/RDMA).

These kind of optimizations might not be seen immediately but are 
accumulated.

I don't know why do you compare user-space benchmarks to storage drivers.

Can you please review the code ?

Sagi,

Can you send your comments as well ?

Leon Romanovsky Aug. 5, 2020, 4:06 p.m. UTC | #3

On Wed, Aug 05, 2020 at 06:14:16PM +0300, Max Gurtovoy wrote:
>
> On 8/5/2020 4:16 PM, Leon Romanovsky wrote:
> > On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote:
> > > Add performance optimization that might slightly improve small IO sizes
> > > benchmarks.
> > >
> > > Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> > > ---
> > >   drivers/infiniband/ulp/isert/ib_isert.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > I find the expectation from "unlikely/likely" keywords to be overrated.
> >
> > When we introduced dissagregate post send verbs in rdma-core, we
> > benchmarked likely/unlikely and didn't find any significant difference
> > for code with and without such keywords.
> >
> > Thanks
>
> Leon,
>
> We are using these small optimizations in all our ULPs and we saw benefit in
> large scale and high loads (we did the same in NVMf/RDMA).
>
> These kind of optimizations might not be seen immediately but are
> accumulated.
>
> I don't know why do you compare user-space benchmarks to storage drivers.

Why not? It produces same asm code and both have same performance
characteristic.

>
> Can you please review the code ?

There is nothing to review here, the patch is straightforward, I just
don't believe in it.

>
> Sagi,
>
> Can you send your comments as well ?
>
>

Max Gurtovoy Aug. 5, 2020, 4:28 p.m. UTC | #4

On 8/5/2020 7:06 PM, Leon Romanovsky wrote:
> On Wed, Aug 05, 2020 at 06:14:16PM +0300, Max Gurtovoy wrote:
>> On 8/5/2020 4:16 PM, Leon Romanovsky wrote:
>>> On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote:
>>>> Add performance optimization that might slightly improve small IO sizes
>>>> benchmarks.
>>>>
>>>> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
>>>> ---
>>>>    drivers/infiniband/ulp/isert/ib_isert.c | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>> I find the expectation from "unlikely/likely" keywords to be overrated.
>>>
>>> When we introduced dissagregate post send verbs in rdma-core, we
>>> benchmarked likely/unlikely and didn't find any significant difference
>>> for code with and without such keywords.
>>>
>>> Thanks
>> Leon,
>>
>> We are using these small optimizations in all our ULPs and we saw benefit in
>> large scale and high loads (we did the same in NVMf/RDMA).
>>
>> These kind of optimizations might not be seen immediately but are
>> accumulated.
>>
>> I don't know why do you compare user-space benchmarks to storage drivers.
> Why not? It produces same asm code and both have same performance
> characteristic.
>
>> Can you please review the code ?
> There is nothing to review here, the patch is straightforward, I just
> don't believe in it.

Its ok.

Just ignore it if you don't want to review it.

The maintainers of iser target will review and decide if they believe in 
it or not.


>> Sagi,
>>
>> Can you send your comments as well ?
>>
>>

Leon Romanovsky Aug. 5, 2020, 4:37 p.m. UTC | #5

On Wed, Aug 05, 2020 at 07:28:50PM +0300, Max Gurtovoy wrote:
>
> On 8/5/2020 7:06 PM, Leon Romanovsky wrote:
> > On Wed, Aug 05, 2020 at 06:14:16PM +0300, Max Gurtovoy wrote:
> > > On 8/5/2020 4:16 PM, Leon Romanovsky wrote:
> > > > On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote:
> > > > > Add performance optimization that might slightly improve small IO sizes
> > > > > benchmarks.
> > > > >
> > > > > Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> > > > > ---
> > > > >    drivers/infiniband/ulp/isert/ib_isert.c | 4 ++--
> > > > >    1 file changed, 2 insertions(+), 2 deletions(-)
> > > > I find the expectation from "unlikely/likely" keywords to be overrated.
> > > >
> > > > When we introduced dissagregate post send verbs in rdma-core, we
> > > > benchmarked likely/unlikely and didn't find any significant difference
> > > > for code with and without such keywords.
> > > >
> > > > Thanks
> > > Leon,
> > >
> > > We are using these small optimizations in all our ULPs and we saw benefit in
> > > large scale and high loads (we did the same in NVMf/RDMA).
> > >
> > > These kind of optimizations might not be seen immediately but are
> > > accumulated.
> > >
> > > I don't know why do you compare user-space benchmarks to storage drivers.
> > Why not? It produces same asm code and both have same performance
> > characteristic.
> >
> > > Can you please review the code ?
> > There is nothing to review here, the patch is straightforward, I just
> > don't believe in it.
>
> Its ok.
>
> Just ignore it if you don't want to review it.

OK, just because you asked.

I reviewed this patch and didn't find any justification for performance
claim, can you please provide us numbers before/after so we will be able
to decide based on reliable data? It will help us to review our drivers
and improve them even more.

>
> The maintainers of iser target will review and decide if they believe in it
> or not.

Sure, I don't care who will provide numbers.

Thanks

>
>
> > > Sagi,
> > >
> > > Can you send your comments as well ?
> > >
> > >

Max Gurtovoy Aug. 6, 2020, 10:56 a.m. UTC | #6

On 8/5/2020 7:37 PM, Leon Romanovsky wrote:
> On Wed, Aug 05, 2020 at 07:28:50PM +0300, Max Gurtovoy wrote:
>> On 8/5/2020 7:06 PM, Leon Romanovsky wrote:
>>> On Wed, Aug 05, 2020 at 06:14:16PM +0300, Max Gurtovoy wrote:
>>>> On 8/5/2020 4:16 PM, Leon Romanovsky wrote:
>>>>> On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote:
>>>>>> Add performance optimization that might slightly improve small IO sizes
>>>>>> benchmarks.
>>>>>>
>>>>>> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
>>>>>> ---
>>>>>>     drivers/infiniband/ulp/isert/ib_isert.c | 4 ++--
>>>>>>     1 file changed, 2 insertions(+), 2 deletions(-)
>>>>> I find the expectation from "unlikely/likely" keywords to be overrated.
>>>>>
>>>>> When we introduced dissagregate post send verbs in rdma-core, we
>>>>> benchmarked likely/unlikely and didn't find any significant difference
>>>>> for code with and without such keywords.
>>>>>
>>>>> Thanks
>>>> Leon,
>>>>
>>>> We are using these small optimizations in all our ULPs and we saw benefit in
>>>> large scale and high loads (we did the same in NVMf/RDMA).
>>>>
>>>> These kind of optimizations might not be seen immediately but are
>>>> accumulated.
>>>>
>>>> I don't know why do you compare user-space benchmarks to storage drivers.
>>> Why not? It produces same asm code and both have same performance
>>> characteristic.
>>>
>>>> Can you please review the code ?
>>> There is nothing to review here, the patch is straightforward, I just
>>> don't believe in it.
>> Its ok.
>>
>> Just ignore it if you don't want to review it.
> OK, just because you asked.
>
> I reviewed this patch and didn't find any justification for performance
> claim, can you please provide us numbers before/after so we will be able
> to decide based on reliable data? It will help us to review our drivers
> and improve them even more.

As I said, these are incremental optimizations that probably won't be 
seen immediately with 1 or 2 changes. But accumulated small 
optimizations can reach to 3%-4%.

If you don't believe in this patch - ignore it and review others. I'm 
sure you have a lot. Let other maintainers review it.

You're also welcomed to remove the likely/unlikely macros from all Linux 
kernel and let's see what comments will it get from other maintainers.

>> The maintainers of iser target will review and decide if they believe in it
>> or not.
> Sure, I don't care who will provide numbers.

I'm not talking about providing numbers.

>
> Thanks
>
>>
>>>> Sagi,
>>>>
>>>> Can you send your comments as well ?
>>>>
>>>>

Sagi Grimberg Aug. 6, 2020, 7:43 p.m. UTC | #7

Looks fine,

Acked-by: Sagi Grimberg <sagi@grimberg.me>

Sagi Grimberg Aug. 6, 2020, 7:51 p.m. UTC | #8

> I reviewed this patch and didn't find any justification for performance
> claim, can you please provide us numbers before/after so we will be able
> to decide based on reliable data? It will help us to review our drivers
> and improve them even more.

I don't see any reason to find evidence in justification here. It's a
fastpath call, which is unlikely to fail, and these macros are
considered common practice.

There is no reason to make Max to go and quantify a micro-optimization.

Leon Romanovsky Aug. 7, 2020, 4:09 p.m. UTC | #9

On Thu, Aug 06, 2020 at 12:51:15PM -0700, Sagi Grimberg wrote:
>
> > I reviewed this patch and didn't find any justification for performance
> > claim, can you please provide us numbers before/after so we will be able
> > to decide based on reliable data? It will help us to review our drivers
> > and improve them even more.
>
> I don't see any reason to find evidence in justification here. It's a
> fastpath call, which is unlikely to fail, and these macros are
> considered common practice.
>
> There is no reason to make Max to go and quantify a micro-optimization.

Unfortunately Max didn't try to see if these likely/unlikely macros
change something, but I did.

Simple objdump -d before and after shows that GCC 9 generates same
ISERT code before and after this patch. It is expected and there are a lot
of reasons for that, but all of them can be reduced to two:
* First, GCC is awesome in building profiled code with right predictions for
standard flows.
* Second, likely/unlikely is intended to be used when input/output is random
from GCC point of view.

So as a summary, there is no optimization here, just misuse of unlikely macro.

BTW, old GCCs behave the same and kernel full of wrong copy/paste.

Thanks

Sagi Grimberg Aug. 7, 2020, 4:33 p.m. UTC | #10

>>> I reviewed this patch and didn't find any justification for performance
>>> claim, can you please provide us numbers before/after so we will be able
>>> to decide based on reliable data? It will help us to review our drivers
>>> and improve them even more.
>>
>> I don't see any reason to find evidence in justification here. It's a
>> fastpath call, which is unlikely to fail, and these macros are
>> considered common practice.
>>
>> There is no reason to make Max to go and quantify a micro-optimization.
> 
> Unfortunately Max didn't try to see if these likely/unlikely macros
> change something, but I did.
> 
> Simple objdump -d before and after shows that GCC 9 generates same
> ISERT code before and after this patch. It is expected and there are a lot
> of reasons for that, but all of them can be reduced to two:
> * First, GCC is awesome in building profiled code with right predictions for
> standard flows.
> * Second, likely/unlikely is intended to be used when input/output is random
> from GCC point of view.
> 
> So as a summary, there is no optimization here, just misuse of unlikely macro.
> 
> BTW, old GCCs behave the same and kernel full of wrong copy/paste.

if that is the case, then we can drop this patch. Thanks for checking.

[1/2] IB/isert: use unlikely macro in the fast path

Commit Message

Comments

Patch