Message ID | 20200805121231.166162-1-maxg@mellanox.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Leon Romanovsky |
Headers | show |
Series | [1/2] IB/isert: use unlikely macro in the fast path | expand |
On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote: > Add performance optimization that might slightly improve small IO sizes > benchmarks. > > Signed-off-by: Max Gurtovoy <maxg@mellanox.com> > --- > drivers/infiniband/ulp/isert/ib_isert.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) I find the expectation from "unlikely/likely" keywords to be overrated. When we introduced dissagregate post send verbs in rdma-core, we benchmarked likely/unlikely and didn't find any significant difference for code with and without such keywords. Thanks
On 8/5/2020 4:16 PM, Leon Romanovsky wrote: > On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote: >> Add performance optimization that might slightly improve small IO sizes >> benchmarks. >> >> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> >> --- >> drivers/infiniband/ulp/isert/ib_isert.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) > I find the expectation from "unlikely/likely" keywords to be overrated. > > When we introduced dissagregate post send verbs in rdma-core, we > benchmarked likely/unlikely and didn't find any significant difference > for code with and without such keywords. > > Thanks Leon, We are using these small optimizations in all our ULPs and we saw benefit in large scale and high loads (we did the same in NVMf/RDMA). These kind of optimizations might not be seen immediately but are accumulated. I don't know why do you compare user-space benchmarks to storage drivers. Can you please review the code ? Sagi, Can you send your comments as well ?
On Wed, Aug 05, 2020 at 06:14:16PM +0300, Max Gurtovoy wrote: > > On 8/5/2020 4:16 PM, Leon Romanovsky wrote: > > On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote: > > > Add performance optimization that might slightly improve small IO sizes > > > benchmarks. > > > > > > Signed-off-by: Max Gurtovoy <maxg@mellanox.com> > > > --- > > > drivers/infiniband/ulp/isert/ib_isert.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > I find the expectation from "unlikely/likely" keywords to be overrated. > > > > When we introduced dissagregate post send verbs in rdma-core, we > > benchmarked likely/unlikely and didn't find any significant difference > > for code with and without such keywords. > > > > Thanks > > Leon, > > We are using these small optimizations in all our ULPs and we saw benefit in > large scale and high loads (we did the same in NVMf/RDMA). > > These kind of optimizations might not be seen immediately but are > accumulated. > > I don't know why do you compare user-space benchmarks to storage drivers. Why not? It produces same asm code and both have same performance characteristic. > > Can you please review the code ? There is nothing to review here, the patch is straightforward, I just don't believe in it. > > Sagi, > > Can you send your comments as well ? > >
On 8/5/2020 7:06 PM, Leon Romanovsky wrote: > On Wed, Aug 05, 2020 at 06:14:16PM +0300, Max Gurtovoy wrote: >> On 8/5/2020 4:16 PM, Leon Romanovsky wrote: >>> On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote: >>>> Add performance optimization that might slightly improve small IO sizes >>>> benchmarks. >>>> >>>> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> >>>> --- >>>> drivers/infiniband/ulp/isert/ib_isert.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> I find the expectation from "unlikely/likely" keywords to be overrated. >>> >>> When we introduced dissagregate post send verbs in rdma-core, we >>> benchmarked likely/unlikely and didn't find any significant difference >>> for code with and without such keywords. >>> >>> Thanks >> Leon, >> >> We are using these small optimizations in all our ULPs and we saw benefit in >> large scale and high loads (we did the same in NVMf/RDMA). >> >> These kind of optimizations might not be seen immediately but are >> accumulated. >> >> I don't know why do you compare user-space benchmarks to storage drivers. > Why not? It produces same asm code and both have same performance > characteristic. > >> Can you please review the code ? > There is nothing to review here, the patch is straightforward, I just > don't believe in it. Its ok. Just ignore it if you don't want to review it. The maintainers of iser target will review and decide if they believe in it or not. >> Sagi, >> >> Can you send your comments as well ? >> >>
On Wed, Aug 05, 2020 at 07:28:50PM +0300, Max Gurtovoy wrote: > > On 8/5/2020 7:06 PM, Leon Romanovsky wrote: > > On Wed, Aug 05, 2020 at 06:14:16PM +0300, Max Gurtovoy wrote: > > > On 8/5/2020 4:16 PM, Leon Romanovsky wrote: > > > > On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote: > > > > > Add performance optimization that might slightly improve small IO sizes > > > > > benchmarks. > > > > > > > > > > Signed-off-by: Max Gurtovoy <maxg@mellanox.com> > > > > > --- > > > > > drivers/infiniband/ulp/isert/ib_isert.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > I find the expectation from "unlikely/likely" keywords to be overrated. > > > > > > > > When we introduced dissagregate post send verbs in rdma-core, we > > > > benchmarked likely/unlikely and didn't find any significant difference > > > > for code with and without such keywords. > > > > > > > > Thanks > > > Leon, > > > > > > We are using these small optimizations in all our ULPs and we saw benefit in > > > large scale and high loads (we did the same in NVMf/RDMA). > > > > > > These kind of optimizations might not be seen immediately but are > > > accumulated. > > > > > > I don't know why do you compare user-space benchmarks to storage drivers. > > Why not? It produces same asm code and both have same performance > > characteristic. > > > > > Can you please review the code ? > > There is nothing to review here, the patch is straightforward, I just > > don't believe in it. > > Its ok. > > Just ignore it if you don't want to review it. OK, just because you asked. I reviewed this patch and didn't find any justification for performance claim, can you please provide us numbers before/after so we will be able to decide based on reliable data? It will help us to review our drivers and improve them even more. > > The maintainers of iser target will review and decide if they believe in it > or not. Sure, I don't care who will provide numbers. Thanks > > > > > Sagi, > > > > > > Can you send your comments as well ? > > > > > >
On 8/5/2020 7:37 PM, Leon Romanovsky wrote: > On Wed, Aug 05, 2020 at 07:28:50PM +0300, Max Gurtovoy wrote: >> On 8/5/2020 7:06 PM, Leon Romanovsky wrote: >>> On Wed, Aug 05, 2020 at 06:14:16PM +0300, Max Gurtovoy wrote: >>>> On 8/5/2020 4:16 PM, Leon Romanovsky wrote: >>>>> On Wed, Aug 05, 2020 at 03:12:30PM +0300, Max Gurtovoy wrote: >>>>>> Add performance optimization that might slightly improve small IO sizes >>>>>> benchmarks. >>>>>> >>>>>> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> >>>>>> --- >>>>>> drivers/infiniband/ulp/isert/ib_isert.c | 4 ++-- >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>> I find the expectation from "unlikely/likely" keywords to be overrated. >>>>> >>>>> When we introduced dissagregate post send verbs in rdma-core, we >>>>> benchmarked likely/unlikely and didn't find any significant difference >>>>> for code with and without such keywords. >>>>> >>>>> Thanks >>>> Leon, >>>> >>>> We are using these small optimizations in all our ULPs and we saw benefit in >>>> large scale and high loads (we did the same in NVMf/RDMA). >>>> >>>> These kind of optimizations might not be seen immediately but are >>>> accumulated. >>>> >>>> I don't know why do you compare user-space benchmarks to storage drivers. >>> Why not? It produces same asm code and both have same performance >>> characteristic. >>> >>>> Can you please review the code ? >>> There is nothing to review here, the patch is straightforward, I just >>> don't believe in it. >> Its ok. >> >> Just ignore it if you don't want to review it. > OK, just because you asked. > > I reviewed this patch and didn't find any justification for performance > claim, can you please provide us numbers before/after so we will be able > to decide based on reliable data? It will help us to review our drivers > and improve them even more. As I said, these are incremental optimizations that probably won't be seen immediately with 1 or 2 changes. But accumulated small optimizations can reach to 3%-4%. If you don't believe in this patch - ignore it and review others. I'm sure you have a lot. Let other maintainers review it. You're also welcomed to remove the likely/unlikely macros from all Linux kernel and let's see what comments will it get from other maintainers. >> The maintainers of iser target will review and decide if they believe in it >> or not. > Sure, I don't care who will provide numbers. I'm not talking about providing numbers. > > Thanks > >> >>>> Sagi, >>>> >>>> Can you send your comments as well ? >>>> >>>>
Looks fine,
Acked-by: Sagi Grimberg <sagi@grimberg.me>
> I reviewed this patch and didn't find any justification for performance > claim, can you please provide us numbers before/after so we will be able > to decide based on reliable data? It will help us to review our drivers > and improve them even more. I don't see any reason to find evidence in justification here. It's a fastpath call, which is unlikely to fail, and these macros are considered common practice. There is no reason to make Max to go and quantify a micro-optimization.
On Thu, Aug 06, 2020 at 12:51:15PM -0700, Sagi Grimberg wrote: > > > I reviewed this patch and didn't find any justification for performance > > claim, can you please provide us numbers before/after so we will be able > > to decide based on reliable data? It will help us to review our drivers > > and improve them even more. > > I don't see any reason to find evidence in justification here. It's a > fastpath call, which is unlikely to fail, and these macros are > considered common practice. > > There is no reason to make Max to go and quantify a micro-optimization. Unfortunately Max didn't try to see if these likely/unlikely macros change something, but I did. Simple objdump -d before and after shows that GCC 9 generates same ISERT code before and after this patch. It is expected and there are a lot of reasons for that, but all of them can be reduced to two: * First, GCC is awesome in building profiled code with right predictions for standard flows. * Second, likely/unlikely is intended to be used when input/output is random from GCC point of view. So as a summary, there is no optimization here, just misuse of unlikely macro. BTW, old GCCs behave the same and kernel full of wrong copy/paste. Thanks
>>> I reviewed this patch and didn't find any justification for performance >>> claim, can you please provide us numbers before/after so we will be able >>> to decide based on reliable data? It will help us to review our drivers >>> and improve them even more. >> >> I don't see any reason to find evidence in justification here. It's a >> fastpath call, which is unlikely to fail, and these macros are >> considered common practice. >> >> There is no reason to make Max to go and quantify a micro-optimization. > > Unfortunately Max didn't try to see if these likely/unlikely macros > change something, but I did. > > Simple objdump -d before and after shows that GCC 9 generates same > ISERT code before and after this patch. It is expected and there are a lot > of reasons for that, but all of them can be reduced to two: > * First, GCC is awesome in building profiled code with right predictions for > standard flows. > * Second, likely/unlikely is intended to be used when input/output is random > from GCC point of view. > > So as a summary, there is no optimization here, just misuse of unlikely macro. > > BTW, old GCCs behave the same and kernel full of wrong copy/paste. if that is the case, then we can drop this patch. Thanks for checking.
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index b7df38ee8ae0..c818eebe6538 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -847,7 +847,7 @@ isert_post_recv(struct isert_conn *isert_conn, struct iser_rx_desc *rx_desc) rx_wr.next = NULL; ret = ib_post_recv(isert_conn->qp, &rx_wr, NULL); - if (ret) + if (unlikely(ret)) isert_err("ib_post_recv() failed with ret: %d\n", ret); return ret; @@ -1831,7 +1831,7 @@ isert_post_response(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd) } ret = ib_post_send(isert_conn->qp, &isert_cmd->tx_desc.send_wr, NULL); - if (ret) { + if (unlikely(ret)) { isert_err("ib_post_send failed with %d\n", ret); return ret; }
Add performance optimization that might slightly improve small IO sizes benchmarks. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> --- drivers/infiniband/ulp/isert/ib_isert.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)