mbox series

[PATCHv2,for-rc,0/6] Bugfixes for send queue overflow by heartbeat

Message ID 20210712060750.16494-1-jinpu.wang@ionos.com (mailing list archive)
Headers show
Series Bugfixes for send queue overflow by heartbeat | expand

Message

Jinpu Wang July 12, 2021, 6:07 a.m. UTC
Hi Jason, hi Doug,

Please consider to include following changes to upstream.

This patchset fix a regression since b38041d50add ("RDMA/rtrs: Do not signal for heatbeat").

In commit b38041d50add, the signal flag is droped to fix the send queue full
logic, but introduced a worse bug the send queue overflow on both clt and srv
by heartbeat, sorry.

The patchset is orgnized as:
- patch1 debug patch.
- patch2 preparation.
- patch3 signal both IO and heartbeat.
- patch4 cleanup.
- patch5 cleanup
- patch6 move sq_wr_avail to account send queue full correctly.

The patches are created base v5.14-rc1.

Since v1:
* rebased to latest v5.14-rc1, target rc instread of for-next.

v1: https://lore.kernel.org/linux-rdma/20210629065321.12600-1-jinpu.wang@ionos.com/T/#t

Jack Wang (6):
  RDMA/rtrs: Add error messages for failed operations.
  RDMA/rtrs: move wr_cnt from rtrs_srv_con to rtrs_con
  RDMA/rtrs: Enable the same selective signal for heartbeat and IO
  RDMA/rtrs: Make rtrs_post_rdma_write_imm_empty static
  RDMA/rtrs: Remove unused flags parameter
  RDMA/rtrs: Move sq_wr_avail to rtrs_con

 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 11 ++++++++---
 drivers/infiniband/ulp/rtrs/rtrs-clt.h |  1 -
 drivers/infiniband/ulp/rtrs/rtrs-pri.h |  6 +++---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 19 ++++++++++---------
 drivers/infiniband/ulp/rtrs/rtrs-srv.h |  2 --
 drivers/infiniband/ulp/rtrs/rtrs.c     | 23 ++++++++++++++++-------
 6 files changed, 37 insertions(+), 25 deletions(-)

Comments

Jason Gunthorpe July 12, 2021, 5:35 p.m. UTC | #1
On Mon, Jul 12, 2021 at 08:07:44AM +0200, Jack Wang wrote:
> Hi Jason, hi Doug,
> 
> Please consider to include following changes to upstream.
> 
> This patchset fix a regression since b38041d50add ("RDMA/rtrs: Do not signal for heatbeat").
> 
> In commit b38041d50add, the signal flag is droped to fix the send queue full
> logic, but introduced a worse bug the send queue overflow on both clt and srv
> by heartbeat, sorry.
> 
> The patchset is orgnized as:
> - patch1 debug patch.
> - patch2 preparation.
> - patch3 signal both IO and heartbeat.
> - patch4 cleanup.
> - patch5 cleanup
> - patch6 move sq_wr_avail to account send queue full correctly.
> 
> The patches are created base v5.14-rc1.
> 
> Since v1:
> * rebased to latest v5.14-rc1, target rc instread of for-next.
> 
> v1: https://lore.kernel.org/linux-rdma/20210629065321.12600-1-jinpu.wang@ionos.com/T/#t
> 
> Jack Wang (6):
>   RDMA/rtrs: Add error messages for failed operations.
>   RDMA/rtrs: move wr_cnt from rtrs_srv_con to rtrs_con
>   RDMA/rtrs: Enable the same selective signal for heartbeat and IO
>   RDMA/rtrs: Make rtrs_post_rdma_write_imm_empty static
>   RDMA/rtrs: Remove unused flags parameter
>   RDMA/rtrs: Move sq_wr_avail to rtrs_con

This is not really structured well enough for a -rc patch. There
should be no unncessary changes and each patch should try to be self
contained. Avoid "cleanup". Carefully describe what user visible
defect each patch is fixing.

If you really want it to be in -rc then it needs reorganizing,
otherwise I can put it in -next

Jason
Jinpu Wang July 12, 2021, 6:58 p.m. UTC | #2
On Mon, Jul 12, 2021 at 7:35 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Mon, Jul 12, 2021 at 08:07:44AM +0200, Jack Wang wrote:
> > Hi Jason, hi Doug,
> >
> > Please consider to include following changes to upstream.
> >
> > This patchset fix a regression since b38041d50add ("RDMA/rtrs: Do not signal for heatbeat").
> >
> > In commit b38041d50add, the signal flag is droped to fix the send queue full
> > logic, but introduced a worse bug the send queue overflow on both clt and srv
> > by heartbeat, sorry.
> >
> > The patchset is orgnized as:
> > - patch1 debug patch.
> > - patch2 preparation.
> > - patch3 signal both IO and heartbeat.
> > - patch4 cleanup.
> > - patch5 cleanup
> > - patch6 move sq_wr_avail to account send queue full correctly.
> >
> > The patches are created base v5.14-rc1.
> >
> > Since v1:
> > * rebased to latest v5.14-rc1, target rc instread of for-next.
> >
> > v1: https://lore.kernel.org/linux-rdma/20210629065321.12600-1-jinpu.wang@ionos.com/T/#t
> >
> > Jack Wang (6):
> >   RDMA/rtrs: Add error messages for failed operations.
> >   RDMA/rtrs: move wr_cnt from rtrs_srv_con to rtrs_con
> >   RDMA/rtrs: Enable the same selective signal for heartbeat and IO
> >   RDMA/rtrs: Make rtrs_post_rdma_write_imm_empty static
> >   RDMA/rtrs: Remove unused flags parameter
> >   RDMA/rtrs: Move sq_wr_avail to rtrs_con
>
> This is not really structured well enough for a -rc patch. There
> should be no unncessary changes and each patch should try to be self
> contained. Avoid "cleanup". Carefully describe what user visible
> defect each patch is fixing.
>
> If you really want it to be in -rc then it needs reorganizing,
> otherwise I can put it in -next
>
> Jason
Hi Jason,

Thanks for your suggestion, I think it would be ok to put them in for-next.

Regards!
Jason Gunthorpe July 15, 2021, 5:41 p.m. UTC | #3
On Mon, Jul 12, 2021 at 08:07:44AM +0200, Jack Wang wrote:
> Hi Jason, hi Doug,
> 
> Please consider to include following changes to upstream.
> 
> This patchset fix a regression since b38041d50add ("RDMA/rtrs: Do not signal for heatbeat").
> 
> In commit b38041d50add, the signal flag is droped to fix the send queue full
> logic, but introduced a worse bug the send queue overflow on both clt and srv
> by heartbeat, sorry.
> 
> The patchset is orgnized as:
> - patch1 debug patch.
> - patch2 preparation.
> - patch3 signal both IO and heartbeat.
> - patch4 cleanup.
> - patch5 cleanup
> - patch6 move sq_wr_avail to account send queue full correctly.
> 
> The patches are created base v5.14-rc1.
> 
> Since v1:
> * rebased to latest v5.14-rc1, target rc instread of for-next.
> 
> v1: https://lore.kernel.org/linux-rdma/20210629065321.12600-1-jinpu.wang@ionos.com/T/#t
> 
> Jack Wang (6):
>   RDMA/rtrs: Add error messages for failed operations.
>   RDMA/rtrs: move wr_cnt from rtrs_srv_con to rtrs_con
>   RDMA/rtrs: Enable the same selective signal for heartbeat and IO
>   RDMA/rtrs: Make rtrs_post_rdma_write_imm_empty static
>   RDMA/rtrs: Remove unused flags parameter
>   RDMA/rtrs: Move sq_wr_avail to rtrs_con

Applied to for-next, thanks

Jason