mbox series

[RESEND,for-next,v6,0/4] RDMA/rxe: Fix no completion event issue

Message ID 1658307368-1851-1-git-send-email-lizhijian@fujitsu.com (mailing list archive)
Headers show
Series RDMA/rxe: Fix no completion event issue | expand

Message

Li Zhijian July 20, 2022, 8:56 a.m. UTC
No change since v5, just resend it via another smtp instead of
Microsoft Exchange which made patches messed up.

It's observed that no more completion occurs after a few incorrect posts.
Actually, it will block the polling. we can easily reproduce it by the below
pattern.

a. post correct RDMA_WRITE
b. poll completion event
while true {
  c. post incorrect RDMA_WRITE(wrong rkey for example)
  d. poll completion event <<<< block after 2 incorrect RDMA_WRITE posts
}

V4 add new patch from Bob where it make requester stop executing qp
operation as soon as possible.

Both blktests and pyverbs tests are passed fine.

Bob Pearson (1):
  RDMA/rxe: Split qp state for requester and completer

Li Zhijian (3):
  RDMA/rxe: Update wqe_index for each wqe error completion
  RDMA/rxe: Generate error completion for error requester QP state
  RDMA/rxe: Fix typo in comment

 drivers/infiniband/sw/rxe/rxe_comp.c  |  6 +++---
 drivers/infiniband/sw/rxe/rxe_qp.c    |  5 +++++
 drivers/infiniband/sw/rxe/rxe_req.c   | 16 +++++++++++++++-
 drivers/infiniband/sw/rxe/rxe_task.c  |  2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.h |  1 +
 5 files changed, 25 insertions(+), 5 deletions(-)

Comments

Jason Gunthorpe July 29, 2022, 7:48 p.m. UTC | #1
On Wed, Jul 20, 2022 at 04:56:04AM -0400, Li Zhijian wrote:
> No change since v5, just resend it via another smtp instead of
> Microsoft Exchange which made patches messed up.
> 
> It's observed that no more completion occurs after a few incorrect posts.
> Actually, it will block the polling. we can easily reproduce it by the below
> pattern.
> 
> a. post correct RDMA_WRITE
> b. poll completion event
> while true {
>   c. post incorrect RDMA_WRITE(wrong rkey for example)
>   d. poll completion event <<<< block after 2 incorrect RDMA_WRITE posts
> }
> 
> V4 add new patch from Bob where it make requester stop executing qp
> operation as soon as possible.
> 
> Both blktests and pyverbs tests are passed fine.
>
> Bob Pearson (1):
>   RDMA/rxe: Split qp state for requester and completer
> 
> Li Zhijian (3):
>   RDMA/rxe: Update wqe_index for each wqe error completion
>   RDMA/rxe: Generate error completion for error requester QP state
>   RDMA/rxe: Fix typo in comment

Bob are you Ok with these?

Jason
Bob Pearson Aug. 1, 2022, 7:24 p.m. UTC | #2
On 7/29/22 14:48, Jason Gunthorpe wrote:
> On Wed, Jul 20, 2022 at 04:56:04AM -0400, Li Zhijian wrote:
>> No change since v5, just resend it via another smtp instead of
>> Microsoft Exchange which made patches messed up.
>>
>> It's observed that no more completion occurs after a few incorrect posts.
>> Actually, it will block the polling. we can easily reproduce it by the below
>> pattern.
>>
>> a. post correct RDMA_WRITE
>> b. poll completion event
>> while true {
>>   c. post incorrect RDMA_WRITE(wrong rkey for example)
>>   d. poll completion event <<<< block after 2 incorrect RDMA_WRITE posts
>> }
>>
>> V4 add new patch from Bob where it make requester stop executing qp
>> operation as soon as possible.
>>
>> Both blktests and pyverbs tests are passed fine.
>>
>> Bob Pearson (1):
>>   RDMA/rxe: Split qp state for requester and completer
>>
>> Li Zhijian (3):
>>   RDMA/rxe: Update wqe_index for each wqe error completion
>>   RDMA/rxe: Generate error completion for error requester QP state
>>   RDMA/rxe: Fix typo in comment
> 
> Bob are you Ok with these?
> 
> Jason

yes. I had reviewed these a while ago and suggested a change which he included.
I'm fine with this.

Bob
Jason Gunthorpe Aug. 2, 2022, 4:54 p.m. UTC | #3
On Wed, Jul 20, 2022 at 04:56:04AM -0400, Li Zhijian wrote:
> No change since v5, just resend it via another smtp instead of
> Microsoft Exchange which made patches messed up.
> 
> It's observed that no more completion occurs after a few incorrect posts.
> Actually, it will block the polling. we can easily reproduce it by the below
> pattern.
> 
> a. post correct RDMA_WRITE
> b. poll completion event
> while true {
>   c. post incorrect RDMA_WRITE(wrong rkey for example)
>   d. poll completion event <<<< block after 2 incorrect RDMA_WRITE posts
> }
> 
> V4 add new patch from Bob where it make requester stop executing qp
> operation as soon as possible.
> 
> Both blktests and pyverbs tests are passed fine.
> 
> Bob Pearson (1):
>   RDMA/rxe: Split qp state for requester and completer
> 
> Li Zhijian (3):
>   RDMA/rxe: Update wqe_index for each wqe error completion
>   RDMA/rxe: Generate error completion for error requester QP state
>   RDMA/rxe: Fix typo in comment

Applied to for-next, thanks

Jason