Message ID | 20220825110255.658706-1-matsuda-daisuke@fujitsu.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | RDMA/rxe: Ratelimit error messages of read_reply() | expand |
On 8/25/22 06:02, Daisuke Matsuda wrote: > When responder cannot copy data from a user MR, error messages overflow. > This is because an incoming RDMA Read request can results in multiple Read > responses. If the target MR is somehow unavailable, then the error message > is generated for every Read response. > > For the same reason, the error message for packet transmission should also > be ratelimited. > > Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> > --- > drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c > index b36ec5c4d5e0..f9e9679b5e32 100644 > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > @@ -812,7 +812,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, > err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), > payload, RXE_FROM_MR_OBJ); > if (err) > - pr_err("Failed copying memory\n"); > + pr_err_ratelimited("Failed copying memory\n"); > if (mr) > rxe_put(mr); > > @@ -824,7 +824,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, > > err = rxe_xmit_packet(qp, &ack_pkt, skb); > if (err) { > - pr_err("Failed sending RDMA reply.\n"); > + pr_err_ratelimited("Failed sending RDMA reply.\n"); > return RESPST_ERR_RNR; > } > Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
On Thu, Aug 25, 2022 at 08:02:55PM +0900, Daisuke Matsuda wrote: > When responder cannot copy data from a user MR, error messages overflow. > This is because an incoming RDMA Read request can results in multiple Read > responses. If the target MR is somehow unavailable, then the error message > is generated for every Read response. > > For the same reason, the error message for packet transmission should also > be ratelimited. > > Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> > --- > drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) These lines should be deleted, network packts should never trigger printing. Jason
On Friday, August 26, 2022 9:28 PM, Jason Gunthorpe wrote: > On Thu, Aug 25, 2022 at 08:02:55PM +0900, Daisuke Matsuda wrote: > > When responder cannot copy data from a user MR, error messages overflow. > > This is because an incoming RDMA Read request can results in multiple > Read > > responses. If the target MR is somehow unavailable, then the error message > > is generated for every Read response. > > > > For the same reason, the error message for packet transmission should also > > be ratelimited. > > > > Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> > > --- > > drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > These lines should be deleted, network packts should never trigger > printing. > > Jason Okay. I will post another patch to do that. I wonder if we should also delete some messages in rxe_rcv() and its callees. It seems some of them can be triggered by packets from an arbitrary client even when there is no established connection between the requesting and responding nodes. As far as I know, the message below can cause a message overflow. ===== static int hdr_check(struct rxe_pkt_info *pkt) { ~~~~~ if (qpn != IB_MULTICAST_QPN) { index = (qpn == 1) ? port->qp_gsi_index : qpn; qp = rxe_pool_get_index(&rxe->qp_pool, index); if (unlikely(!qp)) { pr_warn_ratelimited("no qp matches qpn 0x%x\n", qpn); goto err1; } ===== Daisuke Matsuda
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index b36ec5c4d5e0..f9e9679b5e32 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -812,7 +812,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), payload, RXE_FROM_MR_OBJ); if (err) - pr_err("Failed copying memory\n"); + pr_err_ratelimited("Failed copying memory\n"); if (mr) rxe_put(mr); @@ -824,7 +824,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, err = rxe_xmit_packet(qp, &ack_pkt, skb); if (err) { - pr_err("Failed sending RDMA reply.\n"); + pr_err_ratelimited("Failed sending RDMA reply.\n"); return RESPST_ERR_RNR; }
When responder cannot copy data from a user MR, error messages overflow. This is because an incoming RDMA Read request can results in multiple Read responses. If the target MR is somehow unavailable, then the error message is generated for every Read response. For the same reason, the error message for packet transmission should also be ratelimited. Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> --- drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)