Message ID | 20241003124611.35060-1-showrya@chelsio.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [for-rc] RDMA/siw: add sendpage_ok() check to disable MSG_SPLICE_PAGES | expand |
> -----Original Message----- > From: Showrya M N <showrya@chelsio.com> > Sent: Thursday, October 3, 2024 2:46 PM > To: jgg@nvidia.com; leonro@nvidia.com; Bernard Metzler <BMT@zurich.ibm.com> > Cc: linux-rdma@vger.kernel.org; Showrya M N <showrya@chelsio.com>; Potnuri > Bharat Teja <bharat@chelsio.com> > Subject: [EXTERNAL] [PATCH for-rc] RDMA/siw: add sendpage_ok() check to > disable MSG_SPLICE_PAGES > > While running ISER over SIW, the initiator machine encounters a warning > from skb_splice_from_iter() indicating that a slab page is being used > in send_page. To address this, it is better to add a sendpage_ok() check > within the driver itself, and if it returns 0, then MSG_SPLICE_PAGES flag > should be disabled before entering the network stack. > > A similar issue has been discussed for NVMe in this thread: > INVALID URI REMOVED > 3A__lore.kernel.org_all_20240530142417.146696-2D1-2Dofir.gal- > 40volumez.com_&d=DwIDAg&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tYS > bqxyOwdSiLedP4yO55g&m=5zLkbzFXCNvwiYwzyhSLC9r5GQnIt4VKawdBIinJdVEfirE1BFqwS > 9QGbhVWOOKo&s=wxM2sRuzDoq36W23_jA5pNgeGDEPfTmyPhsvPqp0_-E&e= > > stack trace: > ... > [ 2157.532917] WARNING: CPU: 0 PID: 5342 at net/core/skbuff.c:7140 > skb_splice_from_iter+0x173/0x320 > Call Trace: > [ 2157.533064] Call Trace: > [ 2157.533069] ? __warn+0x84/0x130 > [ 2157.533073] ? skb_splice_from_iter+0x173/0x320 > [ 2157.533075] ? report_bug+0xfc/0x1e0 > [ 2157.533081] ? handle_bug+0x3f/0x70 > [ 2157.533085] ? exc_invalid_op+0x17/0x70 > [ 2157.533088] ? asm_exc_invalid_op+0x1a/0x20 > [ 2157.533096] ? skb_splice_from_iter+0x173/0x320 > [ 2157.533101] tcp_sendmsg_locked+0x368/0xe40 > [ 2157.533111] siw_tx_hdt+0x695/0xa40 [siw] > [ 2157.533134] ? sched_balance_find_src_group+0x44/0x4f0 > [ 2157.533143] ? __update_load_avg_cfs_rq+0x272/0x300 > [ 2157.533152] ? place_entity+0x19/0xf0 > [ 2157.533157] ? enqueue_entity+0xdb/0x3d0 > [ 2157.533162] ? pick_eevdf+0xe2/0x120 > [ 2157.533169] ? check_preempt_wakeup_fair+0x161/0x1f0 > [ 2157.533174] ? wakeup_preempt+0x61/0x70 > [ 2157.533177] ? ttwu_do_activate+0x5d/0x1e0 > [ 2157.533183] ? try_to_wake_up+0x78/0x610 > [ 2157.533188] ? xas_load+0xd/0xb0 > [ 2157.533193] ? xa_load+0x80/0xb0 > [ 2157.533200] siw_qp_sq_process+0x102/0xb00 [siw] > [ 2157.533213] ? __pfx_siw_run_sq+0x10/0x10 [siw] > [ 2157.533224] siw_sq_resume+0x39/0x110 [siw] > [ 2157.533236] siw_run_sq+0x74/0x160 [siw] > [ 2157.533246] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 2157.533252] kthread+0xd2/0x100 > [ 2157.533257] ? __pfx_kthread+0x10/0x10 > [ 2157.533261] ret_from_fork+0x34/0x40 > [ 2157.533266] ? __pfx_kthread+0x10/0x10 > [ 2157.533269] ret_from_fork_asm+0x1a/0x30 > . > [ 2157.533301] iser: iser_qp_event_callback: qp event QP request error (2) > [ 2157.533307] iser: iser_qp_event_callback: qp event send queue drained > (5) > [ 2157.533348] connection26:0: detected conn error (1011) > > Signed-off-by: Showrya M N <showrya@chelsio.com> > Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> > --- > drivers/infiniband/sw/siw/siw_qp_tx.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c > b/drivers/infiniband/sw/siw/siw_qp_tx.c > index 64ad9e0895bd..d777d06037db 100644 > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c > @@ -334,6 +334,9 @@ static int siw_tcp_sendpages(struct socket *s, struct > page **page, int offset, > bvec_set_page(&bvec, page[i], bytes, offset); > iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); > > + if (!sendpage_ok(page[i])) > + msg.msg_flags &= ~MSG_SPLICE_PAGES; > + Thanks! This looks good to me. Although, I would suggest moving this further up just before bvec_set_page(): While it is not dong anything to the page, it looks more clean to first alter the pages flags before linking the bvec with it. Thanks! Bernard > try_page_again: > lock_sock(sk); > rv = tcp_sendmsg_locked(sk, &msg, size); > -- > 2.39.1
ok bernard, I will send a v2. Thanks, Showrya -----Original Message----- From: Bernard Metzler <BMT@zurich.ibm.com> Sent: 03 October 2024 19:56 To: Showrya M N <showrya@chelsio.com>; jgg@nvidia.com; leonro@nvidia.com Cc: linux-rdma@vger.kernel.org; Potnuri Bharat Teja <bharat@chelsio.com> Subject: RE: [PATCH for-rc] RDMA/siw: add sendpage_ok() check to disable MSG_SPLICE_PAGES > -----Original Message----- > From: Showrya M N <showrya@chelsio.com> > Sent: Thursday, October 3, 2024 2:46 PM > To: jgg@nvidia.com; leonro@nvidia.com; Bernard Metzler > <BMT@zurich.ibm.com> > Cc: linux-rdma@vger.kernel.org; Showrya M N <showrya@chelsio.com>; > Potnuri Bharat Teja <bharat@chelsio.com> > Subject: [EXTERNAL] [PATCH for-rc] RDMA/siw: add sendpage_ok() check > to disable MSG_SPLICE_PAGES > > While running ISER over SIW, the initiator machine encounters a > warning from skb_splice_from_iter() indicating that a slab page is > being used in send_page. To address this, it is better to add a > sendpage_ok() check within the driver itself, and if it returns 0, > then MSG_SPLICE_PAGES flag should be disabled before entering the network stack. > > A similar issue has been discussed for NVMe in this thread: > INVALID URI REMOVED > 3A__lore.kernel.org_all_20240530142417.146696-2D1-2Dofir.gal- > 40volumez.com_&d=DwIDAg&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvov > E4tYS > bqxyOwdSiLedP4yO55g&m=5zLkbzFXCNvwiYwzyhSLC9r5GQnIt4VKawdBIinJdVEfirE1 > BFqwS 9QGbhVWOOKo&s=wxM2sRuzDoq36W23_jA5pNgeGDEPfTmyPhsvPqp0_-E&e= > > stack trace: > ... > [ 2157.532917] WARNING: CPU: 0 PID: 5342 at net/core/skbuff.c:7140 > skb_splice_from_iter+0x173/0x320 > Call Trace: > [ 2157.533064] Call Trace: > [ 2157.533069] ? __warn+0x84/0x130 > [ 2157.533073] ? skb_splice_from_iter+0x173/0x320 [ 2157.533075] ? > report_bug+0xfc/0x1e0 [ 2157.533081] ? handle_bug+0x3f/0x70 [ > 2157.533085] ? exc_invalid_op+0x17/0x70 [ 2157.533088] ? > asm_exc_invalid_op+0x1a/0x20 [ 2157.533096] ? > skb_splice_from_iter+0x173/0x320 [ 2157.533101] > tcp_sendmsg_locked+0x368/0xe40 [ 2157.533111] siw_tx_hdt+0x695/0xa40 > [siw] [ 2157.533134] ? sched_balance_find_src_group+0x44/0x4f0 > [ 2157.533143] ? __update_load_avg_cfs_rq+0x272/0x300 > [ 2157.533152] ? place_entity+0x19/0xf0 [ 2157.533157] ? > enqueue_entity+0xdb/0x3d0 [ 2157.533162] ? pick_eevdf+0xe2/0x120 [ > 2157.533169] ? check_preempt_wakeup_fair+0x161/0x1f0 > [ 2157.533174] ? wakeup_preempt+0x61/0x70 [ 2157.533177] ? > ttwu_do_activate+0x5d/0x1e0 [ 2157.533183] ? > try_to_wake_up+0x78/0x610 [ 2157.533188] ? xas_load+0xd/0xb0 [ > 2157.533193] ? xa_load+0x80/0xb0 [ 2157.533200] > siw_qp_sq_process+0x102/0xb00 [siw] [ 2157.533213] ? > __pfx_siw_run_sq+0x10/0x10 [siw] [ 2157.533224] > siw_sq_resume+0x39/0x110 [siw] [ 2157.533236] siw_run_sq+0x74/0x160 > [siw] [ 2157.533246] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 2157.533252] kthread+0xd2/0x100 > [ 2157.533257] ? __pfx_kthread+0x10/0x10 [ 2157.533261] > ret_from_fork+0x34/0x40 [ 2157.533266] ? __pfx_kthread+0x10/0x10 [ > 2157.533269] ret_from_fork_asm+0x1a/0x30 . > [ 2157.533301] iser: iser_qp_event_callback: qp event QP request error > (2) [ 2157.533307] iser: iser_qp_event_callback: qp event send queue > drained > (5) > [ 2157.533348] connection26:0: detected conn error (1011) > > Signed-off-by: Showrya M N <showrya@chelsio.com> > Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> > --- > drivers/infiniband/sw/siw/siw_qp_tx.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c > b/drivers/infiniband/sw/siw/siw_qp_tx.c > index 64ad9e0895bd..d777d06037db 100644 > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c > @@ -334,6 +334,9 @@ static int siw_tcp_sendpages(struct socket *s, > struct page **page, int offset, > bvec_set_page(&bvec, page[i], bytes, offset); > iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); > > + if (!sendpage_ok(page[i])) > + msg.msg_flags &= ~MSG_SPLICE_PAGES; > + Thanks! This looks good to me. Although, I would suggest moving this further up just before bvec_set_page(): While it is not dong anything to the page, it looks more clean to first alter the pages flags before linking the bvec with it. Thanks! Bernard > try_page_again: > lock_sock(sk); > rv = tcp_sendmsg_locked(sk, &msg, size); > -- > 2.39.1
diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index 64ad9e0895bd..d777d06037db 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -334,6 +334,9 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset, bvec_set_page(&bvec, page[i], bytes, offset); iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + if (!sendpage_ok(page[i])) + msg.msg_flags &= ~MSG_SPLICE_PAGES; + try_page_again: lock_sock(sk); rv = tcp_sendmsg_locked(sk, &msg, size);