diff mbox series

[net-next,v6,2/4] vsock/virtio: support to send non-linear skb

Message ID 20230814212720.3679058-3-AVKrasnov@sberdevices.ru (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series vsock/virtio/vhost: MSG_ZEROCOPY preparations | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1330 this patch: 1330
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 1353 this patch: 1353
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1353 this patch: 1353
netdev/checkpatch warning WARNING: line length of 86 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Arseniy Krasnov Aug. 14, 2023, 9:27 p.m. UTC
For non-linear skb use its pages from fragment array as buffers in
virtio tx queue. These pages are already pinned by 'get_user_pages()'
during such skb creation.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---
 Changelog:
 v2 -> v3:
  * Comment about 'page_to_virt()' is updated. I don't remove R-b,
    as this change is quiet small I guess.

 net/vmw_vsock/virtio_transport.c | 41 +++++++++++++++++++++++++++-----
 1 file changed, 35 insertions(+), 6 deletions(-)

Comments

Paolo Abeni Aug. 22, 2023, 8:16 a.m. UTC | #1
Hi,

I'm sorry for the long delay here. I was OoO in the past few weeks.

On Tue, 2023-08-15 at 00:27 +0300, Arseniy Krasnov wrote:
> For non-linear skb use its pages from fragment array as buffers in
> virtio tx queue. These pages are already pinned by 'get_user_pages()'
> during such skb creation.
> 
> Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
>  Changelog:
>  v2 -> v3:
>   * Comment about 'page_to_virt()' is updated. I don't remove R-b,
>     as this change is quiet small I guess.
> 
>  net/vmw_vsock/virtio_transport.c | 41 +++++++++++++++++++++++++++-----
>  1 file changed, 35 insertions(+), 6 deletions(-)
> 
> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
> index e95df847176b..7bbcc8093e51 100644
> --- a/net/vmw_vsock/virtio_transport.c
> +++ b/net/vmw_vsock/virtio_transport.c
> @@ -100,7 +100,9 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>  	vq = vsock->vqs[VSOCK_VQ_TX];
>  
>  	for (;;) {
> -		struct scatterlist hdr, buf, *sgs[2];
> +		/* +1 is for packet header. */
> +		struct scatterlist *sgs[MAX_SKB_FRAGS + 1];
> +		struct scatterlist bufs[MAX_SKB_FRAGS + 1];

Note that MAX_SKB_FRAGS depends on a config knob (CONFIG_MAX_SKB_FRAGS)
and valid/reasonable values are up to 45. The total stack usage can be
pretty large (~700 bytes).

As this is under the vsk tx lock, have you considered moving such data
in the virtio_vsock struct?

>  		int ret, in_sg = 0, out_sg = 0;
>  		struct sk_buff *skb;
>  		bool reply;
> @@ -111,12 +113,39 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>  
>  		virtio_transport_deliver_tap_pkt(skb);
>  		reply = virtio_vsock_skb_reply(skb);
> +		sg_init_one(&bufs[out_sg], virtio_vsock_hdr(skb),
> +			    sizeof(*virtio_vsock_hdr(skb)));
> +		sgs[out_sg] = &bufs[out_sg];
> +		out_sg++;
> +
> +		if (!skb_is_nonlinear(skb)) {
> +			if (skb->len > 0) {
> +				sg_init_one(&bufs[out_sg], skb->data, skb->len);
> +				sgs[out_sg] = &bufs[out_sg];
> +				out_sg++;
> +			}
> +		} else {
> +			struct skb_shared_info *si;
> +			int i;
> +
> +			si = skb_shinfo(skb);

This assumes that the paged skb does not carry any actual data in the
head buffer (only the header). Is that constraint enforced somewhere
else? Otherwise a

	WARN_ON_ONCE(skb_headlen(skb) > sizeof(*virtio_vsock_hdr(skb))

could be helpful to catch early possible bugs.

Thanks!

Paolo

> +
> +			for (i = 0; i < si->nr_frags; i++) {
> +				skb_frag_t *skb_frag = &si->frags[i];
> +				void *va;
>  
> -		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
> -		sgs[out_sg++] = &hdr;
> -		if (skb->len > 0) {
> -			sg_init_one(&buf, skb->data, skb->len);
> -			sgs[out_sg++] = &buf;
> +				/* We will use 'page_to_virt()' for the userspace page
> +				 * here, because virtio or dma-mapping layers will call
> +				 * 'virt_to_phys()' later to fill the buffer descriptor.
> +				 * We don't touch memory at "virtual" address of this page.
> +				 */
> +				va = page_to_virt(skb_frag->bv_page);
> +				sg_init_one(&bufs[out_sg],
> +					    va + skb_frag->bv_offset,
> +					    skb_frag->bv_len);
> +				sgs[out_sg] = &bufs[out_sg];
> +				out_sg++;
> +			}
>  		}
>  
>  		ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, skb, GFP_KERNEL);
Arseniy Krasnov Aug. 23, 2023, 6:54 a.m. UTC | #2
On 22.08.2023 11:16, Paolo Abeni wrote:
> Hi,
> 
> I'm sorry for the long delay here. I was OoO in the past few weeks.
> 
> On Tue, 2023-08-15 at 00:27 +0300, Arseniy Krasnov wrote:
>> For non-linear skb use its pages from fragment array as buffers in
>> virtio tx queue. These pages are already pinned by 'get_user_pages()'
>> during such skb creation.
>>
>> Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
>> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
>> ---
>>  Changelog:
>>  v2 -> v3:
>>   * Comment about 'page_to_virt()' is updated. I don't remove R-b,
>>     as this change is quiet small I guess.
>>
>>  net/vmw_vsock/virtio_transport.c | 41 +++++++++++++++++++++++++++-----
>>  1 file changed, 35 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>> index e95df847176b..7bbcc8093e51 100644
>> --- a/net/vmw_vsock/virtio_transport.c
>> +++ b/net/vmw_vsock/virtio_transport.c
>> @@ -100,7 +100,9 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>>  	vq = vsock->vqs[VSOCK_VQ_TX];
>>  
>>  	for (;;) {
>> -		struct scatterlist hdr, buf, *sgs[2];
>> +		/* +1 is for packet header. */
>> +		struct scatterlist *sgs[MAX_SKB_FRAGS + 1];
>> +		struct scatterlist bufs[MAX_SKB_FRAGS + 1];
> 
> Note that MAX_SKB_FRAGS depends on a config knob (CONFIG_MAX_SKB_FRAGS)
> and valid/reasonable values are up to 45. The total stack usage can be
> pretty large (~700 bytes).
> 
> As this is under the vsk tx lock, have you considered moving such data
> in the virtio_vsock struct?

I think yes, there will be no problem if these temporary variables will be moved
into this global struct. I'll add comment about this reason.

> 
>>  		int ret, in_sg = 0, out_sg = 0;
>>  		struct sk_buff *skb;
>>  		bool reply;
>> @@ -111,12 +113,39 @@ virtio_transport_send_pkt_work(struct work_struct *work)
>>  
>>  		virtio_transport_deliver_tap_pkt(skb);
>>  		reply = virtio_vsock_skb_reply(skb);
>> +		sg_init_one(&bufs[out_sg], virtio_vsock_hdr(skb),
>> +			    sizeof(*virtio_vsock_hdr(skb)));
>> +		sgs[out_sg] = &bufs[out_sg];
>> +		out_sg++;
>> +
>> +		if (!skb_is_nonlinear(skb)) {
>> +			if (skb->len > 0) {
>> +				sg_init_one(&bufs[out_sg], skb->data, skb->len);
>> +				sgs[out_sg] = &bufs[out_sg];
>> +				out_sg++;
>> +			}
>> +		} else {
>> +			struct skb_shared_info *si;
>> +			int i;
>> +
>> +			si = skb_shinfo(skb);
> 
> This assumes that the paged skb does not carry any actual data in the
> head buffer (only the header). Is that constraint enforced somewhere
> else? Otherwise a
> 
> 	WARN_ON_ONCE(skb_headlen(skb) > sizeof(*virtio_vsock_hdr(skb))
> 
> could be helpful to catch early possible bugs.

Yes, such skbs have data only in paged part, while linear buffer contains only
header. Ok, let's add this warning here to prevent future bugs.

Thanks, Arseniy

> 
> Thanks!
> 
> Paolo
> 
>> +
>> +			for (i = 0; i < si->nr_frags; i++) {
>> +				skb_frag_t *skb_frag = &si->frags[i];
>> +				void *va;
>>  
>> -		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
>> -		sgs[out_sg++] = &hdr;
>> -		if (skb->len > 0) {
>> -			sg_init_one(&buf, skb->data, skb->len);
>> -			sgs[out_sg++] = &buf;
>> +				/* We will use 'page_to_virt()' for the userspace page
>> +				 * here, because virtio or dma-mapping layers will call
>> +				 * 'virt_to_phys()' later to fill the buffer descriptor.
>> +				 * We don't touch memory at "virtual" address of this page.
>> +				 */
>> +				va = page_to_virt(skb_frag->bv_page);
>> +				sg_init_one(&bufs[out_sg],
>> +					    va + skb_frag->bv_offset,
>> +					    skb_frag->bv_len);
>> +				sgs[out_sg] = &bufs[out_sg];
>> +				out_sg++;
>> +			}
>>  		}
>>  
>>  		ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, skb, GFP_KERNEL);
>
diff mbox series

Patch

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index e95df847176b..7bbcc8093e51 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -100,7 +100,9 @@  virtio_transport_send_pkt_work(struct work_struct *work)
 	vq = vsock->vqs[VSOCK_VQ_TX];
 
 	for (;;) {
-		struct scatterlist hdr, buf, *sgs[2];
+		/* +1 is for packet header. */
+		struct scatterlist *sgs[MAX_SKB_FRAGS + 1];
+		struct scatterlist bufs[MAX_SKB_FRAGS + 1];
 		int ret, in_sg = 0, out_sg = 0;
 		struct sk_buff *skb;
 		bool reply;
@@ -111,12 +113,39 @@  virtio_transport_send_pkt_work(struct work_struct *work)
 
 		virtio_transport_deliver_tap_pkt(skb);
 		reply = virtio_vsock_skb_reply(skb);
+		sg_init_one(&bufs[out_sg], virtio_vsock_hdr(skb),
+			    sizeof(*virtio_vsock_hdr(skb)));
+		sgs[out_sg] = &bufs[out_sg];
+		out_sg++;
+
+		if (!skb_is_nonlinear(skb)) {
+			if (skb->len > 0) {
+				sg_init_one(&bufs[out_sg], skb->data, skb->len);
+				sgs[out_sg] = &bufs[out_sg];
+				out_sg++;
+			}
+		} else {
+			struct skb_shared_info *si;
+			int i;
+
+			si = skb_shinfo(skb);
+
+			for (i = 0; i < si->nr_frags; i++) {
+				skb_frag_t *skb_frag = &si->frags[i];
+				void *va;
 
-		sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb)));
-		sgs[out_sg++] = &hdr;
-		if (skb->len > 0) {
-			sg_init_one(&buf, skb->data, skb->len);
-			sgs[out_sg++] = &buf;
+				/* We will use 'page_to_virt()' for the userspace page
+				 * here, because virtio or dma-mapping layers will call
+				 * 'virt_to_phys()' later to fill the buffer descriptor.
+				 * We don't touch memory at "virtual" address of this page.
+				 */
+				va = page_to_virt(skb_frag->bv_page);
+				sg_init_one(&bufs[out_sg],
+					    va + skb_frag->bv_offset,
+					    skb_frag->bv_len);
+				sgs[out_sg] = &bufs[out_sg];
+				out_sg++;
+			}
 		}
 
 		ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, skb, GFP_KERNEL);