Message ID | 20230727222627.1895355-1-AVKrasnov@sberdevices.ru (mailing list archive) |
---|---|
Headers | show |
Series | vsock/virtio/vhost: MSG_ZEROCOPY preparations | expand |
On Fri, Jul 28, 2023 at 01:26:23AM +0300, Arseniy Krasnov wrote: > Hello, > > this patchset is first of three parts of another big patchset for > MSG_ZEROCOPY flag support: > https://lore.kernel.org/netdev/20230701063947.3422088-1-AVKrasnov@sberdevices.ru/ overall looks good. Two points I'd like to see addressed: - what's the performance with all these changes - still same? - most systems have a copybreak scheme where buffers smaller than a given size are copied directly. This will address regression you see with small buffers - but need to find that value. we know it's between 4k and 32k :) > During review of this series, Stefano Garzarella <sgarzare@redhat.com> > suggested to split it for three parts to simplify review and merging: > > 1) virtio and vhost updates (for fragged skbs) <--- this patchset > 2) AF_VSOCK updates (allows to enable MSG_ZEROCOPY mode and read > tx completions) and update for Documentation/. > 3) Updates for tests and utils. > > This series enables handling of fragged skbs in virtio and vhost parts. > Newly logic won't be triggered, because SO_ZEROCOPY options is still > impossible to enable at this moment (next bunch of patches from big > set above will enable it). > > I've included changelog to some patches anyway, because there were some > comments during review of last big patchset from the link above. > > Head for this patchset is 9d0cd5d25f7d45bce01bbb3193b54ac24b3a60f3 > > Link to v1: > https://lore.kernel.org/netdev/20230717210051.856388-1-AVKrasnov@sberdevices.ru/ > Link to v2: > https://lore.kernel.org/netdev/20230718180237.3248179-1-AVKrasnov@sberdevices.ru/ > Link to v3: > https://lore.kernel.org/netdev/20230720214245.457298-1-AVKrasnov@sberdevices.ru/ > > Changelog: > * Patchset rebased and tested on new HEAD of net-next (see hash above). > * See per-patch changelog after ---. > > Arseniy Krasnov (4): > vsock/virtio/vhost: read data from non-linear skb > vsock/virtio: support to send non-linear skb > vsock/virtio: non-linear skb handling for tap > vsock/virtio: MSG_ZEROCOPY flag support > > drivers/vhost/vsock.c | 14 +- > include/linux/virtio_vsock.h | 6 + > net/vmw_vsock/virtio_transport.c | 79 +++++- > net/vmw_vsock/virtio_transport_common.c | 312 ++++++++++++++++++------ > 4 files changed, 330 insertions(+), 81 deletions(-) > > -- > 2.25.1
On 28.07.2023 08:45, Michael S. Tsirkin wrote: > On Fri, Jul 28, 2023 at 01:26:23AM +0300, Arseniy Krasnov wrote: >> Hello, >> >> this patchset is first of three parts of another big patchset for >> MSG_ZEROCOPY flag support: >> https://lore.kernel.org/netdev/20230701063947.3422088-1-AVKrasnov@sberdevices.ru/ > > overall looks good. Two points I'd like to see addressed: Thanks! > - what's the performance with all these changes - still same? Yes, I perform quick tests and seems result are same. This is because last implemented logic when I compare size of payload against 'num_max' is for "emergency" case and not triggered in default environment. Anyway, I'll perform retest at least in nested guest case. > - most systems have a copybreak scheme where buffers > smaller than a given size are copied directly. > This will address regression you see with small buffers - > but need to find that value. we know it's between 4k and 32k :) I see, You suggest to find this value and add this check for decision to use zerocopy or copy ? Thanks, Arseniy > > >> During review of this series, Stefano Garzarella <sgarzare@redhat.com> >> suggested to split it for three parts to simplify review and merging: >> >> 1) virtio and vhost updates (for fragged skbs) <--- this patchset >> 2) AF_VSOCK updates (allows to enable MSG_ZEROCOPY mode and read >> tx completions) and update for Documentation/. >> 3) Updates for tests and utils. >> >> This series enables handling of fragged skbs in virtio and vhost parts. >> Newly logic won't be triggered, because SO_ZEROCOPY options is still >> impossible to enable at this moment (next bunch of patches from big >> set above will enable it). >> >> I've included changelog to some patches anyway, because there were some >> comments during review of last big patchset from the link above. >> >> Head for this patchset is 9d0cd5d25f7d45bce01bbb3193b54ac24b3a60f3 >> >> Link to v1: >> https://lore.kernel.org/netdev/20230717210051.856388-1-AVKrasnov@sberdevices.ru/ >> Link to v2: >> https://lore.kernel.org/netdev/20230718180237.3248179-1-AVKrasnov@sberdevices.ru/ >> Link to v3: >> https://lore.kernel.org/netdev/20230720214245.457298-1-AVKrasnov@sberdevices.ru/ >> >> Changelog: >> * Patchset rebased and tested on new HEAD of net-next (see hash above). >> * See per-patch changelog after ---. >> >> Arseniy Krasnov (4): >> vsock/virtio/vhost: read data from non-linear skb >> vsock/virtio: support to send non-linear skb >> vsock/virtio: non-linear skb handling for tap >> vsock/virtio: MSG_ZEROCOPY flag support >> >> drivers/vhost/vsock.c | 14 +- >> include/linux/virtio_vsock.h | 6 + >> net/vmw_vsock/virtio_transport.c | 79 +++++- >> net/vmw_vsock/virtio_transport_common.c | 312 ++++++++++++++++++------ >> 4 files changed, 330 insertions(+), 81 deletions(-) >> >> -- >> 2.25.1 >
On 28.07.2023 11:00, Arseniy Krasnov wrote: > > > On 28.07.2023 08:45, Michael S. Tsirkin wrote: >> On Fri, Jul 28, 2023 at 01:26:23AM +0300, Arseniy Krasnov wrote: >>> Hello, >>> >>> this patchset is first of three parts of another big patchset for >>> MSG_ZEROCOPY flag support: >>> https://lore.kernel.org/netdev/20230701063947.3422088-1-AVKrasnov@sberdevices.ru/ >> >> overall looks good. Two points I'd like to see addressed: > > Thanks! > >> - what's the performance with all these changes - still same? > > Yes, I perform quick tests and seems result are same. This is because last > implemented logic when I compare size of payload against 'num_max' is > for "emergency" case and not triggered in default environment. Anyway, I'll > perform retest at least in nested guest case. "default environment" is vanilla Qemu where queue size is 128 elements. To test this logic i rebuild Qemu with for example queue of 8 elements. Thanks, Arseniy > >> - most systems have a copybreak scheme where buffers >> smaller than a given size are copied directly. >> This will address regression you see with small buffers - >> but need to find that value. we know it's between 4k and 32k :) > > I see, You suggest to find this value and add this check for decision to > use zerocopy or copy ? > > Thanks, Arseniy > >> >> >>> During review of this series, Stefano Garzarella <sgarzare@redhat.com> >>> suggested to split it for three parts to simplify review and merging: >>> >>> 1) virtio and vhost updates (for fragged skbs) <--- this patchset >>> 2) AF_VSOCK updates (allows to enable MSG_ZEROCOPY mode and read >>> tx completions) and update for Documentation/. >>> 3) Updates for tests and utils. >>> >>> This series enables handling of fragged skbs in virtio and vhost parts. >>> Newly logic won't be triggered, because SO_ZEROCOPY options is still >>> impossible to enable at this moment (next bunch of patches from big >>> set above will enable it). >>> >>> I've included changelog to some patches anyway, because there were some >>> comments during review of last big patchset from the link above. >>> >>> Head for this patchset is 9d0cd5d25f7d45bce01bbb3193b54ac24b3a60f3 >>> >>> Link to v1: >>> https://lore.kernel.org/netdev/20230717210051.856388-1-AVKrasnov@sberdevices.ru/ >>> Link to v2: >>> https://lore.kernel.org/netdev/20230718180237.3248179-1-AVKrasnov@sberdevices.ru/ >>> Link to v3: >>> https://lore.kernel.org/netdev/20230720214245.457298-1-AVKrasnov@sberdevices.ru/ >>> >>> Changelog: >>> * Patchset rebased and tested on new HEAD of net-next (see hash above). >>> * See per-patch changelog after ---. >>> >>> Arseniy Krasnov (4): >>> vsock/virtio/vhost: read data from non-linear skb >>> vsock/virtio: support to send non-linear skb >>> vsock/virtio: non-linear skb handling for tap >>> vsock/virtio: MSG_ZEROCOPY flag support >>> >>> drivers/vhost/vsock.c | 14 +- >>> include/linux/virtio_vsock.h | 6 + >>> net/vmw_vsock/virtio_transport.c | 79 +++++- >>> net/vmw_vsock/virtio_transport_common.c | 312 ++++++++++++++++++------ >>> 4 files changed, 330 insertions(+), 81 deletions(-) >>> >>> -- >>> 2.25.1 >>
On 28.07.2023 08:45, Michael S. Tsirkin wrote: > On Fri, Jul 28, 2023 at 01:26:23AM +0300, Arseniy Krasnov wrote: >> Hello, >> >> this patchset is first of three parts of another big patchset for >> MSG_ZEROCOPY flag support: >> https://lore.kernel.org/netdev/20230701063947.3422088-1-AVKrasnov@sberdevices.ru/ > > overall looks good. Two points I'd like to see addressed: > - what's the performance with all these changes - still same? Hello Michael, here are results on the last version: There is some difference between these numbers and numbers from link (it was v3). Looks like new version of zerocopy become slower on big buffers. But anyway it is faster than copy mode in all cases (except <<<<<< marked line below, but I had same result for this testcase in v3 before). I tried to find reason of this difference by switching to v3 version, but seems it is no easy - I get current results again. I guess reason maybe: 1) My environment change - I perform this test in nested virtualization mode, so host OS may also affect performance. 2) My mistake in v3 :( Anyway: 1) MSG_ZEROCOPY is still faster than copy as expected. 2) I'v added column with benchmark on 'net-next' without MSG_ZEROCOPY patchset. Seems it doesn't affect copy performance. Cases where we have difference like 26 against 29 is not a big deal - final result is unstable with some error, e.g. if you run again same test, you can get opposite result like 29 against 26. 2) Numbers below could be considered valid. This is newest measurement. G2H transmission (values are Gbit/s): Core i7 with nested guest. *-------------------------------*-----------------------* | | | | | | buf size | copy | zerocopy | copy w/o MSG_ZEROCOPY | | | | | patchset | | | | | | *-------------------------------*-----------------------* | 4KB | 3 | 11 | 3 | *-------------------------------*-----------------------* | 32KB | 9 | 70 | 10 | *-------------------------------*-----------------------* | 256KB | 30 | 224 | 29 | *-------------------------------*-----------------------* | 1M | 27 | 285 | 30 | *-------------------------------*-----------------------* | 8M | 26 | 365 | 29 | *-------------------------------*-----------------------* H2G: Core i7 with nested guest. *-------------------------------*-----------------------* | | | | | | buf size | copy | zerocopy | copy w/o MSG_ZEROCOPY | | | | | patchset | | | | | | *-------------------------------*-----------------------* | 4KB | 17 | 10 | 17 | <<<<<< *-------------------------------*-----------------------* | 32KB | 30 | 61 | 31 | *-------------------------------*-----------------------* | 256KB | 35 | 214 | 30 | *-------------------------------*-----------------------* | 1M | 29 | 292 | 28 | *-------------------------------*-----------------------* | 8M | 28 | 341 | 28 | *-------------------------------*-----------------------* Loopback: Core i7 with nested guest. *-------------------------------*-----------------------* | | | | | | buf size | copy | zerocopy | copy w/o MSG_ZEROCOPY | | | | | patchset | | | | | | *-------------------------------*-----------------------* | 4KB | 8 | 7 | 8 | *-------------------------------*-----------------------* | 32KB | 27 | 43 | 30 | *-------------------------------*-----------------------* | 256KB | 38 | 100 | 39 | *-------------------------------*-----------------------* | 1M | 37 | 141 | 39 | *-------------------------------*-----------------------* | 8M | 40 | 201 | 36 | *-------------------------------*-----------------------* Thanks, Arseniy > - most systems have a copybreak scheme where buffers > smaller than a given size are copied directly. > This will address regression you see with small buffers - > but need to find that value. we know it's between 4k and 32k :) > > >> During review of this series, Stefano Garzarella <sgarzare@redhat.com> >> suggested to split it for three parts to simplify review and merging: >> >> 1) virtio and vhost updates (for fragged skbs) <--- this patchset >> 2) AF_VSOCK updates (allows to enable MSG_ZEROCOPY mode and read >> tx completions) and update for Documentation/. >> 3) Updates for tests and utils. >> >> This series enables handling of fragged skbs in virtio and vhost parts. >> Newly logic won't be triggered, because SO_ZEROCOPY options is still >> impossible to enable at this moment (next bunch of patches from big >> set above will enable it). >> >> I've included changelog to some patches anyway, because there were some >> comments during review of last big patchset from the link above. >> >> Head for this patchset is 9d0cd5d25f7d45bce01bbb3193b54ac24b3a60f3 >> >> Link to v1: >> https://lore.kernel.org/netdev/20230717210051.856388-1-AVKrasnov@sberdevices.ru/ >> Link to v2: >> https://lore.kernel.org/netdev/20230718180237.3248179-1-AVKrasnov@sberdevices.ru/ >> Link to v3: >> https://lore.kernel.org/netdev/20230720214245.457298-1-AVKrasnov@sberdevices.ru/ >> >> Changelog: >> * Patchset rebased and tested on new HEAD of net-next (see hash above). >> * See per-patch changelog after ---. >> >> Arseniy Krasnov (4): >> vsock/virtio/vhost: read data from non-linear skb >> vsock/virtio: support to send non-linear skb >> vsock/virtio: non-linear skb handling for tap >> vsock/virtio: MSG_ZEROCOPY flag support >> >> drivers/vhost/vsock.c | 14 +- >> include/linux/virtio_vsock.h | 6 + >> net/vmw_vsock/virtio_transport.c | 79 +++++- >> net/vmw_vsock/virtio_transport_common.c | 312 ++++++++++++++++++------ >> 4 files changed, 330 insertions(+), 81 deletions(-) >> >> -- >> 2.25.1 >