Message ID | 20210726163137.2589102-1-arseny.krasnov@kaspersky.com (mailing list archive) |
---|---|
Headers | show |
Series | virtio/vsock: introduce MSG_EOR flag for SEQPACKET | expand |
On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: > This patchset implements support of MSG_EOR bit for SEQPACKET >AF_VSOCK sockets over virtio transport. > Idea is to distinguish concepts of 'messages' and 'records'. >Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >etc. It has fixed maximum length, and it bounds are visible using >return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >Current implementation based on message definition above. > Record has unlimited length, it consists of multiple message, >and bounds of record are visible via MSG_EOR flag returned from >'recvmsg()' call. Sender passes MSG_EOR to sending system call and >receiver will see MSG_EOR when corresponding message will be processed. > To support MSG_EOR new bit was added along with existing >'VIRTIO_VSOCK_SEQ_EOR': 'VIRTIO_VSOCK_SEQ_EOM'(end-of-message) - now it >works in the same way as 'VIRTIO_VSOCK_SEQ_EOR'. But 'VIRTIO_VSOCK_SEQ_EOR' >is used to mark 'MSG_EOR' bit passed from userspace. At this point it's probably better to rename the old flag, so we stay compatible. What happens if one of the two peers does not support MSG_EOR handling, while the other does? I'll do a closer review in the next few days. Thanks, Stefano
On 27.07.2021 10:59, Stefano Garzarella wrote: > Caution: This is an external email. Be cautious while opening links or attachments. > > > > On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: >> This patchset implements support of MSG_EOR bit for SEQPACKET >> AF_VSOCK sockets over virtio transport. >> Idea is to distinguish concepts of 'messages' and 'records'. >> Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >> etc. It has fixed maximum length, and it bounds are visible using >> return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >> Current implementation based on message definition above. >> Record has unlimited length, it consists of multiple message, >> and bounds of record are visible via MSG_EOR flag returned from >> 'recvmsg()' call. Sender passes MSG_EOR to sending system call and >> receiver will see MSG_EOR when corresponding message will be processed. >> To support MSG_EOR new bit was added along with existing >> 'VIRTIO_VSOCK_SEQ_EOR': 'VIRTIO_VSOCK_SEQ_EOM'(end-of-message) - now it >> works in the same way as 'VIRTIO_VSOCK_SEQ_EOR'. But 'VIRTIO_VSOCK_SEQ_EOR' >> is used to mark 'MSG_EOR' bit passed from userspace. > At this point it's probably better to rename the old flag, so we stay > compatible. > > What happens if one of the two peers does not support MSG_EOR handling, > while the other does? > > I'll do a closer review in the next few days. Thank You, also i think MSG_EOR support must be described in spec > > Thanks, > Stefano > >
On Tue, Jul 27, 2021 at 12:34:36PM +0300, Arseny Krasnov wrote: > >On 27.07.2021 10:59, Stefano Garzarella wrote: >> Caution: This is an external email. Be cautious while opening links or attachments. >> >> >> >> On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: >>> This patchset implements support of MSG_EOR bit for SEQPACKET >>> AF_VSOCK sockets over virtio transport. >>> Idea is to distinguish concepts of 'messages' and 'records'. >>> Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >>> etc. It has fixed maximum length, and it bounds are visible using >>> return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >>> Current implementation based on message definition above. >>> Record has unlimited length, it consists of multiple message, >>> and bounds of record are visible via MSG_EOR flag returned from >>> 'recvmsg()' call. Sender passes MSG_EOR to sending system call and >>> receiver will see MSG_EOR when corresponding message will be processed. >>> To support MSG_EOR new bit was added along with existing >>> 'VIRTIO_VSOCK_SEQ_EOR': 'VIRTIO_VSOCK_SEQ_EOM'(end-of-message) - now it >>> works in the same way as 'VIRTIO_VSOCK_SEQ_EOR'. But 'VIRTIO_VSOCK_SEQ_EOR' >>> is used to mark 'MSG_EOR' bit passed from userspace. >> At this point it's probably better to rename the old flag, so we stay >> compatible. >> >> What happens if one of the two peers does not support MSG_EOR handling, >> while the other does? >> >> I'll do a closer review in the next few days. >Thank You, also i think MSG_EOR support must be described in spec Yep, sure! What do you think about the concerns above? Stefano
On 27.07.2021 12:58, Stefano Garzarella wrote: > Caution: This is an external email. Be cautious while opening links or attachments. > > > > On Tue, Jul 27, 2021 at 12:34:36PM +0300, Arseny Krasnov wrote: >> On 27.07.2021 10:59, Stefano Garzarella wrote: >>> Caution: This is an external email. Be cautious while opening links or attachments. >>> >>> >>> >>> On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: >>>> This patchset implements support of MSG_EOR bit for SEQPACKET >>>> AF_VSOCK sockets over virtio transport. >>>> Idea is to distinguish concepts of 'messages' and 'records'. >>>> Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >>>> etc. It has fixed maximum length, and it bounds are visible using >>>> return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >>>> Current implementation based on message definition above. >>>> Record has unlimited length, it consists of multiple message, >>>> and bounds of record are visible via MSG_EOR flag returned from >>>> 'recvmsg()' call. Sender passes MSG_EOR to sending system call and >>>> receiver will see MSG_EOR when corresponding message will be processed. >>>> To support MSG_EOR new bit was added along with existing >>>> 'VIRTIO_VSOCK_SEQ_EOR': 'VIRTIO_VSOCK_SEQ_EOM'(end-of-message) - now it >>>> works in the same way as 'VIRTIO_VSOCK_SEQ_EOR'. But 'VIRTIO_VSOCK_SEQ_EOR' >>>> is used to mark 'MSG_EOR' bit passed from userspace. >>> At this point it's probably better to rename the old flag, so we stay >>> compatible. >>> >>> What happens if one of the two peers does not support MSG_EOR handling, >>> while the other does? >>> >>> I'll do a closer review in the next few days. >> Thank You, also i think MSG_EOR support must be described in spec > Yep, sure! > > What do you think about the concerns above? I think you are right, i'll rename EOR -> EOM, and EOR will be added by patch > > Stefano > >
Hi Arseny, On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: > This patchset implements support of MSG_EOR bit for SEQPACKET >AF_VSOCK sockets over virtio transport. > Idea is to distinguish concepts of 'messages' and 'records'. >Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >etc. It has fixed maximum length, and it bounds are visible using >return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >Current implementation based on message definition above. Okay, so the implementation we merged is wrong right? Should we disable the feature bit in stable kernels that contain it? Or maybe we can backport the fixes... > Record has unlimited length, it consists of multiple message, >and bounds of record are visible via MSG_EOR flag returned from >'recvmsg()' call. Sender passes MSG_EOR to sending system call and >receiver will see MSG_EOR when corresponding message will be processed. > To support MSG_EOR new bit was added along with existing >'VIRTIO_VSOCK_SEQ_EOR': 'VIRTIO_VSOCK_SEQ_EOM'(end-of-message) - now it >works in the same way as 'VIRTIO_VSOCK_SEQ_EOR'. But 'VIRTIO_VSOCK_SEQ_EOR' >is used to mark 'MSG_EOR' bit passed from userspace. I understand that it makes sense to remap VIRTIO_VSOCK_SEQ_EOR to MSG_EOR to make the user understand the boundaries, but why do we need EOM as well? Why do we care about the boundaries of a message within a record? I mean, if the sender makes 3 calls: send(A1,0) send(A2,0) send(A3, MSG_EOR); IIUC it should be fine if the receiver for example receives all in one single recv() calll with MSG_EOR set, so why do we need EOM? Thanks, Stefano
On 04.08.2021 15:57, Stefano Garzarella wrote: > Caution: This is an external email. Be cautious while opening links or attachments. > > > > Hi Arseny, > > On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: >> This patchset implements support of MSG_EOR bit for SEQPACKET >> AF_VSOCK sockets over virtio transport. >> Idea is to distinguish concepts of 'messages' and 'records'. >> Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >> etc. It has fixed maximum length, and it bounds are visible using >> return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >> Current implementation based on message definition above. > Okay, so the implementation we merged is wrong right? > Should we disable the feature bit in stable kernels that contain it? Or > maybe we can backport the fixes... Hi, No, this is correct and it is message boundary based. Idea of this patchset is to add extra boundaries marker which i think could be useful when we want to send data in seqpacket mode which length is bigger than maximum message length(this is limited by transport). Of course we can fragment big piece of data too small messages, but this requires to carry fragmentation info in data protocol. So In this case when we want to maintain boundaries receiver calls recvmsg() until MSG_EOR found. But when receiver knows, that data is fit in maximum datagram length, it doesn't care about checking MSG_EOR just calling recv() or read()(e.g. message based mode). Thank You > >> Record has unlimited length, it consists of multiple message, >> and bounds of record are visible via MSG_EOR flag returned from >> 'recvmsg()' call. Sender passes MSG_EOR to sending system call and >> receiver will see MSG_EOR when corresponding message will be processed. >> To support MSG_EOR new bit was added along with existing >> 'VIRTIO_VSOCK_SEQ_EOR': 'VIRTIO_VSOCK_SEQ_EOM'(end-of-message) - now it >> works in the same way as 'VIRTIO_VSOCK_SEQ_EOR'. But 'VIRTIO_VSOCK_SEQ_EOR' >> is used to mark 'MSG_EOR' bit passed from userspace. > I understand that it makes sense to remap VIRTIO_VSOCK_SEQ_EOR to > MSG_EOR to make the user understand the boundaries, but why do we need > EOM as well? > > Why do we care about the boundaries of a message within a record? > I mean, if the sender makes 3 calls: > send(A1,0) > send(A2,0) > send(A3, MSG_EOR); > > IIUC it should be fine if the receiver for example receives all in one > single recv() calll with MSG_EOR set, so why do we need EOM? > > Thanks, > Stefano > >
On Thu, Aug 05, 2021 at 11:33:12AM +0300, Arseny Krasnov wrote: > >On 04.08.2021 15:57, Stefano Garzarella wrote: >> Caution: This is an external email. Be cautious while opening links or attachments. >> >> >> >> Hi Arseny, >> >> On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: >>> This patchset implements support of MSG_EOR bit for SEQPACKET >>> AF_VSOCK sockets over virtio transport. >>> Idea is to distinguish concepts of 'messages' and 'records'. >>> Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >>> etc. It has fixed maximum length, and it bounds are visible using >>> return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >>> Current implementation based on message definition above. >> Okay, so the implementation we merged is wrong right? >> Should we disable the feature bit in stable kernels that contain it? Or >> maybe we can backport the fixes... > >Hi, > >No, this is correct and it is message boundary based. Idea of this >patchset is to add extra boundaries marker which i think could be >useful when we want to send data in seqpacket mode which length >is bigger than maximum message length(this is limited by transport). >Of course we can fragment big piece of data too small messages, but >this >requires to carry fragmentation info in data protocol. So In this case >when we want to maintain boundaries receiver calls recvmsg() until >MSG_EOR found. >But when receiver knows, that data is fit in maximum datagram length, >it doesn't care about checking MSG_EOR just calling recv() or >read()(e.g. >message based mode). I'm not sure we should maintain boundaries of multiple send(), from POSIX standard [1]: SOCK_SEQPACKET Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag. From my understanding a record could be sent with multiple send() and received, for example, with a single recvmsg(). The only boundary should be the MSG_EOR flag set by the user on the last send() of a record. From send() description [2]: MSG_EOR Terminates a record (if supported by the protocol). From recvmsg() description [3]: MSG_EOR End-of-record was received (if supported by the protocol). Thanks, Stefano [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/send.html [3] https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvmsg.html
On 05.08.2021 12:06, Stefano Garzarella wrote: > Caution: This is an external email. Be cautious while opening links or attachments. > > > > On Thu, Aug 05, 2021 at 11:33:12AM +0300, Arseny Krasnov wrote: >> On 04.08.2021 15:57, Stefano Garzarella wrote: >>> Caution: This is an external email. Be cautious while opening links or attachments. >>> >>> >>> >>> Hi Arseny, >>> >>> On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: >>>> This patchset implements support of MSG_EOR bit for SEQPACKET >>>> AF_VSOCK sockets over virtio transport. >>>> Idea is to distinguish concepts of 'messages' and 'records'. >>>> Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >>>> etc. It has fixed maximum length, and it bounds are visible using >>>> return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >>>> Current implementation based on message definition above. >>> Okay, so the implementation we merged is wrong right? >>> Should we disable the feature bit in stable kernels that contain it? Or >>> maybe we can backport the fixes... >> Hi, >> >> No, this is correct and it is message boundary based. Idea of this >> patchset is to add extra boundaries marker which i think could be >> useful when we want to send data in seqpacket mode which length >> is bigger than maximum message length(this is limited by transport). >> Of course we can fragment big piece of data too small messages, but >> this >> requires to carry fragmentation info in data protocol. So In this case >> when we want to maintain boundaries receiver calls recvmsg() until >> MSG_EOR found. >> But when receiver knows, that data is fit in maximum datagram length, >> it doesn't care about checking MSG_EOR just calling recv() or >> read()(e.g. >> message based mode). > I'm not sure we should maintain boundaries of multiple send(), from > POSIX standard [1]: Yes, but also from POSIX: such calls like send() and sendmsg() operates with "message" and if we check recvmsg() we will find the following thing: For message-based sockets, such as SOCK_DGRAM and SOCK_SEQPACKET, the entire message shall be read in a single operation. If a message is too long to fit in the supplied buffers, and MSG_PEEK is not set in the flags argument, the excess bytes shall be discarded. I understand this, that send() boundaries also must be maintained. I've checked SEQPACKET in AF_UNIX and AX_25 - both doesn't support MSG_EOR, so send() boundaries must be supported. > > SOCK_SEQPACKET > Provides sequenced, reliable, bidirectional, connection-mode > transmission paths for records. A record can be sent using one or > more output operations and received using one or more input > operations, but a single operation never transfers part of more than > one record. Record boundaries are visible to the receiver via the > MSG_EOR flag. > > From my understanding a record could be sent with multiple send() and > received, for example, with a single recvmsg(). > The only boundary should be the MSG_EOR flag set by the user on the last > send() of a record. You are right, if we talking about "record". > > From send() description [2]: > > MSG_EOR > Terminates a record (if supported by the protocol). > > From recvmsg() description [3]: > > MSG_EOR > End-of-record was received (if supported by the protocol). > > Thanks, > Stefano > > [1] > https://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html > [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/send.html > [3] > https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvmsg.html P.S.: seems SEQPACKET is too exotic thing that everyone implements it in own manner, because i've tested SCTP seqpacket implementation, and found that: 1) It doesn't support MSG_EOR bit at send side, but uses MSG_EOR at receiver side to mark MESSAGE boundary. 2) According POSIX any extra bytes that didn't fit in user's buffer must be dropped, but SCTP doesn't drop it - you can read rest of datagram in next calls. > >
On Thu, Aug 05, 2021 at 12:21:57PM +0300, Arseny Krasnov wrote: > >On 05.08.2021 12:06, Stefano Garzarella wrote: >> Caution: This is an external email. Be cautious while opening links or attachments. >> >> >> >> On Thu, Aug 05, 2021 at 11:33:12AM +0300, Arseny Krasnov wrote: >>> On 04.08.2021 15:57, Stefano Garzarella wrote: >>>> Caution: This is an external email. Be cautious while opening links or attachments. >>>> >>>> >>>> >>>> Hi Arseny, >>>> >>>> On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote: >>>>> This patchset implements support of MSG_EOR bit for SEQPACKET >>>>> AF_VSOCK sockets over virtio transport. >>>>> Idea is to distinguish concepts of 'messages' and 'records'. >>>>> Message is result of sending calls: 'write()', 'send()', 'sendmsg()' >>>>> etc. It has fixed maximum length, and it bounds are visible using >>>>> return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. >>>>> Current implementation based on message definition above. >>>> Okay, so the implementation we merged is wrong right? >>>> Should we disable the feature bit in stable kernels that contain it? Or >>>> maybe we can backport the fixes... >>> Hi, >>> >>> No, this is correct and it is message boundary based. Idea of this >>> patchset is to add extra boundaries marker which i think could be >>> useful when we want to send data in seqpacket mode which length >>> is bigger than maximum message length(this is limited by transport). >>> Of course we can fragment big piece of data too small messages, but >>> this >>> requires to carry fragmentation info in data protocol. So In this case >>> when we want to maintain boundaries receiver calls recvmsg() until >>> MSG_EOR found. >>> But when receiver knows, that data is fit in maximum datagram length, >>> it doesn't care about checking MSG_EOR just calling recv() or >>> read()(e.g. >>> message based mode). >> I'm not sure we should maintain boundaries of multiple send(), from >> POSIX standard [1]: > >Yes, but also from POSIX: such calls like send() and sendmsg() > >operates with "message" and if we check recvmsg() we will > >find the following thing: > > >For message-based sockets, such as SOCK_DGRAM and SOCK_SEQPACKET, the entire > >message shall be read in a single operation. If a message is too long to fit in the supplied > >buffers, and MSG_PEEK is not set in the flags argument, the excess bytes shall be discarded. > > >I understand this, that send() boundaries also must be maintained. > >I've checked SEQPACKET in AF_UNIX and AX_25 - both doesn't support > >MSG_EOR, so send() boundaries must be supported. > >> >> SOCK_SEQPACKET >> Provides sequenced, reliable, bidirectional, connection-mode >> transmission paths for records. A record can be sent using one or >> more output operations and received using one or more input >> operations, but a single operation never transfers part of more than >> one record. Record boundaries are visible to the receiver via the >> MSG_EOR flag. >> >> From my understanding a record could be sent with multiple send() >> and >> received, for example, with a single recvmsg(). >> The only boundary should be the MSG_EOR flag set by the user on the >> last >> send() of a record. >You are right, if we talking about "record". >> >> From send() description [2]: >> >> MSG_EOR >> Terminates a record (if supported by the protocol). >> >> From recvmsg() description [3]: >> >> MSG_EOR >> End-of-record was received (if supported by the protocol). >> >> Thanks, >> Stefano >> >> [1] >> https://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html >> [2] >> https://pubs.opengroup.org/onlinepubs/9699919799/functions/send.html >> [3] >> https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvmsg.html > >P.S.: seems SEQPACKET is too exotic thing that everyone implements it >in > >own manner, because i've tested SCTP seqpacket implementation, and >found > >that: > >1) It doesn't support MSG_EOR bit at send side, but uses MSG_EOR at >receiver > >side to mark MESSAGE boundary. > >2) According POSIX any extra bytes that didn't fit in user's buffer >must be dropped, > >but SCTP doesn't drop it - you can read rest of datagram in next calls. > Thanks for this useful information, now I see the differences and why we should support both. I think is better to include them in the cover letter. I'm going to review the paches right now :-) Stefano
This patchset implements support of MSG_EOR bit for SEQPACKET AF_VSOCK sockets over virtio transport. Idea is to distinguish concepts of 'messages' and 'records'. Message is result of sending calls: 'write()', 'send()', 'sendmsg()' etc. It has fixed maximum length, and it bounds are visible using return from receive calls: 'read()', 'recv()', 'recvmsg()' etc. Current implementation based on message definition above. Record has unlimited length, it consists of multiple message, and bounds of record are visible via MSG_EOR flag returned from 'recvmsg()' call. Sender passes MSG_EOR to sending system call and receiver will see MSG_EOR when corresponding message will be processed. To support MSG_EOR new bit was added along with existing 'VIRTIO_VSOCK_SEQ_EOR': 'VIRTIO_VSOCK_SEQ_EOM'(end-of-message) - now it works in the same way as 'VIRTIO_VSOCK_SEQ_EOR'. But 'VIRTIO_VSOCK_SEQ_EOR' is used to mark 'MSG_EOR' bit passed from userspace. This patchset includes simple test for MSG_EOR. Also i've added new vsock test for '-EAGAIN' receive result. Arseny Krasnov(7): virtio/vsock: add 'VIRTIO_VSOCK_SEQ_EOM' bit vsock: rename implementation from 'record' to 'message' vhost/vsock: support MSG_EOR bit processing virito/vsock: support MSG_EOR bit processing af_vsock: rename variables in receive loop vsock_test: update message bounds test for MSG_EOR vsock_test: 'SO_RCVTIMEO' test for SEQPACKET drivers/vhost/vsock.c | 28 +++++++---- include/uapi/linux/virtio_vsock.h | 1 + net/vmw_vsock/af_vsock.c | 10 ++-- net/vmw_vsock/virtio_transport_common.c | 23 +++++---- tools/testing/vsock/vsock_test.c | 57 ++++++++++++++++++++++- 5 files changed, 96 insertions(+), 23 deletions(-) Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>