From patchwork Mon Jul 18 08:12:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 12920978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05228C433EF for ; Mon, 18 Jul 2022 08:13:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233690AbiGRIN4 (ORCPT ); Mon, 18 Jul 2022 04:13:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233682AbiGRINw (ORCPT ); Mon, 18 Jul 2022 04:13:52 -0400 Received: from mail.sberdevices.ru (mail.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F928B34; Mon, 18 Jul 2022 01:13:49 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mail.sberdevices.ru (Postfix) with ESMTP id 6E82C5FD02; Mon, 18 Jul 2022 11:13:45 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1658132025; bh=K9t73VPQn1H7rzF6FDTeEbbZIPpEhYN+YhYt5OLk0eg=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=PFqZ48CnBylj1z/FTKdJ/Jp6iVsagp8hwbl5Gqslqw+Y+sxJ0WYuDkcdjUzz8wRKs 5S5oAXp6iF4uUy9kxoAxe38d16BBNL3ZN8Y4a8fpzt0SI253PGI3IBLPCNFqhsRbcg t/bWorEz+to4y/FX38hDXILrYqRcjdSDJbjBYIsin0AN+3vUAxXwobZFboqSQZuQPG 2EH2EjzpCWEcPIk81IGrOSgx0QnaspBWxynKFow6xqppcKL62TzfA9ksMUiEc2yseu 0Ypn95pLHMwiQ2/58rdfAPeRtDTJPrZVB21B3rOJ+ULYfHsT2nRcTC4WPGq0hi8QY5 AG1zPqDzD7vcQ== Received: from S-MS-EXCH02.sberdevices.ru (S-MS-EXCH02.sberdevices.ru [172.16.1.5]) by mail.sberdevices.ru (Postfix) with ESMTP; Mon, 18 Jul 2022 11:13:41 +0300 (MSK) From: Arseniy Krasnov To: Stefano Garzarella , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Stefan Hajnoczi , "Michael S. Tsirkin" , Arseniy Krasnov , Krasnov Arseniy CC: "kvm@vger.kernel.org" , "netdev@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "linux-kernel@vger.kernel.org" , kernel Subject: [RFC PATCH v1 0/3] virtio/vsock: use SO_RCVLOWAT to set POLLIN/POLLRDNORM Thread-Topic: [RFC PATCH v1 0/3] virtio/vsock: use SO_RCVLOWAT to set POLLIN/POLLRDNORM Thread-Index: AQHYmn4tqUccpAClwkSIQKaYrjmRMQ== Date: Mon, 18 Jul 2022 08:12:52 +0000 Message-ID: Accept-Language: en-US, ru-RU Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [172.16.1.12] Content-ID: MIME-Version: 1.0 X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2022/07/18 02:31:00 #19923013 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-State: RFC Hello, during my experiments with zerocopy receive, i found, that in some cases, poll() implementation violates POSIX: when socket has non- default SO_RCVLOWAT(e.g. not 1), poll() will always set POLLIN and POLLRDNORM bits in 'revents' even number of bytes available to read on socket is smaller than SO_RCVLOWAT value. In this case,user sees POLLIN flag and then tries to read data(for example using 'read()' call), but read call will be blocked, because SO_RCVLOWAT logic is supported in dequeue loop in af_vsock.c. But the same time, POSIX requires that: "POLLIN Data other than high-priority data may be read without blocking. POLLRDNORM Normal data may be read without blocking." See https://www.open-std.org/jtc1/sc22/open/n4217.pdf, page 293. So, we have, that poll() syscall returns POLLIN, but read call will be blocked. Also in man page socket(7) i found that: "Since Linux 2.6.28, select(2), poll(2), and epoll(7) indicate a socket as readable only if at least SO_RCVLOWAT bytes are available." I checked TCP callback for poll()(net/ipv4/tcp.c, tcp_poll()), it uses SO_RCVLOWAT value to set POLLIN bit, also i've tested TCP with this case for TCP socket, it works as POSIX required. I've added some fixes to af_vsock.c and virtio_transport_common.c, test is also implemented. What do You think guys? Thank You Arseniy Krasnov(3): vsock_test: POLLIN + SO_RCVLOWAT test. virtio/vsock: use 'target' in notify_poll_in callback. vsock: use sk_skrcvlowat to set POLLIN,POLLRDNORM bits. net/vmw_vsock/af_vsock.c | 2 +- net/vmw_vsock/virtio_transport_common.c | 2 +- tools/testing/vsock/vsock_test.c | 90 +++++++++++++++++++++++++++++++++ 3 files changed, 92 insertions(+), 2 deletions(-)