From patchwork Tue Jan 12 12:15:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Catangiu, Adrian Costin" X-Patchwork-Id: 12013327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF230C433DB for ; Tue, 12 Jan 2021 12:19:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 98AF8206F0 for ; Tue, 12 Jan 2021 12:19:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 98AF8206F0 Authentication-Results: mail.kernel.org; dmarc=pass (p=none dis=none) header.from=nongnu.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58432 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kzIdw-00032f-Mb for qemu-devel@archiver.kernel.org; Tue, 12 Jan 2021 07:19:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:48388) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kzIbS-0001mZ-14 for qemu-devel@nongnu.org; Tue, 12 Jan 2021 07:16:50 -0500 Received: from smtp-fw-6002.amazon.com ([52.95.49.90]:21028) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kzIbP-0001Fx-Uf for qemu-devel@nongnu.org; Tue, 12 Jan 2021 07:16:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1610453808; x=1641989808; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=tF+5LWSipsSfv/MpNTU2XOkSQBgOPGBmsF5sLnQRVvI=; b=lZ6x0kZeyffbEZcLdyUpOk+X6fjf96W5tGlcRYOeK7EvH/dP0KLm4LPF LEL4q4t3k8B/WT/oOonfJGuxTzDsqjmj7JCLQbQ6Py6lpECb0q2mcNPOI p61JppAHaZoq9yhRgPE/Qfie1pr3ZIruTgcNK+4vU0FO09SUChP49ZPJl s=; X-IronPort-AV: E=Sophos;i="5.79,341,1602547200"; d="scan'208";a="77011747" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-1d-16425a8d.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-out-6002.iad6.amazon.com with ESMTP; 12 Jan 2021 12:16:39 +0000 Received: from EX13D08EUB004.ant.amazon.com (iad12-ws-svc-p26-lb9-vlan3.iad.amazon.com [10.40.163.38]) by email-inbound-relay-1d-16425a8d.us-east-1.amazon.com (Postfix) with ESMTPS id 1E6F6100F90; Tue, 12 Jan 2021 12:16:29 +0000 (UTC) Received: from uf6ed9c851f4556.ant.amazon.com (10.43.161.68) by EX13D08EUB004.ant.amazon.com (10.43.166.158) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Tue, 12 Jan 2021 12:16:15 +0000 To: , , , , Subject: [PATCH v4 0/2] System Generation ID driver and VMGENID backend Date: Tue, 12 Jan 2021 14:15:58 +0200 Message-ID: <1610453760-13812-1-git-send-email-acatan@amazon.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 X-Originating-IP: [10.43.161.68] X-ClientProxiedBy: EX13D50UWC001.ant.amazon.com (10.43.162.96) To EX13D08EUB004.ant.amazon.com (10.43.166.158) Precedence: Bulk Received-SPF: pass client-ip=52.95.49.90; envelope-from=prvs=639db6d60=acatan@amazon.com; helo=smtp-fw-6002.amazon.com X-Spam_score_int: -120 X-Spam_score: -12.1 X-Spam_bar: ------------ X-Spam_report: (-12.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.251, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason@zx2c4.com, dgunigun@redhat.com, mst@redhat.com, ghammer@redhat.com, vijaysun@ca.ibm.com, 0x7f454c46@gmail.com, mhocko@kernel.org, oridgar@gmail.com, avagin@gmail.com, pavel@ucw.cz, ptikhomirov@virtuozzo.com, corbet@lwn.net, mpe@ellerman.id.au, rafael@kernel.org, ebiggers@kernel.org, borntraeger@de.ibm.com, sblbir@amazon.com, bonzini@gnu.org, arnd@arndb.de, jannh@google.com, raduweis@amazon.com, asmehra@redhat.com, Adrian Catangiu , graf@amazon.com, rppt@kernel.org, luto@kernel.org, gil@azul.com, colmmacc@amazon.com, tytso@mit.edu, gregkh@linuxfoundation.org, areber@redhat.com, ebiederm@xmission.com, ovzxemul@gmail.com, w@1wt.eu, dwmw@amazon.co.uk Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Reply-to: Adrian Catangiu X-Patchwork-Original-From: Adrian Catangiu via From: "Catangiu, Adrian Costin" This feature is aimed at virtualized or containerized environments where VM or container snapshotting duplicates memory state, which is a challenge for applications that want to generate unique data such as request IDs, UUIDs, and cryptographic nonces. The patch set introduces a mechanism that provides a userspace interface for applications and libraries to be made aware of uniqueness breaking events such as VM or container snapshotting, and allow them to react and adapt to such events. Solving the uniqueness problem strongly enough for cryptographic purposes requires a mechanism which can deterministically reseed userspace PRNGs with new entropy at restore time. This mechanism must also support the high-throughput and low-latency use-cases that led programmers to pick a userspace PRNG in the first place; be usable by both application code and libraries; allow transparent retrofitting behind existing popular PRNG interfaces without changing application code; it must be efficient, especially on snapshot restore; and be simple enough for wide adoption. The first patch in the set implements a device driver which exposes a read-only device /dev/sysgenid to userspace, which contains a monotonically increasing u32 generation counter. Libraries and applications are expected to open() the device, and then call read() which blocks until the SysGenId changes. Following an update, read() calls no longer block until the application acknowledges the new SysGenId by write()ing it back to the device. Non-blocking read() calls return EAGAIN when there is no new SysGenId available. Alternatively, libraries can mmap() the device to get a single shared page which contains the latest SysGenId at offset 0. SysGenId also supports a notification mechanism exposed as two IOCTLs on the device. SYSGENID_GET_OUTDATED_WATCHERS immediately returns the number of file descriptors to the device that were open during the last SysGenId change but have not yet acknowledged the new id. SYSGENID_WAIT_WATCHERS blocks until there are no open file handles on the device which haven’t acknowledged the new id. These two interfaces are intended for serverless and container control planes, which want to confirm that all application code has detected and reacted to the new SysGenId before sending an invoke to the newly-restored sandbox. The second patch in the set adds a VmGenId driver which makes use of the ACPI vmgenid device to drive SysGenId and to reseed kernel entropy on VM snapshots. --- v3 -> v4: - split functionality in two separate kernel modules: 1. drivers/misc/sysgenid.c which provides the generic userspace interface and mechanisms 2. drivers/virt/vmgenid.c as VMGENID acpi device driver that seeds kernel entropy and acts as a driving backend for the generic sysgenid - renamed /dev/vmgenid -> /dev/sysgenid - renamed uapi header file vmgenid.h -> sysgenid.h - renamed ioctls VMGENID_* -> SYSGENID_* - added ‘min_gen’ parameter to SYSGENID_FORCE_GEN_UPDATE ioctl - fixed races in documentation examples - various style nits - rebased on top of linus latest v2 -> v3: - separate the core driver logic and interface, from the ACPI device. The ACPI vmgenid device is now one possible backend. - fix issue when timeout=0 in VMGENID_WAIT_WATCHERS - add locking to avoid races between fs ops handlers and hw irq driven generation updates - change VMGENID_WAIT_WATCHERS ioctl so if the current caller is outdated or a generation change happens while waiting (thus making current caller outdated), the ioctl returns -EINTR to signal the user to handle event and retry. Fixes blocking on oneself. - add VMGENID_FORCE_GEN_UPDATE ioctl conditioned by CAP_CHECKPOINT_RESTORE capability, through which software can force generation bump. v1 -> v2: - expose to userspace a monotonically increasing u32 Vm Gen Counter instead of the hw VmGen UUID - since the hw/hypervisor-provided 128-bit UUID is not public anymore, add it to the kernel RNG as device randomness - insert driver page containing Vm Gen Counter in the user vma in the driver's mmap handler instead of using a fault handler - turn driver into a misc device driver to auto-create /dev/vmgenid - change ioctl arg to avoid leaking kernel structs to userspace - update documentation - various nits - rebase on top of linus latest Adrian Catangiu (2): drivers/misc: sysgenid: add system generation id driver drivers/virt: vmgenid: add vm generation id driver Documentation/misc-devices/sysgenid.rst | 240 +++++++++++++++++++++++++ Documentation/virt/vmgenid.rst | 34 ++++ drivers/misc/Kconfig | 16 ++ drivers/misc/Makefile | 1 + drivers/misc/sysgenid.c | 298 ++++++++++++++++++++++++++++++++ drivers/virt/Kconfig | 14 ++ drivers/virt/Makefile | 1 + drivers/virt/vmgenid.c | 153 ++++++++++++++++ include/uapi/linux/sysgenid.h | 18 ++ 9 files changed, 775 insertions(+) create mode 100644 Documentation/misc-devices/sysgenid.rst create mode 100644 Documentation/virt/vmgenid.rst create mode 100644 drivers/misc/sysgenid.c create mode 100644 drivers/virt/vmgenid.c create mode 100644 include/uapi/linux/sysgenid.h