From patchwork Fri Jul 16 21:26:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oliver Upton X-Patchwork-Id: 12382907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20FA8C12002 for ; Fri, 16 Jul 2021 21:40:12 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D5F91611F1 for ; Fri, 16 Jul 2021 21:40:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D5F91611F1 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:References: Mime-Version:Message-Id:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=fVTrPPto9v7y5uCMLa/uY1v6EKwTu2qs99hZxV1G2FY=; b=NiS1QHCL5pNfXoZRINzpQZOtYr noDTGw8Bu8wNu3Acjyo8923kwLaSG8EisJbduzBbf5fA68Z/9QAmQOjBzKtrzWI/dZxWocHWwv2vv +sloVklX+q+14gA0hCd50z5ylTzyRe3MUUa+0m1ZtOVK93nZC1rp/qWDE3YJXVvzuExfQ2xsownro 3v4sWQchQyZ+c37Gwfk8KF5M2e61kEZT/1eVTu9XC0N2IfKIsIMkxQ7OJzMDqT3MfKH/5BhYnqjHB bA7wQ/p8y7ltHE0SDX0GS8kfBkY3V19hGAUrqJfIwOds0tu1bgcdx+37VUfmSQRG412KM+Qjec5QG b65APurg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1m4VX0-005NGi-OO; Fri, 16 Jul 2021 21:38:04 +0000 Received: from mail-io1-xd4a.google.com ([2607:f8b0:4864:20::d4a]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1m4VM4-005IYs-GZ for linux-arm-kernel@lists.infradead.org; Fri, 16 Jul 2021 21:26:46 +0000 Received: by mail-io1-xd4a.google.com with SMTP id f2-20020a6b62020000b02905094eaa65fdso6889555iog.15 for ; Fri, 16 Jul 2021 14:26:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=G0sqE4QE+QI2i1u4g2AfnJStaFM+KwHtrxz1AZ/h5tg=; b=ZfMXy9DyS4/A0gZDb255CmZRdDffflLwCBKvyMhX5wqVLKSEtBrEQS746xvb2oBHA0 cj4XharNtW/izBp1TJS3/+CC6TAT4/V0WXqLKxwmJom0yakyq6D0o89WRGS2hvSH6KDw k9cv0YLSEDhgajibvef99woxkH+ubblAUBiaW2pWCA/zjqwkBwvoqxgVv+nxUcyXWiX+ LbtjYBL6tcSu8ORVKYU6eOwOYvwzBi8QV7HFPtunJSOI3UEwqGsL7x2RNBMXb1vUrqKR 5hacA6UlMd5NF7c7XTlAGO8kQ9BUed5j6XHciSfpe+/aAk1VoSi7pScEmbqNT6co6Kkf rQTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=G0sqE4QE+QI2i1u4g2AfnJStaFM+KwHtrxz1AZ/h5tg=; b=tN9UwHLJjS2I0Gylajtarx8flay+AhJHaq8fo8kZ9X5bGqstnNnkTFGuiZE9skmVaN DAJOuL2PK3YqNkSi19B5vxX9th3XhhUZarPLEqVMz67Pp+lwiaPFWVSciqAwQLyw3P9X 9o8BZQJWXg5WjhbobGzoHWpysx/3f6s13rbcvdm6tD8GGG1pC//51GhROU0ARNkbOuvZ JLw8xo+Fzj1pHTYOhvcvwwiTsdKOQHd4AEv98vQ8tpjRpHStY/EWYDLJM3IdxyQsov+W Su+ZZoHOSCgaMH9Y91/Xr33y2csnfb15RySt42JvQVMud5C2UmWmnZ50OrLQo3Gknakk wlfw== X-Gm-Message-State: AOAM532nMKDmxHyh4h0qaS+gWeQ7lsmMGTfP/XGqh7jfw9mMvFX5uhhl 20+4BiRBwdLPzuJ/vKJsahTKkDoib0o= X-Google-Smtp-Source: ABdhPJw3bhsyCdmbrH4NHhe5U+Lu64+JtIDd5EYWnLBhLx6mU/J+kMPc2hK7380eFiQyTV58W5AUnj2kS9U= X-Received: from oupton.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:404]) (user=oupton job=sendgmr) by 2002:a5d:87d0:: with SMTP id q16mr8960110ios.109.1626470802830; Fri, 16 Jul 2021 14:26:42 -0700 (PDT) Date: Fri, 16 Jul 2021 21:26:20 +0000 In-Reply-To: <20210716212629.2232756-1-oupton@google.com> Message-Id: <20210716212629.2232756-4-oupton@google.com> Mime-Version: 1.0 References: <20210716212629.2232756-1-oupton@google.com> X-Mailer: git-send-email 2.32.0.402.g57bb445576-goog Subject: [PATCH v2 03/12] KVM: x86: Expose TSC offset controls to userspace From: Oliver Upton To: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu Cc: Paolo Bonzini , Sean Christopherson , Marc Zyngier , Peter Shier , Jim Mattson , David Matlack , Ricardo Koller , Jing Zhang , Raghavendra Rao Anata , James Morse , Alexandru Elisei , Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, Oliver Upton , Oliver Upton X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210716_142644_647855_3C1D5AF9 X-CRM114-Status: GOOD ( 18.80 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org To date, VMM-directed TSC synchronization and migration has been a bit messy. KVM has some baked-in heuristics around TSC writes to infer if the VMM is attempting to synchronize. This is problematic, as it depends on host userspace writing to the guest's TSC within 1 second of the last write. A much cleaner approach to configuring the guest's views of the TSC is to simply migrate the TSC offset for every vCPU. Offsets are idempotent, and thus not subject to change depending on when the VMM actually reads/writes values from/to KVM. The VMM can then read the TSC once with KVM_GET_CLOCK to capture a (realtime, host_tsc) pair at the instant when the guest is paused. Cc: David Matlack Signed-off-by: Oliver Upton --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/include/uapi/asm/kvm.h | 4 + arch/x86/kvm/x86.c | 166 ++++++++++++++++++++++++++++++++ 3 files changed, 171 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e527d7259415..45134b7b14d6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1070,6 +1070,7 @@ struct kvm_arch { u64 last_tsc_nsec; u64 last_tsc_write; u32 last_tsc_khz; + u64 last_tsc_offset; u64 cur_tsc_nsec; u64 cur_tsc_write; u64 cur_tsc_offset; diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index a6c327f8ad9e..0b22e1e84e78 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -503,4 +503,8 @@ struct kvm_pmu_event_filter { #define KVM_PMU_EVENT_ALLOW 0 #define KVM_PMU_EVENT_DENY 1 +/* for KVM_{GET,SET,HAS}_DEVICE_ATTR */ +#define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */ +#define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */ + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e1b7c8b67428..d22de0a1988a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2411,6 +2411,11 @@ static void kvm_vcpu_write_tsc_offset(struct kvm_vcpu *vcpu, u64 l1_offset) static_call(kvm_x86_write_tsc_offset)(vcpu, vcpu->arch.tsc_offset); } +static u64 kvm_vcpu_read_tsc_offset(struct kvm_vcpu *vcpu) +{ + return vcpu->arch.l1_tsc_offset; +} + static void kvm_vcpu_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 l1_multiplier) { vcpu->arch.l1_tsc_scaling_ratio = l1_multiplier; @@ -2467,6 +2472,7 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc, kvm->arch.last_tsc_nsec = ns; kvm->arch.last_tsc_write = tsc; kvm->arch.last_tsc_khz = vcpu->arch.virtual_tsc_khz; + kvm->arch.last_tsc_offset = offset; vcpu->arch.last_guest_tsc = tsc; @@ -4914,6 +4920,136 @@ static int kvm_set_guest_paused(struct kvm_vcpu *vcpu) return 0; } +static int kvm_arch_tsc_has_attr(struct kvm_vcpu *vcpu, + struct kvm_device_attr *attr) +{ + int r; + + switch (attr->attr) { + case KVM_VCPU_TSC_OFFSET: + r = 0; + break; + default: + r = -ENXIO; + } + + return r; +} + +static int kvm_arch_tsc_get_attr(struct kvm_vcpu *vcpu, + struct kvm_device_attr *attr) +{ + void __user *uaddr = (void __user *)attr->addr; + int r; + + switch (attr->attr) { + case KVM_VCPU_TSC_OFFSET: { + u64 offset; + + offset = kvm_vcpu_read_tsc_offset(vcpu); + r = -EFAULT; + if (copy_to_user(uaddr, &offset, sizeof(offset))) + break; + + r = 0; + } + default: + r = -ENXIO; + } + + return r; +} + +static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu, + struct kvm_device_attr *attr) +{ + void __user *uaddr = (void __user *)attr->addr; + struct kvm *kvm = vcpu->kvm; + int r; + + switch (attr->attr) { + case KVM_VCPU_TSC_OFFSET: { + u64 offset, tsc, ns; + unsigned long flags; + bool matched; + + r = -EFAULT; + if (copy_from_user(&offset, uaddr, sizeof(offset))) + break; + + raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags); + + matched = (vcpu->arch.virtual_tsc_khz && + kvm->arch.last_tsc_khz == vcpu->arch.virtual_tsc_khz && + kvm->arch.last_tsc_offset == offset); + + tsc = kvm_scale_tsc(vcpu, rdtsc(), vcpu->arch.l1_tsc_scaling_ratio) + offset; + ns = get_kvmclock_base_ns(); + + __kvm_synchronize_tsc(vcpu, offset, tsc, ns, matched); + raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags); + + r = 0; + break; + } + default: + r = -ENXIO; + } + + return r; +} + +static int kvm_vcpu_ioctl_has_device_attr(struct kvm_vcpu *vcpu, + struct kvm_device_attr *attr) +{ + int r; + + switch (attr->group) { + case KVM_VCPU_TSC_CTRL: + r = kvm_arch_tsc_has_attr(vcpu, attr); + break; + default: + r = -ENXIO; + break; + } + + return r; +} + +static int kvm_vcpu_ioctl_get_device_attr(struct kvm_vcpu *vcpu, + struct kvm_device_attr *attr) +{ + int r; + + switch (attr->group) { + case KVM_VCPU_TSC_CTRL: + r = kvm_arch_tsc_get_attr(vcpu, attr); + break; + default: + r = -ENXIO; + break; + } + + return r; +} + +static int kvm_vcpu_ioctl_set_device_attr(struct kvm_vcpu *vcpu, + struct kvm_device_attr *attr) +{ + int r; + + switch (attr->group) { + case KVM_VCPU_TSC_CTRL: + r = kvm_arch_tsc_set_attr(vcpu, attr); + break; + default: + r = -ENXIO; + break; + } + + return r; +} + static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, struct kvm_enable_cap *cap) { @@ -5368,6 +5504,36 @@ long kvm_arch_vcpu_ioctl(struct file *filp, r = __set_sregs2(vcpu, u.sregs2); break; } + case KVM_HAS_DEVICE_ATTR: { + struct kvm_device_attr attr; + + r = -EFAULT; + if (copy_from_user(&attr, argp, sizeof(attr))) + goto out; + + r = kvm_vcpu_ioctl_has_device_attr(vcpu, &attr); + break; + } + case KVM_GET_DEVICE_ATTR: { + struct kvm_device_attr attr; + + r = -EFAULT; + if (copy_from_user(&attr, argp, sizeof(attr))) + goto out; + + r = kvm_vcpu_ioctl_get_device_attr(vcpu, &attr); + break; + } + case KVM_SET_DEVICE_ATTR: { + struct kvm_device_attr attr; + + r = -EFAULT; + if (copy_from_user(&attr, argp, sizeof(attr))) + goto out; + + r = kvm_vcpu_ioctl_set_device_attr(vcpu, &attr); + break; + } default: r = -EINVAL; }