From patchwork Fri Apr 19 12:51:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 13636347 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4B1794D137; Fri, 19 Apr 2024 12:52:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713531129; cv=none; b=HG7hR8kiL4EtxcvDIpaiehSarx0b7WKqDGGjfCkOfY1yzgto1eB0WMIDkMfZNTqH+nrtIjifU3NC4cX2JWpYPHVt8cmld9yq4vr63fW8AxW/lxljJModIeXP9Z3dYED7RYk2ujDiVSMsObKUxcMSSuW6anTj7LnsFTZAmIbyvVo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713531129; c=relaxed/simple; bh=Wp6p4pM7sMSjMzfUlyba631lKZP2PLARyhUM0ofGv+U=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=pItwozFnlOTBFzVj3b4e3K4xJf3na3BpdxGDUnFdaHQqJudBiTXRpRIhJi7lixuapDpzjg5t3/dbtNcgGuR2JhXlUYbLVSiqDgTTHcTXrGTOixQyXieFHvgTEVl+b8FKjea/5aK2702fHEaPwZi1f9uJLcvaACKIiH0JUtBAQl8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=casper.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=b7nsP3Cw; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=casper.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="b7nsP3Cw" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=MIME-Version:Content-Type:References: In-Reply-To:Date:Cc:To:From:Subject:Message-ID:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=sSKFy7D/dmB/oF984z3LfPtEgeoxTcZItiyNrK7PKig=; b=b7nsP3Cw2KRmL4GgsuQx1NHh4F ETcewaDmSBMFt6drZ/paozRNyWF7xidCKusY03ynz6AwiiskV1reJqldQ2do2SQwacZvej7QNpgUN BwfJXk7LRLeSf8yMvFtTetPTNFhlLBIdHPGeIGJAWoQgl4Jz9Ig53P8iMXzjjEleX2W5qNNPXXCVE OAcnZe3cbrWW+cKRE/Mil3cIiipIKM6VSNwYZTqhxoskk/AwfMmeNSRePf5vud5CegxQNXPCFSSin aMIPOoF64cmus3FAh+lfGZSOS/yN/G+oAw2owJvwphdMePEmtGEDWZ1J9UCudC5AwX2po6dUu/75H RLptfE2A==; Received: from [2001:8b0:10b:5:c08e:a4fc:45a2:fa90] (helo=u3832b3a9db3152.ant.amazon.com) by casper.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1rxnig-00000007g89-3PVk; Fri, 19 Apr 2024 12:51:59 +0000 Message-ID: <4cb5e38cfce2dbd73635177842e79d479fc2942f.camel@infradead.org> Subject: [PATCH 11/10] KVM: x86: Fix software TSC upscaling in kvm_update_guest_time() From: David Woodhouse To: kvm@vger.kernel.org, Dongli Zhang Cc: Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Paul Durrant , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Oliver Upton , Marcelo Tosatti , jalliste@amazon.co.uk, sveith@amazon.de Date: Fri, 19 Apr 2024 13:51:58 +0100 In-Reply-To: <20240418193528.41780-1-dwmw2@infradead.org> References: <20240418193528.41780-1-dwmw2@infradead.org> User-Agent: Evolution 3.44.4-0ubuntu2 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html From: David Woodhouse There was some confusion in kvm_update_guest_time() when software needs to advance the guest TSC. In master clock mode, there are two points of time which need to be taken into account. First there is the master clock reference point, stored in kvm->arch.master_kernel_ns (and associated host TSC ->master_cycle_now). Secondly, there is the time *now*, at the point kvm_update_guest_time() is being called. With software TSC upscaling, the guest TSC is getting further and further ahead of the host TSC as time elapses. So at time "now", the guest TSC should be further ahead of the host, than it was at master_kernel_ns. The adjustment in kvm_update_guest_time() was not taking that into account, and was only advancing the guest TSC by the appropriate amount for master_kernel_ns, *not* the current time. Fix it to calculate them both correctly. Since the KVM clock reference point in master_kernel_ns might actually be *earlier* than the reference point used for the guest TSC (vcpu->last_tsc_nsec), this might lead to a negative delta. Fix the compute_guest_tsc() function to cope with negative numbers, which then means there is no need to force a master clock update when the guest TSC is written. Signed-off-by: David Woodhouse --- Untested. Thrown on the pile at https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/clocks which we'll be testing more next week... arch/x86/kvm/x86.c | 61 +++++++++++++++++++++++++++++++++------------- 1 file changed, 44 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5cd92f4b4c97..a78adef698bd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2488,10 +2488,19 @@ static int kvm_set_tsc_khz(struct kvm_vcpu *vcpu, u32 user_tsc_khz) static u64 compute_guest_tsc(struct kvm_vcpu *vcpu, s64 kernel_ns) { - u64 tsc = pvclock_scale_delta(kernel_ns-vcpu->arch.this_tsc_nsec, - vcpu->arch.virtual_tsc_mult, - vcpu->arch.virtual_tsc_shift); - tsc += vcpu->arch.this_tsc_write; + s64 delta = kernel_ns - vcpu->arch.this_tsc_nsec; + u64 tsc = vcpu->arch.this_tsc_write; + + /* pvclock_scale_delta cannot cope with negative deltas */ + if (delta >= 0) + tsc += pvclock_scale_delta(delta, + vcpu->arch.virtual_tsc_mult, + vcpu->arch.virtual_tsc_shift); + else + tsc -= pvclock_scale_delta(-delta, + vcpu->arch.virtual_tsc_mult, + vcpu->arch.virtual_tsc_shift); + return tsc; } @@ -2502,7 +2511,7 @@ static inline bool gtod_is_based_on_tsc(int mode) } #endif -static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu, bool new_generation) +static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu) { #ifdef CONFIG_X86_64 struct kvm_arch *ka = &vcpu->kvm->arch; @@ -2519,12 +2528,9 @@ static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu, bool new_generation) /* * Request a masterclock update if the masterclock needs to be toggled - * on/off, or when starting a new generation and the masterclock is - * enabled (compute_guest_tsc() requires the masterclock snapshot to be - * taken _after_ the new generation is created). + * on/off. */ - if ((ka->use_master_clock && new_generation) || - (ka->use_master_clock != use_master_clock)) + if ((ka->use_master_clock != use_master_clock)) kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu); trace_kvm_track_tsc(vcpu->vcpu_id, ka->nr_vcpus_matched_tsc, @@ -2702,7 +2708,7 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc, vcpu->arch.this_tsc_nsec = kvm->arch.cur_tsc_nsec; vcpu->arch.this_tsc_write = kvm->arch.cur_tsc_write; - kvm_track_tsc_matching(vcpu, !matched); + kvm_track_tsc_matching(vcpu); } static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 *user_value) @@ -3296,8 +3302,6 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) kernel_ns = get_kvmclock_base_ns(); } - tsc_timestamp = kvm_read_l1_tsc(v, host_tsc); - /* * We may have to catch up the TSC to match elapsed wall clock * time for two reasons, even if kvmclock is used. @@ -3309,11 +3313,34 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) * very slowly. */ if (vcpu->tsc_catchup) { - u64 tsc = compute_guest_tsc(v, kernel_ns); - if (tsc > tsc_timestamp) { - adjust_tsc_offset_guest(v, tsc - tsc_timestamp); - tsc_timestamp = tsc; + uint64_t now_guest_tsc_adjusted; + uint64_t now_guest_tsc_unadjusted; + int64_t now_guest_tsc_delta; + + tsc_timestamp = compute_guest_tsc(v, kernel_ns); + + if (use_master_clock) { + uint64_t now_host_tsc; + int64_t now_kernel_ns; + + if (!kvm_get_time_and_clockread(&now_kernel_ns, &now_host_tsc)) { + now_kernel_ns = get_kvmclock_base_ns(); + now_host_tsc = rdtsc(); + } + now_guest_tsc_adjusted = compute_guest_tsc(v, now_kernel_ns); + now_guest_tsc_unadjusted = kvm_read_l1_tsc(v, now_host_tsc); + } else { + now_guest_tsc_adjusted = tsc_timestamp; + now_guest_tsc_unadjusted = kvm_read_l1_tsc(v, kernel_ns); } + + now_guest_tsc_delta = now_guest_tsc_adjusted - + now_guest_tsc_unadjusted; + + if (now_guest_tsc_delta > 0) + adjust_tsc_offset_guest(v, now_guest_tsc_delta); + } else { + tsc_timestamp = kvm_read_l1_tsc(v, host_tsc); } local_irq_restore(flags);