[RFC] KVM: x86: Allow userspace exit on HLT and MWAIT, else yield on MWAIT

From: David Woodhouse <dwmw@amazon.co.uk>

From: David Woodhouse <dwmw@amazon.co.uk>

The VMM may have work to do on behalf of the guest, and it's often
desirable to use the cycles when the vCPUS are idle.

When the vCPU uses HLT this works out OK because the VMM can run its
tasks in a separate thread which gets scheduled when the in-kernel
emulation of HLT schedules away. It isn't perfect, because it doesn't
easily allow for handling both low-priority maintenance tasks when the
VMM wants to wait until the vCPU is idle, and also for higher priority
tasks where the VMM does want to preempt the vCPU. It can also lead to
noisy neighbour effects, when a host has isn't necessarily sized to
expect any given VMM to suddenly be contending for many *more* pCPUs
than it has vCPUs. 

In addition, there are times when we need to expose MWAIT to a guest
for compatibility with a previous environment. And MWAIT is much harder
because it's very hard to emulate properly.

There were attempts at doing so based on marking the target page read-
only in MONITOR and triggering the wake when it takes a minor fault,
but so far they haven't led to a working solution:
https://www.contrib.andrew.cmu.edu/~somlo/OSXKVM/mwait.html

So when a guest executes MWAIT, either we've disabled exit-on-mwait and
the guest actually sits in non-root mode hogging the pCPU, or if we do
enable exit-on-mwait the kernel just treats it as a NOP and bounces
right back into the guest to busy-wait round its idle loop.

For a start, we can stick a yield() into that busy-loop. The yield()
has fairly poorly defined semantics, but it's better than *nothing* and
does allow a VMM's thread-based I/O and maintenance tasks to run a
*little* better.

Better still, we can bounce all the way out to *userspace* on an MWAIT
exit, and let the VMM perform some of its pending work right there and
then in the vCPU thread before re-entering the vCPU. That's much nicer
than yield(). The vCPU is still runnable, since we still don't have a
*real* emulation of MWAIT, so the vCPU thread can do a *little* bit of
work and then go back into the vCPU for another turn around the loop.

And if we're going to do that kind of task processing for MWAIT-idle
guests directly from the vCPU thread, it's neater to do it for HLT-idle
guests that way too.

For HLT, the vCPU *isn't* runnable; it'll be in KVM_MP_STATE_HALTED.
The VMM can poll the mp_state and know when the vCPU should be run
again. But not poll(), although we might want to hook up something like
that (or just a signal or eventfd) for other reasons for VSM anyway.
The VMM can also just do some work and then re-enter the vCPU without
the corresponding bit set in the kvm_run struct.

So, er, what does this patch do? Add a capability, define two bits for
exiting to userspace on HLT or MWAIT — in the kvm_run struct rather
than needing a separate ioctl to turn them on or off, so that the VMM
can make the decision each time it enters the vCPU. Hook it up to
(ab?)use the existing KVM_EXIT_HLT which was previously only used when
the local APIC was emulated in userspace, and add a new KVM_EXIT_MWAIT.

Fairly much untested.

If this approach seems reasonable, of course I'll add test cases and
proper documentation before posting it for real. This is the proof of
concept before we even put it through testing to see what performance
we get out of it especially for those obnoxious MWAIT-enabled guests.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>

Message ID	1b52b557beb6606007f7ec5672eab0adf1606a34.camel@infradead.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@vger.kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7AA7CD13D9 for <kvm@archiver.kernel.org>; Mon, 18 Sep 2023 09:07:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239133AbjIRJGp (ORCPT <rfc822;kvm@archiver.kernel.org>); Mon, 18 Sep 2023 05:06:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240965AbjIRJGj (ORCPT <rfc822;kvm@vger.kernel.org>); Mon, 18 Sep 2023 05:06:39 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B6DDC5; Mon, 18 Sep 2023 02:06:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=MIME-Version:Content-Type:Date:Cc:To: From:Subject:Message-ID:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:In-Reply-To:References; bh=jhKFzfaIrj2yifqprfN7tR//Lqbk6JFSdT37dJb4VL0=; b=QKabkEXGZDcccenpwlyT7dPry+ X9cTG2wiEFPIZzCpCjycweSVM2LSkPZQHFGryCsStYAzO6U+MAoqfBZ49NbBc8+RO1tCQ6Y6EFn4A Y8Q62k6yEhLArtKztOidimg+lJH/bto01H+Kmhl6tATYnFIyrPblOYwamNpOeLkJfksYnpyd+gg/f vtDWjisOGCkCJHGhTlhrljyL3QWpoxHIE621C7ngh3m68Ao5jNDR1Q8qTF+1WA7QQUx4Zfd9yzWTx jL+iKq+0WZFF4aY8etkWuJU58wLB6t73fVu9ANO+eh6qnkdbVHo8t/3RCQaoXPOy9hFZsLS9xfwPr XoA2eJmg==; Received: from [2001:8b0:10b:5:3cdb:35b0:ea67:aadb] (helo=u3832b3a9db3152.ant.amazon.com) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qiAD3-009yJZ-0J; Mon, 18 Sep 2023 09:06:25 +0000 Message-ID: <1b52b557beb6606007f7ec5672eab0adf1606a34.camel@infradead.org> Subject: [RFC] KVM: x86: Allow userspace exit on HLT and MWAIT, else yield on MWAIT From: David Woodhouse <dwmw2@infradead.org> To: kvm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, linux-kernel@vger.kernel.org, graf@amazon.de, Nicolas Saenz Julienne <nsaenz@amazon.es>, "Griffoul, Fred" <fgriffo@amazon.com> Date: Mon, 18 Sep 2023 11:06:24 +0200 Content-Type: multipart/signed; micalg="sha-256"; protocol="application/pkcs7-signature"; boundary="=-5IlqtGoL4WjBAvGoYipU" User-Agent: Evolution 3.44.4-0ubuntu2 MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from <dwmw2@infradead.org> by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: <kvm.vger.kernel.org> X-Mailing-List: kvm@vger.kernel.org
Series	[RFC] KVM: x86: Allow userspace exit on HLT and MWAIT, else yield on MWAIT \| expand [RFC] KVM: x86: Allow userspace exit on HLT and MWAIT, else yield on MWAIT

[RFC] KVM: x86: Allow userspace exit on HLT and MWAIT, else yield on MWAIT

Commit Message

Comments

Patch