mbox series

[0/2] x86/kvm: Force legacy PCI hole as WB under SNP/TDX

Message ID 20250201005048.657470-1-seanjc@google.com (mailing list archive)
Headers show
Series x86/kvm: Force legacy PCI hole as WB under SNP/TDX | expand

Message

Sean Christopherson Feb. 1, 2025, 12:50 a.m. UTC
Attempt to hack around the SNP/TDX guest MTRR disaster by hijacking
x86_platform.is_untracked_pat_range() to force the legacy PCI hole, i.e.
memory from TOLUD => 4GiB, as unconditionally writeback.

TDX in particular has created an impossible situation with MTRRs.  Because
TDX disallows toggling CR0.CD, TDX enabling decided the easiest solution
was to ignore MTRRs entirely (because omitting CR0.CD write is obviously
too simple).

Unfortunately, under KVM at least, the kernel subtly relies on MTRRs to
make ACPI play nice with device drivers.  ACPI tries to map ranges it finds
as WB, which in turn prevents device drivers from mapping device memory as
WC/UC-.

For the record, I hate this hack.  But it's the safest approach I can come
up with.  E.g. forcing ioremap() to always use WB scares me because it's
possible, however unlikely, that the kernel could try to map non-emulated
memory (that is presented as MMIO to the guest) as WC/UC-, and silently
forcing those mappings to WB could do weird things.

My initial thought was to effectively revert the offending commit and
skip the cache disabling/enabling, i.e. the problematic CR0.CD toggling,
but unfortunately OVMF/EDKII has also added code to skip MTRR setup. :-(

Sean Christopherson (2):
  x86/mtrr: Return success vs. "failure" from guest_force_mtrr_state()
  x86/kvm: Override low memory above TOLUD to WB when MTRRs are forced
    WB

 arch/x86/include/asm/mtrr.h        |  5 +++--
 arch/x86/kernel/cpu/mtrr/generic.c | 11 +++++++----
 arch/x86/kernel/kvm.c              | 31 ++++++++++++++++++++++++++++--
 3 files changed, 39 insertions(+), 8 deletions(-)


base-commit: fd8c09ad0d87783b9b6a27900d66293be45b7bad

Comments

Dionna Amalie Glaze Feb. 1, 2025, 2:25 p.m. UTC | #1
On Fri, Jan 31, 2025 at 4:50 PM Sean Christopherson <seanjc@google.com> wrote:
>
> Attempt to hack around the SNP/TDX guest MTRR disaster by hijacking
> x86_platform.is_untracked_pat_range() to force the legacy PCI hole, i.e.
> memory from TOLUD => 4GiB, as unconditionally writeback.
>
> TDX in particular has created an impossible situation with MTRRs.  Because
> TDX disallows toggling CR0.CD, TDX enabling decided the easiest solution
> was to ignore MTRRs entirely (because omitting CR0.CD write is obviously
> too simple).
>
> Unfortunately, under KVM at least, the kernel subtly relies on MTRRs to
> make ACPI play nice with device drivers.  ACPI tries to map ranges it finds
> as WB, which in turn prevents device drivers from mapping device memory as
> WC/UC-.
>
> For the record, I hate this hack.  But it's the safest approach I can come
> up with.  E.g. forcing ioremap() to always use WB scares me because it's
> possible, however unlikely, that the kernel could try to map non-emulated
> memory (that is presented as MMIO to the guest) as WC/UC-, and silently
> forcing those mappings to WB could do weird things.
>
> My initial thought was to effectively revert the offending commit and
> skip the cache disabling/enabling, i.e. the problematic CR0.CD toggling,
> but unfortunately OVMF/EDKII has also added code to skip MTRR setup. :-(
>

EDK2 has a bug tracker. Maybe this is still fixable on Intel's end.
Adding Qinglan, Isaku, and Min to comment.

> Sean Christopherson (2):
>   x86/mtrr: Return success vs. "failure" from guest_force_mtrr_state()
>   x86/kvm: Override low memory above TOLUD to WB when MTRRs are forced
>     WB
>
>  arch/x86/include/asm/mtrr.h        |  5 +++--
>  arch/x86/kernel/cpu/mtrr/generic.c | 11 +++++++----
>  arch/x86/kernel/kvm.c              | 31 ++++++++++++++++++++++++++++--
>  3 files changed, 39 insertions(+), 8 deletions(-)
>
>
> base-commit: fd8c09ad0d87783b9b6a27900d66293be45b7bad
> --
> 2.48.1.362.g079036d154-goog
>