Message ID | 552B6923.3020602@siemens.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2015-04-13 08:58+0200, Jan Kiszka: > When hardware supports the g_pat VMCB field, we can use it for emulating > the PAT configuration that the guest configures by writing to the > corresponding MSR. > > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > --- Reviewed-by: Radim Kr?má? <rkrcmar@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2015-04-13 08:58+0200, Jan Kiszka: > When hardware supports the g_pat VMCB field, we can use it for emulating > the PAT configuration that the guest configures by writing to the > corresponding MSR. > > Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> > --- > > RFC because it is only compile-tested. > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > @@ -3245,6 +3245,15 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) > case MSR_VM_IGNNE: > vcpu_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", ecx, data); > break; > + case MSR_IA32_CR_PAT: > + if (npt_enabled) { > + if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data)) > + return 1; > + svm->vmcb->save.g_pat = data; > + vcpu->arch.pat = data; Disregarding my Reviewed-by, the code is missing: mark_dirty(svm->vmcb, VMCB_NPT); Also, Tested-by: Radim Kr?má? <rkrcmar@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2015-04-20 18:14+0200, Radim Kr?má?:
> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
Uncached accesses were roughly 20x slower.
In case anyone wanted to reproduce, I used this as a kvm-unit-test:
---
#include "processor.h"
#define NR_TOP_LOOPS 24
#define NR_MEM_LOOPS 10
#define MEM_ELEMENTS 1024
static volatile u64 pat_test_memory[MEM_ELEMENTS];
static void flush_tlb(void)
{
write_cr3(read_cr3());
}
static void set_pat(u64 val)
{
wrmsr(0x277, val);
flush_tlb();
}
static u64 time_memory_accesses(void)
{
u64 tsc_before = rdtsc();
for (unsigned loop = 0; loop < NR_MEM_LOOPS; loop++)
for (unsigned i = 0; i < MEM_ELEMENTS; i++)
pat_test_memory[i]++;
return rdtsc() - tsc_before;
}
int main(int argc, char **argv)
{
unsigned error = 0;
for (unsigned loop = 0; loop < NR_TOP_LOOPS; loop++) {
u64 time_uc, time_wb;
set_pat(0);
time_uc = time_memory_accesses();
set_pat(0x0606060606060606ULL);
time_wb = time_memory_accesses();
if (time_uc < time_wb * 4)
error++;
printf("%02d uc: %10lld wb: %8lld\n", loop, time_uc, time_wb);
}
report("guest PAT", !error);
return report_summary();
}
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015-04-20 19:16, Radim Kr?má? wrote: > 2015-04-20 18:14+0200, Radim Kr?má?: >> Tested-by: Radim Kr?má? <rkrcmar@redhat.com> > > Uncached accesses were roughly 20x slower. > In case anyone wanted to reproduce, I used this as a kvm-unit-test: > > --- > #include "processor.h" > > #define NR_TOP_LOOPS 24 > #define NR_MEM_LOOPS 10 > #define MEM_ELEMENTS 1024 > > static volatile u64 pat_test_memory[MEM_ELEMENTS]; > > static void flush_tlb(void) > { > write_cr3(read_cr3()); > } > > static void set_pat(u64 val) > { > wrmsr(0x277, val); > flush_tlb(); > > } > > static u64 time_memory_accesses(void) > { > u64 tsc_before = rdtsc(); > > for (unsigned loop = 0; loop < NR_MEM_LOOPS; loop++) > for (unsigned i = 0; i < MEM_ELEMENTS; i++) > pat_test_memory[i]++; > > return rdtsc() - tsc_before; > } > > int main(int argc, char **argv) > { > unsigned error = 0; > > for (unsigned loop = 0; loop < NR_TOP_LOOPS; loop++) { > u64 time_uc, time_wb; > > set_pat(0); > time_uc = time_memory_accesses(); > > set_pat(0x0606060606060606ULL); > time_wb = time_memory_accesses(); > > if (time_uc < time_wb * 4) > error++; > > printf("%02d uc: %10lld wb: %8lld\n", loop, time_uc, time_wb); > } > > report("guest PAT", !error); > > return report_summary(); > } > Great, thanks. Will you push it to the unit tests? Could raise motivations to fix the !NPT/EPT case. Jan
2015-04-20 19:16+0200, Radim Kr?má?:
> Uncached accesses were roughly 20x slower.
Sorry, a zero is missing there ... they were 200 times slower.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
2015-04-20 19:21+0200, Jan Kiszka: > On 2015-04-20 19:16, Radim Kr?má? wrote: > > 2015-04-20 18:14+0200, Radim Kr?má?: > >> Tested-by: Radim Kr?má? <rkrcmar@redhat.com> > > > > Uncached accesses were roughly 20x slower. > > In case anyone wanted to reproduce, I used this as a kvm-unit-test: > > > > --- | [code] > > Great, thanks. Will you push it to the unit tests? Could raise > motivations to fix the !NPT/EPT case. It can't be included in `run_tests.sh`, because we intenionally ignore PAT for normal RAM on VMX and the test does "fail" ... I'll think how to make the test use fool-proof first, and also look how to fix the !NPT/EPT without affecting the case we care about too much. (And if we can do a similar trick with NPT.) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015-04-20 19:33, Radim Kr?má? wrote: > 2015-04-20 19:21+0200, Jan Kiszka: >> On 2015-04-20 19:16, Radim Kr?má? wrote: >>> 2015-04-20 18:14+0200, Radim Kr?má?: >>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com> >>> >>> Uncached accesses were roughly 20x slower. >>> In case anyone wanted to reproduce, I used this as a kvm-unit-test: >>> >>> --- > | [code] >> >> Great, thanks. Will you push it to the unit tests? Could raise >> motivations to fix the !NPT/EPT case. > > It can't be included in `run_tests.sh`, because we intenionally ignore > PAT for normal RAM on VMX and the test does "fail" ... That ignoring is encoded into the EPT? Hmm... Maybe we can create a ivshmem device and use that as test target. > > I'll think how to make the test use fool-proof first, and also look how > to fix the !NPT/EPT without affecting the case we care about too much. > (And if we can do a similar trick with NPT.) > OK. Jan
On 2015-04-20 19:37, Jan Kiszka wrote: > On 2015-04-20 19:33, Radim Kr?má? wrote: >> 2015-04-20 19:21+0200, Jan Kiszka: >>> On 2015-04-20 19:16, Radim Kr?má? wrote: >>>> 2015-04-20 18:14+0200, Radim Kr?má?: >>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com> >>>> >>>> Uncached accesses were roughly 20x slower. >>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test: >>>> >>>> --- >> | [code] >>> >>> Great, thanks. Will you push it to the unit tests? Could raise >>> motivations to fix the !NPT/EPT case. >> >> It can't be included in `run_tests.sh`, because we intenionally ignore >> PAT for normal RAM on VMX and the test does "fail" ... > > That ignoring is encoded into the EPT? Hmm... Maybe we can create a > ivshmem device and use that as test target. And do you also know why is it ignored on Intel? Side effects on the host? Jan
2015-04-20 19:45+0200, Jan Kiszka: > On 2015-04-20 19:37, Jan Kiszka wrote: > > On 2015-04-20 19:33, Radim Kr?má? wrote: > >> 2015-04-20 19:21+0200, Jan Kiszka: > >>> On 2015-04-20 19:16, Radim Kr?má? wrote: > >>>> 2015-04-20 18:14+0200, Radim Kr?má?: > >>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com> > >>>> > >>>> Uncached accesses were roughly 20x slower. > >>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test: > >>>> > >>>> --- > >> | [code] > >>> > >>> Great, thanks. Will you push it to the unit tests? Could raise > >>> motivations to fix the !NPT/EPT case. > >> > >> It can't be included in `run_tests.sh`, because we intenionally ignore > >> PAT for normal RAM on VMX and the test does "fail" ... > > > > That ignoring is encoded into the EPT? Yes, it's the VMX_EPT_IPAT_BIT. > And do you also know why is it ignored on Intel? Side effects on the host? I think it is an optimization exclusive to Intel. We know that the other side is not real hardware, which could avoid CPU caches when accessing memory, so there is little reason to slow the guest down. > > Hmm... Maybe we can create a > > ivshmem device and use that as test target. Good idea, thanks. (Haven't used it yet, so its parts might be able to do what is needed without creating a dependency on the whole ivshmem system.) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015-04-20 20:33, Radim Kr?má? wrote: > 2015-04-20 19:45+0200, Jan Kiszka: >> On 2015-04-20 19:37, Jan Kiszka wrote: >>> On 2015-04-20 19:33, Radim Kr?má? wrote: >>>> 2015-04-20 19:21+0200, Jan Kiszka: >>>>> On 2015-04-20 19:16, Radim Kr?má? wrote: >>>>>> 2015-04-20 18:14+0200, Radim Kr?má?: >>>>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com> >>>>>> >>>>>> Uncached accesses were roughly 20x slower. >>>>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test: >>>>>> >>>>>> --- >>>> | [code] >>>>> >>>>> Great, thanks. Will you push it to the unit tests? Could raise >>>>> motivations to fix the !NPT/EPT case. >>>> >>>> It can't be included in `run_tests.sh`, because we intenionally ignore >>>> PAT for normal RAM on VMX and the test does "fail" ... >>> >>> That ignoring is encoded into the EPT? > > Yes, it's the VMX_EPT_IPAT_BIT. > >> And do you also know why is it ignored on Intel? Side effects on the host? > > I think it is an optimization exclusive to Intel. > We know that the other side is not real hardware, which could avoid CPU > caches when accessing memory, so there is little reason to slow the > guest down. If the guest pushes data for DMA into RAM, it may assume that it lands there directly, without the need for explicit flushes, because it has caching disabled - no? Jan
2015-04-20 20:41+0200, Jan Kiszka: > On 2015-04-20 20:33, Radim Kr?má? wrote: > > 2015-04-20 19:45+0200, Jan Kiszka: > >> On 2015-04-20 19:37, Jan Kiszka wrote: > >>> On 2015-04-20 19:33, Radim Kr?má? wrote: > >>>> 2015-04-20 19:21+0200, Jan Kiszka: > >>>>> On 2015-04-20 19:16, Radim Kr?má? wrote: > >>>>>> 2015-04-20 18:14+0200, Radim Kr?má?: > >>>>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com> > >>>>>> > >>>>>> Uncached accesses were roughly 20x slower. > >>>>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test: > >>>>>> > >>>>>> --- > >>>> | [code] > >>>>> > >>>>> Great, thanks. Will you push it to the unit tests? Could raise > >>>>> motivations to fix the !NPT/EPT case. > >>>> > >>>> It can't be included in `run_tests.sh`, because we intenionally ignore > >>>> PAT for normal RAM on VMX and the test does "fail" ... > >>> > >>> That ignoring is encoded into the EPT? > > > > Yes, it's the VMX_EPT_IPAT_BIT. > > > >> And do you also know why is it ignored on Intel? Side effects on the host? > > > > I think it is an optimization exclusive to Intel. > > We know that the other side is not real hardware, which could avoid CPU > > caches when accessing memory, so there is little reason to slow the > > guest down. > > If the guest pushes data for DMA into RAM, it may assume that it lands > there directly, without the need for explicit flushes, because it has > caching disabled - no? Yes, the guest can't tell a difference. In the presence of an assigned device, we always consider PAT; with VFIO, ignoring PAT is decided by coherent DMA capability. Emulated DMA requests are handled by the host, so the IPAT bit optimizes cases where we know that caches are always right, but the guest doesn't. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 20/04/2015 20:41, Jan Kiszka wrote: > If the guest pushes data for DMA into RAM, it may assume that it lands > there directly, without the need for explicit flushes, because it has > caching disabled - no? Yes, but Intel IOMMUs can have snooping control and in this case you can just set memory to WB. On Intel, KVM trusts the guest's PAT if the IOMMU is in use, and you do not have an Intel IOMMU with snooping control. In this case kvm_arch_has_noncoherent_dma(vcpu->kvm) returns true. The same should work for AMD, so you can set the gPAT: - to the guest's value if kvm_arch_has_noncoherent_dma(vcpu->kvm), and then you return cachemode2protval(kvm_get_guest_memory_type(...)) from svm_get_mt_mask to layer the guest MTRRs on top of the guest PAT. - otherwise, to all WB (0x0606060606060606), and then you can return either 0 or _PAGE_NOCACHE from svm_get_mt_mask to achieve either UC (for MMIO regions) or WB (for everything else). To sum up you have: IOMMU? no IOMMU? guest PAT obeyed ignored (all WB) guest MTRR obeyed (svm_get_mt_mask) ignored (UC if MMIO, else WB) host PAT always ignored always ignored host MTRR always obeyed always obeyed I think that kvm_arch_has_noncoherent_dma() can be resampled, and gPAT updated, in wbinvd_interception. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index ce741b8..9439c6c 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -3245,6 +3245,15 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) case MSR_VM_IGNNE: vcpu_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", ecx, data); break; + case MSR_IA32_CR_PAT: + if (npt_enabled) { + if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data)) + return 1; + svm->vmcb->save.g_pat = data; + vcpu->arch.pat = data; + break; + } + /* fall through */ default: return kvm_set_msr_common(vcpu, msr); }
When hardware supports the g_pat VMCB field, we can use it for emulating the PAT configuration that the guest configures by writing to the corresponding MSR. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> --- RFC because it is only compile-tested. arch/x86/kvm/svm.c | 9 +++++++++ 1 file changed, 9 insertions(+)