[RFC] KVM: SVM: Sync g_pat with guest-written PAT value

Message ID	552B6923.3020602@siemens.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> Message-ID: <552B6923.3020602@siemens.com> Date: Mon, 13 Apr 2015 08:58:43 +0200 From: Jan Kiszka <jan.kiszka@siemens.com> User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: kvm <kvm@vger.kernel.org> CC: Joel Schopp <joel.schopp@amd.com> Subject: [RFC][PATCH] KVM: SVM: Sync g_pat with guest-written PAT value References: <552B5128.4010909@siemens.com> In-Reply-To: <552B5128.4010909@siemens.com> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 7bit Sender: kvm-owner@vger.kernel.org Precedence: bulk

Jan Kiszka April 13, 2015, 6:58 a.m. UTC

When hardware supports the g_pat VMCB field, we can use it for emulating
the PAT configuration that the guest configures by writing to the
corresponding MSR.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---

RFC because it is only compile-tested.

 arch/x86/kvm/svm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Radim Krčmář April 17, 2015, 4:59 p.m. UTC | #1

2015-04-13 08:58+0200, Jan Kiszka:
> When hardware supports the g_pat VMCB field, we can use it for emulating
> the PAT configuration that the guest configures by writing to the
> corresponding MSR.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---

Reviewed-by: Radim Kr?má? <rkrcmar@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Radim Krčmář April 20, 2015, 4:14 p.m. UTC | #2

2015-04-13 08:58+0200, Jan Kiszka:
> When hardware supports the g_pat VMCB field, we can use it for emulating
> the PAT configuration that the guest configures by writing to the
> corresponding MSR.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
> 
> RFC because it is only compile-tested.
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> @@ -3245,6 +3245,15 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  	case MSR_VM_IGNNE:
>  		vcpu_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", ecx, data);
>  		break;
> +	case MSR_IA32_CR_PAT:
> +		if (npt_enabled) {
> +			if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
> +				return 1;
> +			svm->vmcb->save.g_pat = data;
> +			vcpu->arch.pat = data;

Disregarding my Reviewed-by, the code is missing:

  mark_dirty(svm->vmcb, VMCB_NPT);

Also,

Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Radim Krčmář April 20, 2015, 5:16 p.m. UTC | #3

2015-04-20 18:14+0200, Radim Kr?má?:
> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>

Uncached accesses were roughly 20x slower.
In case anyone wanted to reproduce, I used this as a kvm-unit-test:

---
#include "processor.h"

#define NR_TOP_LOOPS 24
#define NR_MEM_LOOPS 10
#define MEM_ELEMENTS 1024

static volatile u64 pat_test_memory[MEM_ELEMENTS];

static void flush_tlb(void)
{
	write_cr3(read_cr3());
}

static void set_pat(u64 val)
{
	wrmsr(0x277, val);
	flush_tlb();

}

static u64 time_memory_accesses(void)
{
	u64 tsc_before = rdtsc();

	for (unsigned loop = 0; loop < NR_MEM_LOOPS; loop++)
		for (unsigned i = 0; i < MEM_ELEMENTS; i++)
			pat_test_memory[i]++;

	return rdtsc() - tsc_before;
}

int main(int argc, char **argv)
{
	unsigned error = 0;

	for (unsigned loop = 0; loop < NR_TOP_LOOPS; loop++) {
		u64 time_uc, time_wb;

		set_pat(0);
		time_uc = time_memory_accesses();

		set_pat(0x0606060606060606ULL);
		time_wb = time_memory_accesses();

		if (time_uc < time_wb * 4)
			error++;

		printf("%02d uc: %10lld wb: %8lld\n", loop, time_uc, time_wb);
	}

	report("guest PAT", !error);

	return report_summary();
}
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Kiszka April 20, 2015, 5:21 p.m. UTC | #4

On 2015-04-20 19:16, Radim Kr?má? wrote:
> 2015-04-20 18:14+0200, Radim Kr?má?:
>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
> 
> Uncached accesses were roughly 20x slower.
> In case anyone wanted to reproduce, I used this as a kvm-unit-test:
> 
> ---
> #include "processor.h"
> 
> #define NR_TOP_LOOPS 24
> #define NR_MEM_LOOPS 10
> #define MEM_ELEMENTS 1024
> 
> static volatile u64 pat_test_memory[MEM_ELEMENTS];
> 
> static void flush_tlb(void)
> {
> 	write_cr3(read_cr3());
> }
> 
> static void set_pat(u64 val)
> {
> 	wrmsr(0x277, val);
> 	flush_tlb();
> 
> }
> 
> static u64 time_memory_accesses(void)
> {
> 	u64 tsc_before = rdtsc();
> 
> 	for (unsigned loop = 0; loop < NR_MEM_LOOPS; loop++)
> 		for (unsigned i = 0; i < MEM_ELEMENTS; i++)
> 			pat_test_memory[i]++;
> 
> 	return rdtsc() - tsc_before;
> }
> 
> int main(int argc, char **argv)
> {
> 	unsigned error = 0;
> 
> 	for (unsigned loop = 0; loop < NR_TOP_LOOPS; loop++) {
> 		u64 time_uc, time_wb;
> 
> 		set_pat(0);
> 		time_uc = time_memory_accesses();
> 
> 		set_pat(0x0606060606060606ULL);
> 		time_wb = time_memory_accesses();
> 
> 		if (time_uc < time_wb * 4)
> 			error++;
> 
> 		printf("%02d uc: %10lld wb: %8lld\n", loop, time_uc, time_wb);
> 	}
> 
> 	report("guest PAT", !error);
> 
> 	return report_summary();
> }
> 

Great, thanks. Will you push it to the unit tests? Could raise
motivations to fix the !NPT/EPT case.

Jan

Radim Krčmář April 20, 2015, 5:22 p.m. UTC | #5

2015-04-20 19:16+0200, Radim Kr?má?:
> Uncached accesses were roughly 20x slower.

Sorry, a zero is missing there ... they were 200 times slower.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Radim Krčmář April 20, 2015, 5:33 p.m. UTC | #6

2015-04-20 19:21+0200, Jan Kiszka:
> On 2015-04-20 19:16, Radim Kr?má? wrote:
> > 2015-04-20 18:14+0200, Radim Kr?má?:
> >> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
> > 
> > Uncached accesses were roughly 20x slower.
> > In case anyone wanted to reproduce, I used this as a kvm-unit-test:
> > 
> > ---
| [code]
> 
> Great, thanks. Will you push it to the unit tests? Could raise
> motivations to fix the !NPT/EPT case.

It can't be included in `run_tests.sh`, because we intenionally ignore
PAT for normal RAM on VMX and the test does "fail" ...

I'll think how to make the test use fool-proof first, and also look how
to fix the !NPT/EPT without affecting the case we care about too much.
(And if we can do a similar trick with NPT.)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Kiszka April 20, 2015, 5:37 p.m. UTC | #7

On 2015-04-20 19:33, Radim Kr?má? wrote:
> 2015-04-20 19:21+0200, Jan Kiszka:
>> On 2015-04-20 19:16, Radim Kr?má? wrote:
>>> 2015-04-20 18:14+0200, Radim Kr?má?:
>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
>>>
>>> Uncached accesses were roughly 20x slower.
>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test:
>>>
>>> ---
> | [code]
>>
>> Great, thanks. Will you push it to the unit tests? Could raise
>> motivations to fix the !NPT/EPT case.
> 
> It can't be included in `run_tests.sh`, because we intenionally ignore
> PAT for normal RAM on VMX and the test does "fail" ...

That ignoring is encoded into the EPT? Hmm... Maybe we can create a
ivshmem device and use that as test target.

> 
> I'll think how to make the test use fool-proof first, and also look how
> to fix the !NPT/EPT without affecting the case we care about too much.
> (And if we can do a similar trick with NPT.)
> 

OK.

Jan

Jan Kiszka April 20, 2015, 5:45 p.m. UTC | #8

On 2015-04-20 19:37, Jan Kiszka wrote:
> On 2015-04-20 19:33, Radim Kr?má? wrote:
>> 2015-04-20 19:21+0200, Jan Kiszka:
>>> On 2015-04-20 19:16, Radim Kr?má? wrote:
>>>> 2015-04-20 18:14+0200, Radim Kr?má?:
>>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
>>>>
>>>> Uncached accesses were roughly 20x slower.
>>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test:
>>>>
>>>> ---
>> | [code]
>>>
>>> Great, thanks. Will you push it to the unit tests? Could raise
>>> motivations to fix the !NPT/EPT case.
>>
>> It can't be included in `run_tests.sh`, because we intenionally ignore
>> PAT for normal RAM on VMX and the test does "fail" ...
> 
> That ignoring is encoded into the EPT? Hmm... Maybe we can create a
> ivshmem device and use that as test target.

And do you also know why is it ignored on Intel? Side effects on the host?

Jan

Radim Krčmář April 20, 2015, 6:33 p.m. UTC | #9

2015-04-20 19:45+0200, Jan Kiszka:
> On 2015-04-20 19:37, Jan Kiszka wrote:
> > On 2015-04-20 19:33, Radim Kr?má? wrote:
> >> 2015-04-20 19:21+0200, Jan Kiszka:
> >>> On 2015-04-20 19:16, Radim Kr?má? wrote:
> >>>> 2015-04-20 18:14+0200, Radim Kr?má?:
> >>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
> >>>>
> >>>> Uncached accesses were roughly 20x slower.
> >>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test:
> >>>>
> >>>> ---
> >> | [code]
> >>>
> >>> Great, thanks. Will you push it to the unit tests? Could raise
> >>> motivations to fix the !NPT/EPT case.
> >>
> >> It can't be included in `run_tests.sh`, because we intenionally ignore
> >> PAT for normal RAM on VMX and the test does "fail" ...
> > 
> > That ignoring is encoded into the EPT?

Yes, it's the VMX_EPT_IPAT_BIT.

> And do you also know why is it ignored on Intel? Side effects on the host?

I think it is an optimization exclusive to Intel.
We know that the other side is not real hardware, which could avoid CPU
caches when accessing memory, so there is little reason to slow the
guest down.

> >                                        Hmm... Maybe we can create a
> > ivshmem device and use that as test target.

Good idea, thanks.
(Haven't used it yet, so its parts might be able to do what is needed
 without creating a dependency on the whole ivshmem system.)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Kiszka April 20, 2015, 6:41 p.m. UTC | #10

On 2015-04-20 20:33, Radim Kr?má? wrote:
> 2015-04-20 19:45+0200, Jan Kiszka:
>> On 2015-04-20 19:37, Jan Kiszka wrote:
>>> On 2015-04-20 19:33, Radim Kr?má? wrote:
>>>> 2015-04-20 19:21+0200, Jan Kiszka:
>>>>> On 2015-04-20 19:16, Radim Kr?má? wrote:
>>>>>> 2015-04-20 18:14+0200, Radim Kr?má?:
>>>>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
>>>>>>
>>>>>> Uncached accesses were roughly 20x slower.
>>>>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test:
>>>>>>
>>>>>> ---
>>>> | [code]
>>>>>
>>>>> Great, thanks. Will you push it to the unit tests? Could raise
>>>>> motivations to fix the !NPT/EPT case.
>>>>
>>>> It can't be included in `run_tests.sh`, because we intenionally ignore
>>>> PAT for normal RAM on VMX and the test does "fail" ...
>>>
>>> That ignoring is encoded into the EPT?
> 
> Yes, it's the VMX_EPT_IPAT_BIT.
> 
>> And do you also know why is it ignored on Intel? Side effects on the host?
> 
> I think it is an optimization exclusive to Intel.
> We know that the other side is not real hardware, which could avoid CPU
> caches when accessing memory, so there is little reason to slow the
> guest down.

If the guest pushes data for DMA into RAM, it may assume that it lands
there directly, without the need for explicit flushes, because it has
caching disabled - no?

Jan

Radim Krčmář April 21, 2015, 12:10 p.m. UTC | #11

2015-04-20 20:41+0200, Jan Kiszka:
> On 2015-04-20 20:33, Radim Kr?má? wrote:
> > 2015-04-20 19:45+0200, Jan Kiszka:
> >> On 2015-04-20 19:37, Jan Kiszka wrote:
> >>> On 2015-04-20 19:33, Radim Kr?má? wrote:
> >>>> 2015-04-20 19:21+0200, Jan Kiszka:
> >>>>> On 2015-04-20 19:16, Radim Kr?má? wrote:
> >>>>>> 2015-04-20 18:14+0200, Radim Kr?má?:
> >>>>>>> Tested-by: Radim Kr?má? <rkrcmar@redhat.com>
> >>>>>>
> >>>>>> Uncached accesses were roughly 20x slower.
> >>>>>> In case anyone wanted to reproduce, I used this as a kvm-unit-test:
> >>>>>>
> >>>>>> ---
> >>>> | [code]
> >>>>>
> >>>>> Great, thanks. Will you push it to the unit tests? Could raise
> >>>>> motivations to fix the !NPT/EPT case.
> >>>>
> >>>> It can't be included in `run_tests.sh`, because we intenionally ignore
> >>>> PAT for normal RAM on VMX and the test does "fail" ...
> >>>
> >>> That ignoring is encoded into the EPT?
> > 
> > Yes, it's the VMX_EPT_IPAT_BIT.
> > 
> >> And do you also know why is it ignored on Intel? Side effects on the host?
> > 
> > I think it is an optimization exclusive to Intel.
> > We know that the other side is not real hardware, which could avoid CPU
> > caches when accessing memory, so there is little reason to slow the
> > guest down.
> 
> If the guest pushes data for DMA into RAM, it may assume that it lands
> there directly, without the need for explicit flushes, because it has
> caching disabled - no?

Yes, the guest can't tell a difference.

In the presence of an assigned device, we always consider PAT;
with VFIO, ignoring PAT is decided by coherent DMA capability.
Emulated DMA requests are handled by the host, so the IPAT bit optimizes
cases where we know that caches are always right, but the guest doesn't.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Paolo Bonzini April 21, 2015, 12:11 p.m. UTC | #12

On 20/04/2015 20:41, Jan Kiszka wrote:
> If the guest pushes data for DMA into RAM, it may assume that it lands
> there directly, without the need for explicit flushes, because it has
> caching disabled - no?

Yes, but Intel IOMMUs can have snooping control and in this case you can
just set memory to WB.

On Intel, KVM trusts the guest's PAT if the IOMMU is in use, and you do
not have an Intel IOMMU with snooping control.  In this case
kvm_arch_has_noncoherent_dma(vcpu->kvm) returns true.

The same should work for AMD, so you can set the gPAT:

- to the guest's value if kvm_arch_has_noncoherent_dma(vcpu->kvm), and
then you return cachemode2protval(kvm_get_guest_memory_type(...)) from
svm_get_mt_mask to layer the guest MTRRs on top of the guest PAT.

- otherwise, to all WB (0x0606060606060606), and then you can return
either 0 or _PAGE_NOCACHE from svm_get_mt_mask to achieve either UC (for
MMIO regions) or WB (for everything else).

To sum up you have:

               IOMMU?                      no IOMMU?
guest PAT      obeyed                      ignored (all WB)
guest MTRR     obeyed (svm_get_mt_mask)    ignored (UC if MMIO, else WB)
host PAT       always ignored              always ignored
host MTRR      always obeyed               always obeyed

I think that kvm_arch_has_noncoherent_dma() can be resampled, and gPAT
updated, in wbinvd_interception.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] KVM: SVM: Sync g_pat with guest-written PAT value

Commit Message

Comments

Patch