Message ID | 4c04e5661688cf1de3e3fd668b0a78b23b6d7b2e.1698261255.git.edwin.torok@cloud.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vPMU bugfixes and support for PMUv5 | expand |
On 25.10.2023 21:29, Edwin Török wrote: > This can be useful if you realize you have to inspect the value of an > MSR in production, without having to change into a new Xen first that > handles the MSR. Yet on a non-pinned Dom0 you'd still be lost. Since iirc we generally advise against pinning, I wonder of how much use such a change would be, when it effectively undoes what we deliberately did a while ago. > --- a/xen/arch/x86/hvm/svm/svm.c > +++ b/xen/arch/x86/hvm/svm/svm.c > @@ -1933,6 +1933,9 @@ static int cf_check svm_msr_read_intercept( > break; > > default: > + if ( is_hwdom_pinned_vcpu(v) && !rdmsr_safe(msr, *msr_content) ) > + break; > + > if ( d->arch.msr_relaxed && !rdmsr_safe(msr, tmp) ) > { > *msr_content = 0; If we went as far as undoing some of what was done, I'd then wonder whether instead we should mandate relaxed mode to be enabled on such a Dom0. Then, instead of returning fake 0 here, the actual value could be returned in the specific case of (pinned?) Dom0. Jan
> On 30 Oct 2023, at 16:29, Jan Beulich <jbeulich@suse.com> wrote: > > On 25.10.2023 21:29, Edwin Török wrote: >> This can be useful if you realize you have to inspect the value of an >> MSR in production, without having to change into a new Xen first that >> handles the MSR. > > Yet on a non-pinned Dom0 you'd still be lost. Since iirc we generally > advise against pinning, You can temporarily pin while debugging the issue (e.g. pin just 1 CPU from Dom0, and "walk" all your physical CPUs with it if you have to, so that you query them all), e.g. with 'xl vcpu-pin'. Although that is more invasive than reading a value. Or alternatively have another (privileged) interface to read the MSR for a given core without exposing it to any guests, that way you don't affect the running system at all (which would be preferable in a production environment), i.e. a Xen equivalent of 'rdmsr'. > I wonder of how much use such a change would be, > when it effectively undoes what we deliberately did a while ago. > >> --- a/xen/arch/x86/hvm/svm/svm.c >> +++ b/xen/arch/x86/hvm/svm/svm.c >> @@ -1933,6 +1933,9 @@ static int cf_check svm_msr_read_intercept( >> break; >> >> default: >> + if ( is_hwdom_pinned_vcpu(v) && !rdmsr_safe(msr, *msr_content) ) >> + break; >> + >> if ( d->arch.msr_relaxed && !rdmsr_safe(msr, tmp) ) >> { >> *msr_content = 0; > > If we went as far as undoing some of what was done, I'd then wonder > whether instead we should mandate relaxed mode to be enabled on such a > Dom0. Then, instead of returning fake 0 here, the actual value could > be returned in the specific case of (pinned?) Dom0. Can relaxed mode be enabled at runtime? I'd be happy with either solution, but it should be something that can be enabled at runtime (if you have to reboot Xen then you may lose the bug repro that you want to gather more information on). Although changing such a setting in a production environment may still be risky, because the guest will then become very confused that it has previously read some 0s, now there are some real values, and later when you flip the switch off it gets 0s again. Best regards, --Edwin
On 31.10.2023 10:31, Edwin Torok wrote: >> On 30 Oct 2023, at 16:29, Jan Beulich <jbeulich@suse.com> wrote: >> On 25.10.2023 21:29, Edwin Török wrote: >>> This can be useful if you realize you have to inspect the value of an >>> MSR in production, without having to change into a new Xen first that >>> handles the MSR. >> >> Yet on a non-pinned Dom0 you'd still be lost. Since iirc we generally >> advise against pinning, > > You can temporarily pin while debugging the issue (e.g. pin just 1 CPU from Dom0, and "walk" all your physical CPUs with it if you have to, > so that you query them all), e.g. with 'xl vcpu-pin'. > Although that is more invasive than reading a value. > > Or alternatively have another (privileged) interface to read the MSR for a given core without exposing it to any guests, that way you don't affect the running system at all > (which would be preferable in a production environment), i.e. a Xen equivalent of 'rdmsr'. The interface we have (XENPF_resource_op) is, despite being privileged, deliberately (so far at least) not permitting access to arbitrary MSRs. In our old XenoLinux forward port we had an extension to the msr.ko module to allow pCPU-based MSR accesses (and I had a private extension to the rdmsr/wrmsr user space tools making use of that), but even that would have been subject to restrictions enforced by Xen as to which MSRs are accessible. >> I wonder of how much use such a change would be, >> when it effectively undoes what we deliberately did a while ago. >> >>> --- a/xen/arch/x86/hvm/svm/svm.c >>> +++ b/xen/arch/x86/hvm/svm/svm.c >>> @@ -1933,6 +1933,9 @@ static int cf_check svm_msr_read_intercept( >>> break; >>> >>> default: >>> + if ( is_hwdom_pinned_vcpu(v) && !rdmsr_safe(msr, *msr_content) ) >>> + break; >>> + >>> if ( d->arch.msr_relaxed && !rdmsr_safe(msr, tmp) ) >>> { >>> *msr_content = 0; >> >> If we went as far as undoing some of what was done, I'd then wonder >> whether instead we should mandate relaxed mode to be enabled on such a >> Dom0. Then, instead of returning fake 0 here, the actual value could >> be returned in the specific case of (pinned?) Dom0. > > > Can relaxed mode be enabled at runtime? Not right now, no. But a hypfs control could certainly be added, with suitable justification. > I'd be happy with either solution, but it should be something that can be enabled at runtime > (if you have to reboot Xen then you may lose the bug repro that you want to gather more information on). > Although changing such a setting in a production environment may still be risky, because the guest will then become very confused that it has previously read some 0s, now there are some real values, and later when you flip the switch off it gets 0s again. Indeed. If you flipped such a control for any domain at runtime, you'd better first check that this wouldn't cause any such issues. Jan
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index 24c417ca71..45f8e1ffd1 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1933,6 +1933,9 @@ static int cf_check svm_msr_read_intercept( break; default: + if ( is_hwdom_pinned_vcpu(v) && !rdmsr_safe(msr, *msr_content) ) + break; + if ( d->arch.msr_relaxed && !rdmsr_safe(msr, tmp) ) { *msr_content = 0; diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 1edc7f1e91..f6e5123f66 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -3377,6 +3377,9 @@ static int cf_check vmx_msr_read_intercept( if ( vmx_read_guest_msr(curr, msr, msr_content) == 0 ) break; + if ( is_hwdom_pinned_vcpu(curr) && !rdmsr_safe(msr, *msr_content) ) + return X86EMUL_OKAY; + if ( is_last_branch_msr(msr) ) { *msr_content = 0; diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c index 0d9f84f458..978ae679a2 100644 --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -985,6 +985,9 @@ static int cf_check read_msr( } /* fall through */ default: + if ( is_hwdom_pinned_vcpu(curr) && !rdmsr_safe(reg, *val) ) + return X86EMUL_OKAY; + if ( currd->arch.msr_relaxed && !rdmsr_safe(reg, tmp) ) { *val = 0;