Message ID | 1450393254-4285-1-git-send-email-boris.ostrovsky@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 17/12/15 23:00, Boris Ostrovsky wrote: > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c > index a7767f8..871aca0 100644 > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -3019,6 +3019,25 @@ long do_mmuext_op( > break; > } > > + if ( has_hvm_container_domain(d) ) > + { > + switch ( op.cmd ) > + { > + case MMUEXT_PIN_L1_TABLE: > + case MMUEXT_PIN_L2_TABLE: > + case MMUEXT_PIN_L3_TABLE: > + case MMUEXT_PIN_L4_TABLE: > + case MMUEXT_UNPIN_TABLE: > + if ( is_control_domain(d) ) > + break; This needs to be an XSM check, rather than a dom0 check. Consider the usecase of a PVH/DMLite domain builder stubdomain. Everything else looks OK. ~Andrew > + /* fallthrough */ > + default: > + MEM_LOG("Invalid extended pt command %#x", op.cmd); > + rc = -EOPNOTSUPP; > + goto done; > + } > + } > + > okay = 1; > > switch ( op.cmd )
>>> On 18.12.15 at 17:28, <andrew.cooper3@citrix.com> wrote: > On 17/12/15 23:00, Boris Ostrovsky wrote: >> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >> index a7767f8..871aca0 100644 >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -3019,6 +3019,25 @@ long do_mmuext_op( >> break; >> } >> >> + if ( has_hvm_container_domain(d) ) >> + { >> + switch ( op.cmd ) >> + { >> + case MMUEXT_PIN_L1_TABLE: >> + case MMUEXT_PIN_L2_TABLE: >> + case MMUEXT_PIN_L3_TABLE: >> + case MMUEXT_PIN_L4_TABLE: >> + case MMUEXT_UNPIN_TABLE: >> + if ( is_control_domain(d) ) >> + break; > > This needs to be an XSM check, rather than a dom0 check. Consider the > usecase of a PVH/DMLite domain builder stubdomain. But wouldn't that be the control domain then? Afaict by making this an XSM check we'd also permit the hardware domain access to these, for no reason. In fact we should probably further restrict this to d != pg_owner. Jan
On 12/18/2015 11:37 AM, Jan Beulich wrote: >>>> On 18.12.15 at 17:28, <andrew.cooper3@citrix.com> wrote: >> On 17/12/15 23:00, Boris Ostrovsky wrote: >>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >>> index a7767f8..871aca0 100644 >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -3019,6 +3019,25 @@ long do_mmuext_op( >>> break; >>> } >>> >>> + if ( has_hvm_container_domain(d) ) >>> + { >>> + switch ( op.cmd ) >>> + { >>> + case MMUEXT_PIN_L1_TABLE: >>> + case MMUEXT_PIN_L2_TABLE: >>> + case MMUEXT_PIN_L3_TABLE: >>> + case MMUEXT_PIN_L4_TABLE: >>> + case MMUEXT_UNPIN_TABLE: >>> + if ( is_control_domain(d) ) >>> + break; >> This needs to be an XSM check, rather than a dom0 check. Consider the >> usecase of a PVH/DMLite domain builder stubdomain. > But wouldn't that be the control domain then? Afaict by making this > an XSM check we'd also permit the hardware domain access to these, > for no reason. In fact we should probably further restrict this to > d != pg_owner. We already do this at the top of do_mmuext_op(): rc = xsm_mmuext_op(XSM_TARGET, d, pg_owner); In fact, there is xsm_memory_pin_page() test under pin_page label so I wonder whether I need any test, including is_control_domain()? (Maybe for the UNPIN). -boris
On 18/12/15 16:37, Jan Beulich wrote: >>>> On 18.12.15 at 17:28, <andrew.cooper3@citrix.com> wrote: >> On 17/12/15 23:00, Boris Ostrovsky wrote: >>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >>> index a7767f8..871aca0 100644 >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -3019,6 +3019,25 @@ long do_mmuext_op( >>> break; >>> } >>> >>> + if ( has_hvm_container_domain(d) ) >>> + { >>> + switch ( op.cmd ) >>> + { >>> + case MMUEXT_PIN_L1_TABLE: >>> + case MMUEXT_PIN_L2_TABLE: >>> + case MMUEXT_PIN_L3_TABLE: >>> + case MMUEXT_PIN_L4_TABLE: >>> + case MMUEXT_UNPIN_TABLE: >>> + if ( is_control_domain(d) ) >>> + break; >> This needs to be an XSM check, rather than a dom0 check. Consider the >> usecase of a PVH/DMLite domain builder stubdomain. > But wouldn't that be the control domain then? Afaict by making this > an XSM check we'd also permit the hardware domain access to these, > for no reason. In fact we should probably further restrict this to > d != pg_owner. Any domain needing to construct PV domains needs to be able to make these hypercalls against the target domain. Therefore, the only valid check is whether XSM will permit 'current' to issue the hypercall against 'd', irrespective of whether current is the control domain, the hardware domain, or something else. I think all that is needed is xsm_mmuext_op(XSM_TARGET, d, pg_owner) ~Andrew
>>> On 18.12.15 at 17:59, <andrew.cooper3@citrix.com> wrote: > On 18/12/15 16:37, Jan Beulich wrote: >>>>> On 18.12.15 at 17:28, <andrew.cooper3@citrix.com> wrote: >>> On 17/12/15 23:00, Boris Ostrovsky wrote: >>>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >>>> index a7767f8..871aca0 100644 >>>> --- a/xen/arch/x86/mm.c >>>> +++ b/xen/arch/x86/mm.c >>>> @@ -3019,6 +3019,25 @@ long do_mmuext_op( >>>> break; >>>> } >>>> >>>> + if ( has_hvm_container_domain(d) ) >>>> + { >>>> + switch ( op.cmd ) >>>> + { >>>> + case MMUEXT_PIN_L1_TABLE: >>>> + case MMUEXT_PIN_L2_TABLE: >>>> + case MMUEXT_PIN_L3_TABLE: >>>> + case MMUEXT_PIN_L4_TABLE: >>>> + case MMUEXT_UNPIN_TABLE: >>>> + if ( is_control_domain(d) ) >>>> + break; >>> This needs to be an XSM check, rather than a dom0 check. Consider the >>> usecase of a PVH/DMLite domain builder stubdomain. >> But wouldn't that be the control domain then? Afaict by making this >> an XSM check we'd also permit the hardware domain access to these, >> for no reason. In fact we should probably further restrict this to >> d != pg_owner. > > Any domain needing to construct PV domains needs to be able to make > these hypercalls against the target domain. > > Therefore, the only valid check is whether XSM will permit 'current' to > issue the hypercall against 'd', irrespective of whether current is the > control domain, the hardware domain, or something else. > > I think all that is needed is xsm_mmuext_op(XSM_TARGET, d, pg_owner) Which, as Boris has just pointed out, is already there. But which also allows the d to issue such operations on itself. Jan
On 18/12/15 17:10, Jan Beulich wrote: >>>> On 18.12.15 at 17:59, <andrew.cooper3@citrix.com> wrote: >> On 18/12/15 16:37, Jan Beulich wrote: >>>>>> On 18.12.15 at 17:28, <andrew.cooper3@citrix.com> wrote: >>>> On 17/12/15 23:00, Boris Ostrovsky wrote: >>>>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >>>>> index a7767f8..871aca0 100644 >>>>> --- a/xen/arch/x86/mm.c >>>>> +++ b/xen/arch/x86/mm.c >>>>> @@ -3019,6 +3019,25 @@ long do_mmuext_op( >>>>> break; >>>>> } >>>>> >>>>> + if ( has_hvm_container_domain(d) ) >>>>> + { >>>>> + switch ( op.cmd ) >>>>> + { >>>>> + case MMUEXT_PIN_L1_TABLE: >>>>> + case MMUEXT_PIN_L2_TABLE: >>>>> + case MMUEXT_PIN_L3_TABLE: >>>>> + case MMUEXT_PIN_L4_TABLE: >>>>> + case MMUEXT_UNPIN_TABLE: >>>>> + if ( is_control_domain(d) ) >>>>> + break; >>>> This needs to be an XSM check, rather than a dom0 check. Consider the >>>> usecase of a PVH/DMLite domain builder stubdomain. >>> But wouldn't that be the control domain then? Afaict by making this >>> an XSM check we'd also permit the hardware domain access to these, >>> for no reason. In fact we should probably further restrict this to >>> d != pg_owner. >> Any domain needing to construct PV domains needs to be able to make >> these hypercalls against the target domain. >> >> Therefore, the only valid check is whether XSM will permit 'current' to >> issue the hypercall against 'd', irrespective of whether current is the >> control domain, the hardware domain, or something else. >> >> I think all that is needed is xsm_mmuext_op(XSM_TARGET, d, pg_owner) > Which, as Boris has just pointed out, is already there. So it is. That is good. > But which also allows the d to issue such operations on itself. For safely sake, it is probably having either do_mmuext_op() or the XSM hook bail early if d is not a PV guest. I would hesitate at putting that check inside the hvm conditional at this point. ~Andrew
On 12/18/2015 12:16 PM, Andrew Cooper wrote: > On 18/12/15 17:10, Jan Beulich wrote: >>>>> On 18.12.15 at 17:59, <andrew.cooper3@citrix.com> wrote: >>> On 18/12/15 16:37, Jan Beulich wrote: >>>>>>> On 18.12.15 at 17:28, <andrew.cooper3@citrix.com> wrote: >>>>> On 17/12/15 23:00, Boris Ostrovsky wrote: >>>>>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c >>>>>> index a7767f8..871aca0 100644 >>>>>> --- a/xen/arch/x86/mm.c >>>>>> +++ b/xen/arch/x86/mm.c >>>>>> @@ -3019,6 +3019,25 @@ long do_mmuext_op( >>>>>> break; >>>>>> } >>>>>> >>>>>> + if ( has_hvm_container_domain(d) ) >>>>>> + { >>>>>> + switch ( op.cmd ) >>>>>> + { >>>>>> + case MMUEXT_PIN_L1_TABLE: >>>>>> + case MMUEXT_PIN_L2_TABLE: >>>>>> + case MMUEXT_PIN_L3_TABLE: >>>>>> + case MMUEXT_PIN_L4_TABLE: >>>>>> + case MMUEXT_UNPIN_TABLE: >>>>>> + if ( is_control_domain(d) ) >>>>>> + break; >>>>> This needs to be an XSM check, rather than a dom0 check. Consider the >>>>> usecase of a PVH/DMLite domain builder stubdomain. >>>> But wouldn't that be the control domain then? Afaict by making this >>>> an XSM check we'd also permit the hardware domain access to these, >>>> for no reason. In fact we should probably further restrict this to >>>> d != pg_owner. >>> Any domain needing to construct PV domains needs to be able to make >>> these hypercalls against the target domain. >>> >>> Therefore, the only valid check is whether XSM will permit 'current' to >>> issue the hypercall against 'd', irrespective of whether current is the >>> control domain, the hardware domain, or something else. >>> >>> I think all that is needed is xsm_mmuext_op(XSM_TARGET, d, pg_owner) >> Which, as Boris has just pointed out, is already there. > So it is. That is good. > >> But which also allows the d to issue such operations on itself. Won't get_pg_owner() fail in that case? (domid == curr->domain_id) test? > For safely sake, it is probably having either do_mmuext_op() or the XSM > hook bail early if d is not a PV guest. > > I would hesitate at putting that check inside the hvm conditional at > this point. I am not sure what you meant here. -boris
>>> On 18.12.15 at 18:33, <boris.ostrovsky@oracle.com> wrote: > On 12/18/2015 12:16 PM, Andrew Cooper wrote: >> On 18/12/15 17:10, Jan Beulich wrote: >>>>>> On 18.12.15 at 17:59, <andrew.cooper3@citrix.com> wrote: >>>> I think all that is needed is xsm_mmuext_op(XSM_TARGET, d, pg_owner) >>> Which, as Boris has just pointed out, is already there. >> So it is. That is good. >> >>> But which also allows the d to issue such operations on itself. > > Won't get_pg_owner() fail in that case? (domid == curr->domain_id) test? No, because of the even earlier domid == DOMID_SELF check. Jan
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 7cc057b..d071b13 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -5188,6 +5188,9 @@ static hvm_hypercall_t *const hvm_hypercall64_table[NR_hypercalls] = { HYPERCALL(sysctl), HYPERCALL(domctl), HYPERCALL(tmem_op), + HYPERCALL(platform_op), + HYPERCALL(mmuext_op), + HYPERCALL(xenpmu_op), [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation }; @@ -5209,48 +5212,8 @@ static hvm_hypercall_t *const hvm_hypercall32_table[NR_hypercalls] = { HYPERCALL(sysctl), HYPERCALL(domctl), HYPERCALL(tmem_op), - [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation -}; - -static hvm_hypercall_t *const pvh_hypercall64_table[NR_hypercalls] = { - HYPERCALL(platform_op), - HYPERCALL(memory_op), - HYPERCALL(xen_version), - HYPERCALL(console_io), - [ __HYPERVISOR_grant_table_op ] = (hvm_hypercall_t *)hvm_grant_table_op, - HYPERCALL(vcpu_op), - HYPERCALL(mmuext_op), - HYPERCALL(xsm_op), - HYPERCALL(sched_op), - HYPERCALL(event_channel_op), - [ __HYPERVISOR_physdev_op ] = (hvm_hypercall_t *)hvm_physdev_op, - HYPERCALL(hvm_op), - HYPERCALL(sysctl), - HYPERCALL(domctl), - HYPERCALL(xenpmu_op), - [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation -}; - -extern int compat_mmuext_op(XEN_GUEST_HANDLE_PARAM(void) cmp_uops, - unsigned int count, - XEN_GUEST_HANDLE_PARAM(uint) pdone, - unsigned int foreigndom); -static hvm_hypercall_t *const pvh_hypercall32_table[NR_hypercalls] = { - HYPERCALL(platform_op), - COMPAT_CALL(memory_op), - HYPERCALL(xen_version), - HYPERCALL(console_io), - [ __HYPERVISOR_grant_table_op ] = - (hvm_hypercall_t *)hvm_grant_table_op_compat32, - COMPAT_CALL(vcpu_op), + COMPAT_CALL(platform_op), COMPAT_CALL(mmuext_op), - HYPERCALL(xsm_op), - COMPAT_CALL(sched_op), - HYPERCALL(event_channel_op), - [ __HYPERVISOR_physdev_op ] = (hvm_hypercall_t *)hvm_physdev_op_compat32, - HYPERCALL(hvm_op), - HYPERCALL(sysctl), - HYPERCALL(domctl), HYPERCALL(xenpmu_op), [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation }; @@ -5284,9 +5247,7 @@ int hvm_do_hypercall(struct cpu_user_regs *regs) if ( (eax & 0x80000000) && is_viridian_domain(currd) ) return viridian_hypercall(regs); - if ( (eax >= NR_hypercalls) || - !(is_pvh_domain(currd) ? pvh_hypercall32_table[eax] - : hvm_hypercall32_table[eax]) ) + if ( (eax >= NR_hypercalls) || !hvm_hypercall32_table[eax] ) { regs->eax = -ENOSYS; return HVM_HCALL_completed; @@ -5320,9 +5281,8 @@ int hvm_do_hypercall(struct cpu_user_regs *regs) #endif curr->arch.hvm_vcpu.hcall_64bit = 1; - regs->rax = (is_pvh_domain(currd) - ? pvh_hypercall64_table - : hvm_hypercall64_table)[eax](rdi, rsi, rdx, r10, r8, r9); + regs->rax = hvm_hypercall64_table[eax](rdi, rsi, rdx, r10, r8, r9); + curr->arch.hvm_vcpu.hcall_64bit = 0; #ifndef NDEBUG @@ -5366,10 +5326,7 @@ int hvm_do_hypercall(struct cpu_user_regs *regs) } #endif - regs->_eax = (is_pvh_vcpu(curr) - ? pvh_hypercall32_table - : hvm_hypercall32_table)[eax](ebx, ecx, edx, - esi, edi, ebp); + regs->_eax = hvm_hypercall32_table[eax](ebx, ecx, edx, esi, edi, ebp); #ifndef NDEBUG if ( !curr->arch.hvm_vcpu.hcall_preempted ) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index a7767f8..871aca0 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3019,6 +3019,25 @@ long do_mmuext_op( break; } + if ( has_hvm_container_domain(d) ) + { + switch ( op.cmd ) + { + case MMUEXT_PIN_L1_TABLE: + case MMUEXT_PIN_L2_TABLE: + case MMUEXT_PIN_L3_TABLE: + case MMUEXT_PIN_L4_TABLE: + case MMUEXT_UNPIN_TABLE: + if ( is_control_domain(d) ) + break; + /* fallthrough */ + default: + MEM_LOG("Invalid extended pt command %#x", op.cmd); + rc = -EOPNOTSUPP; + goto done; + } + } + okay = 1; switch ( op.cmd ) @@ -3448,6 +3467,7 @@ long do_mmuext_op( break; } + done: if ( unlikely(!okay) && !rc ) rc = -EINVAL; if ( unlikely(rc) ) diff --git a/xen/arch/x86/x86_64/compat/mm.c b/xen/arch/x86/x86_64/compat/mm.c index 178e42d..58be8ad 100644 --- a/xen/arch/x86/x86_64/compat/mm.c +++ b/xen/arch/x86/x86_64/compat/mm.c @@ -215,13 +215,15 @@ int compat_update_va_mapping_otherdomain(unsigned long va, u32 lo, u32 hi, DEFINE_XEN_GUEST_HANDLE(mmuext_op_compat_t); -int compat_mmuext_op(XEN_GUEST_HANDLE_PARAM(mmuext_op_compat_t) cmp_uops, +int compat_mmuext_op(XEN_GUEST_HANDLE_PARAM(void) arg, unsigned int count, XEN_GUEST_HANDLE_PARAM(uint) pdone, unsigned int foreigndom) { unsigned int i, preempt_mask; int rc = 0; + XEN_GUEST_HANDLE_PARAM(mmuext_op_compat_t) cmp_uops = + guest_handle_cast(arg, mmuext_op_compat_t); XEN_GUEST_HANDLE_PARAM(mmuext_op_t) nat_ops; if ( unlikely(count == MMU_UPDATE_PREEMPTED) && diff --git a/xen/include/asm-x86/hypercall.h b/xen/include/asm-x86/hypercall.h index afa8ba9..945d58a 100644 --- a/xen/include/asm-x86/hypercall.h +++ b/xen/include/asm-x86/hypercall.h @@ -110,4 +110,13 @@ extern int arch_compat_vcpu_op( int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg); +extern int compat_mmuext_op( + XEN_GUEST_HANDLE_PARAM(void) arg, + unsigned int count, + XEN_GUEST_HANDLE_PARAM(uint) pdone, + unsigned int foreigndom); + +extern int compat_platform_op( + XEN_GUEST_HANDLE_PARAM(void) u_xenpf_op); + #endif /* __ASM_X86_HYPERCALL_H__ */
The tables are almost identical and therefore there is little reason to keep both sets. PVH needs 3 extra hypercalls: * mmuext_op. MMUEXT_PIN_L<x>_TABLE are required by control domain (dom0) when building guests. We add MMUEXT_UNPIN_TABLE for completeness. * platform_op. These are only available to privileged domains. We will (eventually) have privileged HVMlite guests and therefore shouldn't limit this to PVH only. * xenpmu_op. any guest with !has_vlapic() (i.e. PV, PVH and HVMlite) should be able to use it. Note that until recently PVH guests used mmuext_op's MMUEXT_INVLPG_MULTI and MMUEXT_TLB_FLUSH_MULTI commands but it has been determined that using the former was incorrect and using the latter is correct for now but is not guaranteed to work in the future. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> --- v3: * Move compat declarations to hypercall.h, redefine compat_mmuext_op to take XEN_GUEST_HANDLE_PARAM(void) as first argument (and then cast it) * Add MMUEXT_UNPIN_TABLE to allowable commands * Return -EOPNOTSUPP xen/arch/x86/hvm/hvm.c | 59 +++++--------------------------------- xen/arch/x86/mm.c | 20 +++++++++++++ xen/arch/x86/x86_64/compat/mm.c | 4 ++- xen/include/asm-x86/hypercall.h | 9 ++++++ 4 files changed, 40 insertions(+), 52 deletions(-)