Message ID | 20240717161415.3586195-4-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | XSA-402 followup | expand |
On 17.07.2024 18:14, Andrew Cooper wrote: > l1_disallow_mask() yields L1_DISALLOW_MASK with PAGE_CACHE_ATTRS conditionally > permitted. First, rewrite it as a plain function. > > Next, correct some dubious behaviours. > > The use of is_pv_domain() is tautological; l1_disallow_mask() is only used in > the PV pagetable code, and it return true for system domains as well. Well, only quite. What you say is entirely true for mod_l1_entry(), but sadly not for get_page_from_l1e(): That's also used by shadow_get_page_from_l1e(), i.e. theoretically from a PVH Dom0 running in shadow mode. I don't think I can spot anywhere we'd reject that combination. > The use of has_arch_pdevs() is wonky; by making the use of cache attributes > dependent on the instantanious property of whether any devices is assigned, > means that a guest could have created a legal PTE which will fail validation > at a later point when the device has been removed. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: Jan Beulich <JBeulich@suse.com> > CC: Roger Pau Monné <roger.pau@citrix.com> > > RFC. I've not tested this, and I doubt it will work to start with owing to > the removal of the dom_io special case which IIRC dom0 uses to map arbitrary > MMIO. I think that check has been dead for a long time. DomIO can't be the page table owner; it can only be the owner of a page to be mapped. The check likely was needed only before the proper splitting of which domain involved plays which role. Dropping this may then want to be done as a separate change, but it could of course remain here as long as properly explained in the description. Maybe for the time being we want to retain an assertion to this effect. > Furthermore, the rangeset_is_empty() calls have the same problem that > has_arch_pdevs() has; they're not invariants on the domain. Also, VMs > wanting/needing encrypted memory need fully working cacheability irrespective > of device assignment. > > I expect the way we actually want to fix this is to have a CDF flag for > allowing reduced cahcebility, and for this expression to simplify to just: > > if ( d->options & XEN_DOMCTL_CDF_any_cacheability ) > disallow &= ~PAGE_CACHE_ATTRS; > > which is simpler still. Perhaps. Or refuse altering the two rangesets post-creation for domains without IOMMUs ("post-creation" here including the establishing of any mapping on behalf of the domain). Jan
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 95795567f2a5..31937319c057 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -162,13 +162,17 @@ static uint32_t __ro_after_init base_disallow_mask; #define L4_DISALLOW_MASK (base_disallow_mask) -#define l1_disallow_mask(d) \ - (((d) != dom_io) && \ - (rangeset_is_empty((d)->iomem_caps) && \ - rangeset_is_empty((d)->arch.ioport_caps) && \ - !has_arch_pdevs(d) && \ - is_pv_domain(d)) ? \ - L1_DISALLOW_MASK : (L1_DISALLOW_MASK & ~PAGE_CACHE_ATTRS)) +static uint32_t l1_disallow_mask(const struct domain *d) +{ + uint32_t disallow = L1_DISALLOW_MASK; + + if ( (d->options & XEN_DOMCTL_CDF_iommu) || + !rangeset_is_empty(d->iomem_caps) || + !rangeset_is_empty(d->arch.ioport_caps) ) + disallow &= ~PAGE_CACHE_ATTRS; + + return disallow; +} static s8 __read_mostly opt_mmio_relax;
l1_disallow_mask() yields L1_DISALLOW_MASK with PAGE_CACHE_ATTRS conditionally permitted. First, rewrite it as a plain function. Next, correct some dubious behaviours. The use of is_pv_domain() is tautological; l1_disallow_mask() is only used in the PV pagetable code, and it return true for system domains as well. The use of has_arch_pdevs() is wonky; by making the use of cache attributes dependent on the instantanious property of whether any devices is assigned, means that a guest could have created a legal PTE which will fail validation at a later point when the device has been removed. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> RFC. I've not tested this, and I doubt it will work to start with owing to the removal of the dom_io special case which IIRC dom0 uses to map arbitrary MMIO. Furthermore, the rangeset_is_empty() calls have the same problem that has_arch_pdevs() has; they're not invariants on the domain. Also, VMs wanting/needing encrypted memory need fully working cacheability irrespective of device assignment. I expect the way we actually want to fix this is to have a CDF flag for allowing reduced cahcebility, and for this expression to simplify to just: if ( d->options & XEN_DOMCTL_CDF_any_cacheability ) disallow &= ~PAGE_CACHE_ATTRS; which is simpler still. For the current form, bloat-o-meter reports: add/remove: 1/0 grow/shrink: 1/2 up/down: 75/-280 (-205) Function old new delta l1_disallow_mask - 74 +74 mod_l1_entry.cold 55 56 +1 get_page_from_l1e 1271 1167 -104 mod_l1_entry 1860 1684 -176 which is an even bigger improvement than simply not duplicating the logic. --- xen/arch/x86/mm.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)