Message ID | 467d24e1-8ed3-4dda-a334-70ff747bf94b@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | VT-d: SATC handling and ATS tidying | expand |
On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote: > Make the variable a tristate, with (as done elsewhere) a negative value > meaning "default". Since all use sites need looking at, also rename it > to match our usual "opt_*" pattern. While touching it, also move it to > .data.ro_after_init. > > The only place it retains boolean nature is pci_ats_device(), for now. Why does it retain the boolean nature in pci_ats_device()? I assume this is to avoid having to touch the line again in a further patch, as given the current logic pci_ats_device() would also want to treat -1 as ATS disabled. I think this is all fine because you add additional opt_ats > 0 checks before the call to pci_ats_device(), but would be good to know this is the intention. > In AMD code re-order conditionals to have the config space accesses > after (cheaper) flag checks. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > In domain_context_mapping_one() I'm a little puzzled that translation > type is selected based on only IOMMU and global properties, i.e. not > taking the device itself into account. That seems like a bug to me, we should check that the device supports ATS (and has it enabled) before setting the translation type to CONTEXT_TT_DEV_IOTLB unconditionally. We should likely use ats_device() instead of ats_enabled in domain_context_mapping_one(). There's also IMO a second bug here, which is that we possibly attempt to flush the device IOTLB before having ATS enabled. We flush the device TLB in domain_context_mapping_one(), yet ATS is enabled by the caller afterwards (see domain_context_mapping()). > > --- a/xen/drivers/passthrough/amd/iommu_cmd.c > +++ b/xen/drivers/passthrough/amd/iommu_cmd.c > @@ -282,7 +282,7 @@ void amd_iommu_flush_iotlb(u8 devfn, con > struct amd_iommu *iommu; > unsigned int req_id, queueid, maxpend; > > - if ( !ats_enabled ) > + if ( opt_ats <= 0 ) > return; > > if ( !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ) > @@ -340,7 +340,7 @@ static void _amd_iommu_flush_pages(struc > flush_command_buffer(iommu, 0); > } > > - if ( ats_enabled ) > + if ( opt_ats > 0 ) > { > amd_iommu_flush_all_iotlbs(d, daddr, order); > > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_ > dte->ex = ivrs_dev->dte_allow_exclusion; > dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT); > > - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > + if ( opt_ats > 0 && > !ivrs_dev->block_ats && > - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) > - dte->i = ats_enabled; > + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > + pci_ats_device(iommu->seg, bus, pdev->devfn) ) > + dte->i = true; > > spin_unlock_irqrestore(&iommu->lock, flags); > > @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_ > ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags, > ACPI_IVHD_SYSTEM_MGMT)); > > - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > + if ( opt_ats > 0 && > !ivrs_dev->block_ats && > - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) > - ASSERT(dte->i == ats_enabled); > + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > + pci_ats_device(iommu->seg, bus, pdev->devfn) ) > + ASSERT(dte->i); > > spin_unlock_irqrestore(&iommu->lock, flags); > > @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_ > > ASSERT(pcidevs_locked()); > > - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > + if ( opt_ats > 0 && > !ivrs_dev->block_ats && > iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > + pci_ats_device(iommu->seg, bus, pdev->devfn) && > !pci_ats_enabled(iommu->seg, bus, pdev->devfn) ) Seeing that this same set of conditions is used in 3 different checks, could we add a wrapper for it? opt_ats > 0 && !ivrs_dev->block_ats && iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && pci_ats_device(iommu->seg, bus, pdev->devfn) pci_device_ats_capable()? or some such. Thanks, Roger.
On 08.02.2024 12:49, Roger Pau Monné wrote: > On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote: >> Make the variable a tristate, with (as done elsewhere) a negative value >> meaning "default". Since all use sites need looking at, also rename it >> to match our usual "opt_*" pattern. While touching it, also move it to >> .data.ro_after_init. >> >> The only place it retains boolean nature is pci_ats_device(), for now. > > Why does it retain the boolean nature in pci_ats_device()? > > I assume this is to avoid having to touch the line again in a further > patch, as given the current logic pci_ats_device() would also want to > treat -1 as ATS disabled. No, then I would need to touch the line. The function wants to treat -1 as "maybe enabled", so the caller can know whether a device is an ATS device regardless of whether ATS use is fully off, or only "soft-off". > I think this is all fine because you add additional opt_ats > 0 checks > before the call to pci_ats_device(), but would be good to know this is > the intention. Note how amd_iommu_disable_domain_device() does not gain such a check, for exactly the reason named above: The function would better turn off ATS whenever enabled, no matter for what reason. And of course - none of this "soft-off" vs "fully off" matters for AMD (which is the only user of the function) right now anyway, seeing they don't have an equivalent of the ATC_REQUIRED flag. >> In AMD code re-order conditionals to have the config space accesses >> after (cheaper) flag checks. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> In domain_context_mapping_one() I'm a little puzzled that translation >> type is selected based on only IOMMU and global properties, i.e. not >> taking the device itself into account. > > That seems like a bug to me, we should check that the device supports > ATS (and has it enabled) before setting the translation type to > CONTEXT_TT_DEV_IOTLB unconditionally. We should likely use > ats_device() instead of ats_enabled in domain_context_mapping_one(). Will try to remember to add yet another patch then. > There's also IMO a second bug here, which is that we possibly attempt > to flush the device IOTLB before having ATS enabled. We flush the > device TLB in domain_context_mapping_one(), yet ATS is enabled by the > caller afterwards (see domain_context_mapping()). You may be right with this; I'd need to read up on whether such flushing is permissible. >> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c >> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c >> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_ >> dte->ex = ivrs_dev->dte_allow_exclusion; >> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT); >> >> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && >> + if ( opt_ats > 0 && >> !ivrs_dev->block_ats && >> - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) >> - dte->i = ats_enabled; >> + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && >> + pci_ats_device(iommu->seg, bus, pdev->devfn) ) >> + dte->i = true; >> >> spin_unlock_irqrestore(&iommu->lock, flags); >> >> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_ >> ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags, >> ACPI_IVHD_SYSTEM_MGMT)); >> >> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && >> + if ( opt_ats > 0 && >> !ivrs_dev->block_ats && >> - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) >> - ASSERT(dte->i == ats_enabled); >> + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && >> + pci_ats_device(iommu->seg, bus, pdev->devfn) ) >> + ASSERT(dte->i); >> >> spin_unlock_irqrestore(&iommu->lock, flags); >> >> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_ >> >> ASSERT(pcidevs_locked()); >> >> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && >> + if ( opt_ats > 0 && >> !ivrs_dev->block_ats && >> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && >> + pci_ats_device(iommu->seg, bus, pdev->devfn) && >> !pci_ats_enabled(iommu->seg, bus, pdev->devfn) ) > > Seeing that this same set of conditions is used in 3 different checks, > could we add a wrapper for it? > > opt_ats > 0 && !ivrs_dev->block_ats && > iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > pci_ats_device(iommu->seg, bus, pdev->devfn) > > pci_device_ats_capable()? or some such. I was pondering that, yes (iirc already once when adding block_ats). Problem is the name. "capable" isn't quite right when considering the tristate opt_ats. And pci_device_may_use_ats() reads, well, clumsy to me. If you have any good idea for a name that's fully applicable and not odd or overly long, I can certainly introduce such a helper. Jan
On Thu, Feb 08, 2024 at 04:49:46PM +0100, Jan Beulich wrote: > On 08.02.2024 12:49, Roger Pau Monné wrote: > > On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote: > >> Make the variable a tristate, with (as done elsewhere) a negative value > >> meaning "default". Since all use sites need looking at, also rename it > >> to match our usual "opt_*" pattern. While touching it, also move it to > >> .data.ro_after_init. > >> > >> The only place it retains boolean nature is pci_ats_device(), for now. > > > > Why does it retain the boolean nature in pci_ats_device()? > > > > I assume this is to avoid having to touch the line again in a further > > patch, as given the current logic pci_ats_device() would also want to > > treat -1 as ATS disabled. > > No, then I would need to touch the line. The function wants to treat > -1 as "maybe enabled", so the caller can know whether a device is an > ATS device regardless of whether ATS use is fully off, or only > "soft-off". I have to admit I'm slightly concerned about this soft-off. Given the current status of ATS itself in Xen, and the technology itself, I think a user should always opt-in to ATS usage. > > I think this is all fine because you add additional opt_ats > 0 checks > > before the call to pci_ats_device(), but would be good to know this is > > the intention. > > Note how amd_iommu_disable_domain_device() does not gain such a > check, for exactly the reason named above: The function would better > turn off ATS whenever enabled, no matter for what reason. > > And of course - none of this "soft-off" vs "fully off" matters for > AMD (which is the only user of the function) right now anyway, seeing > they don't have an equivalent of the ATC_REQUIRED flag. > > >> In AMD code re-order conditionals to have the config space accesses > >> after (cheaper) flag checks. > >> > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > >> --- > >> In domain_context_mapping_one() I'm a little puzzled that translation > >> type is selected based on only IOMMU and global properties, i.e. not > >> taking the device itself into account. > > > > That seems like a bug to me, we should check that the device supports > > ATS (and has it enabled) before setting the translation type to > > CONTEXT_TT_DEV_IOTLB unconditionally. We should likely use > > ats_device() instead of ats_enabled in domain_context_mapping_one(). > > Will try to remember to add yet another patch then. > > > There's also IMO a second bug here, which is that we possibly attempt > > to flush the device IOTLB before having ATS enabled. We flush the > > device TLB in domain_context_mapping_one(), yet ATS is enabled by the > > caller afterwards (see domain_context_mapping()). > > You may be right with this; I'd need to read up on whether such > flushing is permissible. > > >> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > >> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > >> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_ > >> dte->ex = ivrs_dev->dte_allow_exclusion; > >> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT); > >> > >> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > >> + if ( opt_ats > 0 && > >> !ivrs_dev->block_ats && > >> - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) > >> - dte->i = ats_enabled; > >> + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > >> + pci_ats_device(iommu->seg, bus, pdev->devfn) ) > >> + dte->i = true; > >> > >> spin_unlock_irqrestore(&iommu->lock, flags); > >> > >> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_ > >> ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags, > >> ACPI_IVHD_SYSTEM_MGMT)); > >> > >> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > >> + if ( opt_ats > 0 && > >> !ivrs_dev->block_ats && > >> - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) > >> - ASSERT(dte->i == ats_enabled); > >> + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > >> + pci_ats_device(iommu->seg, bus, pdev->devfn) ) > >> + ASSERT(dte->i); > >> > >> spin_unlock_irqrestore(&iommu->lock, flags); > >> > >> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_ > >> > >> ASSERT(pcidevs_locked()); > >> > >> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > >> + if ( opt_ats > 0 && > >> !ivrs_dev->block_ats && > >> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > >> + pci_ats_device(iommu->seg, bus, pdev->devfn) && > >> !pci_ats_enabled(iommu->seg, bus, pdev->devfn) ) > > > > Seeing that this same set of conditions is used in 3 different checks, > > could we add a wrapper for it? > > > > opt_ats > 0 && !ivrs_dev->block_ats && > > iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > > pci_ats_device(iommu->seg, bus, pdev->devfn) > > > > pci_device_ats_capable()? or some such. > > I was pondering that, yes (iirc already once when adding block_ats). > Problem is the name. "capable" isn't quite right when considering > the tristate opt_ats. And pci_device_may_use_ats() reads, well, > clumsy to me. If you have any good idea for a name that's fully > applicable and not odd or overly long, I can certainly introduce > such a helper. But if ATS is soft-disabled (-1) or hard disabled (0), it's fine to consider the devices as not ATS capable for the context here? Thanks, Roger.
On 12.02.2024 10:39, Roger Pau Monné wrote: > On Thu, Feb 08, 2024 at 04:49:46PM +0100, Jan Beulich wrote: >> On 08.02.2024 12:49, Roger Pau Monné wrote: >>> On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote: >>>> Make the variable a tristate, with (as done elsewhere) a negative value >>>> meaning "default". Since all use sites need looking at, also rename it >>>> to match our usual "opt_*" pattern. While touching it, also move it to >>>> .data.ro_after_init. >>>> >>>> The only place it retains boolean nature is pci_ats_device(), for now. >>> >>> Why does it retain the boolean nature in pci_ats_device()? >>> >>> I assume this is to avoid having to touch the line again in a further >>> patch, as given the current logic pci_ats_device() would also want to >>> treat -1 as ATS disabled. >> >> No, then I would need to touch the line. The function wants to treat >> -1 as "maybe enabled", so the caller can know whether a device is an >> ATS device regardless of whether ATS use is fully off, or only >> "soft-off". > > I have to admit I'm slightly concerned about this soft-off. Given the > current status of ATS itself in Xen, and the technology itself, I > think a user should always opt-in to ATS usage. The plan is to follow your suggestion in patch 3 and require explicit enabling for passing through of such devices. For Dom0, however, I think it is important that we respect the firmware request by default. The only viable(?!) alternative would be to panic() instead. >>>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c >>>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c >>>> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_ >>>> dte->ex = ivrs_dev->dte_allow_exclusion; >>>> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT); >>>> >>>> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && >>>> + if ( opt_ats > 0 && >>>> !ivrs_dev->block_ats && >>>> - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) >>>> - dte->i = ats_enabled; >>>> + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && >>>> + pci_ats_device(iommu->seg, bus, pdev->devfn) ) >>>> + dte->i = true; >>>> >>>> spin_unlock_irqrestore(&iommu->lock, flags); >>>> >>>> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_ >>>> ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags, >>>> ACPI_IVHD_SYSTEM_MGMT)); >>>> >>>> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && >>>> + if ( opt_ats > 0 && >>>> !ivrs_dev->block_ats && >>>> - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) >>>> - ASSERT(dte->i == ats_enabled); >>>> + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && >>>> + pci_ats_device(iommu->seg, bus, pdev->devfn) ) >>>> + ASSERT(dte->i); >>>> >>>> spin_unlock_irqrestore(&iommu->lock, flags); >>>> >>>> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_ >>>> >>>> ASSERT(pcidevs_locked()); >>>> >>>> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && >>>> + if ( opt_ats > 0 && >>>> !ivrs_dev->block_ats && >>>> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && >>>> + pci_ats_device(iommu->seg, bus, pdev->devfn) && >>>> !pci_ats_enabled(iommu->seg, bus, pdev->devfn) ) >>> >>> Seeing that this same set of conditions is used in 3 different checks, >>> could we add a wrapper for it? >>> >>> opt_ats > 0 && !ivrs_dev->block_ats && >>> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && >>> pci_ats_device(iommu->seg, bus, pdev->devfn) >>> >>> pci_device_ats_capable()? or some such. >> >> I was pondering that, yes (iirc already once when adding block_ats). >> Problem is the name. "capable" isn't quite right when considering >> the tristate opt_ats. And pci_device_may_use_ats() reads, well, >> clumsy to me. If you have any good idea for a name that's fully >> applicable and not odd or overly long, I can certainly introduce >> such a helper. > > But if ATS is soft-disabled (-1) or hard disabled (0), it's fine to > consider the devices as not ATS capable for the context here? I don't like mixing capability and policy aspects into a resulting "capable". Jan
On Mon, Feb 12, 2024 at 11:45:33AM +0100, Jan Beulich wrote: > On 12.02.2024 10:39, Roger Pau Monné wrote: > > On Thu, Feb 08, 2024 at 04:49:46PM +0100, Jan Beulich wrote: > >> On 08.02.2024 12:49, Roger Pau Monné wrote: > >>> On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote: > >>>> Make the variable a tristate, with (as done elsewhere) a negative value > >>>> meaning "default". Since all use sites need looking at, also rename it > >>>> to match our usual "opt_*" pattern. While touching it, also move it to > >>>> .data.ro_after_init. > >>>> > >>>> The only place it retains boolean nature is pci_ats_device(), for now. > >>> > >>> Why does it retain the boolean nature in pci_ats_device()? > >>> > >>> I assume this is to avoid having to touch the line again in a further > >>> patch, as given the current logic pci_ats_device() would also want to > >>> treat -1 as ATS disabled. > >> > >> No, then I would need to touch the line. The function wants to treat > >> -1 as "maybe enabled", so the caller can know whether a device is an > >> ATS device regardless of whether ATS use is fully off, or only > >> "soft-off". > > > > I have to admit I'm slightly concerned about this soft-off. Given the > > current status of ATS itself in Xen, and the technology itself, I > > think a user should always opt-in to ATS usage. > > The plan is to follow your suggestion in patch 3 and require explicit > enabling for passing through of such devices. For Dom0, however, I > think it is important that we respect the firmware request by default. > The only viable(?!) alternative would be to panic() instead. Or assign to domIO? > >>>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > >>>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > >>>> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_ > >>>> dte->ex = ivrs_dev->dte_allow_exclusion; > >>>> dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT); > >>>> > >>>> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > >>>> + if ( opt_ats > 0 && > >>>> !ivrs_dev->block_ats && > >>>> - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) > >>>> - dte->i = ats_enabled; > >>>> + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > >>>> + pci_ats_device(iommu->seg, bus, pdev->devfn) ) > >>>> + dte->i = true; > >>>> > >>>> spin_unlock_irqrestore(&iommu->lock, flags); > >>>> > >>>> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_ > >>>> ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags, > >>>> ACPI_IVHD_SYSTEM_MGMT)); > >>>> > >>>> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > >>>> + if ( opt_ats > 0 && > >>>> !ivrs_dev->block_ats && > >>>> - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) > >>>> - ASSERT(dte->i == ats_enabled); > >>>> + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > >>>> + pci_ats_device(iommu->seg, bus, pdev->devfn) ) > >>>> + ASSERT(dte->i); > >>>> > >>>> spin_unlock_irqrestore(&iommu->lock, flags); > >>>> > >>>> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_ > >>>> > >>>> ASSERT(pcidevs_locked()); > >>>> > >>>> - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && > >>>> + if ( opt_ats > 0 && > >>>> !ivrs_dev->block_ats && > >>>> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > >>>> + pci_ats_device(iommu->seg, bus, pdev->devfn) && > >>>> !pci_ats_enabled(iommu->seg, bus, pdev->devfn) ) > >>> > >>> Seeing that this same set of conditions is used in 3 different checks, > >>> could we add a wrapper for it? > >>> > >>> opt_ats > 0 && !ivrs_dev->block_ats && > >>> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && > >>> pci_ats_device(iommu->seg, bus, pdev->devfn) > >>> > >>> pci_device_ats_capable()? or some such. > >> > >> I was pondering that, yes (iirc already once when adding block_ats). > >> Problem is the name. "capable" isn't quite right when considering > >> the tristate opt_ats. And pci_device_may_use_ats() reads, well, > >> clumsy to me. If you have any good idea for a name that's fully > >> applicable and not odd or overly long, I can certainly introduce > >> such a helper. > > > > But if ATS is soft-disabled (-1) or hard disabled (0), it's fine to > > consider the devices as not ATS capable for the context here? > > I don't like mixing capability and policy aspects into a resulting > "capable". IMO we should prefer avoiding code repetition, even if at the cost of having a handler that have a maybe not ideal naming, but I can't force you to do that. Thanks, Roger.
On 12.02.2024 16:38, Roger Pau Monné wrote: > On Mon, Feb 12, 2024 at 11:45:33AM +0100, Jan Beulich wrote: >> On 12.02.2024 10:39, Roger Pau Monné wrote: >>> On Thu, Feb 08, 2024 at 04:49:46PM +0100, Jan Beulich wrote: >>>> On 08.02.2024 12:49, Roger Pau Monné wrote: >>>>> On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote: >>>>>> Make the variable a tristate, with (as done elsewhere) a negative value >>>>>> meaning "default". Since all use sites need looking at, also rename it >>>>>> to match our usual "opt_*" pattern. While touching it, also move it to >>>>>> .data.ro_after_init. >>>>>> >>>>>> The only place it retains boolean nature is pci_ats_device(), for now. >>>>> >>>>> Why does it retain the boolean nature in pci_ats_device()? >>>>> >>>>> I assume this is to avoid having to touch the line again in a further >>>>> patch, as given the current logic pci_ats_device() would also want to >>>>> treat -1 as ATS disabled. >>>> >>>> No, then I would need to touch the line. The function wants to treat >>>> -1 as "maybe enabled", so the caller can know whether a device is an >>>> ATS device regardless of whether ATS use is fully off, or only >>>> "soft-off". >>> >>> I have to admit I'm slightly concerned about this soft-off. Given the >>> current status of ATS itself in Xen, and the technology itself, I >>> think a user should always opt-in to ATS usage. >> >> The plan is to follow your suggestion in patch 3 and require explicit >> enabling for passing through of such devices. For Dom0, however, I >> think it is important that we respect the firmware request by default. >> The only viable(?!) alternative would be to panic() instead. > > Or assign to domIO? Not behind Dom0's back - I can't see how that would work if then Dom0 tried to load a driver for the device. Jan
--- a/xen/drivers/passthrough/amd/iommu_cmd.c +++ b/xen/drivers/passthrough/amd/iommu_cmd.c @@ -282,7 +282,7 @@ void amd_iommu_flush_iotlb(u8 devfn, con struct amd_iommu *iommu; unsigned int req_id, queueid, maxpend; - if ( !ats_enabled ) + if ( opt_ats <= 0 ) return; if ( !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ) @@ -340,7 +340,7 @@ static void _amd_iommu_flush_pages(struc flush_command_buffer(iommu, 0); } - if ( ats_enabled ) + if ( opt_ats > 0 ) { amd_iommu_flush_all_iotlbs(d, daddr, order); --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_ dte->ex = ivrs_dev->dte_allow_exclusion; dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT); - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && + if ( opt_ats > 0 && !ivrs_dev->block_ats && - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) - dte->i = ats_enabled; + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && + pci_ats_device(iommu->seg, bus, pdev->devfn) ) + dte->i = true; spin_unlock_irqrestore(&iommu->lock, flags); @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_ ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT)); - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && + if ( opt_ats > 0 && !ivrs_dev->block_ats && - iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) ) - ASSERT(dte->i == ats_enabled); + iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && + pci_ats_device(iommu->seg, bus, pdev->devfn) ) + ASSERT(dte->i); spin_unlock_irqrestore(&iommu->lock, flags); @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_ ASSERT(pcidevs_locked()); - if ( pci_ats_device(iommu->seg, bus, pdev->devfn) && + if ( opt_ats > 0 && !ivrs_dev->block_ats && iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) && + pci_ats_device(iommu->seg, bus, pdev->devfn) && !pci_ats_enabled(iommu->seg, bus, pdev->devfn) ) { if ( devfn == pdev->devfn ) --- a/xen/drivers/passthrough/ats.c +++ b/xen/drivers/passthrough/ats.c @@ -18,8 +18,8 @@ #include <xen/pci_regs.h> #include "ats.h" -bool __read_mostly ats_enabled; -boolean_param("ats", ats_enabled); +int8_t __ro_after_init opt_ats = -1; +boolean_param("ats", opt_ats); int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list) { --- a/xen/drivers/passthrough/ats.h +++ b/xen/drivers/passthrough/ats.h @@ -22,7 +22,7 @@ #define ATS_QUEUE_DEPTH_MASK 0x1f #define ATS_ENABLE (1<<15) -extern bool ats_enabled; +extern int8_t opt_ats; int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list); void disable_ats_device(struct pci_dev *pdev); @@ -43,7 +43,7 @@ static inline int pci_ats_enabled(int se static inline int pci_ats_device(int seg, int bus, int devfn) { - if ( !ats_enabled ) + if ( !opt_ats ) return 0; return pci_find_ext_capability(PCI_SBDF(seg, bus, devfn), --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1543,7 +1543,7 @@ int domain_context_mapping_one( } context_set_address_root(lctxt, root); - if ( ats_enabled && ecap_dev_iotlb(iommu->ecap) ) + if ( opt_ats > 0 && ecap_dev_iotlb(iommu->ecap) ) context_set_translation_type(lctxt, CONTEXT_TT_DEV_IOTLB); else context_set_translation_type(lctxt, CONTEXT_TT_MULTI_LEVEL); --- a/xen/drivers/passthrough/vtd/x86/ats.c +++ b/xen/drivers/passthrough/vtd/x86/ats.c @@ -46,7 +46,7 @@ int ats_device(const struct pci_dev *pde struct acpi_drhd_unit *ats_drhd; int pos; - if ( !ats_enabled || !iommu_qinval ) + if ( opt_ats <= 0 || !iommu_qinval ) return 0; if ( !ecap_queued_inval(drhd->iommu->ecap) ||
Make the variable a tristate, with (as done elsewhere) a negative value meaning "default". Since all use sites need looking at, also rename it to match our usual "opt_*" pattern. While touching it, also move it to .data.ro_after_init. The only place it retains boolean nature is pci_ats_device(), for now. In AMD code re-order conditionals to have the config space accesses after (cheaper) flag checks. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- In domain_context_mapping_one() I'm a little puzzled that translation type is selected based on only IOMMU and global properties, i.e. not taking the device itself into account.