diff mbox series

[2/7] IOMMU: rename and re-type ats_enabled

Message ID 467d24e1-8ed3-4dda-a334-70ff747bf94b@suse.com (mailing list archive)
State New, archived
Headers show
Series VT-d: SATC handling and ATS tidying | expand

Commit Message

Jan Beulich Feb. 5, 2024, 1:55 p.m. UTC
Make the variable a tristate, with (as done elsewhere) a negative value
meaning "default". Since all use sites need looking at, also rename it
to match our usual "opt_*" pattern. While touching it, also move it to
.data.ro_after_init.

The only place it retains boolean nature is pci_ats_device(), for now.

In AMD code re-order conditionals to have the config space accesses
after (cheaper) flag checks.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
In domain_context_mapping_one() I'm a little puzzled that translation
type is selected based on only IOMMU and global properties, i.e. not
taking the device itself into account.

Comments

Roger Pau Monné Feb. 8, 2024, 11:49 a.m. UTC | #1
On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote:
> Make the variable a tristate, with (as done elsewhere) a negative value
> meaning "default". Since all use sites need looking at, also rename it
> to match our usual "opt_*" pattern. While touching it, also move it to
> .data.ro_after_init.
> 
> The only place it retains boolean nature is pci_ats_device(), for now.

Why does it retain the boolean nature in pci_ats_device()?

I assume this is to avoid having to touch the line again in a further
patch, as given the current logic pci_ats_device() would also want to
treat -1 as ATS disabled.

I think this is all fine because you add additional opt_ats > 0 checks
before the call to pci_ats_device(), but would be good to know this is
the intention.

> In AMD code re-order conditionals to have the config space accesses
> after (cheaper) flag checks.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> In domain_context_mapping_one() I'm a little puzzled that translation
> type is selected based on only IOMMU and global properties, i.e. not
> taking the device itself into account.

That seems like a bug to me, we should check that the device supports
ATS (and has it enabled) before setting the translation type to
CONTEXT_TT_DEV_IOTLB unconditionally.  We should likely use
ats_device() instead of ats_enabled in domain_context_mapping_one().

There's also IMO a second bug here, which is that we possibly attempt
to flush the device IOTLB before having ATS enabled.  We flush the
device TLB in domain_context_mapping_one(), yet ATS is enabled by the
caller afterwards (see domain_context_mapping()).

> 
> --- a/xen/drivers/passthrough/amd/iommu_cmd.c
> +++ b/xen/drivers/passthrough/amd/iommu_cmd.c
> @@ -282,7 +282,7 @@ void amd_iommu_flush_iotlb(u8 devfn, con
>      struct amd_iommu *iommu;
>      unsigned int req_id, queueid, maxpend;
>  
> -    if ( !ats_enabled )
> +    if ( opt_ats <= 0 )
>          return;
>  
>      if ( !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) )
> @@ -340,7 +340,7 @@ static void _amd_iommu_flush_pages(struc
>          flush_command_buffer(iommu, 0);
>      }
>  
> -    if ( ats_enabled )
> +    if ( opt_ats > 0 )
>      {
>          amd_iommu_flush_all_iotlbs(d, daddr, order);
>  
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_
>          dte->ex = ivrs_dev->dte_allow_exclusion;
>          dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>  
> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> +        if ( opt_ats > 0 &&
>               !ivrs_dev->block_ats &&
> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
> -            dte->i = ats_enabled;
> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
> +            dte->i = true;
>  
>          spin_unlock_irqrestore(&iommu->lock, flags);
>  
> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_
>          ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags,
>                                           ACPI_IVHD_SYSTEM_MGMT));
>  
> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> +        if ( opt_ats > 0 &&
>               !ivrs_dev->block_ats &&
> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
> -            ASSERT(dte->i == ats_enabled);
> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
> +            ASSERT(dte->i);
>  
>          spin_unlock_irqrestore(&iommu->lock, flags);
>  
> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_
>  
>      ASSERT(pcidevs_locked());
>  
> -    if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> +    if ( opt_ats > 0 &&
>           !ivrs_dev->block_ats &&
>           iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> +         pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>           !pci_ats_enabled(iommu->seg, bus, pdev->devfn) )

Seeing that this same set of conditions is used in 3 different checks,
could we add a wrapper for it?

opt_ats > 0 && !ivrs_dev->block_ats &&
iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
pci_ats_device(iommu->seg, bus, pdev->devfn)

pci_device_ats_capable()? or some such.

Thanks, Roger.
Jan Beulich Feb. 8, 2024, 3:49 p.m. UTC | #2
On 08.02.2024 12:49, Roger Pau Monné wrote:
> On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote:
>> Make the variable a tristate, with (as done elsewhere) a negative value
>> meaning "default". Since all use sites need looking at, also rename it
>> to match our usual "opt_*" pattern. While touching it, also move it to
>> .data.ro_after_init.
>>
>> The only place it retains boolean nature is pci_ats_device(), for now.
> 
> Why does it retain the boolean nature in pci_ats_device()?
> 
> I assume this is to avoid having to touch the line again in a further
> patch, as given the current logic pci_ats_device() would also want to
> treat -1 as ATS disabled.

No, then I would need to touch the line. The function wants to treat
-1 as "maybe enabled", so the caller can know whether a device is an
ATS device regardless of whether ATS use is fully off, or only
"soft-off".

> I think this is all fine because you add additional opt_ats > 0 checks
> before the call to pci_ats_device(), but would be good to know this is
> the intention.

Note how amd_iommu_disable_domain_device() does not gain such a
check, for exactly the reason named above: The function would better
turn off ATS whenever enabled, no matter for what reason.

And of course - none of this "soft-off" vs "fully off" matters for
AMD (which is the only user of the function) right now anyway, seeing
they don't have an equivalent of the ATC_REQUIRED flag.

>> In AMD code re-order conditionals to have the config space accesses
>> after (cheaper) flag checks.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> In domain_context_mapping_one() I'm a little puzzled that translation
>> type is selected based on only IOMMU and global properties, i.e. not
>> taking the device itself into account.
> 
> That seems like a bug to me, we should check that the device supports
> ATS (and has it enabled) before setting the translation type to
> CONTEXT_TT_DEV_IOTLB unconditionally.  We should likely use
> ats_device() instead of ats_enabled in domain_context_mapping_one().

Will try to remember to add yet another patch then.

> There's also IMO a second bug here, which is that we possibly attempt
> to flush the device IOTLB before having ATS enabled.  We flush the
> device TLB in domain_context_mapping_one(), yet ATS is enabled by the
> caller afterwards (see domain_context_mapping()).

You may be right with this; I'd need to read up on whether such
flushing is permissible.

>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_
>>          dte->ex = ivrs_dev->dte_allow_exclusion;
>>          dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>>  
>> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>> +        if ( opt_ats > 0 &&
>>               !ivrs_dev->block_ats &&
>> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
>> -            dte->i = ats_enabled;
>> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
>> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
>> +            dte->i = true;
>>  
>>          spin_unlock_irqrestore(&iommu->lock, flags);
>>  
>> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_
>>          ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags,
>>                                           ACPI_IVHD_SYSTEM_MGMT));
>>  
>> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>> +        if ( opt_ats > 0 &&
>>               !ivrs_dev->block_ats &&
>> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
>> -            ASSERT(dte->i == ats_enabled);
>> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
>> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
>> +            ASSERT(dte->i);
>>  
>>          spin_unlock_irqrestore(&iommu->lock, flags);
>>  
>> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_
>>  
>>      ASSERT(pcidevs_locked());
>>  
>> -    if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>> +    if ( opt_ats > 0 &&
>>           !ivrs_dev->block_ats &&
>>           iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
>> +         pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>>           !pci_ats_enabled(iommu->seg, bus, pdev->devfn) )
> 
> Seeing that this same set of conditions is used in 3 different checks,
> could we add a wrapper for it?
> 
> opt_ats > 0 && !ivrs_dev->block_ats &&
> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> pci_ats_device(iommu->seg, bus, pdev->devfn)
> 
> pci_device_ats_capable()? or some such.

I was pondering that, yes (iirc already once when adding block_ats).
Problem is the name. "capable" isn't quite right when considering
the tristate opt_ats. And pci_device_may_use_ats() reads, well,
clumsy to me. If you have any good idea for a name that's fully
applicable and not odd or overly long, I can certainly introduce
such a helper.

Jan
Roger Pau Monné Feb. 12, 2024, 9:39 a.m. UTC | #3
On Thu, Feb 08, 2024 at 04:49:46PM +0100, Jan Beulich wrote:
> On 08.02.2024 12:49, Roger Pau Monné wrote:
> > On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote:
> >> Make the variable a tristate, with (as done elsewhere) a negative value
> >> meaning "default". Since all use sites need looking at, also rename it
> >> to match our usual "opt_*" pattern. While touching it, also move it to
> >> .data.ro_after_init.
> >>
> >> The only place it retains boolean nature is pci_ats_device(), for now.
> > 
> > Why does it retain the boolean nature in pci_ats_device()?
> > 
> > I assume this is to avoid having to touch the line again in a further
> > patch, as given the current logic pci_ats_device() would also want to
> > treat -1 as ATS disabled.
> 
> No, then I would need to touch the line. The function wants to treat
> -1 as "maybe enabled", so the caller can know whether a device is an
> ATS device regardless of whether ATS use is fully off, or only
> "soft-off".

I have to admit I'm slightly concerned about this soft-off.  Given the
current status of ATS itself in Xen, and the technology itself, I
think a user should always opt-in to ATS usage.

> > I think this is all fine because you add additional opt_ats > 0 checks
> > before the call to pci_ats_device(), but would be good to know this is
> > the intention.
> 
> Note how amd_iommu_disable_domain_device() does not gain such a
> check, for exactly the reason named above: The function would better
> turn off ATS whenever enabled, no matter for what reason.
> 
> And of course - none of this "soft-off" vs "fully off" matters for
> AMD (which is the only user of the function) right now anyway, seeing
> they don't have an equivalent of the ATC_REQUIRED flag.
> 
> >> In AMD code re-order conditionals to have the config space accesses
> >> after (cheaper) flag checks.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> ---
> >> In domain_context_mapping_one() I'm a little puzzled that translation
> >> type is selected based on only IOMMU and global properties, i.e. not
> >> taking the device itself into account.
> > 
> > That seems like a bug to me, we should check that the device supports
> > ATS (and has it enabled) before setting the translation type to
> > CONTEXT_TT_DEV_IOTLB unconditionally.  We should likely use
> > ats_device() instead of ats_enabled in domain_context_mapping_one().
> 
> Will try to remember to add yet another patch then.
> 
> > There's also IMO a second bug here, which is that we possibly attempt
> > to flush the device IOTLB before having ATS enabled.  We flush the
> > device TLB in domain_context_mapping_one(), yet ATS is enabled by the
> > caller afterwards (see domain_context_mapping()).
> 
> You may be right with this; I'd need to read up on whether such
> flushing is permissible.
> 
> >> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_
> >>          dte->ex = ivrs_dev->dte_allow_exclusion;
> >>          dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
> >>  
> >> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> >> +        if ( opt_ats > 0 &&
> >>               !ivrs_dev->block_ats &&
> >> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
> >> -            dte->i = ats_enabled;
> >> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> >> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
> >> +            dte->i = true;
> >>  
> >>          spin_unlock_irqrestore(&iommu->lock, flags);
> >>  
> >> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_
> >>          ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags,
> >>                                           ACPI_IVHD_SYSTEM_MGMT));
> >>  
> >> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> >> +        if ( opt_ats > 0 &&
> >>               !ivrs_dev->block_ats &&
> >> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
> >> -            ASSERT(dte->i == ats_enabled);
> >> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> >> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
> >> +            ASSERT(dte->i);
> >>  
> >>          spin_unlock_irqrestore(&iommu->lock, flags);
> >>  
> >> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_
> >>  
> >>      ASSERT(pcidevs_locked());
> >>  
> >> -    if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> >> +    if ( opt_ats > 0 &&
> >>           !ivrs_dev->block_ats &&
> >>           iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> >> +         pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> >>           !pci_ats_enabled(iommu->seg, bus, pdev->devfn) )
> > 
> > Seeing that this same set of conditions is used in 3 different checks,
> > could we add a wrapper for it?
> > 
> > opt_ats > 0 && !ivrs_dev->block_ats &&
> > iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> > pci_ats_device(iommu->seg, bus, pdev->devfn)
> > 
> > pci_device_ats_capable()? or some such.
> 
> I was pondering that, yes (iirc already once when adding block_ats).
> Problem is the name. "capable" isn't quite right when considering
> the tristate opt_ats. And pci_device_may_use_ats() reads, well,
> clumsy to me. If you have any good idea for a name that's fully
> applicable and not odd or overly long, I can certainly introduce
> such a helper.

But if ATS is soft-disabled (-1) or hard disabled (0), it's fine to
consider the devices as not ATS capable for the context here?

Thanks, Roger.
Jan Beulich Feb. 12, 2024, 10:45 a.m. UTC | #4
On 12.02.2024 10:39, Roger Pau Monné wrote:
> On Thu, Feb 08, 2024 at 04:49:46PM +0100, Jan Beulich wrote:
>> On 08.02.2024 12:49, Roger Pau Monné wrote:
>>> On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote:
>>>> Make the variable a tristate, with (as done elsewhere) a negative value
>>>> meaning "default". Since all use sites need looking at, also rename it
>>>> to match our usual "opt_*" pattern. While touching it, also move it to
>>>> .data.ro_after_init.
>>>>
>>>> The only place it retains boolean nature is pci_ats_device(), for now.
>>>
>>> Why does it retain the boolean nature in pci_ats_device()?
>>>
>>> I assume this is to avoid having to touch the line again in a further
>>> patch, as given the current logic pci_ats_device() would also want to
>>> treat -1 as ATS disabled.
>>
>> No, then I would need to touch the line. The function wants to treat
>> -1 as "maybe enabled", so the caller can know whether a device is an
>> ATS device regardless of whether ATS use is fully off, or only
>> "soft-off".
> 
> I have to admit I'm slightly concerned about this soft-off.  Given the
> current status of ATS itself in Xen, and the technology itself, I
> think a user should always opt-in to ATS usage.

The plan is to follow your suggestion in patch 3 and require explicit
enabling for passing through of such devices. For Dom0, however, I
think it is important that we respect the firmware request by default.
The only viable(?!) alternative would be to panic() instead.

>>>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
>>>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
>>>> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_
>>>>          dte->ex = ivrs_dev->dte_allow_exclusion;
>>>>          dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
>>>>  
>>>> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>>>> +        if ( opt_ats > 0 &&
>>>>               !ivrs_dev->block_ats &&
>>>> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
>>>> -            dte->i = ats_enabled;
>>>> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
>>>> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
>>>> +            dte->i = true;
>>>>  
>>>>          spin_unlock_irqrestore(&iommu->lock, flags);
>>>>  
>>>> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_
>>>>          ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags,
>>>>                                           ACPI_IVHD_SYSTEM_MGMT));
>>>>  
>>>> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>>>> +        if ( opt_ats > 0 &&
>>>>               !ivrs_dev->block_ats &&
>>>> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
>>>> -            ASSERT(dte->i == ats_enabled);
>>>> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
>>>> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
>>>> +            ASSERT(dte->i);
>>>>  
>>>>          spin_unlock_irqrestore(&iommu->lock, flags);
>>>>  
>>>> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_
>>>>  
>>>>      ASSERT(pcidevs_locked());
>>>>  
>>>> -    if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>>>> +    if ( opt_ats > 0 &&
>>>>           !ivrs_dev->block_ats &&
>>>>           iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
>>>> +         pci_ats_device(iommu->seg, bus, pdev->devfn) &&
>>>>           !pci_ats_enabled(iommu->seg, bus, pdev->devfn) )
>>>
>>> Seeing that this same set of conditions is used in 3 different checks,
>>> could we add a wrapper for it?
>>>
>>> opt_ats > 0 && !ivrs_dev->block_ats &&
>>> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
>>> pci_ats_device(iommu->seg, bus, pdev->devfn)
>>>
>>> pci_device_ats_capable()? or some such.
>>
>> I was pondering that, yes (iirc already once when adding block_ats).
>> Problem is the name. "capable" isn't quite right when considering
>> the tristate opt_ats. And pci_device_may_use_ats() reads, well,
>> clumsy to me. If you have any good idea for a name that's fully
>> applicable and not odd or overly long, I can certainly introduce
>> such a helper.
> 
> But if ATS is soft-disabled (-1) or hard disabled (0), it's fine to
> consider the devices as not ATS capable for the context here?

I don't like mixing capability and policy aspects into a resulting
"capable".

Jan
Roger Pau Monné Feb. 12, 2024, 3:38 p.m. UTC | #5
On Mon, Feb 12, 2024 at 11:45:33AM +0100, Jan Beulich wrote:
> On 12.02.2024 10:39, Roger Pau Monné wrote:
> > On Thu, Feb 08, 2024 at 04:49:46PM +0100, Jan Beulich wrote:
> >> On 08.02.2024 12:49, Roger Pau Monné wrote:
> >>> On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote:
> >>>> Make the variable a tristate, with (as done elsewhere) a negative value
> >>>> meaning "default". Since all use sites need looking at, also rename it
> >>>> to match our usual "opt_*" pattern. While touching it, also move it to
> >>>> .data.ro_after_init.
> >>>>
> >>>> The only place it retains boolean nature is pci_ats_device(), for now.
> >>>
> >>> Why does it retain the boolean nature in pci_ats_device()?
> >>>
> >>> I assume this is to avoid having to touch the line again in a further
> >>> patch, as given the current logic pci_ats_device() would also want to
> >>> treat -1 as ATS disabled.
> >>
> >> No, then I would need to touch the line. The function wants to treat
> >> -1 as "maybe enabled", so the caller can know whether a device is an
> >> ATS device regardless of whether ATS use is fully off, or only
> >> "soft-off".
> > 
> > I have to admit I'm slightly concerned about this soft-off.  Given the
> > current status of ATS itself in Xen, and the technology itself, I
> > think a user should always opt-in to ATS usage.
> 
> The plan is to follow your suggestion in patch 3 and require explicit
> enabling for passing through of such devices. For Dom0, however, I
> think it is important that we respect the firmware request by default.
> The only viable(?!) alternative would be to panic() instead.

Or assign to domIO?

> >>>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >>>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> >>>> @@ -185,10 +185,11 @@ static int __must_check amd_iommu_setup_
> >>>>          dte->ex = ivrs_dev->dte_allow_exclusion;
> >>>>          dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
> >>>>  
> >>>> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> >>>> +        if ( opt_ats > 0 &&
> >>>>               !ivrs_dev->block_ats &&
> >>>> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
> >>>> -            dte->i = ats_enabled;
> >>>> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> >>>> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
> >>>> +            dte->i = true;
> >>>>  
> >>>>          spin_unlock_irqrestore(&iommu->lock, flags);
> >>>>  
> >>>> @@ -248,10 +249,11 @@ static int __must_check amd_iommu_setup_
> >>>>          ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags,
> >>>>                                           ACPI_IVHD_SYSTEM_MGMT));
> >>>>  
> >>>> -        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> >>>> +        if ( opt_ats > 0 &&
> >>>>               !ivrs_dev->block_ats &&
> >>>> -             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
> >>>> -            ASSERT(dte->i == ats_enabled);
> >>>> +             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> >>>> +             pci_ats_device(iommu->seg, bus, pdev->devfn) )
> >>>> +            ASSERT(dte->i);
> >>>>  
> >>>>          spin_unlock_irqrestore(&iommu->lock, flags);
> >>>>  
> >>>> @@ -268,9 +270,10 @@ static int __must_check amd_iommu_setup_
> >>>>  
> >>>>      ASSERT(pcidevs_locked());
> >>>>  
> >>>> -    if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> >>>> +    if ( opt_ats > 0 &&
> >>>>           !ivrs_dev->block_ats &&
> >>>>           iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> >>>> +         pci_ats_device(iommu->seg, bus, pdev->devfn) &&
> >>>>           !pci_ats_enabled(iommu->seg, bus, pdev->devfn) )
> >>>
> >>> Seeing that this same set of conditions is used in 3 different checks,
> >>> could we add a wrapper for it?
> >>>
> >>> opt_ats > 0 && !ivrs_dev->block_ats &&
> >>> iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
> >>> pci_ats_device(iommu->seg, bus, pdev->devfn)
> >>>
> >>> pci_device_ats_capable()? or some such.
> >>
> >> I was pondering that, yes (iirc already once when adding block_ats).
> >> Problem is the name. "capable" isn't quite right when considering
> >> the tristate opt_ats. And pci_device_may_use_ats() reads, well,
> >> clumsy to me. If you have any good idea for a name that's fully
> >> applicable and not odd or overly long, I can certainly introduce
> >> such a helper.
> > 
> > But if ATS is soft-disabled (-1) or hard disabled (0), it's fine to
> > consider the devices as not ATS capable for the context here?
> 
> I don't like mixing capability and policy aspects into a resulting
> "capable".

IMO we should prefer avoiding code repetition, even if at the cost
of having a handler that have a maybe not ideal naming, but I can't
force you to do that.

Thanks, Roger.
Jan Beulich Feb. 12, 2024, 3:59 p.m. UTC | #6
On 12.02.2024 16:38, Roger Pau Monné wrote:
> On Mon, Feb 12, 2024 at 11:45:33AM +0100, Jan Beulich wrote:
>> On 12.02.2024 10:39, Roger Pau Monné wrote:
>>> On Thu, Feb 08, 2024 at 04:49:46PM +0100, Jan Beulich wrote:
>>>> On 08.02.2024 12:49, Roger Pau Monné wrote:
>>>>> On Mon, Feb 05, 2024 at 02:55:43PM +0100, Jan Beulich wrote:
>>>>>> Make the variable a tristate, with (as done elsewhere) a negative value
>>>>>> meaning "default". Since all use sites need looking at, also rename it
>>>>>> to match our usual "opt_*" pattern. While touching it, also move it to
>>>>>> .data.ro_after_init.
>>>>>>
>>>>>> The only place it retains boolean nature is pci_ats_device(), for now.
>>>>>
>>>>> Why does it retain the boolean nature in pci_ats_device()?
>>>>>
>>>>> I assume this is to avoid having to touch the line again in a further
>>>>> patch, as given the current logic pci_ats_device() would also want to
>>>>> treat -1 as ATS disabled.
>>>>
>>>> No, then I would need to touch the line. The function wants to treat
>>>> -1 as "maybe enabled", so the caller can know whether a device is an
>>>> ATS device regardless of whether ATS use is fully off, or only
>>>> "soft-off".
>>>
>>> I have to admit I'm slightly concerned about this soft-off.  Given the
>>> current status of ATS itself in Xen, and the technology itself, I
>>> think a user should always opt-in to ATS usage.
>>
>> The plan is to follow your suggestion in patch 3 and require explicit
>> enabling for passing through of such devices. For Dom0, however, I
>> think it is important that we respect the firmware request by default.
>> The only viable(?!) alternative would be to panic() instead.
> 
> Or assign to domIO?

Not behind Dom0's back - I can't see how that would work if then Dom0
tried to load a driver for the device.

Jan
diff mbox series

Patch

--- a/xen/drivers/passthrough/amd/iommu_cmd.c
+++ b/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -282,7 +282,7 @@  void amd_iommu_flush_iotlb(u8 devfn, con
     struct amd_iommu *iommu;
     unsigned int req_id, queueid, maxpend;
 
-    if ( !ats_enabled )
+    if ( opt_ats <= 0 )
         return;
 
     if ( !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) )
@@ -340,7 +340,7 @@  static void _amd_iommu_flush_pages(struc
         flush_command_buffer(iommu, 0);
     }
 
-    if ( ats_enabled )
+    if ( opt_ats > 0 )
     {
         amd_iommu_flush_all_iotlbs(d, daddr, order);
 
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -185,10 +185,11 @@  static int __must_check amd_iommu_setup_
         dte->ex = ivrs_dev->dte_allow_exclusion;
         dte->sys_mgt = MASK_EXTR(ivrs_dev->device_flags, ACPI_IVHD_SYSTEM_MGMT);
 
-        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
+        if ( opt_ats > 0 &&
              !ivrs_dev->block_ats &&
-             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
-            dte->i = ats_enabled;
+             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
+             pci_ats_device(iommu->seg, bus, pdev->devfn) )
+            dte->i = true;
 
         spin_unlock_irqrestore(&iommu->lock, flags);
 
@@ -248,10 +249,11 @@  static int __must_check amd_iommu_setup_
         ASSERT(dte->sys_mgt == MASK_EXTR(ivrs_dev->device_flags,
                                          ACPI_IVHD_SYSTEM_MGMT));
 
-        if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
+        if ( opt_ats > 0 &&
              !ivrs_dev->block_ats &&
-             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
-            ASSERT(dte->i == ats_enabled);
+             iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
+             pci_ats_device(iommu->seg, bus, pdev->devfn) )
+            ASSERT(dte->i);
 
         spin_unlock_irqrestore(&iommu->lock, flags);
 
@@ -268,9 +270,10 @@  static int __must_check amd_iommu_setup_
 
     ASSERT(pcidevs_locked());
 
-    if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
+    if ( opt_ats > 0 &&
          !ivrs_dev->block_ats &&
          iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) &&
+         pci_ats_device(iommu->seg, bus, pdev->devfn) &&
          !pci_ats_enabled(iommu->seg, bus, pdev->devfn) )
     {
         if ( devfn == pdev->devfn )
--- a/xen/drivers/passthrough/ats.c
+++ b/xen/drivers/passthrough/ats.c
@@ -18,8 +18,8 @@ 
 #include <xen/pci_regs.h>
 #include "ats.h"
 
-bool __read_mostly ats_enabled;
-boolean_param("ats", ats_enabled);
+int8_t __ro_after_init opt_ats = -1;
+boolean_param("ats", opt_ats);
 
 int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list)
 {
--- a/xen/drivers/passthrough/ats.h
+++ b/xen/drivers/passthrough/ats.h
@@ -22,7 +22,7 @@ 
 #define ATS_QUEUE_DEPTH_MASK     0x1f
 #define ATS_ENABLE               (1<<15)
 
-extern bool ats_enabled;
+extern int8_t opt_ats;
 
 int enable_ats_device(struct pci_dev *pdev, struct list_head *ats_list);
 void disable_ats_device(struct pci_dev *pdev);
@@ -43,7 +43,7 @@  static inline int pci_ats_enabled(int se
 
 static inline int pci_ats_device(int seg, int bus, int devfn)
 {
-    if ( !ats_enabled )
+    if ( !opt_ats )
         return 0;
 
     return pci_find_ext_capability(PCI_SBDF(seg, bus, devfn),
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1543,7 +1543,7 @@  int domain_context_mapping_one(
         }
 
         context_set_address_root(lctxt, root);
-        if ( ats_enabled && ecap_dev_iotlb(iommu->ecap) )
+        if ( opt_ats > 0 && ecap_dev_iotlb(iommu->ecap) )
             context_set_translation_type(lctxt, CONTEXT_TT_DEV_IOTLB);
         else
             context_set_translation_type(lctxt, CONTEXT_TT_MULTI_LEVEL);
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -46,7 +46,7 @@  int ats_device(const struct pci_dev *pde
     struct acpi_drhd_unit *ats_drhd;
     int pos;
 
-    if ( !ats_enabled || !iommu_qinval )
+    if ( opt_ats <= 0 || !iommu_qinval )
         return 0;
 
     if ( !ecap_queued_inval(drhd->iommu->ecap) ||