diff mbox series

drm/i915: Use per device iommu check

Message ID 20211109121759.170915-1-tvrtko.ursulin@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915: Use per device iommu check | expand

Commit Message

Tvrtko Ursulin Nov. 9, 2021, 12:17 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
and probe presence of iommu domain per device to accurately reflect its
status.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
---
Baolu, is my understanding here correct? Maybe I am confused by both
intel_iommu_gfx_mapped and dmar_map_gfx being globals in the intel_iommu
driver. But it certainly appears the setup can assign some iommu ops (and
assign the discrete i915 to iommu group) when those two are set to off.
---
 drivers/gpu/drm/i915/display/intel_bw.c      |  2 +-
 drivers/gpu/drm/i915/display/intel_display.c |  2 +-
 drivers/gpu/drm/i915/display/intel_fbc.c     |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c   |  2 +-
 drivers/gpu/drm/i915/gem/i915_gemfs.c        |  2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c         |  4 ++--
 drivers/gpu/drm/i915/i915_debugfs.c          |  1 +
 drivers/gpu/drm/i915/i915_drv.c              |  7 +++++++
 drivers/gpu/drm/i915/i915_drv.h              | 13 +++++++------
 drivers/gpu/drm/i915/i915_gpu_error.c        |  5 +----
 drivers/gpu/drm/i915/intel_device_info.c     | 14 +-------------
 drivers/gpu/drm/i915/intel_pm.c              |  2 +-
 12 files changed, 25 insertions(+), 31 deletions(-)

Comments

Lucas De Marchi Nov. 9, 2021, 5:19 p.m. UTC | #1
On Tue, Nov 09, 2021 at 12:17:59PM +0000, Tvrtko Ursulin wrote:
>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
>On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
>disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>and probe presence of iommu domain per device to accurately reflect its
>status.

nice, I was just starting to look into thus but for another reason: we
are adding support for other archs, like aarch64, and the global from here
was a problem

should we change drivers/gpu/drm/i915/Kconfig.debug to stop selecting
CONFIG_INTEL_IOMMU and CONFIG_INTEL_IOMMU_DEFAULT_ON?


thanks
Lucas De Marchi
Tvrtko Ursulin Nov. 9, 2021, 5:35 p.m. UTC | #2
On 09/11/2021 17:19, Lucas De Marchi wrote:
> On Tue, Nov 09, 2021 at 12:17:59PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>> and probe presence of iommu domain per device to accurately reflect its
>> status.
> 
> nice, I was just starting to look into thus but for another reason: we
> are adding support for other archs, like aarch64, and the global from here
> was a problem

Yes I realized the other iommu angle as well. To do this properly we 
need to sort the intel_vtd_active call sites into at least two buckets - 
which are truly about VT-d and which are just IOMMU.

For instance the THP decision in i915_gemfs.co would be "are we behind 
any iommu". Some other call sites are possibly only about the bugs in 
the igfx iommu. Not sure if there is a third bucket for any potential 
differences between igfx iommu and other Intel iommu in case of dgfx.

I'd like to hear from Baolu as well to confirm if intel_iommu driver is 
handling igfx + dgfx correctly in respect to the two global variables I 
mention in the commit message.

> should we change drivers/gpu/drm/i915/Kconfig.debug to stop selecting
> CONFIG_INTEL_IOMMU and CONFIG_INTEL_IOMMU_DEFAULT_ON?

Don't know. For debug it is useful since it can catch more issues but 
whether or not kconfig can be improved to select the right one for the 
platform? I guess select X if X86, select Y if Z?

Regards,

Tvrtko
Baolu Lu Nov. 10, 2021, 7:12 a.m. UTC | #3
Hi Tvrtko,

On 2021/11/9 20:17, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
> 
> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
> and probe presence of iommu domain per device to accurately reflect its
> status.
> 
> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
> Cc: Lu Baolu<baolu.lu@linux.intel.com>
> ---
> Baolu, is my understanding here correct? Maybe I am confused by both
> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the intel_iommu
> driver. But it certainly appears the setup can assign some iommu ops (and
> assign the discrete i915 to iommu group) when those two are set to off.

diff --git a/drivers/gpu/drm/i915/i915_drv.h 
b/drivers/gpu/drm/i915/i915_drv.h
index e967cd08f23e..9fb38a54f1fe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
  #define HAS_D12_PLANE_MINIMIZATION(dev_priv) (IS_ROCKETLAKE(dev_priv) || \
  					      IS_ALDERLAKE_S(dev_priv))

-static inline bool intel_vtd_active(void)
+static inline bool intel_vtd_active(struct drm_i915_private *i915)
  {
-#ifdef CONFIG_INTEL_IOMMU
-	if (intel_iommu_gfx_mapped)
+	if (iommu_get_domain_for_dev(i915->drm.dev))
  		return true;
-#endif

  	/* Running as a guest, we assume the host is enforcing VT'd */
  	return run_as_guest();
  }

Have you verified this change? I am afraid that
iommu_get_domain_for_dev() always gets a valid iommu domain even
intel_iommu_gfx_mapped == 0.

A possible way could look like this:

static bool intel_vtd_active(struct drm_i915_private *i915)
{
         struct iommu_domain *domain;

         domain = iommu_get_domain_for_dev(i915->drm.dev);

         if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
                 return true;

	... ...
}

Actually I don't like this either since it checks the domain->type out
of the iommu subsystem. We could refactor this later by export an iommu
interface for this check.

Best regards,
baolu
Baolu Lu Nov. 10, 2021, 7:25 a.m. UTC | #4
On 2021/11/10 1:35, Tvrtko Ursulin wrote:
> 
> On 09/11/2021 17:19, Lucas De Marchi wrote:
>> On Tue, Nov 09, 2021 at 12:17:59PM +0000, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
>>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>>> and probe presence of iommu domain per device to accurately reflect its
>>> status.
>>
>> nice, I was just starting to look into thus but for another reason: we
>> are adding support for other archs, like aarch64, and the global from 
>> here
>> was a problem
> 
> Yes I realized the other iommu angle as well. To do this properly we 
> need to sort the intel_vtd_active call sites into at least two buckets - 
> which are truly about VT-d and which are just IOMMU.
> 
> For instance the THP decision in i915_gemfs.co would be "are we behind 
> any iommu". Some other call sites are possibly only about the bugs in 
> the igfx iommu. Not sure if there is a third bucket for any potential 
> differences between igfx iommu and other Intel iommu in case of dgfx.
> 
> I'd like to hear from Baolu as well to confirm if intel_iommu driver is 
> handling igfx + dgfx correctly in respect to the two global variables I 
> mention in the commit message.

I strongly agree that the drivers should call the IOMMU interface
directly for portability. For Intel graphic driver, we have two issues:

#1) driver asks vt-d driver for identity map with intel_iommu=igfx_off.
#2) driver query the status with a global intel_iommu_gfx_mapped.

We need to solve these two problems step by step. This patch is
definitely a good start point.

Best regards,
baolu
Tvrtko Ursulin Nov. 10, 2021, 9:30 a.m. UTC | #5
On 10/11/2021 07:12, Lu Baolu wrote:
> Hi Tvrtko,
> 
> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>
>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>> and probe presence of iommu domain per device to accurately reflect its
>> status.
>>
>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>> ---
>> Baolu, is my understanding here correct? Maybe I am confused by both
>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the intel_iommu
>> driver. But it certainly appears the setup can assign some iommu ops (and
>> assign the discrete i915 to iommu group) when those two are set to off.
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> b/drivers/gpu/drm/i915/i915_drv.h
> index e967cd08f23e..9fb38a54f1fe 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) (IS_ROCKETLAKE(dev_priv) 
> || \
>                             IS_ALDERLAKE_S(dev_priv))
> 
> -static inline bool intel_vtd_active(void)
> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>   {
> -#ifdef CONFIG_INTEL_IOMMU
> -    if (intel_iommu_gfx_mapped)
> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>           return true;
> -#endif
> 
>       /* Running as a guest, we assume the host is enforcing VT'd */
>       return run_as_guest();
>   }
> 
> Have you verified this change? I am afraid that
> iommu_get_domain_for_dev() always gets a valid iommu domain even
> intel_iommu_gfx_mapped == 0.

Yes it seems to work as is:

default:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

intel_iommu=igfx_off:

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

On my system dri device 0 is integrated graphics and 1 is discrete.

Regards,

Tvrtko

> 
> A possible way could look like this:
> 
> static bool intel_vtd_active(struct drm_i915_private *i915)
> {
>          struct iommu_domain *domain;
> 
>          domain = iommu_get_domain_for_dev(i915->drm.dev);
> 
>          if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
>                  return true;
> 
>      ... ...
> }
> 
> Actually I don't like this either since it checks the domain->type out
> of the iommu subsystem. We could refactor this later by export an iommu
> interface for this check.
> 
> Best regards,
> baolu
Tvrtko Ursulin Nov. 10, 2021, 9:35 a.m. UTC | #6
On 10/11/2021 07:25, Lu Baolu wrote:
> On 2021/11/10 1:35, Tvrtko Ursulin wrote:
>>
>> On 09/11/2021 17:19, Lucas De Marchi wrote:
>>> On Tue, Nov 09, 2021 at 12:17:59PM +0000, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
>>>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>>>> and probe presence of iommu domain per device to accurately reflect its
>>>> status.
>>>
>>> nice, I was just starting to look into thus but for another reason: we
>>> are adding support for other archs, like aarch64, and the global from 
>>> here
>>> was a problem
>>
>> Yes I realized the other iommu angle as well. To do this properly we 
>> need to sort the intel_vtd_active call sites into at least two buckets 
>> - which are truly about VT-d and which are just IOMMU.
>>
>> For instance the THP decision in i915_gemfs.co would be "are we behind 
>> any iommu". Some other call sites are possibly only about the bugs in 
>> the igfx iommu. Not sure if there is a third bucket for any potential 
>> differences between igfx iommu and other Intel iommu in case of dgfx.
>>
>> I'd like to hear from Baolu as well to confirm if intel_iommu driver 
>> is handling igfx + dgfx correctly in respect to the two global 
>> variables I mention in the commit message.
> 
> I strongly agree that the drivers should call the IOMMU interface
> directly for portability. For Intel graphic driver, we have two issues:
> 
> #1) driver asks vt-d driver for identity map with intel_iommu=igfx_off.
> #2) driver query the status with a global intel_iommu_gfx_mapped.
> 
> We need to solve these two problems step by step. This patch is
> definitely a good start point.

(I should have really consolidated the thread, but never mind now.)

You mean good starting point for the discussion or between your first 
and second email you started thinking it may even work?

Because as I wrote in the other email, it appears to work. But I fully 
accept it may be by accident and you may suggest a proper API to be 
added to the IOMMU core, which I would then be happy to use.

If maybe not immediately, perhaps we could start with this patch and 
going forward add something more detailed. Like for instance allowing us 
to query the name/id of the iommu driver in case i915 needs to apply 
different quirks across them? Not sure how feasible that would be, but 
at the moment the need does sound plausible to me.

Regards,

Tvrtko
Baolu Lu Nov. 10, 2021, 12:04 p.m. UTC | #7
On 2021/11/10 17:30, Tvrtko Ursulin wrote:
> 
> On 10/11/2021 07:12, Lu Baolu wrote:
>> Hi Tvrtko,
>>
>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>
>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
>>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>>> and probe presence of iommu domain per device to accurately reflect its
>>> status.
>>>
>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>> ---
>>> Baolu, is my understanding here correct? Maybe I am confused by both
>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the intel_iommu
>>> driver. But it certainly appears the setup can assign some iommu ops 
>>> (and
>>> assign the discrete i915 to iommu group) when those two are set to off.
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index e967cd08f23e..9fb38a54f1fe 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>> (IS_ROCKETLAKE(dev_priv) || \
>>                             IS_ALDERLAKE_S(dev_priv))
>>
>> -static inline bool intel_vtd_active(void)
>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>   {
>> -#ifdef CONFIG_INTEL_IOMMU
>> -    if (intel_iommu_gfx_mapped)
>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>           return true;
>> -#endif
>>
>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>       return run_as_guest();
>>   }
>>
>> Have you verified this change? I am afraid that
>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>> intel_iommu_gfx_mapped == 0.
> 
> Yes it seems to work as is:
> 
> default:
> 
> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
> 
> intel_iommu=igfx_off:
> 
> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
> 
> On my system dri device 0 is integrated graphics and 1 is discrete.

The drm device 0 has a dedicated iommu. When the user request igfx not
mapped, the VT-d implementation will turn it off to save power. But for
shared iommu, you definitely will get it enabled.

Best regards,
baolu
Tvrtko Ursulin Nov. 10, 2021, 12:08 p.m. UTC | #8
On 10/11/2021 12:04, Lu Baolu wrote:
> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>
>> On 10/11/2021 07:12, Lu Baolu wrote:
>>> Hi Tvrtko,
>>>
>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>
>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option only
>>>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>>>> and probe presence of iommu domain per device to accurately reflect its
>>>> status.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>> ---
>>>> Baolu, is my understanding here correct? Maybe I am confused by both
>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>> intel_iommu
>>>> driver. But it certainly appears the setup can assign some iommu ops 
>>>> (and
>>>> assign the discrete i915 to iommu group) when those two are set to off.
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index e967cd08f23e..9fb38a54f1fe 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>> (IS_ROCKETLAKE(dev_priv) || \
>>>                             IS_ALDERLAKE_S(dev_priv))
>>>
>>> -static inline bool intel_vtd_active(void)
>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>   {
>>> -#ifdef CONFIG_INTEL_IOMMU
>>> -    if (intel_iommu_gfx_mapped)
>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>           return true;
>>> -#endif
>>>
>>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>>       return run_as_guest();
>>>   }
>>>
>>> Have you verified this change? I am afraid that
>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>> intel_iommu_gfx_mapped == 0.
>>
>> Yes it seems to work as is:
>>
>> default:
>>
>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>
>> intel_iommu=igfx_off:
>>
>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>
>> On my system dri device 0 is integrated graphics and 1 is discrete.
> 
> The drm device 0 has a dedicated iommu. When the user request igfx not
> mapped, the VT-d implementation will turn it off to save power. But for
> shared iommu, you definitely will get it enabled.

Sorry I am not following, what exactly do you mean? Is there a platform 
with integrated graphics without a dedicated iommu, in which case 
intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 0 and 
iommu_get_domain_for_dev returning non-NULL?

Regards,

Tvrtko
Robin Murphy Nov. 10, 2021, 12:16 p.m. UTC | #9
On 2021-11-10 09:35, Tvrtko Ursulin wrote:
> 
> On 10/11/2021 07:25, Lu Baolu wrote:
>> On 2021/11/10 1:35, Tvrtko Ursulin wrote:
>>>
>>> On 09/11/2021 17:19, Lucas De Marchi wrote:
>>>> On Tue, Nov 09, 2021 at 12:17:59PM +0000, Tvrtko Ursulin wrote:
>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>
>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option 
>>>>> only
>>>>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>>>>> and probe presence of iommu domain per device to accurately reflect 
>>>>> its
>>>>> status.
>>>>
>>>> nice, I was just starting to look into thus but for another reason: we
>>>> are adding support for other archs, like aarch64, and the global 
>>>> from here
>>>> was a problem
>>>
>>> Yes I realized the other iommu angle as well. To do this properly we 
>>> need to sort the intel_vtd_active call sites into at least two 
>>> buckets - which are truly about VT-d and which are just IOMMU.
>>>
>>> For instance the THP decision in i915_gemfs.co would be "are we 
>>> behind any iommu". Some other call sites are possibly only about the 
>>> bugs in the igfx iommu. Not sure if there is a third bucket for any 
>>> potential differences between igfx iommu and other Intel iommu in 
>>> case of dgfx.
>>>
>>> I'd like to hear from Baolu as well to confirm if intel_iommu driver 
>>> is handling igfx + dgfx correctly in respect to the two global 
>>> variables I mention in the commit message.
>>
>> I strongly agree that the drivers should call the IOMMU interface
>> directly for portability. For Intel graphic driver, we have two issues:
>>
>> #1) driver asks vt-d driver for identity map with intel_iommu=igfx_off.
>> #2) driver query the status with a global intel_iommu_gfx_mapped.
>>
>> We need to solve these two problems step by step. This patch is
>> definitely a good start point.
> 
> (I should have really consolidated the thread, but never mind now.)
> 
> You mean good starting point for the discussion or between your first 
> and second email you started thinking it may even work?
> 
> Because as I wrote in the other email, it appears to work. But I fully 
> accept it may be by accident and you may suggest a proper API to be 
> added to the IOMMU core, which I would then be happy to use.

The "proper" API at the moment is device_iommu_mapped(), but indeed that 
only answers the "is this device connected to an IOMMU at all?" 
question, it doesn't tell you whether that IOMMU is translating or just 
bypassing.

If translation only really matters in terms of DMA API usage - i.e. 
you're not interested in using the IOMMU API directly - then I reckon it 
would be fairly reasonable to use dma_get_ops() to look at whether 
you've got dma-direct or iommu_dma_ops. At the moment that's sufficient 
to tell you whether your DMA is translated or not. If a more formal 
interface is wanted in future, I'm inclined to think that it would still 
belong at the DMA API level, since the public IOMMU API is really all 
about explicit translation, whereas what we care about here is reflected 
more in its internal interaction with the DMA APIs.

> If maybe not immediately, perhaps we could start with this patch and 
> going forward add something more detailed. Like for instance allowing us 
> to query the name/id of the iommu driver in case i915 needs to apply 
> different quirks across them? Not sure how feasible that would be, but 
> at the moment the need does sound plausible to me.

FWIW I'd be wary of trying to get that clever - with iGFX it's easy to 
make fine-grained decisions because you've got a known and fixed 
integration with a particular IOMMU, but once you get out into the wider 
world you'll run into not only multiple different IOMMU implementations 
behind the same driver, but even the exact same IOMMU IP having 
different characteristics in different SoCs. Even something as 
seemingly-innocuous as an "is it worth using 2MB large pages?" quirk 
list could effectively become the cross product of various kernel config 
options and all PCIe-capable SoCs in existence.

Cheers,
Robin.
Baolu Lu Nov. 10, 2021, 12:26 p.m. UTC | #10
On 2021/11/10 17:35, Tvrtko Ursulin wrote:
> 
> On 10/11/2021 07:25, Lu Baolu wrote:
>> On 2021/11/10 1:35, Tvrtko Ursulin wrote:
>>>
>>> On 09/11/2021 17:19, Lucas De Marchi wrote:
>>>> On Tue, Nov 09, 2021 at 12:17:59PM +0000, Tvrtko Ursulin wrote:
>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>
>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option 
>>>>> only
>>>>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>>>>> and probe presence of iommu domain per device to accurately reflect 
>>>>> its
>>>>> status.
>>>>
>>>> nice, I was just starting to look into thus but for another reason: we
>>>> are adding support for other archs, like aarch64, and the global 
>>>> from here
>>>> was a problem
>>>
>>> Yes I realized the other iommu angle as well. To do this properly we 
>>> need to sort the intel_vtd_active call sites into at least two 
>>> buckets - which are truly about VT-d and which are just IOMMU.
>>>
>>> For instance the THP decision in i915_gemfs.co would be "are we 
>>> behind any iommu". Some other call sites are possibly only about the 
>>> bugs in the igfx iommu. Not sure if there is a third bucket for any 
>>> potential differences between igfx iommu and other Intel iommu in 
>>> case of dgfx.
>>>
>>> I'd like to hear from Baolu as well to confirm if intel_iommu driver 
>>> is handling igfx + dgfx correctly in respect to the two global 
>>> variables I mention in the commit message.
>>
>> I strongly agree that the drivers should call the IOMMU interface
>> directly for portability. For Intel graphic driver, we have two issues:
>>
>> #1) driver asks vt-d driver for identity map with intel_iommu=igfx_off.
>> #2) driver query the status with a global intel_iommu_gfx_mapped.
>>
>> We need to solve these two problems step by step. This patch is
>> definitely a good start point.
> 
> (I should have really consolidated the thread, but never mind now.)
> 
> You mean good starting point for the discussion or between your first 
> and second email you started thinking it may even work?
> 
> Because as I wrote in the other email, it appears to work. But I fully 
> accept it may be by accident and you may suggest a proper API to be 
> added to the IOMMU core, which I would then be happy to use.
> 
> If maybe not immediately, perhaps we could start with this patch and 
> going forward add something more detailed. Like for instance allowing us 
> to query the name/id of the iommu driver in case i915 needs to apply 
> different quirks across them? Not sure how feasible that would be, but 
> at the moment the need does sound plausible to me.

The whole story in my mind looks like below:

1. The graphic device driver has its own way to let the user specify
    iommu-bypass mode. For example, early_param()...

2. If the iommu-bypass mode is set, the graphic device will get below
    set:

	dev->dma_ops_bypass = true;

3. The iommu probe procedure will use identity domain (non-mapped mode)
    for the device because value of dev->dma_ops_bypass is true.

That's all. The device driver doesn't need to check iommu for the
mapping mode. And this framework will be generic and could be used by
other device drivers.

The difficulty in doing so is that step 2 must happen before or during
the device probe process. At present, I haven't figured out how to do
it yet. :-)

Best regards,
baolu
Baolu Lu Nov. 10, 2021, 12:35 p.m. UTC | #11
On 2021/11/10 20:08, Tvrtko Ursulin wrote:
> 
> On 10/11/2021 12:04, Lu Baolu wrote:
>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>
>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>> Hi Tvrtko,
>>>>
>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>
>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option 
>>>>> only
>>>>> disables the igfx iommu. Stop relying on global intel_iommu_gfx_mapped
>>>>> and probe presence of iommu domain per device to accurately reflect 
>>>>> its
>>>>> status.
>>>>>
>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>> ---
>>>>> Baolu, is my understanding here correct? Maybe I am confused by both
>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>> intel_iommu
>>>>> driver. But it certainly appears the setup can assign some iommu 
>>>>> ops (and
>>>>> assign the discrete i915 to iommu group) when those two are set to 
>>>>> off.
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>
>>>> -static inline bool intel_vtd_active(void)
>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>   {
>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>> -    if (intel_iommu_gfx_mapped)
>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>           return true;
>>>> -#endif
>>>>
>>>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>>>       return run_as_guest();
>>>>   }
>>>>
>>>> Have you verified this change? I am afraid that
>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>> intel_iommu_gfx_mapped == 0.
>>>
>>> Yes it seems to work as is:
>>>
>>> default:
>>>
>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>
>>> intel_iommu=igfx_off:
>>>
>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>
>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>
>> The drm device 0 has a dedicated iommu. When the user request igfx not
>> mapped, the VT-d implementation will turn it off to save power. But for
>> shared iommu, you definitely will get it enabled.
> 
> Sorry I am not following, what exactly do you mean? Is there a platform 
> with integrated graphics without a dedicated iommu, in which case 
> intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 0 and 
> iommu_get_domain_for_dev returning non-NULL?

Your code always work for an igfx with a dedicated iommu. This might be
always true on today's platforms. But from driver's point of view, we
should not make such assumption.

For example, if the iommu implementation decides not to turn off the
graphic iommu (perhaps due to some hw quirk or for graphic
virtualization), your code will be broken.

Best regards,
baolu
Tvrtko Ursulin Nov. 10, 2021, 2:11 p.m. UTC | #12
On 10/11/2021 12:35, Lu Baolu wrote:
> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>
>> On 10/11/2021 12:04, Lu Baolu wrote:
>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>
>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>> Hi Tvrtko,
>>>>>
>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>
>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option 
>>>>>> only
>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>> intel_iommu_gfx_mapped
>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>> reflect its
>>>>>> status.
>>>>>>
>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>> ---
>>>>>> Baolu, is my understanding here correct? Maybe I am confused by both
>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>> intel_iommu
>>>>>> driver. But it certainly appears the setup can assign some iommu 
>>>>>> ops (and
>>>>>> assign the discrete i915 to iommu group) when those two are set to 
>>>>>> off.
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>
>>>>> -static inline bool intel_vtd_active(void)
>>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>>   {
>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>> -    if (intel_iommu_gfx_mapped)
>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>           return true;
>>>>> -#endif
>>>>>
>>>>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>>>>       return run_as_guest();
>>>>>   }
>>>>>
>>>>> Have you verified this change? I am afraid that
>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>> intel_iommu_gfx_mapped == 0.
>>>>
>>>> Yes it seems to work as is:
>>>>
>>>> default:
>>>>
>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>
>>>> intel_iommu=igfx_off:
>>>>
>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>
>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>
>>> The drm device 0 has a dedicated iommu. When the user request igfx not
>>> mapped, the VT-d implementation will turn it off to save power. But for
>>> shared iommu, you definitely will get it enabled.
>>
>> Sorry I am not following, what exactly do you mean? Is there a 
>> platform with integrated graphics without a dedicated iommu, in which 
>> case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 0 and 
>> iommu_get_domain_for_dev returning non-NULL?
> 
> Your code always work for an igfx with a dedicated iommu. This might be
> always true on today's platforms. But from driver's point of view, we
> should not make such assumption.
> 
> For example, if the iommu implementation decides not to turn off the
> graphic iommu (perhaps due to some hw quirk or for graphic
> virtualization), your code will be broken.

If I got it right, this would go back to your earlier recommendation to 
have the check look like this:

static bool intel_vtd_active(struct drm_i915_private *i915)
{
         struct iommu_domain *domain;

         domain = iommu_get_domain_for_dev(i915->drm.dev);
         if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
                 return true;
	...

This would be okay as a first step?

Elsewhere in the thread Robin suggested looking at the dec->dma_ops and 
comparing against iommu_dma_ops. These two solution would be effectively 
the same?

Regards,

Tvrtko
Robin Murphy Nov. 10, 2021, 2:37 p.m. UTC | #13
On 2021-11-10 14:11, Tvrtko Ursulin wrote:
> 
> On 10/11/2021 12:35, Lu Baolu wrote:
>> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>>
>>> On 10/11/2021 12:04, Lu Baolu wrote:
>>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>>> Hi Tvrtko,
>>>>>>
>>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>
>>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
>>>>>>> option only
>>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>>> intel_iommu_gfx_mapped
>>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>>> reflect its
>>>>>>> status.
>>>>>>>
>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>>> ---
>>>>>>> Baolu, is my understanding here correct? Maybe I am confused by both
>>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>>> intel_iommu
>>>>>>> driver. But it certainly appears the setup can assign some iommu 
>>>>>>> ops (and
>>>>>>> assign the discrete i915 to iommu group) when those two are set 
>>>>>>> to off.
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>>
>>>>>> -static inline bool intel_vtd_active(void)
>>>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>>>   {
>>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>>> -    if (intel_iommu_gfx_mapped)
>>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>>           return true;
>>>>>> -#endif
>>>>>>
>>>>>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>>>>>       return run_as_guest();
>>>>>>   }
>>>>>>
>>>>>> Have you verified this change? I am afraid that
>>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>>> intel_iommu_gfx_mapped == 0.
>>>>>
>>>>> Yes it seems to work as is:
>>>>>
>>>>> default:
>>>>>
>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>
>>>>> intel_iommu=igfx_off:
>>>>>
>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>
>>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>>
>>>> The drm device 0 has a dedicated iommu. When the user request igfx not
>>>> mapped, the VT-d implementation will turn it off to save power. But for
>>>> shared iommu, you definitely will get it enabled.
>>>
>>> Sorry I am not following, what exactly do you mean? Is there a 
>>> platform with integrated graphics without a dedicated iommu, in which 
>>> case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 0 and 
>>> iommu_get_domain_for_dev returning non-NULL?
>>
>> Your code always work for an igfx with a dedicated iommu. This might be
>> always true on today's platforms. But from driver's point of view, we
>> should not make such assumption.
>>
>> For example, if the iommu implementation decides not to turn off the
>> graphic iommu (perhaps due to some hw quirk or for graphic
>> virtualization), your code will be broken.
> 
> If I got it right, this would go back to your earlier recommendation to 
> have the check look like this:
> 
> static bool intel_vtd_active(struct drm_i915_private *i915)
> {
>          struct iommu_domain *domain;
> 
>          domain = iommu_get_domain_for_dev(i915->drm.dev);
>          if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
>                  return true;
>      ...
> 
> This would be okay as a first step?
> 
> Elsewhere in the thread Robin suggested looking at the dec->dma_ops and 
> comparing against iommu_dma_ops. These two solution would be effectively 
> the same?

Effectively, yes. See iommu_setup_dma_ops() - the only way to end up 
with iommu_dma_ops is if a managed translation domain is present; if the 
IOMMU is present but the default domain type has been set to passthrough 
(either globally or forced for the given device) it will do nothing and 
leave you with dma-direct, while if the IOMMU has been ignored entirely 
then it should never even be called. Thus it neatly encapsulates what 
you're after here.

Cheers,
Robin.
Tvrtko Ursulin Nov. 11, 2021, 3:06 p.m. UTC | #14
On 10/11/2021 12:35, Lu Baolu wrote:
> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>
>> On 10/11/2021 12:04, Lu Baolu wrote:
>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>
>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>> Hi Tvrtko,
>>>>>
>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>
>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off option 
>>>>>> only
>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>> intel_iommu_gfx_mapped
>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>> reflect its
>>>>>> status.
>>>>>>
>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>> ---
>>>>>> Baolu, is my understanding here correct? Maybe I am confused by both
>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>> intel_iommu
>>>>>> driver. But it certainly appears the setup can assign some iommu 
>>>>>> ops (and
>>>>>> assign the discrete i915 to iommu group) when those two are set to 
>>>>>> off.
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>
>>>>> -static inline bool intel_vtd_active(void)
>>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>>   {
>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>> -    if (intel_iommu_gfx_mapped)
>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>           return true;
>>>>> -#endif
>>>>>
>>>>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>>>>       return run_as_guest();
>>>>>   }
>>>>>
>>>>> Have you verified this change? I am afraid that
>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>> intel_iommu_gfx_mapped == 0.
>>>>
>>>> Yes it seems to work as is:
>>>>
>>>> default:
>>>>
>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>
>>>> intel_iommu=igfx_off:
>>>>
>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>
>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>
>>> The drm device 0 has a dedicated iommu. When the user request igfx not
>>> mapped, the VT-d implementation will turn it off to save power. But for
>>> shared iommu, you definitely will get it enabled.
>>
>> Sorry I am not following, what exactly do you mean? Is there a 
>> platform with integrated graphics without a dedicated iommu, in which 
>> case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 0 and 
>> iommu_get_domain_for_dev returning non-NULL?
> 
> Your code always work for an igfx with a dedicated iommu. This might be
> always true on today's platforms. But from driver's point of view, we
> should not make such assumption.
> 
> For example, if the iommu implementation decides not to turn off the
> graphic iommu (perhaps due to some hw quirk or for graphic
> virtualization), your code will be broken.

I tried your suggestion (checking for __IOMMU_DOMAIN_PAGING) and it works better, however I have observed one odd behaviour (for me at least).

In short - why does the DMAR mode for the discrete device change depending on igfx_off parameter?

Consider the laptop has these two graphics cards:

# cat /sys/kernel/debug/dri/0/name
i915 dev=0000:00:02.0 unique=0000:00:02.0 # integrated

# cat /sys/kernel/debug/dri/1/name
i915 dev=0000:03:00.0 unique=0000:03:00.0 # discrete

Booting with different options:
===============================

default / intel_iommu=on
------------------------

# cat /sys/class/iommu/dmar0/devices/0000:00:02.0/iommu_group/type
DMA-FQ
# cat /sys/class/iommu/dmar2/devices/0000:03:00.0/iommu_group/type
DMA-FQ

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled

All good.

intel_iommu=igfx_off
--------------------

## no dmar0 in sysfs
# cat /sys/class/iommu/dmar2/devices/0000:03:00.0/iommu_group/type
identity

Unexpected!?

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled # At least the i915 patch detects it correctly.

intel_iommu=off
---------------

## no dmar0 in sysfs
## no dmar2 in sysfs

# grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
/sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
/sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled

All good.

The fact discrete graphics changes from translated to pass-through when igfx_off is set is surprising to me. Is this a bug?

Regards,

Tvrtko
Tvrtko Ursulin Nov. 11, 2021, 3:18 p.m. UTC | #15
On 10/11/2021 14:37, Robin Murphy wrote:
> On 2021-11-10 14:11, Tvrtko Ursulin wrote:
>>
>> On 10/11/2021 12:35, Lu Baolu wrote:
>>> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>>>
>>>> On 10/11/2021 12:04, Lu Baolu wrote:
>>>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>>>> Hi Tvrtko,
>>>>>>>
>>>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>>
>>>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
>>>>>>>> option only
>>>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>>>> intel_iommu_gfx_mapped
>>>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>>>> reflect its
>>>>>>>> status.
>>>>>>>>
>>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>>>> ---
>>>>>>>> Baolu, is my understanding here correct? Maybe I am confused by 
>>>>>>>> both
>>>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>>>> intel_iommu
>>>>>>>> driver. But it certainly appears the setup can assign some iommu 
>>>>>>>> ops (and
>>>>>>>> assign the discrete i915 to iommu group) when those two are set 
>>>>>>>> to off.
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>>>
>>>>>>> -static inline bool intel_vtd_active(void)
>>>>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>>>>   {
>>>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>>>> -    if (intel_iommu_gfx_mapped)
>>>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>>>           return true;
>>>>>>> -#endif
>>>>>>>
>>>>>>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>>>>>>       return run_as_guest();
>>>>>>>   }
>>>>>>>
>>>>>>> Have you verified this change? I am afraid that
>>>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>>>> intel_iommu_gfx_mapped == 0.
>>>>>>
>>>>>> Yes it seems to work as is:
>>>>>>
>>>>>> default:
>>>>>>
>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>
>>>>>> intel_iommu=igfx_off:
>>>>>>
>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>
>>>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>>>
>>>>> The drm device 0 has a dedicated iommu. When the user request igfx not
>>>>> mapped, the VT-d implementation will turn it off to save power. But 
>>>>> for
>>>>> shared iommu, you definitely will get it enabled.
>>>>
>>>> Sorry I am not following, what exactly do you mean? Is there a 
>>>> platform with integrated graphics without a dedicated iommu, in 
>>>> which case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 
>>>> 0 and iommu_get_domain_for_dev returning non-NULL?
>>>
>>> Your code always work for an igfx with a dedicated iommu. This might be
>>> always true on today's platforms. But from driver's point of view, we
>>> should not make such assumption.
>>>
>>> For example, if the iommu implementation decides not to turn off the
>>> graphic iommu (perhaps due to some hw quirk or for graphic
>>> virtualization), your code will be broken.
>>
>> If I got it right, this would go back to your earlier recommendation 
>> to have the check look like this:
>>
>> static bool intel_vtd_active(struct drm_i915_private *i915)
>> {
>>          struct iommu_domain *domain;
>>
>>          domain = iommu_get_domain_for_dev(i915->drm.dev);
>>          if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
>>                  return true;
>>      ...
>>
>> This would be okay as a first step?
>>
>> Elsewhere in the thread Robin suggested looking at the dec->dma_ops 
>> and comparing against iommu_dma_ops. These two solution would be 
>> effectively the same?
> 
> Effectively, yes. See iommu_setup_dma_ops() - the only way to end up 
> with iommu_dma_ops is if a managed translation domain is present; if the 
> IOMMU is present but the default domain type has been set to passthrough 
> (either globally or forced for the given device) it will do nothing and 
> leave you with dma-direct, while if the IOMMU has been ignored entirely 
> then it should never even be called. Thus it neatly encapsulates what 
> you're after here.

One concern I have is whether the pass-through mode truly does nothing 
or addresses perhaps still go through the dmar hardware just with no 
translation?

If latter then most like for like change is actually exactly what the 
first version of my patch did. That is replace intel_iommu_gfx_mapped 
with a plain non-NULL check on iommu_get_domain_for_dev.

Regards,

Tvrtko
Baolu Lu Nov. 12, 2021, 12:53 a.m. UTC | #16
On 11/11/21 11:06 PM, Tvrtko Ursulin wrote:
> 
> On 10/11/2021 12:35, Lu Baolu wrote:
>> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>>
>>> On 10/11/2021 12:04, Lu Baolu wrote:
>>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>>> Hi Tvrtko,
>>>>>>
>>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>
>>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
>>>>>>> option only
>>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>>> intel_iommu_gfx_mapped
>>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>>> reflect its
>>>>>>> status.
>>>>>>>
>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>>> ---
>>>>>>> Baolu, is my understanding here correct? Maybe I am confused by both
>>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>>> intel_iommu
>>>>>>> driver. But it certainly appears the setup can assign some iommu 
>>>>>>> ops (and
>>>>>>> assign the discrete i915 to iommu group) when those two are set 
>>>>>>> to off.
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>>
>>>>>> -static inline bool intel_vtd_active(void)
>>>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>>>   {
>>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>>> -    if (intel_iommu_gfx_mapped)
>>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>>           return true;
>>>>>> -#endif
>>>>>>
>>>>>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>>>>>       return run_as_guest();
>>>>>>   }
>>>>>>
>>>>>> Have you verified this change? I am afraid that
>>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>>> intel_iommu_gfx_mapped == 0.
>>>>>
>>>>> Yes it seems to work as is:
>>>>>
>>>>> default:
>>>>>
>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>
>>>>> intel_iommu=igfx_off:
>>>>>
>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>
>>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>>
>>>> The drm device 0 has a dedicated iommu. When the user request igfx not
>>>> mapped, the VT-d implementation will turn it off to save power. But for
>>>> shared iommu, you definitely will get it enabled.
>>>
>>> Sorry I am not following, what exactly do you mean? Is there a 
>>> platform with integrated graphics without a dedicated iommu, in which 
>>> case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 0 and 
>>> iommu_get_domain_for_dev returning non-NULL?
>>
>> Your code always work for an igfx with a dedicated iommu. This might be
>> always true on today's platforms. But from driver's point of view, we
>> should not make such assumption.
>>
>> For example, if the iommu implementation decides not to turn off the
>> graphic iommu (perhaps due to some hw quirk or for graphic
>> virtualization), your code will be broken.
> 
> I tried your suggestion (checking for __IOMMU_DOMAIN_PAGING) and it 
> works better, however I have observed one odd behaviour (for me at least).
> 
> In short - why does the DMAR mode for the discrete device change 
> depending on igfx_off parameter?
> 
> Consider the laptop has these two graphics cards:
> 
> # cat /sys/kernel/debug/dri/0/name
> i915 dev=0000:00:02.0 unique=0000:00:02.0 # integrated
> 
> # cat /sys/kernel/debug/dri/1/name
> i915 dev=0000:03:00.0 unique=0000:03:00.0 # discrete
> 
> Booting with different options:
> ===============================
> 
> default / intel_iommu=on
> ------------------------
> 
> # cat /sys/class/iommu/dmar0/devices/0000:00:02.0/iommu_group/type
> DMA-FQ
> # cat /sys/class/iommu/dmar2/devices/0000:03:00.0/iommu_group/type
> DMA-FQ
> 
> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
> 
> All good.
> 
> intel_iommu=igfx_off
> --------------------
> 
> ## no dmar0 in sysfs
> # cat /sys/class/iommu/dmar2/devices/0000:03:00.0/iommu_group/type
> identity
> 
> Unexpected!?
> 
> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
> /sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled # At least the 
> i915 patch detects it correctly.
> 
> intel_iommu=off
> ---------------
> 
> ## no dmar0 in sysfs
> ## no dmar2 in sysfs
> 
> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
> /sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled
> 
> All good.
> 
> The fact discrete graphics changes from translated to pass-through when 
> igfx_off is set is surprising to me. Is this a bug?

The existing VT-d implementation doesn't distinguish igfx from dgfx. It
only checks whether the device is of a display class:

#define IS_GFX_DEVICE(pdev) ((pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY)

When igfx_off is specified, all graphic devices will put into pass-
through (the same meaning as identity mapping) mode. For igfx, since
the iommu is always dedicated, hence it further turn off the iommu
(hence there's no iommu domain) to save power.

Best regards,
baolu
Baolu Lu Nov. 12, 2021, 12:58 a.m. UTC | #17
On 11/11/21 11:18 PM, Tvrtko Ursulin wrote:
> 
> On 10/11/2021 14:37, Robin Murphy wrote:
>> On 2021-11-10 14:11, Tvrtko Ursulin wrote:
>>>
>>> On 10/11/2021 12:35, Lu Baolu wrote:
>>>> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 10/11/2021 12:04, Lu Baolu wrote:
>>>>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>>>>
>>>>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>>>>> Hi Tvrtko,
>>>>>>>>
>>>>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>>>
>>>>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
>>>>>>>>> option only
>>>>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>>>>> intel_iommu_gfx_mapped
>>>>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>>>>> reflect its
>>>>>>>>> status.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>>>>> ---
>>>>>>>>> Baolu, is my understanding here correct? Maybe I am confused by 
>>>>>>>>> both
>>>>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>>>>> intel_iommu
>>>>>>>>> driver. But it certainly appears the setup can assign some 
>>>>>>>>> iommu ops (and
>>>>>>>>> assign the discrete i915 to iommu group) when those two are set 
>>>>>>>>> to off.
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>>>>
>>>>>>>> -static inline bool intel_vtd_active(void)
>>>>>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>>>>>   {
>>>>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>>>>> -    if (intel_iommu_gfx_mapped)
>>>>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>>>>           return true;
>>>>>>>> -#endif
>>>>>>>>
>>>>>>>>       /* Running as a guest, we assume the host is enforcing 
>>>>>>>> VT'd */
>>>>>>>>       return run_as_guest();
>>>>>>>>   }
>>>>>>>>
>>>>>>>> Have you verified this change? I am afraid that
>>>>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>>>>> intel_iommu_gfx_mapped == 0.
>>>>>>>
>>>>>>> Yes it seems to work as is:
>>>>>>>
>>>>>>> default:
>>>>>>>
>>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>>
>>>>>>> intel_iommu=igfx_off:
>>>>>>>
>>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>>
>>>>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>>>>
>>>>>> The drm device 0 has a dedicated iommu. When the user request igfx 
>>>>>> not
>>>>>> mapped, the VT-d implementation will turn it off to save power. 
>>>>>> But for
>>>>>> shared iommu, you definitely will get it enabled.
>>>>>
>>>>> Sorry I am not following, what exactly do you mean? Is there a 
>>>>> platform with integrated graphics without a dedicated iommu, in 
>>>>> which case intel_iommu=igfx_off results in intel_iommu_gfx_mapped 
>>>>> == 0 and iommu_get_domain_for_dev returning non-NULL?
>>>>
>>>> Your code always work for an igfx with a dedicated iommu. This might be
>>>> always true on today's platforms. But from driver's point of view, we
>>>> should not make such assumption.
>>>>
>>>> For example, if the iommu implementation decides not to turn off the
>>>> graphic iommu (perhaps due to some hw quirk or for graphic
>>>> virtualization), your code will be broken.
>>>
>>> If I got it right, this would go back to your earlier recommendation 
>>> to have the check look like this:
>>>
>>> static bool intel_vtd_active(struct drm_i915_private *i915)
>>> {
>>>          struct iommu_domain *domain;
>>>
>>>          domain = iommu_get_domain_for_dev(i915->drm.dev);
>>>          if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
>>>                  return true;
>>>      ...
>>>
>>> This would be okay as a first step?
>>>
>>> Elsewhere in the thread Robin suggested looking at the dec->dma_ops 
>>> and comparing against iommu_dma_ops. These two solution would be 
>>> effectively the same?
>>
>> Effectively, yes. See iommu_setup_dma_ops() - the only way to end up 
>> with iommu_dma_ops is if a managed translation domain is present; if 
>> the IOMMU is present but the default domain type has been set to 
>> passthrough (either globally or forced for the given device) it will 
>> do nothing and leave you with dma-direct, while if the IOMMU has been 
>> ignored entirely then it should never even be called. Thus it neatly 
>> encapsulates what you're after here.
> 
> One concern I have is whether the pass-through mode truly does nothing 
> or addresses perhaps still go through the dmar hardware just with no 
> translation?

Pass-through mode means the latter.

> 
> If latter then most like for like change is actually exactly what the 
> first version of my patch did. That is replace intel_iommu_gfx_mapped 
> with a plain non-NULL check on iommu_get_domain_for_dev.

Depends on what you want here,

#1) the graphic device works in iommu pass-through mode
    - device have an iommu
    - but iommu does no translation
    - the dma transactions go through iommu with the same destination
      memory address specified by the device;

#2) the graphic device works without a system iommu
    - the iommu is off
    - there's no iommu on the path of DMA transaction.

My suggestion works for #1). Robin's suggestion (device_iommu_mapped())
could work for #2).

Best regards,
baolu
Tvrtko Ursulin Nov. 12, 2021, 1:40 p.m. UTC | #18
On 12/11/2021 00:53, Lu Baolu wrote:
> On 11/11/21 11:06 PM, Tvrtko Ursulin wrote:
>>
>> On 10/11/2021 12:35, Lu Baolu wrote:
>>> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>>>
>>>> On 10/11/2021 12:04, Lu Baolu wrote:
>>>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>>>> Hi Tvrtko,
>>>>>>>
>>>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>>
>>>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
>>>>>>>> option only
>>>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>>>> intel_iommu_gfx_mapped
>>>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>>>> reflect its
>>>>>>>> status.
>>>>>>>>
>>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>>>> ---
>>>>>>>> Baolu, is my understanding here correct? Maybe I am confused by 
>>>>>>>> both
>>>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>>>> intel_iommu
>>>>>>>> driver. But it certainly appears the setup can assign some iommu 
>>>>>>>> ops (and
>>>>>>>> assign the discrete i915 to iommu group) when those two are set 
>>>>>>>> to off.
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>>>
>>>>>>> -static inline bool intel_vtd_active(void)
>>>>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>>>>   {
>>>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>>>> -    if (intel_iommu_gfx_mapped)
>>>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>>>           return true;
>>>>>>> -#endif
>>>>>>>
>>>>>>>       /* Running as a guest, we assume the host is enforcing VT'd */
>>>>>>>       return run_as_guest();
>>>>>>>   }
>>>>>>>
>>>>>>> Have you verified this change? I am afraid that
>>>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>>>> intel_iommu_gfx_mapped == 0.
>>>>>>
>>>>>> Yes it seems to work as is:
>>>>>>
>>>>>> default:
>>>>>>
>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>
>>>>>> intel_iommu=igfx_off:
>>>>>>
>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>
>>>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>>>
>>>>> The drm device 0 has a dedicated iommu. When the user request igfx not
>>>>> mapped, the VT-d implementation will turn it off to save power. But 
>>>>> for
>>>>> shared iommu, you definitely will get it enabled.
>>>>
>>>> Sorry I am not following, what exactly do you mean? Is there a 
>>>> platform with integrated graphics without a dedicated iommu, in 
>>>> which case intel_iommu=igfx_off results in intel_iommu_gfx_mapped == 
>>>> 0 and iommu_get_domain_for_dev returning non-NULL?
>>>
>>> Your code always work for an igfx with a dedicated iommu. This might be
>>> always true on today's platforms. But from driver's point of view, we
>>> should not make such assumption.
>>>
>>> For example, if the iommu implementation decides not to turn off the
>>> graphic iommu (perhaps due to some hw quirk or for graphic
>>> virtualization), your code will be broken.
>>
>> I tried your suggestion (checking for __IOMMU_DOMAIN_PAGING) and it 
>> works better, however I have observed one odd behaviour (for me at 
>> least).
>>
>> In short - why does the DMAR mode for the discrete device change 
>> depending on igfx_off parameter?
>>
>> Consider the laptop has these two graphics cards:
>>
>> # cat /sys/kernel/debug/dri/0/name
>> i915 dev=0000:00:02.0 unique=0000:00:02.0 # integrated
>>
>> # cat /sys/kernel/debug/dri/1/name
>> i915 dev=0000:03:00.0 unique=0000:03:00.0 # discrete
>>
>> Booting with different options:
>> ===============================
>>
>> default / intel_iommu=on
>> ------------------------
>>
>> # cat /sys/class/iommu/dmar0/devices/0000:00:02.0/iommu_group/type
>> DMA-FQ
>> # cat /sys/class/iommu/dmar2/devices/0000:03:00.0/iommu_group/type
>> DMA-FQ
>>
>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>
>> All good.
>>
>> intel_iommu=igfx_off
>> --------------------
>>
>> ## no dmar0 in sysfs
>> # cat /sys/class/iommu/dmar2/devices/0000:03:00.0/iommu_group/type
>> identity
>>
>> Unexpected!?
>>
>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled # At least 
>> the i915 patch detects it correctly.
>>
>> intel_iommu=off
>> ---------------
>>
>> ## no dmar0 in sysfs
>> ## no dmar2 in sysfs
>>
>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled
>>
>> All good.
>>
>> The fact discrete graphics changes from translated to pass-through 
>> when igfx_off is set is surprising to me. Is this a bug?
> 
> The existing VT-d implementation doesn't distinguish igfx from dgfx. It
> only checks whether the device is of a display class:
> 
> #define IS_GFX_DEVICE(pdev) ((pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY)
> 
> When igfx_off is specified, all graphic devices will put into pass-
> through (the same meaning as identity mapping) mode. For igfx, since
> the iommu is always dedicated, hence it further turn off the iommu
> (hence there's no iommu domain) to save power.

Ah okay. Is this something we want to change/fix?

Regards,

Tvrtko
Tvrtko Ursulin Nov. 12, 2021, 2:10 p.m. UTC | #19
On 12/11/2021 00:58, Lu Baolu wrote:
> On 11/11/21 11:18 PM, Tvrtko Ursulin wrote:
>>
>> On 10/11/2021 14:37, Robin Murphy wrote:
>>> On 2021-11-10 14:11, Tvrtko Ursulin wrote:
>>>>
>>>> On 10/11/2021 12:35, Lu Baolu wrote:
>>>>> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>>>>>
>>>>>> On 10/11/2021 12:04, Lu Baolu wrote:
>>>>>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>>>>>
>>>>>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>>>>>> Hi Tvrtko,
>>>>>>>>>
>>>>>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>>>>
>>>>>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
>>>>>>>>>> option only
>>>>>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>>>>>> intel_iommu_gfx_mapped
>>>>>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>>>>>> reflect its
>>>>>>>>>> status.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>>>>>> ---
>>>>>>>>>> Baolu, is my understanding here correct? Maybe I am confused 
>>>>>>>>>> by both
>>>>>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>>>>>> intel_iommu
>>>>>>>>>> driver. But it certainly appears the setup can assign some 
>>>>>>>>>> iommu ops (and
>>>>>>>>>> assign the discrete i915 to iommu group) when those two are 
>>>>>>>>>> set to off.
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>>>>>
>>>>>>>>> -static inline bool intel_vtd_active(void)
>>>>>>>>> +static inline bool intel_vtd_active(struct drm_i915_private 
>>>>>>>>> *i915)
>>>>>>>>>   {
>>>>>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>>>>>> -    if (intel_iommu_gfx_mapped)
>>>>>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>>>>>           return true;
>>>>>>>>> -#endif
>>>>>>>>>
>>>>>>>>>       /* Running as a guest, we assume the host is enforcing 
>>>>>>>>> VT'd */
>>>>>>>>>       return run_as_guest();
>>>>>>>>>   }
>>>>>>>>>
>>>>>>>>> Have you verified this change? I am afraid that
>>>>>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>>>>>> intel_iommu_gfx_mapped == 0.
>>>>>>>>
>>>>>>>> Yes it seems to work as is:
>>>>>>>>
>>>>>>>> default:
>>>>>>>>
>>>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>>>
>>>>>>>> intel_iommu=igfx_off:
>>>>>>>>
>>>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>>>
>>>>>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>>>>>
>>>>>>> The drm device 0 has a dedicated iommu. When the user request 
>>>>>>> igfx not
>>>>>>> mapped, the VT-d implementation will turn it off to save power. 
>>>>>>> But for
>>>>>>> shared iommu, you definitely will get it enabled.
>>>>>>
>>>>>> Sorry I am not following, what exactly do you mean? Is there a 
>>>>>> platform with integrated graphics without a dedicated iommu, in 
>>>>>> which case intel_iommu=igfx_off results in intel_iommu_gfx_mapped 
>>>>>> == 0 and iommu_get_domain_for_dev returning non-NULL?
>>>>>
>>>>> Your code always work for an igfx with a dedicated iommu. This 
>>>>> might be
>>>>> always true on today's platforms. But from driver's point of view, we
>>>>> should not make such assumption.
>>>>>
>>>>> For example, if the iommu implementation decides not to turn off the
>>>>> graphic iommu (perhaps due to some hw quirk or for graphic
>>>>> virtualization), your code will be broken.
>>>>
>>>> If I got it right, this would go back to your earlier recommendation 
>>>> to have the check look like this:
>>>>
>>>> static bool intel_vtd_active(struct drm_i915_private *i915)
>>>> {
>>>>          struct iommu_domain *domain;
>>>>
>>>>          domain = iommu_get_domain_for_dev(i915->drm.dev);
>>>>          if (domain && (domain->type & __IOMMU_DOMAIN_PAGING))
>>>>                  return true;
>>>>      ...
>>>>
>>>> This would be okay as a first step?
>>>>
>>>> Elsewhere in the thread Robin suggested looking at the dec->dma_ops 
>>>> and comparing against iommu_dma_ops. These two solution would be 
>>>> effectively the same?
>>>
>>> Effectively, yes. See iommu_setup_dma_ops() - the only way to end up 
>>> with iommu_dma_ops is if a managed translation domain is present; if 
>>> the IOMMU is present but the default domain type has been set to 
>>> passthrough (either globally or forced for the given device) it will 
>>> do nothing and leave you with dma-direct, while if the IOMMU has been 
>>> ignored entirely then it should never even be called. Thus it neatly 
>>> encapsulates what you're after here.
>>
>> One concern I have is whether the pass-through mode truly does nothing 
>> or addresses perhaps still go through the dmar hardware just with no 
>> translation?
> 
> Pass-through mode means the latter.
> 
>>
>> If latter then most like for like change is actually exactly what the 
>> first version of my patch did. That is replace intel_iommu_gfx_mapped 
>> with a plain non-NULL check on iommu_get_domain_for_dev.
> 
> Depends on what you want here,
> 
> #1) the graphic device works in iommu pass-through mode
>     - device have an iommu
>     - but iommu does no translation
>     - the dma transactions go through iommu with the same destination
>       memory address specified by the device;

Do you have any indications of the slowdown this adds?

> #2) the graphic device works without a system iommu
>     - the iommu is off
>     - there's no iommu on the path of DMA transaction.
> 
> My suggestion works for #1). Robin's suggestion (device_iommu_mapped())
> could work for #2).

On the question of what do I want here. It seems that to preserve 
like-for-like with the current and past i915 usage, ie. 
intel_iommu_gfx_mapped, the first version of my patch should be used.

In other words if I configure the boot with iommu=pt, then 
intel_iommu_gfx_mapped is true. So if I add the __IOMMU_DOMAIN_PAGING 
check, the new intel_vtd_active would return false, where the old 
version would return true.

So v1 of the patch feels like the safest route given I don't know which 
workarounds are due remapping slowdown, and which may be present even in 
the pass-through mode.

I would explain the situation in the comment inside intel_vtd_active for 
future reference.

Regards,

Tvrtko
Tvrtko Ursulin Nov. 25, 2021, 10 a.m. UTC | #20
On 12/11/2021 13:40, Tvrtko Ursulin wrote:
> 
> On 12/11/2021 00:53, Lu Baolu wrote:
>> On 11/11/21 11:06 PM, Tvrtko Ursulin wrote:
>>>
>>> On 10/11/2021 12:35, Lu Baolu wrote:
>>>> On 2021/11/10 20:08, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 10/11/2021 12:04, Lu Baolu wrote:
>>>>>> On 2021/11/10 17:30, Tvrtko Ursulin wrote:
>>>>>>>
>>>>>>> On 10/11/2021 07:12, Lu Baolu wrote:
>>>>>>>> Hi Tvrtko,
>>>>>>>>
>>>>>>>> On 2021/11/9 20:17, Tvrtko Ursulin wrote:
>>>>>>>>> From: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>>>
>>>>>>>>> On igfx + dgfx setups, it appears that intel_iommu=igfx_off 
>>>>>>>>> option only
>>>>>>>>> disables the igfx iommu. Stop relying on global 
>>>>>>>>> intel_iommu_gfx_mapped
>>>>>>>>> and probe presence of iommu domain per device to accurately 
>>>>>>>>> reflect its
>>>>>>>>> status.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Tvrtko Ursulin<tvrtko.ursulin@intel.com>
>>>>>>>>> Cc: Lu Baolu<baolu.lu@linux.intel.com>
>>>>>>>>> ---
>>>>>>>>> Baolu, is my understanding here correct? Maybe I am confused by 
>>>>>>>>> both
>>>>>>>>> intel_iommu_gfx_mapped and dmar_map_gfx being globals in the 
>>>>>>>>> intel_iommu
>>>>>>>>> driver. But it certainly appears the setup can assign some 
>>>>>>>>> iommu ops (and
>>>>>>>>> assign the discrete i915 to iommu group) when those two are set 
>>>>>>>>> to off.
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>>>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> index e967cd08f23e..9fb38a54f1fe 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>>>> @@ -1763,26 +1763,27 @@ static inline bool run_as_guest(void)
>>>>>>>>   #define HAS_D12_PLANE_MINIMIZATION(dev_priv) 
>>>>>>>> (IS_ROCKETLAKE(dev_priv) || \
>>>>>>>>                             IS_ALDERLAKE_S(dev_priv))
>>>>>>>>
>>>>>>>> -static inline bool intel_vtd_active(void)
>>>>>>>> +static inline bool intel_vtd_active(struct drm_i915_private *i915)
>>>>>>>>   {
>>>>>>>> -#ifdef CONFIG_INTEL_IOMMU
>>>>>>>> -    if (intel_iommu_gfx_mapped)
>>>>>>>> +    if (iommu_get_domain_for_dev(i915->drm.dev))
>>>>>>>>           return true;
>>>>>>>> -#endif
>>>>>>>>
>>>>>>>>       /* Running as a guest, we assume the host is enforcing 
>>>>>>>> VT'd */
>>>>>>>>       return run_as_guest();
>>>>>>>>   }
>>>>>>>>
>>>>>>>> Have you verified this change? I am afraid that
>>>>>>>> iommu_get_domain_for_dev() always gets a valid iommu domain even
>>>>>>>> intel_iommu_gfx_mapped == 0.
>>>>>>>
>>>>>>> Yes it seems to work as is:
>>>>>>>
>>>>>>> default:
>>>>>>>
>>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>>
>>>>>>> intel_iommu=igfx_off:
>>>>>>>
>>>>>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>>>>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>>>>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>>>>>
>>>>>>> On my system dri device 0 is integrated graphics and 1 is discrete.
>>>>>>
>>>>>> The drm device 0 has a dedicated iommu. When the user request igfx 
>>>>>> not
>>>>>> mapped, the VT-d implementation will turn it off to save power. 
>>>>>> But for
>>>>>> shared iommu, you definitely will get it enabled.
>>>>>
>>>>> Sorry I am not following, what exactly do you mean? Is there a 
>>>>> platform with integrated graphics without a dedicated iommu, in 
>>>>> which case intel_iommu=igfx_off results in intel_iommu_gfx_mapped 
>>>>> == 0 and iommu_get_domain_for_dev returning non-NULL?
>>>>
>>>> Your code always work for an igfx with a dedicated iommu. This might be
>>>> always true on today's platforms. But from driver's point of view, we
>>>> should not make such assumption.
>>>>
>>>> For example, if the iommu implementation decides not to turn off the
>>>> graphic iommu (perhaps due to some hw quirk or for graphic
>>>> virtualization), your code will be broken.
>>>
>>> I tried your suggestion (checking for __IOMMU_DOMAIN_PAGING) and it 
>>> works better, however I have observed one odd behaviour (for me at 
>>> least).
>>>
>>> In short - why does the DMAR mode for the discrete device change 
>>> depending on igfx_off parameter?
>>>
>>> Consider the laptop has these two graphics cards:
>>>
>>> # cat /sys/kernel/debug/dri/0/name
>>> i915 dev=0000:00:02.0 unique=0000:00:02.0 # integrated
>>>
>>> # cat /sys/kernel/debug/dri/1/name
>>> i915 dev=0000:03:00.0 unique=0000:03:00.0 # discrete
>>>
>>> Booting with different options:
>>> ===============================
>>>
>>> default / intel_iommu=on
>>> ------------------------
>>>
>>> # cat /sys/class/iommu/dmar0/devices/0000:00:02.0/iommu_group/type
>>> DMA-FQ
>>> # cat /sys/class/iommu/dmar2/devices/0000:03:00.0/iommu_group/type
>>> DMA-FQ
>>>
>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: enabled
>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: enabled
>>>
>>> All good.
>>>
>>> intel_iommu=igfx_off
>>> --------------------
>>>
>>> ## no dmar0 in sysfs
>>> # cat /sys/class/iommu/dmar2/devices/0000:03:00.0/iommu_group/type
>>> identity
>>>
>>> Unexpected!?
>>>
>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled # At least 
>>> the i915 patch detects it correctly.
>>>
>>> intel_iommu=off
>>> ---------------
>>>
>>> ## no dmar0 in sysfs
>>> ## no dmar2 in sysfs
>>>
>>> # grep -i iommu /sys/kernel/debug/dri/*/i915_capabilities
>>> /sys/kernel/debug/dri/0/i915_capabilities:iommu: disabled
>>> /sys/kernel/debug/dri/1/i915_capabilities:iommu: disabled
>>>
>>> All good.
>>>
>>> The fact discrete graphics changes from translated to pass-through 
>>> when igfx_off is set is surprising to me. Is this a bug?
>>
>> The existing VT-d implementation doesn't distinguish igfx from dgfx. It
>> only checks whether the device is of a display class:
>>
>> #define IS_GFX_DEVICE(pdev) ((pdev->class >> 16) == 
>> PCI_BASE_CLASS_DISPLAY)
>>
>> When igfx_off is specified, all graphic devices will put into pass-
>> through (the same meaning as identity mapping) mode. For igfx, since
>> the iommu is always dedicated, hence it further turn off the iommu
>> (hence there's no iommu domain) to save power.
> 
> Ah okay. Is this something we want to change/fix?

Ping on this - don't we want to fix igfx_off option to not apply to 
discrete GPUs?

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 8d9d888e9316..a4d8088e4f71 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -490,7 +490,7 @@  static unsigned int intel_bw_data_rate(struct drm_i915_private *dev_priv,
 	for_each_pipe(dev_priv, pipe)
 		data_rate += bw_state->data_rate[pipe];
 
-	if (DISPLAY_VER(dev_priv) >= 13 && intel_vtd_active())
+	if (DISPLAY_VER(dev_priv) >= 13 && intel_vtd_active(dev_priv))
 		data_rate = data_rate * 105 / 100;
 
 	return data_rate;
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 29392dfc46c8..80d206f3e9da 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -1315,7 +1315,7 @@  static bool needs_async_flip_vtd_wa(const struct intel_crtc_state *crtc_state)
 {
 	struct drm_i915_private *i915 = to_i915(crtc_state->uapi.crtc->dev);
 
-	return crtc_state->uapi.async_flip && intel_vtd_active() &&
+	return crtc_state->uapi.async_flip && intel_vtd_active(i915) &&
 		(DISPLAY_VER(i915) == 9 || IS_BROADWELL(i915) || IS_HASWELL(i915));
 }
 
diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c
index 834eb4cc7c10..0ccbfc9e4101 100644
--- a/drivers/gpu/drm/i915/display/intel_fbc.c
+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
@@ -1539,7 +1539,7 @@  static int intel_sanitize_fbc_option(struct drm_i915_private *dev_priv)
 static bool need_fbc_vtd_wa(struct drm_i915_private *dev_priv)
 {
 	/* WaFbcTurnOffFbcWhenHyperVisorIsUsed:skl,bxt */
-	if (intel_vtd_active() &&
+	if (intel_vtd_active(dev_priv) &&
 	    (IS_SKYLAKE(dev_priv) || IS_BROXTON(dev_priv))) {
 		drm_info(&dev_priv->drm,
 			 "Disabling framebuffer compression (FBC) to prevent screen flicker with VT-d enabled\n");
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index ddd37ccb1362..cf100c0ea3b7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -399,7 +399,7 @@  static int i915_gem_init_stolen(struct intel_memory_region *mem)
 		return 0;
 	}
 
-	if (intel_vtd_active() && GRAPHICS_VER(i915) < 8) {
+	if (intel_vtd_active(i915) && GRAPHICS_VER(i915) < 8) {
 		drm_notice(&i915->drm,
 			   "%s, disabling use of stolen memory\n",
 			   "DMAR active");
diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c
index dbdbdc344d87..11cd66d183e6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gemfs.c
+++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c
@@ -31,7 +31,7 @@  int i915_gemfs_init(struct drm_i915_private *i915)
 	 */
 
 	opts = NULL;
-	if (intel_vtd_active()) {
+	if (intel_vtd_active(i915)) {
 		if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
 			static char huge_opt[] = "huge=within_size"; /* r/w */
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 1fb4a03d7ac3..ddb4a9d039d4 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -104,7 +104,7 @@  static bool needs_idle_maps(struct drm_i915_private *i915)
 	 * Query intel_iommu to see if we need the workaround. Presumably that
 	 * was loaded first.
 	 */
-	if (!intel_vtd_active())
+	if (!intel_vtd_active(i915))
 		return false;
 
 	if (GRAPHICS_VER(i915) == 5 && IS_MOBILE(i915))
@@ -1231,7 +1231,7 @@  int i915_ggtt_probe_hw(struct drm_i915_private *i915)
 	if (ret)
 		return ret;
 
-	if (intel_vtd_active())
+	if (intel_vtd_active(i915))
 		drm_info(&i915->drm, "VT-d active for gfx access\n");
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index fe638b5da7c0..390d541f64ea 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -65,6 +65,7 @@  static int i915_capabilities(struct seq_file *m, void *data)
 
 	intel_device_info_print_static(INTEL_INFO(i915), &p);
 	intel_device_info_print_runtime(RUNTIME_INFO(i915), &p);
+	i915_print_iommu_status(i915, &p);
 	intel_gt_info_print(&i915->gt.info, &p);
 	intel_driver_caps_print(&i915->caps, &p);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f184b44d05f2..8efd0ad5bef0 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -733,6 +733,12 @@  static void i915_driver_unregister(struct drm_i915_private *dev_priv)
 	i915_gem_driver_unregister(dev_priv);
 }
 
+void
+i915_print_iommu_status(struct drm_i915_private *i915, struct drm_printer *p)
+{
+	drm_printf(p, "iommu: %s\n", enableddisabled(intel_vtd_active(i915)));
+}
+
 static void i915_welcome_messages(struct drm_i915_private *dev_priv)
 {
 	if (drm_debug_enabled(DRM_UT_DRIVER)) {
@@ -748,6 +754,7 @@  static void i915_welcome_messages(struct drm_i915_private *dev_priv)
 
 		intel_device_info_print_static(INTEL_INFO(dev_priv), &p);
 		intel_device_info_print_runtime(RUNTIME_INFO(dev_priv), &p);
+		i915_print_iommu_status(dev_priv, &p);
 		intel_gt_info_print(&dev_priv->gt.info, &p);
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e967cd08f23e..9fb38a54f1fe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1763,26 +1763,27 @@  static inline bool run_as_guest(void)
 #define HAS_D12_PLANE_MINIMIZATION(dev_priv) (IS_ROCKETLAKE(dev_priv) || \
 					      IS_ALDERLAKE_S(dev_priv))
 
-static inline bool intel_vtd_active(void)
+static inline bool intel_vtd_active(struct drm_i915_private *i915)
 {
-#ifdef CONFIG_INTEL_IOMMU
-	if (intel_iommu_gfx_mapped)
+	if (iommu_get_domain_for_dev(i915->drm.dev))
 		return true;
-#endif
 
 	/* Running as a guest, we assume the host is enforcing VT'd */
 	return run_as_guest();
 }
 
+void
+i915_print_iommu_status(struct drm_i915_private *i915, struct drm_printer *p);
+
 static inline bool intel_scanout_needs_vtd_wa(struct drm_i915_private *dev_priv)
 {
-	return GRAPHICS_VER(dev_priv) >= 6 && intel_vtd_active();
+	return GRAPHICS_VER(dev_priv) >= 6 && intel_vtd_active(dev_priv);
 }
 
 static inline bool
 intel_ggtt_update_needs_vtd_wa(struct drm_i915_private *i915)
 {
-	return IS_BROXTON(i915) && intel_vtd_active();
+	return IS_BROXTON(i915) && intel_vtd_active(i915);
 }
 
 static inline bool
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index aa2b3aad9643..7672927ed10b 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1750,10 +1750,7 @@  static void capture_gen(struct i915_gpu_coredump *error)
 	error->wakelock = atomic_read(&i915->runtime_pm.wakeref_count);
 	error->suspended = i915->runtime_pm.suspended;
 
-	error->iommu = -1;
-#ifdef CONFIG_INTEL_IOMMU
-	error->iommu = intel_iommu_gfx_mapped;
-#endif
+	error->iommu = intel_vtd_active(i915);
 	error->reset_count = i915_reset_count(&i915->gpu_error);
 	error->suspend_count = i915->suspend_count;
 
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 6e6b317bc33c..e6605b5181a5 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -83,17 +83,6 @@  const char *intel_platform_name(enum intel_platform platform)
 	return platform_names[platform];
 }
 
-static const char *iommu_name(void)
-{
-	const char *msg = "n/a";
-
-#ifdef CONFIG_INTEL_IOMMU
-	msg = enableddisabled(intel_iommu_gfx_mapped);
-#endif
-
-	return msg;
-}
-
 void intel_device_info_print_static(const struct intel_device_info *info,
 				    struct drm_printer *p)
 {
@@ -114,7 +103,6 @@  void intel_device_info_print_static(const struct intel_device_info *info,
 		drm_printf(p, "display version: %u\n", info->display.ver);
 
 	drm_printf(p, "gt: %d\n", info->gt);
-	drm_printf(p, "iommu: %s\n", iommu_name());
 	drm_printf(p, "memory-regions: %x\n", info->memory_regions);
 	drm_printf(p, "page-sizes: %x\n", info->page_sizes);
 	drm_printf(p, "platform: %s\n", intel_platform_name(info->platform));
@@ -374,7 +362,7 @@  void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 			info->display.has_dsc = 0;
 	}
 
-	if (GRAPHICS_VER(dev_priv) == 6 && intel_vtd_active()) {
+	if (GRAPHICS_VER(dev_priv) == 6 && intel_vtd_active(dev_priv)) {
 		drm_info(&dev_priv->drm,
 			 "Disabling ppGTT for VT-d support\n");
 		info->ppgtt_type = INTEL_PPGTT_NONE;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 59adf0ce6719..f5b567f87c24 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -98,7 +98,7 @@  static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
 		 * "Plane N strech max must be programmed to 11b (x1)
 		 *  when Async flips are enabled on that plane."
 		 */
-		if (!IS_GEMINILAKE(dev_priv) && intel_vtd_active())
+		if (!IS_GEMINILAKE(dev_priv) && intel_vtd_active(dev_priv))
 			intel_uncore_rmw(&dev_priv->uncore, CHICKEN_PIPESL_1(pipe),
 					 SKL_PLANE1_STRETCH_MAX_MASK, SKL_PLANE1_STRETCH_MAX_X1);
 	}