mbox series

[v11,00/13] SMMUv3 Nested Stage Setup (IOMMU part)

Message ID 20200414150607.28488-1-eric.auger@redhat.com (mailing list archive)
Headers show
Series SMMUv3 Nested Stage Setup (IOMMU part) | expand

Message

Eric Auger April 14, 2020, 3:05 p.m. UTC
This version fixes an issue observed by Shameer on an SMMU 3.2,
when moving from dual stage config to stage 1 only config.
The 2 high 64b of the STE now get reset. Otherwise, leaving the
S2TTB set may cause a C_BAD_STE error.

This series can be found at:
https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
(including the VFIO part)
The QEMU fellow series still can be found at:
https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6

Users have expressed interest in that work and tested v9/v10:
- https://patchwork.kernel.org/cover/11039995/#23012381
- https://patchwork.kernel.org/cover/11039995/#23197235

Background:

This series brings the IOMMU part of HW nested paging support
in the SMMUv3. The VFIO part is submitted separately.

The IOMMU API is extended to support 2 new API functionalities:
1) pass the guest stage 1 configuration
2) pass stage 1 MSI bindings

Then those capabilities gets implemented in the SMMUv3 driver.

The virtualizer passes information through the VFIO user API
which cascades them to the iommu subsystem. This allows the guest
to own stage 1 tables and context descriptors (so-called PASID
table) while the host owns stage 2 tables and main configuration
structures (STE).

Best Regards

Eric


History:

v10 -> v11:
- S2TTB reset when S2 is off
- fix compil issue when CONFIG_IOMMU_DMA is not set

v9 -> v10:
- rebase on top of 5.6.0-rc3

v8 -> v9:
- rebase on 5.3
- split iommu/vfio parts

v6 -> v8:
- Implement VFIO-PCI device specific interrupt framework

v7 -> v8:
- rebase on top of v5.2-rc1 and especially
  8be39a1a04c1  iommu/arm-smmu-v3: Add a master->domain pointer
- dynamic alloc of s1_cfg/s2_cfg
- __arm_smmu_tlb_inv_asid/s1_range_nosync
- check there is no HW MSI regions
- asid invalidation using pasid extended struct (change in the uapi)
- add s1_live/s2_live checks
- move check about support of nested stages in domain finalise
- fixes in error reporting according to the discussion with Robin
- reordered the patches to have first iommu/smmuv3 patches and then
  VFIO patches

v6 -> v7:
- removed device handle from bind/unbind_guest_msi
- added "iommu/smmuv3: Nested mode single MSI doorbell per domain
  enforcement"
- added few uapi comments as suggested by Jean, Jacop and Alex

v5 -> v6:
- Fix compilation issue when CONFIG_IOMMU_API is unset

v4 -> v5:
- fix bug reported by Vincent: fault handler unregistration now happens in
  vfio_pci_release
- IOMMU_FAULT_PERM_* moved outside of struct definition + small
  uapi changes suggested by Kean-Philippe (except fetch_addr)
- iommu: introduce device fault report API: removed the PRI part.
- see individual logs for more details
- reset the ste abort flag on detach

v3 -> v4:
- took into account Alex, jean-Philippe and Robin's comments on v3
- rework of the smmuv3 driver integration
- add tear down ops for msi binding and PASID table binding
- fix S1 fault propagation
- put fault reporting patches at the beginning of the series following
  Jean-Philippe's request
- update of the cache invalidate and fault API uapis
- VFIO fault reporting rework with 2 separate regions and one mmappable
  segment for the fault queue
- moved to PATCH

v2 -> v3:
- When registering the S1 MSI binding we now store the device handle. This
  addresses Robin's comment about discimination of devices beonging to
  different S1 groups and using different physical MSI doorbells.
- Change the fault reporting API: use VFIO_PCI_DMA_FAULT_IRQ_INDEX to
  set the eventfd and expose the faults through an mmappable fault region

v1 -> v2:
- Added the fault reporting capability
- asid properly passed on invalidation (fix assignment of multiple
  devices)
- see individual change logs for more info

Eric Auger (11):
  iommu: Introduce bind/unbind_guest_msi
  iommu/smmuv3: Dynamically allocate s1_cfg and s2_cfg
  iommu/smmuv3: Get prepared for nested stage support
  iommu/smmuv3: Implement attach/detach_pasid_table
  iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs
  iommu/smmuv3: Implement cache_invalidate
  dma-iommu: Implement NESTED_MSI cookie
  iommu/smmuv3: Nested mode single MSI doorbell per domain enforcement
  iommu/smmuv3: Enforce incompatibility between nested mode and HW MSI
    regions
  iommu/smmuv3: Implement bind/unbind_guest_msi
  iommu/smmuv3: Report non recoverable faults

Jacob Pan (1):
  iommu: Introduce attach/detach_pasid_table API

Jean-Philippe Brucker (1):
  iommu/arm-smmu-v3: Maintain a SID->device structure

 drivers/iommu/arm-smmu-v3.c | 744 ++++++++++++++++++++++++++++++++----
 drivers/iommu/dma-iommu.c   | 142 ++++++-
 drivers/iommu/iommu.c       |  56 +++
 include/linux/dma-iommu.h   |  16 +
 include/linux/iommu.h       |  38 ++
 include/uapi/linux/iommu.h  |  51 +++
 6 files changed, 975 insertions(+), 72 deletions(-)

Comments

Zhangfei Gao April 16, 2020, 4:25 a.m. UTC | #1
On 2020/4/14 下午11:05, Eric Auger wrote:
> This version fixes an issue observed by Shameer on an SMMU 3.2,
> when moving from dual stage config to stage 1 only config.
> The 2 high 64b of the STE now get reset. Otherwise, leaving the
> S2TTB set may cause a C_BAD_STE error.
>
> This series can be found at:
> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
> (including the VFIO part)
> The QEMU fellow series still can be found at:
> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
>
> Users have expressed interest in that work and tested v9/v10:
> - https://patchwork.kernel.org/cover/11039995/#23012381
> - https://patchwork.kernel.org/cover/11039995/#23197235
>
> Background:
>
> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3. The VFIO part is submitted separately.
>
> The IOMMU API is extended to support 2 new API functionalities:
> 1) pass the guest stage 1 configuration
> 2) pass stage 1 MSI bindings
>
> Then those capabilities gets implemented in the SMMUv3 driver.
>
> The virtualizer passes information through the VFIO user API
> which cascades them to the iommu subsystem. This allows the guest
> to own stage 1 tables and context descriptors (so-called PASID
> table) while the host owns stage 2 tables and main configuration
> structures (STE).
>
>

Thanks Eric

Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
1. no-sva works, where guest app directly use physical address via ioctl.
2. vSVA still not work, same as v10,
3.  the v10 issue reported by Shameer has been solved,  first start qemu 
with  iommu=smmuv3, then start qemu without  iommu=smmuv3
4. no-sva also works without  iommu=smmuv3

Test details in https://docs.qq.com/doc/DRU5oR1NtUERseFNL

Thanks
Eric Auger April 16, 2020, 7:45 a.m. UTC | #2
Hi Zhangfei,

On 4/16/20 6:25 AM, Zhangfei Gao wrote:
> 
> 
> On 2020/4/14 下午11:05, Eric Auger wrote:
>> This version fixes an issue observed by Shameer on an SMMU 3.2,
>> when moving from dual stage config to stage 1 only config.
>> The 2 high 64b of the STE now get reset. Otherwise, leaving the
>> S2TTB set may cause a C_BAD_STE error.
>>
>> This series can be found at:
>> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
>> (including the VFIO part)
>> The QEMU fellow series still can be found at:
>> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
>>
>> Users have expressed interest in that work and tested v9/v10:
>> - https://patchwork.kernel.org/cover/11039995/#23012381
>> - https://patchwork.kernel.org/cover/11039995/#23197235
>>
>> Background:
>>
>> This series brings the IOMMU part of HW nested paging support
>> in the SMMUv3. The VFIO part is submitted separately.
>>
>> The IOMMU API is extended to support 2 new API functionalities:
>> 1) pass the guest stage 1 configuration
>> 2) pass stage 1 MSI bindings
>>
>> Then those capabilities gets implemented in the SMMUv3 driver.
>>
>> The virtualizer passes information through the VFIO user API
>> which cascades them to the iommu subsystem. This allows the guest
>> to own stage 1 tables and context descriptors (so-called PASID
>> table) while the host owns stage 2 tables and main configuration
>> structures (STE).
>>
>>
> 
> Thanks Eric
> 
> Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
> 1. no-sva works, where guest app directly use physical address via ioctl.
Thank you for the testing. Glad it works for you.
> 2. vSVA still not work, same as v10,
Yes that's normal this series is not meant to support vSVM at this stage.

I intend to add the missing pieces during the next weeks.

Thanks

Eric
> 3.  the v10 issue reported by Shameer has been solved,  first start qemu
> with  iommu=smmuv3, then start qemu without  iommu=smmuv3
> 4. no-sva also works without  iommu=smmuv3
> 
> Test details in https://docs.qq.com/doc/DRU5oR1NtUERseFNL
> 
> Thanks
>
Shameerali Kolothum Thodi April 30, 2020, 9:39 a.m. UTC | #3
Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 16 April 2020 08:45
> To: Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
> Cc: jean-philippe@linaro.org; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; alex.williamson@redhat.com;
> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
> tn@semihalf.com; bbhushan2@marvell.com
> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Zhangfei,
> 
> On 4/16/20 6:25 AM, Zhangfei Gao wrote:
> >
> >
> > On 2020/4/14 下午11:05, Eric Auger wrote:
> >> This version fixes an issue observed by Shameer on an SMMU 3.2,
> >> when moving from dual stage config to stage 1 only config.
> >> The 2 high 64b of the STE now get reset. Otherwise, leaving the
> >> S2TTB set may cause a C_BAD_STE error.
> >>
> >> This series can be found at:
> >> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
> >> (including the VFIO part)
> >> The QEMU fellow series still can be found at:
> >> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
> >>
> >> Users have expressed interest in that work and tested v9/v10:
> >> - https://patchwork.kernel.org/cover/11039995/#23012381
> >> - https://patchwork.kernel.org/cover/11039995/#23197235
> >>
> >> Background:
> >>
> >> This series brings the IOMMU part of HW nested paging support
> >> in the SMMUv3. The VFIO part is submitted separately.
> >>
> >> The IOMMU API is extended to support 2 new API functionalities:
> >> 1) pass the guest stage 1 configuration
> >> 2) pass stage 1 MSI bindings
> >>
> >> Then those capabilities gets implemented in the SMMUv3 driver.
> >>
> >> The virtualizer passes information through the VFIO user API
> >> which cascades them to the iommu subsystem. This allows the guest
> >> to own stage 1 tables and context descriptors (so-called PASID
> >> table) while the host owns stage 2 tables and main configuration
> >> structures (STE).
> >>
> >>
> >
> > Thanks Eric
> >
> > Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
> > 1. no-sva works, where guest app directly use physical address via ioctl.
> Thank you for the testing. Glad it works for you.
> > 2. vSVA still not work, same as v10,
> Yes that's normal this series is not meant to support vSVM at this stage.
> 
> I intend to add the missing pieces during the next weeks.

Thanks for that. I have made an attempt to add the vSVA based on 
your v10 + JPBs sva patches. The host kernel and Qemu changes can 
be found here[1][2].

This basically adds multiple pasid support on top of your changes.
I have done some basic sanity testing and we have some initial success
with the zip vf dev on our D06 platform. Please note that the STALL event is
not yet supported though, but works fine if we mlock() guest usr mem.

I also noted that Intel patches for vSVA has couple of changes in the vfio interfaces
and hope there will be a convergence soon. Please let me know your plans
of a respin of this series and see whether incorporating the changes for multiple
pasid make sense or not for now.

Thanks,
Shameer

[1]https://github.com/hisilicon/qemu/tree/v4.2.0-2stage-rfcv6-vsva-prototype-v1
[2]https://github.com/hisilicon/kernel-dev/tree/vsva-prototype-host-v1

> Thanks
> 
> Eric
> > 3.  the v10 issue reported by Shameer has been solved,  first start qemu
> > with  iommu=smmuv3, then start qemu without  iommu=smmuv3
> > 4. no-sva also works without  iommu=smmuv3
> >
> > Test details in https://docs.qq.com/doc/DRU5oR1NtUERseFNL
> >
> > Thanks
> >
Shameerali Kolothum Thodi May 7, 2020, 6:59 a.m. UTC | #4
Hi Eric,

> -----Original Message-----
> From: Shameerali Kolothum Thodi
> Sent: 30 April 2020 10:38
> To: 'Auger Eric' <eric.auger@redhat.com>; Zhangfei Gao
> <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
> Cc: jean-philippe@linaro.org; alex.williamson@redhat.com;
> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
> tn@semihalf.com; bbhushan2@marvell.com
> Subject: RE: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Eric,
> 
> > -----Original Message-----
> > From: Auger Eric [mailto:eric.auger@redhat.com]
> > Sent: 16 April 2020 08:45
> > To: Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> > iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> > kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
> > joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
> > Cc: jean-philippe@linaro.org; Shameerali Kolothum Thodi
> > <shameerali.kolothum.thodi@huawei.com>; alex.williamson@redhat.com;
> > jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
> > tn@semihalf.com; bbhushan2@marvell.com
> > Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> >
> > Hi Zhangfei,
> >
> > On 4/16/20 6:25 AM, Zhangfei Gao wrote:
> > >
> > >
> > > On 2020/4/14 下午11:05, Eric Auger wrote:
> > >> This version fixes an issue observed by Shameer on an SMMU 3.2,
> > >> when moving from dual stage config to stage 1 only config.
> > >> The 2 high 64b of the STE now get reset. Otherwise, leaving the
> > >> S2TTB set may cause a C_BAD_STE error.
> > >>
> > >> This series can be found at:
> > >> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
> > >> (including the VFIO part)
> > >> The QEMU fellow series still can be found at:
> > >> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
> > >>
> > >> Users have expressed interest in that work and tested v9/v10:
> > >> - https://patchwork.kernel.org/cover/11039995/#23012381
> > >> - https://patchwork.kernel.org/cover/11039995/#23197235
> > >>
> > >> Background:
> > >>
> > >> This series brings the IOMMU part of HW nested paging support
> > >> in the SMMUv3. The VFIO part is submitted separately.
> > >>
> > >> The IOMMU API is extended to support 2 new API functionalities:
> > >> 1) pass the guest stage 1 configuration
> > >> 2) pass stage 1 MSI bindings
> > >>
> > >> Then those capabilities gets implemented in the SMMUv3 driver.
> > >>
> > >> The virtualizer passes information through the VFIO user API
> > >> which cascades them to the iommu subsystem. This allows the guest
> > >> to own stage 1 tables and context descriptors (so-called PASID
> > >> table) while the host owns stage 2 tables and main configuration
> > >> structures (STE).
> > >>
> > >>
> > >
> > > Thanks Eric
> > >
> > > Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
> > > 1. no-sva works, where guest app directly use physical address via ioctl.
> > Thank you for the testing. Glad it works for you.
> > > 2. vSVA still not work, same as v10,
> > Yes that's normal this series is not meant to support vSVM at this stage.
> >
> > I intend to add the missing pieces during the next weeks.
> 
> Thanks for that. I have made an attempt to add the vSVA based on
> your v10 + JPBs sva patches. The host kernel and Qemu changes can
> be found here[1][2].
> 
> This basically adds multiple pasid support on top of your changes.
> I have done some basic sanity testing and we have some initial success
> with the zip vf dev on our D06 platform. Please note that the STALL event is
> not yet supported though, but works fine if we mlock() guest usr mem.

I have added STALL support for our vSVA prototype and it seems to be
working(on our hardware). I have updated the kernel and qemu branches with
the same[1][2]. I should warn you though that these are prototype code and I am pretty
much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost everything.
But thought of sharing, in case if it is useful somehow!.

Thanks,
Shameer

[1]https://github.com/hisilicon/kernel-dev/commits/vsva-prototype-host-v1

[2]https://github.com/hisilicon/qemu/tree/v4.2.0-2stage-rfcv6-vsva-prototype-v1
Eric Auger May 7, 2020, 7:45 a.m. UTC | #5
Hi Shameer,

On 5/7/20 8:59 AM, Shameerali Kolothum Thodi wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Shameerali Kolothum Thodi
>> Sent: 30 April 2020 10:38
>> To: 'Auger Eric' <eric.auger@redhat.com>; Zhangfei Gao
>> <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
>> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
>> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
>> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
>> Cc: jean-philippe@linaro.org; alex.williamson@redhat.com;
>> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
>> tn@semihalf.com; bbhushan2@marvell.com
>> Subject: RE: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
>>
>> Hi Eric,
>>
>>> -----Original Message-----
>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>> Sent: 16 April 2020 08:45
>>> To: Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
>>> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
>>> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
>>> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
>>> Cc: jean-philippe@linaro.org; Shameerali Kolothum Thodi
>>> <shameerali.kolothum.thodi@huawei.com>; alex.williamson@redhat.com;
>>> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
>>> tn@semihalf.com; bbhushan2@marvell.com
>>> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
>>>
>>> Hi Zhangfei,
>>>
>>> On 4/16/20 6:25 AM, Zhangfei Gao wrote:
>>>>
>>>>
>>>> On 2020/4/14 下午11:05, Eric Auger wrote:
>>>>> This version fixes an issue observed by Shameer on an SMMU 3.2,
>>>>> when moving from dual stage config to stage 1 only config.
>>>>> The 2 high 64b of the STE now get reset. Otherwise, leaving the
>>>>> S2TTB set may cause a C_BAD_STE error.
>>>>>
>>>>> This series can be found at:
>>>>> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
>>>>> (including the VFIO part)
>>>>> The QEMU fellow series still can be found at:
>>>>> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
>>>>>
>>>>> Users have expressed interest in that work and tested v9/v10:
>>>>> - https://patchwork.kernel.org/cover/11039995/#23012381
>>>>> - https://patchwork.kernel.org/cover/11039995/#23197235
>>>>>
>>>>> Background:
>>>>>
>>>>> This series brings the IOMMU part of HW nested paging support
>>>>> in the SMMUv3. The VFIO part is submitted separately.
>>>>>
>>>>> The IOMMU API is extended to support 2 new API functionalities:
>>>>> 1) pass the guest stage 1 configuration
>>>>> 2) pass stage 1 MSI bindings
>>>>>
>>>>> Then those capabilities gets implemented in the SMMUv3 driver.
>>>>>
>>>>> The virtualizer passes information through the VFIO user API
>>>>> which cascades them to the iommu subsystem. This allows the guest
>>>>> to own stage 1 tables and context descriptors (so-called PASID
>>>>> table) while the host owns stage 2 tables and main configuration
>>>>> structures (STE).
>>>>>
>>>>>
>>>>
>>>> Thanks Eric
>>>>
>>>> Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
>>>> 1. no-sva works, where guest app directly use physical address via ioctl.
>>> Thank you for the testing. Glad it works for you.
>>>> 2. vSVA still not work, same as v10,
>>> Yes that's normal this series is not meant to support vSVM at this stage.
>>>
>>> I intend to add the missing pieces during the next weeks.
>>
>> Thanks for that. I have made an attempt to add the vSVA based on
>> your v10 + JPBs sva patches. The host kernel and Qemu changes can
>> be found here[1][2].
>>
>> This basically adds multiple pasid support on top of your changes.
>> I have done some basic sanity testing and we have some initial success
>> with the zip vf dev on our D06 platform. Please note that the STALL event is
>> not yet supported though, but works fine if we mlock() guest usr mem.
> 
> I have added STALL support for our vSVA prototype and it seems to be
> working(on our hardware). I have updated the kernel and qemu branches with
> the same[1][2]. I should warn you though that these are prototype code and I am pretty
> much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost everything.
> But thought of sharing, in case if it is useful somehow!.

Thank you very much for your work. I intend to look at your additions by
beginning of next week.

Best Regards

Eric
> 
> Thanks,
> Shameer
> 
> [1]https://github.com/hisilicon/kernel-dev/commits/vsva-prototype-host-v1
> 
> [2]https://github.com/hisilicon/qemu/tree/v4.2.0-2stage-rfcv6-vsva-prototype-v1
>
Eric Auger May 13, 2020, 1:28 p.m. UTC | #6
Hi Shameer,

On 5/7/20 8:59 AM, Shameerali Kolothum Thodi wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Shameerali Kolothum Thodi
>> Sent: 30 April 2020 10:38
>> To: 'Auger Eric' <eric.auger@redhat.com>; Zhangfei Gao
>> <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
>> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
>> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
>> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
>> Cc: jean-philippe@linaro.org; alex.williamson@redhat.com;
>> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
>> tn@semihalf.com; bbhushan2@marvell.com
>> Subject: RE: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
>>
>> Hi Eric,
>>
>>> -----Original Message-----
>>> From: Auger Eric [mailto:eric.auger@redhat.com]
>>> Sent: 16 April 2020 08:45
>>> To: Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
>>> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
>>> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
>>> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
>>> Cc: jean-philippe@linaro.org; Shameerali Kolothum Thodi
>>> <shameerali.kolothum.thodi@huawei.com>; alex.williamson@redhat.com;
>>> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
>>> tn@semihalf.com; bbhushan2@marvell.com
>>> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
>>>
>>> Hi Zhangfei,
>>>
>>> On 4/16/20 6:25 AM, Zhangfei Gao wrote:
>>>>
>>>>
>>>> On 2020/4/14 下午11:05, Eric Auger wrote:
>>>>> This version fixes an issue observed by Shameer on an SMMU 3.2,
>>>>> when moving from dual stage config to stage 1 only config.
>>>>> The 2 high 64b of the STE now get reset. Otherwise, leaving the
>>>>> S2TTB set may cause a C_BAD_STE error.
>>>>>
>>>>> This series can be found at:
>>>>> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
>>>>> (including the VFIO part)
>>>>> The QEMU fellow series still can be found at:
>>>>> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
>>>>>
>>>>> Users have expressed interest in that work and tested v9/v10:
>>>>> - https://patchwork.kernel.org/cover/11039995/#23012381
>>>>> - https://patchwork.kernel.org/cover/11039995/#23197235
>>>>>
>>>>> Background:
>>>>>
>>>>> This series brings the IOMMU part of HW nested paging support
>>>>> in the SMMUv3. The VFIO part is submitted separately.
>>>>>
>>>>> The IOMMU API is extended to support 2 new API functionalities:
>>>>> 1) pass the guest stage 1 configuration
>>>>> 2) pass stage 1 MSI bindings
>>>>>
>>>>> Then those capabilities gets implemented in the SMMUv3 driver.
>>>>>
>>>>> The virtualizer passes information through the VFIO user API
>>>>> which cascades them to the iommu subsystem. This allows the guest
>>>>> to own stage 1 tables and context descriptors (so-called PASID
>>>>> table) while the host owns stage 2 tables and main configuration
>>>>> structures (STE).
>>>>>
>>>>>
>>>>
>>>> Thanks Eric
>>>>
>>>> Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
>>>> 1. no-sva works, where guest app directly use physical address via ioctl.
>>> Thank you for the testing. Glad it works for you.
>>>> 2. vSVA still not work, same as v10,
>>> Yes that's normal this series is not meant to support vSVM at this stage.
>>>
>>> I intend to add the missing pieces during the next weeks.
>>
>> Thanks for that. I have made an attempt to add the vSVA based on
>> your v10 + JPBs sva patches. The host kernel and Qemu changes can
>> be found here[1][2].
>>
>> This basically adds multiple pasid support on top of your changes.
>> I have done some basic sanity testing and we have some initial success
>> with the zip vf dev on our D06 platform. Please note that the STALL event is
>> not yet supported though, but works fine if we mlock() guest usr mem.
> 
> I have added STALL support for our vSVA prototype and it seems to be
> working(on our hardware). I have updated the kernel and qemu branches with
> the same[1][2]. I should warn you though that these are prototype code and I am pretty
> much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost everything.
> But thought of sharing, in case if it is useful somehow!.

Thank you again for sharing the POC. I looked at the kernel and QEMU
branches.

Here are some preliminary comments:
- "arm-smmu-v3: Reset S2TTB while switching back from nested stage":  as
you mentionned S2TTB reset now is featured in v11
- "arm-smmu-v3: Add support for multiple pasid in nested mode": I could
easily integrate this into my series. Update the iommu api first and
pass multiple CD info in a separate patch
- "arm-smmu-v3: Add support to Invalidate CD": CD invalidation should be
cascaded to host through the PASID cache invalidation uapi (no pb you
warned us for the POC you simply used VFIO_IOMMU_SET_PASID_TABLE). I
think I should add this support in my original series although it does
not seem to trigger any issue up to now.
- "arm-smmu-v3: Remove duplication of fault propagation". I understand
the transcode is done somewhere else with SVA but we still need to do it
if a single CD is used, right? I will review the SVA code to better
understand.
- for the STALL response injection I would tend to use a new VFIO region
for responses. At the moment there is a single VFIO region for reporting
the fault.

On QEMU side:
- I am currently working on 3.2 range invalidation support which is
needed for DPDK/VFIO
- While at it I will look at how to incrementally introduce some of the
features you need in this series.

Thanks

Eric



> 
> Thanks,
> Shameer
> 
> [1]https://github.com/hisilicon/kernel-dev/commits/vsva-prototype-host-v1
> 
> [2]https://github.com/hisilicon/qemu/tree/v4.2.0-2stage-rfcv6-vsva-prototype-v1
>
Shameerali Kolothum Thodi May 13, 2020, 3:57 p.m. UTC | #7
Hi Eric,

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 13 May 2020 14:29
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
> Cc: jean-philippe@linaro.org; alex.williamson@redhat.com;
> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
> tn@semihalf.com; bbhushan2@marvell.com
> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> 
[...]

> >>> Yes that's normal this series is not meant to support vSVM at this stage.
> >>>
> >>> I intend to add the missing pieces during the next weeks.
> >>
> >> Thanks for that. I have made an attempt to add the vSVA based on
> >> your v10 + JPBs sva patches. The host kernel and Qemu changes can
> >> be found here[1][2].
> >>
> >> This basically adds multiple pasid support on top of your changes.
> >> I have done some basic sanity testing and we have some initial success
> >> with the zip vf dev on our D06 platform. Please note that the STALL event is
> >> not yet supported though, but works fine if we mlock() guest usr mem.
> >
> > I have added STALL support for our vSVA prototype and it seems to be
> > working(on our hardware). I have updated the kernel and qemu branches
> with
> > the same[1][2]. I should warn you though that these are prototype code and I
> am pretty
> > much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost
> everything.
> > But thought of sharing, in case if it is useful somehow!.
> 
> Thank you again for sharing the POC. I looked at the kernel and QEMU
> branches.
> 
> Here are some preliminary comments:
> - "arm-smmu-v3: Reset S2TTB while switching back from nested stage":  as
> you mentionned S2TTB reset now is featured in v11

Yes.

> - "arm-smmu-v3: Add support for multiple pasid in nested mode": I could
> easily integrate this into my series. Update the iommu api first and
> pass multiple CD info in a separate patch

Ok.
> - "arm-smmu-v3: Add support to Invalidate CD": CD invalidation should be
> cascaded to host through the PASID cache invalidation uapi (no pb you
> warned us for the POC you simply used VFIO_IOMMU_SET_PASID_TABLE). I
> think I should add this support in my original series although it does
> not seem to trigger any issue up to now.

Agree. Cache invalidation uapi is a better interface for this. Also I don’t think
this matters for non-vsva cases as Guest kernel table/CD(pasid 0) will never
get invalidated. 

> - "arm-smmu-v3: Remove duplication of fault propagation". I understand
> the transcode is done somewhere else with SVA but we still need to do it
> if a single CD is used, right? I will review the SVA code to better
> understand.

Hmm..not sure. Need to take another look to see whether we need a special
handling for single CD or not.

> - for the STALL response injection I would tend to use a new VFIO region
> for responses. At the moment there is a single VFIO region for reporting
> the fault.

Sure. That will be much cleaner and probably improve the context switch
latency. Another thing I noted with STALL is that pasid_valid flag needs to be
taken care in the SVA kernel path. 

"iommu: Remove pasid validity check for STALL model page response msg"
Not sure this one is a proper way to handle this.
 
> On QEMU side:
> - I am currently working on 3.2 range invalidation support which is
> needed for DPDK/VFIO
> - While at it I will look at how to incrementally introduce some of the
> features you need in this series.

Ok. 

Thanks for taking a look at the POC.

Cheers,
Shameer
Eric Auger Nov. 17, 2020, 8:39 a.m. UTC | #8
Hi Shameer,

On 5/13/20 5:57 PM, Shameerali Kolothum Thodi wrote:
> Hi Eric,
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:eric.auger@redhat.com]
>> Sent: 13 May 2020 14:29
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
>> Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
>> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
>> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
>> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
>> Cc: jean-philippe@linaro.org; alex.williamson@redhat.com;
>> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
>> tn@semihalf.com; bbhushan2@marvell.com
>> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
>>
> [...]
> 
>>>>> Yes that's normal this series is not meant to support vSVM at this stage.
>>>>>
>>>>> I intend to add the missing pieces during the next weeks.
>>>>
>>>> Thanks for that. I have made an attempt to add the vSVA based on
>>>> your v10 + JPBs sva patches. The host kernel and Qemu changes can
>>>> be found here[1][2].
>>>>
>>>> This basically adds multiple pasid support on top of your changes.
>>>> I have done some basic sanity testing and we have some initial success
>>>> with the zip vf dev on our D06 platform. Please note that the STALL event is
>>>> not yet supported though, but works fine if we mlock() guest usr mem.
>>>
>>> I have added STALL support for our vSVA prototype and it seems to be
>>> working(on our hardware). I have updated the kernel and qemu branches
>> with
>>> the same[1][2]. I should warn you though that these are prototype code and I
>> am pretty
>>> much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost
>> everything.
>>> But thought of sharing, in case if it is useful somehow!.
>>
>> Thank you again for sharing the POC. I looked at the kernel and QEMU
>> branches.
>>
>> Here are some preliminary comments:
>> - "arm-smmu-v3: Reset S2TTB while switching back from nested stage":  as
>> you mentionned S2TTB reset now is featured in v11
> 
> Yes.
> 
>> - "arm-smmu-v3: Add support for multiple pasid in nested mode": I could
>> easily integrate this into my series. Update the iommu api first and
>> pass multiple CD info in a separate patch
> 
> Ok.
in v12, I added
[PATCH v12 14/15] iommu/smmuv3: Accept configs with more than one
context descriptor

I don't think you need to add s1cdmax addition as we already have
pasid_bits which should do the job.

>> - "arm-smmu-v3: Add support to Invalidate CD": CD invalidation should be
>> cascaded to host through the PASID cache invalidation uapi (no pb you
>> warned us for the POC you simply used VFIO_IOMMU_SET_PASID_TABLE). I
>> think I should add this support in my original series although it does
>> not seem to trigger any issue up to now.
> 
> Agree. Cache invalidation uapi is a better interface for this. Also I don’t think
> this matters for non-vsva cases as Guest kernel table/CD(pasid 0) will never
> get invalidated. 
in v12 I added [PATCH v12 15/15] iommu/smmuv3: Add PASID cache
invalidation per PASID. I have not tested it though.
> 
>> - "arm-smmu-v3: Remove duplication of fault propagation". I understand
>> the transcode is done somewhere else with SVA but we still need to do it
>> if a single CD is used, right? I will review the SVA code to better
>> understand.

Since I have rebase on 5.10-rc4 you will still have this duplication to
handle.
> 
> Hmm..not sure. Need to take another look to see whether we need a special
> handling for single CD or not.
> 
>> - for the STALL response injection I would tend to use a new VFIO region
>> for responses. At the moment there is a single VFIO region for reporting
>> the fault.

in v12 I added a new VFIO region to inject your fault. This was tested
with dummy event injection, this should work properly.

If we clearly identify all the public dependencies needed for vSVA/ARM I
can help you respinning on top of them

Thanks

Eric
> 
> Sure. That will be much cleaner and probably improve the context switch
> latency. Another thing I noted with STALL is that pasid_valid flag needs to be
> taken care in the SVA kernel path. 
> 
> "iommu: Remove pasid validity check for STALL model page response msg"
> Not sure this one is a proper way to handle this.
>  
>> On QEMU side:
>> - I am currently working on 3.2 range invalidation support which is
>> needed for DPDK/VFIO
>> - While at it I will look at how to incrementally introduce some of the
>> features you need in this series.
> 
> Ok. 
> 
> Thanks for taking a look at the POC.
> 
> Cheers,
> Shameer
>
Shameerali Kolothum Thodi Nov. 17, 2020, 9:16 a.m. UTC | #9
Hi Eric,

First, many thanks for the respin. I will go through all of these(iommu/vfio/Qemu)
and will do a thorough verification/tests on our hardware. 

> -----Original Message-----
> From: Auger Eric [mailto:eric.auger@redhat.com]
> Sent: 17 November 2020 08:40
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
> Cc: jean-philippe@linaro.org; alex.williamson@redhat.com;
> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com; peter.maydell@linaro.org;
> tn@semihalf.com; bbhushan2@marvell.com
> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Shameer,
> 
> On 5/13/20 5:57 PM, Shameerali Kolothum Thodi wrote:
> > Hi Eric,
> >
> >> -----Original Message-----
> >> From: Auger Eric [mailto:eric.auger@redhat.com]
> >> Sent: 13 May 2020 14:29
> >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> >> Zhangfei Gao <zhangfei.gao@linaro.org>; eric.auger.pro@gmail.com;
> >> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org;
> >> kvm@vger.kernel.org; kvmarm@lists.cs.columbia.edu; will@kernel.org;
> >> joro@8bytes.org; maz@kernel.org; robin.murphy@arm.com
> >> Cc: jean-philippe@linaro.org; alex.williamson@redhat.com;
> >> jacob.jun.pan@linux.intel.com; yi.l.liu@intel.com;
> peter.maydell@linaro.org;
> >> tn@semihalf.com; bbhushan2@marvell.com
> >> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> >>
> > [...]
> >
> >>>>> Yes that's normal this series is not meant to support vSVM at this stage.
> >>>>>
> >>>>> I intend to add the missing pieces during the next weeks.
> >>>>
> >>>> Thanks for that. I have made an attempt to add the vSVA based on
> >>>> your v10 + JPBs sva patches. The host kernel and Qemu changes can
> >>>> be found here[1][2].
> >>>>
> >>>> This basically adds multiple pasid support on top of your changes.
> >>>> I have done some basic sanity testing and we have some initial success
> >>>> with the zip vf dev on our D06 platform. Please note that the STALL event
> is
> >>>> not yet supported though, but works fine if we mlock() guest usr mem.
> >>>
> >>> I have added STALL support for our vSVA prototype and it seems to be
> >>> working(on our hardware). I have updated the kernel and qemu branches
> >> with
> >>> the same[1][2]. I should warn you though that these are prototype code
> and I
> >> am pretty
> >>> much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost
> >> everything.
> >>> But thought of sharing, in case if it is useful somehow!.
> >>
> >> Thank you again for sharing the POC. I looked at the kernel and QEMU
> >> branches.
> >>
> >> Here are some preliminary comments:
> >> - "arm-smmu-v3: Reset S2TTB while switching back from nested stage":
> as
> >> you mentionned S2TTB reset now is featured in v11
> >
> > Yes.
> >
> >> - "arm-smmu-v3: Add support for multiple pasid in nested mode": I could
> >> easily integrate this into my series. Update the iommu api first and
> >> pass multiple CD info in a separate patch
> >
> > Ok.
> in v12, I added
> [PATCH v12 14/15] iommu/smmuv3: Accept configs with more than one
> context descriptor
> 
> I don't think you need to add s1cdmax addition as we already have
> pasid_bits which should do the job.

Ok.
 
> >> - "arm-smmu-v3: Add support to Invalidate CD": CD invalidation should be
> >> cascaded to host through the PASID cache invalidation uapi (no pb you
> >> warned us for the POC you simply used VFIO_IOMMU_SET_PASID_TABLE). I
> >> think I should add this support in my original series although it does
> >> not seem to trigger any issue up to now.
> >
> > Agree. Cache invalidation uapi is a better interface for this. Also I don’t think
> > this matters for non-vsva cases as Guest kernel table/CD(pasid 0) will never
> > get invalidated.
> in v12 I added [PATCH v12 15/15] iommu/smmuv3: Add PASID cache
> invalidation per PASID. I have not tested it though.

Ok. Will verify this.

> >> - "arm-smmu-v3: Remove duplication of fault propagation". I understand
> >> the transcode is done somewhere else with SVA but we still need to do it
> >> if a single CD is used, right? I will review the SVA code to better
> >> understand.
> 
> Since I have rebase on 5.10-rc4 you will still have this duplication to
> handle.

OK.

> > Hmm..not sure. Need to take another look to see whether we need a special
> > handling for single CD or not.
> >
> >> - for the STALL response injection I would tend to use a new VFIO region
> >> for responses. At the moment there is a single VFIO region for reporting
> >> the fault.
> 
> in v12 I added a new VFIO region to inject your fault. This was tested
> with dummy event injection, this should work properly.

That’s cool. I will check this.
 
> If we clearly identify all the public dependencies needed for vSVA/ARM I
> can help you respinning on top of them

Sure. I will rebase the vSVA related changes on top of your series and then
we can take it from there. Thanks for your support.

Thanks,
Shameer

> Thanks
> 
> Eric
> >
> > Sure. That will be much cleaner and probably improve the context switch
> > latency. Another thing I noted with STALL is that pasid_valid flag needs to be
> > taken care in the SVA kernel path.
> >
> > "iommu: Remove pasid validity check for STALL model page response msg"
> > Not sure this one is a proper way to handle this.
> >
> >> On QEMU side:
> >> - I am currently working on 3.2 range invalidation support which is
> >> needed for DPDK/VFIO
> >> - While at it I will look at how to incrementally introduce some of the
> >> features you need in this series.
> >
> > Ok.
> >
> > Thanks for taking a look at the POC.
> >
> > Cheers,
> > Shameer
> >