Message ID | 171026725393.8367.17497620074051138306.stgit@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | powerpc: pSeries: vfio: iommu: Re-enable support for SPAPR TCE VFIO | expand |
On Tue, Mar 12, 2024 at 01:14:20PM -0500, Shivaprasad G Bhat wrote: > The commit 090bad39b237a ("powerpc/powernv: Add indirect levels to > it_userspace") which implemented the tce indirect levels > support for PowerNV ended up removing the single level support > which existed by default(generic tce_iommu_userspace_view_alloc/free() > calls). On pSeries the TCEs are single level, and the allocation > of userspace view is lost with the removal of generic code. :( :( If this has been broken since 2018 and nobody cared till now can we please go in a direction of moving this code to the new iommu APIs instead of doubling down on more of this old stuff that apparently almost nobody cares about ?? Jason
----- Original Message ----- > From: "Jason Gunthorpe" <jgg@ziepe.ca> > To: "Shivaprasad G Bhat" <sbhat@linux.ibm.com> > Cc: "Timothy Pearson" <tpearson@raptorengineering.com>, "Alex Williamson" <alex.williamson@redhat.com>, "linuxppc-dev" > <linuxppc-dev@lists.ozlabs.org>, "Michael Ellerman" <mpe@ellerman.id.au>, "npiggin" <npiggin@gmail.com>, "christophe > leroy" <christophe.leroy@csgroup.eu>, "aneesh kumar" <aneesh.kumar@kernel.org>, "naveen n rao" > <naveen.n.rao@linux.ibm.com>, "gbatra" <gbatra@linux.vnet.ibm.com>, brking@linux.vnet.ibm.com, "Alexey Kardashevskiy" > <aik@ozlabs.ru>, robh@kernel.org, "linux-kernel" <linux-kernel@vger.kernel.org>, "kvm" <kvm@vger.kernel.org>, "aik" > <aik@amd.com>, msuchanek@suse.de, "jroedel" <jroedel@suse.de>, "vaibhav" <vaibhav@linux.ibm.com>, svaidy@linux.ibm.com > Sent: Tuesday, March 19, 2024 9:32:02 AM > Subject: Re: [RFC PATCH 1/3] powerpc/pseries/iommu: Bring back userspace view for single level TCE tables > On Tue, Mar 12, 2024 at 01:14:20PM -0500, Shivaprasad G Bhat wrote: >> The commit 090bad39b237a ("powerpc/powernv: Add indirect levels to >> it_userspace") which implemented the tce indirect levels >> support for PowerNV ended up removing the single level support >> which existed by default(generic tce_iommu_userspace_view_alloc/free() >> calls). On pSeries the TCEs are single level, and the allocation >> of userspace view is lost with the removal of generic code. > > :( :( > > If this has been broken since 2018 and nobody cared till now can we > please go in a direction of moving this code to the new iommu APIs > instead of doubling down on more of this old stuff that apparently > almost nobody cares about ?? > > Jason Just FYI Raptor is working on porting things over to the new APIs. RFC patches should be posted in the next week or two.
Hi Jason, On 3/19/24 20:02, Jason Gunthorpe wrote: > On Tue, Mar 12, 2024 at 01:14:20PM -0500, Shivaprasad G Bhat wrote: >> The commit 090bad39b237a ("powerpc/powernv: Add indirect levels to >> it_userspace") which implemented the tce indirect levels >> support for PowerNV ended up removing the single level support >> which existed by default(generic tce_iommu_userspace_view_alloc/free() >> calls). On pSeries the TCEs are single level, and the allocation >> of userspace view is lost with the removal of generic code. > :( :( > > If this has been broken since 2018 and nobody cared till now can we > please go in a direction of moving this code to the new iommu APIs > instead of doubling down on more of this old stuff that apparently > almost nobody cares about ?? We have existing software stack deployments using VFIO userspace device assignment running on Power platform. We have to enable similar software stack on newer generation Power10 platform and also in a pSeries lpar environment. These distros rely on VFIO enabled in kernel and currently have IOMMUFD disabled. This patch series is a simpler low risk enablement that functionally get the software stack working while we continue to enable and move to IOMMUFD in phases. We have to fix the older APIs in order to stage the functional enablement in small increments. We are working on iommufd support for pSeries and looking forward to Timothy's patches. -Thanks Shivaprasad > Jason
Jason Gunthorpe <jgg@ziepe.ca> writes: > On Tue, Mar 12, 2024 at 01:14:20PM -0500, Shivaprasad G Bhat wrote: >> The commit 090bad39b237a ("powerpc/powernv: Add indirect levels to >> it_userspace") which implemented the tce indirect levels >> support for PowerNV ended up removing the single level support >> which existed by default(generic tce_iommu_userspace_view_alloc/free() >> calls). On pSeries the TCEs are single level, and the allocation >> of userspace view is lost with the removal of generic code. > > :( :( > > If this has been broken since 2018 and nobody cared till now can we > please go in a direction of moving this code to the new iommu APIs > instead of doubling down on more of this old stuff that apparently > almost nobody cares about ?? It's broken *on pseries* (Linux as a guest), but it works fine on powernv (aka bare metal, aka Linux as Hypervisor). What's changed is folks are now testing it on pseries with Linux as a nested hypervisor. cheers
On Tue, Mar 19, 2024 at 01:36:51PM -0500, Timothy Pearson wrote: > > On Tue, Mar 12, 2024 at 01:14:20PM -0500, Shivaprasad G Bhat wrote: > >> The commit 090bad39b237a ("powerpc/powernv: Add indirect levels to > >> it_userspace") which implemented the tce indirect levels > >> support for PowerNV ended up removing the single level support > >> which existed by default(generic tce_iommu_userspace_view_alloc/free() > >> calls). On pSeries the TCEs are single level, and the allocation > >> of userspace view is lost with the removal of generic code. > > > > :( :( > > > > If this has been broken since 2018 and nobody cared till now can we > > please go in a direction of moving this code to the new iommu APIs > > instead of doubling down on more of this old stuff that apparently > > almost nobody cares about ?? > > Just FYI Raptor is working on porting things over to the new APIs. > RFC patches should be posted in the next week or two. There was a discussion about this at LPC a few weeks ago, did any patches get prepared? Jason
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index e8c4129697b1..40de8d55faef 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -143,7 +143,7 @@ static int tce_build_pSeries(struct iommu_table *tbl, long index, } -static void tce_free_pSeries(struct iommu_table *tbl, long index, long npages) +static void tce_clear_pSeries(struct iommu_table *tbl, long index, long npages) { __be64 *tcep; @@ -162,6 +162,11 @@ static unsigned long tce_get_pseries(struct iommu_table *tbl, long index) return be64_to_cpu(*tcep); } +static void tce_free_pSeries(struct iommu_table *tbl) +{ + /* Do nothing. */ +} + static void tce_free_pSeriesLP(unsigned long liobn, long, long, long); static void tce_freemulti_pSeriesLP(struct iommu_table*, long, long); @@ -576,7 +581,7 @@ struct iommu_table_ops iommu_table_lpar_multi_ops; struct iommu_table_ops iommu_table_pseries_ops = { .set = tce_build_pSeries, - .clear = tce_free_pSeries, + .clear = tce_clear_pSeries, .get = tce_get_pseries }; @@ -685,15 +690,23 @@ static int tce_exchange_pseries(struct iommu_table *tbl, long index, unsigned return rc; } + +static __be64 *tce_useraddr_pSeriesLP(struct iommu_table *tbl, long index, + bool __always_unused alloc) +{ + return tbl->it_userspace ? &tbl->it_userspace[index - tbl->it_offset] : NULL; +} #endif struct iommu_table_ops iommu_table_lpar_multi_ops = { .set = tce_buildmulti_pSeriesLP, #ifdef CONFIG_IOMMU_API .xchg_no_kill = tce_exchange_pseries, + .useraddrptr = tce_useraddr_pSeriesLP, #endif .clear = tce_freemulti_pSeriesLP, - .get = tce_get_pSeriesLP + .get = tce_get_pSeriesLP, + .free = tce_free_pSeries }; /* diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c index a94ec6225d31..1cf36d687559 100644 --- a/drivers/vfio/vfio_iommu_spapr_tce.c +++ b/drivers/vfio/vfio_iommu_spapr_tce.c @@ -177,6 +177,50 @@ static long tce_iommu_register_pages(struct tce_container *container, return ret; } +static long tce_iommu_userspace_view_alloc(struct iommu_table *tbl, + struct mm_struct *mm) +{ + unsigned long cb = ALIGN(sizeof(tbl->it_userspace[0]) * + tbl->it_size, PAGE_SIZE); + unsigned long *uas; + long ret; + + if (tbl->it_indirect_levels) + return 0; + + WARN_ON(tbl->it_userspace); + + ret = account_locked_vm(mm, cb >> PAGE_SHIFT, true); + if (ret) + return ret; + + uas = vzalloc(cb); + if (!uas) { + account_locked_vm(mm, cb >> PAGE_SHIFT, false); + return -ENOMEM; + } + tbl->it_userspace = (__be64 *) uas; + + return 0; +} + +static void tce_iommu_userspace_view_free(struct iommu_table *tbl, + struct mm_struct *mm) +{ + unsigned long cb = ALIGN(sizeof(tbl->it_userspace[0]) * + tbl->it_size, PAGE_SIZE); + + if (!tbl->it_userspace) + return; + + if (tbl->it_indirect_levels) + return; + + vfree(tbl->it_userspace); + tbl->it_userspace = NULL; + account_locked_vm(mm, cb >> PAGE_SHIFT, false); +} + static bool tce_page_is_contained(struct mm_struct *mm, unsigned long hpa, unsigned int it_page_shift) { @@ -554,6 +598,12 @@ static long tce_iommu_build_v2(struct tce_container *container, unsigned long hpa; enum dma_data_direction dirtmp; + if (!tbl->it_userspace) { + ret = tce_iommu_userspace_view_alloc(tbl, container->mm); + if (ret) + return ret; + } + for (i = 0; i < pages; ++i) { struct mm_iommu_table_group_mem_t *mem = NULL; __be64 *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl, entry + i); @@ -637,6 +687,7 @@ static void tce_iommu_free_table(struct tce_container *container, { unsigned long pages = tbl->it_allocated_size >> PAGE_SHIFT; + tce_iommu_userspace_view_free(tbl, container->mm); iommu_tce_table_put(tbl); account_locked_vm(container->mm, pages, false); }
The commit 090bad39b237a ("powerpc/powernv: Add indirect levels to it_userspace") which implemented the tce indirect levels support for PowerNV ended up removing the single level support which existed by default(generic tce_iommu_userspace_view_alloc/free() calls). On pSeries the TCEs are single level, and the allocation of userspace view is lost with the removal of generic code. The patch attempts to bring it back for pseries on the refactored code base. On pSeries, the windows/tables are "borrowed", so the it_ops->free() is not called during the container detach or the tce release call paths as the table is not really freed. So, decoupling the userspace view array free and alloc from table's it_ops just the way it was before. Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> --- arch/powerpc/platforms/pseries/iommu.c | 19 ++++++++++-- drivers/vfio/vfio_iommu_spapr_tce.c | 51 ++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+), 3 deletions(-)