Message ID | 1493114272-30093-1-git-send-email-sunil.kovvuri@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Apr 25, 2017 at 3:27 PM, <sunil.kovvuri@gmail.com> wrote: > From: Sunil Goutham <sgoutham@cavium.com> > > For software initiated address translation, when domain type is > IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior > i.e return the same IOVA as translated address. > > This patch is an extension to Will Deacon's patchset > "Implement SMMU passthrough using the default domain". > > Signed-off-by: Sunil Goutham <sgoutham@cavium.com> > --- > > V2 > - As per Will's suggestion applied fix to SMMUv3 driver as well. > > drivers/iommu/arm-smmu-v3.c | 3 +++ > drivers/iommu/arm-smmu.c | 3 +++ > 2 files changed, 6 insertions(+) > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index 05b4592..d412bdd 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -1714,6 +1714,9 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; > > + if (domain->type == IOMMU_DOMAIN_IDENTITY) > + return iova; > + > if (!ops) > return 0; > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index bfab4f7..81088cd 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -1459,6 +1459,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops; > > + if (domain->type == IOMMU_DOMAIN_IDENTITY) > + return iova; > + > if (!ops) > return 0; > > -- > 2.7.4 > Will, if you are okay with the patch, can you please ACK. Thanks, Sunil.
Hi Sunil, On Tue, Apr 25, 2017 at 03:27:52PM +0530, sunil.kovvuri@gmail.com wrote: > From: Sunil Goutham <sgoutham@cavium.com> > > For software initiated address translation, when domain type is > IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior > i.e return the same IOVA as translated address. > > This patch is an extension to Will Deacon's patchset > "Implement SMMU passthrough using the default domain". > > Signed-off-by: Sunil Goutham <sgoutham@cavium.com> > --- > > V2 > - As per Will's suggestion applied fix to SMMUv3 driver as well. This follows what the AMD driver does, so: Acked-by: Will Deacon <will.deacon@arm.com> but I still think that having drivers/net/ethernet/cavium/thunder/nicvf_queues.c poke around with the physical address to get at the struct pages underlying a DMA buffer is really dodgy. Is there no way this can be avoided, perhaps by tracking the pages some other way (although I don't understand why you're having to mess with the page reference counts to start with)? At least, I think you should be checking the domain type in nicvf_iova_to_phys, which clearly expects a DMA domain if one exists at all. Joerg: sorry, this is another one for you to pick up if possible. Cheers, Will > drivers/iommu/arm-smmu-v3.c | 3 +++ > drivers/iommu/arm-smmu.c | 3 +++ > 2 files changed, 6 insertions(+) > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index 05b4592..d412bdd 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -1714,6 +1714,9 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; > > + if (domain->type == IOMMU_DOMAIN_IDENTITY) > + return iova; > + > if (!ops) > return 0; > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index bfab4f7..81088cd 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -1459,6 +1459,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops; > > + if (domain->type == IOMMU_DOMAIN_IDENTITY) > + return iova; > + > if (!ops) > return 0; > > -- > 2.7.4 >
On Wed, Apr 26, 2017 at 11:01:50AM +0100, Will Deacon wrote:
> Joerg: sorry, this is another one for you to pick up if possible.
Applied.
On Wed, Apr 26, 2017 at 3:31 PM, Will Deacon <will.deacon@arm.com> wrote: > Hi Sunil, > > On Tue, Apr 25, 2017 at 03:27:52PM +0530, sunil.kovvuri@gmail.com wrote: >> From: Sunil Goutham <sgoutham@cavium.com> >> >> For software initiated address translation, when domain type is >> IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior >> i.e return the same IOVA as translated address. >> >> This patch is an extension to Will Deacon's patchset >> "Implement SMMU passthrough using the default domain". >> >> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> >> --- >> >> V2 >> - As per Will's suggestion applied fix to SMMUv3 driver as well. > > This follows what the AMD driver does, so: > > Acked-by: Will Deacon <will.deacon@arm.com> Thanks, > > but I still think that having drivers/net/ethernet/cavium/thunder/nicvf_queues.c > poke around with the physical address to get at the struct pages underlying > a DMA buffer is really dodgy. Driver is not dealing with page structures to be precise, just like for any other NIC device, driver needs to know the virtual address of the packet to where it's DMA'ed, so that SKB if framed and handed over to network stack. Due to reasons mentioned below, in this driver it's not possible to maintain a list of DMA addresses to Virtual address mappings. Hence using IOMMU API, DMA address is translated to physical address and finally to virtual address. I don't see anything dodgy here. > Is there no way this can be avoided, perhaps by tracking the pages some other way I have explained that in the commit message -- Also VNIC doesn't have a seperate receive buffer ring per receive queue, so there is no 1:1 descriptor index matching between CQE_RX and the index in buffer ring from where a buffer has been used for DMA'ing. Unlike other NICs, here it's not possible to maintain dma address to virt address mappings within the driver. This leaves us no other choice but to use IOMMU's IOVA address conversion API to get buffer's virtual address which can be given to network stack for processing. -- >(although I don't understand why you're having to mess with the page reference >counts to start with)? Not sure why you say it's a mess, adjusting page reference counts is quite common if you check other NIC drivers. On ARM64 especially when using 64KB pages, if we have only one packet buffer for each page then we will have to set aside a whole lot of memory which sometimes is not possible on embedded platforms. Hence multiple pkt buffers per page, and page reference is set accordingly. > > At least, I think you should be checking the domain type in > nicvf_iova_to_phys, which clearly expects a DMA domain if one exists at all. Probably, but I don't think network maintainers would be okay with it, since such stuff should be hidden from a network driver's point of view. In reverse the argument can be that NIC driver shouldn't even have to check if domain is set or not. Thanks, Sunil. > > Joerg: sorry, this is another one for you to pick up if possible. > > Cheers, > > Will > >> drivers/iommu/arm-smmu-v3.c | 3 +++ >> drivers/iommu/arm-smmu.c | 3 +++ >> 2 files changed, 6 insertions(+) >> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c >> index 05b4592..d412bdd 100644 >> --- a/drivers/iommu/arm-smmu-v3.c >> +++ b/drivers/iommu/arm-smmu-v3.c >> @@ -1714,6 +1714,9 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >> struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; >> >> + if (domain->type == IOMMU_DOMAIN_IDENTITY) >> + return iova; >> + >> if (!ops) >> return 0; >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index bfab4f7..81088cd 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -1459,6 +1459,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >> struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops; >> >> + if (domain->type == IOMMU_DOMAIN_IDENTITY) >> + return iova; >> + >> if (!ops) >> return 0; >> >> -- >> 2.7.4 >>
On Wed, Apr 26, 2017 at 04:13:29PM +0530, Sunil Kovvuri wrote: > On Wed, Apr 26, 2017 at 3:31 PM, Will Deacon <will.deacon@arm.com> wrote: > > Hi Sunil, > > > > On Tue, Apr 25, 2017 at 03:27:52PM +0530, sunil.kovvuri@gmail.com wrote: > >> From: Sunil Goutham <sgoutham@cavium.com> > >> > >> For software initiated address translation, when domain type is > >> IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior > >> i.e return the same IOVA as translated address. > >> > >> This patch is an extension to Will Deacon's patchset > >> "Implement SMMU passthrough using the default domain". > >> > >> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> > >> --- > >> > >> V2 > >> - As per Will's suggestion applied fix to SMMUv3 driver as well. > > > > This follows what the AMD driver does, so: > > > > Acked-by: Will Deacon <will.deacon@arm.com> > > Thanks, > > > > > but I still think that having drivers/net/ethernet/cavium/thunder/nicvf_queues.c > > poke around with the physical address to get at the struct pages underlying > > a DMA buffer is really dodgy. > > Driver is not dealing with page structures to be precise, just like > for any other NIC device, driver needs to know the virtual address > of the packet to where it's DMA'ed, so that SKB if framed and > handed over to network stack. Due to reasons mentioned below, > in this driver it's not possible to maintain a list of DMA addresses to > Virtual address mappings. Hence using IOMMU API, DMA address > is translated to physical address and finally to virtual address. I don't > see anything dodgy here. It's dodgy because you're the only NIC driver using iommu_iova_to_phys directly and, afaict, the driver could just stash either the struct page or the virtual address at the point of allocation. > > Is there no way this can be avoided, perhaps by tracking the pages some other way > > I have explained that in the commit message > -- > Also VNIC doesn't have a seperate receive buffer ring per receive > queue, so there is no 1:1 descriptor index matching between CQE_RX > and the index in buffer ring from where a buffer has been used for > DMA'ing. Unlike other NICs, here it's not possible to maintain dma > address to virt address mappings within the driver. This leaves us > no other choice but to use IOMMU's IOVA address conversion API to > get buffer's virtual address which can be given to network stack > for processing. > -- > > >(although I don't understand why you're having to mess with the page reference > >counts to start with)? > Not sure why you say it's a mess, adjusting page reference counts is quite > common if you check other NIC drivers. On ARM64 especially when using > 64KB pages, if we have only one packet buffer for each page then we > will have to set aside a whole lot of memory which sometimes is not possible > on embedded platforms. Hence multiple pkt buffers per page, and page reference > is set accordingly. I wasn't saying that was a mess, I was just saying that I didn't understand why you mess (verb) with the page reference counts (my ignorance of the network layer). The code that I think is a mess is: phys_addr = nicvf_iova_to_phys(nic, buf_addr); [...] put_page(virt_to_page(phys_to_virt(phys_addr))); because: (a) You have the information you need at allocation time, but you've failed to record that and are trying to use the IOMMU API to reconstruct the CPU virtual address (b) When there isn't an IOMMU present, you assume that bus addresses == physical addresses (c) You assume that the DMA buffer is mapped in the linear mapping that's probably all true for ThunderX/arm64, but it's generally not portable or reliable code. If you could get a handle to the struct page that you allocated in the first place, then you could use page_address to get its virtual address instead of having to go via the physical address. Will
On Wed, Apr 26, 2017 at 5:06 PM, Will Deacon <will.deacon@arm.com> wrote: > On Wed, Apr 26, 2017 at 04:13:29PM +0530, Sunil Kovvuri wrote: >> On Wed, Apr 26, 2017 at 3:31 PM, Will Deacon <will.deacon@arm.com> wrote: >> > Hi Sunil, >> > >> > On Tue, Apr 25, 2017 at 03:27:52PM +0530, sunil.kovvuri@gmail.com wrote: >> >> From: Sunil Goutham <sgoutham@cavium.com> >> >> >> >> For software initiated address translation, when domain type is >> >> IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior >> >> i.e return the same IOVA as translated address. >> >> >> >> This patch is an extension to Will Deacon's patchset >> >> "Implement SMMU passthrough using the default domain". >> >> >> >> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> >> >> --- >> >> >> >> V2 >> >> - As per Will's suggestion applied fix to SMMUv3 driver as well. >> > >> > This follows what the AMD driver does, so: >> > >> > Acked-by: Will Deacon <will.deacon@arm.com> >> >> Thanks, >> >> > >> > but I still think that having drivers/net/ethernet/cavium/thunder/nicvf_queues.c >> > poke around with the physical address to get at the struct pages underlying >> > a DMA buffer is really dodgy. >> >> Driver is not dealing with page structures to be precise, just like >> for any other NIC device, driver needs to know the virtual address >> of the packet to where it's DMA'ed, so that SKB if framed and >> handed over to network stack. Due to reasons mentioned below, >> in this driver it's not possible to maintain a list of DMA addresses to >> Virtual address mappings. Hence using IOMMU API, DMA address >> is translated to physical address and finally to virtual address. I don't >> see anything dodgy here. > > It's dodgy because you're the only NIC driver using iommu_iova_to_phys > directly and, afaict, the driver could just stash either the struct page > or the virtual address at the point of allocation. Well the driver needs to be written based on how HW functions even if it results in making use of an API which isn't used earlier by others. > >> > Is there no way this can be avoided, perhaps by tracking the pages some other way >> >> I have explained that in the commit message >> -- >> Also VNIC doesn't have a seperate receive buffer ring per receive >> queue, so there is no 1:1 descriptor index matching between CQE_RX >> and the index in buffer ring from where a buffer has been used for >> DMA'ing. Unlike other NICs, here it's not possible to maintain dma >> address to virt address mappings within the driver. This leaves us >> no other choice but to use IOMMU's IOVA address conversion API to >> get buffer's virtual address which can be given to network stack >> for processing. >> -- >> >> >(although I don't understand why you're having to mess with the page reference >> >counts to start with)? >> Not sure why you say it's a mess, adjusting page reference counts is quite >> common if you check other NIC drivers. On ARM64 especially when using >> 64KB pages, if we have only one packet buffer for each page then we >> will have to set aside a whole lot of memory which sometimes is not possible >> on embedded platforms. Hence multiple pkt buffers per page, and page reference >> is set accordingly. > > I wasn't saying that was a mess, I was just saying that I didn't understand > why you mess (verb) with the page reference counts (my ignorance of the > network layer). The code that I think is a mess is: > > phys_addr = nicvf_iova_to_phys(nic, buf_addr); > [...] > put_page(virt_to_page(phys_to_virt(phys_addr))); Even if it's possible to record info info in this driver, still page reference count needs to be released to free it otherwise the page is gone. > > because: > > (a) You have the information you need at allocation time, but you've > failed to record that and are trying to use the IOMMU API to > reconstruct the CPU virtual address That's exactly what I have explained in the commit message, i.e why I cannot record info at the time of allocation. Also, HW gives address of the buffer (IOVA or physcial) where it has DMA'ed the packet and not an index into buffer ring. There is one single buffer ring for 8 receive queues, so there is no way to do a mapping btw DMA address at receive queue to recorded info in buffer ring. All you said is possible and that is exactly what I would have done if HW gives me an index into buffer ring instead of DMA'ed address and I wouldn't have been hit so hard with all the bottlenecks in ARM IOMMU infrastructure. Thanks, Sunil. > > (b) When there isn't an IOMMU present, you assume that bus addresses == > physical addresses > > (c) You assume that the DMA buffer is mapped in the linear mapping > > that's probably all true for ThunderX/arm64, but it's generally not portable > or reliable code. If you could get a handle to the struct page that you > allocated in the first place, then you could use page_address to get its > virtual address instead of having to go via the physical address. > > Will
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 05b4592..d412bdd 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1714,6 +1714,9 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops; + if (domain->type == IOMMU_DOMAIN_IDENTITY) + return iova; + if (!ops) return 0; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index bfab4f7..81088cd 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1459,6 +1459,9 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain, struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops; + if (domain->type == IOMMU_DOMAIN_IDENTITY) + return iova; + if (!ops) return 0;