diff mbox series

[09/16] dma-direct: Support PCI P2PDMA pages in dma-direct map_sg

Message ID 20210408170123.8788-10-logang@deltatee.com (mailing list archive)
State Superseded
Headers show
Series Add new DMA mapping operation for P2PDMA | expand

Commit Message

Logan Gunthorpe April 8, 2021, 5:01 p.m. UTC
Add PCI P2PDMA support for dma_direct_map_sg() so that it can map
PCI P2PDMA pages directly without a hack in the callers. This allows
for heterogeneous SGLs that contain both P2PDMA and regular pages.

SGL segments that contain PCI bus addresses are marked with
sg_mark_pci_p2pdma() and are ignored when unmapped.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 kernel/dma/direct.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

Comments

Jason Gunthorpe April 27, 2021, 7:33 p.m. UTC | #1
On Thu, Apr 08, 2021 at 11:01:16AM -0600, Logan Gunthorpe wrote:
> Add PCI P2PDMA support for dma_direct_map_sg() so that it can map
> PCI P2PDMA pages directly without a hack in the callers. This allows
> for heterogeneous SGLs that contain both P2PDMA and regular pages.
> 
> SGL segments that contain PCI bus addresses are marked with
> sg_mark_pci_p2pdma() and are ignored when unmapped.
> 
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
>  kernel/dma/direct.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 002268262c9a..108dfb4ecbd5 100644
> +++ b/kernel/dma/direct.c
> @@ -13,6 +13,7 @@
>  #include <linux/vmalloc.h>
>  #include <linux/set_memory.h>
>  #include <linux/slab.h>
> +#include <linux/pci-p2pdma.h>
>  #include "direct.h"
>  
>  /*
> @@ -387,19 +388,37 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
>  	struct scatterlist *sg;
>  	int i;
>  
> -	for_each_sg(sgl, sg, nents, i)
> +	for_each_sg(sgl, sg, nents, i) {
> +		if (sg_is_pci_p2pdma(sg)) {
> +			sg_unmark_pci_p2pdma(sg);

This doesn't seem nice, the DMA layer should only alter the DMA
portion of the SG, not the other portions. Is it necessary?

Jason
Jason Gunthorpe April 27, 2021, 7:40 p.m. UTC | #2
On Tue, Apr 27, 2021 at 04:33:51PM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 08, 2021 at 11:01:16AM -0600, Logan Gunthorpe wrote:
> > Add PCI P2PDMA support for dma_direct_map_sg() so that it can map
> > PCI P2PDMA pages directly without a hack in the callers. This allows
> > for heterogeneous SGLs that contain both P2PDMA and regular pages.
> > 
> > SGL segments that contain PCI bus addresses are marked with
> > sg_mark_pci_p2pdma() and are ignored when unmapped.
> > 
> > Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> >  kernel/dma/direct.c | 25 ++++++++++++++++++++++---
> >  1 file changed, 22 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > index 002268262c9a..108dfb4ecbd5 100644
> > +++ b/kernel/dma/direct.c
> > @@ -13,6 +13,7 @@
> >  #include <linux/vmalloc.h>
> >  #include <linux/set_memory.h>
> >  #include <linux/slab.h>
> > +#include <linux/pci-p2pdma.h>
> >  #include "direct.h"
> >  
> >  /*
> > @@ -387,19 +388,37 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
> >  	struct scatterlist *sg;
> >  	int i;
> >  
> > -	for_each_sg(sgl, sg, nents, i)
> > +	for_each_sg(sgl, sg, nents, i) {
> > +		if (sg_is_pci_p2pdma(sg)) {
> > +			sg_unmark_pci_p2pdma(sg);
> 
> This doesn't seem nice, the DMA layer should only alter the DMA
> portion of the SG, not the other portions. Is it necessary?

Oh, I got it completely wrong what this is for.

This should be named sg_dma_mark_pci_p2p() and similar for other
functions to make it clear it is part of the DMA side of the SG
interface (eg it is like sg_dma_address, sg_dma_len, etc)

Jason
Logan Gunthorpe April 27, 2021, 10:56 p.m. UTC | #3
On 2021-04-27 1:40 p.m., Jason Gunthorpe wrote:
> On Tue, Apr 27, 2021 at 04:33:51PM -0300, Jason Gunthorpe wrote:
>> On Thu, Apr 08, 2021 at 11:01:16AM -0600, Logan Gunthorpe wrote:
>>> Add PCI P2PDMA support for dma_direct_map_sg() so that it can map
>>> PCI P2PDMA pages directly without a hack in the callers. This allows
>>> for heterogeneous SGLs that contain both P2PDMA and regular pages.
>>>
>>> SGL segments that contain PCI bus addresses are marked with
>>> sg_mark_pci_p2pdma() and are ignored when unmapped.
>>>
>>> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
>>>  kernel/dma/direct.c | 25 ++++++++++++++++++++++---
>>>  1 file changed, 22 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
>>> index 002268262c9a..108dfb4ecbd5 100644
>>> +++ b/kernel/dma/direct.c
>>> @@ -13,6 +13,7 @@
>>>  #include <linux/vmalloc.h>
>>>  #include <linux/set_memory.h>
>>>  #include <linux/slab.h>
>>> +#include <linux/pci-p2pdma.h>
>>>  #include "direct.h"
>>>  
>>>  /*
>>> @@ -387,19 +388,37 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
>>>  	struct scatterlist *sg;
>>>  	int i;
>>>  
>>> -	for_each_sg(sgl, sg, nents, i)
>>> +	for_each_sg(sgl, sg, nents, i) {
>>> +		if (sg_is_pci_p2pdma(sg)) {
>>> +			sg_unmark_pci_p2pdma(sg);
>>
>> This doesn't seem nice, the DMA layer should only alter the DMA
>> portion of the SG, not the other portions. Is it necessary?
> 
> Oh, I got it completely wrong what this is for.
> 
> This should be named sg_dma_mark_pci_p2p() and similar for other
> functions to make it clear it is part of the DMA side of the SG
> interface (eg it is like sg_dma_address, sg_dma_len, etc)

Fair point. Yes, I'll rename this for the next version.

Logan
John Hubbard May 2, 2021, 11:28 p.m. UTC | #4
On 4/8/21 10:01 AM, Logan Gunthorpe wrote:
> Add PCI P2PDMA support for dma_direct_map_sg() so that it can map
> PCI P2PDMA pages directly without a hack in the callers. This allows
> for heterogeneous SGLs that contain both P2PDMA and regular pages.
> 
> SGL segments that contain PCI bus addresses are marked with
> sg_mark_pci_p2pdma() and are ignored when unmapped.
> 
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> ---
>   kernel/dma/direct.c | 25 ++++++++++++++++++++++---
>   1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 002268262c9a..108dfb4ecbd5 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -13,6 +13,7 @@
>   #include <linux/vmalloc.h>
>   #include <linux/set_memory.h>
>   #include <linux/slab.h>
> +#include <linux/pci-p2pdma.h>
>   #include "direct.h"
>   
>   /*
> @@ -387,19 +388,37 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,

This routine now deserves a little bit of commenting, now that it is
doing less obvious things. How about something like this:

/*
  * Unmaps pages, except for PCI_P2PDMA pages, which were never mapped in the
  * first place. Instead of unmapping PCI_P2PDMA entries, simply remove the
  * SG_PCI_P2PDMA mark
  */
void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
		int nents, enum dma_data_direction dir, unsigned long attrs)
{


>   	struct scatterlist *sg;
>   	int i;
>   
> -	for_each_sg(sgl, sg, nents, i)
> +	for_each_sg(sgl, sg, nents, i) {
> +		if (sg_is_pci_p2pdma(sg)) {
> +			sg_unmark_pci_p2pdma(sg);
> +			continue;
> +		}
> +
>   		dma_direct_unmap_page(dev, sg->dma_address, sg_dma_len(sg), dir,
>   			     attrs);
> +	}

The same thing can be achieved with fewer lines and a bit more clarity.
Can we please do it like this instead:

	for_each_sg(sgl, sg, nents, i) {
		if (sg_is_pci_p2pdma(sg))
			sg_unmark_pci_p2pdma(sg);
		else
			dma_direct_unmap_page(dev, sg->dma_address,
					      sg_dma_len(sg), dir, attrs);
	}


>   }
>   #endif
>   

Also here, a block comment for the function would be nice. How about
approximately this:

/*
  * Maps each SG segment. Returns the number of entries mapped, or 0 upon
  * failure. If any entry could not be mapped, then no entries are mapped.
  */

I'll stop complaining about the pre-existing return code conventions,
since by now you know what I was thinking of saying. :)

>   int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
>   		enum dma_data_direction dir, unsigned long attrs)
>   {
> -	int i;
> +	struct pci_p2pdma_map_state p2pdma_state = {};

Is it worth putting this stuff on the stack--is there a noticeable
performance improvement from caching the state? Because if it's
invisible, then simplicity is better. I suspect you're right, and that
it *is* worth it, but it's good to know for real.

>   	struct scatterlist *sg;
> +	int i, ret = 0;
>   
>   	for_each_sg(sgl, sg, nents, i) {
> +		if (is_pci_p2pdma_page(sg_page(sg))) {
> +			ret = pci_p2pdma_map_segment(&p2pdma_state, dev, sg,
> +						     attrs);
> +			if (ret < 0) {
> +				goto out_unmap;
> +			} else if (ret) {
> +				ret = 0;
> +				continue;

Is this a bug? If neither of those "if" branches fires (ret == 0), then
the code (probably unintentionally) falls through and continues on to
attempt to call dma_direct_map_page()--despite being a PCI_P2PDMA page!

See below for suggestions:

> +			}
> +		}
> +
>   		sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
>   				sg->offset, sg->length, dir, attrs);
>   		if (sg->dma_address == DMA_MAPPING_ERROR)

This is another case in which "continue" is misleading and not as good
as "else". Because unless I'm wrong above, you really only want to take
one path *or* the other.

Also, the "else if (ret)" can be simplified to just setting ret = 0
unconditionally.

Given all that, here's a suggested alternative, which is both shorter
and clearer, IMHO:

	for_each_sg(sgl, sg, nents, i) {
		if (is_pci_p2pdma_page(sg_page(sg))) {
			ret = pci_p2pdma_map_segment(&p2pdma_state, dev, sg,
						     attrs);
			if (ret < 0)
				goto out_unmap;
			else
				ret = 0;
		} else {
			sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
					sg->offset, sg->length, dir, attrs);
			if (sg->dma_address == DMA_MAPPING_ERROR)
				goto out_unmap;
			sg_dma_len(sg) = sg->length;
		}
	}

thanks,
John Hubbard May 2, 2021, 11:32 p.m. UTC | #5
On 5/2/21 4:28 PM, John Hubbard wrote:
> On 4/8/21 10:01 AM, Logan Gunthorpe wrote:
...
>> @@ -387,19 +388,37 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
> 
> This routine now deserves a little bit of commenting, now that it is
> doing less obvious things. How about something like this:
> 
> /*
> * Unmaps pages, except for PCI_P2PDMA pages, which were never mapped in the
> * first place. Instead of unmapping PCI_P2PDMA entries, simply remove the
> * SG_PCI_P2PDMA mark
> */

I got that kind of wrong. They *were* mapped, but need to be left mostly
alone...maybe you can word it better. Here's my second draft:

/*
  * Unmaps pages, except for PCI_P2PDMA pages, which should not be unmapped at
  * this point. Instead of unmapping PCI_P2PDMA entries, simply remove the
  * SG_PCI_P2PDMA mark.
  */

...am I getting close? :)

thanks,
Logan Gunthorpe May 3, 2021, 4:55 p.m. UTC | #6
On 2021-05-02 5:28 p.m., John Hubbard wrote:
>> @@ -387,19 +388,37 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
> 
> This routine now deserves a little bit of commenting, now that it is
> doing less obvious things. How about something like this:
> 
> /*
>   * Unmaps pages, except for PCI_P2PDMA pages, which were never mapped in the
>   * first place. Instead of unmapping PCI_P2PDMA entries, simply remove the
>   * SG_PCI_P2PDMA mark
>   */
> void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
> 		int nents, enum dma_data_direction dir, unsigned long attrs)
> {
> 

Ok.

>>   	struct scatterlist *sg;
>>   	int i;
>>   
>> -	for_each_sg(sgl, sg, nents, i)
>> +	for_each_sg(sgl, sg, nents, i) {
>> +		if (sg_is_pci_p2pdma(sg)) {
>> +			sg_unmark_pci_p2pdma(sg);
>> +			continue;
>> +		}
>> +
>>   		dma_direct_unmap_page(dev, sg->dma_address, sg_dma_len(sg), dir,
>>   			     attrs);
>> +	}
> 
> The same thing can be achieved with fewer lines and a bit more clarity.
> Can we please do it like this instead:
> 
> 	for_each_sg(sgl, sg, nents, i) {
> 		if (sg_is_pci_p2pdma(sg))
> 			sg_unmark_pci_p2pdma(sg);
> 		else
> 			dma_direct_unmap_page(dev, sg->dma_address,
> 					      sg_dma_len(sg), dir, attrs);
> 	}
> 
> 

That's debatable (the way I did it emphasizes the common case). But I'll
consider changing it.

> 
> Also here, a block comment for the function would be nice. How about
> approximately this:
> 
> /*
>   * Maps each SG segment. Returns the number of entries mapped, or 0 upon
>   * failure. If any entry could not be mapped, then no entries are mapped.
>   */
> 
> I'll stop complaining about the pre-existing return code conventions,
> since by now you know what I was thinking of saying. :)

Not really part of this patchset... Seems like if you think there should
be a comment like that here, you should send a patch. But this patch
starts returning a negative value here.

>>   int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
>>   		enum dma_data_direction dir, unsigned long attrs)
>>   {
>> -	int i;
>> +	struct pci_p2pdma_map_state p2pdma_state = {};
> 
> Is it worth putting this stuff on the stack--is there a noticeable
> performance improvement from caching the state? Because if it's
> invisible, then simplicity is better. I suspect you're right, and that
> it *is* worth it, but it's good to know for real.
> 
>>   	struct scatterlist *sg;
>> +	int i, ret = 0;
>>   
>>   	for_each_sg(sgl, sg, nents, i) {
>> +		if (is_pci_p2pdma_page(sg_page(sg))) {
>> +			ret = pci_p2pdma_map_segment(&p2pdma_state, dev, sg,
>> +						     attrs);
>> +			if (ret < 0) {
>> +				goto out_unmap;
>> +			} else if (ret) {
>> +				ret = 0;
>> +				continue;
> 
> Is this a bug? If neither of those "if" branches fires (ret == 0), then
> the code (probably unintentionally) falls through and continues on to
> attempt to call dma_direct_map_page()--despite being a PCI_P2PDMA page!

No, it's not a bug. Per the documentation of pci_p2pdma_map_segment(),
if it returns zero the segment should be mapped normally. P2PDMA pages
must be mapped with physical addresses (or IOVA addresses) if the TLPS
for the transaction will go through the host bridge.

> See below for suggestions:
> 
>> +			}
>> +		}
>> +
>>   		sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
>>   				sg->offset, sg->length, dir, attrs);
>>   		if (sg->dma_address == DMA_MAPPING_ERROR)
> 
> This is another case in which "continue" is misleading and not as good
> as "else". Because unless I'm wrong above, you really only want to take
> one path *or* the other.

No, per above, it's not one path or the other. If it's a P2PDMA page it
may still need to be mapped normally.

> Also, the "else if (ret)" can be simplified to just setting ret = 0
> unconditionally.

I don't follow. If ret is set, we need to unset it before the end of the
loop.

> Given all that, here's a suggested alternative, which is both shorter
> and clearer, IMHO:
> 
> 	for_each_sg(sgl, sg, nents, i) {
> 		if (is_pci_p2pdma_page(sg_page(sg))) {
> 			ret = pci_p2pdma_map_segment(&p2pdma_state, dev, sg,
> 						     attrs);
> 			if (ret < 0)
> 				goto out_unmap;
> 			else
> 				ret = 0;
> 		} else {
> 			sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
> 					sg->offset, sg->length, dir, attrs);
> 			if (sg->dma_address == DMA_MAPPING_ERROR)
> 				goto out_unmap;
> 			sg_dma_len(sg) = sg->length;
> 		}
> 	}

No, per the comments above, this does not accomplish the same thing and
is not correct.

I'll try to add a comment to the code to make it more clearer. But the
kernel doc on pci_p2pdma_map_segment() does mention what must be done
for different return values explicitly.

Logan
Logan Gunthorpe May 3, 2021, 5:04 p.m. UTC | #7
Oops missed a comment:

On 2021-05-02 5:28 p.m., John Hubbard wrote:
>>   int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
>>   		enum dma_data_direction dir, unsigned long attrs)
>>   {
>> -	int i;
>> +	struct pci_p2pdma_map_state p2pdma_state = {};
> 
> Is it worth putting this stuff on the stack--is there a noticeable
> performance improvement from caching the state? Because if it's
> invisible, then simplicity is better. I suspect you're right, and that
> it *is* worth it, but it's good to know for real.

I haven't measured it (it would be hard to measure), but I think it's
fairly clear here. Without the state, xa_load() would need to be called
on *every* page in an SGL that maps only P2PDMA memory from one device.
With the state, it only needs to be called once. xa_load() is cheap, but
it is not that cheap.

There's essentially the same optimization in get_user_pages for
ZONE_DEVICE pages. So, if it is necessary there, it should be necessary
here.

Logan
Logan Gunthorpe May 3, 2021, 5:06 p.m. UTC | #8
On 2021-05-02 5:32 p.m., John Hubbard wrote:
> On 5/2/21 4:28 PM, John Hubbard wrote:
>> On 4/8/21 10:01 AM, Logan Gunthorpe wrote:
> ...
>>> @@ -387,19 +388,37 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
>>
>> This routine now deserves a little bit of commenting, now that it is
>> doing less obvious things. How about something like this:
>>
>> /*
>> * Unmaps pages, except for PCI_P2PDMA pages, which were never mapped in the
>> * first place. Instead of unmapping PCI_P2PDMA entries, simply remove the
>> * SG_PCI_P2PDMA mark
>> */
> 
> I got that kind of wrong. They *were* mapped, but need to be left mostly
> alone...maybe you can word it better. Here's my second draft:
> 
> /*
>   * Unmaps pages, except for PCI_P2PDMA pages, which should not be unmapped at
>   * this point. Instead of unmapping PCI_P2PDMA entries, simply remove the
>   * SG_PCI_P2PDMA mark.
>   */
> 
> ...am I getting close? :)

I don't think your original comment was wrong per se. But I guess it
depends on your definition of "mapped". In dma-direct the physical
address is added to the SGL and, on some arches, the address has to be
synced on unmap. With P2PDMA, the PCI bus address is sometimes added to
the SGL and no sync is necessary at the end.

Logan
John Hubbard May 4, 2021, 12:01 a.m. UTC | #9
On 5/3/21 10:04 AM, Logan Gunthorpe wrote:
> Oops missed a comment:
> 
> On 2021-05-02 5:28 p.m., John Hubbard wrote:
>>>    int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
>>>    		enum dma_data_direction dir, unsigned long attrs)
>>>    {
>>> -	int i;
>>> +	struct pci_p2pdma_map_state p2pdma_state = {};
>>
>> Is it worth putting this stuff on the stack--is there a noticeable
>> performance improvement from caching the state? Because if it's
>> invisible, then simplicity is better. I suspect you're right, and that
>> it *is* worth it, but it's good to know for real.
> 
> I haven't measured it (it would be hard to measure), but I think it's
> fairly clear here. Without the state, xa_load() would need to be called
> on *every* page in an SGL that maps only P2PDMA memory from one device.
> With the state, it only needs to be called once. xa_load() is cheap, but
> it is not that cheap.

OK, thanks for spelling it out for me. :)

> 
> There's essentially the same optimization in get_user_pages for
> ZONE_DEVICE pages. So, if it is necessary there, it should be necessary
> here.
> 

Right, that's a pretty solid example.

thanks,
John Hubbard May 4, 2021, 12:12 a.m. UTC | #10
On 5/3/21 9:55 AM, Logan Gunthorpe wrote:
...
>> The same thing can be achieved with fewer lines and a bit more clarity.
>> Can we please do it like this instead:
>>
>> 	for_each_sg(sgl, sg, nents, i) {
>> 		if (sg_is_pci_p2pdma(sg))
>> 			sg_unmark_pci_p2pdma(sg);
>> 		else
>> 			dma_direct_unmap_page(dev, sg->dma_address,
>> 					      sg_dma_len(sg), dir, attrs);
>> 	}
>>
>>
> 
> That's debatable (the way I did it emphasizes the common case). But I'll
> consider changing it.
> 

Thanks for considering it.

>>
>> Also here, a block comment for the function would be nice. How about
>> approximately this:
>>
>> /*
>>    * Maps each SG segment. Returns the number of entries mapped, or 0 upon
>>    * failure. If any entry could not be mapped, then no entries are mapped.
>>    */
>>
>> I'll stop complaining about the pre-existing return code conventions,
>> since by now you know what I was thinking of saying. :)
> 
> Not really part of this patchset... Seems like if you think there should
> be a comment like that here, you should send a patch. But this patch
> starts returning a negative value here.

OK, that's fine. Like you say, that comment is rather beyond this patchset.

>>>    	for_each_sg(sgl, sg, nents, i) {
>>> +		if (is_pci_p2pdma_page(sg_page(sg))) {
>>> +			ret = pci_p2pdma_map_segment(&p2pdma_state, dev, sg,
>>> +						     attrs);
>>> +			if (ret < 0) {
>>> +				goto out_unmap;
>>> +			} else if (ret) {
>>> +				ret = 0;
>>> +				continue;
>>
>> Is this a bug? If neither of those "if" branches fires (ret == 0), then
>> the code (probably unintentionally) falls through and continues on to
>> attempt to call dma_direct_map_page()--despite being a PCI_P2PDMA page!
> 
> No, it's not a bug. Per the documentation of pci_p2pdma_map_segment(),
> if it returns zero the segment should be mapped normally. P2PDMA pages
> must be mapped with physical addresses (or IOVA addresses) if the TLPS
> for the transaction will go through the host bridge.

Could we maybe put a little comment there, to that effect? It would be
nice to call out that point, especially since it is common to miss one
case (negative, 0, positive) when handling return values. Sort of like
we used to put "// fallthrough" in the case statements. Not a big deal
of course.

> 
>> See below for suggestions:
>>
>>> +			}
>>> +		}
>>> +
>>>    		sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
>>>    				sg->offset, sg->length, dir, attrs);
>>>    		if (sg->dma_address == DMA_MAPPING_ERROR)
>>
>> This is another case in which "continue" is misleading and not as good
>> as "else". Because unless I'm wrong above, you really only want to take
>> one path *or* the other.
> 
> No, per above, it's not one path or the other. If it's a P2PDMA page it
> may still need to be mapped normally.
> 

Right. That follows.

thanks,
diff mbox series

Patch

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 002268262c9a..108dfb4ecbd5 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -13,6 +13,7 @@ 
 #include <linux/vmalloc.h>
 #include <linux/set_memory.h>
 #include <linux/slab.h>
+#include <linux/pci-p2pdma.h>
 #include "direct.h"
 
 /*
@@ -387,19 +388,37 @@  void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
 	struct scatterlist *sg;
 	int i;
 
-	for_each_sg(sgl, sg, nents, i)
+	for_each_sg(sgl, sg, nents, i) {
+		if (sg_is_pci_p2pdma(sg)) {
+			sg_unmark_pci_p2pdma(sg);
+			continue;
+		}
+
 		dma_direct_unmap_page(dev, sg->dma_address, sg_dma_len(sg), dir,
 			     attrs);
+	}
 }
 #endif
 
 int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
 		enum dma_data_direction dir, unsigned long attrs)
 {
-	int i;
+	struct pci_p2pdma_map_state p2pdma_state = {};
 	struct scatterlist *sg;
+	int i, ret = 0;
 
 	for_each_sg(sgl, sg, nents, i) {
+		if (is_pci_p2pdma_page(sg_page(sg))) {
+			ret = pci_p2pdma_map_segment(&p2pdma_state, dev, sg,
+						     attrs);
+			if (ret < 0) {
+				goto out_unmap;
+			} else if (ret) {
+				ret = 0;
+				continue;
+			}
+		}
+
 		sg->dma_address = dma_direct_map_page(dev, sg_page(sg),
 				sg->offset, sg->length, dir, attrs);
 		if (sg->dma_address == DMA_MAPPING_ERROR)
@@ -411,7 +430,7 @@  int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
 
 out_unmap:
 	dma_direct_unmap_sg(dev, sgl, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC);
-	return 0;
+	return ret;
 }
 
 dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,