diff mbox series

[v2,1/5] PCI/P2PDMA: Don't enforce ACS check for functions of same device

Message ID 20241021052236.1820329-2-vivek.kasireddy@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/xe/sriov: Don't migrate dmabuf BO to System RAM while running in VM | expand

Commit Message

Kasireddy, Vivek Oct. 21, 2024, 5:21 a.m. UTC
Functions of the same PCI device (such as a PF and a VF) share the
same bus and have a common root port and typically, the PF provisions
resources for the VF. Therefore, they can be considered compatible
as far as P2P access is considered.

Currently, although the distance (2) is correctly calculated for
functions of the same device, an ACS check failure prevents P2P DMA
access between them. Therefore, introduce a small function named
pci_devs_are_p2pdma_compatible() to determine if the provider and
client belong to the same device and facilitate P2P DMA between
them by not enforcing the ACS check.

v2:
- Relax the enforcment of ACS check only for Intel GPU functions
  as they are P2PDMA compatible given the way the PF provisions
  the resources among multiple VFs.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: <linux-pci@vger.kernel.org>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
 drivers/pci/p2pdma.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Bjorn Helgaas Oct. 22, 2024, 3:16 p.m. UTC | #1
On Sun, Oct 20, 2024 at 10:21:29PM -0700, Vivek Kasireddy wrote:
> Functions of the same PCI device (such as a PF and a VF) share the
> same bus and have a common root port and typically, the PF provisions
> resources for the VF. Therefore, they can be considered compatible
> as far as P2P access is considered.
> 
> Currently, although the distance (2) is correctly calculated for
> functions of the same device, an ACS check failure prevents P2P DMA
> access between them. Therefore, introduce a small function named
> pci_devs_are_p2pdma_compatible() to determine if the provider and
> client belong to the same device and facilitate P2P DMA between
> them by not enforcing the ACS check.
> 
> v2:
> - Relax the enforcment of ACS check only for Intel GPU functions
>   as they are P2PDMA compatible given the way the PF provisions
>   the resources among multiple VFs.

I don't want version history in the commit log.  If the content is
useful, just incorporate it here directly (without the version info),
and put the version-to-version changelog below the "---".

> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: <linux-pci@vger.kernel.org>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> ---
>  drivers/pci/p2pdma.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index 4f47a13cb500..a230e661f939 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -535,6 +535,17 @@ static unsigned long map_types_idx(struct pci_dev *client)
>  	return (pci_domain_nr(client->bus) << 16) | pci_dev_id(client);
>  }
>  
> +static bool pci_devs_are_p2pdma_compatible(struct pci_dev *provider,
> +					   struct pci_dev *client)
> +{
> +	if (provider->vendor == PCI_VENDOR_ID_INTEL) {
> +		if (pci_is_vga(provider) && pci_is_vga(client))
> +			return pci_physfn(provider) == pci_physfn(client);
> +	}

This doesn't explain why this should be specific to Intel or VGA.  As
far as I can tell, everything mentioned in the commit log is generic.

I see the previous comments
(https://lore.kernel.org/all/eddb423c-945f-40c9-b904-43ea8371f1c4@deltatee.com/),
but none of that context was captured here.

I'm not sure what you refer to by "PF provisions resources for the
VF".  Isn't it *always* the case that the architected PCI resources
(BARs) are configured by the PF?  It sounds like you're referring to
something Intel GPU-specific beyond that?

> +	return false;
> +}
> +
>  /*
>   * Calculate the P2PDMA mapping type and distance between two PCI devices.
>   *
> @@ -634,7 +645,7 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
>  
>  	*dist = dist_a + dist_b;
>  
> -	if (!acs_cnt) {
> +	if (!acs_cnt || pci_devs_are_p2pdma_compatible(provider, client)) {
>  		map_type = PCI_P2PDMA_MAP_BUS_ADDR;
>  		goto done;
>  	}
> @@ -696,7 +707,9 @@ int pci_p2pdma_distance_many(struct pci_dev *provider, struct device **clients,
>  		return -1;
>  
>  	for (i = 0; i < num_clients; i++) {
> -		pci_client = find_parent_pci_dev(clients[i]);
> +		pci_client = dev_is_pf(clients[i]) ?
> +				pci_dev_get(to_pci_dev(clients[i])) :
> +				find_parent_pci_dev(clients[i]);
>  		if (!pci_client) {
>  			if (verbose)
>  				dev_warn(clients[i],
> -- 
> 2.45.1
>
Logan Gunthorpe Oct. 22, 2024, 9:15 p.m. UTC | #2
On 2024-10-22 09:16, Bjorn Helgaas wrote:
> On Sun, Oct 20, 2024 at 10:21:29PM -0700, Vivek Kasireddy wrote:
>> Functions of the same PCI device (such as a PF and a VF) share the
>> same bus and have a common root port and typically, the PF provisions
>> resources for the VF. Therefore, they can be considered compatible
>> as far as P2P access is considered.
>>
>> Currently, although the distance (2) is correctly calculated for
>> functions of the same device, an ACS check failure prevents P2P DMA
>> access between them. Therefore, introduce a small function named
>> pci_devs_are_p2pdma_compatible() to determine if the provider and
>> client belong to the same device and facilitate P2P DMA between
>> them by not enforcing the ACS check.
>>
>> v2:
>> - Relax the enforcment of ACS check only for Intel GPU functions
>>   as they are P2PDMA compatible given the way the PF provisions
>>   the resources among multiple VFs.
> 
> I don't want version history in the commit log.  If the content is
> useful, just incorporate it here directly (without the version info),
> and put the version-to-version changelog below the "---".
> 
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Cc: Logan Gunthorpe <logang@deltatee.com>
>> Cc: <linux-pci@vger.kernel.org>
>> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
>> ---
>>  drivers/pci/p2pdma.c | 17 +++++++++++++++--
>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
>> index 4f47a13cb500..a230e661f939 100644
>> --- a/drivers/pci/p2pdma.c
>> +++ b/drivers/pci/p2pdma.c
>> @@ -535,6 +535,17 @@ static unsigned long map_types_idx(struct pci_dev *client)
>>  	return (pci_domain_nr(client->bus) << 16) | pci_dev_id(client);
>>  }
>>  
>> +static bool pci_devs_are_p2pdma_compatible(struct pci_dev *provider,
>> +					   struct pci_dev *client)
>> +{
>> +	if (provider->vendor == PCI_VENDOR_ID_INTEL) {
>> +		if (pci_is_vga(provider) && pci_is_vga(client))
>> +			return pci_physfn(provider) == pci_physfn(client);
>> +	}

I'd echo many of Bjorn's concerns. In addition, I think the name of the
pci_devs_are_p2pdma_compatible() isn't quite right. Specifically this is
dealing with PCI functions within a single device that are known to
allow P2P traffic. So I think the name should probably reflect that.

Thanks,

Logan
Kasireddy, Vivek Oct. 24, 2024, 5:50 a.m. UTC | #3
Hi Logan,

> Subject: Re: [PATCH v2 1/5] PCI/P2PDMA: Don't enforce ACS check for
> functions of same device
> 
> 
> 
> On 2024-10-22 09:16, Bjorn Helgaas wrote:
> > On Sun, Oct 20, 2024 at 10:21:29PM -0700, Vivek Kasireddy wrote:
> >> Functions of the same PCI device (such as a PF and a VF) share the
> >> same bus and have a common root port and typically, the PF provisions
> >> resources for the VF. Therefore, they can be considered compatible
> >> as far as P2P access is considered.
> >>
> >> Currently, although the distance (2) is correctly calculated for
> >> functions of the same device, an ACS check failure prevents P2P DMA
> >> access between them. Therefore, introduce a small function named
> >> pci_devs_are_p2pdma_compatible() to determine if the provider and
> >> client belong to the same device and facilitate P2P DMA between
> >> them by not enforcing the ACS check.
> >>
> >> v2:
> >> - Relax the enforcment of ACS check only for Intel GPU functions
> >>   as they are P2PDMA compatible given the way the PF provisions
> >>   the resources among multiple VFs.
> >
> > I don't want version history in the commit log.  If the content is
> > useful, just incorporate it here directly (without the version info),
> > and put the version-to-version changelog below the "---".
> >
> >> Cc: Bjorn Helgaas <bhelgaas@google.com>
> >> Cc: Logan Gunthorpe <logang@deltatee.com>
> >> Cc: <linux-pci@vger.kernel.org>
> >> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> >> ---
> >>  drivers/pci/p2pdma.c | 17 +++++++++++++++--
> >>  1 file changed, 15 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> >> index 4f47a13cb500..a230e661f939 100644
> >> --- a/drivers/pci/p2pdma.c
> >> +++ b/drivers/pci/p2pdma.c
> >> @@ -535,6 +535,17 @@ static unsigned long map_types_idx(struct
> pci_dev *client)
> >>  	return (pci_domain_nr(client->bus) << 16) | pci_dev_id(client);
> >>  }
> >>
> >> +static bool pci_devs_are_p2pdma_compatible(struct pci_dev *provider,
> >> +					   struct pci_dev *client)
> >> +{
> >> +	if (provider->vendor == PCI_VENDOR_ID_INTEL) {
> >> +		if (pci_is_vga(provider) && pci_is_vga(client))
> >> +			return pci_physfn(provider) == pci_physfn(client);
> >> +	}
> 
> I'd echo many of Bjorn's concerns. In addition, I think the name of the
> pci_devs_are_p2pdma_compatible() isn't quite right. Specifically this is
> dealing with PCI functions within a single device that are known to
> allow P2P traffic. So I think the name should probably reflect that.
Would pci_devfns_support_p2pdma() be a more appropriate name?

Thanks,
Vivek

> 
> Thanks,
> 
> Logan
Kasireddy, Vivek Oct. 24, 2024, 5:58 a.m. UTC | #4
Hi Bjorn,

> Subject: Re: [PATCH v2 1/5] PCI/P2PDMA: Don't enforce ACS check for
> functions of same device
> 
> On Sun, Oct 20, 2024 at 10:21:29PM -0700, Vivek Kasireddy wrote:
> > Functions of the same PCI device (such as a PF and a VF) share the
> > same bus and have a common root port and typically, the PF provisions
> > resources for the VF. Therefore, they can be considered compatible
> > as far as P2P access is considered.
> >
> > Currently, although the distance (2) is correctly calculated for
> > functions of the same device, an ACS check failure prevents P2P DMA
> > access between them. Therefore, introduce a small function named
> > pci_devs_are_p2pdma_compatible() to determine if the provider and
> > client belong to the same device and facilitate P2P DMA between
> > them by not enforcing the ACS check.
> >
> > v2:
> > - Relax the enforcment of ACS check only for Intel GPU functions
> >   as they are P2PDMA compatible given the way the PF provisions
> >   the resources among multiple VFs.
> 
> I don't want version history in the commit log.  If the content is
> useful, just incorporate it here directly (without the version info),
> and put the version-to-version changelog below the "---".
Ok, noted; will follow your suggestion for the next versions.

> 
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: Logan Gunthorpe <logang@deltatee.com>
> > Cc: <linux-pci@vger.kernel.org>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> > ---
> >  drivers/pci/p2pdma.c | 17 +++++++++++++++--
> >  1 file changed, 15 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> > index 4f47a13cb500..a230e661f939 100644
> > --- a/drivers/pci/p2pdma.c
> > +++ b/drivers/pci/p2pdma.c
> > @@ -535,6 +535,17 @@ static unsigned long map_types_idx(struct pci_dev
> *client)
> >  	return (pci_domain_nr(client->bus) << 16) | pci_dev_id(client);
> >  }
> >
> > +static bool pci_devs_are_p2pdma_compatible(struct pci_dev *provider,
> > +					   struct pci_dev *client)
> > +{
> > +	if (provider->vendor == PCI_VENDOR_ID_INTEL) {
> > +		if (pci_is_vga(provider) && pci_is_vga(client))
> > +			return pci_physfn(provider) == pci_physfn(client);
> > +	}
> 
> This doesn't explain why this should be specific to Intel or VGA.  As
> far as I can tell, everything mentioned in the commit log is generic.
> 
> I see the previous comments
> (https://lore.kernel.org/all/eddb423c-945f-40c9-b904-
> 43ea8371f1c4@deltatee.com/),
> but none of that context was captured here.
Ok, I'll augment the commit message to include this context.

> 
> I'm not sure what you refer to by "PF provisions resources for the
> VF".  Isn't it *always* the case that the architected PCI resources
> (BARs) are configured by the PF?  It sounds like you're referring to
> something Intel GPU-specific beyond that?
What I meant to say is that since PF provisions the resources for the VF
in a typical scenario, they should be automatically P2PDMA compatible
particularly when the provider is the VF and PF is the client. However,
since this cannot be guaranteed on all the PCI devices out there for various
reasons, my objective is to start including the ones that can be tested and
are known to be compatible (Intel GPUs).

I'll capture these additional details in the next version.

Thanks,
Vivek

> 
> > +	return false;
> > +}
> > +
> >  /*
> >   * Calculate the P2PDMA mapping type and distance between two PCI
> devices.
> >   *
> > @@ -634,7 +645,7 @@ calc_map_type_and_dist(struct pci_dev *provider,
> struct pci_dev *client,
> >
> >  	*dist = dist_a + dist_b;
> >
> > -	if (!acs_cnt) {
> > +	if (!acs_cnt || pci_devs_are_p2pdma_compatible(provider, client)) {
> >  		map_type = PCI_P2PDMA_MAP_BUS_ADDR;
> >  		goto done;
> >  	}
> > @@ -696,7 +707,9 @@ int pci_p2pdma_distance_many(struct pci_dev
> *provider, struct device **clients,
> >  		return -1;
> >
> >  	for (i = 0; i < num_clients; i++) {
> > -		pci_client = find_parent_pci_dev(clients[i]);
> > +		pci_client = dev_is_pf(clients[i]) ?
> > +				pci_dev_get(to_pci_dev(clients[i])) :
> > +				find_parent_pci_dev(clients[i]);
> >  		if (!pci_client) {
> >  			if (verbose)
> >  				dev_warn(clients[i],
> > --
> > 2.45.1
> >
Logan Gunthorpe Oct. 24, 2024, 4:21 p.m. UTC | #5
On 2024-10-23 23:50, Kasireddy, Vivek wrote:
>> I'd echo many of Bjorn's concerns. In addition, I think the name of the
>> pci_devs_are_p2pdma_compatible() isn't quite right. Specifically this is
>> dealing with PCI functions within a single device that are known to
>> allow P2P traffic. So I think the name should probably reflect that.
> Would pci_devfns_support_p2pdma() be a more appropriate name?

That sounds better to me, thanks.

Logan
Bjorn Helgaas Oct. 24, 2024, 5:59 p.m. UTC | #6
On Thu, Oct 24, 2024 at 05:58:48AM +0000, Kasireddy, Vivek wrote:
> > Subject: Re: [PATCH v2 1/5] PCI/P2PDMA: Don't enforce ACS check for
> > functions of same device
> > 
> > On Sun, Oct 20, 2024 at 10:21:29PM -0700, Vivek Kasireddy wrote:
> > > Functions of the same PCI device (such as a PF and a VF) share the
> > > same bus and have a common root port and typically, the PF provisions
> > > resources for the VF. Therefore, they can be considered compatible
> > > as far as P2P access is considered.

I don't understand the "therefore they can be considered compatible"
conclusion.  The spec quote below seems like it addresses exactly this
situation: it says ACS can control peer-to-peer requests between VFs.

> ...
> > I'm not sure what you refer to by "PF provisions resources for the
> > VF".  Isn't it *always* the case that the architected PCI
> > resources (BARs) are configured by the PF?  It sounds like you're
> > referring to something Intel GPU-specific beyond that?
>
> What I meant to say is that since PF provisions the resources for
> the VF in a typical scenario,

Are you talking about BARs?  As far as I know, the PF BAR assignments
always (not just in typical scenarios) determine the VF BAR
assignments.  

Or are you referring to some other non-BAR resources?

> they should be automatically P2PDMA compatible particularly when the
> provider is the VF and PF is the client. However, since this cannot
> be guaranteed on all the PCI devices out there for various reasons,
> my objective is to start including the ones that can be tested and
> are known to be compatible (Intel GPUs).

Regardless of BAR or other VF resources, I don't think VFs are
automatically P2PDMA compatible.  For example, PCIe r6.0, sec 6.12.1.2
says:

  For ACS requirements, single-Function devices that are SR-IOV
  capable must be handled as if they were Multi-Function Devices.

  ...

  - ACS P2P Request Redirect: must be implemented by Functions that
    support peer-to-peer traffic with other Functions. This includes
    SR-IOV Virtual Functions (VFs).  ACS P2P Request Redirect is
    subject to interaction with the ACS P2P Egress Control and ACS
    Direct Translated P2P mechanisms (if implemented). Refer to
    Section 6.12.3 for more information.  When ACS P2P Request
    Redirect is enabled in a Multi-Function Device that is not an
    RCiEP, peer-to-peer Requests (between Functions of the device)
    must be redirected Upstream towards the RC.

Or do you mean something else by "P2PDMA compatible"?

Bjorn
Bjorn Helgaas Oct. 24, 2024, 6:01 p.m. UTC | #7
On Thu, Oct 24, 2024 at 10:21:17AM -0600, Logan Gunthorpe wrote:
> On 2024-10-23 23:50, Kasireddy, Vivek wrote:
> >> I'd echo many of Bjorn's concerns. In addition, I think the name of the
> >> pci_devs_are_p2pdma_compatible() isn't quite right. Specifically this is
> >> dealing with PCI functions within a single device that are known to
> >> allow P2P traffic. So I think the name should probably reflect that.
> >
> > Would pci_devfns_support_p2pdma() be a more appropriate name?
> 
> That sounds better to me, thanks.

This sounds similar to what's done in pci_dev_specific_acs_enabled().
Could this problem be solved by adding more device-specific quirks
there?

Bjorn
Kasireddy, Vivek Oct. 25, 2024, 6:57 a.m. UTC | #8
Hi Bjorn,

> Subject: Re: [PATCH v2 1/5] PCI/P2PDMA: Don't enforce ACS check for
> functions of same device
> 
> On Thu, Oct 24, 2024 at 05:58:48AM +0000, Kasireddy, Vivek wrote:
> > > Subject: Re: [PATCH v2 1/5] PCI/P2PDMA: Don't enforce ACS check for
> > > functions of same device
> > >
> > > On Sun, Oct 20, 2024 at 10:21:29PM -0700, Vivek Kasireddy wrote:
> > > > Functions of the same PCI device (such as a PF and a VF) share the
> > > > same bus and have a common root port and typically, the PF provisions
> > > > resources for the VF. Therefore, they can be considered compatible
> > > > as far as P2P access is considered.
> 
> I don't understand the "therefore they can be considered compatible"
> conclusion.  The spec quote below seems like it addresses exactly this
> situation: it says ACS can control peer-to-peer requests between VFs.
I am only referring to the specific case where the PF is trying to access (P2P)
a VF's resource that the PF itself has provisioned. Shouldn't this be considered
a valid access?

> 
> > ...
> > > I'm not sure what you refer to by "PF provisions resources for the
> > > VF".  Isn't it *always* the case that the architected PCI
> > > resources (BARs) are configured by the PF?  It sounds like you're
> > > referring to something Intel GPU-specific beyond that?
> >
> > What I meant to say is that since PF provisions the resources for
> > the VF in a typical scenario,
> 
> Are you talking about BARs?  As far as I know, the PF BAR assignments
> always (not just in typical scenarios) determine the VF BAR
> assignments.
Right, I am indeed talking about BARs.

> 
> Or are you referring to some other non-BAR resources?
> 
> > they should be automatically P2PDMA compatible particularly when the
> > provider is the VF and PF is the client. However, since this cannot
> > be guaranteed on all the PCI devices out there for various reasons,
> > my objective is to start including the ones that can be tested and
> > are known to be compatible (Intel GPUs).
> 
> Regardless of BAR or other VF resources, I don't think VFs are
> automatically P2PDMA compatible.
I agree that VFs in general are not automatically P2PDMA compatible
but a PF and a VF should be considered compatible particularly when the
provider is a VF and PF is the client.

> For example, PCIe r6.0, sec 6.12.1.2  says:
> 
>   For ACS requirements, single-Function devices that are SR-IOV
>   capable must be handled as if they were Multi-Function Devices.
> 
>   ...
> 
>   - ACS P2P Request Redirect: must be implemented by Functions that
>     support peer-to-peer traffic with other Functions. This includes
>     SR-IOV Virtual Functions (VFs).  ACS P2P Request Redirect is
>     subject to interaction with the ACS P2P Egress Control and ACS
>     Direct Translated P2P mechanisms (if implemented). Refer to
>     Section 6.12.3 for more information.  When ACS P2P Request
>     Redirect is enabled in a Multi-Function Device that is not an
>     RCiEP, peer-to-peer Requests (between Functions of the device)
>     must be redirected Upstream towards the RC.
> 
> Or do you mean something else by "P2PDMA compatible"?
I am no longer making any generic claims about devices' P2PDMA
compatibility. Instead, as mentioned above, I am only focused on the
interactions between a PF (client) and a VF (provider), particularly with
Intel GPUs. 

More specifically, I am trying to address a use-case where the VF needs to
share a buffer with the PF but is unsuccessful because pci_p2pdma_distance_many(
provider, client, 1, true) fails (due to ACS redirect being set) although
the buffer is located within a BAR resource that the PF has provisioned
and has full access to it. Shouldn't this be allowed?

Thanks,
Vivek

> 
> Bjorn
Bjorn Helgaas Oct. 30, 2024, 6:46 p.m. UTC | #9
On Fri, Oct 25, 2024 at 06:57:37AM +0000, Kasireddy, Vivek wrote:
> > Subject: Re: [PATCH v2 1/5] PCI/P2PDMA: Don't enforce ACS check for
> > functions of same device
> > 
> > On Thu, Oct 24, 2024 at 05:58:48AM +0000, Kasireddy, Vivek wrote:
> > > > Subject: Re: [PATCH v2 1/5] PCI/P2PDMA: Don't enforce ACS check for
> > > > functions of same device
> > > >
> > > > On Sun, Oct 20, 2024 at 10:21:29PM -0700, Vivek Kasireddy wrote:
> > > > > Functions of the same PCI device (such as a PF and a VF) share the
> > > > > same bus and have a common root port and typically, the PF provisions
> > > > > resources for the VF. Therefore, they can be considered compatible
> > > > > as far as P2P access is considered.
> > 
> > I don't understand the "therefore they can be considered compatible"
> > conclusion.  The spec quote below seems like it addresses exactly this
> > situation: it says ACS can control peer-to-peer requests between VFs.
>
> I am only referring to the specific case where the PF is trying to
> access (P2P) a VF's resource that the PF itself has provisioned.
> Shouldn't this be considered a valid access?
> 
> > > ...
> > > > I'm not sure what you refer to by "PF provisions resources for the
> > > > VF".  Isn't it *always* the case that the architected PCI
> > > > resources (BARs) are configured by the PF?  It sounds like you're
> > > > referring to something Intel GPU-specific beyond that?
> > >
> > > What I meant to say is that since PF provisions the resources for
> > > the VF in a typical scenario,
> > 
> > Are you talking about BARs?  As far as I know, the PF BAR assignments
> > always (not just in typical scenarios) determine the VF BAR
> > assignments.
>
> Right, I am indeed talking about BARs.
> 
> > Or are you referring to some other non-BAR resources?
> > 
> > > they should be automatically P2PDMA compatible particularly when the
> > > provider is the VF and PF is the client. However, since this cannot
> > > be guaranteed on all the PCI devices out there for various reasons,
> > > my objective is to start including the ones that can be tested and
> > > are known to be compatible (Intel GPUs).
> > 
> > Regardless of BAR or other VF resources, I don't think VFs are
> > automatically P2PDMA compatible.
>
> I agree that VFs in general are not automatically P2PDMA compatible
> but a PF and a VF should be considered compatible particularly when the
> provider is a VF and PF is the client.
> 
> > For example, PCIe r6.0, sec 6.12.1.2  says:
> > 
> >   For ACS requirements, single-Function devices that are SR-IOV
> >   capable must be handled as if they were Multi-Function Devices.
> > 
> >   ...
> > 
> >   - ACS P2P Request Redirect: must be implemented by Functions that
> >     support peer-to-peer traffic with other Functions. This includes
> >     SR-IOV Virtual Functions (VFs).  ACS P2P Request Redirect is
> >     subject to interaction with the ACS P2P Egress Control and ACS
> >     Direct Translated P2P mechanisms (if implemented). Refer to
> >     Section 6.12.3 for more information.  When ACS P2P Request
> >     Redirect is enabled in a Multi-Function Device that is not an
> >     RCiEP, peer-to-peer Requests (between Functions of the device)
> >     must be redirected Upstream towards the RC.
> > 
> > Or do you mean something else by "P2PDMA compatible"?
>
> I am no longer making any generic claims about devices' P2PDMA
> compatibility. Instead, as mentioned above, I am only focused on the
> interactions between a PF (client) and a VF (provider), particularly
> with Intel GPUs. 
> 
> More specifically, I am trying to address a use-case where the VF
> needs to share a buffer with the PF but is unsuccessful because
> pci_p2pdma_distance_many(provider, client, 1, true) fails (due to
> ACS redirect being set) although the buffer is located within a BAR
> resource that the PF has provisioned and has full access to it.
> Shouldn't this be allowed?

IIUC you want the PF to be able to initiate a transaction on the PCIe
link to access a VF BAR.  The address in that TLP will be inside the
VF BAR (and also inside the space defined by the VF BAR<n> and the
NumVFs value in the PF's SR-IOV Capability).

In the PCIe world, I don't think a TLP can "loop back" to another
function on the same device.  I think it has to go upstream at least
to the Port above the originating function.  The Port works like a
PCI-PCI bridge.  The TLP address will be inside a Port memory window,
so in the absence of ACS, the Port would reflect the TLP back down the
same link it came from.  I'm pretty sure an analyzer on the link would
see two distinct TLPs.

But as far as I can tell, when ACS P2P Request Redirect is enabled,
the spec requires that the Port forward the TLP upstream (regardless
of the TLP address) instead of reflecting it back to the downstream
link.

Do you read the spec differently?

Bjorn
Logan Gunthorpe Oct. 30, 2024, 9:20 p.m. UTC | #10
On 2024-10-30 12:46, Bjorn Helgaas wrote:
> On Fri, Oct 25, 2024 at 06:57:37AM +0000, Kasireddy, Vivek wrote:
> In the PCIe world, I don't think a TLP can "loop back" to another
> function on the same device.

I'm not sure if the spec says anything that specifically denies this.
But it seems to me that it would be possible for a multifunction device
to handle a transfer to a neighbouring function internally and not
actually involve the PCIe fabric. This seems like something we'd want to
support if and when such a device were to be created.

Logan
Bjorn Helgaas Oct. 30, 2024, 10:07 p.m. UTC | #11
On Wed, Oct 30, 2024 at 03:20:02PM -0600, Logan Gunthorpe wrote:
> On 2024-10-30 12:46, Bjorn Helgaas wrote:
> > On Fri, Oct 25, 2024 at 06:57:37AM +0000, Kasireddy, Vivek wrote:
> > In the PCIe world, I don't think a TLP can "loop back" to another
> > function on the same device.
> 
> I'm not sure if the spec says anything that specifically denies this.

I'm not a hardware guy and I don't know if there's a direct statement
about it, but if a Downstream Port supports ACS, it must support ACS
P2P Request Redirect (PCIe r6.0, sec 6.12.1.1), which specifically
applies to peer-to-peer TLPs.

If peer-to-peer TLPs appear on the link, the Downstream Port will see
them and act on them, e.g., either route them upstream (if P2P Request
Redirect is enabled) or back downstream.  I don't think the VF could
act on them directly via a loopback path because that would lead to
duplicate writes and duplicate Completions for reads.

> But it seems to me that it would be possible for a multifunction device
> to handle a transfer to a neighbouring function internally and not
> actually involve the PCIe fabric. This seems like something we'd want to
> support if and when such a device were to be created.

If peer-to-peer transactions are handled internally, an SR-IOV device
other than an RCiEP is required to support ACS with P2P Egress Control
(sec 7.7.11) and P2P Request Redirect (sec 7.7.11.2).

Bjorn
Kasireddy, Vivek Oct. 31, 2024, 6:59 a.m. UTC | #12
Hi Bjorn,

> Subject: Re: [PATCH v2 1/5] PCI/P2PDMA: Don't enforce ACS check for
> functions of same device
> 
> On Wed, Oct 30, 2024 at 03:20:02PM -0600, Logan Gunthorpe wrote:
> > On 2024-10-30 12:46, Bjorn Helgaas wrote:
> > > On Fri, Oct 25, 2024 at 06:57:37AM +0000, Kasireddy, Vivek wrote:
> > > In the PCIe world, I don't think a TLP can "loop back" to another
> > > function on the same device.
> >
> > I'm not sure if the spec says anything that specifically denies this.
> 
> I'm not a hardware guy and I don't know if there's a direct statement
> about it, but if a Downstream Port supports ACS, it must support ACS
> P2P Request Redirect (PCIe r6.0, sec 6.12.1.1), which specifically
> applies to peer-to-peer TLPs.
> 
> If peer-to-peer TLPs appear on the link, the Downstream Port will see
> them and act on them, e.g., either route them upstream (if P2P Request
> Redirect is enabled) or back downstream.  I don't think the VF could
> act on them directly via a loopback path because that would lead to
> duplicate writes and duplicate Completions for reads.
> 
> > But it seems to me that it would be possible for a multifunction device
> > to handle a transfer to a neighbouring function internally and not
> > actually involve the PCIe fabric. This seems like something we'd want to
> > support if and when such a device were to be created.
> 
> If peer-to-peer transactions are handled internally, an SR-IOV device
> other than an RCiEP is required to support ACS with P2P Egress Control
> (sec 7.7.11) and P2P Request Redirect (sec 7.7.11.2).
As Logan suggests, my use-case does not involve using the PCIe fabric to
accomplish the DMA access between PF and VF. Instead, the PF's (GPU) driver
handles VF's BAR addresses (associated with the buffer) by translating them
into a different (internal and local) address space before facilitating access.

To articulate further, in my use-case, there is a driver A (that is bound to VF)
that needs to share a buffer with driver B (associated with the PF). However,
driver A would like to know if B can access its buffer or not, so it calls
pci_p2pdma_distance(A, B, ...) to check, as both PF and VF are PCI devices.
But given that pci_p2pdma_distance_many() uses ACS as the main criterion
(in addition to the bridge whitelist and a specific CPU type), it determines
that the access is invalid (as ACS redirect is set).

IIUC, it appears that pci_p2pdma_distance_many() is not the right tool to
use to check for access validity in my use-case, as it assumes the provider
and client would always use PCIe fabric for DMA. I think it either needs
to be augmented to handle various situations or a new helper is needed.
What is your recommended solution for this issue?

Thanks,
Vivek

> 
> Bjorn
diff mbox series

Patch

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index 4f47a13cb500..a230e661f939 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -535,6 +535,17 @@  static unsigned long map_types_idx(struct pci_dev *client)
 	return (pci_domain_nr(client->bus) << 16) | pci_dev_id(client);
 }
 
+static bool pci_devs_are_p2pdma_compatible(struct pci_dev *provider,
+					   struct pci_dev *client)
+{
+	if (provider->vendor == PCI_VENDOR_ID_INTEL) {
+		if (pci_is_vga(provider) && pci_is_vga(client))
+			return pci_physfn(provider) == pci_physfn(client);
+	}
+
+	return false;
+}
+
 /*
  * Calculate the P2PDMA mapping type and distance between two PCI devices.
  *
@@ -634,7 +645,7 @@  calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
 
 	*dist = dist_a + dist_b;
 
-	if (!acs_cnt) {
+	if (!acs_cnt || pci_devs_are_p2pdma_compatible(provider, client)) {
 		map_type = PCI_P2PDMA_MAP_BUS_ADDR;
 		goto done;
 	}
@@ -696,7 +707,9 @@  int pci_p2pdma_distance_many(struct pci_dev *provider, struct device **clients,
 		return -1;
 
 	for (i = 0; i < num_clients; i++) {
-		pci_client = find_parent_pci_dev(clients[i]);
+		pci_client = dev_is_pf(clients[i]) ?
+				pci_dev_get(to_pci_dev(clients[i])) :
+				find_parent_pci_dev(clients[i]);
 		if (!pci_client) {
 			if (verbose)
 				dev_warn(clients[i],