diff mbox

[V11,07/17] powerpc/pci: Don't unset pci resources for VFs

Message ID 1421288887-7765-8-git-send-email-weiyang@linux.vnet.ibm.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Wei Yang Jan. 15, 2015, 2:27 a.m. UTC
If we're going to reassign resources with flag PCI_REASSIGN_ALL_RSRC, all
resources will be cleaned out during device header fixup time and then get
reassigned by PCI core. However, the VF resources won't be reassigned and
thus, we shouldn't clean them out.

This patch adds a condition. If the pci_dev is a VF, skip the resource
unset process.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/pci-common.c |    4 ++++
 1 file changed, 4 insertions(+)

Comments

Benjamin Herrenschmidt Feb. 10, 2015, 12:36 a.m. UTC | #1
On Thu, 2015-01-15 at 10:27 +0800, Wei Yang wrote:
> If we're going to reassign resources with flag PCI_REASSIGN_ALL_RSRC, all
> resources will be cleaned out during device header fixup time and then get
> reassigned by PCI core. However, the VF resources won't be reassigned and
> thus, we shouldn't clean them out.
> 
> This patch adds a condition. If the pci_dev is a VF, skip the resource
> unset process.

I don't understand this, can you elaborate ? Why wouldn't we reassign
the IOV resource just like everything else ?

Ben.

> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/pci-common.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> index 37d512d..889f743 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -788,6 +788,10 @@ static void pcibios_fixup_resources(struct pci_dev *dev)
>  		       pci_name(dev));
>  		return;
>  	}
> +
> +	if (dev->is_virtfn)
> +		return;
> +
>  	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
>  		struct resource *res = dev->resource + i;
>  		struct pci_bus_region reg;


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Yang Feb. 10, 2015, 1:51 a.m. UTC | #2
On Tue, Feb 10, 2015 at 11:36:24AM +1100, Benjamin Herrenschmidt wrote:
>On Thu, 2015-01-15 at 10:27 +0800, Wei Yang wrote:
>> If we're going to reassign resources with flag PCI_REASSIGN_ALL_RSRC, all
>> resources will be cleaned out during device header fixup time and then get
>> reassigned by PCI core. However, the VF resources won't be reassigned and
>> thus, we shouldn't clean them out.
>> 
>> This patch adds a condition. If the pci_dev is a VF, skip the resource
>> unset process.
>
>I don't understand this, can you elaborate ? Why wouldn't we reassign
>the IOV resource just like everything else ?

Sure.

VFs work a little bit different from normal devices. On powernv platform, we
have PCI_REASSIGN_ALL_RSRC set, which means all resource retrieved from
hardware will be cleaned and re-assigned by kernel. While VF's resources are
calculated from PF's IOV BAR, in virtfn_add(). And after this, there is not
re-assign process for VFs.

>
>Ben.
>
>> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/kernel/pci-common.c |    4 ++++
>>  1 file changed, 4 insertions(+)
>> 
>> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
>> index 37d512d..889f743 100644
>> --- a/arch/powerpc/kernel/pci-common.c
>> +++ b/arch/powerpc/kernel/pci-common.c
>> @@ -788,6 +788,10 @@ static void pcibios_fixup_resources(struct pci_dev *dev)
>>  		       pci_name(dev));
>>  		return;
>>  	}
>> +
>> +	if (dev->is_virtfn)
>> +		return;
>> +
>>  	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
>>  		struct resource *res = dev->resource + i;
>>  		struct pci_bus_region reg;
>
Benjamin Herrenschmidt Feb. 10, 2015, 2:14 a.m. UTC | #3
On Tue, 2015-02-10 at 09:51 +0800, Wei Yang wrote:
> On Tue, Feb 10, 2015 at 11:36:24AM +1100, Benjamin Herrenschmidt wrote:
> >On Thu, 2015-01-15 at 10:27 +0800, Wei Yang wrote:
> >> If we're going to reassign resources with flag PCI_REASSIGN_ALL_RSRC, all
> >> resources will be cleaned out during device header fixup time and then get
> >> reassigned by PCI core. However, the VF resources won't be reassigned and
> >> thus, we shouldn't clean them out.
> >> 
> >> This patch adds a condition. If the pci_dev is a VF, skip the resource
> >> unset process.
> >
> >I don't understand this, can you elaborate ? Why wouldn't we reassign
> >the IOV resource just like everything else ?
> 
> Sure.
> 
> VFs work a little bit different from normal devices. On powernv platform, we
> have PCI_REASSIGN_ALL_RSRC set, which means all resource retrieved from
> hardware will be cleaned and re-assigned by kernel. While VF's resources are
> calculated from PF's IOV BAR, in virtfn_add(). And after this, there is not
> re-assign process for VFs.

I still don't undertand, you mean SR-IOV is assigned before we assign
everybody else ? That doesn't make sense to me...

Ben.

> >
> >Ben.
> >
> >> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
> >> ---
> >>  arch/powerpc/kernel/pci-common.c |    4 ++++
> >>  1 file changed, 4 insertions(+)
> >> 
> >> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> >> index 37d512d..889f743 100644
> >> --- a/arch/powerpc/kernel/pci-common.c
> >> +++ b/arch/powerpc/kernel/pci-common.c
> >> @@ -788,6 +788,10 @@ static void pcibios_fixup_resources(struct pci_dev *dev)
> >>  		       pci_name(dev));
> >>  		return;
> >>  	}
> >> +
> >> +	if (dev->is_virtfn)
> >> +		return;
> >> +
> >>  	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
> >>  		struct resource *res = dev->resource + i;
> >>  		struct pci_bus_region reg;
> >
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Yang Feb. 10, 2015, 6:25 a.m. UTC | #4
On Tue, Feb 10, 2015 at 01:14:11PM +1100, Benjamin Herrenschmidt wrote:
>On Tue, 2015-02-10 at 09:51 +0800, Wei Yang wrote:
>> On Tue, Feb 10, 2015 at 11:36:24AM +1100, Benjamin Herrenschmidt wrote:
>> >On Thu, 2015-01-15 at 10:27 +0800, Wei Yang wrote:
>> >> If we're going to reassign resources with flag PCI_REASSIGN_ALL_RSRC, all
>> >> resources will be cleaned out during device header fixup time and then get
>> >> reassigned by PCI core. However, the VF resources won't be reassigned and
>> >> thus, we shouldn't clean them out.
>> >> 
>> >> This patch adds a condition. If the pci_dev is a VF, skip the resource
>> >> unset process.
>> >
>> >I don't understand this, can you elaborate ? Why wouldn't we reassign
>> >the IOV resource just like everything else ?
>> 
>> Sure.
>> 
>> VFs work a little bit different from normal devices. On powernv platform, we
>> have PCI_REASSIGN_ALL_RSRC set, which means all resource retrieved from
>> hardware will be cleaned and re-assigned by kernel. While VF's resources are
>> calculated from PF's IOV BAR, in virtfn_add(). And after this, there is not
>> re-assign process for VFs.
>
>I still don't undertand, you mean SR-IOV is assigned before we assign
>everybody else ? That doesn't make sense to me...
>

PF's resource will be assigned first, including normal BARs and IOV BARs.

Then PF's driver will create VFs, in virtfn_add(). In this function, VF's
resources is calculated from its PF's IOV BAR.

If you reset VF's resource as PFs, no one will try to assign it again.

>Ben.
>
>> >
>> >Ben.
>> >
>> >> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
>> >> ---
>> >>  arch/powerpc/kernel/pci-common.c |    4 ++++
>> >>  1 file changed, 4 insertions(+)
>> >> 
>> >> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
>> >> index 37d512d..889f743 100644
>> >> --- a/arch/powerpc/kernel/pci-common.c
>> >> +++ b/arch/powerpc/kernel/pci-common.c
>> >> @@ -788,6 +788,10 @@ static void pcibios_fixup_resources(struct pci_dev *dev)
>> >>  		       pci_name(dev));
>> >>  		return;
>> >>  	}
>> >> +
>> >> +	if (dev->is_virtfn)
>> >> +		return;
>> >> +
>> >>  	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
>> >>  		struct resource *res = dev->resource + i;
>> >>  		struct pci_bus_region reg;
>> >
>> 
>
Benjamin Herrenschmidt Feb. 10, 2015, 8:14 a.m. UTC | #5
On Tue, 2015-02-10 at 14:25 +0800, Wei Yang wrote:
> PF's resource will be assigned first, including normal BARs and IOV
> BARs.
> 
> Then PF's driver will create VFs, in virtfn_add(). In this function,
> VF's
> resources is calculated from its PF's IOV BAR.
> 
> If you reset VF's resource as PFs, no one will try to assign it again.

So the problem is that the flag indicating VF is lost ? IE. We should
still mark them unset, but preserve that flag ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 20, 2015, 11:47 p.m. UTC | #6
On Tue, Feb 10, 2015 at 07:14:45PM +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2015-02-10 at 14:25 +0800, Wei Yang wrote:
> > PF's resource will be assigned first, including normal BARs and IOV
> > BARs.
> > 
> > Then PF's driver will create VFs, in virtfn_add(). In this function,
> > VF's
> > resources is calculated from its PF's IOV BAR.
> > 
> > If you reset VF's resource as PFs, no one will try to assign it again.
> 
> So the problem is that the flag indicating VF is lost ? IE. We should
> still mark them unset, but preserve that flag ?

I think the problem is that the normal path for PCI_REASSIGN_ALL_RSRC is
at boot-time, where we do this:

    pcibios_init
      pcibios_scan_phb
        pci_scan_child_bus
          ...
            pci_device_add
              pci_fixup_device(pci_fixup_header)
		pcibios_fixup_resources                       # header fixup
		  for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
		    dev->resource[i].start = 0
      pcibios_resource_survey
        pcibios_allocate_resources

and we assign dev->resource[] for everything in
pcibios_allocate_resources().

But VFs are enumerated later, when they are enabled by the PF driver after
boot, so we have this path:

    pci_enable_sriov
      sriov_enable
        virtfn_add(vf_id)
          for (i = 0; i < 6; i++)
            vf->resource[i].start = pf->resource[IOV + i].start + (size * vf_id)
          pci_device_add
            pci_fixup_device(pci_fixup_header)
              pcibios_fixup_resources                   # header fixup
                for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
                  vf->resource[i].start = 0

Here, we clear out vf->resource[0..5] in the header fixup, but we're not
going to call pcibios_allocate_resources() again to reassign them.

So I think the *intent* of PCI_REASSIGN_ALL_RSRC is preserved if
pcibios_fixup_resources() leaves the VF resources alone, because the VF
resources are completely determined by the PF resources, and the PF
resources have already been reassigned.

If my understanding is correct, I think the patch is reasonable, and I
would try to put some of this explanation into the changelog.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Yang March 2, 2015, 6:09 a.m. UTC | #7
On Fri, Feb 20, 2015 at 05:47:09PM -0600, Bjorn Helgaas wrote:
>On Tue, Feb 10, 2015 at 07:14:45PM +1100, Benjamin Herrenschmidt wrote:
>> On Tue, 2015-02-10 at 14:25 +0800, Wei Yang wrote:
>> > PF's resource will be assigned first, including normal BARs and IOV
>> > BARs.
>> > 
>> > Then PF's driver will create VFs, in virtfn_add(). In this function,
>> > VF's
>> > resources is calculated from its PF's IOV BAR.
>> > 
>> > If you reset VF's resource as PFs, no one will try to assign it again.
>> 
>> So the problem is that the flag indicating VF is lost ? IE. We should
>> still mark them unset, but preserve that flag ?
>
>I think the problem is that the normal path for PCI_REASSIGN_ALL_RSRC is
>at boot-time, where we do this:
>
>    pcibios_init
>      pcibios_scan_phb
>        pci_scan_child_bus
>          ...
>            pci_device_add
>              pci_fixup_device(pci_fixup_header)
>		pcibios_fixup_resources                       # header fixup
>		  for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
>		    dev->resource[i].start = 0
>      pcibios_resource_survey
>        pcibios_allocate_resources
>
>and we assign dev->resource[] for everything in
>pcibios_allocate_resources().
>
>But VFs are enumerated later, when they are enabled by the PF driver after
>boot, so we have this path:
>
>    pci_enable_sriov
>      sriov_enable
>        virtfn_add(vf_id)
>          for (i = 0; i < 6; i++)
>            vf->resource[i].start = pf->resource[IOV + i].start + (size * vf_id)
>          pci_device_add
>            pci_fixup_device(pci_fixup_header)
>              pcibios_fixup_resources                   # header fixup
>                for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
>                  vf->resource[i].start = 0
>
>Here, we clear out vf->resource[0..5] in the header fixup, but we're not
>going to call pcibios_allocate_resources() again to reassign them.
>
>So I think the *intent* of PCI_REASSIGN_ALL_RSRC is preserved if
>pcibios_fixup_resources() leaves the VF resources alone, because the VF
>resources are completely determined by the PF resources, and the PF
>resources have already been reassigned.
>
>If my understanding is correct, I think the patch is reasonable, and I
>would try to put some of this explanation into the changelog.

Yep, it is correct, thanks for your explanation.

I did a chat on IRC with Ben, I guess he has got the idea :-)

>
>Bjorn
diff mbox

Patch

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 37d512d..889f743 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -788,6 +788,10 @@  static void pcibios_fixup_resources(struct pci_dev *dev)
 		       pci_name(dev));
 		return;
 	}
+
+	if (dev->is_virtfn)
+		return;
+
 	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
 		struct resource *res = dev->resource + i;
 		struct pci_bus_region reg;