diff mbox

[v8,20/45] powerpc/powernv: Allocate PE# in reverse order

Message ID 1455680668-23298-21-git-send-email-gwshan@linux.vnet.ibm.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Gavin Shan Feb. 17, 2016, 3:44 a.m. UTC
PE number for one particular PE can be allocated dynamically or
reserved according to the consumed M64 (64-bits prefetchable)
segments of the PE. The M64 resources, and hence their segments
and PE number are assigned/reserved in ascending order. The PE
numbers are allocated dynamically in ascending order as well.
It's not a problem as the PE numbers are reserved and then
allocated all at once in fine order. However, it will introduce
conflicts when PCI hotplug is supported: the PE number to be
reserved for newly added PE might have been assigned.

To resolve above conflicts, this forces the PE number to be
allocated dynamically in reverse order. With this patch applied,
the PE numbers are reserved in ascending order, but allocated
dynamically in reverse order.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

Comments

Alexey Kardashevskiy April 19, 2016, 3:07 a.m. UTC | #1
On 02/17/2016 02:44 PM, Gavin Shan wrote:
> PE number for one particular PE can be allocated dynamically or
> reserved according to the consumed M64 (64-bits prefetchable)
> segments of the PE. The M64 resources, and hence their segments
> and PE number are assigned/reserved in ascending order. The PE
> numbers are allocated dynamically in ascending order as well.
> It's not a problem as the PE numbers are reserved and then
> allocated all at once in fine order. However, it will introduce
> conflicts when PCI hotplug is supported: the PE number to be
> reserved for newly added PE might have been assigned.
>
> To resolve above conflicts, this forces the PE number to be
> allocated dynamically in reverse order. With this patch applied,
> the PE numbers are reserved in ascending order, but allocated
> dynamically in reverse order.


The patch is probably is ok, the commit log is not - I do not follow it. 
Some PEs are reserved (for what? why does the absolute PE number matter? 
put it in the commit log), that means that the corresponding bits in 
pe_alloc[] should be set so when you will be allocating PEs for a just 
plugged device, you won't pick them and you will pick free ones, and the 
order should not matter. I would think that "reservation" happens once at 
the boot time so you set "used" bits for the reserved PEs then and after 
that the dynamic allocator will skip them.


>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
>   arch/powerpc/platforms/powernv/pci-ioda.c | 14 ++++++--------
>   1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index f182ca7..565725b 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -144,16 +144,14 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
>
>   static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
>   {
> -	unsigned long pe;
> +	unsigned long pe = phb->ioda.total_pe_num - 1;
>
> -	do {
> -		pe = find_next_zero_bit(phb->ioda.pe_alloc,
> -					phb->ioda.total_pe_num, 0);
> -		if (pe >= phb->ioda.total_pe_num)
> -			return NULL;
> -	} while(test_and_set_bit(pe, phb->ioda.pe_alloc));
> +	for (pe = phb->ioda.total_pe_num - 1; pe >= 0; pe--) {
> +		if (!test_and_set_bit(pe, phb->ioda.pe_alloc))
> +			return pnv_ioda_init_pe(phb, pe);
> +	}
>
> -	return pnv_ioda_init_pe(phb, pe);
> +	return NULL;
>   }
>
>   static void pnv_ioda_free_pe(struct pnv_ioda_pe *pe)
>
Gavin Shan April 20, 2016, 1:04 a.m. UTC | #2
On Tue, Apr 19, 2016 at 01:07:59PM +1000, Alexey Kardashevskiy wrote:
>On 02/17/2016 02:44 PM, Gavin Shan wrote:
>>PE number for one particular PE can be allocated dynamically or
>>reserved according to the consumed M64 (64-bits prefetchable)
>>segments of the PE. The M64 resources, and hence their segments
>>and PE number are assigned/reserved in ascending order. The PE
>>numbers are allocated dynamically in ascending order as well.
>>It's not a problem as the PE numbers are reserved and then
>>allocated all at once in fine order. However, it will introduce
>>conflicts when PCI hotplug is supported: the PE number to be
>>reserved for newly added PE might have been assigned.
>>
>>To resolve above conflicts, this forces the PE number to be
>>allocated dynamically in reverse order. With this patch applied,
>>the PE numbers are reserved in ascending order, but allocated
>>dynamically in reverse order.
>
>
>The patch is probably is ok, the commit log is not - I do not follow it. Some
>PEs are reserved (for what? why does the absolute PE number matter? put it in
>the commit log), that means that the corresponding bits in pe_alloc[] should
>be set so when you will be allocating PEs for a just plugged device, you
>won't pick them and you will pick free ones, and the order should not matter.
>I would think that "reservation" happens once at the boot time so you set
>"used" bits for the reserved PEs then and after that the dynamic allocator
>will skip them.
>

I will enhance the commit log in next revision, perhaps just pick part of
below words: On PHB3, there are 16 M64 BARs in hardware. The last one is
split ovenly into 256 segments. Each segment can be associated/assigned
to fixed PE# (segment#x <-> PE#x) which is how the hardware was designed.
If one plugged PE has M64 (64-bits prefetchable memory) resources, its
PE# is equal to the segment#. Otherwise, the PE# is allocated dynamically
if the PE doesn't contain M64 resource.

The M64 resources are assigned from low to high end, meaning the reserved
PE# (according to the M64 segments) are grown from low to high end. It's
most likely to get a dynamically allocated PE# which should be reserved
because of M64 segment. It's the conflicts the patch tries to resolve.

The PE# reservation doesn't happen once at boot time because it's
unknow how many PEs and how much M64 resources will be hot added.

>
>>
>>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>---
>>  arch/powerpc/platforms/powernv/pci-ioda.c | 14 ++++++--------
>>  1 file changed, 6 insertions(+), 8 deletions(-)
>>
>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>index f182ca7..565725b 100644
>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>@@ -144,16 +144,14 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
>>
>>  static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
>>  {
>>-	unsigned long pe;
>>+	unsigned long pe = phb->ioda.total_pe_num - 1;
>>
>>-	do {
>>-		pe = find_next_zero_bit(phb->ioda.pe_alloc,
>>-					phb->ioda.total_pe_num, 0);
>>-		if (pe >= phb->ioda.total_pe_num)
>>-			return NULL;
>>-	} while(test_and_set_bit(pe, phb->ioda.pe_alloc));
>>+	for (pe = phb->ioda.total_pe_num - 1; pe >= 0; pe--) {
>>+		if (!test_and_set_bit(pe, phb->ioda.pe_alloc))
>>+			return pnv_ioda_init_pe(phb, pe);
>>+	}
>>
>>-	return pnv_ioda_init_pe(phb, pe);
>>+	return NULL;
>>  }
>>
>>  static void pnv_ioda_free_pe(struct pnv_ioda_pe *pe)
>>
>
>
>-- 
>Alexey
>

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index f182ca7..565725b 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -144,16 +144,14 @@  static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
 
 static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
 {
-	unsigned long pe;
+	unsigned long pe = phb->ioda.total_pe_num - 1;
 
-	do {
-		pe = find_next_zero_bit(phb->ioda.pe_alloc,
-					phb->ioda.total_pe_num, 0);
-		if (pe >= phb->ioda.total_pe_num)
-			return NULL;
-	} while(test_and_set_bit(pe, phb->ioda.pe_alloc));
+	for (pe = phb->ioda.total_pe_num - 1; pe >= 0; pe--) {
+		if (!test_and_set_bit(pe, phb->ioda.pe_alloc))
+			return pnv_ioda_init_pe(phb, pe);
+	}
 
-	return pnv_ioda_init_pe(phb, pe);
+	return NULL;
 }
 
 static void pnv_ioda_free_pe(struct pnv_ioda_pe *pe)