Message ID | 1481738472-2671-1-git-send-email-gpiccoli@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Wed, Dec 14, 2016 at 04:01:12PM -0200, Guilherme G. Piccoli wrote: >Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading >infrastructure") introduced a better IRQ spreading mechanism, taking >account of the available NUMA nodes in the machine. > >Problem is that the algorithm of retrieving the nodemask iterates >"linearly" based on the number of online nodes - some architectures >present non-linear node distribution among the nodemask, like PowerPC. >If this is the case, the algorithm lead to a wrong node count number >and therefore to a bad/incomplete IRQ affinity distribution. > >For example, this problem were found in a machine with 128 CPUs and two >nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly >distributed). This led to a wrong affinity distribution which then led to >a bad mq allocation for nvme driver. > >Finally, we take the opportunity to fix a comment regarding the affinity >distribution when we have _more_ nodes than vectors. > >Fixes: 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading infrastructure") >Reported-by: Gabriel Krisman Bertazi <gabriel@krisman.be> >Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> >Cc: stable@vger.kernel.org # v4.9+ >Cc: Christoph Hellwig <hch@lst.de> >Cc: linuxppc-dev@lists.ozlabs.org >Cc: linux-pci@vger.kernel.org >--- Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> There is one picky comment as below, but you don't have to fix it :) > kernel/irq/affinity.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > >diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c >index 9be9bda..464eaf0 100644 >--- a/kernel/irq/affinity.c >+++ b/kernel/irq/affinity.c >@@ -37,15 +37,15 @@ static void irq_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk, > > static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk) > { >- int n, nodes; >+ int n, nodes = 0; > > /* Calculate the number of nodes in the supplied affinity mask */ >- for (n = 0, nodes = 0; n < num_online_nodes(); n++) { >+ for_each_online_node(n) > if (cpumask_intersects(mask, cpumask_of_node(n))) { > node_set(n, *nodemsk); > nodes++; > } >- } >+ It'd better to keep the brackets so that we needn't add them when adding more code into the block next time. > return nodes; > } > >@@ -82,7 +82,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd) > nodes = get_nodes_in_cpumask(cpu_online_mask, &nodemsk); > > /* >- * If the number of nodes in the mask is less than or equal the >+ * If the number of nodes in the mask is greater than or equal the > * number of vectors we just spread the vectors across the nodes. > */ > if (affv <= nodes) { Thanks, Gavin -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
"Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes: > Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading > infrastructure") introduced a better IRQ spreading mechanism, taking > account of the available NUMA nodes in the machine. > > Problem is that the algorithm of retrieving the nodemask iterates > "linearly" based on the number of online nodes - some architectures > present non-linear node distribution among the nodemask, like PowerPC. > If this is the case, the algorithm lead to a wrong node count number > and therefore to a bad/incomplete IRQ affinity distribution. > > For example, this problem were found in a machine with 128 CPUs and two > nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly > distributed). This led to a wrong affinity distribution which then led to > a bad mq allocation for nvme driver. > > Finally, we take the opportunity to fix a comment regarding the affinity > distribution when we have _more_ nodes than vectors. Thanks for taking care of this so quickly, Guilherme. Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Looks fine:
Reviewed-by: Christoph Hellwig <hch@lst.de>
(but I agree with the bracing nitpick from Gavin)
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 15 Dec 2016, Gavin Shan wrote: > > static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk) > > { > >- int n, nodes; > >+ int n, nodes = 0; > > > > /* Calculate the number of nodes in the supplied affinity mask */ > >- for (n = 0, nodes = 0; n < num_online_nodes(); n++) { > >+ for_each_online_node(n) > > if (cpumask_intersects(mask, cpumask_of_node(n))) { > > node_set(n, *nodemsk); > > nodes++; > > } > >- } > >+ > > It'd better to keep the brackets so that we needn't add them when adding > more code into the block next time. Removing the brackets is outright wrong. See: https://marc.info/?l=linux-kernel&m=147351236615103 I'll fix that up when applying the patch. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 15/12/16 05:01, Guilherme G. Piccoli wrote: > Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading > infrastructure") introduced a better IRQ spreading mechanism, taking > account of the available NUMA nodes in the machine. > > Problem is that the algorithm of retrieving the nodemask iterates > "linearly" based on the number of online nodes - some architectures > present non-linear node distribution among the nodemask, like PowerPC. > If this is the case, the algorithm lead to a wrong node count number > and therefore to a bad/incomplete IRQ affinity distribution. > > For example, this problem were found in a machine with 128 CPUs and two > nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly > distributed). This led to a wrong affinity distribution which then led to > a bad mq allocation for nvme driver. > > Finally, we take the opportunity to fix a comment regarding the affinity > distribution when we have _more_ nodes than vectors. Very good catch! Acked-by: Balbir Singh <bsingharora@gmail.com> -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/15/2016 07:36 AM, Thomas Gleixner wrote: > On Thu, 15 Dec 2016, Gavin Shan wrote: >>> static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk) >>> { >>> - int n, nodes; >>> + int n, nodes = 0; >>> >>> /* Calculate the number of nodes in the supplied affinity mask */ >>> - for (n = 0, nodes = 0; n < num_online_nodes(); n++) { >>> + for_each_online_node(n) >>> if (cpumask_intersects(mask, cpumask_of_node(n))) { >>> node_set(n, *nodemsk); >>> nodes++; >>> } >>> - } >>> + >> >> It'd better to keep the brackets so that we needn't add them when adding >> more code into the block next time. > > Removing the brackets is outright wrong. See: > https://marc.info/?l=linux-kernel&m=147351236615103 > > I'll fix that up when applying the patch. > > Thanks, > > tglx > Thanks you all very much for the reviews and comments - lesson learned about the brackets in multi-line if/for statements! Thanks for fixing it Thomas. Cheers, Guilherme -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index 9be9bda..464eaf0 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -37,15 +37,15 @@ static void irq_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk, static int get_nodes_in_cpumask(const struct cpumask *mask, nodemask_t *nodemsk) { - int n, nodes; + int n, nodes = 0; /* Calculate the number of nodes in the supplied affinity mask */ - for (n = 0, nodes = 0; n < num_online_nodes(); n++) { + for_each_online_node(n) if (cpumask_intersects(mask, cpumask_of_node(n))) { node_set(n, *nodemsk); nodes++; } - } + return nodes; } @@ -82,7 +82,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd) nodes = get_nodes_in_cpumask(cpu_online_mask, &nodemsk); /* - * If the number of nodes in the mask is less than or equal the + * If the number of nodes in the mask is greater than or equal the * number of vectors we just spread the vectors across the nodes. */ if (affv <= nodes) {
Commit 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading infrastructure") introduced a better IRQ spreading mechanism, taking account of the available NUMA nodes in the machine. Problem is that the algorithm of retrieving the nodemask iterates "linearly" based on the number of online nodes - some architectures present non-linear node distribution among the nodemask, like PowerPC. If this is the case, the algorithm lead to a wrong node count number and therefore to a bad/incomplete IRQ affinity distribution. For example, this problem were found in a machine with 128 CPUs and two nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly distributed). This led to a wrong affinity distribution which then led to a bad mq allocation for nvme driver. Finally, we take the opportunity to fix a comment regarding the affinity distribution when we have _more_ nodes than vectors. Fixes: 34c3d9819fda ("genirq/affinity: Provide smarter irq spreading infrastructure") Reported-by: Gabriel Krisman Bertazi <gabriel@krisman.be> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.9+ Cc: Christoph Hellwig <hch@lst.de> Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-pci@vger.kernel.org --- kernel/irq/affinity.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)