Message ID | 20220923155542.1212814-5-vschneid@redhat.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | sched, net: NUMA-aware CPU spreading interface | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
On Fri, Sep 23, 2022 at 04:55:41PM +0100, Valentin Schneider wrote: > The recently introduced sched_numa_hop_mask() exposes cpumasks of CPUs > reachable within a given distance budget, but this means each successive > cpumask is a superset of the previous one. > > Code wanting to allocate one item per CPU (e.g. IRQs) at increasing > distances would thus need to allocate a temporary cpumask to note which > CPUs have already been visited. This can be prevented by leveraging > for_each_cpu_andnot() - package all that logic into one ugl^D fancy macro. > > Signed-off-by: Valentin Schneider <vschneid@redhat.com> > --- > include/linux/topology.h | 37 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 37 insertions(+) > > diff --git a/include/linux/topology.h b/include/linux/topology.h > index 3e91ae6d0ad5..7aa7e6a4c739 100644 > --- a/include/linux/topology.h > +++ b/include/linux/topology.h > @@ -257,5 +257,42 @@ static inline const struct cpumask *sched_numa_hop_mask(int node, int hops) > } > #endif /* CONFIG_NUMA */ > > +/** > + * for_each_numa_hop_cpu - iterate over CPUs by increasing NUMA distance, > + * starting from a given node. > + * @cpu: the iteration variable. > + * @node: the NUMA node to start the search from. > + * > + * Requires rcu_lock to be held. > + * Careful: this is a double loop, 'break' won't work as expected. This warning concerns me not only because new iteration loop hides complexity and breaks 'break' (sic!), but also because it looks too specific. Why don't you split it, so instead: for_each_numa_hop_cpu(cpu, dev->priv.numa_node) { cpus[i] = cpu; if (++i == ncomp_eqs) goto spread_done; } in the following patch you would have something like this: for_each_node_hop(hop, node) { struct cpumask hop_cpus = sched_numa_hop_mask(node, hop); for_each_cpu_andnot(cpu, hop_cpus, ...) { cpus[i] = cpu; if (++i == ncomp_eqs) goto spread_done; } } It looks more bulky, but I believe there will be more users for for_each_node_hop() alone. On top of that, if you really like it, you can implement for_each_numa_hop_cpu() if you want. > + * Implementation notes: > + * > + * Providing it is valid, the mask returned by > + * sched_numa_hop_mask(node, hops+1) > + * is a superset of the one returned by > + * sched_numa_hop_mask(node, hops) > + * which may not be that useful for drivers that try to spread things out and > + * want to visit a CPU not more than once. > + * > + * To accommodate for that, we use for_each_cpu_andnot() to iterate over the cpus > + * of sched_numa_hop_mask(node, hops+1) with the CPUs of > + * sched_numa_hop_mask(node, hops) removed, IOW we only iterate over CPUs > + * a given distance away (rather than *up to* a given distance). > + * > + * hops=0 forces us to play silly games: we pass cpu_none_mask to > + * for_each_cpu_andnot(), which turns it into for_each_cpu(). > + */ > +#define for_each_numa_hop_cpu(cpu, node) \ > + for (struct { const struct cpumask *curr, *prev; int hops; } __v = \ > + { sched_numa_hop_mask(node, 0), NULL, 0 }; \ This anonymous structure is never used as structure. What for you define it? Why not just declare hops, prev and curr without packing them? Thanks, Yury > + !IS_ERR_OR_NULL(__v.curr); \ > + __v.hops++, \ > + __v.prev = __v.curr, \ > + __v.curr = sched_numa_hop_mask(node, __v.hops)) \ > + for_each_cpu_andnot(cpu, \ > + __v.curr, \ > + __v.hops ? __v.prev : cpu_none_mask) > > #endif /* _LINUX_TOPOLOGY_H */ > -- > 2.31.1
On 25/09/22 07:58, Yury Norov wrote: > On Fri, Sep 23, 2022 at 04:55:41PM +0100, Valentin Schneider wrote: >> +/** >> + * for_each_numa_hop_cpu - iterate over CPUs by increasing NUMA distance, >> + * starting from a given node. >> + * @cpu: the iteration variable. >> + * @node: the NUMA node to start the search from. >> + * >> + * Requires rcu_lock to be held. >> + * Careful: this is a double loop, 'break' won't work as expected. > > This warning concerns me not only because new iteration loop hides > complexity and breaks 'break' (sic!), but also because it looks too > specific. Why don't you split it, so instead: > > for_each_numa_hop_cpu(cpu, dev->priv.numa_node) { > cpus[i] = cpu; > if (++i == ncomp_eqs) > goto spread_done; > } > > in the following patch you would have something like this: > > for_each_node_hop(hop, node) { > struct cpumask hop_cpus = sched_numa_hop_mask(node, hop); > > for_each_cpu_andnot(cpu, hop_cpus, ...) { > cpus[i] = cpu; > if (++i == ncomp_eqs) > goto spread_done; > } > } > > It looks more bulky, but I believe there will be more users for > for_each_node_hop() alone. > > On top of that, if you really like it, you can implement > for_each_numa_hop_cpu() if you want. > IIUC you're suggesting to introduce an iterator for the cpumasks first, and then maybe add one on top for the individual cpus. I'm happy to do that, though I have to say I'm keen to keep the CPU iterator - IMO the complexity is justified if it is centralized in one location and saves us from boring old boilerplate code. >> + * Implementation notes: >> + * >> + * Providing it is valid, the mask returned by >> + * sched_numa_hop_mask(node, hops+1) >> + * is a superset of the one returned by >> + * sched_numa_hop_mask(node, hops) >> + * which may not be that useful for drivers that try to spread things out and >> + * want to visit a CPU not more than once. >> + * >> + * To accommodate for that, we use for_each_cpu_andnot() to iterate over the cpus >> + * of sched_numa_hop_mask(node, hops+1) with the CPUs of >> + * sched_numa_hop_mask(node, hops) removed, IOW we only iterate over CPUs >> + * a given distance away (rather than *up to* a given distance). >> + * >> + * hops=0 forces us to play silly games: we pass cpu_none_mask to >> + * for_each_cpu_andnot(), which turns it into for_each_cpu(). >> + */ >> +#define for_each_numa_hop_cpu(cpu, node) \ >> + for (struct { const struct cpumask *curr, *prev; int hops; } __v = \ >> + { sched_numa_hop_mask(node, 0), NULL, 0 }; \ > > This anonymous structure is never used as structure. What for you > define it? Why not just declare hops, prev and curr without packing > them? > I haven't found a way to do this that doesn't involve a struct - apparently you can't mix types in a for loop declaration clause. > Thanks, > Yury >
diff --git a/include/linux/topology.h b/include/linux/topology.h index 3e91ae6d0ad5..7aa7e6a4c739 100644 --- a/include/linux/topology.h +++ b/include/linux/topology.h @@ -257,5 +257,42 @@ static inline const struct cpumask *sched_numa_hop_mask(int node, int hops) } #endif /* CONFIG_NUMA */ +/** + * for_each_numa_hop_cpu - iterate over CPUs by increasing NUMA distance, + * starting from a given node. + * @cpu: the iteration variable. + * @node: the NUMA node to start the search from. + * + * Requires rcu_lock to be held. + * Careful: this is a double loop, 'break' won't work as expected. + * + * + * Implementation notes: + * + * Providing it is valid, the mask returned by + * sched_numa_hop_mask(node, hops+1) + * is a superset of the one returned by + * sched_numa_hop_mask(node, hops) + * which may not be that useful for drivers that try to spread things out and + * want to visit a CPU not more than once. + * + * To accommodate for that, we use for_each_cpu_andnot() to iterate over the cpus + * of sched_numa_hop_mask(node, hops+1) with the CPUs of + * sched_numa_hop_mask(node, hops) removed, IOW we only iterate over CPUs + * a given distance away (rather than *up to* a given distance). + * + * hops=0 forces us to play silly games: we pass cpu_none_mask to + * for_each_cpu_andnot(), which turns it into for_each_cpu(). + */ +#define for_each_numa_hop_cpu(cpu, node) \ + for (struct { const struct cpumask *curr, *prev; int hops; } __v = \ + { sched_numa_hop_mask(node, 0), NULL, 0 }; \ + !IS_ERR_OR_NULL(__v.curr); \ + __v.hops++, \ + __v.prev = __v.curr, \ + __v.curr = sched_numa_hop_mask(node, __v.hops)) \ + for_each_cpu_andnot(cpu, \ + __v.curr, \ + __v.hops ? __v.prev : cpu_none_mask) #endif /* _LINUX_TOPOLOGY_H */
The recently introduced sched_numa_hop_mask() exposes cpumasks of CPUs reachable within a given distance budget, but this means each successive cpumask is a superset of the previous one. Code wanting to allocate one item per CPU (e.g. IRQs) at increasing distances would thus need to allocate a temporary cpumask to note which CPUs have already been visited. This can be prevented by leveraging for_each_cpu_andnot() - package all that logic into one ugl^D fancy macro. Signed-off-by: Valentin Schneider <vschneid@redhat.com> --- include/linux/topology.h | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)