Message ID | 20240529090132.59434-2-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/irq: fixes for CPU hot{,un}plug | expand |
On 29.05.2024 11:01, Roger Pau Monne wrote: > When adjusting move_cleanup_count to account for CPUs that are offline also > adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract > those again creating and create an imbalance in move_cleanup_count. I'm in trouble with "creating"; I can't seem to be able to guess what you may have meant. > Fixes: 472e0b74c5c4 ('x86/IRQ: deal with move cleanup count state in fixup_irqs()') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> With the above clarified (adjustment can be done while committing) Reviewed-by: Jan Beulich <jbeulich@suse.com> > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -2572,6 +2572,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) > desc->arch.move_cleanup_count -= cpumask_weight(affinity); > if ( !desc->arch.move_cleanup_count ) > release_old_vec(desc); > + else > + /* > + * Adjust old_cpu_mask to account for the offline CPUs, > + * otherwise further calls to fixup_irqs() could subtract those > + * again and possibly underflow the counter. > + */ > + cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, > + &cpu_online_map); > } While functionality-wise okay, imo it would be slightly better to use "affinity" here as well, so that even without looking at context beyond what's shown here there is a direct connection to the cpumask_weight() call. I.e. cpumask_andnot(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, affinity); Thoughts? Jan
On Wed, May 29, 2024 at 02:40:51PM +0200, Jan Beulich wrote: > On 29.05.2024 11:01, Roger Pau Monne wrote: > > When adjusting move_cleanup_count to account for CPUs that are offline also > > adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract > > those again creating and create an imbalance in move_cleanup_count. > > I'm in trouble with "creating"; I can't seem to be able to guess what you may > have meant. Oh, sorry, that's a typo. I was meaning to point out that not removing the already subtracted CPUs from the mask can lead to further calls to fixup_irqs() subtracting them again and move_cleanup_count possibly underflowing. Would you prefer to write it as: "... could subtract those again and possibly underflow move_cleanup_count." > > Fixes: 472e0b74c5c4 ('x86/IRQ: deal with move cleanup count state in fixup_irqs()') > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > With the above clarified (adjustment can be done while committing) > Reviewed-by: Jan Beulich <jbeulich@suse.com> > > > --- a/xen/arch/x86/irq.c > > +++ b/xen/arch/x86/irq.c > > @@ -2572,6 +2572,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) > > desc->arch.move_cleanup_count -= cpumask_weight(affinity); > > if ( !desc->arch.move_cleanup_count ) > > release_old_vec(desc); > > + else > > + /* > > + * Adjust old_cpu_mask to account for the offline CPUs, > > + * otherwise further calls to fixup_irqs() could subtract those > > + * again and possibly underflow the counter. > > + */ > > + cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, > > + &cpu_online_map); > > } > > While functionality-wise okay, imo it would be slightly better to use > "affinity" here as well, so that even without looking at context beyond > what's shown here there is a direct connection to the cpumask_weight() > call. I.e. > > cpumask_andnot(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, > affinity); > > Thoughts? It was more straightforward for me to reason that removing the offline CPUs is OK, but I can see that you might prefer to use 'affinity', because that's the weight that's subtracted from move_cleanup_count. Using either should lead to the same result if my understanding is correct. Thanks, Roger.
On 29.05.2024 17:15, Roger Pau Monné wrote: > On Wed, May 29, 2024 at 02:40:51PM +0200, Jan Beulich wrote: >> On 29.05.2024 11:01, Roger Pau Monne wrote: >>> When adjusting move_cleanup_count to account for CPUs that are offline also >>> adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract >>> those again creating and create an imbalance in move_cleanup_count. >> >> I'm in trouble with "creating"; I can't seem to be able to guess what you may >> have meant. > > Oh, sorry, that's a typo. > > I was meaning to point out that not removing the already subtracted > CPUs from the mask can lead to further calls to fixup_irqs() > subtracting them again and move_cleanup_count possibly underflowing. > > Would you prefer to write it as: > > "... could subtract those again and possibly underflow move_cleanup_count." Fine with me. Looks like simply deleting "creating" and keeping the rest as it was would be okay too? Whatever you prefer in the end. >>> --- a/xen/arch/x86/irq.c >>> +++ b/xen/arch/x86/irq.c >>> @@ -2572,6 +2572,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) >>> desc->arch.move_cleanup_count -= cpumask_weight(affinity); >>> if ( !desc->arch.move_cleanup_count ) >>> release_old_vec(desc); >>> + else >>> + /* >>> + * Adjust old_cpu_mask to account for the offline CPUs, >>> + * otherwise further calls to fixup_irqs() could subtract those >>> + * again and possibly underflow the counter. >>> + */ >>> + cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, >>> + &cpu_online_map); >>> } >> >> While functionality-wise okay, imo it would be slightly better to use >> "affinity" here as well, so that even without looking at context beyond >> what's shown here there is a direct connection to the cpumask_weight() >> call. I.e. >> >> cpumask_andnot(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, >> affinity); >> >> Thoughts? > > It was more straightforward for me to reason that removing the offline > CPUs is OK, but I can see that you might prefer to use 'affinity', > because that's the weight that's subtracted from move_cleanup_count. > Using either should lead to the same result if my understanding is > correct. That was the conclusion I came to, or else I wouldn't have made the suggestion. Unless you have a strong preference for the as-is form, I'd indeed prefer the suggested alternative. Jan
On Wed, May 29, 2024 at 05:27:06PM +0200, Jan Beulich wrote: > On 29.05.2024 17:15, Roger Pau Monné wrote: > > On Wed, May 29, 2024 at 02:40:51PM +0200, Jan Beulich wrote: > >> On 29.05.2024 11:01, Roger Pau Monne wrote: > >>> When adjusting move_cleanup_count to account for CPUs that are offline also > >>> adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract > >>> those again creating and create an imbalance in move_cleanup_count. > >> > >> I'm in trouble with "creating"; I can't seem to be able to guess what you may > >> have meant. > > > > Oh, sorry, that's a typo. > > > > I was meaning to point out that not removing the already subtracted > > CPUs from the mask can lead to further calls to fixup_irqs() > > subtracting them again and move_cleanup_count possibly underflowing. > > > > Would you prefer to write it as: > > > > "... could subtract those again and possibly underflow move_cleanup_count." > > Fine with me. Looks like simply deleting "creating" and keeping the rest > as it was would be okay too? Whatever you prefer in the end. Yes, whatever you think it's clearer TBH, I don't really have a preference. Thanks, Roger.
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c index c16205a9beb6..9716e00e873b 100644 --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -2572,6 +2572,14 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) desc->arch.move_cleanup_count -= cpumask_weight(affinity); if ( !desc->arch.move_cleanup_count ) release_old_vec(desc); + else + /* + * Adjust old_cpu_mask to account for the offline CPUs, + * otherwise further calls to fixup_irqs() could subtract those + * again and possibly underflow the counter. + */ + cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask, + &cpu_online_map); } if ( !desc->action || cpumask_subset(desc->affinity, mask) )
When adjusting move_cleanup_count to account for CPUs that are offline also adjust old_cpu_mask, otherwise further calls to fixup_irqs() could subtract those again creating and create an imbalance in move_cleanup_count. Fixes: 472e0b74c5c4 ('x86/IRQ: deal with move cleanup count state in fixup_irqs()') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/irq.c | 8 ++++++++ 1 file changed, 8 insertions(+)