mbox series

[v2,0/2] arm64: Only call into preempt_schedule() if need_resched()

Message ID 1543599271-14339-1-git-send-email-will.deacon@arm.com (mailing list archive)
Headers show
Series arm64: Only call into preempt_schedule() if need_resched() | expand

Message

Will Deacon Nov. 30, 2018, 5:34 p.m. UTC
Hi all,

This is version two of the patches I originally posted here:

  http://lkml.kernel.org/r/1543347902-21170-1-git-send-email-will.deacon@arm.com

The only change since v1 is that  __preempt_count_dec_and_test() now
reloads the need_resched flag if it initially saw that it was set. This
resolves the issue spotted by Peter, where an IRQ coming in during the
decrement can cause a reschedule to be missed.

Feedback welcome.

Will

--->8

Will Deacon (2):
  preempt: Move PREEMPT_NEED_RESCHED definition into arch code
  arm64: preempt: Provide our own implementation of asm/preempt.h

 arch/arm64/include/asm/Kbuild        |  1 -
 arch/arm64/include/asm/preempt.h     | 88 ++++++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/thread_info.h | 13 +++++-
 arch/s390/include/asm/preempt.h      |  2 +
 arch/x86/include/asm/preempt.h       |  3 ++
 include/linux/preempt.h              |  3 --
 6 files changed, 105 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/include/asm/preempt.h

Comments

Peter Zijlstra Dec. 6, 2018, 3:08 p.m. UTC | #1
On Fri, Nov 30, 2018 at 05:34:29PM +0000, Will Deacon wrote:
> Hi all,
> 
> This is version two of the patches I originally posted here:
> 
>   http://lkml.kernel.org/r/1543347902-21170-1-git-send-email-will.deacon@arm.com
> 
> The only change since v1 is that  __preempt_count_dec_and_test() now
> reloads the need_resched flag if it initially saw that it was set. This
> resolves the issue spotted by Peter, where an IRQ coming in during the
> decrement can cause a reschedule to be missed.

Yes, I think this one will work, so:

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

However, this leaves me wondering if the sequence is actually much
better than what you had?

I suppose there's a win due to cache locality -- you only have to load a
single line -- but I'm thinking that on pure instruction count, you're
not actually winning much.
Will Deacon Dec. 6, 2018, 7:18 p.m. UTC | #2
Hi Peter,

On Thu, Dec 06, 2018 at 04:08:50PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 30, 2018 at 05:34:29PM +0000, Will Deacon wrote:
> > This is version two of the patches I originally posted here:
> > 
> >   http://lkml.kernel.org/r/1543347902-21170-1-git-send-email-will.deacon@arm.com
> > 
> > The only change since v1 is that  __preempt_count_dec_and_test() now
> > reloads the need_resched flag if it initially saw that it was set. This
> > resolves the issue spotted by Peter, where an IRQ coming in during the
> > decrement can cause a reschedule to be missed.
> 
> Yes, I think this one will work, so:
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Thanks!

> However, this leaves me wondering if the sequence is actually much
> better than what you had?
> 
> I suppose there's a win due to cache locality -- you only have to load a
> single line -- but I'm thinking that on pure instruction count, you're
> not actually winning much.

The fast path is still slightly shorter in terms of executed instructions,
but you're right that the win is likely to be because everything hits in the
cache or the store buffer when we're not preempting, so we should run
through the code reasonably quickly and avoid the unconditional call to
preempt_schedule().

Will

--->8

// Before
  20:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
  24:   910003fd        mov     x29, sp
  28:   d5384101        mrs     x1, sp_el0
  2c:   b9401020        ldr     w0, [x1, #16]
  30:   51000400        sub     w0, w0, #0x1
  34:   b9001020        str     w0, [x1, #16]
  38:   350000a0        cbnz    w0, 4c <preempt_enable+0x2c>
  3c:   f9400020        ldr     x0, [x1]
  40:   721f001f        tst     w0, #0x2
  44:   54000040        b.eq    4c <preempt_enable+0x2c>  // b.none
  48:   94000000        bl      0 <preempt_schedule>
  4c:   a8c17bfd        ldp     x29, x30, [sp], #16
  50:   d65f03c0        ret

// After
  20:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
  24:   910003fd        mov     x29, sp
  28:   d5384101        mrs     x1, sp_el0
  2c:   f9400820        ldr     x0, [x1, #16]
  30:   d1000400        sub     x0, x0, #0x1
  34:   b9001020        str     w0, [x1, #16]
  38:   b5000080        cbnz    x0, 48 <preempt_enable+0x28>
  3c:   94000000        bl      0 <preempt_schedule>
  40:   a8c17bfd        ldp     x29, x30, [sp], #16
  44:   d65f03c0        ret
  48:   f9400820        ldr     x0, [x1, #16]
  4c:   b5ffffa0        cbnz    x0, 40 <preempt_enable+0x20>
  50:   94000000        bl      0 <preempt_schedule>
  54:   17fffffb        b       40 <will_preempt_enable+0x20>