diff mbox series

[v4,1/2] x86/delay: Fix the wrong asm constraint in `delay_loop()`

Message ID 20220301094608.118879-2-ammarfaizi2@gnuweeb.org (mailing list archive)
State New, archived
Headers show
Series Two x86 fixes | expand

Commit Message

Ammar Faizi March 1, 2022, 9:46 a.m. UTC
From: Ammar Faizi <ammarfaizi2@gnuweeb.org>

The asm constraint does not reflect that the asm statement can modify
the value of @loops. But the asm statement in delay_loop() does change
the @loops.

If by any chance the compiler inlines this function, it may clobber
random stuff (e.g. local variable, important temporary value in reg,
etc.).

Fortunately, delay_loop() is only called indirectly (so it can't
inline), and then the register it clobbers is %rax (which is by the
nature of the calling convention, it's a caller saved register), so it
didn't yield any bug.

^ That shouldn't be an excuse for using the wrong constraint anyway.

This changes "a" (as an input) to "+a" (as an input and output).

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jiri Hladky <hladky.jiri@googlemail.com>
Fixes: e01b70ef3eb3 ("x86: fix bug in arch/i386/lib/delay.c file, delay_loop function")
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
---

Fortunately, the constraint violation that's fixed by patch 1 doesn't
yield any bug due to the nature of System V ABI. Should we backport
this?

 arch/x86/lib/delay.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

David Laight March 1, 2022, 9:54 a.m. UTC | #1
From: Ammar Faizi
> Sent: 01 March 2022 09:46
> 
> The asm constraint does not reflect that the asm statement can modify
> the value of @loops. But the asm statement in delay_loop() does change
> the @loops.
> 
> If by any chance the compiler inlines this function, it may clobber
> random stuff (e.g. local variable, important temporary value in reg,
> etc.).
> 
> Fortunately, delay_loop() is only called indirectly (so it can't
> inline), and then the register it clobbers is %rax (which is by the
> nature of the calling convention, it's a caller saved register), so it
> didn't yield any bug.

Both the function pointers in that code need killing.
They only have two options (each) so conditional branches
will almost certainly always have been better.

I also wonder how well the comment
   The additional jump magic is needed to get the timing stable
   on all the CPU' we have to worry about.
applies to any modern cpu!
The code is unchanged since (at least) 2.6.27.
(It might have been moved from another file.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Alviro Iskandar Setiawan March 1, 2022, 11:33 a.m. UTC | #2
On Tue, Mar 1, 2022 at 4:46 PM Ammar Faizi wrote:
> Fortunately, the constraint violation that's fixed by patch 1 doesn't
> yield any bug due to the nature of System V ABI. Should we backport
> this?

hi sir, it might also be interesting to know that even if it never be
inlined, it's still potential to break.

for example this code (https://godbolt.org/z/xWMTxhTET)

  __attribute__((__noinline__)) static void x(int a)
  {
      asm("xorl\t%%r8d, %%r8d"::"a"(a));
  }

  extern int p(void);

  int f(void)
  {
      int ret = p();
      x(ret);
      return ret;
  }

translates to this asm

  x:
          movl    %edi, %eax
          xorl    %r8d, %r8d
          ret
  f:
          subq    $8, %rsp
          call    p
          movl    %eax, %r8d
          movl    %eax, %edi
          call    x
          movl    %r8d, %eax
          addq    $8, %rsp
          ret

See the %r8d? It should be clobbered by a function call too. But since
no one tells the compiler that we clobber %r8d, it assumes %r8d never
changes after that call. The compiler thinks x() is static and will
not clobber %r8d, even the ABI says %r8d will be clobbered by a
function call. So i think it should be backported to the stable
kernel, it's still a fix

-- Viro
Ammar Faizi March 3, 2022, 12:06 a.m. UTC | #3
On 3/1/22 6:33 PM, Alviro Iskandar Setiawan wrote:
> hi sir, it might also be interesting to know that even if it never be
> inlined, it's still potential to break.
> 
> for example this code (https://godbolt.org/z/xWMTxhTET)
> 
>    __attribute__((__noinline__)) static void x(int a)
>    {
>        asm("xorl\t%%r8d, %%r8d"::"a"(a));
>    }
> 
>    extern int p(void);
> 
>    int f(void)
>    {
>        int ret = p();
>        x(ret);
>        return ret;
>    }
> 
> translates to this asm
> 
>    x:
>            movl    %edi, %eax
>            xorl    %r8d, %r8d
>            ret
>    f:
>            subq    $8, %rsp
>            call    p
>            movl    %eax, %r8d
>            movl    %eax, %edi
>            call    x
>            movl    %r8d, %eax
>            addq    $8, %rsp
>            ret
> 
> See the %r8d? It should be clobbered by a function call too. But since
> no one tells the compiler that we clobber %r8d, it assumes %r8d never
> changes after that call. The compiler thinks x() is static and will
> not clobber %r8d, even the ABI says %r8d will be clobbered by a
> function call. So i think it should be backported to the stable
> kernel, it's still a fix

Thanks. I will add CC stable in the v5.
Ammar Faizi March 3, 2022, 12:14 a.m. UTC | #4
On 3/1/22 4:54 PM, David Laight wrote:
> Both the function pointers in that code need killing.
> They only have two options (each) so conditional branches
> will almost certainly always have been better.

Yes, I agree with simply using conditional branches to handle this
case. But to keep the changes minimal for the stable tree, let's fix
the obvious real bug first. Someone can refactor it later, but I
don't see that as an urgent thing to refactor.

> I also wonder how well the comment
>     The additional jump magic is needed to get the timing stable
>     on all the CPU' we have to worry about.
> applies to any modern cpu!
> The code is unchanged since (at least) 2.6.27.
> (It might have been moved from another file.)
Not sure about that...

Thanks for the feedback.
David Laight March 3, 2022, 12:35 a.m. UTC | #5
From: Alviro Iskandar Setiawan
> Sent: 01 March 2022 11:34
> 
> On Tue, Mar 1, 2022 at 4:46 PM Ammar Faizi wrote:
> > Fortunately, the constraint violation that's fixed by patch 1 doesn't
> > yield any bug due to the nature of System V ABI. Should we backport
> > this?
> 
> hi sir, it might also be interesting to know that even if it never be
> inlined, it's still potential to break.
> 
> for example this code (https://godbolt.org/z/xWMTxhTET)
> 
>   __attribute__((__noinline__)) static void x(int a)
>   {
>       asm("xorl\t%%r8d, %%r8d"::"a"(a));
>   }

But this code isn't doing that.
In your example the compiler has looked at the static function
and realised that is doesn't use r8 so it need not be saved
even though it is a volatile register.

In this code the compiler knows %ax is being used, it just
doesn't know it is changed - so could assume the value
is unchanged.

The only code that is likely to break is:

int f(int d)
{
	d += 10;
	__delay(d);
	return d;
}

Which might manage to return the value of %eax modified by the asm.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
diff mbox series

Patch

diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 65d15df6212d..0e65d00e2339 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -54,8 +54,8 @@  static void delay_loop(u64 __loops)
 		"	jnz 2b		\n"
 		"3:	dec %0		\n"
 
-		: /* we don't need output */
-		:"a" (loops)
+		: "+a" (loops)
+		:
 	);
 }