BUG: FP registers leak across execve

Message ID	20181210221328.fdl2eoe5cqxudgfs@aurel32.net (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org> Date: Mon, 10 Dec 2018 23:13:28 +0100 From: Aurelien Jarno <aurelien@aurel32.net> To: linux-riscv@lists.infradead.org Subject: BUG: FP registers leak across execve Message-ID: <20181210221328.fdl2eoe5cqxudgfs@aurel32.net> MIME-Version: 1.0 User-Agent: NeoMutt/20170113 (1.7.2) Precedence: list Content-Type: multipart/mixed; boundary="===============7342543508321656470==" Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org> Errors-To: linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org
Series	BUG: FP registers leak across execve \| expand BUG: FP registers leak across execve

Message ID

20181210221328.fdl2eoe5cqxudgfs@aurel32.net (mailing list archive)

State

New, archived

Headers

Date: Mon, 10 Dec 2018 23:13:28 +0100
From: Aurelien Jarno <aurelien@aurel32.net>
To: linux-riscv@lists.infradead.org
Subject: BUG: FP registers leak across execve
Message-ID: <20181210221328.fdl2eoe5cqxudgfs@aurel32.net>
MIME-Version: 1.0
User-Agent: NeoMutt/20170113 (1.7.2)
Precedence: list
Content-Type: multipart/mixed; boundary="===============7342543508321656470=="
Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org>
Errors-To: 
 linux-riscv-bounces+patchwork-linux-riscv=patchwork.kernel.org@lists.infradead.org

Series

BUG: FP registers leak across execve | expand

Commit Message

Aurelien Jarno Dec. 10, 2018, 10:13 p.m. UTC

Hi all,

Debugging some glibc testsuite math failures, I have found out that most
of the time, the FP status register and the FP registers are not zeroed
as they should. This can be tested with the attached code. The best way
to reproduce it is to execute from Python (i guess Perl or another
interpreted language that support FP computation should work). When 
running an FP computation before calling the program, the result of the
computation can be seen in f10.

The zeroing of the FP status happens in kernel/process.c in the
flush_thread function. It seems that the kernel restore that state only
if a context switch happens between flush_thread and the first FP
instruction of the executed program.

A possible workaround is to restore of the FP registers in flush_thread,
but that's probably not the best way to do that:



Aurelien

Comments

Palmer Dabbelt Jan. 3, 2019, 11:36 p.m. UTC | #1

On Mon, 10 Dec 2018 14:13:28 PST (-0800), aurelien@aurel32.net wrote:
> Hi all,
>
> Debugging some glibc testsuite math failures, I have found out that most
> of the time, the FP status register and the FP registers are not zeroed
> as they should. This can be tested with the attached code. The best way
> to reproduce it is to execute from Python (i guess Perl or another
> interpreted language that support FP computation should work). When
> running an FP computation before calling the program, the result of the
> computation can be seen in f10.
>
> The zeroing of the FP status happens in kernel/process.c in the
> flush_thread function. It seems that the kernel restore that state only
> if a context switch happens between flush_thread and the first FP
> instruction of the executed program.
>
> A possible workaround is to restore of the FP registers in flush_thread,
> but that's probably not the best way to do that:
>
>
> --- a/arch/riscv/kernel/process.c
> +++ b/arch/riscv/kernel/process.c
> @@ -93,6 +93,7 @@ void flush_thread(void)
>          *      fflags: accrued exceptions cleared
>          */
>         memset(&current->thread.fstate, 0, sizeof(current->thread.fstate));
> +       fstate_restore(current, task_pt_regs(current));
>  #endif
>  }

Are you running this in QEMU?  IIRC there was a bug here and we might not have 
the right fix upstream yet.

Aurelien Jarno Jan. 4, 2019, 1:39 a.m. UTC | #2

On 2019-01-03 15:36, Palmer Dabbelt wrote:
> On Mon, 10 Dec 2018 14:13:28 PST (-0800), aurelien@aurel32.net wrote:
> > Hi all,
> > 
> > Debugging some glibc testsuite math failures, I have found out that most
> > of the time, the FP status register and the FP registers are not zeroed
> > as they should. This can be tested with the attached code. The best way
> > to reproduce it is to execute from Python (i guess Perl or another
> > interpreted language that support FP computation should work). When
> > running an FP computation before calling the program, the result of the
> > computation can be seen in f10.
> > 
> > The zeroing of the FP status happens in kernel/process.c in the
> > flush_thread function. It seems that the kernel restore that state only
> > if a context switch happens between flush_thread and the first FP
> > instruction of the executed program.
> > 
> > A possible workaround is to restore of the FP registers in flush_thread,
> > but that's probably not the best way to do that:
> > 
> > 
> > --- a/arch/riscv/kernel/process.c
> > +++ b/arch/riscv/kernel/process.c
> > @@ -93,6 +93,7 @@ void flush_thread(void)
> >          *      fflags: accrued exceptions cleared
> >          */
> >         memset(&current->thread.fstate, 0, sizeof(current->thread.fstate));
> > +       fstate_restore(current, task_pt_regs(current));
> >  #endif
> >  }
> 
> Are you running this in QEMU?  IIRC there was a bug here and we might not
> have the right fix upstream yet.

I can reproduce the issue in a QEMU 3.1 VM running a 4.20 kernel, but
also on an HFU board running the original kernel.

Benjamin Herrenschmidt Jan. 4, 2019, 3:12 a.m. UTC | #3

On Fri, 2019-01-04 at 02:39 +0100, Aurelien Jarno wrote:
> On 2019-01-03 15:36, Palmer Dabbelt wrote:
> > On Mon, 10 Dec 2018 14:13:28 PST (-0800), aurelien@aurel32.net wrote:
> > > Hi all,
> > > 
> > > Debugging some glibc testsuite math failures, I have found out that most
> > > of the time, the FP status register and the FP registers are not zeroed
> > > as they should. This can be tested with the attached code. The best way
> > > to reproduce it is to execute from Python (i guess Perl or another
> > > interpreted language that support FP computation should work). When
> > > running an FP computation before calling the program, the result of the
> > > computation can be seen in f10.
> > > 
> > > The zeroing of the FP status happens in kernel/process.c in the
> > > flush_thread function. It seems that the kernel restore that state only
> > > if a context switch happens between flush_thread and the first FP
> > > instruction of the executed program.
> > > 
> > > A possible workaround is to restore of the FP registers in flush_thread,
> > > but that's probably not the best way to do that:
> > > 
> > > 
> > > --- a/arch/riscv/kernel/process.c
> > > +++ b/arch/riscv/kernel/process.c
> > > @@ -93,6 +93,7 @@ void flush_thread(void)
> > >          *      fflags: accrued exceptions cleared
> > >          */
> > >         memset(&current->thread.fstate, 0, sizeof(current->thread.fstate));
> > > +       fstate_restore(current, task_pt_regs(current));
> > >  #endif
> > >  }
> > 
> > Are you running this in QEMU?  IIRC there was a bug here and we might not
> > have the right fix upstream yet.
> 
> I can reproduce the issue in a QEMU 3.1 VM running a 4.20 kernel, but
> also on an HFU board running the original kernel.

You can do that or force a restore in start_thread I think. (powerpc
does all that stuff in start_thread rather than flush_thread but that's
probably not that great an idea).

Cheers,
Ben.

Palmer Dabbelt Jan. 22, 2019, 8:31 p.m. UTC | #4

On Thu, 03 Jan 2019 17:39:32 PST (-0800), aurelien@aurel32.net wrote:
> On 2019-01-03 15:36, Palmer Dabbelt wrote:
>> On Mon, 10 Dec 2018 14:13:28 PST (-0800), aurelien@aurel32.net wrote:
>> > Hi all,
>> >
>> > Debugging some glibc testsuite math failures, I have found out that most
>> > of the time, the FP status register and the FP registers are not zeroed
>> > as they should. This can be tested with the attached code. The best way
>> > to reproduce it is to execute from Python (i guess Perl or another
>> > interpreted language that support FP computation should work). When
>> > running an FP computation before calling the program, the result of the
>> > computation can be seen in f10.
>> >
>> > The zeroing of the FP status happens in kernel/process.c in the
>> > flush_thread function. It seems that the kernel restore that state only
>> > if a context switch happens between flush_thread and the first FP
>> > instruction of the executed program.
>> >
>> > A possible workaround is to restore of the FP registers in flush_thread,
>> > but that's probably not the best way to do that:
>> >
>> >
>> > --- a/arch/riscv/kernel/process.c
>> > +++ b/arch/riscv/kernel/process.c
>> > @@ -93,6 +93,7 @@ void flush_thread(void)
>> >          *      fflags: accrued exceptions cleared
>> >          */
>> >         memset(&current->thread.fstate, 0, sizeof(current->thread.fstate));
>> > +       fstate_restore(current, task_pt_regs(current));
>> >  #endif
>> >  }
>>
>> Are you running this in QEMU?  IIRC there was a bug here and we might not
>> have the right fix upstream yet.
>
> I can reproduce the issue in a QEMU 3.1 VM running a 4.20 kernel, but
> also on an HFU board running the original kernel.

OK, so at least it's not a QEMU bug.

I think this is the right place to put it, but we should also add

    regs->sstatus &= ~SR_FS_INITIAL;
    regs->sstatus |= SR_FS_INITIAL;

so the FPU's status register corresponds to the contents of the FPU.  I'm not 
sure if this is strictly necessary, but it'll certainly help FP save/restore 
performance.  I'm trying to construct a scenario where not setting setting 
SR.FS=initial actually manifests in incorrect behavior and can't think of one, 
but it does feel incorrect.

--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -93,6 +93,7 @@  void flush_thread(void)
         *      fflags: accrued exceptions cleared
         */
        memset(&current->thread.fstate, 0, sizeof(current->thread.fstate));
+       fstate_restore(current, task_pt_regs(current));
 #endif
 }

BUG: FP registers leak across execve

Commit Message

Comments

Patch