[v3] MIPS: Implement ieee754 NAN2008 emulation mode

Message ID	20240612-mips_ieee754_emul-v3-1-2c21b450abdb@flygoat.com (mailing list archive)
State	Accepted
Commit	59649de96f21dfb0518faa8feaa3d05c2d81b042
Headers	show Received: from fhigh3-smtp.messagingengine.com (fhigh3-smtp.messagingengine.com [103.168.172.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A1F116C6AF; Wed, 12 Jun 2024 08:38:24 +0000 (UTC) Feedback-ID: ifd894703:Fastmail From: Jiaxun Yang <jiaxun.yang@flygoat.com> Date: Wed, 12 Jun 2024 09:38:19 +0100 Subject: [PATCH v3] MIPS: Implement ieee754 NAN2008 emulation mode Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Message-Id: <20240612-mips_ieee754_emul-v3-1-2c21b450abdb@flygoat.com> To: Jonathan Corbet <corbet@lwn.net>, Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= <philmd@linaro.org>, Jiaxun Yang <jiaxun.yang@flygoat.com>
Series	[v3] MIPS: Implement ieee754 NAN2008 emulation mode \| expand [v3] MIPS: Implement ieee754 NAN2008 emulation mode

Jiaxun Yang June 12, 2024, 8:38 a.m. UTC

Implement ieee754 NAN2008 emulation mode.

When this mode is enabled, kernel will accept ELF file
compiled for both NaN 2008 and NaN legacy, but if hardware
does not have capability to match ELF's NaN mode, __own_fpu
will fail for corresponding thread and fpuemu will then kick
in.

This mode trade performance for correctness, while maintaining
support for both NaN mode regardless of hardware capability.
It is useful for multilib installation that have both types
of binary exist in system.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
---
Changes in v3:
- Fix typo in commit message
- Collect R-b tags
- Link to v2: https://lore.kernel.org/r/20240511-mips_ieee754_emul-v2-1-af796ea21ef0@flygoat.com

Changes in v2:
- Fix a typo
- Link to v1: https://lore.kernel.org/r/20240507-mips_ieee754_emul-v1-1-1dc7c0d13cac@flygoat.com
---
 Documentation/admin-guide/kernel-parameters.txt |  4 +++-
 arch/mips/include/asm/fpu.h                     | 15 +++++++++++++++
 arch/mips/kernel/elf.c                          |  4 ++++
 arch/mips/kernel/fpu-probe.c                    |  9 ++++++++-
 4 files changed, 30 insertions(+), 2 deletions(-)


---
base-commit: 93a39e4766083050ca0ecd6a3548093a3b9eb60c
change-id: 20240507-mips_ieee754_emul-d17865fd9870

Best regards,

Thomas Bogendoerfer June 27, 2024, 10:58 a.m. UTC | #1

On Wed, Jun 12, 2024 at 09:38:19AM +0100, Jiaxun Yang wrote:
> +++ b/arch/mips/kernel/elf.c
> @@ -318,6 +318,10 @@ void mips_set_personality_nan(struct arch_elf_state *state)
>  	t->thread.fpu.fcr31 = c->fpu_csr31;
>  	switch (state->nan_2008) {
>  	case 0:
> +		if (!(c->fpu_msk31 & FPU_CSR_NAN2008))
> +			t->thread.fpu.fcr31 &= ~FPU_CSR_NAN2008;
> +		if (!(c->fpu_msk31 & FPU_CSR_ABS2008))
> +			t->thread.fpu.fcr31 &= ~FPU_CSR_ABS2008;

why is this needed ?

Thomas.

Jiaxun Yang June 27, 2024, 11:21 a.m. UTC | #2

在2024年6月27日六月 上午11:58，Thomas Bogendoerfer写道：
> On Wed, Jun 12, 2024 at 09:38:19AM +0100, Jiaxun Yang wrote:
>> +++ b/arch/mips/kernel/elf.c
>> @@ -318,6 +318,10 @@ void mips_set_personality_nan(struct arch_elf_state *state)
>>  	t->thread.fpu.fcr31 = c->fpu_csr31;
>>  	switch (state->nan_2008) {
>>  	case 0:
>> +		if (!(c->fpu_msk31 & FPU_CSR_NAN2008))
>> +			t->thread.fpu.fcr31 &= ~FPU_CSR_NAN2008;
>> +		if (!(c->fpu_msk31 & FPU_CSR_ABS2008))
>> +			t->thread.fpu.fcr31 &= ~FPU_CSR_ABS2008;
>
> why is this needed?

Because t->thread.fpu.fcr31 comes from c->fpu_csr31, in this case we the default
value of c->fpu_csr31 is read from hardware and we don't know what would that be.

Thanks
>
> Thomas.
>
> -- 
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea.                                                [ RFC1925, 2.3 ]

Maciej W. Rozycki June 27, 2024, 7:54 p.m. UTC | #3

On Thu, 27 Jun 2024, Jiaxun Yang wrote:

> >> @@ -318,6 +318,10 @@ void mips_set_personality_nan(struct arch_elf_state *state)
> >>  	t->thread.fpu.fcr31 = c->fpu_csr31;
> >>  	switch (state->nan_2008) {
> >>  	case 0:
> >> +		if (!(c->fpu_msk31 & FPU_CSR_NAN2008))
> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_NAN2008;
> >> +		if (!(c->fpu_msk31 & FPU_CSR_ABS2008))
> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_ABS2008;
> >
> > why is this needed?
> 
> Because t->thread.fpu.fcr31 comes from c->fpu_csr31, in this case we the default
> value of c->fpu_csr31 is read from hardware and we don't know what would that be.

 But it has always been like this.  What has changed with your patch that 
you need to mask the bit out now?

  Maciej

Jiaxun Yang June 28, 2024, 12:33 a.m. UTC | #4

在2024年6月27日六月 下午8:54，Maciej W. Rozycki写道：
> On Thu, 27 Jun 2024, Jiaxun Yang wrote:
>
>> >> @@ -318,6 +318,10 @@ void mips_set_personality_nan(struct arch_elf_state *state)
>> >>  	t->thread.fpu.fcr31 = c->fpu_csr31;
>> >>  	switch (state->nan_2008) {
>> >>  	case 0:
>> >> +		if (!(c->fpu_msk31 & FPU_CSR_NAN2008))
>> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_NAN2008;
>> >> +		if (!(c->fpu_msk31 & FPU_CSR_ABS2008))
>> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_ABS2008;
>> >
>> > why is this needed?
>> 
>> Because t->thread.fpu.fcr31 comes from c->fpu_csr31, in this case we the default
>> value of c->fpu_csr31 is read from hardware and we don't know what would that be.
>
>  But it has always been like this.  What has changed with your patch that 
> you need to mask the bit out now?

After this patch kernel's copy of t->thread.fpu.fcr31 can disagree with hardware.
When disagree happens, we trigger emulation.

Before that patch for nan legacy binary running on nan2008 CPU t->thread.fpu.fcr31
will still be nan2008 (for ieee754=relaxed) so that's not relevant.

Thanks
>
>   Maciej

Thomas Bogendoerfer July 9, 2024, 8:55 a.m. UTC | #5

On Fri, Jun 28, 2024 at 01:33:06AM +0100, Jiaxun Yang wrote:
> 
> 
> 在2024年6月27日六月 下午8:54，Maciej W. Rozycki写道：
> > On Thu, 27 Jun 2024, Jiaxun Yang wrote:
> >
> >> >> @@ -318,6 +318,10 @@ void mips_set_personality_nan(struct arch_elf_state *state)
> >> >>  	t->thread.fpu.fcr31 = c->fpu_csr31;
> >> >>  	switch (state->nan_2008) {
> >> >>  	case 0:
> >> >> +		if (!(c->fpu_msk31 & FPU_CSR_NAN2008))
> >> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_NAN2008;
> >> >> +		if (!(c->fpu_msk31 & FPU_CSR_ABS2008))
> >> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_ABS2008;
> >> >
> >> > why is this needed?
> >> 
> >> Because t->thread.fpu.fcr31 comes from c->fpu_csr31, in this case we the default
> >> value of c->fpu_csr31 is read from hardware and we don't know what would that be.
> >
> >  But it has always been like this.  What has changed with your patch that 
> > you need to mask the bit out now?
> 
> After this patch kernel's copy of t->thread.fpu.fcr31 can disagree with hardware.
> When disagree happens, we trigger emulation.
> 
> Before that patch for nan legacy binary running on nan2008 CPU t->thread.fpu.fcr31
> will still be nan2008 (for ieee754=relaxed) so that's not relevant.

I'm considering to apply your patch, how much testing/verification did
this patch see ? Do have some test binaries ?

Thomas.

Jiaxun Yang July 10, 2024, 5:34 a.m. UTC | #6

在2024年7月9日七月 下午4:55，Thomas Bogendoerfer写道：
> On Fri, Jun 28, 2024 at 01:33:06AM +0100, Jiaxun Yang wrote:
>> 
>> 
>> 在2024年6月27日六月 下午8:54，Maciej W. Rozycki写道：
>> > On Thu, 27 Jun 2024, Jiaxun Yang wrote:
>> >
>> >> >> @@ -318,6 +318,10 @@ void mips_set_personality_nan(struct arch_elf_state *state)
>> >> >>  	t->thread.fpu.fcr31 = c->fpu_csr31;
>> >> >>  	switch (state->nan_2008) {
>> >> >>  	case 0:
>> >> >> +		if (!(c->fpu_msk31 & FPU_CSR_NAN2008))
>> >> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_NAN2008;
>> >> >> +		if (!(c->fpu_msk31 & FPU_CSR_ABS2008))
>> >> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_ABS2008;
>> >> >
>> >> > why is this needed?
>> >> 
>> >> Because t->thread.fpu.fcr31 comes from c->fpu_csr31, in this case we the default
>> >> value of c->fpu_csr31 is read from hardware and we don't know what would that be.
>> >
>> >  But it has always been like this.  What has changed with your patch that 
>> > you need to mask the bit out now?
>> 
>> After this patch kernel's copy of t->thread.fpu.fcr31 can disagree with hardware.
>> When disagree happens, we trigger emulation.
>> 
>> Before that patch for nan legacy binary running on nan2008 CPU t->thread.fpu.fcr31
>> will still be nan2008 (for ieee754=relaxed) so that's not relevant.
>
> I'm considering to apply your patch, how much testing/verification did
> this patch see ? Do have some test binaries ?

It has been tested against Debian rootfs. There is no need to test againt special binary,
but you need NaN2008 hardware such as Loongson 3A4000.

Thanks
- Jiaxun
>
> Thomas.
>
> -- 
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea.                                                [ RFC1925, 2.3 ]

Thomas Bogendoerfer July 10, 2024, 7:36 a.m. UTC | #7

On Wed, Jul 10, 2024 at 01:34:41PM +0800, Jiaxun Yang wrote:
> 
> 
> 在2024年7月9日七月 下午4:55，Thomas Bogendoerfer写道：
> > On Fri, Jun 28, 2024 at 01:33:06AM +0100, Jiaxun Yang wrote:
> >> 
> >> 
> >> 在2024年6月27日六月 下午8:54，Maciej W. Rozycki写道：
> >> > On Thu, 27 Jun 2024, Jiaxun Yang wrote:
> >> >
> >> >> >> @@ -318,6 +318,10 @@ void mips_set_personality_nan(struct arch_elf_state *state)
> >> >> >>  	t->thread.fpu.fcr31 = c->fpu_csr31;
> >> >> >>  	switch (state->nan_2008) {
> >> >> >>  	case 0:
> >> >> >> +		if (!(c->fpu_msk31 & FPU_CSR_NAN2008))
> >> >> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_NAN2008;
> >> >> >> +		if (!(c->fpu_msk31 & FPU_CSR_ABS2008))
> >> >> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_ABS2008;
> >> >> >
> >> >> > why is this needed?
> >> >> 
> >> >> Because t->thread.fpu.fcr31 comes from c->fpu_csr31, in this case we the default
> >> >> value of c->fpu_csr31 is read from hardware and we don't know what would that be.
> >> >
> >> >  But it has always been like this.  What has changed with your patch that 
> >> > you need to mask the bit out now?
> >> 
> >> After this patch kernel's copy of t->thread.fpu.fcr31 can disagree with hardware.
> >> When disagree happens, we trigger emulation.
> >> 
> >> Before that patch for nan legacy binary running on nan2008 CPU t->thread.fpu.fcr31
> >> will still be nan2008 (for ieee754=relaxed) so that's not relevant.
> >
> > I'm considering to apply your patch, how much testing/verification did
> > this patch see ? Do have some test binaries ?
> 
> It has been tested against Debian rootfs. There is no need to test againt special binary,
> but you need NaN2008 hardware such as Loongson 3A4000.

that's just one case, what about NaN2008 binaries on a legacy MIPS CPU ?

Thomas.

Jiaxun Yang July 10, 2024, 8:16 a.m. UTC | #8

在2024年7月10日七月 下午3:36，Thomas Bogendoerfer写道：
[...]
>> It has been tested against Debian rootfs. There is no need to test againt special binary,
>> but you need NaN2008 hardware such as Loongson 3A4000.
>
> that's just one case, what about NaN2008 binaries on a legacy MIPS CPU ?

Never checked that, as we don't have any NaN2008 distro.

Will try and report.

>
> Thomas.
>
> -- 
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea.                                                [ RFC1925, 2.3 ]

Maciej W. Rozycki July 10, 2024, 9:21 a.m. UTC | #9

On Wed, 10 Jul 2024, Thomas Bogendoerfer wrote:

> > > I'm considering to apply your patch, how much testing/verification did
> > > this patch see ? Do have some test binaries ?
> > 
> > It has been tested against Debian rootfs. There is no need to test againt special binary,
> > but you need NaN2008 hardware such as Loongson 3A4000.
> 
> that's just one case, what about NaN2008 binaries on a legacy MIPS CPU ?

 It would be good to check with hard-float QEMU configured for writable 
FCSR.NAN2008 (which is one way original code was verified) that things 
have not regressed.  And also what happens if once our emulation has 
triggered for the unsupported FCSR.NAN2008 mode, an attempt is made to 
flip the mode bit via ptrace(2), e.g. under GDB, which I reckon our 
emulation permits for non-legacy CPUs (and which I think should not be 
allowed under the new setting).

  Maciej

Jiaxun Yang July 11, 2024, 1:47 a.m. UTC | #10

在2024年7月10日七月 下午3:36，Thomas Bogendoerfer写道：
> On Wed, Jul 10, 2024 at 01:34:41PM +0800, Jiaxun Yang wrote:
>> 
>> 
>> 在2024年7月9日七月 下午4:55，Thomas Bogendoerfer写道：
>> > On Fri, Jun 28, 2024 at 01:33:06AM +0100, Jiaxun Yang wrote:
>> >> 
>> >> 
>> >> 在2024年6月27日六月 下午8:54，Maciej W. Rozycki写道：
>> >> > On Thu, 27 Jun 2024, Jiaxun Yang wrote:
>> >> >
>> >> >> >> @@ -318,6 +318,10 @@ void mips_set_personality_nan(struct arch_elf_state *state)
>> >> >> >>  	t->thread.fpu.fcr31 = c->fpu_csr31;
>> >> >> >>  	switch (state->nan_2008) {
>> >> >> >>  	case 0:
>> >> >> >> +		if (!(c->fpu_msk31 & FPU_CSR_NAN2008))
>> >> >> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_NAN2008;
>> >> >> >> +		if (!(c->fpu_msk31 & FPU_CSR_ABS2008))
>> >> >> >> +			t->thread.fpu.fcr31 &= ~FPU_CSR_ABS2008;
>> >> >> >
>> >> >> > why is this needed?
>> >> >> 
>> >> >> Because t->thread.fpu.fcr31 comes from c->fpu_csr31, in this case we the default
>> >> >> value of c->fpu_csr31 is read from hardware and we don't know what would that be.
>> >> >
>> >> >  But it has always been like this.  What has changed with your patch that 
>> >> > you need to mask the bit out now?
>> >> 
>> >> After this patch kernel's copy of t->thread.fpu.fcr31 can disagree with hardware.
>> >> When disagree happens, we trigger emulation.
>> >> 
>> >> Before that patch for nan legacy binary running on nan2008 CPU t->thread.fpu.fcr31
>> >> will still be nan2008 (for ieee754=relaxed) so that's not relevant.
>> >
>> > I'm considering to apply your patch, how much testing/verification did
>> > this patch see ? Do have some test binaries ?
>> 
>> It has been tested against Debian rootfs. There is no need to test againt special binary,
>> but you need NaN2008 hardware such as Loongson 3A4000.
>
> that's just one case, what about NaN2008 binaries on a legacy MIPS CPU ?

Boot tested CIP United's NaN2008 Debian port, works good so far [1].

[1]: http://repo.oss.cipunited.com/debian/README.nan2008.txt

Thanks
- Jiaxun

>
> Thomas.
>
> -- 
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea.                                                [ RFC1925, 2.3 ]

Jiaxun Yang July 11, 2024, 1:53 a.m. UTC | #11

在2024年7月10日七月 下午5:21，Maciej W. Rozycki写道：
> On Wed, 10 Jul 2024, Thomas Bogendoerfer wrote:
>
>> > > I'm considering to apply your patch, how much testing/verification did
>> > > this patch see ? Do have some test binaries ?
>> > 
>> > It has been tested against Debian rootfs. There is no need to test againt special binary,
>> > but you need NaN2008 hardware such as Loongson 3A4000.
>> 
>> that's just one case, what about NaN2008 binaries on a legacy MIPS CPU ?
>
>  It would be good to check with hard-float QEMU configured for writable 
> FCSR.NAN2008 (which is one way original code was verified) that things 
> have not regressed.  And also what happens if once our emulation has 
> triggered for the unsupported FCSR.NAN2008 mode, an attempt is made to 
> flip the mode bit via ptrace(2), e.g. under GDB, which I reckon our 
> emulation permits for non-legacy CPUs (and which I think should not be 
> allowed under the new setting).

PTrace is working as expected (reflects emulated value).

The actual switchable NaN hardware (M5150, P5600) uses a dedicated Config7
bit rather than writable FCSR.NAN2008 to control NaN2008 mode. This is undocumented
and not present on some RTL releases. FCSR.NAN2008 is R/O as per The MIPS32 Instruction
Set Manual. This renders the purposed test pointless.

That being said, I'll catch some time later to test behaviour with purposed QEMU modification
but I think it's good to go now. 

Thanks
- Jiaxun

>
>   Maciej

Maciej W. Rozycki July 11, 2024, 10:20 a.m. UTC | #12

On Thu, 11 Jul 2024, Jiaxun Yang wrote:

> >> that's just one case, what about NaN2008 binaries on a legacy MIPS CPU ?
> >
> >  It would be good to check with hard-float QEMU configured for writable 
> > FCSR.NAN2008 (which is one way original code was verified) that things 
> > have not regressed.  And also what happens if once our emulation has 
> > triggered for the unsupported FCSR.NAN2008 mode, an attempt is made to 
> > flip the mode bit via ptrace(2), e.g. under GDB, which I reckon our 
> > emulation permits for non-legacy CPUs (and which I think should not be 
> > allowed under the new setting).
> 
> PTrace is working as expected (reflects emulated value).

 Yes, sure for reads, but how about *writing* to the bit?

> The actual switchable NaN hardware (M5150, P5600) uses a dedicated Config7
> bit rather than writable FCSR.NAN2008 to control NaN2008 mode. This is undocumented
> and not present on some RTL releases. FCSR.NAN2008 is R/O as per The MIPS32 Instruction
> Set Manual. This renders the purposed test pointless.

 Yes, for R6 and arguably R5, but not for R3.  Architecture specification 
revisions 3.50 through 5.02 define FCSR.NAN2008 (and also FCSR.ABS2008) as 
either R/O or R/W, at the implementer's discretion, so it is a conforming 
implementation to have these bits writable and our FPU emulator reflects 
it.  I won't go into the details here as to why the later revisions of the 
specification have been restricted to the R/O implementation only.

 NB architecture specification revisions 3.50 through 5.01 also have the 
FCSR.MAC2008 bit defined, removed altogether later on.

  Maciej

Jiaxun Yang July 12, 2024, 12:36 a.m. UTC | #13

在2024年7月11日七月 下午6:20，Maciej W. Rozycki写道：
> On Thu, 11 Jul 2024, Jiaxun Yang wrote:
>
>> >> that's just one case, what about NaN2008 binaries on a legacy MIPS CPU ?
>> >
>> >  It would be good to check with hard-float QEMU configured for writable 
>> > FCSR.NAN2008 (which is one way original code was verified) that things 
>> > have not regressed.  And also what happens if once our emulation has 
>> > triggered for the unsupported FCSR.NAN2008 mode, an attempt is made to 
>> > flip the mode bit via ptrace(2), e.g. under GDB, which I reckon our 
>> > emulation permits for non-legacy CPUs (and which I think should not be 
>> > allowed under the new setting).
>> 
>> PTrace is working as expected (reflects emulated value).
>
>  Yes, sure for reads, but how about *writing* to the bit?

Tested flipping nan2008 bits with ieee754=emulated with ptrace, it works on some extent.
(flipping the bit to unsupported value immediately triggered emulation).

>
>> The actual switchable NaN hardware (M5150, P5600) uses a dedicated Config7
>> bit rather than writable FCSR.NAN2008 to control NaN2008 mode. This is undocumented
>> and not present on some RTL releases. FCSR.NAN2008 is R/O as per The MIPS32 Instruction
>> Set Manual. This renders the purposed test pointless.
>
>  Yes, for R6 and arguably R5, but not for R3.  Architecture specification 
> revisions 3.50 through 5.02 define FCSR.NAN2008 (and also FCSR.ABS2008) as 
> either R/O or R/W, at the implementer's discretion, so it is a conforming 
> implementation to have these bits writable and our FPU emulator reflects 
> it.  I won't go into the details here as to why the later revisions of the 
> specification have been restricted to the R/O implementation only.
>
>  NB architecture specification revisions 3.50 through 5.01 also have the 
> FCSR.MAC2008 bit defined, removed altogether later on.

Thanks for the information, I don't have access to those manuals so I was unaware
of that. R/W NAN2008 is prohibited by AVP as well.

I briefly tested NaN2008 distro on QEMU modified with r/w NaN2008 bits in ieee754=
strict mode, it seems working fine.

Thanks
>
>   Maciej

Thomas Bogendoerfer July 12, 2024, 11:21 a.m. UTC | #14

On Wed, Jun 12, 2024 at 09:38:19AM +0100, Jiaxun Yang wrote:
> Implement ieee754 NAN2008 emulation mode.
> 
> When this mode is enabled, kernel will accept ELF file
> compiled for both NaN 2008 and NaN legacy, but if hardware
> does not have capability to match ELF's NaN mode, __own_fpu
> will fail for corresponding thread and fpuemu will then kick
> in.
> 
> This mode trade performance for correctness, while maintaining
> support for both NaN mode regardless of hardware capability.
> It is useful for multilib installation that have both types
> of binary exist in system.
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
> ---
> Changes in v3:
> - Fix typo in commit message
> - Collect R-b tags
> - Link to v2: https://lore.kernel.org/r/20240511-mips_ieee754_emul-v2-1-af796ea21ef0@flygoat.com
> 
> Changes in v2:
> - Fix a typo
> - Link to v1: https://lore.kernel.org/r/20240507-mips_ieee754_emul-v1-1-1dc7c0d13cac@flygoat.com
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  4 +++-
>  arch/mips/include/asm/fpu.h                     | 15 +++++++++++++++
>  arch/mips/kernel/elf.c                          |  4 ++++
>  arch/mips/kernel/fpu-probe.c                    |  9 ++++++++-
>  4 files changed, 30 insertions(+), 2 deletions(-)

applied to mips-next.

Thomas.

Maciej W. Rozycki July 12, 2024, 12:22 p.m. UTC | #15

On Fri, 12 Jul 2024, Jiaxun Yang wrote:

> >> >  It would be good to check with hard-float QEMU configured for writable 
> >> > FCSR.NAN2008 (which is one way original code was verified) that things 
> >> > have not regressed.  And also what happens if once our emulation has 
> >> > triggered for the unsupported FCSR.NAN2008 mode, an attempt is made to 
> >> > flip the mode bit via ptrace(2), e.g. under GDB, which I reckon our 
> >> > emulation permits for non-legacy CPUs (and which I think should not be 
> >> > allowed under the new setting).
> >> 
> >> PTrace is working as expected (reflects emulated value).
> >
> >  Yes, sure for reads, but how about *writing* to the bit?
> 
> Tested flipping nan2008 bits with ieee754=emulated with ptrace, it works on some extent.
> (flipping the bit to unsupported value immediately triggered emulation).

 What about the other way round?

 Anyway I think we need to prevent it from happening, matching runtime 
behaviour, i.e. if the program itself cannot flip FCSR.NAN2008 via CTC1, 
then ptrace(2) must not either.  And the same for the emulator in the 
"ieee754=emulated" mode (but not for the emulator modes where the flipping 
is currently permitted), as it would be a one-way switch.

 In other words we need to be consistent and the NaN mode of operation has 
to be strapped in "ieee754=emulated" mode according to ELF file header's 
EF_MIPS_NAN2008 bit for the duration of execution of a given program.

 Likewise FCSR.ABS2008.

> >> The actual switchable NaN hardware (M5150, P5600) uses a dedicated Config7
> >> bit rather than writable FCSR.NAN2008 to control NaN2008 mode. This is undocumented
> >> and not present on some RTL releases. FCSR.NAN2008 is R/O as per The MIPS32 Instruction
> >> Set Manual. This renders the purposed test pointless.
> >
> >  Yes, for R6 and arguably R5, but not for R3.  Architecture specification 
> > revisions 3.50 through 5.02 define FCSR.NAN2008 (and also FCSR.ABS2008) as 
> > either R/O or R/W, at the implementer's discretion, so it is a conforming 
> > implementation to have these bits writable and our FPU emulator reflects 
> > it.  I won't go into the details here as to why the later revisions of the 
> > specification have been restricted to the R/O implementation only.
> >
> >  NB architecture specification revisions 3.50 through 5.01 also have the 
> > FCSR.MAC2008 bit defined, removed altogether later on.
> 
> Thanks for the information, I don't have access to those manuals so I was unaware
> of that. R/W NAN2008 is prohibited by AVP as well.

 There is a mention in revision history at the end of the current document 
anyway, which in this case is perhaps more informative than individual 
documents themselves.

> I briefly tested NaN2008 distro on QEMU modified with r/w NaN2008 bits in ieee754=
> strict mode, it seems working fine.

 Good.

  Maciej

Jiaxun Yang July 14, 2024, 3:11 a.m. UTC | #16

在2024年7月12日七月 下午8:22，Maciej W. Rozycki写道：
> On Fri, 12 Jul 2024, Jiaxun Yang wrote:
>
>> >> >  It would be good to check with hard-float QEMU configured for writable 
>> >> > FCSR.NAN2008 (which is one way original code was verified) that things 
>> >> > have not regressed.  And also what happens if once our emulation has 
>> >> > triggered for the unsupported FCSR.NAN2008 mode, an attempt is made to 
>> >> > flip the mode bit via ptrace(2), e.g. under GDB, which I reckon our 
>> >> > emulation permits for non-legacy CPUs (and which I think should not be 
>> >> > allowed under the new setting).
>> >> 
>> >> PTrace is working as expected (reflects emulated value).
>> >
>> >  Yes, sure for reads, but how about *writing* to the bit?
>> 
>> Tested flipping nan2008 bits with ieee754=emulated with ptrace, it works on some extent.
>> (flipping the bit to unsupported value immediately triggered emulation).
>
>  What about the other way round?

It works on both side (NaN2008 binary with ptrace flipped back to legacy and legacy flipped
back to NaN2008).

>
>  Anyway I think we need to prevent it from happening, matching runtime 
> behaviour, i.e. if the program itself cannot flip FCSR.NAN2008 via CTC1, 
> then ptrace(2) must not either.  And the same for the emulator in the 
> "ieee754=emulated" mode (but not for the emulator modes where the flipping 
> is currently permitted), as it would be a one-way switch.

It is out of the scope of this patch I think. Maybe we need a prctl to set NaN2008 status.

We are unable to prevent user applications write NAN2008 bits for the "switchable
QEMU" as well. So I'd perfer leave it as is, and let this feature go into 6.11 so people
can start to use it.

This is actually a request from Debian MIPS team so they can get glibc tests run on
mismatched NaN hardware.

Thanks
>
>  In other words we need to be consistent and the NaN mode of operation has 
> to be strapped in "ieee754=emulated" mode according to ELF file header's 
> EF_MIPS_NAN2008 bit for the duration of execution of a given program.
>
>  Likewise FCSR.ABS2008.
>
[...]

Maciej W. Rozycki July 15, 2024, 12:15 p.m. UTC | #17

On Sun, 14 Jul 2024, Jiaxun Yang wrote:

> >> >  Yes, sure for reads, but how about *writing* to the bit?
> >> 
> >> Tested flipping nan2008 bits with ieee754=emulated with ptrace, it works on some extent.
> >> (flipping the bit to unsupported value immediately triggered emulation).
> >
> >  What about the other way round?
> 
> It works on both side (NaN2008 binary with ptrace flipped back to legacy and legacy flipped
> back to NaN2008).

 So this is clearly wrong for this scenario.

> >  Anyway I think we need to prevent it from happening, matching runtime 
> > behaviour, i.e. if the program itself cannot flip FCSR.NAN2008 via CTC1, 
> > then ptrace(2) must not either.  And the same for the emulator in the 
> > "ieee754=emulated" mode (but not for the emulator modes where the flipping 
> > is currently permitted), as it would be a one-way switch.
> 
> It is out of the scope of this patch I think. Maybe we need a prctl to 
> set NaN2008 status.

 I don't know what prctl(2) has to do with this.  If you don't implement 
this part, then your change will cause Linux to behave inconsistently and 
therefore I'll have to NAK it.

 It's not much to do anyway, as I have prepared `ptrace_setfcr31' already 
to handle masking correctly, so all you have to do is to set the mask as 
required for the right thing to happen.  I shouldn't have needed to point 
you at it though, as that code is easy to find.

> We are unable to prevent user applications write NAN2008 bits for the "switchable
> QEMU" as well. So I'd perfer leave it as is, and let this feature go into 6.11 so people
> can start to use it.

 This doesn't matter either, as your change only addresses the case where 
FCSR.NAN2008 isn't writable anyway, which is the sole reason you want to 
switch between native hard float support and emulation, doesn't it?

 In fact where FCSR.NAN2008 is writable your new mode has to be equivalent
to "ieee754=strict", because then there is no need to trigger emulation 
for either NaN mode.  Please do verify that this is the case.

> This is actually a request from Debian MIPS team so they can get glibc tests run on
> mismatched NaN hardware.

 That doesn't matter for us here (and I have a bad suspicion anyway), but 
the Debian team is of course free to do what they want here, the GNU GPL 
applies.

 And also they can always use the "nofpu" kernel parameter to run their 
verification.  I used it for mine back at ImgTec before 2008 NaN hardware 
was available, also to verify emulation, which I wrote too.  Perhaps that 
is also the right solution for Debian actually?

  Maciej

Jiaxun Yang July 15, 2024, 12:35 p.m. UTC | #18

在2024年7月15日七月 下午8:15，Maciej W. Rozycki写道：
[..]
>  I don't know what prctl(2) has to do with this.  If you don't implement 
> this part, then your change will cause Linux to behave inconsistently and 
> therefore I'll have to NAK it.

I think your concern was regarding user space application needs to set NaN2008 bits
at runtime?

In this case, the best interface to inform kernel about NaN2008 changes is prctl.

I may misinterpret your comments.

>
>  It's not much to do anyway, as I have prepared `ptrace_setfcr31' already 
> to handle masking correctly, so all you have to do is to set the mask as 
> required for the right thing to happen.  I shouldn't have needed to point 
> you at it though, as that code is easy to find.

I think I got your point, will try to implement it.

>
>> We are unable to prevent user applications write NAN2008 bits for the "switchable
>> QEMU" as well. So I'd perfer leave it as is, and let this feature go into 6.11 so people
>> can start to use it.
>
>  This doesn't matter either, as your change only addresses the case where 
> FCSR.NAN2008 isn't writable anyway, which is the sole reason you want to 
> switch between native hard float support and emulation, doesn't it?
>
>  In fact where FCSR.NAN2008 is writable your new mode has to be equivalent
> to "ieee754=strict", because then there is no need to trigger emulation 
> for either NaN mode.  Please do verify that this is the case.

This had been verified with perf math-emu counters to ensure no unnecessary emulation
is triggered.

>
>> This is actually a request from Debian MIPS team so they can get glibc tests run on
>> mismatched NaN hardware.
>
>  That doesn't matter for us here (and I have a bad suspicion anyway), but 
> the Debian team is of course free to do what they want here, the GNU GPL 
> applies.

We care about our downstream users, don't we?

There are some lags on Debian buildd queue for mips64el due to lack of high performance
hardware with huge memory.
They were about to source some Loongson 3A4000, which is NaN2008 only. But many packages
test cases are failing on it due to NaN2008.
They asked me to help and that was my solution. I sincerely want to get this change upstreamed
to cover some downstream use cases.

I don't know what theory do you have here, but that's all stories behind.

>
>  And also they can always use the "nofpu" kernel parameter to run their 
> verification.  I used it for mine back at ImgTec before 2008 NaN hardware 
> was available, also to verify emulation, which I wrote too.  Perhaps that 
> is also the right solution for Debian actually?

I'll suggest to them, thanks.

>
>   Maciej

Maciej W. Rozycki July 15, 2024, 2:01 p.m. UTC | #19

On Mon, 15 Jul 2024, Jiaxun Yang wrote:

> >  I don't know what prctl(2) has to do with this.  If you don't implement 
> > this part, then your change will cause Linux to behave inconsistently and 
> > therefore I'll have to NAK it.
> 
> I think your concern was regarding user space application needs to set NaN2008 bits
> at runtime?

 Nope, following the objective of your change: the EF_MIPS_NAN2008 ELF 
file header flag instructs the kernel to choose between hardware and 
emulated hard float and that's not supposed to change later on throughout 
the life of the program, because it's not something the program can do 
itself, because writes to FCSR.NAN2008 are ignored by hardware.  And it's 
not a functional regression, because flipping FCSR.NAN2008 isn't allowed 
by hardware concerned anyway, we just want to have it consistent including 
the debugger interface.

> >  It's not much to do anyway, as I have prepared `ptrace_setfcr31' already 
> > to handle masking correctly, so all you have to do is to set the mask as 
> > required for the right thing to happen.  I shouldn't have needed to point 
> > you at it though, as that code is easy to find.
> 
> I think I got your point, will try to implement it.

 Thank you.

> >  This doesn't matter either, as your change only addresses the case where 
> > FCSR.NAN2008 isn't writable anyway, which is the sole reason you want to 
> > switch between native hard float support and emulation, doesn't it?
> >
> >  In fact where FCSR.NAN2008 is writable your new mode has to be equivalent
> > to "ieee754=strict", because then there is no need to trigger emulation 
> > for either NaN mode.  Please do verify that this is the case.
> 
> This had been verified with perf math-emu counters to ensure no unnecessary emulation
> is triggered.

 Thanks.

> >  That doesn't matter for us here (and I have a bad suspicion anyway), but 
> > the Debian team is of course free to do what they want here, the GNU GPL 
> > applies.
> 
> We care about our downstream users, don't we?

 There is a balance for us to keep.  Requests made have to be reasonable
and code contributed has to be architected well and meet quality criteria.  
Every change carries its associated cost and especially with the limited 
manpower available we can't afford having a technical debt created.  Any 
unclean piece of code accepted will strike us back sooner or later.

> They asked me to help and that was my solution. I sincerely want to get this change upstreamed
> to cover some downstream use cases.

 If it's your own genuine from-scratch implementation, then I have more 
faith in it.

> I don't know what theory do you have here, but that's all stories behind.

 I have seen odd requests and code changes stemming from embarassing lack 
of understanding how things work with the MIPS architecture.

> >  And also they can always use the "nofpu" kernel parameter to run their 
> > verification.  I used it for mine back at ImgTec before 2008 NaN hardware 
> > was available, also to verify emulation, which I wrote too.  Perhaps that 
> > is also the right solution for Debian actually?
> 
> I'll suggest to them, thanks.

 I note that it's been like this since 2015 and it has been documented:

	ieee754=	[MIPS] Select IEEE Std 754 conformance mode
			Format: { strict | legacy | 2008 | relaxed }
			Default: strict
[...]
			The FPU emulator is always able to support both NaN
			encodings, so if no FPU hardware is present or it has
			been disabled with 'nofpu', then the settings of
			'legacy' and '2008' strap the emulator accordingly,
			'relaxed' straps the emulator for both legacy-NaN and
			2008-NaN, whereas 'strict' enables legacy-NaN only on
			legacy processors and both NaN encodings on MIPS32 or
			MIPS64 CPUs.

(see the part following the last comma in particular).  It usually makes 
sense to read documentation and I'd expect MIPS Debian port experts to do 
it from time to time.

  Maciej

YunQiang Su July 15, 2024, 4:38 p.m. UTC | #20

Maciej W. Rozycki <macro@orcam.me.uk> 于2024年7月15日周一 22:02写道：
>
> On Mon, 15 Jul 2024, Jiaxun Yang wrote:
>
> > >  I don't know what prctl(2) has to do with this.  If you don't implement
> > > this part, then your change will cause Linux to behave inconsistently and
> > > therefore I'll have to NAK it.
> >
> > I think your concern was regarding user space application needs to set NaN2008 bits
> > at runtime?
>
>  Nope, following the objective of your change: the EF_MIPS_NAN2008 ELF
> file header flag instructs the kernel to choose between hardware and
> emulated hard float and that's not supposed to change later on throughout
> the life of the program, because it's not something the program can do
> itself, because writes to FCSR.NAN2008 are ignored by hardware.  And it's
> not a functional regression, because flipping FCSR.NAN2008 isn't allowed
> by hardware concerned anyway, we just want to have it consistent including
> the debugger interface.
>
> > >  It's not much to do anyway, as I have prepared `ptrace_setfcr31' already
> > > to handle masking correctly, so all you have to do is to set the mask as
> > > required for the right thing to happen.  I shouldn't have needed to point
> > > you at it though, as that code is easy to find.
> >
> > I think I got your point, will try to implement it.
>
>  Thank you.
>
> > >  This doesn't matter either, as your change only addresses the case where
> > > FCSR.NAN2008 isn't writable anyway, which is the sole reason you want to
> > > switch between native hard float support and emulation, doesn't it?
> > >
> > >  In fact where FCSR.NAN2008 is writable your new mode has to be equivalent
> > > to "ieee754=strict", because then there is no need to trigger emulation
> > > for either NaN mode.  Please do verify that this is the case.
> >
> > This had been verified with perf math-emu counters to ensure no unnecessary emulation
> > is triggered.
>
>  Thanks.
>
> > >  That doesn't matter for us here (and I have a bad suspicion anyway), but
> > > the Debian team is of course free to do what they want here, the GNU GPL
> > > applies.
> >
> > We care about our downstream users, don't we?
>
>  There is a balance for us to keep.  Requests made have to be reasonable
> and code contributed has to be architected well and meet quality criteria.
> Every change carries its associated cost and especially with the limited
> manpower available we can't afford having a technical debt created.  Any
> unclean piece of code accepted will strike us back sooner or later.
>
> > They asked me to help and that was my solution. I sincerely want to get this change upstreamed
> > to cover some downstream use cases.
>
>  If it's your own genuine from-scratch implementation, then I have more
> faith in it.
>
> > I don't know what theory do you have here, but that's all stories behind.
>
>  I have seen odd requests and code changes stemming from embarassing lack
> of understanding how things work with the MIPS architecture.
>

I won't think that your reply makes any sense since you have a typo here.  ;)

There is nothing in the world perfect I guess, while we need to
continue our life
and there are always new problems needing to be solved in our life.

Once the problem happens, we need a solution for it.
I think that a solution is a good one if it
    won't break something more
    we cannot forecast it will break something more in future
    It won't be very hard to rollback or drop it in future

I don't think that it is a good behavioral pattern to say "let's
destroy the world
since it is not perfect".  Anyway we live in a world full of tradeoffs.

> > >  And also they can always use the "nofpu" kernel parameter to run their
> > > verification.  I used it for mine back at ImgTec before 2008 NaN hardware
> > > was available, also to verify emulation, which I wrote too.  Perhaps that
> > > is also the right solution for Debian actually?
> >
> > I'll suggest to them, thanks.
>
>  I note that it's been like this since 2015 and it has been documented:
>
>         ieee754=        [MIPS] Select IEEE Std 754 conformance mode
>                         Format: { strict | legacy | 2008 | relaxed }
>                         Default: strict
> [...]
>                         The FPU emulator is always able to support both NaN
>                         encodings, so if no FPU hardware is present or it has
>                         been disabled with 'nofpu', then the settings of
>                         'legacy' and '2008' strap the emulator accordingly,
>                         'relaxed' straps the emulator for both legacy-NaN and
>                         2008-NaN, whereas 'strict' enables legacy-NaN only on
>                         legacy processors and both NaN encodings on MIPS32 or
>                         MIPS64 CPUs.
>
> (see the part following the last comma in particular).  It usually makes
> sense to read documentation and I'd expect MIPS Debian port experts to do
> it from time to time.
>
>   Maciej
>

Thomas Bogendoerfer Aug. 22, 2024, 8:02 a.m. UTC | #21

On Mon, Jul 15, 2024 at 08:35:21PM +0800, Jiaxun Yang wrote:
> 在2024年7月15日七月 下午8:15，Maciej W. Rozycki写道：
> >  It's not much to do anyway, as I have prepared `ptrace_setfcr31' already 
> > to handle masking correctly, so all you have to do is to set the mask as 
> > required for the right thing to happen.  I shouldn't have needed to point 
> > you at it though, as that code is easy to find.
> 
> I think I got your point, will try to implement it.

any news about this ?

Thomas.

Jiaxun Yang Aug. 22, 2024, 9:20 a.m. UTC | #22

在2024年8月22日八月 上午9:02，Thomas Bogendoerfer写道：
> On Mon, Jul 15, 2024 at 08:35:21PM +0800, Jiaxun Yang wrote:
>> 在2024年7月15日七月 下午8:15，Maciej W. Rozycki写道：
>> >  It's not much to do anyway, as I have prepared `ptrace_setfcr31' already 
>> > to handle masking correctly, so all you have to do is to set the mask as 
>> > required for the right thing to happen.  I shouldn't have needed to point 
>> > you at it though, as that code is easy to find.
>> 
>> I think I got your point, will try to implement it.
>
> any news about this ?
Hi Thomas,

Sorry, I implemented it but was unsure that my test coverage was ideal as I was
unable to access my usual switchable NaN hardware (FPGA based on interAptiv MR3)
for testing.

Will post patch after gain access to that again. Hardware ETA 10 Sept.

Thanks
- Jiaxun

>
> Thomas.
>
> -- 
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea.                                                [ RFC1925, 2.3 ]

[v3] MIPS: Implement ieee754 NAN2008 emulation mode

Commit Message

Comments

Patch