fix fdiv instruction

Message ID	20180623022258.22158-1-programmingkidx@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> From: John Arbuckle <programmingkidx@gmail.com> To: david@gibson.dropbear.id.au, qemu-devel@nongnu.org, qemu-ppc@nongnu.org Date: Fri, 22 Jun 2018 22:22:58 -0400 Message-Id: <20180623022258.22158-1-programmingkidx@gmail.com> Subject: [Qemu-devel] [PATCH] fix fdiv instruction Precedence: list Cc: John Arbuckle <programmingkidx@gmail.com> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>

Programmingkid June 23, 2018, 2:22 a.m. UTC

When the fdiv instruction divides a finite number by zero,
the result actually depends on the FPSCR[ZE] bit. If this
bit is set, the return value is zero. If it is not set
the result should be either positive or negative infinity.
The sign of this result would depend on the sign of the
two inputs. What currently happens is only infinity is
returned even if the FPSCR[ZE] bit is set. This patch
fixes this problem by actually checking the FPSCR[ZE] bit
when deciding what the answer should be.

fdiv is suppose to only set the FPSCR's FPRF bits during a
division by zero situation when the FPSCR[ZE] is not set.
What currently happens is these bits are always set. This
patch fixes this problem by checking the FPSCR[ZE] bit to
decide if the FPRF bits should be set. 

https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
This document has the information on the fdiv. Page 133 has the information on what action is executed when a division by zero situation takes place. 

Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
---
 target/ppc/fpu_helper.c            | 16 ++++++++++++++++
 target/ppc/translate/fp-impl.inc.c | 28 +++++++++++++++++++++++++++-
 2 files changed, 43 insertions(+), 1 deletion(-)

no-reply@patchew.org June 23, 2018, 4:54 a.m. UTC | #1

Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180623022258.22158-1-programmingkidx@gmail.com
Subject: [Qemu-devel] [PATCH] fix fdiv instruction

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/1529716721-28400-1-git-send-email-linux@roeck-us.net -> patchew/1529716721-28400-1-git-send-email-linux@roeck-us.net
 * [new tag]               patchew/20180623022258.22158-1-programmingkidx@gmail.com -> patchew/20180623022258.22158-1-programmingkidx@gmail.com
Switched to a new branch 'test'
9795f56c24 fix fdiv instruction

=== OUTPUT BEGIN ===
Checking PATCH 1/1: fix fdiv instruction...
ERROR: Macros with multiple statements should be enclosed in a do - while loop
#91: FILE: target/ppc/translate/fp-impl.inc.c:108:
+#define GEN_FLOAT_DIV(name, op2, inval, set_fprf, type)                       \
+_GEN_FLOAT_DIV(name, name, 0x3F, op2, inval, 0, set_fprf, type);              \
+_GEN_FLOAT_DIV(name##s, name, 0x3B, op2, inval, 1, set_fprf, type);

total: 1 errors, 0 warnings, 68 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

Richard Henderson June 23, 2018, 4:17 p.m. UTC | #2

On 06/22/2018 07:22 PM, John Arbuckle wrote:
> When the fdiv instruction divides a finite number by zero,
> the result actually depends on the FPSCR[ZE] bit. If this
> bit is set, the return value is zero. If it is not set
> the result should be either positive or negative infinity.
> The sign of this result would depend on the sign of the
> two inputs. What currently happens is only infinity is
> returned even if the FPSCR[ZE] bit is set. This patch
> fixes this problem by actually checking the FPSCR[ZE] bit
> when deciding what the answer should be.
> 
> fdiv is suppose to only set the FPSCR's FPRF bits during a
> division by zero situation when the FPSCR[ZE] is not set.
> What currently happens is these bits are always set. This
> patch fixes this problem by checking the FPSCR[ZE] bit to
> decide if the FPRF bits should be set. 
> 
> https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
> This document has the information on the fdiv. Page 133 has the information on what action is executed when a division by zero situation takes place. 
> 
> Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
> ---
>  target/ppc/fpu_helper.c            | 16 ++++++++++++++++
>  target/ppc/translate/fp-impl.inc.c | 28 +++++++++++++++++++++++++++-
>  2 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 7714bfe0f9..de694604fb 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -658,6 +658,20 @@ uint64_t helper_fdiv(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
>      } else if (unlikely(float64_is_zero(farg1.d) && float64_is_zero(farg2.d))) {
>          /* Division of zero by zero */
>          farg1.ll = float_invalid_op_excp(env, POWERPC_EXCP_FP_VXZDZ, 1);
> +    } else if (arg2 == 0) {
> +        /* Division by zero */
> +        float_zero_divide_excp(env, GETPC());
> +        if (fpscr_ze) { /* if zero divide exception is enabled */
> +            farg1.ll = 0;
> +        } else {
> +            uint64_t sign = (farg1.ll ^ farg2.ll) >> 63;
> +            if (sign) { /* Negative sign bit */
> +                farg1.ll = 0xfff0000000000000; /* Negative Infinity */
> +            } else { /* Positive sign bit */
> +                farg1.ll = 0x7ff0000000000000; /* Positive Infinity */
> +            }
> +            helper_compute_fprf_float64(env, farg1.d);

I don't believe you.
(1) This is against IEEE spec,
(2) There is nothing about this zero result in the Power manual,
(3) I do not replicate this experimentally.

#include <signal.h>
#include <stdio.h>
#include <fenv.h>
#include <stdlib.h>

void handle(int sig, siginfo_t *info, void *x)
{
    ucontext_t *u = x;
    printf("%f\n", u->uc_mcontext.fp_regs[0]);
    exit(0);
}

int main()
{
  struct sigaction a = { .sa_sigaction = handle, .sa_flags = SA_SIGINFO };
  sigaction(SIGFPE, &a, NULL);
  feenableexcept(FE_ALL_EXCEPT);

  {
    register double f0 __asm__("32") = 2;
    register double f1 __asm__("33") = 1;
    register double f2 __asm__("34") = 0;
    __asm__ volatile ("fdiv %0,%1,%2" : "+f"(f0) : "f"(f1), "f"(f2));
  }
  return 1;
}

Produces the expected 2.0, i.e. the destination register is unmodified.
Without feenableexcept, of course, the normal infinity is produced.


r~

Programmingkid June 23, 2018, 8:17 p.m. UTC | #3

> On Jun 23, 2018, at 12:17 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 06/22/2018 07:22 PM, John Arbuckle wrote:
>> When the fdiv instruction divides a finite number by zero,
>> the result actually depends on the FPSCR[ZE] bit. If this
>> bit is set, the return value is zero. If it is not set
>> the result should be either positive or negative infinity.
>> The sign of this result would depend on the sign of the
>> two inputs. What currently happens is only infinity is
>> returned even if the FPSCR[ZE] bit is set. This patch
>> fixes this problem by actually checking the FPSCR[ZE] bit
>> when deciding what the answer should be.
>> 
>> fdiv is suppose to only set the FPSCR's FPRF bits during a
>> division by zero situation when the FPSCR[ZE] is not set.
>> What currently happens is these bits are always set. This
>> patch fixes this problem by checking the FPSCR[ZE] bit to
>> decide if the FPRF bits should be set. 
>> 
>> https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
>> This document has the information on the fdiv. Page 133 has the information on what action is executed when a division by zero situation takes place. 
>> 
>> Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
>> ---
>> target/ppc/fpu_helper.c            | 16 ++++++++++++++++
>> target/ppc/translate/fp-impl.inc.c | 28 +++++++++++++++++++++++++++-
>> 2 files changed, 43 insertions(+), 1 deletion(-)
>> 
>> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
>> index 7714bfe0f9..de694604fb 100644
>> --- a/target/ppc/fpu_helper.c
>> +++ b/target/ppc/fpu_helper.c
>> @@ -658,6 +658,20 @@ uint64_t helper_fdiv(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
>>     } else if (unlikely(float64_is_zero(farg1.d) && float64_is_zero(farg2.d))) {
>>         /* Division of zero by zero */
>>         farg1.ll = float_invalid_op_excp(env, POWERPC_EXCP_FP_VXZDZ, 1);
>> +    } else if (arg2 == 0) {
>> +        /* Division by zero */
>> +        float_zero_divide_excp(env, GETPC());
>> +        if (fpscr_ze) { /* if zero divide exception is enabled */
>> +            farg1.ll = 0;
>> +        } else {
>> +            uint64_t sign = (farg1.ll ^ farg2.ll) >> 63;
>> +            if (sign) { /* Negative sign bit */
>> +                farg1.ll = 0xfff0000000000000; /* Negative Infinity */
>> +            } else { /* Positive sign bit */
>> +                farg1.ll = 0x7ff0000000000000; /* Positive Infinity */
>> +            }
>> +            helper_compute_fprf_float64(env, farg1.d);
> 
> I don't believe you.
> (1) This is against IEEE spec,

I'm trying to implement IBM PowerPC specs. 

> (2) There is nothing about this zero result in the Power manual,

This is for PowerPC. Power and PowerPC are cousins to each other rather than having a child-parent relationship. Yes there are a lot of similar instructions between them, this does not mean they are compatible with each other.  

> (3) I do not replicate this experimentally.

I used a G3 iMac and a G5 iMac to do my testing.

> 
> #include <signal.h>
> #include <stdio.h>
> #include <fenv.h>
> #include <stdlib.h>
> 
> void handle(int sig, siginfo_t *info, void *x)
> {
>    ucontext_t *u = x;
>    printf("%f\n", u->uc_mcontext.fp_regs[0]);
>    exit(0);
> }
> 
> int main()
> {
>  struct sigaction a = { .sa_sigaction = handle, .sa_flags = SA_SIGINFO };
>  sigaction(SIGFPE, &a, NULL);
>  feenableexcept(FE_ALL_EXCEPT);

This is C99 code. There are a lot of floating point bugs with this implementation. I suggest all future testing be done using PowerPC assembly language only.

> 
>  {
>    register double f0 __asm__("32") = 2;
>    register double f1 __asm__("33") = 1;
>    register double f2 __asm__("34") = 0;
>    __asm__ volatile ("fdiv %0,%1,%2" : "+f"(f0) : "f"(f1), "f"(f2));
>  }
>  return 1;
> }
> 
> Produces the expected 2.0, i.e. the destination register is unmodified.
> Without feenableexcept, of course, the normal infinity is produced.

What compiler did you use to compile this program?

What operating system did you run this program on?

What are the specs of the system you used to test this program on?

I made a complete floating point test program that I could send you if you want to see how I test things.

Thank you.

Richard Henderson June 24, 2018, 4:18 a.m. UTC | #4

On 06/23/2018 01:17 PM, Programmingkid wrote:
>>> https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
>>> This document has the information on the fdiv. Page 133 has the information on what action is executed when a division by zero situation takes place. 

Even in your referenced PDF, table 3-13, it says that frD is unmodified.

>>  struct sigaction a = { .sa_sigaction = handle, .sa_flags = SA_SIGINFO };
>>  sigaction(SIGFPE, &a, NULL);
>>  feenableexcept(FE_ALL_EXCEPT);
> 
> This is C99 code. There are a lot of floating point bugs with this implementation. I suggest all future testing be done using PowerPC assembly language only.

Um.. have you really ever seen an implementation that won't set ZE?

> What compiler did you use to compile this program?

gcc 7.2.

> What operating system did you run this program on?

CentOS 7, so kernel 3.10, glibc 2.17.

> What are the specs of the system you used to test this program on?

CHRP IBM,8231-E2B (Power7).


r~

Programmingkid June 24, 2018, 1:46 p.m. UTC | #5

> On Jun 24, 2018, at 12:18 AM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 06/23/2018 01:17 PM, Programmingkid wrote:
>>>> https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
>>>> This document has the information on the fdiv. Page 133 has the information on what action is executed when a division by zero situation takes place. 
> 
> Even in your referenced PDF, table 3-13, it says that frD is unmodified.

Actually it says when FPSCR[ZE] is set is when frD is unmodified. When FPSCR[ZE] is not set it frD's sign is determined by an XOR of the signs of the operands. I have verified that this is what happens on real PowerPC 750 and 950 CPUs. 

> 
>>> struct sigaction a = { .sa_sigaction = handle, .sa_flags = SA_SIGINFO };
>>> sigaction(SIGFPE, &a, NULL);
>>> feenableexcept(FE_ALL_EXCEPT);
>> 
>> This is C99 code. There are a lot of floating point bugs with this implementation. I suggest all future testing be done using PowerPC assembly language only.
> 
> Um.. have you really ever seen an implementation that won't set ZE?

I do know that when I did try to use C99 floating point code I saw a lot of problems. Bypassing any library issues by directly setting the FPSCR is best for testing. Table 2-4 in the pdf has information on the FPSCR. 

> 
>> What compiler did you use to compile this program?
> 
> gcc 7.2.
> 
>> What operating system did you run this program on?
> 
> CentOS 7, so kernel 3.10, glibc 2.17.
> 
>> What are the specs of the system you used to test this program on?
> 
> CHRP IBM,8231-E2B (Power7).

I was just wondering are you able to run a PowerPC operating system in QEMU in KVM mode on this Power7 computer?

Thank you.

Richard Henderson June 24, 2018, 6:30 p.m. UTC | #6

On 06/24/2018 06:46 AM, Programmingkid wrote:
>> Even in your referenced PDF, table 3-13, it says that frD is unmodified.
> 
> Actually it says when FPSCR[ZE] is set is when frD is unmodified. When FPSCR[ZE] is not set it frD's sign is determined by an XOR of the signs of the operands. I have verified that this is what happens on real PowerPC 750 and 950 CPUs. 


Of course.  When ZE is not set, 1 / 0 -> inf (and -1 / 0 -> -inf, etc).
But ZE not set is not the topic of discussion, or the subject of your patch.


> I was just wondering are you able to run a PowerPC operating system in QEMU
> in KVM mode on this Power7 computer?

Sadly not.  KVM is not enabled in this setup.


r~

Programmingkid June 24, 2018, 6:38 p.m. UTC | #7

> On Jun 24, 2018, at 2:30 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 06/24/2018 06:46 AM, Programmingkid wrote:
>>> Even in your referenced PDF, table 3-13, it says that frD is unmodified.
>> 
>> Actually it says when FPSCR[ZE] is set is when frD is unmodified. When FPSCR[ZE] is not set it frD's sign is determined by an XOR of the signs of the operands. I have verified that this is what happens on real PowerPC 750 and 950 CPUs. 
> 
> 
> Of course.  When ZE is not set, 1 / 0 -> inf (and -1 / 0 -> -inf, etc).
> But ZE not set is not the topic of discussion, or the subject of your patch.

I do believe the ZE is closely tied to the fdiv instruction, and that fixing the fdiv instruction does involve the ZE bit. 

>> I was just wondering are you able to run a PowerPC operating system in QEMU
>> in KVM mode on this Power7 computer?
> 
> Sadly not.  KVM is not enabled in this setup.

Ok. Thanks for letting me know. 

Attached is my floating point test program. It is made for PowerPC. Does this program run on your system? If so, could you send me the results?
/**************************************************************************************
 * File: main.c
 * Date: 4-30-2017
 * Description: Implement a test program for various floating point instructions.
 * Note: tests made to work with PowerPC G3 and G5 only.
 * Compiling on Mac OS X: use gcc-3.3 -force_cpusubtype_ALL
 * Note: fsqrt test will not work on PowerPC G3.
 * Version: 2
 **************************************************************************************/

#include <stdio.h>
#include <stdint.h>
#include <math.h>
#include <float.h>
#include <inttypes.h>

// Used to convert unsigned integer <--> double 
union Converter
{
    double d;   // double
    uint64_t i; // integer
};

typedef union Converter Converter;
 
/* Describes the name and description of each bit of the FPSCR */
struct fpscr_info
{
    char name[8];
    char description[100];
};

struct fpscr_info finfo[] = 
{
    "FX", "Floating-point exception summary",
    "FEX", "Floating-point enabled exception summary",
    "VX", "Floating-point invalid operation exception summary",
    "OX", "Floating-point overflow exception",
    "UX", "Floating-point underflow exception",
    "ZX", "Floating-point zero divide exception",
    "XX", "Floating-point inexact exception",
    "VXSNAN", "Floating-point invalid operation exception for SNaN",
    "VXISI", "Floating-point invalid operation exception for ∞ - ∞",
    "VXIDI", "Floating-point invalid operation exception for ∞/∞",
    "VXZDZ", "Floating-point invalid operation exception for 0/0",
    "VXIMZ", "Floating-point invalid operation exception for ∞ * 0",
    "VXVC", "Floating-point invalid operation exception for invalid compare",
    "FR", "Floating-point fraction rounded",
    "FI", "Floating-point fraction inexact",
    "FPRF", "Floating-point result class descriptor ",
    "FPRF", "Floating-point less than or negative",
    "FPRF", "Floating-point greater than or positive",
    "FPRF", "Floating-point equal or zero",
    "FPRF", "Floating-point unordered or NaN",
    "NO NAME", "Reserved - you shouldn't be seeing this",
    "VXSOFT", "Floating-point invalid operation exception for software request",
    "VXSQRT", "Floating-point invalid operation exception for invalid square root",
    "VXCVI", "Floating-point invalid operation exception for invalid integer convert",
    "VE", "Floating-point invalid operation exception enable",
    "OE", "IEEE floating-point overflow exception enable",
    "UE", "IEEE floating-point underflow exception enable",
    "ZE", "IEEE floating-point zero divide exception enable",
    "XE", "Floating-point inexact exception enable",
    "NI", "Floating-point non-IEEE mode",
    "RN", "Rounding bit 0",
    "RN", "Rounding bit 1",
};
 
// Prints all the FPSCR settings that are set in the input
void print_fpscr_settings(uint32_t fpscr)
{
    int i;
    for (i = 0; i < 32; i++) {
        if (((fpscr >> i) & 0x1) == 1) {
            /* right description = 31 - i  Oddity of IBM documentation */
            printf("bit %d: %s - %s\n", 31-i, finfo[31-i].name, finfo[31-i].description);
        }
    }
}


#define ZE 27
#define set_fpscr_bit(x) asm volatile ("mtfsb1 %0" : : "i"(x))

/* Keeps track of the number of tests that failed */
int failed_tests = 0;

// Reset the FPSCR
void reset_fpscr()
{
    asm volatile("mtfsb0 0");
    asm volatile("mtfsb0 1");
    asm volatile("mtfsb0 2");
    asm volatile("mtfsb0 3");
    asm volatile("mtfsb0 4");
    asm volatile("mtfsb0 5");
    asm volatile("mtfsb0 6");
    asm volatile("mtfsb0 7");
    asm volatile("mtfsb0 8");
    asm volatile("mtfsb0 9");
    asm volatile("mtfsb0 10");
    asm volatile("mtfsb0 11");
    asm volatile("mtfsb0 12");
    asm volatile("mtfsb0 13");
    asm volatile("mtfsb0 14");
    asm volatile("mtfsb0 15");
    asm volatile("mtfsb0 16");
    asm volatile("mtfsb0 17");
    asm volatile("mtfsb0 18");
    asm volatile("mtfsb0 19");
    asm volatile("mtfsb0 20");
    asm volatile("mtfsb0 21");
    asm volatile("mtfsb0 22");
    asm volatile("mtfsb0 23");
    asm volatile("mtfsb0 24");
    asm volatile("mtfsb0 25");
    asm volatile("mtfsb0 26");
    asm volatile("mtfsb0 27");
    asm volatile("mtfsb0 28");
    asm volatile("mtfsb0 29");
    asm volatile("mtfsb0 30");
    asm volatile("mtfsb0 31");
    
    /* Check if everything is alright */
    uint32_t fpscr;
    Converter c;
    asm volatile("mffs %0" : "=f"(c.d));
    fpscr = (uint32_t)c.i;
    if (fpscr != 0) {
        printf("Warning: fpscr not equal to zero: 0x%x\n", fpscr);
        print_fpscr_settings(fpscr);
    }
}

/*
 * The action to take if a test fails
 * Input one: message string
 * Input two: actual fpscr value
 * Input three: expected fpscr value
 * Input four: actual answer
 * Input five: expected answer
 */
 void test_failed(const char *message, uint32_t actual_fpscr, uint32_t expected_fpscr,
                  uint64_t actual_answer, uint64_t expected_answer)
 {
    printf("%s\n", message);
    printf("expected answer: 0x%" PRIx64 " (%f)\n", expected_answer, expected_answer);
    printf("  actual answer: 0x%" PRIx64 " (%f)\n", actual_answer, actual_answer);
    printf("expected fpscr: 0x%x\n", expected_fpscr);
    printf("  actual fpscr: 0x%x\n", actual_fpscr);
    
    /* Only print FPSCR bits if there is a difference */
    if (actual_fpscr != expected_fpscr) {
        printf("\nactual FPSCR bits set:\n");
        print_fpscr_settings(actual_fpscr);
        printf("\nexpected FPSCR bits set:\n");
        print_fpscr_settings(expected_fpscr);
    }
    printf("\n");
    failed_tests++;
 }

/*
 * Returns the value of the FPSCR
 * output: unsigned 32 bit integer
 */
uint32_t get_fpscr()
{
    asm volatile("mffs f0");
    asm volatile("stfd f0, 40(r1)");
    uint32_t return_value;
    asm volatile("lwz %0, 44(r1)" : "=r"(return_value));
    return return_value;
}


/* The fpscr and answer have to be right for a test to pass. */


/* Test the fadd instruction */
void test_fadd()
{
    Converter c;
    uint64_t expected_answer = 0x3ff3333333333334;
    uint32_t actual_fpscr, expected_fpscr = 0x82064000;
    reset_fpscr();    
    asm volatile("fadd %0, %1, %2" : "=f"(c.d) : "f"(0.4), "f"(0.8));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fadd test passed\n");
    } else {
        test_failed("fadd test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}
 
/* Test the fadds instruction */
void test_fadds()
{
    Converter c;
    uint64_t expected_answer = 0x407024d500000000;
    uint32_t actual_fpscr, expected_fpscr = 0x82064000;
    reset_fpscr();    
    asm volatile("fadds %0, %1, %2" : "=f"(c.d) : "f"(257.445), "f"(0.857));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fadds test passed\n");
    } else {
        test_failed("fadds test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fsub instruction */
void test_fsub()
{
    Converter c;
    uint64_t expected_answer = 0x40f2fd1deb11c6d2;
    uint32_t actual_fpscr, expected_fpscr = 0x4000;
    reset_fpscr();
    asm volatile("fsub %0, %1, %2" : "=f"(c.d) : "f"(123456.78), "f"(45678.91011));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fsub test passed\n");
    } else {
        test_failed("fsub test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fsubs instruction */
void test_fsubs()
{
    Converter c;
    double expected_answer = 0x40884e3d70a3d70a;
    uint32_t actual_fpscr, expected_fpscr = 0x4000;
    reset_fpscr();
    asm volatile("fsub %0, %1, %2" : "=f"(c.d) : "f"(1234.56), "f"(456.78));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fsubs test passed\n");
    } else {
        test_failed("fsubs test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test infinity - infinity */
void test_inf_minus_inf()
{
    Converter c;
    uint64_t expected_answer = 0x7ff8000000000000;
    uint32_t actual_fpscr, expected_fpscr = 0xa0811000;
    reset_fpscr();
    asm volatile("fsub %0, %1, %1" : "=f"(c.d) : "f"(INFINITY));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("inf - inf test passed\n");
    } else {
        test_failed("inf - inf test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test division by zero */
void test_division_by_zero()
{
    Converter c;
    uint64_t expected_answer = 0x0;
    uint32_t actual_fpscr, expected_fpscr = 0xc4000010;
    reset_fpscr();
    set_fpscr_bit(ZE);
    asm volatile("fdiv %0, %1, %2" : "=f"(c.d) : "f"(1.0), "f"(0.0));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("division by zero test passed\n");
    } else {
        test_failed("division by zero test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test zero divided by zero */
void test_zero_div_zero()
{
    Converter c;
    uint64_t expected_answer = 0x7ff8000000000000;
    uint32_t actual_fpscr, expected_fpscr = 0xa0211010;
    reset_fpscr();
    set_fpscr_bit(ZE);
    asm volatile("fdiv %0, %1, %1" : "=f"(c.d) : "f"(0.0), "f"(0.0));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("0/0 test passed\n");
    } else {
        test_failed("0/0 test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test infinity divided by infinity */
void test_inf_div_inf()
{
    Converter c;
    uint64_t expected_answer = 0x7ff8000000000000;
    uint32_t actual_fpscr, expected_fpscr = 0xa0411000;
    reset_fpscr();
    asm volatile("fdiv %0, %1, %1" : "=f"(c.d) : "f"(INFINITY));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("inf/inf test passed\n");
    } else {
        test_failed("inf/inf test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}


/* Test the fdiv instruction */
void test_fdiv()
{
    Converter c;
    uint64_t expected_answer = 0x40059f38ee13b48b;
    uint32_t actual_fpscr, expected_fpscr = 0x82064000;
    reset_fpscr();
    asm volatile("fdiv %0, %1, %2" : "=f"(c.d) : "f"(1234.56), "f"(456.78));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fdiv test passed\n");
    } else {
        test_failed("fdiv test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fdivs instruction */
void test_fdivs()
{
    Converter c;
    uint64_t expected_answer = 0x40059f38e0000000;
    uint32_t actual_fpscr, expected_fpscr = 0x82024000;
    reset_fpscr();
    asm volatile("fdivs %0, %1, %2" : "=f"(c.d) : "f"(1234.56), "f"(456.78));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fdivs test passed\n");
    } else {
        test_failed("fdivs test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fmul instruction */
void test_fmul()
{
    Converter c;
    uint64_t expected_answer = 0x40365c28f5c28f5c;
    uint32_t actual_fpscr, expected_fpscr = 0x82024000;
    reset_fpscr();
    asm volatile("fmul %0, %1, %2" : "=f"(c.d) : "f"(5.2), "f"(4.3));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fmul test passed\n");
    } else {
        test_failed("fmul test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fmuls instruction */
void test_fmuls()
{
    Converter c;
    uint64_t expected_answer = 0x412135a4a0000000;
    uint32_t actual_fpscr, expected_fpscr = 0x82024000;
    reset_fpscr();
    asm volatile("fmuls %0, %1, %2" : "=f"(c.d) : "f"(1234.56), "f"(456.78));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fmuls test passed\n");
    } else {
        test_failed("fmuls test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fmul instruction */
void test_inf_times_zero()
{
    Converter c;
    uint64_t expected_answer = 0x7ff8000000000000;
    uint32_t actual_fpscr, expected_fpscr = 0xa0111000;
    reset_fpscr();
    asm volatile("fmul %0, %1, %2" : "=f"(c.d) : "f"(INFINITY), "f"(0.0));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("inf * 0 test passed\n");
    } else {
        test_failed("inf * 0 test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fmadd instruction */
void test_fmadd()
{
    Converter c;
    uint64_t expected_answer = 0x4123fcaadfa43fe5;
    uint32_t actual_fpscr, expected_fpscr = 0x82024000;
    reset_fpscr();
    asm volatile("fmadd %0, %1, %2, %3" : "=f"(c.d) : "f"(1234.56), "f"(456.78), "f"(91011.12));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fmadd test passed\n");
    } else {
        test_failed("fmadd test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fmadds instruction */
void test_fmadds()
{
    Converter c;
    uint64_t expected_answer = 0x4123fcaae0000000;
    uint32_t actual_fpscr, expected_fpscr = 0x82064000;
    reset_fpscr();
    asm volatile("fmadds %0, %1, %2, %3" : "=f"(c.d) : "f"(1234.56), "f"(456.78), "f"(91011.12));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fmadds test passed\n");
    } else {
        test_failed("fmadds test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fsqrt instruction - This instruction doesn't work on the PowerPC 750 (G3).
   It does work on the PowerPC 970 (G5).  */
/*void test_fsqrt()
{
    double answer;
    reset_fpscr();
    asm volatile("fsqrt %0, %1" : "=f"(answer) : "f"(-1.0));
    if (get_fpscr() == 0xa0011200) {
        printf("fsqrt test passed\n");
    } else {
        test_failed("fsqrt test failed");
    }   
}*/

/* Test an overflow condition */
void test_overflow()
{
    // multiplying two really big numbers equals overflow
    Converter c;
    double really_big_input;
    uint64_t expected_answer = 0x7ff0000000000000;
    uint32_t actual_fpscr, expected_fpscr = 0x92025000;
    reset_fpscr();
    really_big_input = 1.7 * pow(10, 308);
    asm volatile("fmul %0, %1, %1" : "=f"(c.d) : "f"(really_big_input));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("overflow test passed\n");
    } else {
        test_failed("overflow test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test an underflow condition */
void test_underflow()
{
    Converter c;
    uint64_t expected_answer = 0x199999999999a;
    uint32_t actual_fpscr, expected_fpscr = 0x8a074000;
    reset_fpscr();
    asm volatile("fmadd %0, %1, %2, %3" : "=f"(c.d) : "f"(DBL_MIN), "f"(0.1), "f"(0.0));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("underflow test passed\n");
    } else {
        test_failed("underflow test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fctiw instruction */
void test_fctiw()
{
    Converter c;
    uint64_t expected_answer;
    uint32_t actual_fpscr, expected_fpscr;
    double frB;
    
    /* above maximum value test */
    expected_fpscr = 0xa0000100;
    expected_answer = 0x7fffffff;
    frB = pow(2, 32); // greater than 2^31 - 1
    reset_fpscr();
    asm volatile("fctiw %0, %1" : "=f"(c.d) : "f"(frB));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && (c.i & 0xffffffff) == expected_answer) {
        printf("fctiw above maximum value test passed\n");
    } else {
        test_failed("fctiw above maximum value test failed", actual_fpscr, expected_fpscr, (c.i & 0xffffffff), expected_answer);
    }
    
    /* below minimum value test*/
    expected_fpscr = 0xa0000100;
    expected_answer = 0x80000000;
    frB = -frB;  // less than -2^31
    reset_fpscr();
    asm volatile("fctiw %0, %1" : "=f"(c.d) : "f"(frB));    
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && (c.i & 0xffffffff) == expected_answer) {
        printf("fctiw below minimum value test passed\n");
    } else {
        test_failed("fctiw below minimum value test failed", actual_fpscr, expected_fpscr, (c.i & 0xffffffff), expected_answer);
    }

    /* float to integer test */
    expected_fpscr = 0x82060000;
    expected_answer = 0xd;
    reset_fpscr();
    asm volatile("fctiw %0, %1" : "=f"(c.d) : "f"(12.7));    
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && (c.i & 0xffffffff) == expected_answer) {
        printf("fctiw integer conversion test passed\n");
    } else {
        test_failed("fctiw integer conversion test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fctiwz instruction */
void test_fctiwz()
{
    Converter c;
    uint64_t expected_answer;
    uint32_t actual_fpscr, expected_fpscr;
    double frB;
    
    /* above maximum value test */
    expected_fpscr = 0xa0000100;
    expected_answer = 0x7fffffff;
    frB = pow(2, 32); // greater than 2^31 - 1
    reset_fpscr();
    asm volatile("fctiwz %0, %1" : "=f"(c.d) : "f"(frB));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && (c.i & 0xffffffff) == expected_answer) {
        printf("fctiwz above maximum value test passed\n");
    } else {
        test_failed("fctiwz above maximum value test failed", actual_fpscr, expected_fpscr, (c.i & 0xffffffff), expected_answer);
    }
    
    /* below minimum value test*/
    expected_fpscr = 0xa0000100;
    expected_answer = 0x80000000;
    frB = -frB;  // less than -2^31
    reset_fpscr();
    asm volatile("fctiwz %0, %1" : "=f"(c.d) : "f"(frB));    
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && (c.i & 0xffffffff) == expected_answer) {
        printf("fctiwz below minimum value test passed\n");
    } else {
        test_failed("fctiwz below minimum value test failed", actual_fpscr, expected_fpscr, (c.i & 0xffffffff), expected_answer);
    }

    /* float to integer test */
    expected_fpscr = 0x82060000;
    expected_answer = 0x1c;
    reset_fpscr();
    asm volatile("fctiw %0, %1" : "=f"(c.d) : "f"(27.98));    
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && (c.i & 0xffffffff) == expected_answer) {
        printf("fctiwz integer conversion test passed\n");
    } else {
        test_failed("fctiwz integer conversion test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the frsp instruction */
void test_frsp()
{
    Converter c;
    uint64_t expected_answer = 0x4271f71fc0000000;
    uint32_t actual_fpscr, expected_fpscr = 0x82064000;
    reset_fpscr();
    asm volatile("frsp %0, %1" : "=f"(c.d) : "f"(1234567891012.131415));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("frsp test passed\n");
    } else {
        test_failed("frsp test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Test the fnmsub instruction */
void test_fnmsub()
{
    Converter c;
    uint64_t expected_answer = 0xc11cdd3cc985f06e;
    uint32_t actual_fpscr, expected_fpscr = 0x82028000;
    reset_fpscr();
    asm volatile("fnmsub %0, %1, %2, %3" : "=f"(c.d) : "f"(1234.56), "f"(456.78), "f"(91011.12));
    actual_fpscr = get_fpscr();
    if (actual_fpscr == expected_fpscr && c.i == expected_answer) {
        printf("fnmsub test passed\n");
    } else {
        test_failed("fnmsub test failed", actual_fpscr, expected_fpscr, c.i, expected_answer);
    }
}

/* Report the results of all the tests */
void report_results()
{
    if (failed_tests == 1) {
        printf("\n=== Warning: %d test failed ===\n", failed_tests);
    } else if (failed_tests > 1) {
        printf("\n=== Warning: %d tests failed ===\n", failed_tests);
    } else {
        printf("\n=== All tests passed ===\n");
    }
}

int main (int argc, const char * argv[]) {

    test_fadd();
    test_fadds();
    test_fsub();
    test_fsubs();
    test_fmul();
    test_fmuls();
    test_fdiv();
    test_fdivs();
    test_fmadd();
    test_fmadds();    
    //test_fsqrt();
    test_inf_minus_inf();
    test_division_by_zero();
    test_zero_div_zero();
    test_inf_div_inf();
    test_inf_times_zero();
    test_overflow();
    test_underflow();
    test_fctiw();
    test_fctiwz();
    test_frsp();
    test_fnmsub();
    
    report_results();

    return 0;
}

David Gibson June 25, 2018, 12:46 a.m. UTC | #8

On Sat, Jun 23, 2018 at 04:17:08PM -0400, Programmingkid wrote:
> 
> > On Jun 23, 2018, at 12:17 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> > 
> > On 06/22/2018 07:22 PM, John Arbuckle wrote:
> >> When the fdiv instruction divides a finite number by zero,
> >> the result actually depends on the FPSCR[ZE] bit. If this
> >> bit is set, the return value is zero. If it is not set
> >> the result should be either positive or negative infinity.
> >> The sign of this result would depend on the sign of the
> >> two inputs. What currently happens is only infinity is
> >> returned even if the FPSCR[ZE] bit is set. This patch
> >> fixes this problem by actually checking the FPSCR[ZE] bit
> >> when deciding what the answer should be.
> >> 
> >> fdiv is suppose to only set the FPSCR's FPRF bits during a
> >> division by zero situation when the FPSCR[ZE] is not set.
> >> What currently happens is these bits are always set. This
> >> patch fixes this problem by checking the FPSCR[ZE] bit to
> >> decide if the FPRF bits should be set. 
> >> 
> >> https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
> >> This document has the information on the fdiv. Page 133 has the information on what action is executed when a division by zero situation takes place. 
> >> 
> >> Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
> >> ---
> >> target/ppc/fpu_helper.c            | 16 ++++++++++++++++
> >> target/ppc/translate/fp-impl.inc.c | 28 +++++++++++++++++++++++++++-
> >> 2 files changed, 43 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> >> index 7714bfe0f9..de694604fb 100644
> >> --- a/target/ppc/fpu_helper.c
> >> +++ b/target/ppc/fpu_helper.c
> >> @@ -658,6 +658,20 @@ uint64_t helper_fdiv(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
> >>     } else if (unlikely(float64_is_zero(farg1.d) && float64_is_zero(farg2.d))) {
> >>         /* Division of zero by zero */
> >>         farg1.ll = float_invalid_op_excp(env, POWERPC_EXCP_FP_VXZDZ, 1);
> >> +    } else if (arg2 == 0) {
> >> +        /* Division by zero */
> >> +        float_zero_divide_excp(env, GETPC());
> >> +        if (fpscr_ze) { /* if zero divide exception is enabled */
> >> +            farg1.ll = 0;
> >> +        } else {
> >> +            uint64_t sign = (farg1.ll ^ farg2.ll) >> 63;
> >> +            if (sign) { /* Negative sign bit */
> >> +                farg1.ll = 0xfff0000000000000; /* Negative Infinity */
> >> +            } else { /* Positive sign bit */
> >> +                farg1.ll = 0x7ff0000000000000; /* Positive Infinity */
> >> +            }
> >> +            helper_compute_fprf_float64(env, farg1.d);
> > 
> > I don't believe you.
> > (1) This is against IEEE spec,
> 
> I'm trying to implement IBM PowerPC specs. 

Which should have an IEEE compliant FPU.

> > (2) There is nothing about this zero result in the Power manual,
> 
> This is for PowerPC. Power and PowerPC are cousins to each other
> rather than having a child-parent relationship. Yes there are a lot
> of similar instructions between them, this does not mean they are
> compatible with each other.

The're definitely supposed to be compatible, at least for userspace
(PR=1) instructions.  The distinction between "POWER" and "PowerPC" is
kind of confusing, but basically every PowerPC or POWER chip from
POWER2 onwards should be backwards compatible for non-privileged
instructions.

David Gibson June 25, 2018, 12:46 a.m. UTC | #9

On Fri, Jun 22, 2018 at 10:22:58PM -0400, John Arbuckle wrote:
> When the fdiv instruction divides a finite number by zero,
> the result actually depends on the FPSCR[ZE] bit. If this
> bit is set, the return value is zero. If it is not set
> the result should be either positive or negative infinity.
> The sign of this result would depend on the sign of the
> two inputs. What currently happens is only infinity is
> returned even if the FPSCR[ZE] bit is set. This patch
> fixes this problem by actually checking the FPSCR[ZE] bit
> when deciding what the answer should be.
> 
> fdiv is suppose to only set the FPSCR's FPRF bits during a
> division by zero situation when the FPSCR[ZE] is not set.
> What currently happens is these bits are always set. This
> patch fixes this problem by checking the FPSCR[ZE] bit to
> decide if the FPRF bits should be set. 
> 
> https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
> This document has the information on the fdiv. Page 133 has the
> information on what action is executed when a division by zero
> situation takes place.

I'm looking at that table and there is definitely no mention of a 0
*result* in any situation.  So this patch is definitely wrong on that
point.  If there's something else this is fixing, then the commit
message needs to describe that.

> 
> Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
> ---
>  target/ppc/fpu_helper.c            | 16 ++++++++++++++++
>  target/ppc/translate/fp-impl.inc.c | 28 +++++++++++++++++++++++++++-
>  2 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 7714bfe0f9..de694604fb 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -658,6 +658,20 @@ uint64_t helper_fdiv(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
>      } else if (unlikely(float64_is_zero(farg1.d) && float64_is_zero(farg2.d))) {
>          /* Division of zero by zero */
>          farg1.ll = float_invalid_op_excp(env, POWERPC_EXCP_FP_VXZDZ, 1);
> +    } else if (arg2 == 0) {
> +        /* Division by zero */
> +        float_zero_divide_excp(env, GETPC());
> +        if (fpscr_ze) { /* if zero divide exception is enabled */
> +            farg1.ll = 0;
> +        } else {
> +            uint64_t sign = (farg1.ll ^ farg2.ll) >> 63;
> +            if (sign) { /* Negative sign bit */
> +                farg1.ll = 0xfff0000000000000; /* Negative Infinity */
> +            } else { /* Positive sign bit */
> +                farg1.ll = 0x7ff0000000000000; /* Positive Infinity */
> +            }
> +            helper_compute_fprf_float64(env, farg1.d);
> +        }
>      } else {
>          if (unlikely(float64_is_signaling_nan(farg1.d, &env->fp_status) ||
>                       float64_is_signaling_nan(farg2.d, &env->fp_status))) {
> @@ -665,6 +679,8 @@ uint64_t helper_fdiv(CPUPPCState *env, uint64_t arg1, uint64_t arg2)
>              float_invalid_op_excp(env, POWERPC_EXCP_FP_VXSNAN, 1);
>          }
>          farg1.d = float64_div(farg1.d, farg2.d, &env->fp_status);
> +        helper_compute_fprf_float64(env, farg1.d);
> +        helper_float_check_status(env);
>      }
>  
>      return farg1.ll;
> diff --git a/target/ppc/translate/fp-impl.inc.c b/target/ppc/translate/fp-impl.inc.c
> index 2fbd4d4f38..4e20bcceb4 100644
> --- a/target/ppc/translate/fp-impl.inc.c
> +++ b/target/ppc/translate/fp-impl.inc.c
> @@ -84,6 +84,32 @@ static void gen_f##name(DisasContext *ctx)                                    \
>  _GEN_FLOAT_AB(name, name, 0x3F, op2, inval, 0, set_fprf, type);               \
>  _GEN_FLOAT_AB(name##s, name, 0x3B, op2, inval, 1, set_fprf, type);
>  
> +
> +#define _GEN_FLOAT_DIV(name, op, op1, op2, inval, isfloat, set_fprf, type)    \
> +static void gen_f##name(DisasContext *ctx)                                    \
> +{                                                                             \
> +    if (unlikely(!ctx->fpu_enabled)) {                                        \
> +        gen_exception(ctx, POWERPC_EXCP_FPU);                                 \
> +        return;                                                               \
> +    }                                                                         \
> +    gen_reset_fpstatus();                                                     \
> +    gen_helper_f##op(cpu_fpr[rD(ctx->opcode)], cpu_env,                       \
> +                     cpu_fpr[rA(ctx->opcode)],                                \
> +                     cpu_fpr[rB(ctx->opcode)]);                               \
> +    if (isfloat) {                                                            \
> +        gen_helper_frsp(cpu_fpr[rD(ctx->opcode)], cpu_env,                    \
> +                        cpu_fpr[rD(ctx->opcode)]);                            \
> +    }                                                                         \
> +    if (unlikely(Rc(ctx->opcode) != 0)) {                                     \
> +        gen_set_cr1_from_fpscr(ctx);                                          \
> +    }                                                                         \
> +}
> +
> +#define GEN_FLOAT_DIV(name, op2, inval, set_fprf, type)                       \
> +_GEN_FLOAT_DIV(name, name, 0x3F, op2, inval, 0, set_fprf, type);              \
> +_GEN_FLOAT_DIV(name##s, name, 0x3B, op2, inval, 1, set_fprf, type);
> +
> +
>  #define _GEN_FLOAT_AC(name, op, op1, op2, inval, isfloat, set_fprf, type)     \
>  static void gen_f##name(DisasContext *ctx)                                    \
>  {                                                                             \
> @@ -149,7 +175,7 @@ static void gen_f##name(DisasContext *ctx)                                    \
>  /* fadd - fadds */
>  GEN_FLOAT_AB(add, 0x15, 0x000007C0, 1, PPC_FLOAT);
>  /* fdiv - fdivs */
> -GEN_FLOAT_AB(div, 0x12, 0x000007C0, 1, PPC_FLOAT);
> +GEN_FLOAT_DIV(div, 0x12, 0x000007C0, 1, PPC_FLOAT);
>  /* fmul - fmuls */
>  GEN_FLOAT_AC(mul, 0x19, 0x0000F800, 1, PPC_FLOAT);
>

Programmingkid June 25, 2018, 3:24 a.m. UTC | #10

On Jun 24, 2018, at 8:46 PM, David Gibson wrote:

> On Fri, Jun 22, 2018 at 10:22:58PM -0400, John Arbuckle wrote:
>> When the fdiv instruction divides a finite number by zero,
>> the result actually depends on the FPSCR[ZE] bit. If this
>> bit is set, the return value is zero. If it is not set
>> the result should be either positive or negative infinity.
>> The sign of this result would depend on the sign of the
>> two inputs. What currently happens is only infinity is
>> returned even if the FPSCR[ZE] bit is set. This patch
>> fixes this problem by actually checking the FPSCR[ZE] bit
>> when deciding what the answer should be.
>>
>> fdiv is suppose to only set the FPSCR's FPRF bits during a
>> division by zero situation when the FPSCR[ZE] is not set.
>> What currently happens is these bits are always set. This
>> patch fixes this problem by checking the FPSCR[ZE] bit to
>> decide if the FPRF bits should be set.
>>
>> https://www.pdfdrive.net/powerpc-microprocessor-family-the- 
>> programming-environments-for-32-e3087633.html
>> This document has the information on the fdiv. Page 133 has the
>> information on what action is executed when a division by zero
>> situation takes place.
>
> I'm looking at that table and there is definitely no mention of a 0
> *result* in any situation.  So this patch is definitely wrong on that
> point.  If there's something else this is fixing, then the commit
> message needs to describe that.

I did not find any mention of a zero result *yet*. There could be  
mention of something in another document. I will see what I can find.  
Right now I do have example code that I ran on a PowerPC 970 CPU and  
the results are as follows:

When dividing by zero:

if FPSCR[ZE] is enabled
	then answer = 0
if FPSCR[ZE] is disabled
	then answer = 0x7ff0000000000000

I think this feature may be undocumented.

Here is the example program I used. When I ran this program, the  
FPSCR is set to zero initially (all bits disabled). When the mtfsb1  
instruction is executed, the ZE bit is set. I ran this program twice.  
Once with the mtfsb1 line uncommented, and another time with the line  
commented. The result for the answer was different in each situation.

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

// Used to convert unsigned integer <--> double
union Converter
{
     double d;
     uint64_t i;
};
typedef union Converter Converter;

int main (int argc, const char * argv[]) {
     Converter answer;
     //asm volatile("mtfsb1 27"); /* Set ZE bit */
     asm volatile("fdiv %0, %1, %2" : "=f"(answer.d) : "f"(1.0),  
"f"(0.0));
     printf("answer = 0x%" PRIx64 "\n", answer.i);
     return 0;
}

David Gibson June 25, 2018, 3:25 a.m. UTC | #11

On Sun, Jun 24, 2018 at 11:24:17PM -0400, G 3 wrote:
> 
> On Jun 24, 2018, at 8:46 PM, David Gibson wrote:
> 
> > On Fri, Jun 22, 2018 at 10:22:58PM -0400, John Arbuckle wrote:
> > > When the fdiv instruction divides a finite number by zero,
> > > the result actually depends on the FPSCR[ZE] bit. If this
> > > bit is set, the return value is zero. If it is not set
> > > the result should be either positive or negative infinity.
> > > The sign of this result would depend on the sign of the
> > > two inputs. What currently happens is only infinity is
> > > returned even if the FPSCR[ZE] bit is set. This patch
> > > fixes this problem by actually checking the FPSCR[ZE] bit
> > > when deciding what the answer should be.
> > > 
> > > fdiv is suppose to only set the FPSCR's FPRF bits during a
> > > division by zero situation when the FPSCR[ZE] is not set.
> > > What currently happens is these bits are always set. This
> > > patch fixes this problem by checking the FPSCR[ZE] bit to
> > > decide if the FPRF bits should be set.
> > > 
> > > https://www.pdfdrive.net/powerpc-microprocessor-family-the-programming-environments-for-32-e3087633.html
> > > This document has the information on the fdiv. Page 133 has the
> > > information on what action is executed when a division by zero
> > > situation takes place.
> > 
> > I'm looking at that table and there is definitely no mention of a 0
> > *result* in any situation.  So this patch is definitely wrong on that
> > point.  If there's something else this is fixing, then the commit
> > message needs to describe that.
> 
> I did not find any mention of a zero result *yet*. There could be mention of
> something in another document. I will see what I can find. Right now I do
> have example code that I ran on a PowerPC 970 CPU and the results are as
> follows:
> 
> When dividing by zero:
> 
> if FPSCR[ZE] is enabled
> 	then answer = 0

Really?  Or is it just that the result register already contained zero
and is being left unchanged as the document says should happen?

> if FPSCR[ZE] is disabled
> 	then answer = 0x7ff0000000000000
> 
> I think this feature may be undocumented.
> 
> Here is the example program I used. When I ran this program, the FPSCR is
> set to zero initially (all bits disabled). When the mtfsb1 instruction is
> executed, the ZE bit is set. I ran this program twice. Once with the mtfsb1
> line uncommented, and another time with the line commented. The result for
> the answer was different in each situation.
> 
> #include <stdio.h>
> #include <stdint.h>
> #include <inttypes.h>
> 
> // Used to convert unsigned integer <--> double
> union Converter
> {
>     double d;
>     uint64_t i;
> };
> typedef union Converter Converter;
> 
> int main (int argc, const char * argv[]) {
>     Converter answer;
>     //asm volatile("mtfsb1 27"); /* Set ZE bit */
>     asm volatile("fdiv %0, %1, %2" : "=f"(answer.d) : "f"(1.0), "f"(0.0));
>     printf("answer = 0x%" PRIx64 "\n", answer.i);
>     return 0;
> }
> 
> 
>

Richard Henderson June 25, 2018, 3:47 a.m. UTC | #12

On 06/24/2018 11:38 AM, Programmingkid wrote:
> void test_division_by_zero()
> {
>     Converter c;
>     uint64_t expected_answer = 0x0;
>     uint32_t actual_fpscr, expected_fpscr = 0xc4000010;
>     reset_fpscr();
>     set_fpscr_bit(ZE);
>     asm volatile("fdiv %0, %1, %2" : "=f"(c.d) : "f"(1.0), "f"(0.0));

Try

    uint64_t expected_answer = 0xffff0000deadbeef;
    ...
    c.i = expected_answer;
    asm volatile("fdiv %0, %1, %2" : "+f"(c.d) : "f"(1.0), "f"(0.0));

to avoid depending on uninitialized data.  (This expected value is
an SNaN with a deadbeef marker Just to be Sure.)


r~

Programmingkid June 25, 2018, 3:23 p.m. UTC | #13

On Jun 24, 2018, at 11:47 PM, Richard Henderson wrote:

> On 06/24/2018 11:38 AM, Programmingkid wrote:
>> void test_division_by_zero()
>> {
>>     Converter c;
>>     uint64_t expected_answer = 0x0;
>>     uint32_t actual_fpscr, expected_fpscr = 0xc4000010;
>>     reset_fpscr();
>>     set_fpscr_bit(ZE);
>>     asm volatile("fdiv %0, %1, %2" : "=f"(c.d) : "f"(1.0), "f"(0.0));
>
> Try
>
>     uint64_t expected_answer = 0xffff0000deadbeef;
>     ...
>     c.i = expected_answer;
>     asm volatile("fdiv %0, %1, %2" : "+f"(c.d) : "f"(1.0), "f"(0.0));
>
> to avoid depending on uninitialized data.  (This expected value is
> an SNaN with a deadbeef marker Just to be Sure.)
>
>
> r~


Ok I made this program and tried it on my iMac G5 (PowerPC 970).

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

// Used to convert unsigned integer <--> double
union Converter
{
     double d;
     uint64_t i;
};
typedef union Converter Converter;

int main (int argc, const char * argv[]) {
     Converter answer;
     answer.i = 0xffff0000deadbeef;
     //asm volatile("mtfsb1 27"); /* Set ZE bit */
     asm volatile("fdiv %0, %1, %2" : "=f"(answer.d) : "f"(1.0),  
"f"(0.0));
     printf("answer = 0x%" PRIx64 "\n", answer.i);
     return 0;
}

The result was the same.

When the FPSCR[ZE] bit is set, the answer is 0x0.

When the FPSCR[ZE] bit is not set, the answer is 0x7ff0000000000000.

This seems to be an undocumented feature.

Richard Henderson June 25, 2018, 9:08 p.m. UTC | #14

On Mon, Jun 25, 2018, 08:23 G 3 <programmingkidx@gmail.com> wrote:

> >
> > Try
> >
> >     uint64_t expected_answer = 0xffff0000deadbeef;
> >     ...
> >     c.i = expected_answer;
> >     asm volatile("fdiv %0, %1, %2" : "+f"(c.d) : "f"(1.0), "f"(0.0));
> >
> > to avoid depending on uninitialized data.  (This expected value is
> > an SNaN with a deadbeef marker Just to be Sure.)
> >
> >
> > r~
>
>
> Ok I made this program and tried it on my iMac G5 (PowerPC 970).
>
> #include <stdio.h>
> #include <stdint.h>
> #include <inttypes.h>
>
> // Used to convert unsigned integer <--> double
> union Converter
> {
>      double d;
>      uint64_t i;
> };
> typedef union Converter Converter;
>
> int main (int argc, const char * argv[]) {
>      Converter answer;
>      answer.i = 0xffff0000deadbeef;
>      //asm volatile("mtfsb1 27"); /* Set ZE bit */
>      asm volatile("fdiv %0, %1, %2" : "=f"(answer.d) : "f"(1.0),
> "f"(0.0));
>

Need +f for inout operand.
This didn't test what you expected.

r~

Programmingkid June 25, 2018, 10:23 p.m. UTC | #15

> On Jun 25, 2018, at 5:08 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On Mon, Jun 25, 2018, 08:23 G 3 <programmingkidx@gmail.com> wrote:
> >
> > Try
> >
> >     uint64_t expected_answer = 0xffff0000deadbeef;
> >     ...
> >     c.i = expected_answer;
> >     asm volatile("fdiv %0, %1, %2" : "+f"(c.d) : "f"(1.0), "f"(0.0));
> >
> > to avoid depending on uninitialized data.  (This expected value is
> > an SNaN with a deadbeef marker Just to be Sure.)
> >
> >
> > r~
> 
> 
> Ok I made this program and tried it on my iMac G5 (PowerPC 970).
> 
> #include <stdio.h>
> #include <stdint.h>
> #include <inttypes.h>
> 
> // Used to convert unsigned integer <--> double
> union Converter
> {
>      double d;
>      uint64_t i;
> };
> typedef union Converter Converter;
> 
> int main (int argc, const char * argv[]) {
>      Converter answer;
>      answer.i = 0xffff0000deadbeef;
>      //asm volatile("mtfsb1 27"); /* Set ZE bit */
>      asm volatile("fdiv %0, %1, %2" : "=f"(answer.d) : "f"(1.0),  
> "f"(0.0));
> 
> Need +f for inout operand.
> This didn't test what you expected.

What do you mean by inout operand?

If you could send me some sample code I will test it out.

Richard Henderson June 26, 2018, 1:49 p.m. UTC | #16

On 06/25/2018 03:23 PM, Programmingkid wrote:
> 
>> On Jun 25, 2018, at 5:08 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
>>
>> On Mon, Jun 25, 2018, 08:23 G 3 <programmingkidx@gmail.com> wrote:
>>>
>>> Try
>>>
>>>     uint64_t expected_answer = 0xffff0000deadbeef;
>>>     ...
>>>     c.i = expected_answer;
>>>     asm volatile("fdiv %0, %1, %2" : "+f"(c.d) : "f"(1.0), "f"(0.0));
>>>
>>> to avoid depending on uninitialized data.  (This expected value is
>>> an SNaN with a deadbeef marker Just to be Sure.)
>>>
>>>
>>> r~
>>
>>
>> Ok I made this program and tried it on my iMac G5 (PowerPC 970).
>>
>> #include <stdio.h>
>> #include <stdint.h>
>> #include <inttypes.h>
>>
>> // Used to convert unsigned integer <--> double
>> union Converter
>> {
>>      double d;
>>      uint64_t i;
>> };
>> typedef union Converter Converter;
>>
>> int main (int argc, const char * argv[]) {
>>      Converter answer;
>>      answer.i = 0xffff0000deadbeef;
>>      //asm volatile("mtfsb1 27"); /* Set ZE bit */
>>      asm volatile("fdiv %0, %1, %2" : "=f"(answer.d) : "f"(1.0),  
>> "f"(0.0));
>>
>> Need +f for inout operand.
>> This didn't test what you expected.
> 
> What do you mean by inout operand?

An operand that is both input and output.

> If you could send me some sample code I will test it out.

I did, you just didn't read it properly.
Here it is again:

  asm volatile("fdiv %0, %1, %2"
               : "+f"(answer.d) : "f"(1.0), "f"(0.0));
                 ^^^^

r~

Programmingkid June 26, 2018, 7:50 p.m. UTC | #17

On Jun 26, 2018, at 9:49 AM, Richard Henderson wrote:

> On 06/25/2018 03:23 PM, Programmingkid wrote:
>>
>>> On Jun 25, 2018, at 5:08 PM, Richard Henderson  
>>> <richard.henderson@linaro.org> wrote:
>>>
>>> On Mon, Jun 25, 2018, 08:23 G 3 <programmingkidx@gmail.com> wrote:
>>>>
>>>> Try
>>>>
>>>>     uint64_t expected_answer = 0xffff0000deadbeef;
>>>>     ...
>>>>     c.i = expected_answer;
>>>>     asm volatile("fdiv %0, %1, %2" : "+f"(c.d) : "f"(1.0),  
>>>> "f"(0.0));
>>>>
>>>> to avoid depending on uninitialized data.  (This expected value is
>>>> an SNaN with a deadbeef marker Just to be Sure.)
>>>>
>>>>
>>>> r~
>>>
>>>
>>> Ok I made this program and tried it on my iMac G5 (PowerPC 970).
>>>
>>> #include <stdio.h>
>>> #include <stdint.h>
>>> #include <inttypes.h>
>>>
>>> // Used to convert unsigned integer <--> double
>>> union Converter
>>> {
>>>      double d;
>>>      uint64_t i;
>>> };
>>> typedef union Converter Converter;
>>>
>>> int main (int argc, const char * argv[]) {
>>>      Converter answer;
>>>      answer.i = 0xffff0000deadbeef;
>>>      //asm volatile("mtfsb1 27"); /* Set ZE bit */
>>>      asm volatile("fdiv %0, %1, %2" : "=f"(answer.d) : "f"(1.0),
>>> "f"(0.0));
>>>
>>> Need +f for inout operand.
>>> This didn't test what you expected.
>>
>> What do you mean by inout operand?
>
> An operand that is both input and output.
>
>> If you could send me some sample code I will test it out.
>
> I did, you just didn't read it properly.
> Here it is again:
>
>   asm volatile("fdiv %0, %1, %2"
>                : "+f"(answer.d) : "f"(1.0), "f"(0.0));
>                  ^^^^
>
> r~


I used the +f constraint and did see different results. Computer used  
was an iMac G5 (PowerPC 970).

If FPSCR[ZE] is set:
	answer = 0xffff0000deadbeef  (changed from 0x0)

if FPSCR[ZE] is not set:
	answer = 0x7ff0000000000000  (no change from before)


I then tried this program in QEMU without any of my patches applied.  
Here is the result:

If FPSCR[ZE] is set or not set, answer = 0x7ff0000000000000. This  
indicates to me that the fdiv instruction needs a little work. This  
is what I think should happen. If division by zero takes  place and  
the FPSCR[ZE] bit is set, then the value in the destination register  
should not be altered (rather than returning zero).


Here is the program used:

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

// Used to convert unsigned integer <--> double
union Converter
{
     double d;
     uint64_t i;
};

typedef union Converter Converter;

int main (int argc, const char * argv[]) {

     Converter answer;
     answer.i = 0xffff0000deadbeef;
     //asm volatile("mtfsb1 27"); /* Set ZE bit */
     asm volatile("fdiv %0, %1, %2" : "+f"(answer.d) : "f"(1.0),  
"f"(0.0));
     printf("answer = 0x%" PRIx64 "\n", answer.i);
     return 0;
}

Richard Henderson June 26, 2018, 9:27 p.m. UTC | #18

On 06/26/2018 12:50 PM, G 3 wrote:
> 
> If FPSCR[ZE] is set or not set, answer = 0x7ff0000000000000. This indicates to
> me that the fdiv instruction needs a little work. This is what I think should
> happen. If division by zero takes  place and the FPSCR[ZE] bit is set, then the
> value in the destination register should not be altered (rather than returning
> zero).

I have not tested, but I suspect the same will be true for all other
floating-point exceptions.

E.g. try fmul of DBL_MAX * DBL_MAX with FPSCR[OE] set.

To my eye we would need to rearrange all of the fp operations:

(1) Remove helper_reset_fpstatus.
    Every fp operation should leave 0 in the exception_flags.

    Failure to do so indicates we're missing the post-operation
    processing of exceptions via float_check_status.  Which is
    in fact exactly the bug here for fdiv.  And based on a quick
    browse, also fmul, fsub, and fadd.

(2) float_check_status should be re-organized.

  (a) if status == 0, early exit,
  (b) otherwise, set_float_exception_flags(&env->fp_status, 0) immediately.

(3) I suspect that all of the exception special cases can be
    reordered such that we test them after the operation, as
    they should all be unlikely.

    A good example is target/tricore/fpu_helper.c, in which
    we test the exception flags, do special cases when we
    find e.g. float_flags_invalid set, and then process the
    exceptions that were raised.


r~

fix fdiv instruction

Commit Message

Comments

Patch