mbox series

[v4,0/8] VSX MMA Implementation

Message ID 20220520135129.63664-1-lucas.araujo@eldorado.org.br (mailing list archive)
Headers show
Series VSX MMA Implementation | expand

Message

Lucas Mateus Martins Araujo e Castro May 20, 2022, 1:51 p.m. UTC
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Based-on: <20220517161522.36132-1-victor.colombo@eldorado.org.br>

This patch series is a patch series of the Matrix-Multiply Assist (MMA)
instructions implementation from the PowerISA 3.1

These and the VDIV/VMOD implementation are the last new PowerISA 3.1
instructions left to be implemented.

The XVFGER instructions accumulate the exception status and at the end
set the FPSCR and take a Program interrupt on a trap-enabled exception,
previous versions were based on Victor's rework of FPU exceptions, but
as that patch was rejected this version worked around the fact that
OX/UX/XX and invalid instructions were handled in different functions
by disabling all enable bits then re-enabling them and calling the mtfsf
deferred exception helper.

Patch without review: 5

v4 changes:
    - Changed VSXGER16 accumulation to always use float32_sum and negate
      the elements according to the type of accumulation

v3 changes:
    - GER helpers now use ppc_acc_t instead of ppc_vsr_t for passing acc
    - Removed do_ger_XX3 and updated the decodetree to pass the masks in
      32 bits instructions
    - Removed unnecessary rounding mode function
    - Moved float32_neg to fpu_helper.c and renamed it bfp32_negate to
      make it clearer that it's a 32 bit version of the PowerISA
      bfp_NEGATE
    - Negated accumulation now a subtraction
    - Changed exception handling by disabling all enable FPSCR enable
      bits to set all FPSCR bits (except FEX) correctly, then re-enable
      them and call do_fpscr_check_status to raise the exception
      accordingly and set FEX if necessary

v2 changes:
    - Changed VSXGER, VSXGER16 and XVIGER macros to functions
    - Set rounding mode in floating-point instructions based on RN
      before operations
    - Separated accumulate and with saturation instructions in
      different helpers
    - Used FIELD, FIELD_EX32 and FIELD_DP32 for packing/unpacking masks

Joel Stanley (1):
  linux-user: Add PowerPC ISA 3.1 and MMA to hwcap

Lucas Mateus Castro (alqotel) (7):
  target/ppc: Implement xxm[tf]acc and xxsetaccz
  target/ppc: Implemented xvi*ger* instructions
  target/ppc: Implemented pmxvi*ger* instructions
  target/ppc: Implemented xvf*ger*
  target/ppc: Implemented xvf16ger*
  target/ppc: Implemented pmxvf*ger*
  target/ppc: Implemented [pm]xvbf16ger2*

 linux-user/elfload.c                |   4 +
 target/ppc/cpu.h                    |  13 ++
 target/ppc/fpu_helper.c             | 326 +++++++++++++++++++++++++++-
 target/ppc/helper.h                 |  33 +++
 target/ppc/insn32.decode            |  52 +++++
 target/ppc/insn64.decode            |  79 +++++++
 target/ppc/int_helper.c             | 130 +++++++++++
 target/ppc/internal.h               |  15 ++
 target/ppc/translate/vsx-impl.c.inc | 130 +++++++++++
 9 files changed, 780 insertions(+), 2 deletions(-)