From patchwork Wed Oct 5 14:37:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?V=C3=ADctor_Colombo?= X-Patchwork-Id: 12999309 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 81ED6C433FE for ; Wed, 5 Oct 2022 15:00:11 +0000 (UTC) Received: from localhost ([::1]:42168 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1og5sW-0003Jq-P7 for qemu-devel@archiver.kernel.org; Wed, 05 Oct 2022 11:00:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50820) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1og5Xf-0005uI-B3; Wed, 05 Oct 2022 10:38:37 -0400 Received: from [200.168.210.66] (port=55228 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1og5Xc-0004bx-4y; Wed, 05 Oct 2022 10:38:34 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Wed, 5 Oct 2022 11:37:23 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id 012C08002A8; Wed, 5 Oct 2022 11:37:22 -0300 (-03) From: =?utf-8?q?V=C3=ADctor_Colombo?= To: qemu-devel@nongnu.org, qemu-ppc@nongnu.org Cc: clg@kaod.org, danielhb413@gmail.com, david@gibson.dropbear.id.au, groug@kaod.org, richard.henderson@linaro.org, aurelien@aurel32.net, peter.maydell@linaro.org, alex.bennee@linaro.org, balaton@eik.bme.hu, victor.colombo@eldorado.org.br, matheus.ferst@eldorado.org.br, lucas.araujo@eldorado.org.br, leandro.lupori@eldorado.org.br, lucas.coutinho@eldorado.org.br Subject: [RFC PATCH 1/4] target/ppc: prepare instructions to work with caching last FP insn Date: Wed, 5 Oct 2022 11:37:16 -0300 Message-Id: <20221005143719.65241-2-victor.colombo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221005143719.65241-1-victor.colombo@eldorado.org.br> References: <20221005143719.65241-1-victor.colombo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 05 Oct 2022 14:37:23.0721 (UTC) FILETIME=[FB835790:01D8D8C7] X-Host-Lookup-Failed: Reverse DNS lookup failed for 200.168.210.66 (failed) Received-SPF: pass client-ip=200.168.210.66; envelope-from=victor.colombo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" When enabling hardfpu for Power and adding the instruction caching feature, it will be necessary to uncache when the instruction is garanteed to be executed in softfloat. If the cache is not cleared in this situation, it could lead to a previous instruction being reexecuted and yield a different result than when only softfloat was present. This patch introduces the base code to allow for the implementation of FP instructions caching, while also adding calls to a macro that clears the cached instruction for every one that has not been 'migrated' to hardfpu-compliance yet. In the future, it will be necessary to implement the necessary code for each FP instruction that wants to use hardfpu. Signed-off-by: Víctor Colombo --- target/ppc/cpu.h | 6 +++ target/ppc/excp_helper.c | 2 + target/ppc/fpu_helper.c | 65 ++++++++++++++++++++++++++++++ target/ppc/helper.h | 1 + target/ppc/translate/fp-impl.c.inc | 1 + 5 files changed, 75 insertions(+) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index 7f73e2ac81..1132d60162 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -1080,6 +1080,10 @@ struct ppc_radix_page_info { #define PPC_CPU_OPCODES_LEN 0x40 #define PPC_CPU_INDIRECT_OPCODES_LEN 0x20 +enum { + CACHED_FN_TYPE_NONE, +}; + struct CPUArchState { /* Most commonly used resources during translated code execution first */ target_ulong gpr[32]; /* general purpose registers */ @@ -1157,6 +1161,8 @@ struct CPUArchState { float_status fp_status; /* Floating point execution context */ target_ulong fpscr; /* Floating point status and control register */ + int cached_fn_type; + /* Internal devices resources */ ppc_tb_t *tb_env; /* Time base and decrementer */ ppc_dcr_t *dcr_env; /* Device control registers */ diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c index 214acf5ac4..4671b15386 100644 --- a/target/ppc/excp_helper.c +++ b/target/ppc/excp_helper.c @@ -1904,6 +1904,8 @@ void raise_exception_err_ra(CPUPPCState *env, uint32_t exception, { CPUState *cs = env_cpu(env); + helper_execute_fp_cached(env); + cs->exception_index = exception; env->error_code = error_code; cpu_loop_exit_restore(cs, raddr); diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index ae25f32d6e..6aaee37619 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -23,6 +23,13 @@ #include "internal.h" #include "fpu/softfloat.h" +#define CACHE_FN_NONE(env) \ + do { \ + assert(!(env->fp_status.float_exception_flags & \ + float_flag_inexact)); \ + env->cached_fn_type = CACHED_FN_TYPE_NONE; \ + } while (0) + static inline float128 float128_snan_to_qnan(float128 x) { float128 r; @@ -514,6 +521,22 @@ void helper_reset_fpstatus(CPUPPCState *env) set_float_exception_flags(0, &env->fp_status); } +void helper_execute_fp_cached(CPUPPCState *env) +{ + switch (env->cached_fn_type) { + case CACHED_FN_TYPE_NONE: + /* + * the last fp instruction was executed in softfloat + * so no need to execute it again + */ + break; + default: + g_assert_not_reached(); + } + + env->cached_fn_type = CACHED_FN_TYPE_NONE; +} + static void float_invalid_op_addsub(CPUPPCState *env, int flags, bool set_fpcc, uintptr_t retaddr) { @@ -527,6 +550,7 @@ static void float_invalid_op_addsub(CPUPPCState *env, int flags, /* fadd - fadd. */ float64 helper_fadd(CPUPPCState *env, float64 arg1, float64 arg2) { + CACHE_FN_NONE(env); float64 ret = float64_add(arg1, arg2, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -540,6 +564,7 @@ float64 helper_fadd(CPUPPCState *env, float64 arg1, float64 arg2) /* fadds - fadds. */ float64 helper_fadds(CPUPPCState *env, float64 arg1, float64 arg2) { + CACHE_FN_NONE(env); float64 ret = float64r32_add(arg1, arg2, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -552,6 +577,7 @@ float64 helper_fadds(CPUPPCState *env, float64 arg1, float64 arg2) /* fsub - fsub. */ float64 helper_fsub(CPUPPCState *env, float64 arg1, float64 arg2) { + CACHE_FN_NONE(env); float64 ret = float64_sub(arg1, arg2, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -565,6 +591,7 @@ float64 helper_fsub(CPUPPCState *env, float64 arg1, float64 arg2) /* fsubs - fsubs. */ float64 helper_fsubs(CPUPPCState *env, float64 arg1, float64 arg2) { + CACHE_FN_NONE(env); float64 ret = float64r32_sub(arg1, arg2, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -587,6 +614,7 @@ static void float_invalid_op_mul(CPUPPCState *env, int flags, /* fmul - fmul. */ float64 helper_fmul(CPUPPCState *env, float64 arg1, float64 arg2) { + CACHE_FN_NONE(env); float64 ret = float64_mul(arg1, arg2, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -600,6 +628,7 @@ float64 helper_fmul(CPUPPCState *env, float64 arg1, float64 arg2) /* fmuls - fmuls. */ float64 helper_fmuls(CPUPPCState *env, float64 arg1, float64 arg2) { + CACHE_FN_NONE(env); float64 ret = float64r32_mul(arg1, arg2, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -624,6 +653,7 @@ static void float_invalid_op_div(CPUPPCState *env, int flags, /* fdiv - fdiv. */ float64 helper_fdiv(CPUPPCState *env, float64 arg1, float64 arg2) { + CACHE_FN_NONE(env); float64 ret = float64_div(arg1, arg2, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -640,6 +670,7 @@ float64 helper_fdiv(CPUPPCState *env, float64 arg1, float64 arg2) /* fdivs - fdivs. */ float64 helper_fdivs(CPUPPCState *env, float64 arg1, float64 arg2) { + CACHE_FN_NONE(env); float64 ret = float64r32_div(arg1, arg2, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -672,6 +703,7 @@ static uint64_t float_invalid_cvt(CPUPPCState *env, int flags, #define FPU_FCTI(op, cvt, nanval) \ uint64_t helper_##op(CPUPPCState *env, float64 arg) \ { \ + CACHE_FN_NONE(env); \ uint64_t ret = float64_to_##cvt(arg, &env->fp_status); \ int flags = get_float_exception_flags(&env->fp_status); \ if (unlikely(flags & float_flag_invalid)) { \ @@ -694,6 +726,8 @@ uint64_t helper_##op(CPUPPCState *env, uint64_t arg) \ { \ CPU_DoubleU farg; \ \ + CACHE_FN_NONE(env); \ + \ if (is_single) { \ float32 tmp = cvtr(arg, &env->fp_status); \ farg.d = float32_to_float64(tmp, &env->fp_status); \ @@ -715,6 +749,8 @@ static uint64_t do_fri(CPUPPCState *env, uint64_t arg, FloatRoundMode old_rounding_mode = get_float_rounding_mode(&env->fp_status); int flags; + CACHE_FN_NONE(env); + set_float_rounding_mode(rounding_mode, &env->fp_status); arg = float64_round_to_int(arg, &env->fp_status); set_float_rounding_mode(old_rounding_mode, &env->fp_status); @@ -764,6 +800,7 @@ static void float_invalid_op_madd(CPUPPCState *env, int flags, static float64 do_fmadd(CPUPPCState *env, float64 a, float64 b, float64 c, int madd_flags, uintptr_t retaddr) { + CACHE_FN_NONE(env); float64 ret = float64_muladd(a, b, c, madd_flags, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -776,6 +813,7 @@ static float64 do_fmadd(CPUPPCState *env, float64 a, float64 b, static uint64_t do_fmadds(CPUPPCState *env, float64 a, float64 b, float64 c, int madd_flags, uintptr_t retaddr) { + CACHE_FN_NONE(env); float64 ret = float64r32_muladd(a, b, c, madd_flags, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -817,6 +855,7 @@ static uint64_t do_frsp(CPUPPCState *env, uint64_t arg, uintptr_t retaddr) uint64_t helper_frsp(CPUPPCState *env, uint64_t arg) { + CACHE_FN_NONE(env); return do_frsp(env, arg, GETPC()); } @@ -833,6 +872,7 @@ static void float_invalid_op_sqrt(CPUPPCState *env, int flags, #define FPU_FSQRT(name, op) \ float64 helper_##name(CPUPPCState *env, float64 arg) \ { \ + CACHE_FN_NONE(env); \ float64 ret = op(arg, &env->fp_status); \ int flags = get_float_exception_flags(&env->fp_status); \ \ @@ -849,6 +889,7 @@ FPU_FSQRT(FSQRTS, float64r32_sqrt) /* fre - fre. */ float64 helper_fre(CPUPPCState *env, float64 arg) { + CACHE_FN_NONE(env); /* "Estimate" the reciprocal with actual division. */ float64 ret = float64_div(float64_one, arg, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -868,6 +909,7 @@ float64 helper_fre(CPUPPCState *env, float64 arg) /* fres - fres. */ uint64_t helper_fres(CPUPPCState *env, uint64_t arg) { + CACHE_FN_NONE(env); /* "Estimate" the reciprocal with actual division. */ float64 ret = float64r32_div(float64_one, arg, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -887,6 +929,7 @@ uint64_t helper_fres(CPUPPCState *env, uint64_t arg) /* frsqrte - frsqrte. */ float64 helper_frsqrte(CPUPPCState *env, float64 arg) { + CACHE_FN_NONE(env); /* "Estimate" the reciprocal with actual division. */ float64 rets = float64_sqrt(arg, &env->fp_status); float64 retd = float64_div(float64_one, rets, &env->fp_status); @@ -906,6 +949,7 @@ float64 helper_frsqrte(CPUPPCState *env, float64 arg) /* frsqrtes - frsqrtes. */ float64 helper_frsqrtes(CPUPPCState *env, float64 arg) { + CACHE_FN_NONE(env); /* "Estimate" the reciprocal with actual division. */ float64 rets = float64_sqrt(arg, &env->fp_status); float64 retd = float64r32_div(float64_one, rets, &env->fp_status); @@ -1706,6 +1750,7 @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt, \ int i; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ \ for (i = 0; i < nels; i++) { \ float_status tstat = env->fp_status; \ @@ -1746,6 +1791,7 @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode, float_status tstat; helper_reset_fpstatus(env); + CACHE_FN_NONE(env); tstat = env->fp_status; if (unlikely(Rc(opcode) != 0)) { @@ -1853,6 +1899,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ int i; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ \ for (i = 0; i < nels; i++) { \ float_status tstat = env->fp_status; \ @@ -2684,6 +2731,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ int i; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ \ for (i = 0; i < nels; i++) { \ t.tfld = stp##_to_##ttp(xb->sfld, &env->fp_status); \ @@ -2711,6 +2759,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ int i; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ \ for (i = 0; i < nels; i++) { \ t.VsrW(2 * i) = stp##_to_##ttp(xb->VsrD(i), &env->fp_status); \ @@ -2750,6 +2799,7 @@ void helper_##op(CPUPPCState *env, uint32_t opcode, \ int i; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ \ for (i = 0; i < nels; i++) { \ t.tfld = stp##_to_##ttp(xb->sfld, &env->fp_status); \ @@ -2787,6 +2837,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ int i; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ \ for (i = 0; i < nels; i++) { \ t.tfld = stp##_to_##ttp(xb->sfld, 1, &env->fp_status); \ @@ -2836,6 +2887,7 @@ void helper_XSCVQPDP(CPUPPCState *env, uint32_t ro, ppc_vsr_t *xt, float_status tstat; helper_reset_fpstatus(env); + CACHE_FN_NONE(env); tstat = env->fp_status; if (ro != 0) { @@ -2862,6 +2914,8 @@ uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb) float_status tstat = env->fp_status; set_float_exception_flags(0, &tstat); + CACHE_FN_NONE(env); + sign = extract64(xb, 63, 1); exp = extract64(xb, 52, 11); frac = extract64(xb, 0, 52) | 0x10000000000000ULL; @@ -2897,6 +2951,7 @@ uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb) uint64_t helper_XSCVSPDPN(uint64_t xb) { + /* TODO: missing env for CACHE_FN_NONE(env); */ return helper_todouble(xb >> 32); } @@ -2919,6 +2974,8 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ \ helper_reset_fpstatus(env); \ \ + CACHE_FN_NONE(env); \ + \ for (i = 0; i < nels; i++) { \ t.tfld = stp##_to_##ttp##_round_to_zero(xb->sfld, &env->fp_status); \ flags = env->fp_status.float_exception_flags; \ @@ -2953,6 +3010,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ int flags; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ t.s128 = float128_to_##tp##_round_to_zero(xb->f128, &env->fp_status); \ flags = get_float_exception_flags(&env->fp_status); \ if (unlikely(flags & float_flag_invalid)) { \ @@ -2984,6 +3042,8 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ \ helper_reset_fpstatus(env); \ \ + CACHE_FN_NONE(env); \ + \ for (i = 0; i < nels; i++) { \ t.VsrW(2 * i) = stp##_to_##ttp##_round_to_zero(xb->VsrD(i), \ &env->fp_status); \ @@ -3021,6 +3081,7 @@ void helper_##op(CPUPPCState *env, uint32_t opcode, \ int flags; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ \ t.tfld = stp##_to_##ttp##_round_to_zero(xb->sfld, &env->fp_status); \ flags = get_float_exception_flags(&env->fp_status); \ @@ -3057,6 +3118,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ int i; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ \ for (i = 0; i < nels; i++) { \ t.tfld = stp##_to_##ttp(xb->sfld, &env->fp_status); \ @@ -3105,6 +3167,7 @@ VSX_CVT_INT_TO_FP2(xvcvuxdsp, uint64, float32) void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)\ { \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ xt->f128 = tp##_to_float128(xb->s128, &env->fp_status); \ helper_compute_fprf_float128(env, xt->f128); \ do_float_check_status(env, true, GETPC()); \ @@ -3128,6 +3191,8 @@ void helper_##op(CPUPPCState *env, uint32_t opcode, \ ppc_vsr_t t = *xt; \ \ helper_reset_fpstatus(env); \ + CACHE_FN_NONE(env); \ + \ t.tfld = stp##_to_##ttp(xb->sfld, &env->fp_status); \ helper_compute_fprf_##ttp(env, t.tfld); \ \ diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 57eee07256..88147b68a0 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -76,6 +76,7 @@ DEF_HELPER_FLAGS_2(brinc, TCG_CALL_NO_RWG_SE, tl, tl, tl) DEF_HELPER_1(float_check_status, void, env) DEF_HELPER_1(fpscr_check_status, void, env) DEF_HELPER_1(reset_fpstatus, void, env) +DEF_HELPER_1(execute_fp_cached, void, env) DEF_HELPER_2(compute_fprf_float64, void, env, i64) DEF_HELPER_3(store_fpscr, void, env, i64, i32) DEF_HELPER_2(fpscr_clrbit, void, env, i32) diff --git a/target/ppc/translate/fp-impl.c.inc b/target/ppc/translate/fp-impl.c.inc index 8d5cf0f982..10dbfb6edd 100644 --- a/target/ppc/translate/fp-impl.c.inc +++ b/target/ppc/translate/fp-impl.c.inc @@ -633,6 +633,7 @@ static bool trans_MFFS(DisasContext *ctx, arg_X_t_rc *a) REQUIRE_FPU(ctx); gen_reset_fpstatus(); + gen_helper_execute_fp_cached(cpu_env); fpscr = place_from_fpscr(a->rt, UINT64_MAX); if (a->rc) { gen_set_cr1_from_fpscr(ctx); From patchwork Wed Oct 5 14:37:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?V=C3=ADctor_Colombo?= X-Patchwork-Id: 12999384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 85D89C4332F for ; Wed, 5 Oct 2022 15:30:44 +0000 (UTC) Received: from localhost ([::1]:53816 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1og6M7-0004XA-HE for qemu-devel@archiver.kernel.org; Wed, 05 Oct 2022 11:30:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43468) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1og5Xl-0005xL-B3; Wed, 05 Oct 2022 10:38:42 -0400 Received: from [200.168.210.66] (port=55228 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1og5Xj-0004bx-MA; Wed, 05 Oct 2022 10:38:41 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Wed, 5 Oct 2022 11:37:24 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id 9116A8003B3; Wed, 5 Oct 2022 11:37:23 -0300 (-03) From: =?utf-8?q?V=C3=ADctor_Colombo?= To: qemu-devel@nongnu.org, qemu-ppc@nongnu.org Cc: clg@kaod.org, danielhb413@gmail.com, david@gibson.dropbear.id.au, groug@kaod.org, richard.henderson@linaro.org, aurelien@aurel32.net, peter.maydell@linaro.org, alex.bennee@linaro.org, balaton@eik.bme.hu, victor.colombo@eldorado.org.br, matheus.ferst@eldorado.org.br, lucas.araujo@eldorado.org.br, leandro.lupori@eldorado.org.br, lucas.coutinho@eldorado.org.br Subject: [RFC PATCH 2/4] target/ppc: Implement instruction caching for fsqrt Date: Wed, 5 Oct 2022 11:37:17 -0300 Message-Id: <20221005143719.65241-3-victor.colombo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221005143719.65241-1-victor.colombo@eldorado.org.br> References: <20221005143719.65241-1-victor.colombo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 05 Oct 2022 14:37:24.0205 (UTC) FILETIME=[FBCD31D0:01D8D8C7] X-Host-Lookup-Failed: Reverse DNS lookup failed for 200.168.210.66 (failed) Received-SPF: pass client-ip=200.168.210.66; envelope-from=victor.colombo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This patch adds the code necessary to cache fsqrt for usage with hardfpu in Power. It is also the first instruction to use the new cache instruction system. fsqrt is an instruction that receives two arguments, one f64 and one status, and returns f64. This info will be cached inside a new union in env, which will grow when other instructions with other signatures are added. Hardfpu in QEMU only works when the inexact is already set. So, CACHE_FN_3 will check if FP_XX is set, and set float_flag_inexact to enable the hardfpu behavior. When the instruction is later reexecuted, it will be with float_flag_inexact cleared, forcing softfloat and correctly updating the relevant flags, as is today. Signed-off-by: Víctor Colombo --- target/ppc/cpu.h | 11 +++++++++++ target/ppc/fpu_helper.c | 39 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index 1132d60162..b423e33a0c 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -1082,6 +1082,14 @@ struct ppc_radix_page_info { enum { CACHED_FN_TYPE_NONE, + CACHED_FN_TYPE_F64_F64_FSTATUS, + +}; + +struct cached_fn_f64_f64_fstatus { + float64 (*fn)(float64, float_status*); + float64 arg1; + float_status arg2; }; struct CPUArchState { @@ -1162,6 +1170,9 @@ struct CPUArchState { target_ulong fpscr; /* Floating point status and control register */ int cached_fn_type; + union { + struct cached_fn_f64_f64_fstatus f64_f64_fstatus; + } cached_fn; /* Internal devices resources */ ppc_tb_t *tb_env; /* Time base and decrementer */ diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index 6aaee37619..b68f12a1a9 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -30,6 +30,21 @@ env->cached_fn_type = CACHED_FN_TYPE_NONE; \ } while (0) +#define CACHE_FN_3(env, FN, ARG1, ARG2, FIELD, TYPE) \ + do { \ + if (env->fpscr & FP_XX) { \ + env->cached_fn_type = TYPE; \ + env->cached_fn.FIELD.fn = FN; \ + env->cached_fn.FIELD.arg1 = ARG1; \ + env->cached_fn.FIELD.arg2 = ARG2; \ + env->fp_status.float_exception_flags |= float_flag_inexact; \ + } else { \ + assert(!(env->fp_status.float_exception_flags & \ + float_flag_inexact)); \ + env->cached_fn_type = CACHED_FN_TYPE_NONE; \ + } \ + } while (0) + static inline float128 float128_snan_to_qnan(float128 x) { float128 r; @@ -530,6 +545,27 @@ void helper_execute_fp_cached(CPUPPCState *env) * so no need to execute it again */ break; + case CACHED_FN_TYPE_F64_F64_FSTATUS: + /* + * execute the cached insn. At this point, float_exception_flags + * should have FI not set, otherwise the result will not be correct + */ + assert((env->cached_fn.f64_f64_fstatus.arg2.float_exception_flags & + float_flag_inexact) == 0); + env->cached_fn.f64_f64_fstatus.fn( + env->cached_fn.f64_f64_fstatus.arg1, + &env->cached_fn.f64_f64_fstatus.arg2); + + env->fpscr &= ~FP_FI; + /* + * if the cached instruction resulted in FI being set + * then we update fpscr with this value + */ + if (env->cached_fn.f64_f64_fstatus.arg2.float_exception_flags & + float_flag_inexact) { + env->fpscr |= FP_FI | FP_XX; + } + break; default: g_assert_not_reached(); } @@ -872,7 +908,8 @@ static void float_invalid_op_sqrt(CPUPPCState *env, int flags, #define FPU_FSQRT(name, op) \ float64 helper_##name(CPUPPCState *env, float64 arg) \ { \ - CACHE_FN_NONE(env); \ + CACHE_FN_3(env, op, arg, env->fp_status, f64_f64_fstatus, \ + CACHED_FN_TYPE_F64_F64_FSTATUS); \ float64 ret = op(arg, &env->fp_status); \ int flags = get_float_exception_flags(&env->fp_status); \ \ From patchwork Wed Oct 5 14:37:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?V=C3=ADctor_Colombo?= X-Patchwork-Id: 12999343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3A6CC433FE for ; Wed, 5 Oct 2022 15:08:20 +0000 (UTC) Received: from localhost ([::1]:48088 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1og60R-00022o-Nv for qemu-devel@archiver.kernel.org; Wed, 05 Oct 2022 11:08:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44970) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1og5Xq-0005yq-Pm; Wed, 05 Oct 2022 10:38:47 -0400 Received: from [200.168.210.66] (port=55228 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1og5Xp-0004bx-0U; Wed, 05 Oct 2022 10:38:46 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Wed, 5 Oct 2022 11:37:24 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id 1A6208002A8; Wed, 5 Oct 2022 11:37:24 -0300 (-03) From: =?utf-8?q?V=C3=ADctor_Colombo?= To: qemu-devel@nongnu.org, qemu-ppc@nongnu.org Cc: clg@kaod.org, danielhb413@gmail.com, david@gibson.dropbear.id.au, groug@kaod.org, richard.henderson@linaro.org, aurelien@aurel32.net, peter.maydell@linaro.org, alex.bennee@linaro.org, balaton@eik.bme.hu, victor.colombo@eldorado.org.br, matheus.ferst@eldorado.org.br, lucas.araujo@eldorado.org.br, leandro.lupori@eldorado.org.br, lucas.coutinho@eldorado.org.br Subject: [RFC PATCH 3/4] target/ppc: Implement instruction caching for muladd Date: Wed, 5 Oct 2022 11:37:18 -0300 Message-Id: <20221005143719.65241-4-victor.colombo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221005143719.65241-1-victor.colombo@eldorado.org.br> References: <20221005143719.65241-1-victor.colombo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 05 Oct 2022 14:37:24.0847 (UTC) FILETIME=[FC2F27F0:01D8D8C7] X-Host-Lookup-Failed: Reverse DNS lookup failed for 200.168.210.66 (failed) Received-SPF: pass client-ip=200.168.210.66; envelope-from=victor.colombo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This patch adds the code necessary to cache muladd instructions for usage with hardfpu in Power. muladd is an instruction that receives four arguments, three f64 and one status, and returns f64. This info will be cached inside the union in env, which grows when other instructions with other signatures are added. Hardfpu in QEMU only works when the inexact is already set. So, CACHE_FN_5 will check if FP_XX is set, and set float_flag_inexact to enable the hardfpu behavior. When the instruction is later reexecuted, it will be with float_flag_inexact cleared, forcing softfloat and correctly updating the relevant flags, as is today. Signed-off-by: Víctor Colombo --- target/ppc/cpu.h | 11 +++++++++++ target/ppc/fpu_helper.c | 34 ++++++++++++++++++++++++++++++++-- 2 files changed, 43 insertions(+), 2 deletions(-) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index b423e33a0c..87183de484 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -1083,6 +1083,7 @@ struct ppc_radix_page_info { enum { CACHED_FN_TYPE_NONE, CACHED_FN_TYPE_F64_F64_FSTATUS, + CACHED_FN_TYPE_F64_F64_F64_F64_I_FSTATUS, }; @@ -1092,6 +1093,15 @@ struct cached_fn_f64_f64_fstatus { float_status arg2; }; +struct cached_fn_f64_f64_f64_f64_i_fstatus { + float64 (*fn)(float64, float64, float64, int, float_status*); + float64 arg1; + float64 arg2; + float64 arg3; + int arg4; + float_status arg5; +}; + struct CPUArchState { /* Most commonly used resources during translated code execution first */ target_ulong gpr[32]; /* general purpose registers */ @@ -1172,6 +1182,7 @@ struct CPUArchState { int cached_fn_type; union { struct cached_fn_f64_f64_fstatus f64_f64_fstatus; + struct cached_fn_f64_f64_f64_f64_i_fstatus f64_f64_f64_f64_i_fstatus; } cached_fn; /* Internal devices resources */ diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index b68f12a1a9..3d06a0fc1a 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -45,6 +45,23 @@ } \ } while (0) +#define CACHE_FN_5(env, FN, ARG1, ARG2, ARG3, ARG4, FIELD, TYPE) \ + do { \ + if (env->fpscr & FP_XX) { \ + env->cached_fn_type = TYPE; \ + env->cached_fn.FIELD.fn = FN; \ + env->cached_fn.FIELD.arg1 = ARG1; \ + env->cached_fn.FIELD.arg2 = ARG2; \ + env->cached_fn.FIELD.arg3 = ARG3; \ + env->cached_fn.FIELD.arg4 = ARG4; \ + env->fp_status.float_exception_flags |= float_flag_inexact; \ + } else { \ + assert(!(env->fp_status.float_exception_flags & \ + float_flag_inexact)); \ + env->cached_fn_type = CACHED_FN_TYPE_NONE; \ + } \ + } while (0) + static inline float128 float128_snan_to_qnan(float128 x) { float128 r; @@ -566,6 +583,17 @@ void helper_execute_fp_cached(CPUPPCState *env) env->fpscr |= FP_FI | FP_XX; } break; + case CACHED_FN_TYPE_F64_F64_F64_F64_I_FSTATUS: + ; /* hack to allow declaration below */ + struct cached_fn_f64_f64_f64_f64_i_fstatus args = + env->cached_fn.f64_f64_f64_f64_i_fstatus; + assert(!(args.arg5.float_exception_flags & float_flag_inexact)); + args.fn(args.arg1, args.arg2, args.arg3, args.arg4, &args.arg5); + env->fpscr &= ~FP_FI; + if (args.arg5.float_exception_flags & float_flag_inexact) { + env->fpscr |= FP_FI | FP_XX; + } + break; default: g_assert_not_reached(); } @@ -836,7 +864,8 @@ static void float_invalid_op_madd(CPUPPCState *env, int flags, static float64 do_fmadd(CPUPPCState *env, float64 a, float64 b, float64 c, int madd_flags, uintptr_t retaddr) { - CACHE_FN_NONE(env); + CACHE_FN_5(env, float64_muladd, a, b, c, madd_flags, + f64_f64_f64_f64_i_fstatus, CACHED_FN_TYPE_F64_F64_F64_F64_I_FSTATUS); float64 ret = float64_muladd(a, b, c, madd_flags, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); @@ -849,7 +878,8 @@ static float64 do_fmadd(CPUPPCState *env, float64 a, float64 b, static uint64_t do_fmadds(CPUPPCState *env, float64 a, float64 b, float64 c, int madd_flags, uintptr_t retaddr) { - CACHE_FN_NONE(env); + CACHE_FN_5(env, float64r32_muladd, a, b, c, madd_flags, + f64_f64_f64_f64_i_fstatus, CACHED_FN_TYPE_F64_F64_F64_F64_I_FSTATUS); float64 ret = float64r32_muladd(a, b, c, madd_flags, &env->fp_status); int flags = get_float_exception_flags(&env->fp_status); From patchwork Wed Oct 5 14:37:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?V=C3=ADctor_Colombo?= X-Patchwork-Id: 12999339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 48425C433F5 for ; Wed, 5 Oct 2022 15:02:21 +0000 (UTC) Received: from localhost ([::1]:40878 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1og5ue-0004Bj-7H for qemu-devel@archiver.kernel.org; Wed, 05 Oct 2022 11:02:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44974) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1og5Xx-00066m-Lz; Wed, 05 Oct 2022 10:38:53 -0400 Received: from [200.168.210.66] (port=55228 helo=outlook.eldorado.org.br) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1og5Xu-0004bx-UI; Wed, 05 Oct 2022 10:38:53 -0400 Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS secured channel with Microsoft SMTPSVC(8.5.9600.16384); Wed, 5 Oct 2022 11:37:25 -0300 Received: from eldorado.org.br (unknown [10.10.70.45]) by p9ibm (Postfix) with ESMTP id ADB6B8003B3; Wed, 5 Oct 2022 11:37:24 -0300 (-03) From: =?utf-8?q?V=C3=ADctor_Colombo?= To: qemu-devel@nongnu.org, qemu-ppc@nongnu.org Cc: clg@kaod.org, danielhb413@gmail.com, david@gibson.dropbear.id.au, groug@kaod.org, richard.henderson@linaro.org, aurelien@aurel32.net, peter.maydell@linaro.org, alex.bennee@linaro.org, balaton@eik.bme.hu, victor.colombo@eldorado.org.br, matheus.ferst@eldorado.org.br, lucas.araujo@eldorado.org.br, leandro.lupori@eldorado.org.br, lucas.coutinho@eldorado.org.br Subject: [RFC PATCH 4/4] target/ppc: Enable hardfpu for Power Date: Wed, 5 Oct 2022 11:37:19 -0300 Message-Id: <20221005143719.65241-5-victor.colombo@eldorado.org.br> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221005143719.65241-1-victor.colombo@eldorado.org.br> References: <20221005143719.65241-1-victor.colombo@eldorado.org.br> MIME-Version: 1.0 X-OriginalArrivalTime: 05 Oct 2022 14:37:25.0363 (UTC) FILETIME=[FC7DE430:01D8D8C7] X-Host-Lookup-Failed: Reverse DNS lookup failed for 200.168.210.66 (failed) Received-SPF: pass client-ip=200.168.210.66; envelope-from=victor.colombo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Remove the build conditional from softfloat.c, allowing TARGET_PPC to use hardfpu. Signed-off-by: Víctor Colombo --- fpu/softfloat.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index c7454c3eb1..de94732f6a 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -220,11 +220,9 @@ GEN_INPUT_FLUSH3(float64_input_flush3, float64) * the use of hardfloat, since hardfloat relies on the inexact flag being * already set. */ -#if defined(TARGET_PPC) || defined(__FAST_MATH__) -# if defined(__FAST_MATH__) -# warning disabling hardfloat due to -ffast-math: hardfloat requires an exact \ +#if defined(__FAST_MATH__) +# warning disabling hardfloat due to -ffast-math: hardfloat requires an exact \ IEEE implementation -# endif # define QEMU_NO_HARDFLOAT 1 # define QEMU_SOFTFLOAT_ATTR QEMU_FLATTEN #else