diff mbox series

[PULL,11/48] target/i386: optimize CX handling in repeated string operations

Message ID 20250124094442.13207-12-pbonzini@redhat.com (mailing list archive)
State New
Headers show
Series [PULL,01/48] rust: pl011: fix repr(C) for PL011Class | expand

Commit Message

Paolo Bonzini Jan. 24, 2025, 9:44 a.m. UTC
In a repeated string operation, CX/ECX will be decremented until it
is 0 but never underflow.  Use this observation to avoid a deposit or
zero-extend operation if the address size of the operation is smaller
than MO_TL.

As in the previous patch, the patch is structured to include some
preparatory work for subsequent changes.  In particular, introducing
cx_next prepares for when ECX will be decremented *before* calling
fn(s, ot), and therefore cannot yet be written back to cpu_regs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20241215090613.89588-11-pbonzini@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 7a3caf8b996..0a8f3c89514 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1339,6 +1339,7 @@  static void do_gen_rep(DisasContext *s, MemOp ot,
 {
     TCGLabel *done = gen_new_label();
     target_ulong cx_mask = MAKE_64BIT_MASK(0, 8 << s->aflag);
+    TCGv cx_next = tcg_temp_new();
     bool had_rf = s->flags & HF_RF_MASK;
 
     /*
@@ -1364,7 +1365,19 @@  static void do_gen_rep(DisasContext *s, MemOp ot,
     tcg_gen_brcondi_tl(TCG_COND_TSTEQ, cpu_regs[R_ECX], cx_mask, done);
 
     fn(s, ot);
-    gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
+
+    tcg_gen_subi_tl(cx_next, cpu_regs[R_ECX], 1);
+
+    /*
+     * Write back cx_next to CX/ECX/RCX.  There can be no carry, so zero
+     * extend if needed but do not do expensive deposit operations.
+     */
+#ifdef TARGET_X86_64
+    if (s->aflag == MO_32) {
+        tcg_gen_ext32u_tl(cx_next, cx_next);
+    }
+#endif
+    tcg_gen_mov_tl(cpu_regs[R_ECX], cx_next);
     gen_update_cc_op(s);
 
     /* Leave if REP condition fails.  */