Message ID | 20220902034412.8918-1-palmer@rivosinc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | RISC-V: Add support for Ztso | expand |
On 9/2/22 04:44, Palmer Dabbelt wrote: > -#define TCG_GUEST_DEFAULT_MO 0 > +/* > + * RISC-V has two memory models: TSO is a bit weaker than Intel (MMIO and > + * fetch), and WMO is approximately equivilant to Arm MCA. Rather than > + * enforcing orderings on most accesses, just default to the target memory > + * order. > + */ > +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO > +# define TCG_GUEST_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) > +#else > +# define TCG_GUEST_DEFAULT_MO (0) > +#endif TCG_GUEST_DEFAULT_MO should be allowed to be variable. Since I've not tried that, it may not work, but making sure that it does would be the first thing to do. > --- a/tcg/i386/tcg-target.h > +++ b/tcg/i386/tcg-target.h > @@ -236,6 +236,7 @@ static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, > #include "tcg/tcg-mo.h" > > #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) > +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 Um, no. There's no need for this hackery... > +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO > + /* > + * We only support Ztso on targets that themselves are already TSO, which > + * means there's no way to provide just RVWMO on those targets. Instead > + * just default to telling the guest that Ztso is enabled.: > + */ > + DEFINE_PROP_BOOL("ztso", RISCVCPU, cfg.ext_ztso, true), > +#endif ... you can just as well define the property at runtime, with a runtime check on TCG_TARGET_DEFAULT_MO. Though, honestly, I've had patches to add the required barriers sitting around for the last few releases, to better support things like x86 on aarch64. I should just finish that up. r~
On Sat, 03 Sep 2022 17:47:54 PDT (-0700), richard.henderson@linaro.org wrote: > On 9/2/22 04:44, Palmer Dabbelt wrote: >> -#define TCG_GUEST_DEFAULT_MO 0 >> +/* >> + * RISC-V has two memory models: TSO is a bit weaker than Intel (MMIO and >> + * fetch), and WMO is approximately equivilant to Arm MCA. Rather than >> + * enforcing orderings on most accesses, just default to the target memory >> + * order. >> + */ >> +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO >> +# define TCG_GUEST_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) >> +#else >> +# define TCG_GUEST_DEFAULT_MO (0) >> +#endif > > TCG_GUEST_DEFAULT_MO should be allowed to be variable. Since I've not tried that, it may > not work, but making sure that it does would be the first thing to do. > >> --- a/tcg/i386/tcg-target.h >> +++ b/tcg/i386/tcg-target.h >> @@ -236,6 +236,7 @@ static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, >> #include "tcg/tcg-mo.h" >> >> #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) >> +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 > > Um, no. There's no need for this hackery... > >> +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO >> + /* >> + * We only support Ztso on targets that themselves are already TSO, which >> + * means there's no way to provide just RVWMO on those targets. Instead >> + * just default to telling the guest that Ztso is enabled.: >> + */ >> + DEFINE_PROP_BOOL("ztso", RISCVCPU, cfg.ext_ztso, true), >> +#endif > > ... you can just as well define the property at runtime, with a runtime check on > TCG_TARGET_DEFAULT_MO. > > Though, honestly, I've had patches to add the required barriers sitting around for the > last few releases, to better support things like x86 on aarch64. I should just finish > that up. I can just do that for the RISC-V TSO support? Like the cover letter says that was my first thought, it's only when I found the comment saying not to do it that I went this way. > > > r~
On 9/16/22 14:52, Palmer Dabbelt wrote: >> Though, honestly, I've had patches to add the required barriers sitting around for the >> last few releases, to better support things like x86 on aarch64. I should just finish >> that up. > > I can just do that for the RISC-V TSO support? Like the cover letter says that was my > first thought, it's only when I found the comment saying not to do it that I went this way. My patches inject the barriers automatically by the tcg optimizer, rather than by hand, which is what the comment was trying to discourage. Last version was https://lore.kernel.org/qemu-devel/20210316220735.2048137-1-richard.henderson@linaro.org/ r~
On Sat, 17 Sep 2022 01:02:46 PDT (-0700), Richard Henderson wrote: > On 9/16/22 14:52, Palmer Dabbelt wrote: >>> Though, honestly, I've had patches to add the required barriers sitting around for the >>> last few releases, to better support things like x86 on aarch64. I should just finish >>> that up. >> >> I can just do that for the RISC-V TSO support? Like the cover letter says that was my >> first thought, it's only when I found the comment saying not to do it that I went this way. > > My patches inject the barriers automatically by the tcg optimizer, rather than by hand, > which is what the comment was trying to discourage. Last version was > > https://lore.kernel.org/qemu-devel/20210316220735.2048137-1-richard.henderson@linaro.org/ Thanks, I get it now.
* Palmer Dabbelt (palmer@rivosinc.com) wrote: > Ztso, the RISC-V extension that provides the TSO memory model, was > recently frozen. This provides support for Ztso on targets that are > themselves TSO. > > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> > > --- > > diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h > index 00fcbe297d..2a43d54fcd 100644 > --- a/tcg/i386/tcg-target.h > +++ b/tcg/i386/tcg-target.h > @@ -236,6 +236,7 @@ static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, > #include "tcg/tcg-mo.h" > > #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) > +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 Is x86's brand of memory ordering strong enough for Ztso? I thought x86 had an optimisation where it was allowed to store forward within the current CPU causing stores not to be quite strictly ordered. Dave > #define TCG_TARGET_HAS_MEMORY_BSWAP have_movbe > > diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h > index 23e2063667..f423c124a0 100644 > --- a/tcg/s390x/tcg-target.h > +++ b/tcg/s390x/tcg-target.h > @@ -171,6 +171,7 @@ extern uint64_t s390_facilities[3]; > #define TCG_TARGET_HAS_MEMORY_BSWAP 1 > > #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) > +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 > > static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, > uintptr_t jmp_rw, uintptr_t addr) > -- > 2.34.1 > >
On Thu, 29 Sep 2022 12:16:48 PDT (-0700), dgilbert@redhat.com wrote: > * Palmer Dabbelt (palmer@rivosinc.com) wrote: >> Ztso, the RISC-V extension that provides the TSO memory model, was >> recently frozen. This provides support for Ztso on targets that are >> themselves TSO. >> >> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> >> >> --- >> > >> diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h >> index 00fcbe297d..2a43d54fcd 100644 >> --- a/tcg/i386/tcg-target.h >> +++ b/tcg/i386/tcg-target.h >> @@ -236,6 +236,7 @@ static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, >> #include "tcg/tcg-mo.h" >> >> #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) >> +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 > > Is x86's brand of memory ordering strong enough for Ztso? > I thought x86 had an optimisation where it was allowed to store forward > within the current CPU causing stores not to be quite strictly ordered. I'm actually not sure: my understanding of the Intel memory model was that there's a bunch of subtle bits that don't match the various TSO formalizations, but the RISC-V folks are pretty adamant that Intel is exactly TSO. I've gotten yelled at enough times on this one that I kind of just stopped caring, but that's not a good reason to have broken code so I'm happy to go fix it. That said, when putting together the v2 (which has TCG barriers in the RISC-V front-end) I couldn't even really figure out how the TCG memory model works in any formal capacity -- I essentially just added the fences necessary for Ztso on RVWMO, but that's not a good proxy for Ztso on arm64 (and I guess not on x86, either). Also happy to go take a crack at that one, but I'm not really a formal memory model person so it might not be the best result. > > Dave > >> #define TCG_TARGET_HAS_MEMORY_BSWAP have_movbe >> >> diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h >> index 23e2063667..f423c124a0 100644 >> --- a/tcg/s390x/tcg-target.h >> +++ b/tcg/s390x/tcg-target.h >> @@ -171,6 +171,7 @@ extern uint64_t s390_facilities[3]; >> #define TCG_TARGET_HAS_MEMORY_BSWAP 1 >> >> #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) >> +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 >> >> static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, >> uintptr_t jmp_rw, uintptr_t addr) >> -- >> 2.34.1 >> >>
* Palmer Dabbelt (palmer@rivosinc.com) wrote: > On Thu, 29 Sep 2022 12:16:48 PDT (-0700), dgilbert@redhat.com wrote: > > * Palmer Dabbelt (palmer@rivosinc.com) wrote: > > > Ztso, the RISC-V extension that provides the TSO memory model, was > > > recently frozen. This provides support for Ztso on targets that are > > > themselves TSO. > > > > > > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> > > > > > > --- > > > > > > > > diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h > > > index 00fcbe297d..2a43d54fcd 100644 > > > --- a/tcg/i386/tcg-target.h > > > +++ b/tcg/i386/tcg-target.h > > > @@ -236,6 +236,7 @@ static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, > > > #include "tcg/tcg-mo.h" > > > > > > #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) > > > +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 > > > > Is x86's brand of memory ordering strong enough for Ztso? > > I thought x86 had an optimisation where it was allowed to store forward > > within the current CPU causing stores not to be quite strictly ordered. > > I'm actually not sure: my understanding of the Intel memory model was that > there's a bunch of subtle bits that don't match the various TSO > formalizations, but the RISC-V folks are pretty adamant that Intel is > exactly TSO. I've gotten yelled at enough times on this one that I kind of > just stopped caring, but that's not a good reason to have broken code so I'm > happy to go fix it. Many people make that mistake, please refer them to the Intel docs; the big 'Intel 64 and IA-32 Architecture Software Developer's Manual, Combined Volumes: 1,2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4'; in the recent version I've got (April 2022) section 8.2 covers memory ordering and 8.2.2 Memory Ordering in P6 and More Recent Processor Families says on page 8-7 (page 3090 ish): In a multiple-processor system, the following ordering principles apply: .... Writes from an individual processor are NOT ordered with respect to the writes from other processors. .... Any two stores are seen in a consistent order by processors other than those performing the stores then a bit further down, '8.2.3.5 Intra-Processor Forwarding Is Allowed' has an example and says 'The memory-ordering model allows concurrent stores by two processors to be seen in different orders by those two processors; specifically, each processor may perceive its own store occurring before that of the other.' Having said that, I remember it's realyl difficult to trigger; it's ~10 years since I saw an example to trigger it, and can't remember it. > That said, when putting together the v2 (which has TCG barriers in the > RISC-V front-end) I couldn't even really figure out how the TCG memory model > works in any formal capacity -- I essentially just added the fences > necessary for Ztso on RVWMO, but that's not a good proxy for Ztso on arm64 > (and I guess not on x86, either). Also happy to go take a crack at that > one, but I'm not really a formal memory model person so it might not be the > best result. Oh I don't know TCG's model, copying in Alex. Dave > > > > Dave > > > > > #define TCG_TARGET_HAS_MEMORY_BSWAP have_movbe > > > > > > diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h > > > index 23e2063667..f423c124a0 100644 > > > --- a/tcg/s390x/tcg-target.h > > > +++ b/tcg/s390x/tcg-target.h > > > @@ -171,6 +171,7 @@ extern uint64_t s390_facilities[3]; > > > #define TCG_TARGET_HAS_MEMORY_BSWAP 1 > > > > > > #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) > > > +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 > > > > > > static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, > > > uintptr_t jmp_rw, uintptr_t addr) > > > -- > > > 2.34.1 > > > > > > >
> > > Is x86's brand of memory ordering strong enough for Ztso? > > > I thought x86 had an optimisation where it was allowed to store forward > > > within the current CPU causing stores not to be quite strictly ordered. [...] > then a bit further down, '8.2.3.5 Intra-Processor Forwarding Is Allowed' > has an example and says > > 'The memory-ordering model allows concurrent stores by two processors to be seen in > different orders by those two processors; specifically, each processor may perceive > its own store occurring before that of the other.' > > Having said that, I remember it's realyl difficult to trigger; it's ~10 > years since I saw an example to trigger it, and can't remember it. AFAICT, Ztso allows the forwarding in question too. Simulations with the axiomatic formalization confirm such expectation: RISCV intra-processor-forwarding { 0:x5=1; 0:x6=x; 0:x8=y; 1:x5=1; 1:x6=y; 1:x8=x; } P0 | P1 ; sw x5,0(x6) | sw x5,0(x6) ; lw x9,0(x6) | lw x9,0(x6) ; lw x7,0(x8) | lw x7,0(x8) ; exists (0:x7=0 /\ 1:x7=0 /\ 0:x9=1 /\ 1:x9=1) Test intra-processor-forwarding Allowed States 4 0:x7=0; 0:x9=1; 1:x7=0; 1:x9=1; 0:x7=0; 0:x9=1; 1:x7=1; 1:x9=1; 0:x7=1; 0:x9=1; 1:x7=0; 1:x9=1; 0:x7=1; 0:x9=1; 1:x7=1; 1:x9=1; Ok Witnesses Positive: 1 Negative: 3 Condition exists (0:x7=0 /\ 1:x7=0 /\ 0:x9=1 /\ 1:x9=1) Observation intra-processor-forwarding Sometimes 1 3 Time intra-processor-forwarding 0.00 Hash=518e4b9b2f0770c94918ac5d7e311ba5 Andrea
* Andrea Parri (andrea@rivosinc.com) wrote: > > > > Is x86's brand of memory ordering strong enough for Ztso? > > > > I thought x86 had an optimisation where it was allowed to store forward > > > > within the current CPU causing stores not to be quite strictly ordered. > > [...] > > > then a bit further down, '8.2.3.5 Intra-Processor Forwarding Is Allowed' > > has an example and says > > > > 'The memory-ordering model allows concurrent stores by two processors to be seen in > > different orders by those two processors; specifically, each processor may perceive > > its own store occurring before that of the other.' > > > > Having said that, I remember it's realyl difficult to trigger; it's ~10 > > years since I saw an example to trigger it, and can't remember it. > > AFAICT, Ztso allows the forwarding in question too. Simulations with > the axiomatic formalization confirm such expectation: OK that seems to be what it says in: https://five-embeddev.com/riscv-isa-manual/latest/ztso.html 'In both of these memory models, it is the that allows a hart to forward a value from its store buffer to a subsequent (in program order) load—that is to say that stores can be forwarded locally before they are visible to other harts' > RISCV intra-processor-forwarding > { > 0:x5=1; 0:x6=x; 0:x8=y; > 1:x5=1; 1:x6=y; 1:x8=x; > } > P0 | P1 ; > sw x5,0(x6) | sw x5,0(x6) ; > lw x9,0(x6) | lw x9,0(x6) ; > lw x7,0(x8) | lw x7,0(x8) ; > exists > (0:x7=0 /\ 1:x7=0 /\ 0:x9=1 /\ 1:x9=1) (I'm a bit fuzzy reading this...) So is that the interesting case - where x7 is saying neither processor saw the other processors write yet, but they did see their own? So from a qemu patch perspective, I think the important thing is that the flag that's defined, is defined and commented in such a way that it's obvious that local forwarding is allowed; we wouldn't want someone emulating a stricter CPU (that doesn't allow local forwarding) to go and use this flag as an indication that the host cpu is that strict. Dave > Test intra-processor-forwarding Allowed > States 4 > 0:x7=0; 0:x9=1; 1:x7=0; 1:x9=1; > 0:x7=0; 0:x9=1; 1:x7=1; 1:x9=1; > 0:x7=1; 0:x9=1; 1:x7=0; 1:x9=1; > 0:x7=1; 0:x9=1; 1:x7=1; 1:x9=1; > Ok > Witnesses > Positive: 1 Negative: 3 > Condition exists (0:x7=0 /\ 1:x7=0 /\ 0:x9=1 /\ 1:x9=1) > Observation intra-processor-forwarding Sometimes 1 3 > Time intra-processor-forwarding 0.00 > Hash=518e4b9b2f0770c94918ac5d7e311ba5 > > Andrea >
> > AFAICT, Ztso allows the forwarding in question too. Simulations with > > the axiomatic formalization confirm such expectation: > > OK that seems to be what it says in: > https://five-embeddev.com/riscv-isa-manual/latest/ztso.html > 'In both of these memory models, it is the that allows a hart to > forward a value from its store buffer to a subsequent (in program order) > load—that is to say that stores can be forwarded locally before they are > visible to other harts' Indeed, thanks for the remark. > > RISCV intra-processor-forwarding > > { > > 0:x5=1; 0:x6=x; 0:x8=y; > > 1:x5=1; 1:x6=y; 1:x8=x; > > } > > P0 | P1 ; > > sw x5,0(x6) | sw x5,0(x6) ; > > lw x9,0(x6) | lw x9,0(x6) ; > > lw x7,0(x8) | lw x7,0(x8) ; > > exists > > (0:x7=0 /\ 1:x7=0 /\ 0:x9=1 /\ 1:x9=1) > > (I'm a bit fuzzy reading this...) > So is that the interesting case - where x7 is saying neither processor > saw the other processors write yet, but they did see their own? Right, it was inspired by the homonymous test in the Intel's specs. Andrea
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index ac6f82ebd0..d05b8c7c4a 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -919,6 +919,15 @@ static Property riscv_cpu_extensions[] = { DEFINE_PROP_BOOL("zhinx", RISCVCPU, cfg.ext_zhinx, false), DEFINE_PROP_BOOL("zhinxmin", RISCVCPU, cfg.ext_zhinxmin, false), +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO + /* + * We only support Ztso on targets that themselves are already TSO, which + * means there's no way to provide just RVWMO on those targets. Instead + * just default to telling the guest that Ztso is enabled.: + */ + DEFINE_PROP_BOOL("ztso", RISCVCPU, cfg.ext_ztso, true), +#endif + /* Vendor-specific custom extensions */ DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, false), @@ -1094,6 +1103,9 @@ static void riscv_isa_string_ext(RISCVCPU *cpu, char **isa_str, int max_str_len) ISA_EDATA_ENTRY(zksed, ext_zksed), ISA_EDATA_ENTRY(zksh, ext_zksh), ISA_EDATA_ENTRY(zkt, ext_zkt), +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO + ISA_EDATA_ENTRY(ztso, ext_ztso), +#endif ISA_EDATA_ENTRY(zve32f, ext_zve32f), ISA_EDATA_ENTRY(zve64f, ext_zve64f), ISA_EDATA_ENTRY(zhinx, ext_zhinx), diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 5c7acc055a..879e11a950 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -27,8 +27,19 @@ #include "qom/object.h" #include "qemu/int128.h" #include "cpu_bits.h" +#include "tcg-target.h" -#define TCG_GUEST_DEFAULT_MO 0 +/* + * RISC-V has two memory models: TSO is a bit weaker than Intel (MMIO and + * fetch), and WMO is approximately equivilant to Arm MCA. Rather than + * enforcing orderings on most accesses, just default to the target memory + * order. + */ +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO +# define TCG_GUEST_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) +#else +# define TCG_GUEST_DEFAULT_MO (0) +#endif /* * RISC-V-specific extra insn start words: @@ -433,6 +444,9 @@ struct RISCVCPUConfig { bool ext_zve32f; bool ext_zve64f; bool ext_zmmul; +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO + bool ext_ztso; +#endif bool rvv_ta_all_1s; uint32_t mvendorid; diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 63b04e8a94..00fd75b971 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -109,6 +109,9 @@ typedef struct DisasContext { /* PointerMasking extension */ bool pm_mask_enabled; bool pm_base_enabled; +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO + bool ztso; +#endif /* TCG of the current insn_start */ TCGOp *insn_start; } DisasContext; @@ -1109,6 +1112,9 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs) memset(ctx->ftemp, 0, sizeof(ctx->ftemp)); ctx->pm_mask_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_MASK_ENABLED); ctx->pm_base_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_BASE_ENABLED); +#ifdef TCG_TARGET_SUPPORTS_MCTCG_RVTSO + ctx->ztso = cpu->cfg.ext_ztso; +#endif ctx->zero = tcg_constant_tl(0); } diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 00fcbe297d..2a43d54fcd 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -236,6 +236,7 @@ static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, #include "tcg/tcg-mo.h" #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 #define TCG_TARGET_HAS_MEMORY_BSWAP have_movbe diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h index 23e2063667..f423c124a0 100644 --- a/tcg/s390x/tcg-target.h +++ b/tcg/s390x/tcg-target.h @@ -171,6 +171,7 @@ extern uint64_t s390_facilities[3]; #define TCG_TARGET_HAS_MEMORY_BSWAP 1 #define TCG_TARGET_DEFAULT_MO (TCG_MO_ALL & ~TCG_MO_ST_LD) +#define TCG_TARGET_SUPPORTS_MCTCG_RVTSO 1 static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx, uintptr_t jmp_rw, uintptr_t addr)
Ztso, the RISC-V extension that provides the TSO memory model, was recently frozen. This provides support for Ztso on targets that are themselves TSO. Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> --- My first thought was to just add the TCG barries to load/store and AMOs that as defined by Ztso, but after poking around a bit it seems that's frowned upon by check_tcg_memory_orders_compatible(). I feel like the indicated performance issues could probably be worked out, but this is about the same amount of code and doesn't suffer from those performance issues. That said, it just seems wrong to couple targets to a RISC-V feature. This is also essentially un-tested, aside from poking around in the generated device tree to make sure "_ztso" shows up when enabled. I don't think there's really any way to test it further, as we don't have any TSO-enabled workloads and we were defacto providing TSO already on x86 targets (which I'm assuming are what the vast majority of users are running). --- target/riscv/cpu.c | 12 ++++++++++++ target/riscv/cpu.h | 16 +++++++++++++++- target/riscv/translate.c | 6 ++++++ tcg/i386/tcg-target.h | 1 + tcg/s390x/tcg-target.h | 1 + 5 files changed, 35 insertions(+), 1 deletion(-)