Message ID | 20221005143719.65241-1-victor.colombo@eldorado.org.br (mailing list archive) |
---|---|
Headers | show |
Series | Idea for using hardfloat in PPC | expand |
On 10/5/22 07:37, Víctor Colombo wrote: > However, the impact in performance was not the expected. In x86_64 I > had a small 3% improvement, while in a Power9 machine there was a small > performance loss, as can be seem below (100 executions). > > | | min [s] | max [s] | avg [s] | > | before | 122.309 | 123.459 | 122.747 | > | after | 123.906 | 125.016 | 124.373 | I hope this is because you didn't handle the most common cases: add, sub, mul, div. The logic seems plausible, as far as it goes, and would work for the FR bit as well which afair isn't handled at all at the moment. I'll review properly in a little while. r~
Richard Henderson <richard.henderson@linaro.org> writes: > On 10/5/22 07:37, Víctor Colombo wrote: >> However, the impact in performance was not the expected. In x86_64 I >> had a small 3% improvement, while in a Power9 machine there was a small >> performance loss, as can be seem below (100 executions). >> | | min [s] | max [s] | avg [s] | >> | before | 122.309 | 123.459 | 122.747 | >> | after | 123.906 | 125.016 | 124.373 | > > I hope this is because you didn't handle the most common cases: add, sub, mul, div. > > The logic seems plausible, as far as it goes, and would work for the > FR bit as well which afair isn't handled at all at the moment. I'll > review properly in a little while. I wonder if this is something that could be generalised and pushed up into the fpu stuff itself. We could after all cache the op and decomposed parameters here in a generic way. The trick would be working out how to do that without slowing down the current common case. Is ppc unique in not persisting the inexact flag from previous operations? > > > r~
On 10/7/22 06:42, Alex Bennée wrote: > Is ppc unique in not persisting the inexact flag from previous > operations? Better phrased as "having an additional per-operation flags for inexact and 'rounded'", because ppc also has the standard ieee sticky inexact flag. But yes, as far as I know ppc is unique with this. r~