diff mbox series

[v2,1/2] RISC-V: hwprobe: Add MISALIGNED_PERF key

Message ID 20240625165121.2160354-2-evan@rivosinc.com (mailing list archive)
State Superseded
Headers show
Series RISC-V: hwprobe: Misaligned scalar perf fix and rename | expand

Commit Message

Evan Green June 25, 2024, 4:51 p.m. UTC
RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
hwprobe_key_is_bitmask(), when in reality it was an enum value. This
causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
since SLOW, FAST, and EMULATED have values whose bits overlap with
each other. If the caller asked for the set of CPUs that was SLOW or
EMULATED, the returned set would also include CPUs that were FAST.

Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
returns the same values in response to a direct query (with no flags),
but is properly handled as an enumerated value. As a result, SLOW,
FAST, and EMULATED are all correctly treated as distinct values under
the new key when queried with the WHICH_CPUS flag.

Leave the old key in place to avoid disturbing applications which may
have already come to rely on the key, with or without its broken
behavior with respect to the WHICH_CPUS flag.

Fixes: e178bf146e4b ("RISC-V: hwprobe: Introduce which-cpus flag")
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

---

Changes in v2:
 - Clarified the distinction of slow and fast refers to misaligned word
   accesses. Previously it just said misaligned accesses, leaving it
   ambiguous as to which type of access was measured.
 - Removed shifts in values (Andrew)
 - Renamed key to RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF (Palmer)

 Documentation/arch/riscv/hwprobe.rst  | 17 +++++++++++------
 arch/riscv/include/asm/hwprobe.h      |  2 +-
 arch/riscv/include/uapi/asm/hwprobe.h | 13 +++++++------
 arch/riscv/kernel/sys_hwprobe.c       |  1 +
 4 files changed, 20 insertions(+), 13 deletions(-)

Comments

Conor Dooley June 26, 2024, 2:36 p.m. UTC | #1
On Tue, Jun 25, 2024 at 09:51:20AM -0700, Evan Green wrote:
> RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
> hwprobe_key_is_bitmask(), when in reality it was an enum value. This
> causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
> since SLOW, FAST, and EMULATED have values whose bits overlap with
> each other. If the caller asked for the set of CPUs that was SLOW or
> EMULATED, the returned set would also include CPUs that were FAST.
> 
> Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
> returns the same values in response to a direct query (with no flags),
> but is properly handled as an enumerated value. As a result, SLOW,
> FAST, and EMULATED are all correctly treated as distinct values under
> the new key when queried with the WHICH_CPUS flag.
> 
> Leave the old key in place to avoid disturbing applications which may
> have already come to rely on the key, with or without its broken
> behavior with respect to the WHICH_CPUS flag.
> 
> Fixes: e178bf146e4b ("RISC-V: hwprobe: Introduce which-cpus flag")
> Signed-off-by: Evan Green <evan@rivosinc.com>
> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> 
> ---
> 
> Changes in v2:
>  - Clarified the distinction of slow and fast refers to misaligned word
>    accesses. Previously it just said misaligned accesses, leaving it
>    ambiguous as to which type of access was measured.

I think if we are gonna be specific, we should be exactly specific as to
what we have tested and say 32-bit if that's what we're probing/testing
with. That'd be consistent with jesse's proposed wording for vector.
Evan Green June 26, 2024, 3:55 p.m. UTC | #2
On Wed, Jun 26, 2024 at 7:36 AM Conor Dooley <conor@kernel.org> wrote:
>
> On Tue, Jun 25, 2024 at 09:51:20AM -0700, Evan Green wrote:
> > RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
> > hwprobe_key_is_bitmask(), when in reality it was an enum value. This
> > causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
> > since SLOW, FAST, and EMULATED have values whose bits overlap with
> > each other. If the caller asked for the set of CPUs that was SLOW or
> > EMULATED, the returned set would also include CPUs that were FAST.
> >
> > Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
> > returns the same values in response to a direct query (with no flags),
> > but is properly handled as an enumerated value. As a result, SLOW,
> > FAST, and EMULATED are all correctly treated as distinct values under
> > the new key when queried with the WHICH_CPUS flag.
> >
> > Leave the old key in place to avoid disturbing applications which may
> > have already come to rely on the key, with or without its broken
> > behavior with respect to the WHICH_CPUS flag.
> >
> > Fixes: e178bf146e4b ("RISC-V: hwprobe: Introduce which-cpus flag")
> > Signed-off-by: Evan Green <evan@rivosinc.com>
> > Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> >
> > ---
> >
> > Changes in v2:
> >  - Clarified the distinction of slow and fast refers to misaligned word
> >    accesses. Previously it just said misaligned accesses, leaving it
> >    ambiguous as to which type of access was measured.
>
> I think if we are gonna be specific, we should be exactly specific as to
> what we have tested and say 32-bit if that's what we're probing/testing
> with. That'd be consistent with jesse's proposed wording for vector.

Sure. In this case it's really native word sized accesses. So something like:

* :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`: An enum value describing
the performance of misaligned scalar native word accesses on the selected set
of processors.

...

* :c:macro:`RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW`: Misaligned native word
sized accesses are slower than the equivalent quantity of byte accesses.
Misaligned accesses may be supported directly in hardware, or trapped and
emulated by software.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_SCALAR_FAST`: Misaligned native word
sized accesses are faster than the equivalent quantity of byte accesses.

I'm planning to leave the qualifiers off of UNKNOWN, EMULATED, and
UNSUPPORTED, as those likely apply to misaligned accesses of any size.
Let me know if you think we should tweak it further.
-Evan
diff mbox series

Patch

diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst
index fc015b452ebf..c9f570b1ab60 100644
--- a/Documentation/arch/riscv/hwprobe.rst
+++ b/Documentation/arch/riscv/hwprobe.rst
@@ -207,8 +207,13 @@  The following keys are defined:
   * :c:macro:`RISCV_HWPROBE_EXT_ZVE64D`: The Vector sub-extension Zve64d is
     supported, as defined by version 1.0 of the RISC-V Vector extension manual.
 
-* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: A bitmask that contains performance
-  information about the selected set of processors.
+* :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated.  Returns similar values to
+     :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`, but the key was
+     mistakenly classified as a bitmask rather than a value.
+
+* :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`: An enum value describing
+  the performance of misaligned scalar word accesses on the selected set of
+  processors.
 
   * :c:macro:`RISCV_HWPROBE_MISALIGNED_UNKNOWN`: The performance of misaligned
     accesses is unknown.
@@ -217,12 +222,12 @@  The following keys are defined:
     emulated via software, either in or below the kernel.  These accesses are
     always extremely slow.
 
-  * :c:macro:`RISCV_HWPROBE_MISALIGNED_SLOW`: Misaligned accesses are slower
-    than equivalent byte accesses.  Misaligned accesses may be supported
+  * :c:macro:`RISCV_HWPROBE_MISALIGNED_SLOW`: Misaligned word accesses are
+    slower than equivalent byte accesses.  Misaligned accesses may be supported
     directly in hardware, or trapped and emulated by software.
 
-  * :c:macro:`RISCV_HWPROBE_MISALIGNED_FAST`: Misaligned accesses are faster
-    than equivalent byte accesses.
+  * :c:macro:`RISCV_HWPROBE_MISALIGNED_FAST`: Misaligned word accesses are
+    faster than equivalent byte accesses.
 
   * :c:macro:`RISCV_HWPROBE_MISALIGNED_UNSUPPORTED`: Misaligned accesses are
     not supported at all and will generate a misaligned address fault.
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h
index 630507dff5ea..150a9877b0af 100644
--- a/arch/riscv/include/asm/hwprobe.h
+++ b/arch/riscv/include/asm/hwprobe.h
@@ -8,7 +8,7 @@ 
 
 #include <uapi/asm/hwprobe.h>
 
-#define RISCV_HWPROBE_MAX_KEY 6
+#define RISCV_HWPROBE_MAX_KEY 7
 
 static inline bool riscv_hwprobe_key_is_valid(__s64 key)
 {
diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h
index 7b95fadbea2a..22073533cea8 100644
--- a/arch/riscv/include/uapi/asm/hwprobe.h
+++ b/arch/riscv/include/uapi/asm/hwprobe.h
@@ -66,13 +66,14 @@  struct riscv_hwprobe {
 #define		RISCV_HWPROBE_EXT_ZVE64F	(1ULL << 40)
 #define		RISCV_HWPROBE_EXT_ZVE64D	(1ULL << 41)
 #define RISCV_HWPROBE_KEY_CPUPERF_0	5
-#define		RISCV_HWPROBE_MISALIGNED_UNKNOWN	(0 << 0)
-#define		RISCV_HWPROBE_MISALIGNED_EMULATED	(1 << 0)
-#define		RISCV_HWPROBE_MISALIGNED_SLOW		(2 << 0)
-#define		RISCV_HWPROBE_MISALIGNED_FAST		(3 << 0)
-#define		RISCV_HWPROBE_MISALIGNED_UNSUPPORTED	(4 << 0)
-#define		RISCV_HWPROBE_MISALIGNED_MASK		(7 << 0)
+#define		RISCV_HWPROBE_MISALIGNED_UNKNOWN	0
+#define		RISCV_HWPROBE_MISALIGNED_EMULATED	1
+#define		RISCV_HWPROBE_MISALIGNED_SLOW		2
+#define		RISCV_HWPROBE_MISALIGNED_FAST		3
+#define		RISCV_HWPROBE_MISALIGNED_UNSUPPORTED	4
+#define		RISCV_HWPROBE_MISALIGNED_MASK		7
 #define RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE	6
+#define RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF	7
 /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
 
 /* Flags */
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index 83fcc939df67..991ceba67717 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -217,6 +217,7 @@  static void hwprobe_one_pair(struct riscv_hwprobe *pair,
 		break;
 
 	case RISCV_HWPROBE_KEY_CPUPERF_0:
+	case RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF:
 		pair->value = hwprobe_misaligned(cpus);
 		break;