mbox series

[v6,00/11] Introduce In Field Scan driver

Message ID 20220506014035.1173578-1-tony.luck@intel.com (mailing list archive)
Headers show
Series Introduce In Field Scan driver | expand

Message

Luck, Tony May 6, 2022, 1:40 a.m. UTC
TL;DR this driver loads scan test files that can check whether silicon
in a CPU core is still running correctly. It is expected that these tests
would be run several times per day to catch problems as silicon ages.

Changes since v5:

Added various Reviewed tags. If anyone wants to take one or more
back in the light of changes listed below, please speak up.

Thomas Gleixner
---------------
03 "So checking for Intel Fam6 ANYMODEL and X86_FEATURE_CORE_CAPABILITIES is
 sufficient, no?"

Longer explanation in earlier e-mail ... but the family/model/stepping
check is needed. No change.

04 "Why is ifs_load_firmware() not returning an error to the caller?"

In most cases the return isn't useful. But this did prompt a change
to make sure "echo 1 > reload" does give an error if the load fails.

05 "The above struct is nicely tabular. Can we have that here too please?"

Added <TAB>s to ifs_data structure to make it equally pretty.

06 "Setting the authenticated indicator _before_ actually doing the
authentication is just wrong. It does not matter in this case, but it's
still making my eyes bleed."

Moved indicator to after success has been checked.

06 "Why has this to be a smp function call? Just because it's conveniant?
This is nothing urgent and no hotpath, so this really can use
queue_work_on()."

Even simpler is schedule_work_on() [since other changes mean that
the driver no longer allocates a work queue.


07 "Waiting for a second with preemption disabled? Seriously?"
   "Plus another half a second with preemption disabled. That's just insane."
   "That local_irq_disable() solves what?"
   "Why cpu_hotplug_disable()? Why is cpus_read_lock() not sufficient here?"
   "Why does this need GFP_NOWAIT?"
   "I put that into the wishful thinking realm"
   "The real question is why you try to rendevouz CPUs via work queues."
   "pseudo-code to use stomp_machine()"

PeterZ contributed a neatly tailored for this usage "stop_core_cpuslocked()"
function that works beautifully. See part 0003 of this new series. That
meant all of the code that triggered the above comments has gone.

Tony Luck
---------
Noticed unnecessary casts from u8 to u32 in the checksum calculation
in load.c. Fixed.

Changed the tracepoint to include the CPU number of the core being
tested (the tracepoint otherwise just tells you which CPU is executing
the driver code and executing "stop_core_cpuslocked()" to do the actual
work on the target CPUs.

Dropped the msec_to_tsc() function that was used to initialize
activate.delay. Just use a #define of 100000 cycles (two orders
of magnitude bigger than I saw for the slew between the two threads
executing the "stop_core_cpuslocked()" target function ... but not
too insane so if the threads do not sync, we give up quickly).



Jithu Joseph (7):
  x86/microcode/intel: Expose collect_cpu_info_early() for IFS
  platform/x86/intel/ifs: Read IFS firmware image
  platform/x86/intel/ifs: Check IFS Image sanity
  platform/x86/intel/ifs: Authenticate and copy to secured memory
  platform/x86/intel/ifs: Add scan test support
  platform/x86/intel/ifs: Add IFS sysfs interface
  platform/x86/intel/ifs: add ABI documentation for IFS

Peter Zijlstra (1):
  stop_machine: Add stop_core_cpuslocked() for per-core operations

Tony Luck (3):
  x86/msr-index: Define INTEGRITY_CAPABILITIES MSR
  platform/x86/intel/ifs: Add stub driver for In-Field Scan
  trace: platform/x86/intel/ifs: Add trace point to track Intel IFS
    operations

 .../ABI/testing/sysfs-platform-intel-ifs      |  39 +++
 MAINTAINERS                                   |   8 +
 arch/x86/include/asm/cpu.h                    |  18 ++
 arch/x86/include/asm/msr-index.h              |   7 +
 arch/x86/kernel/cpu/intel.c                   |  32 +++
 arch/x86/kernel/cpu/microcode/intel.c         |  59 +---
 drivers/platform/x86/intel/Kconfig            |   1 +
 drivers/platform/x86/intel/Makefile           |   1 +
 drivers/platform/x86/intel/ifs/Kconfig        |  13 +
 drivers/platform/x86/intel/ifs/Makefile       |   3 +
 drivers/platform/x86/intel/ifs/core.c         |  74 +++++
 drivers/platform/x86/intel/ifs/ifs.h          | 124 ++++++++
 drivers/platform/x86/intel/ifs/load.c         | 266 ++++++++++++++++++
 drivers/platform/x86/intel/ifs/runtest.c      | 255 +++++++++++++++++
 drivers/platform/x86/intel/ifs/sysfs.c        | 149 ++++++++++
 include/linux/stop_machine.h                  |  16 ++
 include/trace/events/intel_ifs.h              |  41 +++
 kernel/stop_machine.c                         |  19 ++
 18 files changed, 1073 insertions(+), 52 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-platform-intel-ifs
 create mode 100644 drivers/platform/x86/intel/ifs/Kconfig
 create mode 100644 drivers/platform/x86/intel/ifs/Makefile
 create mode 100644 drivers/platform/x86/intel/ifs/core.c
 create mode 100644 drivers/platform/x86/intel/ifs/ifs.h
 create mode 100644 drivers/platform/x86/intel/ifs/load.c
 create mode 100644 drivers/platform/x86/intel/ifs/runtest.c
 create mode 100644 drivers/platform/x86/intel/ifs/sysfs.c
 create mode 100644 include/trace/events/intel_ifs.h


base-commit: 672c0c5173427e6b3e2a9bbb7be51ceeec78093a