Message ID | 20221107225323.2733518-13-jithu.joseph@intel.com (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
Series | IFS multi test image support and misc changes | expand |
On 11/7/2022 2:53 PM, Jithu Joseph wrote: > Reviewed-by: Tony Luck <tony.luck@intel.com> > Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> > --- > drivers/platform/x86/intel/ifs/ifs.h | 23 ++++++++++---- > drivers/platform/x86/intel/ifs/core.c | 1 + > drivers/platform/x86/intel/ifs/load.c | 18 +++++++---- > drivers/platform/x86/intel/ifs/runtest.c | 10 ++++--- > drivers/platform/x86/intel/ifs/sysfs.c | 38 ++++++++++++++++++++++++ > 5 files changed, 74 insertions(+), 16 deletions(-) > Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> A Nit below. > +static ssize_t current_batch_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + struct ifs_data *ifsd = ifs_get_data(dev); > + int cur_batch; > + int rc; > + > + rc = kstrtouint(buf, 0, &cur_batch); Should this be kstrtoint() since cur_batch is defined as int? Or maybe it might be better to define cur_batch as unsigned int to avoid the negative input? > + if (rc < 0 || cur_batch > 0xff) > + return -EINVAL; > + > + if (down_interruptible(&ifs_sem)) > + return -EINTR; > + > + ifsd->cur_batch = cur_batch; > + Sohil
Hi, On 11/7/22 23:53, Jithu Joseph wrote: > Initial implementation assumed a single IFS test image file with a > fixed name ff-mm-ss.scan. (where ff, mm, ss refers to family, model > and stepping of the core) > > Subsequently, it became evident that supporting more than one > test image file is needed to provide more comprehensive > test coverage. (Test coverage in this scenario refers to testing > more transistors in the core to identify faults) > > The other alternative of increasing the size of a single scan test image > file would not work as the upper bound is limited by the size of memory > area reserved by BIOS for loading IFS test image. > > Introduce "current_batch" file which accepts a number. Writing a > number to the current_batch file would load the test image file by name > ff-mm-ss-<xy>.scan, where <xy> is the number written to the > "current_batch" file in hex. Range check of the input is done to verify > it not greater than 0xff. > > For e.g if the scan test image comprises of 6 files, they would be named > as show below: > 06-8f-06-01.scan > 06-8f-06-02.scan > 06-8f-06-03.scan > 06-8f-06-04.scan > 06-8f-06-05.scan > 06-8f-06-06.scan > > And writing 3 to current_batch would result in loading 06-8f-06-03.scan > in the above e.g. The file can also be read to know the currently loaded > file. > > And testing a system looks like: > for each scan file > do > load the IFS test image file (write to the batch file) > for each core > do > test the core with this set of tests > done > done > > Qualify few error messages with the test image file suffix to > provide better context. > > Reviewed-by: Tony Luck <tony.luck@intel.com> > Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Thanks, patch looks good to me: Reviewed-by: Hans de Goede <hdegoede@redhat.com> Regards, Hans > --- > drivers/platform/x86/intel/ifs/ifs.h | 23 ++++++++++---- > drivers/platform/x86/intel/ifs/core.c | 1 + > drivers/platform/x86/intel/ifs/load.c | 18 +++++++---- > drivers/platform/x86/intel/ifs/runtest.c | 10 ++++--- > drivers/platform/x86/intel/ifs/sysfs.c | 38 ++++++++++++++++++++++++ > 5 files changed, 74 insertions(+), 16 deletions(-) > > diff --git a/drivers/platform/x86/intel/ifs/ifs.h b/drivers/platform/x86/intel/ifs/ifs.h > index 98ca91bdd5ca..390e508faf57 100644 > --- a/drivers/platform/x86/intel/ifs/ifs.h > +++ b/drivers/platform/x86/intel/ifs/ifs.h > @@ -33,13 +33,23 @@ > * The driver loads the tests into memory reserved BIOS local to each CPU > * socket in a two step process using writes to MSRs to first load the > * SHA hashes for the test. Then the tests themselves. Status MSRs provide > - * feedback on the success/failure of these steps. When a new test file > - * is installed it can be loaded by writing to the driver reload file:: > + * feedback on the success/failure of these steps. > * > - * # echo 1 > /sys/devices/virtual/misc/intel_ifs_0/reload > + * The test files are kept in a fixed location: /lib/firmware/intel/ifs_0/ > + * For e.g if there are 3 test files, they would be named in the following > + * fashion: > + * ff-mm-ss-01.scan > + * ff-mm-ss-02.scan > + * ff-mm-ss-03.scan > + * (where ff refers to family, mm indicates model and ss indicates stepping) > * > - * Similar to microcode, the current version of the scan tests is stored > - * in a fixed location: /lib/firmware/intel/ifs.0/family-model-stepping.scan > + * A different testfile can be loaded by writing the numerical portion > + * (e.g 1, 2 or 3 in the above scenario) into the curent_batch file. > + * To load ff-mm-ss-02.scan, the following command can be used:: > + * > + * # echo 2 > /sys/devices/virtual/misc/intel_ifs_0/current_batch > + * > + * The above file can also be read to know the currently loaded image. > * > * Running tests > * ------------- > @@ -207,6 +217,7 @@ struct ifs_data { > int status; > u64 scan_details; > int cur_batch; > + int test_num; > }; > > struct ifs_work { > @@ -227,7 +238,7 @@ static inline struct ifs_data *ifs_get_data(struct device *dev) > return &d->data; > } > > -void ifs_load_firmware(struct device *dev); > +int ifs_load_firmware(struct device *dev); > int do_core_test(int cpu, struct device *dev); > const struct attribute_group **ifs_get_groups(void); > > diff --git a/drivers/platform/x86/intel/ifs/core.c b/drivers/platform/x86/intel/ifs/core.c > index 5fb7f655c291..1f040837e8eb 100644 > --- a/drivers/platform/x86/intel/ifs/core.c > +++ b/drivers/platform/x86/intel/ifs/core.c > @@ -22,6 +22,7 @@ MODULE_DEVICE_TABLE(x86cpu, ifs_cpu_ids); > static struct ifs_device ifs_device = { > .data = { > .integrity_cap_bit = MSR_INTEGRITY_CAPS_PERIODIC_BIST_BIT, > + .test_num = 0, > }, > .misc = { > .name = "intel_ifs_0", > diff --git a/drivers/platform/x86/intel/ifs/load.c b/drivers/platform/x86/intel/ifs/load.c > index f361fd42a320..c6414958a691 100644 > --- a/drivers/platform/x86/intel/ifs/load.c > +++ b/drivers/platform/x86/intel/ifs/load.c > @@ -256,17 +256,18 @@ static int ifs_image_sanity_check(struct device *dev, const struct microcode_hea > > /* > * Load ifs image. Before loading ifs module, the ifs image must be located > - * in /lib/firmware/intel/ifs and named as {family/model/stepping}.{testname}. > + * in /lib/firmware/intel/ifs_x/ and named as family-model-stepping-02x.{testname}. > */ > -void ifs_load_firmware(struct device *dev) > +int ifs_load_firmware(struct device *dev) > { > struct ifs_data *ifsd = ifs_get_data(dev); > const struct firmware *fw; > - char scan_path[32]; > - int ret; > + char scan_path[64]; > + int ret = -EINVAL; > > - snprintf(scan_path, sizeof(scan_path), "intel/ifs/%02x-%02x-%02x.scan", > - boot_cpu_data.x86, boot_cpu_data.x86_model, boot_cpu_data.x86_stepping); > + snprintf(scan_path, sizeof(scan_path), "intel/ifs_%d/%02x-%02x-%02x-%02x.scan", > + ifsd->test_num, boot_cpu_data.x86, boot_cpu_data.x86_model, > + boot_cpu_data.x86_stepping, ifsd->cur_batch); > > ret = request_firmware_direct(&fw, scan_path, dev); > if (ret) { > @@ -282,8 +283,13 @@ void ifs_load_firmware(struct device *dev) > ifs_hash_ptr = (u64)(ifs_header_ptr + 1); > > ret = scan_chunks_sanity_check(dev); > + if (ret) > + dev_err(dev, "Load failure for batch: %02x\n", ifsd->cur_batch); > + > release: > release_firmware(fw); > done: > ifsd->loaded = (ret == 0); > + > + return ret; > } > diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c > index b2ca2bb4501f..0bfd8fcdd7e8 100644 > --- a/drivers/platform/x86/intel/ifs/runtest.c > +++ b/drivers/platform/x86/intel/ifs/runtest.c > @@ -78,14 +78,16 @@ static void message_not_tested(struct device *dev, int cpu, union ifs_status sta > > static void message_fail(struct device *dev, int cpu, union ifs_status status) > { > + struct ifs_data *ifsd = ifs_get_data(dev); > + > /* > * control_error is set when the microcode runs into a problem > * loading the image from the reserved BIOS memory, or it has > * been corrupted. Reloading the image may fix this issue. > */ > if (status.control_error) { > - dev_err(dev, "CPU(s) %*pbl: could not execute from loaded scan image\n", > - cpumask_pr_args(cpu_smt_mask(cpu))); > + dev_err(dev, "CPU(s) %*pbl: could not execute from loaded scan image. Batch: %02x version: 0x%x\n", > + cpumask_pr_args(cpu_smt_mask(cpu)), ifsd->cur_batch, ifsd->loaded_version); > } > > /* > @@ -96,8 +98,8 @@ static void message_fail(struct device *dev, int cpu, union ifs_status status) > * the core being tested. > */ > if (status.signature_error) { > - dev_err(dev, "CPU(s) %*pbl: test signature incorrect.\n", > - cpumask_pr_args(cpu_smt_mask(cpu))); > + dev_err(dev, "CPU(s) %*pbl: test signature incorrect. Batch: %02x version: 0x%x\n", > + cpumask_pr_args(cpu_smt_mask(cpu)), ifsd->cur_batch, ifsd->loaded_version); > } > } > > diff --git a/drivers/platform/x86/intel/ifs/sysfs.c b/drivers/platform/x86/intel/ifs/sysfs.c > index e077910c5d28..d2eeeb04d760 100644 > --- a/drivers/platform/x86/intel/ifs/sysfs.c > +++ b/drivers/platform/x86/intel/ifs/sysfs.c > @@ -87,6 +87,43 @@ static ssize_t run_test_store(struct device *dev, > > static DEVICE_ATTR_WO(run_test); > > +static ssize_t current_batch_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + struct ifs_data *ifsd = ifs_get_data(dev); > + int cur_batch; > + int rc; > + > + rc = kstrtouint(buf, 0, &cur_batch); > + if (rc < 0 || cur_batch > 0xff) > + return -EINVAL; > + > + if (down_interruptible(&ifs_sem)) > + return -EINTR; > + > + ifsd->cur_batch = cur_batch; > + > + rc = ifs_load_firmware(dev); > + > + up(&ifs_sem); > + > + return (rc == 0) ? count : rc; > +} > + > +static ssize_t current_batch_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct ifs_data *ifsd = ifs_get_data(dev); > + > + if (!ifsd->loaded) > + return sysfs_emit(buf, "none\n"); > + else > + return sysfs_emit(buf, "0x%02x\n", ifsd->cur_batch); > +} > + > +static DEVICE_ATTR_RW(current_batch); > + > /* > * Display currently loaded IFS image version. > */ > @@ -108,6 +145,7 @@ static struct attribute *plat_ifs_attrs[] = { > &dev_attr_details.attr, > &dev_attr_status.attr, > &dev_attr_run_test.attr, > + &dev_attr_current_batch.attr, > &dev_attr_image_version.attr, > NULL > };
On Mon, Nov 07, 2022 at 02:53:21PM -0800, Jithu Joseph wrote: > Initial implementation assumed a single IFS test image file with a > fixed name ff-mm-ss.scan. (where ff, mm, ss refers to family, model > and stepping of the core) > > Subsequently, it became evident that supporting more than one > test image file is needed to provide more comprehensive > test coverage. (Test coverage in this scenario refers to testing > more transistors in the core to identify faults) > > The other alternative of increasing the size of a single scan test image > file would not work as the upper bound is limited by the size of memory > area reserved by BIOS for loading IFS test image. > > Introduce "current_batch" file which accepts a number. Writing a > number to the current_batch file would load the test image file by name > ff-mm-ss-<xy>.scan, where <xy> is the number written to the > "current_batch" file in hex. Range check of the input is done to verify > it not greater than 0xff. Dunno - sounds silly to me. Means one needs to go and look up which files are there and echo those batch numbers into sysfs and so on. What I would do is make it real trivial for the user so that latter can simply do: for f in $(ls /lib/firmware/intel/ifs_0/*.scan); do echo $f > /sys/devices/virtual/misc/intel_ifs_0/test_file done and simply supply the full filename. There will be no requirement on the naming - only on the filename length and it should be in that directory /lib/firmware/intel/ifs_0/ (btw, what's that appended "_0" supposed to mean there?) So the kernel would simply open it, sanity-check it, if it passes, it would run it - otherwise it would ignore it. A usability win-win.
On Saturday, 12 November 2022 08:26:28 PST Borislav Petkov wrote: > > Introduce "current_batch" file which accepts a number. Writing a > > number to the current_batch file would load the test image file by name > > ff-mm-ss-<xy>.scan, where <xy> is the number written to the > > "current_batch" file in hex. Range check of the input is done to verify > > it not greater than 0xff. > > Dunno - sounds silly to me. Means one needs to go and look up which > files are there and echo those batch numbers into sysfs and so on. Not exactly. That's what this file is there for. It allows the algorithm to read the current batch file, add 1, then echo back. If the load succeeds, the the batch exists; if not, then the algorithm should simply go back to 0. That's what we're implementing here: https://github.com/opendcdiag/opendcdiag/pull/163 > What I would do is make it real trivial for the user so that latter can > simply do: > > for f in $(ls /lib/firmware/intel/ifs_0/*.scan); > do > echo $f > /sys/devices/virtual/misc/intel_ifs_0/test_file > done > > and simply supply the full filename. Unfortunately, there are other limitations that make such a simple algorithm not possible in the first place. First, there's the question of the ability to see into /lib/firmware. I'm not a kernel dev but I'm told that request_firmware() only operates on the root container's filesystem view. We're expecting that the application may get deployed as a container (with full privileges so it can write to /sys, sure), so it won't be able to see the host system's /lib to know what files are available. It could "guess" at the file names, based on the current processor's family/model/stepping and a natural number, but that's sub-optimal. Unless the driver were allowed to load any file named by the application, from its own view of the filesystem, permitting the firmware files being distributed inside the container. Second, for electrical reasons, we expect that certain processor generations will need a timeout between tests before testing can be done again on a given core, whether the same batch or the next one. This time out can be in the order of many minutes, which is longer than any hyperscaler is willing to allocate for a system self-test hogging a core or the whole system, just waiting. For example, let's say that the timeout is 15 minutes and there are 4 batches: this means the whole testing procedure takes one hour, even though the actual downtime for each core was less than 1 second. This is lost revenue. Instead, they wish the next available maintenance window to simply resume testing at the point where the last one stopped. These windows need not be scheduled; they can also be opportunistic, when the orchestrator determines the machine or a subset of one is going to be idle. That's what the algorithm in the pull request above implements: if the current_batch's result was "untested", it is attempted again, otherwise it tries the next one, rolling back to 0 if the loading failed. This removes the need to know anything about the timeout on the current processor or even whether there is one, or how many batches there are.242 > So the kernel would simply open it, sanity-check it, if it passes, it > would run it - otherwise it would ignore it. > > A usability win-win.
> Dunno - sounds silly to me. Means one needs to go and look up which > files are there and echo those batch numbers into sysfs and so on. > > What I would do is make it real trivial for the user so that latter can > simply do: > > for f in $(ls /lib/firmware/intel/ifs_0/*.scan); > do > echo $f > /sys/devices/virtual/misc/intel_ifs_0/test_file > done > > and simply supply the full filename. We tried the full file name in an earlier version. GregKH was unimpressed. But that was when we were trying to overload the meaning of the “reload” file. Is it different now? -Tony
On Sat, Nov 12, 2022 at 10:21:35AM -0800, Thiago Macieira wrote: > Not exactly. That's what this file is there for. It allows the algorithm to > read the current batch file, add 1, then echo back. If the load succeeds, the > the batch exists; if not, then the algorithm should simply go back to 0. This sounds to me like there's a special order in which those batches should be executed? I thought they're simply collections of test sequences which can be run in any order... > First, there's the question of the ability to see into /lib/firmware. I'm not a > kernel dev but I'm told that request_firmware() only operates on the root > container's filesystem view. We're expecting that the application may get > deployed as a container (with full privileges so it can write to /sys, sure), > so it won't be able to see the host system's /lib to know what files are > available. It could "guess" at the file names, based on the current processor's > family/model/stepping and a natural number, but that's sub-optimal. It is not about seeing - you simply give it the filename - request_firmware* does the "seeing". Either the file's there or it isn't. > Unless the driver were allowed to load any file named by the application, from > its own view of the filesystem, permitting the firmware files being distributed > inside the container. There's a reason I wrote: "There will be no requirement on the naming - only on the filename length and it should be in that directory /lib/firmware/intel/ifs_0/" Of course the driver should load only from that directory. > Second, for electrical reasons, we expect that certain processor generations > will need a timeout between tests before testing can be done again on a given > core, whether the same batch or the next one. This time out can be in the > order of many minutes, which is longer than any hyperscaler is willing to > allocate for a system self-test hogging a core or the whole system, just > waiting. For example, let's say that the timeout is 15 minutes and there are 4 > batches: this means the whole testing procedure takes one hour, even though > the actual downtime for each core was less than 1 second. This is lost > revenue. All that doesn't matter - if the CPU *must* wait 15 minutes between batches, then that should be enforced by the driver and not relied upon by userspace to DTRT. > Instead, they wish the next available maintenance window to simply resume > testing at the point where the last one stopped. These windows need not be > scheduled; they can also be opportunistic, when the orchestrator determines > the machine or a subset of one is going to be idle. That's what the algorithm > in the pull request above implements: if the current_batch's result was > "untested", it is attempted again, otherwise it tries the next one, rolling > back to 0 if the loading failed. This removes the need to know anything about > the timeout on the current processor or even whether there is one, or how many > batches there are.242 This all has nothing to do with whether you give it a number or a filename. How you glue your testing around it together is a userspace issue - all the kernel driver needs to be able to do is load the sequence and execute it. Echoing filenames into sysfs is no different from echoing numbers into it - former is simpler. If the CPU says it cannot execute the sequence currently, you have to think about how you retry that sequence. How you specify it doesn't matter.
On Sat, Nov 12, 2022 at 06:33:58PM +0000, Luck, Tony wrote: > We tried the full file name in an earlier version. GregKH was > unimpressed. But that was when we were trying to overload the meaning > of the “reload” file. I'm guessing the background of his lack of "impressiveness" was something along the lines of: "Attributes should be ASCII text files, preferably with only one value per file. It is noted that it may not be efficient to contain only one value per file, so it is socially acceptable to express an array of values of the same type." Question is, is a filename string one value per file or strings don't count? Suspend allows it: # echo "shutdown" > /sys/power/disk # echo "disk" > /sys/power/state Btw, do you guys really want to run this in production? As in, are you really that confident that all those test sequence executions really do not affect other CPU execution at all? Because if this is going to be run during downtime, as Thiago says, then you can just as well use debugfs for this. And then there's no need to cast any API in stone and so on. Hmmm.
On Sat, Nov 12, 2022 at 08:20:30PM +0100, Borislav Petkov wrote: > On Sat, Nov 12, 2022 at 10:21:35AM -0800, Thiago Macieira wrote: > > Not exactly. That's what this file is there for. It allows the algorithm to > > read the current batch file, add 1, then echo back. If the load succeeds, the > > the batch exists; if not, then the algorithm should simply go back to 0. > > This sounds to me like there's a special order in which those batches > should be executed? There is no special sequence required. Idea is that user would like to run all tests, so simply going through them works. User has no idea what the test content is for each file, unless there is a case where they like to repeat a specific batch on one core, they can simply skip and do one specific batch, they don't need to go through in sequence. > > I thought they're simply collections of test sequences which can be run > in any order... > > > First, there's the question of the ability to see into /lib/firmware. I'm not a > > kernel dev but I'm told that request_firmware() only operates on the root > > container's filesystem view. We're expecting that the application may get > > deployed as a container (with full privileges so it can write to /sys, sure), > > so it won't be able to see the host system's /lib to know what files are > > available. It could "guess" at the file names, based on the current processor's > > family/model/stepping and a natural number, but that's sub-optimal. > > It is not about seeing - you simply give it the filename - > request_firmware* does the "seeing". Either the file's there or it > isn't. Link to Tony's last post before changing the file formats to accomodate Greg's feedback. https://lore.kernel.org/lkml/SJ1PR11MB6083263E8EEBF106B6B61B24FC969@SJ1PR11MB6083.namprd11.prod.outlook.com/ Cheers, Ashok
> Btw, do you guys really want to run this in production? Absolutely yes. > As in, are you really that confident that all those test sequence > executions really do not affect other CPU execution at all? Tests may draw a lot of power, so prevent other cores going into turbo frequency states. But otherwise no effect. > Because if this is going to be run during downtime, as Thiago says, then > you can just as well use debugfs for this. And then there's no need to > cast any API in stone and so on. Did Thiago say “during downtime”? I think he talked about some users opportunistic use of scan tests. But that’s far from only during downtime. We fully expect CSPs to run these scans periodically on production machines. -Tony
On Saturday, 12 November 2022 11:20:30 PST Borislav Petkov wrote: > This sounds to me like there's a special order in which those batches > should be executed? > > I thought they're simply collections of test sequences which can be run > in any order... As Ashok replied, no, they are not ordered. But running them from first to last is the simplest algorithm to code. If we did support file names, then the directory order would be also as good as any (unsorted). > It is not about seeing - you simply give it the filename - > request_firmware* does the "seeing". Either the file's there or it > isn't. I meant knowing which files are there. If they don't form a specific pattern, then it's impossible to know what they have been named. And if they do form a specific pattern, what's the harm in just using the sequence number? It's much easier for the kernel to remember a single 8-bit number than a file name. It also allows for new techniques like deploying a single cpio or tarball with everything with an update to the kernel, without having userspace have to update to match. > There's a reason I wrote: > > "There will be no requirement on the naming - only on the filename > length and it should be in that directory /lib/firmware/intel/ifs_0/" > > Of course the driver should load only from that directory. Thank you for that explanation. But my argument was that the application driving this might be deployed as a container, as part of a container- orchestration and scheduling system, while the firmware files would already be pre-installed on the machine and updated with the regular firmware update mechanism. If the container can't see what files are there, it would have to resort to the guessing I mentioned above. CSPs could avoid this by bind- mounting /lib/firmware into the container, but we'd prefer not to add yet another constraint. > All that doesn't matter - if the CPU *must* wait 15 minutes between > batches, then that should be enforced by the driver and not relied upon > by userspace to DTRT. If's enforced by the CPU today. How it determines when a test can be run is besides the point; the driver will ask it to run and the CPU will reply it couldn't. That information is conveyed back to userspace by changing the "status" back to "untested". > This all has nothing to do with whether you give it a number or a > filename. How you glue your testing around it together is a userspace > issue - all the kernel driver needs to be able to do is load the > sequence and execute it. > > Echoing filenames into sysfs is no different from echoing numbers into > it - former is simpler. If the CPU says it cannot execute the sequence > currently, you have to think about how you retry that sequence. How you > specify it doesn't matter. Right, it's no different from echoing file names, but it's much simpler to increment a number than do a directory listing and sort the file names, so it can pick up from where it left off. I mentioned this complication to explain why the userspace won't be able to simply echo each file name and expect things to have reached full coverage. Unfortunately, userspace needs to cope with the fact that there will be a timeout for certain generations. It's not what we'd have preferred; it's a hardware constraint we have to adapt to. WIth this constraint in mind, having a single number simplifies userspace. You can't do it with a three-line shell script, but we're not expecting that shell scripts are the means to use this feature in the first place.
On Saturday, 12 November 2022 15:32:47 PST Luck, Tony wrote: > > Because if this is going to be run during downtime, as Thiago says, then > > you can just as well use debugfs for this. And then there's no need to > > cast any API in stone and so on. > > Did Thiago say “during downtime”? I think > he talked about some users opportunistic > use of scan tests. But that’s far from only > during downtime. We fully expect CSPs to > run these scans periodically on production > machines. Let me clarify. I did not mean full system downtime for maintenance, but I did mean that there's a gap in consumer workload, for both threads of one or more cores. As Tony said, it should have little observable effect on any other core, meaning an IFS run can be scheduled *as* any other workload (albeit a privileged one) for a subset of the machine, while the rest of the system remains in production. This allows them a lot of flexibility and is the reason I am talking about containers, with the implied constraint that the container's view of the filesystem is narrower than the kernel's. There'll be some coordination required to get all cores to have run all tests, but it should be doable over a period of time, and I'm thinking days, not years. This should still be short enough to reveal if the system can detect a defect or wear-out before any real workload is impacted by it. If an issue is detected, the admin can decide whether to offline the core(s) reporting problems but keep the rest serving workloads and generating revenue, or offline the entire machine for full maintenance and to run more invasive and time-consuming tests.
On Sat, Nov 12, 2022 at 06:33:58PM +0000, Luck, Tony wrote: > > > Dunno - sounds silly to me. Means one needs to go and look up which > > files are there and echo those batch numbers into sysfs and so on. > > > > What I would do is make it real trivial for the user so that latter can > > simply do: > > > > for f in $(ls /lib/firmware/intel/ifs_0/*.scan); > > do > > echo $f > /sys/devices/virtual/misc/intel_ifs_0/test_file > > done > > > > and simply supply the full filename. > > We tried the full file name in an earlier version. GregKH was unimpressed. But that was when we were trying to overload the meaning of the “reload” file. > > Is it different now? No, please do not force the driver to resolve a filename path in the kernel.
Replying to both with one mail because it still feels like there's a misunderstanding. On Sun, Nov 13, 2022 at 08:37:32AM +0100, gregkh@linuxfoundation.org wrote: > No, please do not force the driver to resolve a filename path in the > kernel. No, I don't mean to do any filename path resolving - all I suggest is to echo into sysfs the full filename instead of the number. I.e., this: for i in $(ls /lib/firmware/intel/ifs_0/*.scan); do echo $i /sys/devices/virtual/misc/intel_ifs_0/current_batch done What you end up echoing inside is only the full filename - *not* the absolute filename - instead of a number. So those in a succession: 06-8f-06-00.scan 06-8f-06-01.scan 06-8f-06-02.scan 06-8f-06-03.scan 06-8f-06-04.scan 06-8f-06-05.scan The advantage being, you don't need to remember which file sequence and which family/model/stepping. For all Intel employees here on the thread, there's a world outside Intel and people do not talk (family model stepping) tuples like we do. All they wanna do is run their damn tests. So instead of what the code does now: + snprintf(scan_path, sizeof(scan_path), "intel/ifs_%d/%02x-%02x-%02x-%02x.scan", + ifsd->test_num, boot_cpu_data.x86, boot_cpu_data.x86_model, + boot_cpu_data.x86_stepping, ifsd->cur_batch); It would still use the *same* scan_path - /lib/firmware/intel/ifs_0/ - no one is proposing to give a full path name - it would only use the filename string - 06-8f-06-00.scan, for example - instead of the "0" in it to build that string. And, ofcourse it would check the format of that string against family, model, stepping and sequence number (btw this way you drop your limitation of 256 for the sequence number which you don't really need either). And then if the format passes, it would check the headers. And only if those pass too, then it would load. > Right, it's no different from echoing file names, but it's much > simpler to increment a number than do a directory listing and sort the > file names, so it can pick up from where it left off. It is no different - you still need to remember where you are in the sequence wrt to testing. So if you want to deal with the timeout, that same script above will check the status and wait. Not rocket science. As to this Thiago: > You can't do it with a three-line shell script, but we're not > expecting that shell scripts are the means to use this feature in the > first place. I consider it a serious design mistake of having to have a *special* tool. A special tool is *always* a burden. You need to build it, supply it, make sure it is installable on the target system and so on. And I'm telling you this with my Linux distributor hat on. It is always a pain - trust me. For example, there's a reason why you can still control ftrace from the command line and you don't need any tool. You *can* use a tool but you don't have to. IOW, the KISS philosophy. IOW, I really think that designing our interfaces with user friendliness in mind should be of much more higher importance. And requiring the user to remember or figure out anything she doesn't really need to, in order to run the test is a burden. Just look at microcode loading: early loading works automatically - you only need to install the blobs in /lib/firmware/intel-ucode/. The initrd builder figures out which to add to the image. And not even that is required - I have a script which adds *all* blobs to my image. It still loads the right one. Late loading works also trivially: echo 1 > /sys/devices/system/cpu/microcode/reload And it goes and builds the filename from f/m/s and loads it from the hardcoded path - no filename resolving. But it doesn't ask the user to give a f/m/s or a sequence number. I sincerely hope that makes more sense.
On Sun, Nov 13, 2022 at 12:48:40PM +0100, Borislav Petkov wrote: > Replying to both with one mail because it still feels like there's a > misunderstanding. > > On Sun, Nov 13, 2022 at 08:37:32AM +0100, gregkh@linuxfoundation.org wrote: > > No, please do not force the driver to resolve a filename path in the > > kernel. > > No, I don't mean to do any filename path resolving - all I suggest is to > echo into sysfs the full filename instead of the number. I.e., this: > > for i in $(ls /lib/firmware/intel/ifs_0/*.scan); > do > echo $i /sys/devices/virtual/misc/intel_ifs_0/current_batch > done > > What you end up echoing inside is only the full filename - *not* the > absolute filename - instead of a number. So those in a succession: > > 06-8f-06-00.scan > 06-8f-06-01.scan > 06-8f-06-02.scan > 06-8f-06-03.scan > 06-8f-06-04.scan > 06-8f-06-05.scan Do you expect the /lib/firmware/intel/ifs_0/ to contain *ONLY* files for this platform? For microcode we have everything in the public release included here. In the above proposal, you can *ONLY* put files for this platform unlike simply copying everything released and let the kernel pick the right one since it does the ff-mm-ss-*.scan lookup. Only the batch number is supplied from user space. > > The advantage being, you don't need to remember which file sequence and > which family/model/stepping. Even in the current implementation, user doesn't need to know f/m/s. That's something the driver selects automatically, just like what microcode does for reload. > > For all Intel employees here on the thread, there's a world outside > Intel and people do not talk (family model stepping) tuples like we do. > All they wanna do is run their damn tests. > > So instead of what the code does now: > > + snprintf(scan_path, sizeof(scan_path), "intel/ifs_%d/%02x-%02x-%02x-%02x.scan", > + ifsd->test_num, boot_cpu_data.x86, boot_cpu_data.x86_model, > + boot_cpu_data.x86_stepping, ifsd->cur_batch); > > It would still use the *same* scan_path - /lib/firmware/intel/ifs_0/ - > no one is proposing to give a full path name - it would only use the > filename string - 06-8f-06-00.scan, for example - instead of the "0" in > it to build that string. > > And, ofcourse it would check the format of that string against family, > model, stepping and sequence number (btw this way you drop your > limitation of 256 for the sequence number which you don't really need > either). > > And then if the format passes, it would check the headers. > > And only if those pass too, then it would load. Isn't it simple now? No need to check if user supplied the right f/m/s since its not a user input, kernel composes that automatically. > > Right, it's no different from echoing file names, but it's much > > simpler to increment a number than do a directory listing and sort the > > file names, so it can pick up from where it left off. > > It is no different - you still need to remember where you are in the > sequence wrt to testing. > > So if you want to deal with the timeout, that same script above will > check the status and wait. Not rocket science. > > As to this Thiago: > > > You can't do it with a three-line shell script, but we're not > > expecting that shell scripts are the means to use this feature in the > > first place. > > I consider it a serious design mistake of having to have a *special* > tool. A special tool is *always* a burden. You need to build it, supply > it, make sure it is installable on the target system and so on. > > And I'm telling you this with my Linux distributor hat on. It is always > a pain - trust me. > > For example, there's a reason why you can still control ftrace from the > command line and you don't need any tool. You *can* use a tool but you > don't have to. IOW, the KISS philosophy. The tool we have is not for a simple test run. That can easily be done with a shell script. The tool does a bit more like handling retries if the test was not scheduled. The fundamental pass/fail simply output is what the kernel provides. > > IOW, I really think that designing our interfaces with user friendliness > in mind should be of much more higher importance. And requiring the user > to remember or figure out anything she doesn't really need to, in order > to run the test is a burden. > > Just look at microcode loading: early loading works automatically - you > only need to install the blobs in /lib/firmware/intel-ucode/. The initrd > builder figures out which to add to the image. > > And not even that is required - I have a script which adds *all* blobs > to my image. It still loads the right one. > > Late loading works also trivially: > > echo 1 > /sys/devices/system/cpu/microcode/reload > > And it goes and builds the filename from f/m/s and loads it from the > hardcoded path - no filename resolving. > > But it doesn't ask the user to give a f/m/s or a sequence number. I don't think the current proposed interface expects a f/m/s. The entire IFS design was sort of mimicking the microcode interface. The V1 submission did pretty much that. But it required copying files from some place to /lib/firmware so they can echo 1 > reload. Instead of copying files over that *did* sound like a hack, the batch was introduced and that's the only thing needed from user input. The utility is more like icing, to run a simple test all you need is a simple script. It is not a baseline requirement. Cheers, Ashok
On Sun, Nov 13, 2022 at 07:15:03AM -0800, Ashok Raj wrote: > Do you expect the /lib/firmware/intel/ifs_0/ to contain *ONLY* files for > this platform? For microcode we have everything in the public release > included here. Same as microcode, as I said further down in my mail: "And, ofcourse it would check the format of that string against family, model, stepping and sequence number (btw this way you drop your limitation of 256 for the sequence number which you don't really need either)." > In the above proposal, you can *ONLY* put files for this platform > unlike simply copying everything released and let the kernel pick the > right one since it does the ff-mm-ss-*.scan lookup. Only the batch > number is supplied from user space. No, see above. You check the filename against the current f/m/s. Just like microcode. > Even in the current implementation, user doesn't need to know f/m/s. > That's something the driver selects automatically, just like what > microcode does for reload. Basically what I'm saying all this time. > Isn't it simple now? No need to check if user supplied the right f/m/s > since its not a user input, kernel composes that automatically. Let's see * try echoing a magic number into some sysfs file vs * simply try *all* files in a directory Latter is even simpler because you don't have to explain anything about sequence numbers - the user doesn't need to know. > The tool we have is not for a simple test run. That can easily be done with > a shell script. The tool does a bit more like handling retries if the test > was not scheduled. The fundamental pass/fail simply output is what the > kernel provides. Because a simple script can't handle retries based on the values read from that sysfs file? Yeah, right. > I don't think the current proposed interface expects a f/m/s. The entire > IFS design was sort of mimicking the microcode interface. Ashok, you prove for the nth time that you don't really read my emails. Sorry, try again.
On 11/13/2022 3:48 AM, Borislav Petkov wrote: > > So instead of what the code does now: > > + snprintf(scan_path, sizeof(scan_path), "intel/ifs_%d/%02x-%02x-%02x-%02x.scan", > + ifsd->test_num, boot_cpu_data.x86, boot_cpu_data.x86_model, > + boot_cpu_data.x86_stepping, ifsd->cur_batch); > > It would still use the *same* scan_path - /lib/firmware/intel/ifs_0/ - > no one is proposing to give a full path name - it would only use the > filename string - 06-8f-06-00.scan, for example - instead of the "0" in > it to build that string. > > And, ofcourse it would check the format of that string against family, > model, stepping and sequence number (btw this way you drop your > limitation of 256 for the sequence number which you don't really need > either). > > And then if the format passes, it would check the headers. Do you think it is better to restrict filename input to confirm to ff-mm-ss-xy.<test> format rather than accepting any string and treating it as a file-name and trying to load it, if it is present ? (Given that, before loading, We do intel_find_matching_signature(), which validates if the signature/pf entries in header confirms to the machine we are on before loading) We did accepting file-name as input before [1] (except for validating if the filename confirms to ff-mm-ss format) > > And only if those pass too, then it would load. > [1] https://lore.kernel.org/lkml/20220710160011.995800-1-jithu.joseph@intel.com/
On Sun, Nov 13, 2022 at 08:41:47AM -0800, Joseph, Jithu wrote: > Do you think it is better to restrict filename input to confirm > to ff-mm-ss-xy.<test> format rather than accepting any string and > treating it as a file-name and trying to load it, if it is present ? Yap, that way you pre-filter filenames. Yeah, the header checks *must* absolutely happen too but it would be a simple first test. Also, you can check for: ff-mm-ss-X.<test> where X is a [0-9]+ of arbitrary length and this way won't have the artificial 256 limit you have now. > (Given that, before loading, We do intel_find_matching_signature(), > which validates if the signature/pf entries in header confirms to the > machine we are on before loading) > > We did accepting file-name as input before [1] (except for validating > if the filename confirms to ff-mm-ss format) Yeah, except now you want to do multiple sets of scan files. But yeah, you can extend the interface even more: echo <fname> /sysfs/file * if f/m/s matches, you execute if still within the timeout, you return -EAGAIN from current_batch_store() to tell userspace, take a nap and try again. * if the f/m/s doesn't match you return -EINVAL to say, wrong filename, try the next one. And this way, the user script can simply look at the retvals and act accordingly - it won't even need to parse dmesg or wnatnot. For example.
On Sun, Nov 13, 2022 at 04:58:52PM +0100, Borislav Petkov wrote: > On Sun, Nov 13, 2022 at 07:15:03AM -0800, Ashok Raj wrote: > > Do you expect the /lib/firmware/intel/ifs_0/ to contain *ONLY* files for > > this platform? For microcode we have everything in the public release > > included here. > > Same as microcode, as I said further down in my mail: > > "And, ofcourse it would check the format of that string against family, -----------------^^ It is user space or the kernel driver? > model, stepping and sequence number (btw this way you drop your > limitation of 256 for the sequence number which you don't really need > either)." > > > In the above proposal, you can *ONLY* put files for this platform > > unlike simply copying everything released and let the kernel pick the > > right one since it does the ff-mm-ss-*.scan lookup. Only the batch > > number is supplied from user space. > > No, see above. You check the filename against the current f/m/s. Just > like microcode. If it's Ok to ask a question. "You" above is the kernel? Microcode has no such functionality today right? User space never inputs a filename, only performs echo 1 > reload. If a file name composed by the kernel exists, then it checks the header validity before proceeding. Apologize if I misunderstood you. > > > Even in the current implementation, user doesn't need to know f/m/s. > > That's something the driver selects automatically, just like what > > microcode does for reload. > > Basically what I'm saying all this time. > > > Isn't it simple now? No need to check if user supplied the right f/m/s > > since its not a user input, kernel composes that automatically. > > Let's see > > * try echoing a magic number into some sysfs file > > vs > > * simply try *all* files in a directory > > Latter is even simpler because you don't have to explain anything about > sequence numbers - the user doesn't need to know. Ok, so can you take through how that is going to work.. for i in * do echo $i > batch done If some of the files are for a different fms, kernel will check the format and ignores the input. So some of the files will work, some will fail, and user space doesn't care? Please pardon if I misunderstood you. > Ashok, you prove for the nth time that you don't really read my emails. > Sorry, try again. I absolutely read your emails Boris. But when I have a misunderstanding its an iterative process. Asking a question or responding to your email doesn't mean I'm willfully ignoring, or have a ill/malice intend. You are reviewing the code and I'm simply discussing what each person means. I hope its OK to have a dialog. Cheers, Ashok
On 11/13/2022 8:58 AM, Borislav Petkov wrote: > On Sun, Nov 13, 2022 at 08:41:47AM -0800, Joseph, Jithu wrote: >> Do you think it is better to restrict filename input to confirm >> to ff-mm-ss-xy.<test> format rather than accepting any string and >> treating it as a file-name and trying to load it, if it is present ? > > Yap, that way you pre-filter filenames. Yeah, the header checks *must* > absolutely happen too but it would be a simple first test. > > Also, you can check for: > > ff-mm-ss-X.<test> > > where X is a [0-9]+ of arbitrary length and this way won't have the > artificial 256 limit you have now. > Thanks for clarifying >> (Given that, before loading, We do intel_find_matching_signature(), >> which validates if the signature/pf entries in header confirms to the >> machine we are on before loading) >> >> We did accepting file-name as input before [1] (except for validating >> if the filename confirms to ff-mm-ss format) > > Yeah, except now you want to do multiple sets of scan files. The earlier change (which modified reload sysfs file from accepting a file-name instead of 1) was for allowing multiple test files (i.e no different from what we are doing now) Then we were told not to specify a filename via sysfs file (apologies for being repetitive) Jithu [1] https://lore.kernel.org/lkml/20220710160011.995800-1-jithu.joseph@intel.com/
On Sun, Nov 13, 2022 at 09:55:00AM -0800, Joseph, Jithu wrote: > Then we were told not to specify a filename via sysfs file (apologies > for being repetitive) Yeah, I'm unclear on why that is either and am hoping that Greg will clarify. He fears that there will be file path resolution which I'm not even thinking about.
On Sun, Nov 13, 2022 at 09:01:32AM -0800, Ashok Raj wrote: > If it's Ok to ask a question. "You" above is the kernel? Of course the kernel. If you think about it, it makes sense only for the kernel to do any checking. As it is enforcing that only the proper blobs are loaded. Just like microcode. Userspace is only doing the triggering of the actions. > Microcode has no such functionality today right? User space > never inputs a filename, only performs echo 1 > reload. Yes, because it is as user-friendly as possible. Users should not care about filenames. But microcode needs only a single file. If you have multiple files like IFS, you could just as well supply them and the kernel would check every aspect before loading them. > If a file name composed by the kernel exists, then it checks the header > validity before proceeding. Yes. > So some of the files will work, some will fail, and user space doesn't > care? See my reply to Jithu: https://lore.kernel.org/r/Y3EiKUzpShqwzEf6@zn.tnic > You are reviewing the code and I'm simply discussing what each person > means. I hope its OK to have a dialog. I say | Late loading works also trivially: | | echo 1 > /sys/devices/system/cpu/microcode/reload | | And it goes and builds the filename from f/m/s and loads it from the | hardcoded path - no filename resolving. | | But it doesn't ask the user to give a f/m/s or a sequence number. You reply with | I don't think the current proposed interface expects a f/m/s. The | entire IFS design was sort of mimicking the microcode interface." | | and you go on to explain what it used to do. I read what it used to do. So how does your reply have any relevance to what I'm saying? I go and give the full spiel on how it is important to support command line loading and how you don't really need a special tool, you say |The utility is more like icing, to run a simple test all you need is a |simple script. It is not a baseline requirement." which feels like you didn't read this part *at* *all*: | A special tool is *always* a burden. You need to build it, supply | it, make sure it is installable on the target system and so on. | | And I'm telling you this with my Linux distributor hat on. It is always | a pain - trust me. | | For example, there's a reason why you can still control ftrace from the | command line and you don't need any tool. You *can* use a tool but you | don't have to. IOW, the KISS philosophy. So now I ended up pasting practically the most of my text again. Why? Because your reply doesn't give me *any* signs that you actually read what I said.
On Sunday, 13 November 2022 08:58:17 PST Borislav Petkov wrote: > * if f/m/s matches, you execute > > if still within the timeout, you return -EAGAIN from > current_batch_store() to tell userspace, take a nap and try again. > > * if the f/m/s doesn't match you return -EINVAL to say, wrong filename, > try the next one. But if it matches but is corrupt or the HW fails to accept it, you also get an error. So you now need to differentiate a failure to load a candidate file and an attempt to load a non-candidate. I'm assuming that the kernel would provide different error conditions for those. But handling those in shell scripting is very difficult: you'd need to start a subshell and parse stderr. It's MUCH easier in C, of course, but incrementing a number is magnitudes easier in C than performing a globbing. Either way, incrementing a number in shell is pretty easy too. The simplest script with just the numbers would be: i=0 while echo $i > /sys/devices/virtual/misc/intel_ifs_0/current_batch; do test_all_cpus done It's four lines and does not need to know about where the scan files are located, how they're named and if some files it may find are not to be used. But I've hidden a lot of complexity in the test_all_cpus shell function, which would be common to either solution of how we specify the batch to be loaded. And this is part of my argument of why it's unlikely people will use their shells to do this. That shell function is easily another 10 lines of scripting, if it's meant to do its job properly. To make that easier, we've developed two tools, one of them the OpenDCDiag tool I linked to, but both just happen to be written in C and C++ instead of shell. > For all Intel employees here on the thread, there's a world outside > Intel and people do not talk (family model stepping) tuples like we do. > All they wanna do is run their damn tests. Indeed they do. I have personally been in contact with the few that will represent over 90% of the deployment of this feature for the next few years. They want this functionality to integrate with their existing health-check scanning methodology. This is where OpenDCDiag comes in, because it does integrate with their workflows, like logging. For another example, it obeys the cpuset that the parent process may have set with sched_setaffinity() or alike tools (taskset(1) or schedtool(8)). I have zero clue how to do that with shell scripting. Which actually means I am the maintainer of the tool that is going to be driving 99% or more of all scans (that's why I was cc'ed in the submission). I am your user. I'm not saying I am the only user. I definitely want to see the best interface so that others could write tools too if they want to. And I don't want there to be a kludge that we need to keep compatibility with for a decade, or to set a bad precedent. But I am giving you the constraints that I need to work under and the kernel interface to support me. I am telling you this is very good right now and your proposal makes it worse for me, not better, for little apparent gain.
On Sun, Nov 13, 2022 at 07:27:26PM +0100, Borislav Petkov wrote: > On Sun, Nov 13, 2022 at 09:55:00AM -0800, Joseph, Jithu wrote: > > Then we were told not to specify a filename via sysfs file (apologies > > for being repetitive) > > Yeah, I'm unclear on why that is either and am hoping that Greg will > clarify. He fears that there will be file path resolution which I'm not > even thinking about. Summarizing the competing proposals here: Option 1 (patches as currently posted) User writes the batch number to the sysfs file: # echo 4 > /sys/devices/virtual/misc/intel_ifs_0/current_batch Driver turns that into a *partial* path (with test type, family-model-stepping and batch number filled in): "intel/ifs_%d/%02x-%02x-%02x-%02x.scan" Feeds that to request_firmware_direct() (which looks in /lib/firmware) Option 2 (proposed by Boris) User writes a filename to the sysfs file: # echo 06-8f-06-04.scan > /sys/devices/virtual/misc/intel_ifs_0/current_batch Driver parses that: If family-mode-stepping does not match current CPU, then fail with -EINVAL If filename doesn't end with a ".scan" suffix, also fails with -EINVAL Otherwise proceeds in similar manner to above. Constructs partial pathname (just fills in test type and filename: "intel/ifs_%d/%s" Feeds that to request_firmware_direct() (which looks in /lib/firmware) IMHO option 1 is following the microcode precedent of having the kernel construct the filename based on the {x86,model,stepping} fields of struct cpuinfo_x86. I think option 2 isn't really doing the user any favors. Having them feed all the *.scan files they find in /lib/firmware/intel/ifs_0 to the driver to see which ones work becomes progressively worse in every CPU generation. Any script/app running tests is likely to do the ff-mm-ss filter itself ... so why have the kernel do it too? -Tony
On Sunday, 13 November 2022 07:58:52 PST Borislav Petkov wrote:
> * simply try *all* files in a directory
By the way, we don't want that.
It's possible that different steppings of the same generation will have the
same test scan files, with the extended signature informing that they are valid
for this stepping too (see find_matching_signature())), because these files are
going to be pretty big, in the order of a hundred MB each. That means we will
either see symlinked or hardlinked files in the directory.
If you blindly try them all, you're going to spend twice or three times as
long as necessary to complete the scan. With the timeout in question for at
least Sapphire Rapids, we're talking about a difference measured in hours.
On Sunday, 13 November 2022 03:48:40 PST Borislav Petkov wrote: > > You can't do it with a three-line shell script, but we're not > > expecting that shell scripts are the means to use this feature in the > > first place. > > I consider it a serious design mistake of having to have a *special* > tool. A special tool is *always* a burden. You need to build it, supply > it, make sure it is installable on the target system and so on. > > And I'm telling you this with my Linux distributor hat on. It is always > a pain - trust me. > [cut] > IOW, I really think that designing our interfaces with user friendliness > in mind should be of much more higher importance. And requiring the user > to remember or figure out anything she doesn't really need to, in order > to run the test is a burden. We agree that it should be operable without a tool. If nothing else, we will run into situations where we need to debug what's happening and the tool is not going to work for those conditions. Having a direct access to the API right there in /sys is great and is one of the things that Linux excels at and differentiates itself from the competition with. However, I am saying that we shouldn't have to go out of our way to make the extreme corner case easy if it comes to the detriment of the 99% case. Anyway, we can update the tool to print "%02x-%02x-%02x-%02x.%s" instead of "%d". That's trivial to us. I just don't think it's a worthwhile change, because four of the five placeholders there are enforced by the kernel and therefore the kernel must check them again (maybe it does so anyway when it opens and validates the file). What REALLY matters to me is that the current_batch file be readable and I can get the last batch that was loaded, in a well-known format. Without this, we go from "inconvenience" to "major design change, must talk to customers and customers must adapt their workloads". Please help me out here.
On Sun, Nov 13, 2022 at 01:21:41PM -0800, Thiago Macieira wrote: > I'm assuming that the kernel would provide different error conditions for > those. But handling those in shell scripting is very difficult: you'd need to > start a subshell and parse stderr. You call that "very difficult"?! > And this is part of my argument of why it's unlikely people will use their > shells to do this. That shell function is easily another 10 lines of > scripting, if it's meant to do its job properly. To make that easier, we've > developed two tools, one of them the OpenDCDiag tool I linked to, but both > just happen to be written in C and C++ instead of shell. I've told you already that having a special tool is always more problematic. I guess you'll make your own experience distributing it but if you see pushback and people start complaining, you shouldn't wonder why.
On Sun, Nov 13, 2022 at 01:33:50PM -0800, Tony Luck wrote: > IMHO option 1 is following the microcode precedent of having the kernel > construct the filename based on the {x86,model,stepping} fields of > struct cpuinfo_x86. You can't follow the microcode precedent if you have more than one file. > I think option 2 isn't really doing the user any favors. Having them > feed all the *.scan files they find in /lib/firmware/intel/ifs_0 to the > driver to see which ones work becomes progressively worse in every CPU > generation. Any script/app running tests is likely to do the ff-mm-ss > filter itself ... so why have the kernel do it too? So you think that feeding a sequence number is more intuitive? I guess I've given all my arguments here but you folks don't see things this way. I guess time will show whether the sequence number thing was a good idea.
On Sun, Nov 13, 2022 at 01:40:56PM -0800, Thiago Macieira wrote: > On Sunday, 13 November 2022 07:58:52 PST Borislav Petkov wrote: > > * simply try *all* files in a directory > > By the way, we don't want that. > > It's possible that different steppings of the same generation will have the > same test scan files, with the extended signature informing that they are valid > for this stepping too (see find_matching_signature())), because these files are > going to be pretty big, in the order of a hundred MB each. That means we will > either see symlinked or hardlinked files in the directory. > > If you blindly try them all, you're going to spend twice or three times as > long as necessary to complete the scan. With the timeout in question for at > least Sapphire Rapids, we're talking about a difference measured in hours. You either have files which are valid and which will get run on the CPU or those which will fail the header check. The header check happens in msecs. So I have no clue what "hours" you're talking about here.
On Sun, Nov 13, 2022 at 01:51:56PM -0800, Thiago Macieira wrote: > Anyway, we can update the tool to print "%02x-%02x-%02x-%02x.%s" instead of > "%d". That's trivial to us. I just don't think it's a worthwhile change, As I wrote to Tony, "I guess time will show whether the sequence number thing was a good idea." Just remember that changing a user-visible interface after the fact is a lot harder.
On Sun, Nov 13, 2022 at 12:48:40PM +0100, Borislav Petkov wrote: > Replying to both with one mail because it still feels like there's a > misunderstanding. > > On Sun, Nov 13, 2022 at 08:37:32AM +0100, gregkh@linuxfoundation.org wrote: > > No, please do not force the driver to resolve a filename path in the > > kernel. > > No, I don't mean to do any filename path resolving - all I suggest is to > echo into sysfs the full filename instead of the number. I.e., this: > > for i in $(ls /lib/firmware/intel/ifs_0/*.scan); > do > echo $i /sys/devices/virtual/misc/intel_ifs_0/current_batch > done Sorry, yes, that is fine, I was objecting to the previous "write any path/file to the sysfs entry and the kernel will parse it" that was happening in the original series. A filename, without a path, that always loads from the existing in-kernel firmware path locations is fine. thanks, greg k-h
Hi, On 11/14/22 00:05, Borislav Petkov wrote: > On Sun, Nov 13, 2022 at 01:51:56PM -0800, Thiago Macieira wrote: >> Anyway, we can update the tool to print "%02x-%02x-%02x-%02x.%s" instead of >> "%d". That's trivial to us. I just don't think it's a worthwhile change, > > As I wrote to Tony, "I guess time will show whether the sequence number > thing was a good idea." Just for the record: I've read the entire thread and I'm fine with doing things either way. If I understand this last email correctly then the plan is to move ahead with the patches as-is, with writing only the batch-number and have the kernel create the filename. This was and still is fine with me. Regards, Hans
On Mon, Nov 14, 2022 at 08:15:58AM +0100, gregkh@linuxfoundation.org wrote: > On Sun, Nov 13, 2022 at 12:48:40PM +0100, Borislav Petkov wrote: > > Replying to both with one mail because it still feels like there's a > > misunderstanding. > > > > On Sun, Nov 13, 2022 at 08:37:32AM +0100, gregkh@linuxfoundation.org wrote: > > > No, please do not force the driver to resolve a filename path in the > > > kernel. > > > > No, I don't mean to do any filename path resolving - all I suggest is to > > echo into sysfs the full filename instead of the number. I.e., this: > > > > for i in $(ls /lib/firmware/intel/ifs_0/*.scan); > > do > > echo $i /sys/devices/virtual/misc/intel_ifs_0/current_batch Bug ... $i is a full path here. I think Boris meant: echo ${i##*/} > /sys/devices/virtual/misc/intel_ifs_0/current_batch > > done > > Sorry, yes, that is fine, I was objecting to the previous "write any > path/file to the sysfs entry and the kernel will parse it" that was > happening in the original series. A filename, without a path, that > always loads from the existing in-kernel firmware path locations is > fine. Just to set the record straight, the previous patch did *not* allow any path/file. It seems that you misread that original patch. Whole thing is here: https://lore.kernel.org/all/20220708151938.986530-1-jithu.joseph@intel.com/ But the important bits are in the commit comment: Change the semantics of the "reload" file. Writing "1" keeps the legacy behavior to reload from the default "ff-mm-ss.scan" file, but now interpret other strings as a filename to be loaded from the /lib/firmware/intel/ifs directory. and the code: + if (strchr(file_name, '/')) + goto done; Is there some other function/macro that we should have used to check that user input was a filename and not a pathname that would have made this clearer? I do agree that overloading semantics of the "reload" was icky. Changing the name of the sysfs file and dropping the "1" means reload default has made this a better interface. -Tony
On Mon, Nov 14, 2022 at 07:33:40AM -0800, Tony Luck wrote:
> Bug ... $i is a full path here. I think Boris meant:
Kinda.
I meant:
for i in $(ls /lib/firmware/intel/ifs_0/) ...
initially. Note the missing glob at the end. This gives the filenames
only, without the path. But then I added "*.scan" without testing it.
The "*.scan" thing would've been redundant too if we did what I was
suggesting...
On 11/13/22 07:58, Borislav Petkov wrote: > On Sun, Nov 13, 2022 at 07:15:03AM -0800, Ashok Raj wrote: >> Do you expect the /lib/firmware/intel/ifs_0/ to contain *ONLY* files for >> this platform? For microcode we have everything in the public release >> included here. > Same as microcode, as I said further down in my mail: > > "And, ofcourse it would check the format of that string against family, > model, stepping and sequence number (btw this way you drop your > limitation of 256 for the sequence number which you don't really need > either)." Maybe dumb question, but what's the point of even checking the filenames? They're meaningless. Let's say we're on model=1,family=2,stepping=3. We try to load test #99: 01-02-03-99.scan The kernel goes and does the sscanf() and checks "01", "02", etc... Everything is fine. The header checks on the .scan file are OK. Life is good. No harm no foul. Then, some dastardly user does this: mv 04-05-06-99.scan 01-02-03-99.scan Taking an evil model=4,family=5,stepping=6 .scan file and trying to load it. It will *pass* the sscanf() checks. But, will fail the metadata checks. The kernel wasted the effort of requesting the file, but there's no harm to anything. So, what's the point of the sscanf() to check the *filename* other than saving some potentially expensive request_firmware() calls?
> So, what's the point of the sscanf() to check the *filename* other than > saving some potentially expensive request_firmware() calls? Not much point. There are two subsequent checks. First the driver checks the F-M-S in the header of the file ... so your dastardly user will be thwarted by this check. If an even more dastardly user edited the file to have the right F-M-S (and recomputed the file checksum) then the driver would be fooled, but microcode would see that the signed binary portion is not for this CPU and reject it. -Tony
On Mon, Nov 14, 2022 at 10:13:44AM -0800, Dave Hansen wrote: > So, what's the point of the sscanf() to check the *filename* other than > saving some potentially expensive request_firmware() calls? The point is to do a first sanity check. The other tests are of course mandatory. Which brings me to another bastardly idea: Let's say you have sequence numbers 0-6. Now someone comes along and changes them all to x-y, where both x and y are > 6. Or removes the sequence numbers completely. Now you go and echo your sequence number into current_batch and nothing happens. But the sequences *are* there - just not named in a magic way. Now *my* idea would work because you don't care what the filenames are. But you don't want that and you want do some silly sequence numbers which the user has to go and connect with what's in the directory. If this were microcode and arch/x86/, I would've never accepted it.
> Now someone comes along and changes them all to x-y, where both x and y > are > 6. Or removes the sequence numbers completely. While there are system admins who might want to deliberately sabotage the system they are responsible for ... let's not worry too much about them. They have root access and can find a million other ways to break things. -Tony
On Mon, Nov 14, 2022 at 07:07:47PM +0000, Luck, Tony wrote: > > Now someone comes along and changes them all to x-y, where both x and y > > are > 6. Or removes the sequence numbers completely. > > While there are system admins who might want to deliberately sabotage the > system they are responsible for ... let's not worry too much about them. They > have root access and can find a million other ways to break things. It doesn't have to be a deliberate sabotage but just plain old sloppiness. Or some wild renaming when handing in the directory into guests with bind mounts or some weird setup or whatnot. You're making these sequence numbers unnecessarily magical. And they don't need to be.
> You're making these sequence numbers unnecessarily magical. > And they don't need to be. I see them as opaque tokens. User asks to load "test set 2" with "echo 2 > current_batch" and the driver finds the file for the ff-mm-ss that contains batch 2 of tests. Numbers are plausible tokens - except for the sequence implication ... you don't have to start at batch 1, and you don't need to go in any particular order. But realistically people will run {1..N} in order, and there's no harm if they think they have to do that. For Thiago's use case, numbers are better than filenames because the container running the test may not have access to the directory to find the filenames. But if this is the only roadblock to taking this patch series, then we can switch to filenames and make Thiago find some way for the container to be given the list of tests to run in the form of filenames. -Tony
On Mon, Nov 14, 2022 at 07:38:35PM +0000, Luck, Tony wrote: > But if this is the only roadblock to taking this patch series, > then we can switch to filenames and make Thiago find some way for > the container to be given the list of tests to run in the form of > filenames. I'm not putting any roadblocks as this is not the area I maintain. I'm just saying that this whole scheme looks fragile to me. I.e., I'm just playing the devil's advocate and pointing out the issues this scheme could have. It might not but it could - we can't know of all the possible future use cases. So this is all your call as far as I'm concerned. Hans was fine either way. I'll take the next revision through tip as we agreed with Hans and after the other, minor issues I've pointed out have been taken care of.
diff --git a/drivers/platform/x86/intel/ifs/ifs.h b/drivers/platform/x86/intel/ifs/ifs.h index 98ca91bdd5ca..390e508faf57 100644 --- a/drivers/platform/x86/intel/ifs/ifs.h +++ b/drivers/platform/x86/intel/ifs/ifs.h @@ -33,13 +33,23 @@ * The driver loads the tests into memory reserved BIOS local to each CPU * socket in a two step process using writes to MSRs to first load the * SHA hashes for the test. Then the tests themselves. Status MSRs provide - * feedback on the success/failure of these steps. When a new test file - * is installed it can be loaded by writing to the driver reload file:: + * feedback on the success/failure of these steps. * - * # echo 1 > /sys/devices/virtual/misc/intel_ifs_0/reload + * The test files are kept in a fixed location: /lib/firmware/intel/ifs_0/ + * For e.g if there are 3 test files, they would be named in the following + * fashion: + * ff-mm-ss-01.scan + * ff-mm-ss-02.scan + * ff-mm-ss-03.scan + * (where ff refers to family, mm indicates model and ss indicates stepping) * - * Similar to microcode, the current version of the scan tests is stored - * in a fixed location: /lib/firmware/intel/ifs.0/family-model-stepping.scan + * A different testfile can be loaded by writing the numerical portion + * (e.g 1, 2 or 3 in the above scenario) into the curent_batch file. + * To load ff-mm-ss-02.scan, the following command can be used:: + * + * # echo 2 > /sys/devices/virtual/misc/intel_ifs_0/current_batch + * + * The above file can also be read to know the currently loaded image. * * Running tests * ------------- @@ -207,6 +217,7 @@ struct ifs_data { int status; u64 scan_details; int cur_batch; + int test_num; }; struct ifs_work { @@ -227,7 +238,7 @@ static inline struct ifs_data *ifs_get_data(struct device *dev) return &d->data; } -void ifs_load_firmware(struct device *dev); +int ifs_load_firmware(struct device *dev); int do_core_test(int cpu, struct device *dev); const struct attribute_group **ifs_get_groups(void); diff --git a/drivers/platform/x86/intel/ifs/core.c b/drivers/platform/x86/intel/ifs/core.c index 5fb7f655c291..1f040837e8eb 100644 --- a/drivers/platform/x86/intel/ifs/core.c +++ b/drivers/platform/x86/intel/ifs/core.c @@ -22,6 +22,7 @@ MODULE_DEVICE_TABLE(x86cpu, ifs_cpu_ids); static struct ifs_device ifs_device = { .data = { .integrity_cap_bit = MSR_INTEGRITY_CAPS_PERIODIC_BIST_BIT, + .test_num = 0, }, .misc = { .name = "intel_ifs_0", diff --git a/drivers/platform/x86/intel/ifs/load.c b/drivers/platform/x86/intel/ifs/load.c index f361fd42a320..c6414958a691 100644 --- a/drivers/platform/x86/intel/ifs/load.c +++ b/drivers/platform/x86/intel/ifs/load.c @@ -256,17 +256,18 @@ static int ifs_image_sanity_check(struct device *dev, const struct microcode_hea /* * Load ifs image. Before loading ifs module, the ifs image must be located - * in /lib/firmware/intel/ifs and named as {family/model/stepping}.{testname}. + * in /lib/firmware/intel/ifs_x/ and named as family-model-stepping-02x.{testname}. */ -void ifs_load_firmware(struct device *dev) +int ifs_load_firmware(struct device *dev) { struct ifs_data *ifsd = ifs_get_data(dev); const struct firmware *fw; - char scan_path[32]; - int ret; + char scan_path[64]; + int ret = -EINVAL; - snprintf(scan_path, sizeof(scan_path), "intel/ifs/%02x-%02x-%02x.scan", - boot_cpu_data.x86, boot_cpu_data.x86_model, boot_cpu_data.x86_stepping); + snprintf(scan_path, sizeof(scan_path), "intel/ifs_%d/%02x-%02x-%02x-%02x.scan", + ifsd->test_num, boot_cpu_data.x86, boot_cpu_data.x86_model, + boot_cpu_data.x86_stepping, ifsd->cur_batch); ret = request_firmware_direct(&fw, scan_path, dev); if (ret) { @@ -282,8 +283,13 @@ void ifs_load_firmware(struct device *dev) ifs_hash_ptr = (u64)(ifs_header_ptr + 1); ret = scan_chunks_sanity_check(dev); + if (ret) + dev_err(dev, "Load failure for batch: %02x\n", ifsd->cur_batch); + release: release_firmware(fw); done: ifsd->loaded = (ret == 0); + + return ret; } diff --git a/drivers/platform/x86/intel/ifs/runtest.c b/drivers/platform/x86/intel/ifs/runtest.c index b2ca2bb4501f..0bfd8fcdd7e8 100644 --- a/drivers/platform/x86/intel/ifs/runtest.c +++ b/drivers/platform/x86/intel/ifs/runtest.c @@ -78,14 +78,16 @@ static void message_not_tested(struct device *dev, int cpu, union ifs_status sta static void message_fail(struct device *dev, int cpu, union ifs_status status) { + struct ifs_data *ifsd = ifs_get_data(dev); + /* * control_error is set when the microcode runs into a problem * loading the image from the reserved BIOS memory, or it has * been corrupted. Reloading the image may fix this issue. */ if (status.control_error) { - dev_err(dev, "CPU(s) %*pbl: could not execute from loaded scan image\n", - cpumask_pr_args(cpu_smt_mask(cpu))); + dev_err(dev, "CPU(s) %*pbl: could not execute from loaded scan image. Batch: %02x version: 0x%x\n", + cpumask_pr_args(cpu_smt_mask(cpu)), ifsd->cur_batch, ifsd->loaded_version); } /* @@ -96,8 +98,8 @@ static void message_fail(struct device *dev, int cpu, union ifs_status status) * the core being tested. */ if (status.signature_error) { - dev_err(dev, "CPU(s) %*pbl: test signature incorrect.\n", - cpumask_pr_args(cpu_smt_mask(cpu))); + dev_err(dev, "CPU(s) %*pbl: test signature incorrect. Batch: %02x version: 0x%x\n", + cpumask_pr_args(cpu_smt_mask(cpu)), ifsd->cur_batch, ifsd->loaded_version); } } diff --git a/drivers/platform/x86/intel/ifs/sysfs.c b/drivers/platform/x86/intel/ifs/sysfs.c index e077910c5d28..d2eeeb04d760 100644 --- a/drivers/platform/x86/intel/ifs/sysfs.c +++ b/drivers/platform/x86/intel/ifs/sysfs.c @@ -87,6 +87,43 @@ static ssize_t run_test_store(struct device *dev, static DEVICE_ATTR_WO(run_test); +static ssize_t current_batch_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ifs_data *ifsd = ifs_get_data(dev); + int cur_batch; + int rc; + + rc = kstrtouint(buf, 0, &cur_batch); + if (rc < 0 || cur_batch > 0xff) + return -EINVAL; + + if (down_interruptible(&ifs_sem)) + return -EINTR; + + ifsd->cur_batch = cur_batch; + + rc = ifs_load_firmware(dev); + + up(&ifs_sem); + + return (rc == 0) ? count : rc; +} + +static ssize_t current_batch_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct ifs_data *ifsd = ifs_get_data(dev); + + if (!ifsd->loaded) + return sysfs_emit(buf, "none\n"); + else + return sysfs_emit(buf, "0x%02x\n", ifsd->cur_batch); +} + +static DEVICE_ATTR_RW(current_batch); + /* * Display currently loaded IFS image version. */ @@ -108,6 +145,7 @@ static struct attribute *plat_ifs_attrs[] = { &dev_attr_details.attr, &dev_attr_status.attr, &dev_attr_run_test.attr, + &dev_attr_current_batch.attr, &dev_attr_image_version.attr, NULL };