Message ID | 20230915172818.761-3-shiju.jose@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | ACPI:RASF: Add support for ACPI RASF, ACPI RAS2 and configure scrubbers | expand |
On Fri, Sep 15, 2023 at 10:29 AM <shiju.jose@huawei.com> wrote: > > From: Shiju Jose <shiju.jose@huawei.com> > > Add sysfs documentation entries for the set of attributes those are > exposed in /sys/class/scrub/ by the scrub driver. These attributes > support configuring parameters of a scrub device. > > Signed-off-by: Shiju Jose <shiju.jose@huawei.com> > --- > .../ABI/testing/sysfs-class-scrub-configure | 82 +++++++++++++++++++ > 1 file changed, 82 insertions(+) > create mode 100644 Documentation/ABI/testing/sysfs-class-scrub-configure > > diff --git a/Documentation/ABI/testing/sysfs-class-scrub-configure b/Documentation/ABI/testing/sysfs-class-scrub-configure > new file mode 100644 > index 000000000000..347e2167dc62 > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-class-scrub-configure > @@ -0,0 +1,82 @@ > +What: /sys/class/scrub/ > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + The scrub/ class subdirectory belongs to the > + scrubber subsystem. > + > +What: /sys/class/scrub/scrubX/ > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + The /sys/class/scrub/scrub{0,1,2,3,...} directories This API (sysfs interface) is very specific to the ACPI interface defined for hardware patrol scrubber. I wonder can we have some interface that is more generic, for a couple of reasons: 1. I am not aware of any chip/platform hardware that implemented the hw ps part defined in ACPI RASF/RAS2 spec. So I am curious what the RAS experts from different hardware vendors think about this. For example, Tony and Dave from Intel, Jon and Vilas from AMD. Is there any hardware platform (if allowed to disclose) that implemented ACPI RASF/RAS2? If so, will vendors continue to support the control of patrol scrubber using the ACPI spec? If not (as Tony said in [1], will the vendor consider starting some future platform? If we are unlikely to get the vendor support, creating this ACPI specific sysfs API (and the driver implementations) in Linux seems to have limited meaning. > + correspond to each scrub device. > + > +What: /sys/class/scrub/scrubX/name > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + (RO) name of the memory scrub device > + > +What: /sys/class/scrub/scrubX/regionY/ 2. I believe the concept of "region" here is probably from PATROL_SCRUB defined in “APCI section 5.2.20.5. Parameter Block". It is indeed powerful: if a process's physical memory spans over multiple memory controllers, OS can in theory scrub chunks of the memory belonging to the process. However, from a previous discussion [1], "From a h/w perspective it might always be complex". IIUC, the address translation from physical address to channel address is hard to achieve, and probably that's one of the tech reasons the patrol scrub ACPI spec is not in practice implemented? So my take is, control at the granularity of the memory controller is probably a nice compromise. Both OS and userspace can get a pretty decent amount of flexibility, start/stop/adjust speed of the scrubbing on a memory controller; meanwhile it doesn't impose too much pain to hardware vendors when they provide these features in hardware. In terms of how these controls/features will be implemented, I imagine it could be implemented: * via hardware registers that directly or indirectly control on memory controllers * via ACPI if the situation changes in 10 years (and the RASF/RAS2/PCC drivers implemented in this patchset can be directly plugged into) * a kernel-thread that uses cpu read to detect memory errors, if hardware support is unavailable or not good enough Given these possible backends of scrubbing, I think a more generic sysfs API that covers and abstracts these backends will be more valuable right now. I haven’t thought thoroughly, but how about defining the top-level interface as something like “/sys/devices/system/memory_controller_scrubX/”, or “/sys/class/memory_controllerX/scrub”? [1] https://lore.kernel.org/linux-mm/SJ1PR11MB6083BF93E9A88E659CED5EC4FC3F9@SJ1PR11MB6083.namprd11.prod.outlook.com/T/#m13516ee35caa05b506080ae805bee14f9f958d43 > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + The /sys/class/scrub/scrubX/region{0,1,2,3,...} > + directories correspond to each scrub region under a scrub device. > + Scrub region is a physical address range for which scrub may be > + separately controlled. Regions may overlap in which case the > + scrubbing rate of the overlapped memory will be at least that > + expected due to each overlapping region. > + > +What: /sys/class/scrub/scrubX/regionY/addr_base > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + (RW) The base of the address range of the memory region > + to be patrol scrubbed. > + On reading, returns the base of the memory region for > + the actual address range(The platform calculates > + the nearest patrol scrub boundary address from where > + it can start scrubbing). > + > +What: /sys/class/scrub/scrubX/regionY/addr_size > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + (RW) The size of the address range to be patrol scrubbed. > + On reading, returns the size of the memory region for > + the actual address range. > + > +What: /sys/class/scrub/scrubX/regionY/enable > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + (WO) Start/Stop scrubbing the memory region. > + 1 - enable the memory scrubbing. > + 0 - disable the memory scrubbing.. > + > +What: /sys/class/scrub/scrubX/regionY/speed_available > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + (RO) Supported range for the partol speed(scrub rate) > + by the scrubber for a memory region. > + The unit of the scrub rate vary depends on the scrubber. > + > +What: /sys/class/scrub/scrubX/regionY/speed > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@vger.kernel.org > +Description: > + (RW) The partol speed(scrub rate) on the memory region specified and > + it must be with in the supported range by the scrubber. > + The unit of the scrub rate vary depends on the scrubber. > -- > 2.34.1 > >
On Thu, 21 Sep 2023 17:07:04 -0700 Jiaqi Yan <jiaqiyan@google.com> wrote: > On Fri, Sep 15, 2023 at 10:29 AM <shiju.jose@huawei.com> wrote: > > > > From: Shiju Jose <shiju.jose@huawei.com> > > > > Add sysfs documentation entries for the set of attributes those are > > exposed in /sys/class/scrub/ by the scrub driver. These attributes > > support configuring parameters of a scrub device. > > > > Signed-off-by: Shiju Jose <shiju.jose@huawei.com> > > --- > > .../ABI/testing/sysfs-class-scrub-configure | 82 +++++++++++++++++++ > > 1 file changed, 82 insertions(+) > > create mode 100644 Documentation/ABI/testing/sysfs-class-scrub-configure > > > > diff --git a/Documentation/ABI/testing/sysfs-class-scrub-configure b/Documentation/ABI/testing/sysfs-class-scrub-configure > > new file mode 100644 > > index 000000000000..347e2167dc62 > > --- /dev/null > > +++ b/Documentation/ABI/testing/sysfs-class-scrub-configure > > @@ -0,0 +1,82 @@ > > +What: /sys/class/scrub/ > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + The scrub/ class subdirectory belongs to the > > + scrubber subsystem. > > + > > +What: /sys/class/scrub/scrubX/ > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + The /sys/class/scrub/scrub{0,1,2,3,...} directories > > This API (sysfs interface) is very specific to the ACPI interface > defined for hardware patrol scrubber. I wonder can we have some > interface that is more generic, for a couple of reasons: Agreed that it makes sense to define a broad interface. We have some hardware focused drivers we can't share yet (IP rules until a release date in the near future) where this is a reasonable fit - but indeed there are others such as mapping this to DDR ECS where it isn't a great mapping. I'd love to come up with an interface that has the right blend of generality and flexibility. That is easiest done before we have any implementation merged. > > 1. I am not aware of any chip/platform hardware that implemented the > hw ps part defined in ACPI RASF/RAS2 spec. So I am curious what the > RAS experts from different hardware vendors think about this. For > example, Tony and Dave from Intel, Jon and Vilas from AMD. Is there > any hardware platform (if allowed to disclose) that implemented ACPI > RASF/RAS2? If so, will vendors continue to support the control of > patrol scrubber using the ACPI spec? If not (as Tony said in [1], will > the vendor consider starting some future platform? > > If we are unlikely to get the vendor support, creating this ACPI > specific sysfs API (and the driver implementations) in Linux seems to > have limited meaning. There is a bit of a chicken and egg problem here. Until there is reasonable support in kernel (or it looks like there will be), BIOS teams push back on a requirement to add the tables. I'd encourage no one to bother with RASF - RAS2 is much less ambiguous. > > > + correspond to each scrub device. > > + > > +What: /sys/class/scrub/scrubX/name > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + (RO) name of the memory scrub device > > + > > +What: /sys/class/scrub/scrubX/regionY/ > > 2. I believe the concept of "region" here is probably from > PATROL_SCRUB defined in “APCI section 5.2.20.5. Parameter Block". It > is indeed powerful: if a process's physical memory spans over multiple > memory controllers, OS can in theory scrub chunks of the memory > belonging to the process. However, from a previous discussion [1], > "From a h/w perspective it might always be complex". IIUC, the address > translation from physical address to channel address is hard to > achieve, and probably that's one of the tech reasons the patrol scrub > ACPI spec is not in practice implemented? Next bit is kind of an aside as I mostly agree with your conclusions ;) This obviously depends on your memory interleave. You want to present physical address ranges as single controllable regions - probably most interesting being stuff that maps to NUMA nodes. The short answer is that any firmware control will end up writing to all the memory controllers involved in a given PA range - firmware can easily establish which those are. A memory controller can support multiple scrub regions which map from a PA range to a particular set of RAM addresses - that's implementation specific. The memory controller is getting the host PA and can carry out appropriate hashing if it wants to. Many scrub solutions won't do this - in which case it's max one region per memory controller (mapped by firmware to one region per interleave set - assuming interleave sets take in whole memory controllers - which they normally do). I would expect existing systems (not design with this differentiated scrub in mind) to only support scrub control by PA range at the granularity of interleave sets. Note that this general question of PA based controls also maps to things like resource control (resctl) where it's only interesting to partition memory bandwidth such that the partition applies to the whole interleave set - that's done for ARM systems anyway by having the userspace interface pretend there is only one control, but write the settings to all actual controllers involved. Not sure what x86 does. Taking a few examples that I know of. All 4 socket server - with control of these as bios options ;). Assuming individual memory controllers don't allow scrub to be configured by PA range. 1. Full system memory interleave (people do this form of crazy) In that case, there is only one set of firmware controls that write to the interfaces of every memory controller. Depending on the interleave design that may still allow multiple regions. 2. Socket wide memory interleave. In that case, firmware controls need to effect all memory controllers in that socket if the requested 'region' maps to them. So 4 PA regions. 3. Die wide memory interleave. Finer granularity of control so perhaps 8 PA rgiones. 4. Finer granularity (there are reasons to do this for above mentioned bandwidth resource control which you can only do if not interleaving across multiple controllers). > > So my take is, control at the granularity of the memory controller is > probably a nice compromise. > Both OS and userspace can get a pretty > decent amount of flexibility, start/stop/adjust speed of the scrubbing > on a memory controller; meanwhile it doesn't impose too much pain to > hardware vendors when they provide these features in hardware. In > terms of how these controls/features will be implemented, I imagine it > could be implemented: > * via hardware registers that directly or indirectly control on memory > controllers > * via ACPI if the situation changes in 10 years (and the RASF/RAS2/PCC > drivers implemented in this patchset can be directly plugged into) > * a kernel-thread that uses cpu read to detect memory errors, if > hardware support is unavailable or not good enough > I more or less agree, but would tweak a little. 1) Allow for multiple regions per memory controller - that's needed for RASF etc anyway - because as far as they are concerned there is only one interface presented. 2) Tie the interface to interleave sets, not memory controllers. NUMA nodes often being a good stand in for those. Controlling memory controllers separately for parts of an interleave isn't something I'd think was very useful. This will probably get messy in the future though and the complexity could be pushed to a userspace tool - as long as enough info is available elsewhere in sysfs. So need those memory controller directories you propose to include info on the PA ranges that they are involved in backing (which to keep things horrible, can change at runtime via memory hotplug and remapping of host physical address ranges on CXL etc) > Given these possible backends of scrubbing, I think a more generic > sysfs API that covers and abstracts these backends will be more > valuable right now. I haven’t thought thoroughly, but how about > defining the top-level interface as something like > “/sys/devices/system/memory_controller_scrubX/”, or > “/sys/class/memory_controllerX/scrub”? No particular harm in the rename of the directory I guess. Though some of those 'memory_controllers' would be virtual as they wouldn't correspond to actual memory controllers but rather to sets of them. Jonathan > > [1] https://lore.kernel.org/linux-mm/SJ1PR11MB6083BF93E9A88E659CED5EC4FC3F9@SJ1PR11MB6083.namprd11.prod.outlook.com/T/#m13516ee35caa05b506080ae805bee14f9f958d43 > > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + The /sys/class/scrub/scrubX/region{0,1,2,3,...} > > + directories correspond to each scrub region under a scrub device. > > + Scrub region is a physical address range for which scrub may be > > + separately controlled. Regions may overlap in which case the > > + scrubbing rate of the overlapped memory will be at least that > > + expected due to each overlapping region. > > + > > +What: /sys/class/scrub/scrubX/regionY/addr_base > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + (RW) The base of the address range of the memory region > > + to be patrol scrubbed. > > + On reading, returns the base of the memory region for > > + the actual address range(The platform calculates > > + the nearest patrol scrub boundary address from where > > + it can start scrubbing). > > + > > +What: /sys/class/scrub/scrubX/regionY/addr_size > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + (RW) The size of the address range to be patrol scrubbed. > > + On reading, returns the size of the memory region for > > + the actual address range. > > + > > +What: /sys/class/scrub/scrubX/regionY/enable > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + (WO) Start/Stop scrubbing the memory region. > > + 1 - enable the memory scrubbing. > > + 0 - disable the memory scrubbing.. > > + > > +What: /sys/class/scrub/scrubX/regionY/speed_available > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + (RO) Supported range for the partol speed(scrub rate) > > + by the scrubber for a memory region. > > + The unit of the scrub rate vary depends on the scrubber. > > + > > +What: /sys/class/scrub/scrubX/regionY/speed > > +Date: September 2023 > > +KernelVersion: 6.7 > > +Contact: linux-kernel@vger.kernel.org > > +Description: > > + (RW) The partol speed(scrub rate) on the memory region specified and > > + it must be with in the supported range by the scrubber. > > + The unit of the scrub rate vary depends on the scrubber. > > -- > > 2.34.1 > > > > >
On Fri, Sep 22, 2023 at 3:20 AM Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > > On Thu, 21 Sep 2023 17:07:04 -0700 > Jiaqi Yan <jiaqiyan@google.com> wrote: > > > On Fri, Sep 15, 2023 at 10:29 AM <shiju.jose@huawei.com> wrote: > > > > > > From: Shiju Jose <shiju.jose@huawei.com> > > > > > > Add sysfs documentation entries for the set of attributes those are > > > exposed in /sys/class/scrub/ by the scrub driver. These attributes > > > support configuring parameters of a scrub device. > > > > > > Signed-off-by: Shiju Jose <shiju.jose@huawei.com> > > > --- > > > .../ABI/testing/sysfs-class-scrub-configure | 82 +++++++++++++++++++ > > > 1 file changed, 82 insertions(+) > > > create mode 100644 Documentation/ABI/testing/sysfs-class-scrub-configure > > > > > > diff --git a/Documentation/ABI/testing/sysfs-class-scrub-configure b/Documentation/ABI/testing/sysfs-class-scrub-configure > > > new file mode 100644 > > > index 000000000000..347e2167dc62 > > > --- /dev/null > > > +++ b/Documentation/ABI/testing/sysfs-class-scrub-configure > > > @@ -0,0 +1,82 @@ > > > +What: /sys/class/scrub/ > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + The scrub/ class subdirectory belongs to the > > > + scrubber subsystem. > > > + > > > +What: /sys/class/scrub/scrubX/ > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + The /sys/class/scrub/scrub{0,1,2,3,...} directories > > > > This API (sysfs interface) is very specific to the ACPI interface > > defined for hardware patrol scrubber. I wonder can we have some > > interface that is more generic, for a couple of reasons: > > Agreed that it makes sense to define a broad interface. We have > some hardware focused drivers we can't share yet (IP rules until > a release date in the near future) where this is a reasonable fit > - but indeed there are others such as mapping this to DDR ECS > where it isn't a great mapping. > > I'd love to come up with an interface that has the right blend > of generality and flexibility. That is easiest done before we have > any implementation merged. > > > > > 1. I am not aware of any chip/platform hardware that implemented the > > hw ps part defined in ACPI RASF/RAS2 spec. So I am curious what the > > RAS experts from different hardware vendors think about this. For > > example, Tony and Dave from Intel, Jon and Vilas from AMD. Is there > > any hardware platform (if allowed to disclose) that implemented ACPI > > RASF/RAS2? If so, will vendors continue to support the control of > > patrol scrubber using the ACPI spec? If not (as Tony said in [1], will > > the vendor consider starting some future platform? > > > > If we are unlikely to get the vendor support, creating this ACPI > > specific sysfs API (and the driver implementations) in Linux seems to > > have limited meaning. > > There is a bit of a chicken and egg problem here. Until there is > reasonable support in kernel (or it looks like there will be), > BIOS teams push back on a requirement to add the tables. > I'd encourage no one to bother with RASF - RAS2 is much less > ambiguous. Here mainly to re-ping folks from Intel (Tony and Dave) and AMD (Jon and Vilas) for your opinion on RAS2. > > > > > > + correspond to each scrub device. > > > + > > > +What: /sys/class/scrub/scrubX/name > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + (RO) name of the memory scrub device > > > + > > > +What: /sys/class/scrub/scrubX/regionY/ > > > > 2. I believe the concept of "region" here is probably from > > PATROL_SCRUB defined in “APCI section 5.2.20.5. Parameter Block". It > > is indeed powerful: if a process's physical memory spans over multiple > > memory controllers, OS can in theory scrub chunks of the memory > > belonging to the process. However, from a previous discussion [1], > > "From a h/w perspective it might always be complex". IIUC, the address > > translation from physical address to channel address is hard to > > achieve, and probably that's one of the tech reasons the patrol scrub > > ACPI spec is not in practice implemented? > > Next bit is kind of an aside as I mostly agree with your conclusions ;) > > This obviously depends on your memory interleave. You want to present > physical address ranges as single controllable regions - probably > most interesting being stuff that maps to NUMA nodes. The short > answer is that any firmware control will end up writing to all the > memory controllers involved in a given PA range - firmware can easily > establish which those are. > > A memory controller can support multiple scrub regions > which map from a PA range to a particular set of RAM addresses > - that's implementation specific. The memory controller is getting > the host PA and can carry out appropriate hashing if it wants to. > Many scrub solutions won't do this - in which case it's max one > region per memory controller (mapped by firmware to one region per > interleave set - assuming interleave sets take in whole memory > controllers - which they normally do). > > I would expect existing systems (not design with this differentiated > scrub in mind) to only support scrub control by PA range at the > granularity of interleave sets. > > Note that this general question of PA based controls also > maps to things like resource control (resctl) where it's only interesting > to partition memory bandwidth such that the partition applies to the > whole interleave set - that's done for ARM systems anyway by having > the userspace interface pretend there is only one control, but > write the settings to all actual controllers involved. Not sure what > x86 does. > > Taking a few examples that I know of. All 4 socket server - with > control of these as bios options ;). > Assuming individual memory controllers don't allow scrub to be > configured by PA range. > > 1. Full system memory interleave (people do this form of crazy) > In that case, there is only one set of firmware controls > that write to the interfaces of every memory controller. Depending > on the interleave design that may still allow multiple regions. > > 2. Socket wide memory interleave. In that case, firmware controls > need to effect all memory controllers in that socket if the > requested 'region' maps to them. So 4 PA regions. > > 3. Die wide memory interleave. Finer granularity of control > so perhaps 8 PA rgiones. > > 4. Finer granularity (there are reasons to do this for above mentioned > bandwidth resource control which you can only do if not interleaving > across multiple controllers). > > > > > > > So my take is, control at the granularity of the memory controller is > > probably a nice compromise. > > Both OS and userspace can get a pretty > > decent amount of flexibility, start/stop/adjust speed of the scrubbing > > on a memory controller; meanwhile it doesn't impose too much pain to > > hardware vendors when they provide these features in hardware. In > > terms of how these controls/features will be implemented, I imagine it > > could be implemented: > > * via hardware registers that directly or indirectly control on memory > > controllers > > * via ACPI if the situation changes in 10 years (and the RASF/RAS2/PCC > > drivers implemented in this patchset can be directly plugged into) > > * a kernel-thread that uses cpu read to detect memory errors, if > > hardware support is unavailable or not good enough > > > > I more or less agree, but would tweak a little. > > 1) Allow for multiple regions per memory controller - that's needed > for RASF etc anyway - because as far as they are concerned there > is only one interface presented. > 2) Tie the interface to interleave sets, not memory controllers. > NUMA nodes often being a good stand in for those. Does you mean /sys/devices/system/node/nodeX/scrub, where scrub is a virtual concept of scrubbing device that mapps to 1 or several physical scrubber backends. For example, starting/stopping the virtual device means issuing START/STOP cmd to all backends. And... > Controlling memory controllers separately for parts of an interleave > isn't something I'd think was very useful. This will probably get > messy in the future though and the complexity could be pushed to > a userspace tool - as long as enough info is available elsewhere > in sysfs. So need those memory controller directories you propose > to include info on the PA ranges that they are involved in backing is it acceptable if we don't provide PA range or region in the interface *for now* if it complicates things a lot? I could be wrong, but the user of scrubber seems would be ok with not being able to scrub an arbitrary physical address range. In contrast, not knowing the scrub results seems to be more annoying to users. So simply giving some progress indicator like how many bytes a scrubber has scrubbed. When we really need to support PA range or region, under the /sys/devices/system/node/nodeX/scrub interface, it basically uses NUMA node X's PA range. Then to scrub node memory in range [PA1, PA2), some driver that understand all backends (or can talk to all backends' drivers) needs to translate the PA into the address in backend's address space, for example, [PA1, PA2) is mapped to 2 device ranges [DA11, DA12) on backend_1 and [DA21, DA22) on backend_2. > (which to keep things horrible, can change at runtime via memory > hotplug and remapping of host physical address ranges on CXL etc) CXL memory locally attached to the host is probably more or less the same as normal physical memory. I wonder what it would be like for CXL memory remotely attached through a memory pool. Does it make sense that the controller/owner of the memory pool takes the responsibility of controlling the CXL memory controller to control scrubbing? Does the owner need to provide/mediate scrubbing support for other clients using the memory pool? Thanks, Jiaqi > > > Given these possible backends of scrubbing, I think a more generic > > sysfs API that covers and abstracts these backends will be more > > valuable right now. I haven’t thought thoroughly, but how about > > defining the top-level interface as something like > > “/sys/devices/system/memory_controller_scrubX/”, or > > “/sys/class/memory_controllerX/scrub”? > > No particular harm in the rename of the directory I guess. > Though some of those 'memory_controllers' would be virtual as they > wouldn't correspond to actual memory controllers but rather to > sets of them. > > Jonathan > > > > > [1] https://lore.kernel.org/linux-mm/SJ1PR11MB6083BF93E9A88E659CED5EC4FC3F9@SJ1PR11MB6083.namprd11.prod.outlook.com/T/#m13516ee35caa05b506080ae805bee14f9f958d43 > > > > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + The /sys/class/scrub/scrubX/region{0,1,2,3,...} > > > + directories correspond to each scrub region under a scrub device. > > > + Scrub region is a physical address range for which scrub may be > > > + separately controlled. Regions may overlap in which case the > > > + scrubbing rate of the overlapped memory will be at least that > > > + expected due to each overlapping region. > > > + > > > +What: /sys/class/scrub/scrubX/regionY/addr_base > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + (RW) The base of the address range of the memory region > > > + to be patrol scrubbed. > > > + On reading, returns the base of the memory region for > > > + the actual address range(The platform calculates > > > + the nearest patrol scrub boundary address from where > > > + it can start scrubbing). > > > + > > > +What: /sys/class/scrub/scrubX/regionY/addr_size > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + (RW) The size of the address range to be patrol scrubbed. > > > + On reading, returns the size of the memory region for > > > + the actual address range. > > > + > > > +What: /sys/class/scrub/scrubX/regionY/enable > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + (WO) Start/Stop scrubbing the memory region. > > > + 1 - enable the memory scrubbing. > > > + 0 - disable the memory scrubbing.. > > > + > > > +What: /sys/class/scrub/scrubX/regionY/speed_available > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + (RO) Supported range for the partol speed(scrub rate) > > > + by the scrubber for a memory region. > > > + The unit of the scrub rate vary depends on the scrubber. > > > + > > > +What: /sys/class/scrub/scrubX/regionY/speed > > > +Date: September 2023 > > > +KernelVersion: 6.7 > > > +Contact: linux-kernel@vger.kernel.org > > > +Description: > > > + (RW) The partol speed(scrub rate) on the memory region specified and > > > + it must be with in the supported range by the scrubber. > > > + The unit of the scrub rate vary depends on the scrubber. > > > -- > > > 2.34.1 > > > > > > > > > >
On Wed, 27 Sep 2023 22:25:52 -0700 Jiaqi Yan <jiaqiyan@google.com> wrote: > On Fri, Sep 22, 2023 at 3:20 AM Jonathan Cameron > <Jonathan.Cameron@huawei.com> wrote: > > > > On Thu, 21 Sep 2023 17:07:04 -0700 > > Jiaqi Yan <jiaqiyan@google.com> wrote: > > > > > On Fri, Sep 15, 2023 at 10:29 AM <shiju.jose@huawei.com> wrote: > > > > > > > > From: Shiju Jose <shiju.jose@huawei.com> > > > > > > > > Add sysfs documentation entries for the set of attributes those are > > > > exposed in /sys/class/scrub/ by the scrub driver. These attributes > > > > support configuring parameters of a scrub device. > > > > > > > > Signed-off-by: Shiju Jose <shiju.jose@huawei.com> > > > > --- > > > > .../ABI/testing/sysfs-class-scrub-configure | 82 +++++++++++++++++++ > > > > 1 file changed, 82 insertions(+) > > > > create mode 100644 Documentation/ABI/testing/sysfs-class-scrub-configure > > > > > > > > diff --git a/Documentation/ABI/testing/sysfs-class-scrub-configure b/Documentation/ABI/testing/sysfs-class-scrub-configure > > > > new file mode 100644 > > > > index 000000000000..347e2167dc62 > > > > --- /dev/null > > > > +++ b/Documentation/ABI/testing/sysfs-class-scrub-configure > > > > @@ -0,0 +1,82 @@ > > > > +What: /sys/class/scrub/ > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + The scrub/ class subdirectory belongs to the > > > > + scrubber subsystem. > > > > + > > > > +What: /sys/class/scrub/scrubX/ > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + The /sys/class/scrub/scrub{0,1,2,3,...} directories > > > > > > This API (sysfs interface) is very specific to the ACPI interface > > > defined for hardware patrol scrubber. I wonder can we have some > > > interface that is more generic, for a couple of reasons: > > > > Agreed that it makes sense to define a broad interface. We have > > some hardware focused drivers we can't share yet (IP rules until > > a release date in the near future) where this is a reasonable fit > > - but indeed there are others such as mapping this to DDR ECS > > where it isn't a great mapping. > > > > I'd love to come up with an interface that has the right blend > > of generality and flexibility. That is easiest done before we have > > any implementation merged. > > > > > > > > 1. I am not aware of any chip/platform hardware that implemented the > > > hw ps part defined in ACPI RASF/RAS2 spec. So I am curious what the > > > RAS experts from different hardware vendors think about this. For > > > example, Tony and Dave from Intel, Jon and Vilas from AMD. Is there > > > any hardware platform (if allowed to disclose) that implemented ACPI > > > RASF/RAS2? If so, will vendors continue to support the control of > > > patrol scrubber using the ACPI spec? If not (as Tony said in [1], will > > > the vendor consider starting some future platform? > > > > > > If we are unlikely to get the vendor support, creating this ACPI > > > specific sysfs API (and the driver implementations) in Linux seems to > > > have limited meaning. > > > > There is a bit of a chicken and egg problem here. Until there is > > reasonable support in kernel (or it looks like there will be), > > BIOS teams push back on a requirement to add the tables. > > I'd encourage no one to bother with RASF - RAS2 is much less > > ambiguous. > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD (Jon > and Vilas) for your opinion on RAS2. > > > > > > > > > > + correspond to each scrub device. > > > > + > > > > +What: /sys/class/scrub/scrubX/name > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + (RO) name of the memory scrub device > > > > + > > > > +What: /sys/class/scrub/scrubX/regionY/ > > > > > > 2. I believe the concept of "region" here is probably from > > > PATROL_SCRUB defined in “APCI section 5.2.20.5. Parameter Block". It > > > is indeed powerful: if a process's physical memory spans over multiple > > > memory controllers, OS can in theory scrub chunks of the memory > > > belonging to the process. However, from a previous discussion [1], > > > "From a h/w perspective it might always be complex". IIUC, the address > > > translation from physical address to channel address is hard to > > > achieve, and probably that's one of the tech reasons the patrol scrub > > > ACPI spec is not in practice implemented? > > > > Next bit is kind of an aside as I mostly agree with your conclusions ;) > > > > This obviously depends on your memory interleave. You want to present > > physical address ranges as single controllable regions - probably > > most interesting being stuff that maps to NUMA nodes. The short > > answer is that any firmware control will end up writing to all the > > memory controllers involved in a given PA range - firmware can easily > > establish which those are. > > > > A memory controller can support multiple scrub regions > > which map from a PA range to a particular set of RAM addresses > > - that's implementation specific. The memory controller is getting > > the host PA and can carry out appropriate hashing if it wants to. > > Many scrub solutions won't do this - in which case it's max one > > region per memory controller (mapped by firmware to one region per > > interleave set - assuming interleave sets take in whole memory > > controllers - which they normally do). > > > > I would expect existing systems (not design with this differentiated > > scrub in mind) to only support scrub control by PA range at the > > granularity of interleave sets. > > > > Note that this general question of PA based controls also > > maps to things like resource control (resctl) where it's only interesting > > to partition memory bandwidth such that the partition applies to the > > whole interleave set - that's done for ARM systems anyway by having > > the userspace interface pretend there is only one control, but > > write the settings to all actual controllers involved. Not sure what > > x86 does. > > > > Taking a few examples that I know of. All 4 socket server - with > > control of these as bios options ;). > > Assuming individual memory controllers don't allow scrub to be > > configured by PA range. > > > > 1. Full system memory interleave (people do this form of crazy) > > In that case, there is only one set of firmware controls > > that write to the interfaces of every memory controller. Depending > > on the interleave design that may still allow multiple regions. > > > > 2. Socket wide memory interleave. In that case, firmware controls > > need to effect all memory controllers in that socket if the > > requested 'region' maps to them. So 4 PA regions. > > > > 3. Die wide memory interleave. Finer granularity of control > > so perhaps 8 PA rgiones. > > > > 4. Finer granularity (there are reasons to do this for above mentioned > > bandwidth resource control which you can only do if not interleaving > > across multiple controllers). > > > > > > > > > > > > So my take is, control at the granularity of the memory controller is > > > probably a nice compromise. > > > Both OS and userspace can get a pretty > > > decent amount of flexibility, start/stop/adjust speed of the scrubbing > > > on a memory controller; meanwhile it doesn't impose too much pain to > > > hardware vendors when they provide these features in hardware. In > > > terms of how these controls/features will be implemented, I imagine it > > > could be implemented: > > > * via hardware registers that directly or indirectly control on memory > > > controllers > > > * via ACPI if the situation changes in 10 years (and the RASF/RAS2/PCC > > > drivers implemented in this patchset can be directly plugged into) > > > * a kernel-thread that uses cpu read to detect memory errors, if > > > hardware support is unavailable or not good enough > > > > > > > I more or less agree, but would tweak a little. > > > > 1) Allow for multiple regions per memory controller - that's needed > > for RASF etc anyway - because as far as they are concerned there > > is only one interface presented. > > 2) Tie the interface to interleave sets, not memory controllers. > > NUMA nodes often being a good stand in for those. > > Does you mean /sys/devices/system/node/nodeX/scrub, where scrub is a > virtual concept of scrubbing device that mapps to 1 or several > physical scrubber backends. > > For example, starting/stopping the virtual device means issuing > START/STOP cmd to all backends. And... That file location would work I think, though in some cases it may get complex with single memory controllers covering parts of different interleave sets. I guess mapping that mess is all a software problem. > > > Controlling memory controllers separately for parts of an interleave > > isn't something I'd think was very useful. This will probably get > > messy in the future though and the complexity could be pushed to > > a userspace tool - as long as enough info is available elsewhere > > in sysfs. So need those memory controller directories you propose > > to include info on the PA ranges that they are involved in backing > > is it acceptable if we don't provide PA range or region in the > interface *for now* if it complicates things a lot? I could be wrong, > but the user of scrubber seems would be ok with not being able to > scrub an arbitrary physical address range. In contrast, not knowing > the scrub results seems to be more annoying to users. So simply giving > some progress indicator like how many bytes a scrubber has scrubbed. Progress is usually a non starter given it tends to be continuous in the hardware controllers I've seen. So you control that it does it every X hours, but doesn't provide a progress indicator. I'd stil like PA range being there, but maybe RO fixed values. Should be easy enough to establish. Also, won't matter if multiple controllers identify same PA range, assumption would be control them all from userspace, or use the NUMA type mapping above. > > When we really need to support PA range or region, under the > /sys/devices/system/node/nodeX/scrub interface, it basically uses NUMA > node X's PA range. Then to scrub node memory in range [PA1, PA2), some > driver that understand all backends (or can talk to all backends' > drivers) needs to translate the PA into the address in backend's > address space, for example, [PA1, PA2) is mapped to 2 device ranges > [DA11, DA12) on backend_1 and [DA21, DA22) on backend_2. Agreed. > > > (which to keep things horrible, can change at runtime via memory > > hotplug and remapping of host physical address ranges on CXL etc) > > CXL memory locally attached to the host is probably more or less the > same as normal physical memory. I wonder what it would be like for CXL > memory remotely attached through a memory pool. Does it make sense > that the controller/owner of the memory pool takes the responsibility > of controlling the CXL memory controller to control scrubbing? Does > the owner need to provide/mediate scrubbing support for other clients > using the memory pool? Good and complex questions but lets leave those for a while ;) beyond comprehending the fact that we can't always map interleave to particular memory controllers as they may use only a subset of the memory on that device. +CC'd linux-cxl for info, not because I want that discussion to go deep at the moment. Jonathan > > Thanks, > Jiaqi > > > > > > Given these possible backends of scrubbing, I think a more generic > > > sysfs API that covers and abstracts these backends will be more > > > valuable right now. I haven’t thought thoroughly, but how about > > > defining the top-level interface as something like > > > “/sys/devices/system/memory_controller_scrubX/”, or > > > “/sys/class/memory_controllerX/scrub”? > > > > No particular harm in the rename of the directory I guess. > > Though some of those 'memory_controllers' would be virtual as they > > wouldn't correspond to actual memory controllers but rather to > > sets of them. > > > > Jonathan > > > > > > > > [1] https://lore.kernel.org/linux-mm/SJ1PR11MB6083BF93E9A88E659CED5EC4FC3F9@SJ1PR11MB6083.namprd11.prod.outlook.com/T/#m13516ee35caa05b506080ae805bee14f9f958d43 > > > > > > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + The /sys/class/scrub/scrubX/region{0,1,2,3,...} > > > > + directories correspond to each scrub region under a scrub device. > > > > + Scrub region is a physical address range for which scrub may be > > > > + separately controlled. Regions may overlap in which case the > > > > + scrubbing rate of the overlapped memory will be at least that > > > > + expected due to each overlapping region. > > > > + > > > > +What: /sys/class/scrub/scrubX/regionY/addr_base > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + (RW) The base of the address range of the memory region > > > > + to be patrol scrubbed. > > > > + On reading, returns the base of the memory region for > > > > + the actual address range(The platform calculates > > > > + the nearest patrol scrub boundary address from where > > > > + it can start scrubbing). > > > > + > > > > +What: /sys/class/scrub/scrubX/regionY/addr_size > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + (RW) The size of the address range to be patrol scrubbed. > > > > + On reading, returns the size of the memory region for > > > > + the actual address range. > > > > + > > > > +What: /sys/class/scrub/scrubX/regionY/enable > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + (WO) Start/Stop scrubbing the memory region. > > > > + 1 - enable the memory scrubbing. > > > > + 0 - disable the memory scrubbing.. > > > > + > > > > +What: /sys/class/scrub/scrubX/regionY/speed_available > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + (RO) Supported range for the partol speed(scrub rate) > > > > + by the scrubber for a memory region. > > > > + The unit of the scrub rate vary depends on the scrubber. > > > > + > > > > +What: /sys/class/scrub/scrubX/regionY/speed > > > > +Date: September 2023 > > > > +KernelVersion: 6.7 > > > > +Contact: linux-kernel@vger.kernel.org > > > > +Description: > > > > + (RW) The partol speed(scrub rate) on the memory region specified and > > > > + it must be with in the supported range by the scrubber. > > > > + The unit of the scrub rate vary depends on the scrubber. > > > > -- > > > > 2.34.1 > > > > > > > > > > > > > > > >
On Wed, 27 Sep 2023, Jiaqi Yan wrote: > > > 1. I am not aware of any chip/platform hardware that implemented the > > > hw ps part defined in ACPI RASF/RAS2 spec. So I am curious what the > > > RAS experts from different hardware vendors think about this. For > > > example, Tony and Dave from Intel, Jon and Vilas from AMD. Is there > > > any hardware platform (if allowed to disclose) that implemented ACPI > > > RASF/RAS2? If so, will vendors continue to support the control of > > > patrol scrubber using the ACPI spec? If not (as Tony said in [1], will > > > the vendor consider starting some future platform? > > > > > > If we are unlikely to get the vendor support, creating this ACPI > > > specific sysfs API (and the driver implementations) in Linux seems to > > > have limited meaning. > > > > There is a bit of a chicken and egg problem here. Until there is > > reasonable support in kernel (or it looks like there will be), > > BIOS teams push back on a requirement to add the tables. > > I'd encourage no one to bother with RASF - RAS2 is much less > > ambiguous. > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD (Jon > and Vilas) for your opinion on RAS2. > We'll need to know from vendors, ideally at minimum from both Intel and AMD, whether RAS2 is the long-term vision here. Nothing is set in stone, of course, but deciding whether RAS2 is the standard that we should be rallying around will help to guide future development including in the kernel. If RAS2 is insufficient for future use cases or we would need to support multiple implementations in the kernel for configuring the patrol scrubber depending on vendor, that's great feedback to have. I'd much rather focus on implementing something in the kernel that we have some clarity about the vendors supporting, especially when it comes with user visible interfaces, as opposed to something that may not be used long term. I think that's a fair ask and that vendor feedback is required here?
On Wed, 4 Oct 2023 20:18:12 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > On Wed, 27 Sep 2023, Jiaqi Yan wrote: > > > > > 1. I am not aware of any chip/platform hardware that implemented the > > > > hw ps part defined in ACPI RASF/RAS2 spec. So I am curious what the > > > > RAS experts from different hardware vendors think about this. For > > > > example, Tony and Dave from Intel, Jon and Vilas from AMD. Is there > > > > any hardware platform (if allowed to disclose) that implemented ACPI > > > > RASF/RAS2? If so, will vendors continue to support the control of > > > > patrol scrubber using the ACPI spec? If not (as Tony said in [1], will > > > > the vendor consider starting some future platform? > > > > > > > > If we are unlikely to get the vendor support, creating this ACPI > > > > specific sysfs API (and the driver implementations) in Linux seems to > > > > have limited meaning. > > > > > > There is a bit of a chicken and egg problem here. Until there is > > > reasonable support in kernel (or it looks like there will be), > > > BIOS teams push back on a requirement to add the tables. > > > I'd encourage no one to bother with RASF - RAS2 is much less > > > ambiguous. > > > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD (Jon > > and Vilas) for your opinion on RAS2. > > > > We'll need to know from vendors, ideally at minimum from both Intel and > AMD, whether RAS2 is the long-term vision here. Nothing is set in stone, > of course, but deciding whether RAS2 is the standard that we should be > rallying around will help to guide future development including in the > kernel. > > If RAS2 is insufficient for future use cases or we would need to support > multiple implementations in the kernel for configuring the patrol scrubber > depending on vendor, that's great feedback to have. > > I'd much rather focus on implementing something in the kernel that we have > some clarity about the vendors supporting, especially when it comes with > user visible interfaces, as opposed to something that may not be used long > term. I think that's a fair ask and that vendor feedback is required > here? Agreed and happy to have feedback from Intel and AMD + all the other CPU vendors who make use of ACPI + all the OEMs who add stuff well beyond what Intel and AMD tell them to :) I'll just note a lot of the ACPI support in the kernel covers stuff not used on mainstream x86 platforms because they are doing something custom and we didn't want 2 + X custom implementations... Some other interfaces for scrub control (beyond existing embedded ones) will surface in the next few months where RAS2 is not appropriate. Jonathan
[AMD Official Use Only - General] I do not believe AMD has implemented RASF/RAS2 at all. We are looking at it, but our initial impression is that it is insufficiently flexible for general use. (Not just for this feature, but for others in the future.) -Vilas -----Original Message----- From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Sent: Friday, October 6, 2023 9:02 AM To: David Rientjes <rientjes@google.com> Cc: Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; Grimm, Jon <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; Sridharan, Vilas <Vilas.Sridharan@amd.com>; linuxarm@huawei.com; shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation entries for set of scrub attributes Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. On Wed, 4 Oct 2023 20:18:12 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > On Wed, 27 Sep 2023, Jiaqi Yan wrote: > > > > > 1. I am not aware of any chip/platform hardware that implemented > > > > the hw ps part defined in ACPI RASF/RAS2 spec. So I am curious > > > > what the RAS experts from different hardware vendors think about > > > > this. For example, Tony and Dave from Intel, Jon and Vilas from > > > > AMD. Is there any hardware platform (if allowed to disclose) > > > > that implemented ACPI RASF/RAS2? If so, will vendors continue to > > > > support the control of patrol scrubber using the ACPI spec? If > > > > not (as Tony said in [1], will the vendor consider starting some future platform? > > > > > > > > If we are unlikely to get the vendor support, creating this ACPI > > > > specific sysfs API (and the driver implementations) in Linux > > > > seems to have limited meaning. > > > > > > There is a bit of a chicken and egg problem here. Until there is > > > reasonable support in kernel (or it looks like there will be), > > > BIOS teams push back on a requirement to add the tables. > > > I'd encourage no one to bother with RASF - RAS2 is much less > > > ambiguous. > > > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD > > (Jon and Vilas) for your opinion on RAS2. > > > > We'll need to know from vendors, ideally at minimum from both Intel > and AMD, whether RAS2 is the long-term vision here. Nothing is set in > stone, of course, but deciding whether RAS2 is the standard that we > should be rallying around will help to guide future development > including in the kernel. > > If RAS2 is insufficient for future use cases or we would need to > support multiple implementations in the kernel for configuring the > patrol scrubber depending on vendor, that's great feedback to have. > > I'd much rather focus on implementing something in the kernel that we > have some clarity about the vendors supporting, especially when it > comes with user visible interfaces, as opposed to something that may > not be used long term. I think that's a fair ask and that vendor > feedback is required here? Agreed and happy to have feedback from Intel and AMD + all the other CPU vendors who make use of ACPI + all the OEMs who add stuff well beyond what Intel and AMD tell them to :) I'll just note a lot of the ACPI support in the kernel covers stuff not used on mainstream x86 platforms because they are doing something custom and we didn't want 2 + X custom implementations... Some other interfaces for scrub control (beyond existing embedded ones) will surface in the next few months where RAS2 is not appropriate. Jonathan
On Fri, 6 Oct 2023 13:06:53 +0000 "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > [AMD Official Use Only - General] > > I do not believe AMD has implemented RASF/RAS2 at all. > > We are looking at it, but our initial impression is that it is insufficiently flexible for general use. (Not just for this feature, but for others in the future.) > > -Vilas Hi Vilas, So obvious question is - worth fixing? I'm not particularly keen to see 10+ different ways of meeting this requirement. Probably not too bad if that's 10+ drivers implementing the same userspace ABI, but definitely don't want 10 drivers and 10 ABIs. Jonathan > > -----Original Message----- > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Sent: Friday, October 6, 2023 9:02 AM > To: David Rientjes <rientjes@google.com> > Cc: Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; Grimm, Jon <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; Sridharan, Vilas <Vilas.Sridharan@amd.com>; linuxarm@huawei.com; shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation entries for set of scrub attributes > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Wed, 4 Oct 2023 20:18:12 -0700 (PDT) > David Rientjes <rientjes@google.com> wrote: > > > On Wed, 27 Sep 2023, Jiaqi Yan wrote: > > > > > > > 1. I am not aware of any chip/platform hardware that implemented > > > > > the hw ps part defined in ACPI RASF/RAS2 spec. So I am curious > > > > > what the RAS experts from different hardware vendors think about > > > > > this. For example, Tony and Dave from Intel, Jon and Vilas from > > > > > AMD. Is there any hardware platform (if allowed to disclose) > > > > > that implemented ACPI RASF/RAS2? If so, will vendors continue to > > > > > support the control of patrol scrubber using the ACPI spec? If > > > > > not (as Tony said in [1], will the vendor consider starting some future platform? > > > > > > > > > > If we are unlikely to get the vendor support, creating this ACPI > > > > > specific sysfs API (and the driver implementations) in Linux > > > > > seems to have limited meaning. > > > > > > > > There is a bit of a chicken and egg problem here. Until there is > > > > reasonable support in kernel (or it looks like there will be), > > > > BIOS teams push back on a requirement to add the tables. > > > > I'd encourage no one to bother with RASF - RAS2 is much less > > > > ambiguous. > > > > > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD > > > (Jon and Vilas) for your opinion on RAS2. > > > > > > > We'll need to know from vendors, ideally at minimum from both Intel > > and AMD, whether RAS2 is the long-term vision here. Nothing is set in > > stone, of course, but deciding whether RAS2 is the standard that we > > should be rallying around will help to guide future development > > including in the kernel. > > > > If RAS2 is insufficient for future use cases or we would need to > > support multiple implementations in the kernel for configuring the > > patrol scrubber depending on vendor, that's great feedback to have. > > > > I'd much rather focus on implementing something in the kernel that we > > have some clarity about the vendors supporting, especially when it > > comes with user visible interfaces, as opposed to something that may > > not be used long term. I think that's a fair ask and that vendor > > feedback is required here? > > Agreed and happy to have feedback from Intel and AMD + all the other CPU vendors who make use of ACPI + all the OEMs who add stuff well beyond what Intel and AMD tell them to :) I'll just note a lot of the ACPI support in the kernel covers stuff not used on mainstream x86 platforms because they are doing something custom and we didn't want 2 + X custom implementations... > > Some other interfaces for scrub control (beyond existing embedded ones) will surface in the next few months where RAS2 is not appropriate. > > Jonathan > >
[AMD Official Use Only - General] + Leo and Yazen We looked at RASF and RAS2 again. We don't think RASF is worth fixing. Our preference is to coalesce around RAS2 because we think it can be extended in interesting ways. The patrol scrub function probably needs some changes to be more general across different types of hardware (there are some baked-in assumptions that don't always hold true). We will look at some spec changes to fix the patrol scrub function, and we are going to start thinking about other functions that can be added to RAS2. -Vilas -----Original Message----- From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Sent: Wednesday, October 11, 2023 12:36 PM To: Sridharan, Vilas <Vilas.Sridharan@amd.com> Cc: David Rientjes <rientjes@google.com>; Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; Grimm, Jon <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; linuxarm@huawei.com; shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation entries for set of scrub attributes Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. On Fri, 6 Oct 2023 13:06:53 +0000 "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > [AMD Official Use Only - General] > > I do not believe AMD has implemented RASF/RAS2 at all. > > We are looking at it, but our initial impression is that it is > insufficiently flexible for general use. (Not just for this feature, > but for others in the future.) > > -Vilas Hi Vilas, So obvious question is - worth fixing? I'm not particularly keen to see 10+ different ways of meeting this requirement. Probably not too bad if that's 10+ drivers implementing the same userspace ABI, but definitely don't want 10 drivers and 10 ABIs. Jonathan > > -----Original Message----- > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Sent: Friday, October 6, 2023 9:02 AM > To: David Rientjes <rientjes@google.com> > Cc: Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; > Grimm, Jon <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; > Sridharan, Vilas <Vilas.Sridharan@amd.com>; linuxarm@huawei.com; > shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; > linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; > naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; > jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; > pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; > gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation > entries for set of scrub attributes > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Wed, 4 Oct 2023 20:18:12 -0700 (PDT) David Rientjes > <rientjes@google.com> wrote: > > > On Wed, 27 Sep 2023, Jiaqi Yan wrote: > > > > > > > 1. I am not aware of any chip/platform hardware that > > > > > implemented the hw ps part defined in ACPI RASF/RAS2 spec. So > > > > > I am curious what the RAS experts from different hardware > > > > > vendors think about this. For example, Tony and Dave from > > > > > Intel, Jon and Vilas from AMD. Is there any hardware platform > > > > > (if allowed to disclose) that implemented ACPI RASF/RAS2? If > > > > > so, will vendors continue to support the control of patrol > > > > > scrubber using the ACPI spec? If not (as Tony said in [1], will the vendor consider starting some future platform? > > > > > > > > > > If we are unlikely to get the vendor support, creating this > > > > > ACPI specific sysfs API (and the driver implementations) in > > > > > Linux seems to have limited meaning. > > > > > > > > There is a bit of a chicken and egg problem here. Until there is > > > > reasonable support in kernel (or it looks like there will be), > > > > BIOS teams push back on a requirement to add the tables. > > > > I'd encourage no one to bother with RASF - RAS2 is much less > > > > ambiguous. > > > > > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD > > > (Jon and Vilas) for your opinion on RAS2. > > > > > > > We'll need to know from vendors, ideally at minimum from both Intel > > and AMD, whether RAS2 is the long-term vision here. Nothing is set > > in stone, of course, but deciding whether RAS2 is the standard that > > we should be rallying around will help to guide future development > > including in the kernel. > > > > If RAS2 is insufficient for future use cases or we would need to > > support multiple implementations in the kernel for configuring the > > patrol scrubber depending on vendor, that's great feedback to have. > > > > I'd much rather focus on implementing something in the kernel that > > we have some clarity about the vendors supporting, especially when > > it comes with user visible interfaces, as opposed to something that > > may not be used long term. I think that's a fair ask and that > > vendor feedback is required here? > > Agreed and happy to have feedback from Intel and AMD + all the other CPU vendors who make use of ACPI + all the OEMs who add stuff well beyond what Intel and AMD tell them to :) I'll just note a lot of the ACPI support in the kernel covers stuff not used on mainstream x86 platforms because they are doing something custom and we didn't want 2 + X custom implementations... > > Some other interfaces for scrub control (beyond existing embedded ones) will surface in the next few months where RAS2 is not appropriate. > > Jonathan > >
On Thu, 12 Oct 2023 13:41:19 +0000 "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > [AMD Official Use Only - General] > > + Leo and Yazen Hi All. + Kangkang and Wanghuiqiang (Henson), > > We looked at RASF and RAS2 again. We don't think RASF is worth fixing. Our preference is to coalesce around RAS2 because we think it can be extended in interesting ways. Absolutely agree. I'm guessing RAS2 was previous go at fixing RASF though I haven't done the archaeology. > > The patrol scrub function probably needs some changes to be more general across different types of hardware (there are some baked-in assumptions that don't always hold true). Agreed. One aspect I'd love to see improved is expanded discoverability of what the hardware can do. > > We will look at some spec changes to fix the patrol scrub function, and we are going to start thinking about other functions that can be added to RAS2. Feel free to reach out if you want some early input on this. Are you thinking a code first proposal? If you think doing this through the standards body is a good idea then perhaps message back here so we know when to look for further proposals in mantis. Thanks, Jonathan > > -Vilas > > -----Original Message----- > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Sent: Wednesday, October 11, 2023 12:36 PM > To: Sridharan, Vilas <Vilas.Sridharan@amd.com> > Cc: David Rientjes <rientjes@google.com>; Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; Grimm, Jon <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; linuxarm@huawei.com; shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation entries for set of scrub attributes > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Fri, 6 Oct 2023 13:06:53 +0000 > "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > > > [AMD Official Use Only - General] > > > > I do not believe AMD has implemented RASF/RAS2 at all. > > > > We are looking at it, but our initial impression is that it is > > insufficiently flexible for general use. (Not just for this feature, > > but for others in the future.) > > > > -Vilas > > Hi Vilas, > > So obvious question is - worth fixing? > > I'm not particularly keen to see 10+ different ways of meeting this requirement. > > Probably not too bad if that's 10+ drivers implementing the same userspace ABI, but definitely don't want 10 drivers and 10 ABIs. > > Jonathan > > > > > -----Original Message----- > > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > > Sent: Friday, October 6, 2023 9:02 AM > > To: David Rientjes <rientjes@google.com> > > Cc: Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; > > Grimm, Jon <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; > > Sridharan, Vilas <Vilas.Sridharan@amd.com>; linuxarm@huawei.com; > > shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; > > linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; > > naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; > > jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; > > pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; > > gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com > > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation > > entries for set of scrub attributes > > > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > > > > On Wed, 4 Oct 2023 20:18:12 -0700 (PDT) David Rientjes > > <rientjes@google.com> wrote: > > > > > On Wed, 27 Sep 2023, Jiaqi Yan wrote: > > > > > > > > > 1. I am not aware of any chip/platform hardware that > > > > > > implemented the hw ps part defined in ACPI RASF/RAS2 spec. So > > > > > > I am curious what the RAS experts from different hardware > > > > > > vendors think about this. For example, Tony and Dave from > > > > > > Intel, Jon and Vilas from AMD. Is there any hardware platform > > > > > > (if allowed to disclose) that implemented ACPI RASF/RAS2? If > > > > > > so, will vendors continue to support the control of patrol > > > > > > scrubber using the ACPI spec? If not (as Tony said in [1], will the vendor consider starting some future platform? > > > > > > > > > > > > If we are unlikely to get the vendor support, creating this > > > > > > ACPI specific sysfs API (and the driver implementations) in > > > > > > Linux seems to have limited meaning. > > > > > > > > > > There is a bit of a chicken and egg problem here. Until there is > > > > > reasonable support in kernel (or it looks like there will be), > > > > > BIOS teams push back on a requirement to add the tables. > > > > > I'd encourage no one to bother with RASF - RAS2 is much less > > > > > ambiguous. > > > > > > > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD > > > > (Jon and Vilas) for your opinion on RAS2. > > > > > > > > > > We'll need to know from vendors, ideally at minimum from both Intel > > > and AMD, whether RAS2 is the long-term vision here. Nothing is set > > > in stone, of course, but deciding whether RAS2 is the standard that > > > we should be rallying around will help to guide future development > > > including in the kernel. > > > > > > If RAS2 is insufficient for future use cases or we would need to > > > support multiple implementations in the kernel for configuring the > > > patrol scrubber depending on vendor, that's great feedback to have. > > > > > > I'd much rather focus on implementing something in the kernel that > > > we have some clarity about the vendors supporting, especially when > > > it comes with user visible interfaces, as opposed to something that > > > may not be used long term. I think that's a fair ask and that > > > vendor feedback is required here? > > > > Agreed and happy to have feedback from Intel and AMD + all the other CPU vendors who make use of ACPI + all the OEMs who add stuff well beyond what Intel and AMD tell them to :) I'll just note a lot of the ACPI support in the kernel covers stuff not used on mainstream x86 platforms because they are doing something custom and we didn't want 2 + X custom implementations... > > > > Some other interfaces for scrub control (beyond existing embedded ones) will surface in the next few months where RAS2 is not appropriate. > > > > Jonathan > > > > >
[AMD Official Use Only - General] > Are you thinking a code first proposal? If you think doing this through the standards body is a good idea then perhaps message back here so we know when to look for further proposals in mantis. I am not super familiar with what you mean by 'code first proposal', but we are thinking about crafting an ECN (or a set of ECNs) for ACPI, that will be made public through ACPI's normal process. -Vilas -----Original Message----- From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Sent: Thursday, October 12, 2023 11:02 AM To: Sridharan, Vilas <Vilas.Sridharan@amd.com> Cc: Duran, Leo <leo.duran@amd.com>; Ghannam, Yazen <Yazen.Ghannam@amd.com>; David Rientjes <rientjes@google.com>; Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; Grimm, Jon <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; linuxarm@huawei.com; shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com; kangkang.shen@futurewei.com; wanghuiqiang@huawei.com Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation entries for set of scrub attributes Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. On Thu, 12 Oct 2023 13:41:19 +0000 "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > [AMD Official Use Only - General] > > + Leo and Yazen Hi All. + Kangkang and Wanghuiqiang (Henson), > > We looked at RASF and RAS2 again. We don't think RASF is worth fixing. Our preference is to coalesce around RAS2 because we think it can be extended in interesting ways. Absolutely agree. I'm guessing RAS2 was previous go at fixing RASF though I haven't done the archaeology. > > The patrol scrub function probably needs some changes to be more general across different types of hardware (there are some baked-in assumptions that don't always hold true). Agreed. One aspect I'd love to see improved is expanded discoverability of what the hardware can do. > > We will look at some spec changes to fix the patrol scrub function, and we are going to start thinking about other functions that can be added to RAS2. Feel free to reach out if you want some early input on this. Are you thinking a code first proposal? If you think doing this through the standards body is a good idea then perhaps message back here so we know when to look for further proposals in mantis. Thanks, Jonathan > > -Vilas > > -----Original Message----- > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Sent: Wednesday, October 11, 2023 12:36 PM > To: Sridharan, Vilas <Vilas.Sridharan@amd.com> > Cc: David Rientjes <rientjes@google.com>; Jiaqi Yan > <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; Grimm, Jon > <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; linuxarm@huawei.com; > shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; > linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; > naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; > jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; > pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; > gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation > entries for set of scrub attributes > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Fri, 6 Oct 2023 13:06:53 +0000 > "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > > > [AMD Official Use Only - General] > > > > I do not believe AMD has implemented RASF/RAS2 at all. > > > > We are looking at it, but our initial impression is that it is > > insufficiently flexible for general use. (Not just for this feature, > > but for others in the future.) > > > > -Vilas > > Hi Vilas, > > So obvious question is - worth fixing? > > I'm not particularly keen to see 10+ different ways of meeting this requirement. > > Probably not too bad if that's 10+ drivers implementing the same userspace ABI, but definitely don't want 10 drivers and 10 ABIs. > > Jonathan > > > > > -----Original Message----- > > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > > Sent: Friday, October 6, 2023 9:02 AM > > To: David Rientjes <rientjes@google.com> > > Cc: Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony > > <tony.luck@intel.com>; Grimm, Jon <Jon.Grimm@amd.com>; > > dave.hansen@linux.intel.com; Sridharan, Vilas > > <Vilas.Sridharan@amd.com>; linuxarm@huawei.com; > > shiju.jose@huawei.com; linux-acpi@vger.kernel.org; > > linux-mm@kvack.org; linux-kernel@vger.kernel.org; rafael@kernel.org; > > lenb@kernel.org; naoya.horiguchi@nec.com; james.morse@arm.com; > > david@redhat.com; jthoughton@google.com; somasundaram.a@hpe.com; > > erdemaktas@google.com; pgonda@google.com; duenwen@google.com; > > mike.malvestuto@intel.com; gthelen@google.com; > > tanxiaofei@huawei.com; prime.zeng@hisilicon.com > > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation > > entries for set of scrub attributes > > > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > > > > On Wed, 4 Oct 2023 20:18:12 -0700 (PDT) David Rientjes > > <rientjes@google.com> wrote: > > > > > On Wed, 27 Sep 2023, Jiaqi Yan wrote: > > > > > > > > > 1. I am not aware of any chip/platform hardware that > > > > > > implemented the hw ps part defined in ACPI RASF/RAS2 spec. > > > > > > So I am curious what the RAS experts from different hardware > > > > > > vendors think about this. For example, Tony and Dave from > > > > > > Intel, Jon and Vilas from AMD. Is there any hardware > > > > > > platform (if allowed to disclose) that implemented ACPI > > > > > > RASF/RAS2? If so, will vendors continue to support the > > > > > > control of patrol scrubber using the ACPI spec? If not (as Tony said in [1], will the vendor consider starting some future platform? > > > > > > > > > > > > If we are unlikely to get the vendor support, creating this > > > > > > ACPI specific sysfs API (and the driver implementations) in > > > > > > Linux seems to have limited meaning. > > > > > > > > > > There is a bit of a chicken and egg problem here. Until there > > > > > is reasonable support in kernel (or it looks like there will > > > > > be), BIOS teams push back on a requirement to add the tables. > > > > > I'd encourage no one to bother with RASF - RAS2 is much less > > > > > ambiguous. > > > > > > > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD > > > > (Jon and Vilas) for your opinion on RAS2. > > > > > > > > > > We'll need to know from vendors, ideally at minimum from both > > > Intel and AMD, whether RAS2 is the long-term vision here. Nothing > > > is set in stone, of course, but deciding whether RAS2 is the > > > standard that we should be rallying around will help to guide > > > future development including in the kernel. > > > > > > If RAS2 is insufficient for future use cases or we would need to > > > support multiple implementations in the kernel for configuring the > > > patrol scrubber depending on vendor, that's great feedback to have. > > > > > > I'd much rather focus on implementing something in the kernel that > > > we have some clarity about the vendors supporting, especially when > > > it comes with user visible interfaces, as opposed to something > > > that may not be used long term. I think that's a fair ask and > > > that vendor feedback is required here? > > > > Agreed and happy to have feedback from Intel and AMD + all the other CPU vendors who make use of ACPI + all the OEMs who add stuff well beyond what Intel and AMD tell them to :) I'll just note a lot of the ACPI support in the kernel covers stuff not used on mainstream x86 platforms because they are doing something custom and we didn't want 2 + X custom implementations... > > > > Some other interfaces for scrub control (beyond existing embedded ones) will surface in the next few months where RAS2 is not appropriate. > > > > Jonathan > > > > >
On Thu, 12 Oct 2023 15:44:18 +0000 "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > [AMD Official Use Only - General] > > > Are you thinking a code first proposal? If you think doing this through the standards body is a good idea then perhaps message back here so we know when to look for further proposals in mantis. > > I am not super familiar with what you mean by 'code first proposal', but we are thinking about crafting an ECN (or a set of ECNs) for ACPI, that will be made public through ACPI's normal process. > There are two ways to go about getting an ECN into the specification and which one is chosen affects the 'made it public' part of the ECN. One is public from the start and is done via a proposal submitted to the Specification Updates section of the tianocore bugzilla. This is referred to as "code first", but actually just means the request came from discussions initially had outside of the UEFI forum. They are still discussed in ASWG, but review also occurs in public on the bugzilla. https://bugzilla.tianocore.org/buglist.cgi?component=Specification%20Update&product=EDK2%20Code%20First&resolution=--- The other is the more traditional method of proposing in private. There the issue is that the review is limited to those who both engage closely with ASWG and those who can remember their mantis password. Before we post any software based on changes going via that route (as its covered by UEFI forum IP rules) we have to wait for a formal specification release. So basically the traditional method is typically slower and doesn't let us do helpful things like ask the kernel community to review the proposed changes. The code first route was added a few years ago to provide the options for companies that preferred the flexibility and openness it provides. As you can see from the link above, there is a lot of activity via the code first route these days. Jonathan > -Vilas > > -----Original Message----- > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Sent: Thursday, October 12, 2023 11:02 AM > To: Sridharan, Vilas <Vilas.Sridharan@amd.com> > Cc: Duran, Leo <leo.duran@amd.com>; Ghannam, Yazen <Yazen.Ghannam@amd.com>; David Rientjes <rientjes@google.com>; Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; Grimm, Jon <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; linuxarm@huawei.com; shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com; kangkang.shen@futurewei.com; wanghuiqiang@huawei.com > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation entries for set of scrub attributes > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Thu, 12 Oct 2023 13:41:19 +0000 > "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > > > [AMD Official Use Only - General] > > > > + Leo and Yazen > > Hi All. > > + Kangkang and Wanghuiqiang (Henson), > > > > > We looked at RASF and RAS2 again. We don't think RASF is worth fixing. Our preference is to coalesce around RAS2 because we think it can be extended in interesting ways. > > Absolutely agree. I'm guessing RAS2 was previous go at fixing RASF though I haven't done the archaeology. > > > > > The patrol scrub function probably needs some changes to be more general across different types of hardware (there are some baked-in assumptions that don't always hold true). > > Agreed. One aspect I'd love to see improved is expanded discoverability of what the hardware can do. > > > > > We will look at some spec changes to fix the patrol scrub function, and we are going to start thinking about other functions that can be added to RAS2. > > Feel free to reach out if you want some early input on this. Are you thinking a code first proposal? If you think doing this through the standards body is a good idea then perhaps message back here so we know when to look for further proposals in mantis. > > Thanks, > > Jonathan > > > > -Vilas > > > > -----Original Message----- > > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > > Sent: Wednesday, October 11, 2023 12:36 PM > > To: Sridharan, Vilas <Vilas.Sridharan@amd.com> > > Cc: David Rientjes <rientjes@google.com>; Jiaqi Yan > > <jiaqiyan@google.com>; Luck, Tony <tony.luck@intel.com>; Grimm, Jon > > <Jon.Grimm@amd.com>; dave.hansen@linux.intel.com; linuxarm@huawei.com; > > shiju.jose@huawei.com; linux-acpi@vger.kernel.org; linux-mm@kvack.org; > > linux-kernel@vger.kernel.org; rafael@kernel.org; lenb@kernel.org; > > naoya.horiguchi@nec.com; james.morse@arm.com; david@redhat.com; > > jthoughton@google.com; somasundaram.a@hpe.com; erdemaktas@google.com; > > pgonda@google.com; duenwen@google.com; mike.malvestuto@intel.com; > > gthelen@google.com; tanxiaofei@huawei.com; prime.zeng@hisilicon.com > > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation > > entries for set of scrub attributes > > > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > > > > On Fri, 6 Oct 2023 13:06:53 +0000 > > "Sridharan, Vilas" <Vilas.Sridharan@amd.com> wrote: > > > > > [AMD Official Use Only - General] > > > > > > I do not believe AMD has implemented RASF/RAS2 at all. > > > > > > We are looking at it, but our initial impression is that it is > > > insufficiently flexible for general use. (Not just for this feature, > > > but for others in the future.) > > > > > > -Vilas > > > > Hi Vilas, > > > > So obvious question is - worth fixing? > > > > I'm not particularly keen to see 10+ different ways of meeting this requirement. > > > > Probably not too bad if that's 10+ drivers implementing the same userspace ABI, but definitely don't want 10 drivers and 10 ABIs. > > > > Jonathan > > > > > > > > -----Original Message----- > > > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > > > Sent: Friday, October 6, 2023 9:02 AM > > > To: David Rientjes <rientjes@google.com> > > > Cc: Jiaqi Yan <jiaqiyan@google.com>; Luck, Tony > > > <tony.luck@intel.com>; Grimm, Jon <Jon.Grimm@amd.com>; > > > dave.hansen@linux.intel.com; Sridharan, Vilas > > > <Vilas.Sridharan@amd.com>; linuxarm@huawei.com; > > > shiju.jose@huawei.com; linux-acpi@vger.kernel.org; > > > linux-mm@kvack.org; linux-kernel@vger.kernel.org; rafael@kernel.org; > > > lenb@kernel.org; naoya.horiguchi@nec.com; james.morse@arm.com; > > > david@redhat.com; jthoughton@google.com; somasundaram.a@hpe.com; > > > erdemaktas@google.com; pgonda@google.com; duenwen@google.com; > > > mike.malvestuto@intel.com; gthelen@google.com; > > > tanxiaofei@huawei.com; prime.zeng@hisilicon.com > > > Subject: Re: [RFC PATCH 2/9] memory: scrub: sysfs: Add Documentation > > > entries for set of scrub attributes > > > > > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > > > > > > > On Wed, 4 Oct 2023 20:18:12 -0700 (PDT) David Rientjes > > > <rientjes@google.com> wrote: > > > > > > > On Wed, 27 Sep 2023, Jiaqi Yan wrote: > > > > > > > > > > > 1. I am not aware of any chip/platform hardware that > > > > > > > implemented the hw ps part defined in ACPI RASF/RAS2 spec. > > > > > > > So I am curious what the RAS experts from different hardware > > > > > > > vendors think about this. For example, Tony and Dave from > > > > > > > Intel, Jon and Vilas from AMD. Is there any hardware > > > > > > > platform (if allowed to disclose) that implemented ACPI > > > > > > > RASF/RAS2? If so, will vendors continue to support the > > > > > > > control of patrol scrubber using the ACPI spec? If not (as Tony said in [1], will the vendor consider starting some future platform? > > > > > > > > > > > > > > If we are unlikely to get the vendor support, creating this > > > > > > > ACPI specific sysfs API (and the driver implementations) in > > > > > > > Linux seems to have limited meaning. > > > > > > > > > > > > There is a bit of a chicken and egg problem here. Until there > > > > > > is reasonable support in kernel (or it looks like there will > > > > > > be), BIOS teams push back on a requirement to add the tables. > > > > > > I'd encourage no one to bother with RASF - RAS2 is much less > > > > > > ambiguous. > > > > > > > > > > Here mainly to re-ping folks from Intel (Tony and Dave) and AMD > > > > > (Jon and Vilas) for your opinion on RAS2. > > > > > > > > > > > > > We'll need to know from vendors, ideally at minimum from both > > > > Intel and AMD, whether RAS2 is the long-term vision here. Nothing > > > > is set in stone, of course, but deciding whether RAS2 is the > > > > standard that we should be rallying around will help to guide > > > > future development including in the kernel. > > > > > > > > If RAS2 is insufficient for future use cases or we would need to > > > > support multiple implementations in the kernel for configuring the > > > > patrol scrubber depending on vendor, that's great feedback to have. > > > > > > > > I'd much rather focus on implementing something in the kernel that > > > > we have some clarity about the vendors supporting, especially when > > > > it comes with user visible interfaces, as opposed to something > > > > that may not be used long term. I think that's a fair ask and > > > > that vendor feedback is required here? > > > > > > Agreed and happy to have feedback from Intel and AMD + all the other CPU vendors who make use of ACPI + all the OEMs who add stuff well beyond what Intel and AMD tell them to :) I'll just note a lot of the ACPI support in the kernel covers stuff not used on mainstream x86 platforms because they are doing something custom and we didn't want 2 + X custom implementations... > > > > > > Some other interfaces for scrub control (beyond existing embedded ones) will surface in the next few months where RAS2 is not appropriate. > > > > > > Jonathan > > > > > > > > >
diff --git a/Documentation/ABI/testing/sysfs-class-scrub-configure b/Documentation/ABI/testing/sysfs-class-scrub-configure new file mode 100644 index 000000000000..347e2167dc62 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-scrub-configure @@ -0,0 +1,82 @@ +What: /sys/class/scrub/ +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + The scrub/ class subdirectory belongs to the + scrubber subsystem. + +What: /sys/class/scrub/scrubX/ +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + The /sys/class/scrub/scrub{0,1,2,3,...} directories + correspond to each scrub device. + +What: /sys/class/scrub/scrubX/name +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + (RO) name of the memory scrub device + +What: /sys/class/scrub/scrubX/regionY/ +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + The /sys/class/scrub/scrubX/region{0,1,2,3,...} + directories correspond to each scrub region under a scrub device. + Scrub region is a physical address range for which scrub may be + separately controlled. Regions may overlap in which case the + scrubbing rate of the overlapped memory will be at least that + expected due to each overlapping region. + +What: /sys/class/scrub/scrubX/regionY/addr_base +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + (RW) The base of the address range of the memory region + to be patrol scrubbed. + On reading, returns the base of the memory region for + the actual address range(The platform calculates + the nearest patrol scrub boundary address from where + it can start scrubbing). + +What: /sys/class/scrub/scrubX/regionY/addr_size +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + (RW) The size of the address range to be patrol scrubbed. + On reading, returns the size of the memory region for + the actual address range. + +What: /sys/class/scrub/scrubX/regionY/enable +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + (WO) Start/Stop scrubbing the memory region. + 1 - enable the memory scrubbing. + 0 - disable the memory scrubbing.. + +What: /sys/class/scrub/scrubX/regionY/speed_available +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + (RO) Supported range for the partol speed(scrub rate) + by the scrubber for a memory region. + The unit of the scrub rate vary depends on the scrubber. + +What: /sys/class/scrub/scrubX/regionY/speed +Date: September 2023 +KernelVersion: 6.7 +Contact: linux-kernel@vger.kernel.org +Description: + (RW) The partol speed(scrub rate) on the memory region specified and + it must be with in the supported range by the scrubber. + The unit of the scrub rate vary depends on the scrubber.