Message ID | 20230406233926.1670094-1-rrendec@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | arch_topology: Pre-allocate cacheinfo from primary CPU | expand |
Hello Radu, Some additional points: 1- For the record, Will made a comment about adding weak functions (cf. https://lore.kernel.org/all/20230327121734.GB31342@willie-the-truck/) but I don't see how it could be done otherwise ... 2- The patch-set needs to be rebased on top of v6.3-rc6, otherwise there is a merge conflict. 3- When trying the patch-set on an ACPI platform with no PPTT, it seems that fetch_cache_info() is not called from init_cpu_topology() because parse_acpi_topology() returns an error code. This result in a 'sleeping function called from invalid context' message. The following made it work for me: --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -838,7 +838,6 @@ void __init init_cpu_topology(void) * don't use partial information. */ reset_cpu_topology(); - return; } for_each_possible_cpu(cpu) { With 2 and 3 addressed: Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Also maybe wait for Sudeep to have a look before sending a v4, Regards, Pierre On 4/7/23 01:39, Radu Rendec wrote: > Commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU") > tries to build the cacheinfo from the primary CPU prior to secondary > CPUs boot, if the DT/ACPI description contains cache information. > However, if such information is not present, it still reverts to the old > behavior, which allocates the cacheinfo memory on each secondary CPU. On > RT kernels, this triggers a "BUG: sleeping function called from invalid > context" because the allocation is done before preemption is first > enabled on the secondary CPU. > > The solution is to add cache information to DT/ACPI, but at least on > arm64 systems this can be avoided by leveraging automatic detection > (through the CLIDR_EL1 register), which is already implemented but > currently doesn't work on RT kernels for the reason described above. > > This patch series attempts to enable automatic detection for RT kernels > when no DT/ACPI cache information is available, by pre-allocating > cacheinfo memory on the primary CPU. > > The first patch adds an architecture independent infrastructure that > allows architecture specific code to take an early guess at the number > of cache leaves of the secodary CPUs, while it runs in preemptible > context on the primary CPU. At the same time, it gives architecture > specific code the opportunity to go back later, while it runs on the > secondary CPU, and reallocate the cacheinfo memory if the initial guess > proves to be wrong. > > The second patch leverages the infrastructure implemented in the first > patch and enables early cache depth detection for arm64. > > The patch series is based on an RFC patch that was posted to the > linux-arm-kernel mailing list and discussed with a smaller audience: > https://lore.kernel.org/all/20230323224242.31142-1-rrendec@redhat.com/ > > Changes to v2: > * Address minor coding style issue (unbalanced braces). > * Move cacheinfo reallocation logic from detect_cache_attributes() to a > new function to improve code readability. > * Minor fix to cacheinfo reallocation logic to avoid a new detection of > the cache level if/when detect_cache_attributes() is called again. > > Radu Rendec (2): > cacheinfo: Add arch specific early level initializer > cacheinfo: Add arm64 early level initializer implementation > > arch/arm64/kernel/cacheinfo.c | 32 +++++++++++---- > drivers/base/cacheinfo.c | 75 +++++++++++++++++++++++++---------- > include/linux/cacheinfo.h | 2 + > 3 files changed, 79 insertions(+), 30 deletions(-) >
Hello Pierre, On Tue, 2023-04-11 at 14:36 +0200, Pierre Gondois wrote: > Hello Radu, > Some additional points: > 1- > For the record, Will made a comment about adding weak functions > (cf. https://lore.kernel.org/all/20230327121734.GB31342@willie-the-truck/) > but I don't see how it could be done otherwise ... In that comment, Will suggested using static inline functions in a header. It would probably work but for the sake of consistency with init_cache_level() I would argue it's better to use a weak function in this particular case. > 2- > The patch-set needs to be rebased on top of v6.3-rc6, > otherwise there is a merge conflict. Fair enough. I worked off of linux-rt-devel for obvious reasons, and forgot to rebase. It's a trivial conflict, which both "git rebase" and "git am -3" can fix automatically. I will keep this in mind for v4. > 3- > When trying the patch-set on an ACPI platform with no PPTT, it seems that > fetch_cache_info() is not called from init_cpu_topology() because > parse_acpi_topology() returns an error code. This result in a > 'sleeping function called from invalid context' message. The following made > it work for me: > > --- a/drivers/base/arch_topology.c > +++ b/drivers/base/arch_topology.c > @@ -838,7 +838,6 @@ void __init init_cpu_topology(void) > * don't use partial information. > */ > reset_cpu_topology(); > - return; > } > > for_each_possible_cpu(cpu) { Good catch! I think this calls for a dedicated patch in the series to do just that and explain why the "return" statement is being removed. > With 2 and 3 addressed: > Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Thanks again for reviewing these patches and for all your input! > Also maybe wait for Sudeep to have a look before sending a v4, Sure. Looking at other patches, he seems to respond pretty quickly, so I'll wait until tomorrow and then send v4 if I don't hear back. Best regards, Radu > On 4/7/23 01:39, Radu Rendec wrote: > > Commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU") > > tries to build the cacheinfo from the primary CPU prior to secondary > > CPUs boot, if the DT/ACPI description contains cache information. > > However, if such information is not present, it still reverts to the old > > behavior, which allocates the cacheinfo memory on each secondary CPU. On > > RT kernels, this triggers a "BUG: sleeping function called from invalid > > context" because the allocation is done before preemption is first > > enabled on the secondary CPU. > > > > The solution is to add cache information to DT/ACPI, but at least on > > arm64 systems this can be avoided by leveraging automatic detection > > (through the CLIDR_EL1 register), which is already implemented but > > currently doesn't work on RT kernels for the reason described above. > > > > This patch series attempts to enable automatic detection for RT kernels > > when no DT/ACPI cache information is available, by pre-allocating > > cacheinfo memory on the primary CPU. > > > > The first patch adds an architecture independent infrastructure that > > allows architecture specific code to take an early guess at the number > > of cache leaves of the secodary CPUs, while it runs in preemptible > > context on the primary CPU. At the same time, it gives architecture > > specific code the opportunity to go back later, while it runs on the > > secondary CPU, and reallocate the cacheinfo memory if the initial guess > > proves to be wrong. > > > > The second patch leverages the infrastructure implemented in the first > > patch and enables early cache depth detection for arm64. > > > > The patch series is based on an RFC patch that was posted to the > > linux-arm-kernel mailing list and discussed with a smaller audience: > > https://lore.kernel.org/all/20230323224242.31142-1-rrendec@redhat.com/ > > > > Changes to v2: > > * Address minor coding style issue (unbalanced braces). > > * Move cacheinfo reallocation logic from detect_cache_attributes() to a > > new function to improve code readability. > > * Minor fix to cacheinfo reallocation logic to avoid a new detection of > > the cache level if/when detect_cache_attributes() is called again. > > > > Radu Rendec (2): > > cacheinfo: Add arch specific early level initializer > > cacheinfo: Add arm64 early level initializer implementation > > > > arch/arm64/kernel/cacheinfo.c | 32 +++++++++++---- > > drivers/base/cacheinfo.c | 75 +++++++++++++++++++++++++---------- > > include/linux/cacheinfo.h | 2 + > > 3 files changed, 79 insertions(+), 30 deletions(-) > > >