Message ID | 20240108082727.420817-16-zhao1.liu@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Support smp.clusters for x86 in QEMU | expand |
On 1/8/2024 4:27 PM, Zhao Liu wrote: > From: Zhao Liu <zhao1.liu@intel.com> > > The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information > for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding > the number of sharing threads directly. > > From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14]) > means [1]: > > The number of logical processors sharing this cache is the value of > this field incremented by 1. To determine which logical processors are > sharing a cache, determine a Share Id for each processor as follows: > > ShareId = LocalApicId >> log2(NumSharingCache+1) > > Logical processors with the same ShareId then share a cache. If > NumSharingCache+1 is not a power of two, round it up to the next power > of two. > > From the description above, the calculation of this field should be same > as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of > APIC ID to calculate this field. > > [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology > Information this patch can be dropped because we have next patch. > Signed-off-by: Zhao Liu <zhao1.liu@intel.com> > Reviewed-by: Babu Moger <babu.moger@amd.com> > Tested-by: Babu Moger <babu.moger@amd.com> > Tested-by: Yongwei Ma <yongwei.ma@intel.com> > Acked-by: Michael S. Tsirkin <mst@redhat.com> > --- > Changes since v3: > * Rewrite the subject. (Babu) > * Delete the original "comment/help" expression, as this behavior is > confirmed for AMD CPUs. (Babu) > * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec > definition. (Babu) > > Changes since v1: > * Rename "l3_threads" to "num_apic_ids" in > encode_cache_cpuid8000001d(). (Yanan) > * Add the description of the original commit and add Cc. > --- > target/i386/cpu.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index b23e8190dc68..8a4d72f6f760 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -483,7 +483,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, > uint32_t *eax, uint32_t *ebx, > uint32_t *ecx, uint32_t *edx) > { > - uint32_t l3_threads; > + uint32_t num_sharing_cache; > assert(cache->size == cache->line_size * cache->associativity * > cache->partitions * cache->sets); > > @@ -492,13 +492,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, > > /* L3 is shared among multiple cores */ > if (cache->level == 3) { > - l3_threads = topo_info->modules_per_die * > - topo_info->cores_per_module * > - topo_info->threads_per_core; > - *eax |= (l3_threads - 1) << 14; > + num_sharing_cache = 1 << apicid_die_offset(topo_info); > } else { > - *eax |= ((topo_info->threads_per_core - 1) << 14); > + num_sharing_cache = 1 << apicid_core_offset(topo_info); > } > + *eax |= (num_sharing_cache - 1) << 14; > > assert(cache->line_size > 0); > assert(cache->partitions > 0);
Hi Xiaoyao, On Sun, Jan 14, 2024 at 10:42:41PM +0800, Xiaoyao Li wrote: > Date: Sun, 14 Jan 2024 22:42:41 +0800 > From: Xiaoyao Li <xiaoyao.li@intel.com> > Subject: Re: [PATCH v7 15/16] i386: Use offsets get NumSharingCache for > CPUID[0x8000001D].EAX[bits 25:14] > > On 1/8/2024 4:27 PM, Zhao Liu wrote: > > From: Zhao Liu <zhao1.liu@intel.com> > > > > The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information > > for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding > > the number of sharing threads directly. > > > > From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14]) > > means [1]: > > > > The number of logical processors sharing this cache is the value of > > this field incremented by 1. To determine which logical processors are > > sharing a cache, determine a Share Id for each processor as follows: > > > > ShareId = LocalApicId >> log2(NumSharingCache+1) > > > > Logical processors with the same ShareId then share a cache. If > > NumSharingCache+1 is not a power of two, round it up to the next power > > of two. > > > > From the description above, the calculation of this field should be same > > as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of > > APIC ID to calculate this field. > > > > [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology > > Information > > this patch can be dropped because we have next patch. This patch is mainly used to explicitly emphasize the change in encoding way and compliance with AMD spec... I didn't tested on AMD machine, so the more granular patch would make it easier for the community to review and test. Thanks, Zhao > > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com> > > Reviewed-by: Babu Moger <babu.moger@amd.com> > > Tested-by: Babu Moger <babu.moger@amd.com> > > Tested-by: Yongwei Ma <yongwei.ma@intel.com> > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > --- > > Changes since v3: > > * Rewrite the subject. (Babu) > > * Delete the original "comment/help" expression, as this behavior is > > confirmed for AMD CPUs. (Babu) > > * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec > > definition. (Babu) > > > > Changes since v1: > > * Rename "l3_threads" to "num_apic_ids" in > > encode_cache_cpuid8000001d(). (Yanan) > > * Add the description of the original commit and add Cc. > > --- > > target/i386/cpu.c | 10 ++++------ > > 1 file changed, 4 insertions(+), 6 deletions(-) > > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > > index b23e8190dc68..8a4d72f6f760 100644 > > --- a/target/i386/cpu.c > > +++ b/target/i386/cpu.c > > @@ -483,7 +483,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, > > uint32_t *eax, uint32_t *ebx, > > uint32_t *ecx, uint32_t *edx) > > { > > - uint32_t l3_threads; > > + uint32_t num_sharing_cache; > > assert(cache->size == cache->line_size * cache->associativity * > > cache->partitions * cache->sets); > > @@ -492,13 +492,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, > > /* L3 is shared among multiple cores */ > > if (cache->level == 3) { > > - l3_threads = topo_info->modules_per_die * > > - topo_info->cores_per_module * > > - topo_info->threads_per_core; > > - *eax |= (l3_threads - 1) << 14; > > + num_sharing_cache = 1 << apicid_die_offset(topo_info); > > } else { > > - *eax |= ((topo_info->threads_per_core - 1) << 14); > > + num_sharing_cache = 1 << apicid_core_offset(topo_info); > > } > > + *eax |= (num_sharing_cache - 1) << 14; > > assert(cache->line_size > 0); > > assert(cache->partitions > 0); >
On 1/15/2024 11:48 AM, Zhao Liu wrote: > Hi Xiaoyao, > > On Sun, Jan 14, 2024 at 10:42:41PM +0800, Xiaoyao Li wrote: >> Date: Sun, 14 Jan 2024 22:42:41 +0800 >> From: Xiaoyao Li <xiaoyao.li@intel.com> >> Subject: Re: [PATCH v7 15/16] i386: Use offsets get NumSharingCache for >> CPUID[0x8000001D].EAX[bits 25:14] >> >> On 1/8/2024 4:27 PM, Zhao Liu wrote: >>> From: Zhao Liu <zhao1.liu@intel.com> >>> >>> The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information >>> for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding >>> the number of sharing threads directly. >>> >>> From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14]) >>> means [1]: >>> >>> The number of logical processors sharing this cache is the value of >>> this field incremented by 1. To determine which logical processors are >>> sharing a cache, determine a Share Id for each processor as follows: >>> >>> ShareId = LocalApicId >> log2(NumSharingCache+1) >>> >>> Logical processors with the same ShareId then share a cache. If >>> NumSharingCache+1 is not a power of two, round it up to the next power >>> of two. >>> >>> From the description above, the calculation of this field should be same >>> as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of >>> APIC ID to calculate this field. >>> >>> [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology >>> Information >> >> this patch can be dropped because we have next patch. > > This patch is mainly used to explicitly emphasize the change in encoding > way and compliance with AMD spec... I didn't tested on AMD machine, so > the more granular patch would make it easier for the community to review > and test. then please move this patch ahead, e.g., after patch 2. > Thanks, > Zhao > >> >>> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> >>> Reviewed-by: Babu Moger <babu.moger@amd.com> >>> Tested-by: Babu Moger <babu.moger@amd.com> >>> Tested-by: Yongwei Ma <yongwei.ma@intel.com> >>> Acked-by: Michael S. Tsirkin <mst@redhat.com> >>> --- >>> Changes since v3: >>> * Rewrite the subject. (Babu) >>> * Delete the original "comment/help" expression, as this behavior is >>> confirmed for AMD CPUs. (Babu) >>> * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec >>> definition. (Babu) >>> >>> Changes since v1: >>> * Rename "l3_threads" to "num_apic_ids" in >>> encode_cache_cpuid8000001d(). (Yanan) >>> * Add the description of the original commit and add Cc. >>> --- >>> target/i386/cpu.c | 10 ++++------ >>> 1 file changed, 4 insertions(+), 6 deletions(-) >>> >>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >>> index b23e8190dc68..8a4d72f6f760 100644 >>> --- a/target/i386/cpu.c >>> +++ b/target/i386/cpu.c >>> @@ -483,7 +483,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, >>> uint32_t *eax, uint32_t *ebx, >>> uint32_t *ecx, uint32_t *edx) >>> { >>> - uint32_t l3_threads; >>> + uint32_t num_sharing_cache; >>> assert(cache->size == cache->line_size * cache->associativity * >>> cache->partitions * cache->sets); >>> @@ -492,13 +492,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, >>> /* L3 is shared among multiple cores */ >>> if (cache->level == 3) { >>> - l3_threads = topo_info->modules_per_die * >>> - topo_info->cores_per_module * >>> - topo_info->threads_per_core; >>> - *eax |= (l3_threads - 1) << 14; >>> + num_sharing_cache = 1 << apicid_die_offset(topo_info); >>> } else { >>> - *eax |= ((topo_info->threads_per_core - 1) << 14); >>> + num_sharing_cache = 1 << apicid_core_offset(topo_info); >>> } >>> + *eax |= (num_sharing_cache - 1) << 14; >>> assert(cache->line_size > 0); >>> assert(cache->partitions > 0); >>
Hi Xiaoyao, On Mon, Jan 15, 2024 at 12:27:43PM +0800, Xiaoyao Li wrote: > Date: Mon, 15 Jan 2024 12:27:43 +0800 > From: Xiaoyao Li <xiaoyao.li@intel.com> > Subject: Re: [PATCH v7 15/16] i386: Use offsets get NumSharingCache for > CPUID[0x8000001D].EAX[bits 25:14] > > On 1/15/2024 11:48 AM, Zhao Liu wrote: > > Hi Xiaoyao, > > > > On Sun, Jan 14, 2024 at 10:42:41PM +0800, Xiaoyao Li wrote: > > > Date: Sun, 14 Jan 2024 22:42:41 +0800 > > > From: Xiaoyao Li <xiaoyao.li@intel.com> > > > Subject: Re: [PATCH v7 15/16] i386: Use offsets get NumSharingCache for > > > CPUID[0x8000001D].EAX[bits 25:14] > > > > > > On 1/8/2024 4:27 PM, Zhao Liu wrote: > > > > From: Zhao Liu <zhao1.liu@intel.com> > > > > > > > > The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information > > > > for cpuid 0x8000001D") adds the cache topology for AMD CPU by encoding > > > > the number of sharing threads directly. > > > > > > > > From AMD's APM, NumSharingCache (CPUID[0x8000001D].EAX[bits 25:14]) > > > > means [1]: > > > > > > > > The number of logical processors sharing this cache is the value of > > > > this field incremented by 1. To determine which logical processors are > > > > sharing a cache, determine a Share Id for each processor as follows: > > > > > > > > ShareId = LocalApicId >> log2(NumSharingCache+1) > > > > > > > > Logical processors with the same ShareId then share a cache. If > > > > NumSharingCache+1 is not a power of two, round it up to the next power > > > > of two. > > > > > > > > From the description above, the calculation of this field should be same > > > > as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of > > > > APIC ID to calculate this field. > > > > > > > > [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology > > > > Information > > > > > > this patch can be dropped because we have next patch. > > > > This patch is mainly used to explicitly emphasize the change in encoding > > way and compliance with AMD spec... I didn't tested on AMD machine, so > > the more granular patch would make it easier for the community to review > > and test. > > then please move this patch ahead, e.g., after patch 2. > OK. Thanks! -Zhao
diff --git a/target/i386/cpu.c b/target/i386/cpu.c index b23e8190dc68..8a4d72f6f760 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -483,7 +483,7 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx) { - uint32_t l3_threads; + uint32_t num_sharing_cache; assert(cache->size == cache->line_size * cache->associativity * cache->partitions * cache->sets); @@ -492,13 +492,11 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache, /* L3 is shared among multiple cores */ if (cache->level == 3) { - l3_threads = topo_info->modules_per_die * - topo_info->cores_per_module * - topo_info->threads_per_core; - *eax |= (l3_threads - 1) << 14; + num_sharing_cache = 1 << apicid_die_offset(topo_info); } else { - *eax |= ((topo_info->threads_per_core - 1) << 14); + num_sharing_cache = 1 << apicid_core_offset(topo_info); } + *eax |= (num_sharing_cache - 1) << 14; assert(cache->line_size > 0); assert(cache->partitions > 0);