Message ID | 2d42decac5194c2c8d897b0424f0dcf3@honor.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: simplify zone_idx() | expand |
Hi, On Thu, Apr 10, 2025 at 12:03:00PM +0000, gaoxu wrote: > store zone_idx directly in struct zone to simplify and optimize zone_idx() Do you see an actual speed up somewhere? > Signed-off-by: gao xu <gaoxu2@honor.com> > --- > include/linux/mmzone.h | 3 ++- > mm/mm_init.c | 1 + > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 4c95fcc9e..7b14f577d 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -941,6 +941,7 @@ struct zone { > #endif > > const char *name; > + enum zone_type zone_idx; > > #ifdef CONFIG_MEMORY_ISOLATION > /* > @@ -1536,7 +1537,7 @@ static inline int local_memory_node(int node_id) { return node_id; }; > /* > * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the ZONE_NORMAL zone, etc. > */ > -#define zone_idx(zone) ((zone) - (zone)->zone_pgdat->node_zones) > +#define zone_idx(zone) ((zone)->zone_idx) > > #ifdef CONFIG_ZONE_DEVICE > static inline bool zone_is_zone_device(struct zone *zone) > diff --git a/mm/mm_init.c b/mm/mm_init.c > index 9659689b8..a7f7264f1 100644 > --- a/mm/mm_init.c > +++ b/mm/mm_init.c > @@ -1425,6 +1425,7 @@ static void __meminit zone_init_internals(struct zone *zone, enum zone_type idx, > atomic_long_set(&zone->managed_pages, remaining_pages); > zone_set_nid(zone, nid); > zone->name = zone_names[idx]; > + zone->zone_idx = idx; > zone->zone_pgdat = NODE_DATA(nid); > spin_lock_init(&zone->lock); > zone_seqlock_init(zone); > -- > 2.17.1
On Fri, Apr 11, 2025 at 2:42 AM Mike Rapoport <rppt@kernel.org> wrote: > > Hi, > > On Thu, Apr 10, 2025 at 12:03:00PM +0000, gaoxu wrote: > > store zone_idx directly in struct zone to simplify and optimize zone_idx() > > Do you see an actual speed up somewhere? +1. Curious if there's data indicating zone_idx is a hot path. > > > Signed-off-by: gao xu <gaoxu2@honor.com> > > --- > > include/linux/mmzone.h | 3 ++- > > mm/mm_init.c | 1 + > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > index 4c95fcc9e..7b14f577d 100644 > > --- a/include/linux/mmzone.h > > +++ b/include/linux/mmzone.h > > @@ -941,6 +941,7 @@ struct zone { > > #endif > > > > const char *name; > > + enum zone_type zone_idx; > > > > #ifdef CONFIG_MEMORY_ISOLATION > > /* > > @@ -1536,7 +1537,7 @@ static inline int local_memory_node(int node_id) { return node_id; }; > > /* > > * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the ZONE_NORMAL zone, etc. > > */ > > -#define zone_idx(zone) ((zone) - (zone)->zone_pgdat->node_zones) > > +#define zone_idx(zone) ((zone)->zone_idx) > > > > #ifdef CONFIG_ZONE_DEVICE > > static inline bool zone_is_zone_device(struct zone *zone) > > diff --git a/mm/mm_init.c b/mm/mm_init.c > > index 9659689b8..a7f7264f1 100644 > > --- a/mm/mm_init.c > > +++ b/mm/mm_init.c > > @@ -1425,6 +1425,7 @@ static void __meminit zone_init_internals(struct zone *zone, enum zone_type idx, > > atomic_long_set(&zone->managed_pages, remaining_pages); > > zone_set_nid(zone, nid); > > zone->name = zone_names[idx]; > > + zone->zone_idx = idx; > > zone->zone_pgdat = NODE_DATA(nid); > > spin_lock_init(&zone->lock); > > zone_seqlock_init(zone); > > -- > > 2.17.1 > > -- > Sincerely yours, > Mike. Thanks Barry
> > On Fri, Apr 11, 2025 at 2:42 AM Mike Rapoport <rppt@kernel.org> wrote: > > > > Hi, > > > > On Thu, Apr 10, 2025 at 12:03:00PM +0000, gaoxu wrote: > > > store zone_idx directly in struct zone to simplify and optimize zone_idx() > > > > Do you see an actual speed up somewhere? Almost negligible. my simple code tests showed the patch provides an average improvement of ~0.02%. Thus, in the Android 15-6.6 kernel, I confidently retained the original zone_idx function. (https://android-review.googlesource.com/c/kernel/common/+/3578322/2/mm/page_alloc.c#770) This patch only eliminates 2-3 assembly instructions, making it challenging to observe measurable performance benefits. However, since the zone struct includes CACHELINE_PADDING (reserving unused space), adding a new member variable does not alter the size of zone. This makes the patch effectively zero-cost while achieving a cleaner implementation of zone_idx. > > +1. Curious if there's data indicating zone_idx is a hot path. There are several functions in the memory management code that are frequently executed and will call zone_idx: rmqueue()->wakeup_kswapd()->zone_idx() alloc_pages_bulk_noprof()->__count_zid_vm_events()->zone_idx() The patch (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.com/) will add new hotspot paths, with the details as follows: __zone_watermark_ok()->zone_is_suitable()->zone_idx() zone_watermark_fast()->zone_is_suitable()->zone_idx() get_page_from_freelist()->zone_is_suitable()->zone_idx() __free_one_page()->zone_max_order()->zone_idx() Although The patch (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.com/) has not yet merged into the Linux mainline; it is already included in Android 15-6.6. > > > > > > Signed-off-by: gao xu <gaoxu2@honor.com> > > > --- > > > include/linux/mmzone.h | 3 ++- > > > mm/mm_init.c | 1 + > > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > > index 4c95fcc9e..7b14f577d 100644 > > > --- a/include/linux/mmzone.h > > > +++ b/include/linux/mmzone.h > > > @@ -941,6 +941,7 @@ struct zone { > > > #endif > > > > > > const char *name; > > > + enum zone_type zone_idx; > > > > > > #ifdef CONFIG_MEMORY_ISOLATION > > > /* > > > @@ -1536,7 +1537,7 @@ static inline int local_memory_node(int node_id) > { return node_id; }; > > > /* > > > * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the ZONE_NORMAL > zone, etc. > > > */ > > > -#define zone_idx(zone) ((zone) - > (zone)->zone_pgdat->node_zones) > > > +#define zone_idx(zone) ((zone)->zone_idx) > > > > > > #ifdef CONFIG_ZONE_DEVICE > > > static inline bool zone_is_zone_device(struct zone *zone) > > > diff --git a/mm/mm_init.c b/mm/mm_init.c > > > index 9659689b8..a7f7264f1 100644 > > > --- a/mm/mm_init.c > > > +++ b/mm/mm_init.c > > > @@ -1425,6 +1425,7 @@ static void __meminit zone_init_internals(struct > zone *zone, enum zone_type idx, > > > atomic_long_set(&zone->managed_pages, remaining_pages); > > > zone_set_nid(zone, nid); > > > zone->name = zone_names[idx]; > > > + zone->zone_idx = idx; > > > zone->zone_pgdat = NODE_DATA(nid); > > > spin_lock_init(&zone->lock); > > > zone_seqlock_init(zone); > > > -- > > > 2.17.1 > > > > -- > > Sincerely yours, > > Mike. > > Thanks > Barry
On Sat, Apr 12, 2025 at 8:34 PM gaoxu <gaoxu2@honor.com> wrote: > > > > > On Fri, Apr 11, 2025 at 2:42 AM Mike Rapoport <rppt@kernel.org> wrote: > > > > > > Hi, > > > > > > On Thu, Apr 10, 2025 at 12:03:00PM +0000, gaoxu wrote: > > > > store zone_idx directly in struct zone to simplify and optimize zone_idx() > > > > > > Do you see an actual speed up somewhere? > Almost negligible. my simple code tests showed the patch provides an average improvement of ~0.02%. > Thus, in the Android 15-6.6 kernel, I confidently retained the original zone_idx function. > (https://android-review.googlesource.com/c/kernel/common/+/3578322/2/mm/page_alloc.c#770) > > This patch only eliminates 2-3 assembly instructions, making it challenging to > observe measurable performance benefits. > However, since the zone struct includes CACHELINE_PADDING (reserving unused space), > adding a new member variable does not alter the size of zone. This makes the patch > effectively zero-cost while achieving a cleaner implementation of zone_idx. The struct zone contains many CONFIG_ options to include or exclude certain fields. If we're confident that our new zone_idx doesn't increase cacheline usage for all those cases, this seems like a worthwhile cleanup. Have you checked the numbers? > > > > +1. Curious if there's data indicating zone_idx is a hot path. > There are several functions in the memory management code that are frequently > executed and will call zone_idx: > rmqueue()->wakeup_kswapd()->zone_idx() > alloc_pages_bulk_noprof()->__count_zid_vm_events()->zone_idx() > > The patch (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.com/) > will add new hotspot paths, with the details as follows: > __zone_watermark_ok()->zone_is_suitable()->zone_idx() > zone_watermark_fast()->zone_is_suitable()->zone_idx() > get_page_from_freelist()->zone_is_suitable()->zone_idx() > __free_one_page()->zone_max_order()->zone_idx() > > Although The patch (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.com/) > has not yet merged into the Linux mainline; it is already included in Android 15-6.6. > > > > > > > > > > Signed-off-by: gao xu <gaoxu2@honor.com> > > > > --- > > > > include/linux/mmzone.h | 3 ++- > > > > mm/mm_init.c | 1 + > > > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > > > index 4c95fcc9e..7b14f577d 100644 > > > > --- a/include/linux/mmzone.h > > > > +++ b/include/linux/mmzone.h > > > > @@ -941,6 +941,7 @@ struct zone { > > > > #endif > > > > > > > > const char *name; > > > > + enum zone_type zone_idx; > > > > > > > > #ifdef CONFIG_MEMORY_ISOLATION > > > > /* > > > > @@ -1536,7 +1537,7 @@ static inline int local_memory_node(int node_id) > > { return node_id; }; > > > > /* > > > > * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the ZONE_NORMAL > > zone, etc. > > > > */ > > > > -#define zone_idx(zone) ((zone) - > > (zone)->zone_pgdat->node_zones) > > > > +#define zone_idx(zone) ((zone)->zone_idx) > > > > > > > > #ifdef CONFIG_ZONE_DEVICE > > > > static inline bool zone_is_zone_device(struct zone *zone) > > > > diff --git a/mm/mm_init.c b/mm/mm_init.c > > > > index 9659689b8..a7f7264f1 100644 > > > > --- a/mm/mm_init.c > > > > +++ b/mm/mm_init.c > > > > @@ -1425,6 +1425,7 @@ static void __meminit zone_init_internals(struct > > zone *zone, enum zone_type idx, > > > > atomic_long_set(&zone->managed_pages, remaining_pages); > > > > zone_set_nid(zone, nid); > > > > zone->name = zone_names[idx]; > > > > + zone->zone_idx = idx; > > > > zone->zone_pgdat = NODE_DATA(nid); > > > > spin_lock_init(&zone->lock); > > > > zone_seqlock_init(zone); > > > > -- > > > > 2.17.1 > > > > > > -- > > > Sincerely yours, > > > Mike. > > Thanks Barry
> > On Sat, Apr 12, 2025 at 8:34 PM gaoxu <gaoxu2@honor.com> wrote: > > > > > > > > On Fri, Apr 11, 2025 at 2:42 AM Mike Rapoport <rppt@kernel.org> wrote: > > > > > > > > Hi, > > > > > > > > On Thu, Apr 10, 2025 at 12:03:00PM +0000, gaoxu wrote: > > > > > store zone_idx directly in struct zone to simplify and optimize > > > > > zone_idx() > > > > > > > > Do you see an actual speed up somewhere? > > Almost negligible. my simple code tests showed the patch provides an average > improvement of ~0.02%. > > Thus, in the Android 15-6.6 kernel, I confidently retained the original zone_idx > function. > > (https://android-review.googlesource.com/c/kernel/common/+/3578322/2/m > > m/page_alloc.c#770) > > > > This patch only eliminates 2-3 assembly instructions, making it > > challenging to observe measurable performance benefits. > > However, since the zone struct includes CACHELINE_PADDING (reserving > > unused space), adding a new member variable does not alter the size of > > zone. This makes the patch effectively zero-cost while achieving a cleaner > implementation of zone_idx. > > The struct zone contains many CONFIG_ options to include or exclude certain > fields. > If we're confident that our new zone_idx doesn't increase cacheline usage for all > those cases, this seems like a worthwhile cleanup. Have you checked the > numbers? The zone info obtained through T32 in the Android 15-6.6 system(arm64): (struct zone) struct (1664 bytes, [0] unsigned long [4] _watermark, [32] unsigned long watermark_boost, [40] unsigned long nr_reserved_highatomic, [48] long [5] lowmem_reserve, [88] struct pglist_data * zone_pgdat, [96] struct per_cpu_pages * per_cpu_pageset, [104] struct per_cpu_zonestat * per_cpu_zonestats, [112] int pageset_high, [116] int pageset_batch, [120] unsigned long zone_start_pfn, [128] atomic_long_t managed_pages, [136] unsigned long spanned_pages, [144] unsigned long present_pages, [152] unsigned long present_early_pages, [160] unsigned long cma_pages, [168] const char * name, [176] unsigned long nr_isolate_pageblock, [184] seqlock_t span_seqlock, [192] int order, [196] int initialized, [256] struct cacheline_padding _pad1_, [256] struct free_area [11] free_area, [1400] unsigned long flags, [1408] spinlock_t lock, [1472] struct cacheline_padding _pad2_, [1472] unsigned long percpu_drift_mark, [1480] unsigned long compact_cached_free_pfn, [1488] unsigned long [2] compact_cached_migrate_pfn, [1504] unsigned long compact_init_migrate_pfn, [1512] unsigned long compact_init_free_pfn, [1520] unsigned int compact_considered, [1524] unsigned int compact_defer_shift, [1528] int compact_order_failed, [1532] bool compact_blockskip_flush, [1533] bool contiguous, [1536] struct cacheline_padding _pad3_, [1536] atomic_long_t [11] vm_stat, [1624] atomic_long_t [0] vm_numa_event) 1) It can be observed that there are 56B of free space in CACHELINE_PADDING(pad1); 2) Before the variables in CACHELINE_PADDING(pad1), there are two CONFIGs that are not enabled in Android 15-6.6: #ifdef CONFIG_NUMA int node; #endif #ifndef CONFIG_SPARSEMEM unsigned long *pageblock_flags; #endif /* CONFIG_SPARSEMEM */ These two CONFIGs occupy 16B. 3) Compared to the latest Linux code, two new variables, unsigned long nr_free_highatomic and int pageset_high_max, occupy an additional 16B; Based on the above analysis, there are still 24B of free space before CACHELINE_PADDING(pad1). (If I misunderstand, please point it out in a timely manner. Thank you!) It would be more appropriate to place the newly added variable zone_idx before initialized. > > > > > > > +1. Curious if there's data indicating zone_idx is a hot path. > > There are several functions in the memory management code that are > > frequently executed and will call zone_idx: > > rmqueue()->wakeup_kswapd()->zone_idx() > > alloc_pages_bulk_noprof()->__count_zid_vm_events()->zone_idx() > > > > The patch > > (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.co > > m/) will add new hotspot paths, with the details as follows: > > __zone_watermark_ok()->zone_is_suitable()->zone_idx() > > zone_watermark_fast()->zone_is_suitable()->zone_idx() > > get_page_from_freelist()->zone_is_suitable()->zone_idx() > > __free_one_page()->zone_max_order()->zone_idx() > > > > Although The patch > > (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.co > > m/) has not yet merged into the Linux mainline; it is already included > > in Android 15-6.6. > > > > > > > > > > > > > > Signed-off-by: gao xu <gaoxu2@honor.com> > > > > > --- > > > > > include/linux/mmzone.h | 3 ++- > > > > > mm/mm_init.c | 1 + > > > > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > > > > index 4c95fcc9e..7b14f577d 100644 > > > > > --- a/include/linux/mmzone.h > > > > > +++ b/include/linux/mmzone.h > > > > > @@ -941,6 +941,7 @@ struct zone { #endif > > > > > > > > > > const char *name; > > > > > + enum zone_type zone_idx; > > > > > > > > > > #ifdef CONFIG_MEMORY_ISOLATION > > > > > /* > > > > > @@ -1536,7 +1537,7 @@ static inline int local_memory_node(int > > > > > node_id) > > > { return node_id; }; > > > > > /* > > > > > * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the > > > > > ZONE_NORMAL > > > zone, etc. > > > > > */ > > > > > -#define zone_idx(zone) ((zone) - > > > (zone)->zone_pgdat->node_zones) > > > > > +#define zone_idx(zone) ((zone)->zone_idx) > > > > > > > > > > #ifdef CONFIG_ZONE_DEVICE > > > > > static inline bool zone_is_zone_device(struct zone *zone) diff > > > > > --git a/mm/mm_init.c b/mm/mm_init.c index 9659689b8..a7f7264f1 > > > > > 100644 > > > > > --- a/mm/mm_init.c > > > > > +++ b/mm/mm_init.c > > > > > @@ -1425,6 +1425,7 @@ static void __meminit > > > > > zone_init_internals(struct > > > zone *zone, enum zone_type idx, > > > > > atomic_long_set(&zone->managed_pages, remaining_pages); > > > > > zone_set_nid(zone, nid); > > > > > zone->name = zone_names[idx]; > > > > > + zone->zone_idx = idx; > > > > > zone->zone_pgdat = NODE_DATA(nid); > > > > > spin_lock_init(&zone->lock); > > > > > zone_seqlock_init(zone); > > > > > -- > > > > > 2.17.1 > > > > > > > > -- > > > > Sincerely yours, > > > > Mike. > > > > > Thanks > Barry
On Sat, Apr 12, 2025 at 10:06 PM gaoxu <gaoxu2@honor.com> wrote: > > > > > On Sat, Apr 12, 2025 at 8:34 PM gaoxu <gaoxu2@honor.com> wrote: > > > > > > > > > > > On Fri, Apr 11, 2025 at 2:42 AM Mike Rapoport <rppt@kernel.org> wrote: > > > > > > > > > > Hi, > > > > > > > > > > On Thu, Apr 10, 2025 at 12:03:00PM +0000, gaoxu wrote: > > > > > > store zone_idx directly in struct zone to simplify and optimize > > > > > > zone_idx() > > > > > > > > > > Do you see an actual speed up somewhere? > > > Almost negligible. my simple code tests showed the patch provides an average > > improvement of ~0.02%. > > > Thus, in the Android 15-6.6 kernel, I confidently retained the original zone_idx > > function. > > > (https://android-review.googlesource.com/c/kernel/common/+/3578322/2/m > > > m/page_alloc.c#770) > > > > > > This patch only eliminates 2-3 assembly instructions, making it > > > challenging to observe measurable performance benefits. > > > However, since the zone struct includes CACHELINE_PADDING (reserving > > > unused space), adding a new member variable does not alter the size of > > > zone. This makes the patch effectively zero-cost while achieving a cleaner > > implementation of zone_idx. > > > > The struct zone contains many CONFIG_ options to include or exclude certain > > fields. > > If we're confident that our new zone_idx doesn't increase cacheline usage for all > > those cases, this seems like a worthwhile cleanup. Have you checked the > > numbers? > > The zone info obtained through T32 in the Android 15-6.6 system(arm64): > (struct zone) struct (1664 bytes, > [0] unsigned long [4] _watermark, > [32] unsigned long watermark_boost, > [40] unsigned long nr_reserved_highatomic, > [48] long [5] lowmem_reserve, > [88] struct pglist_data * zone_pgdat, > [96] struct per_cpu_pages * per_cpu_pageset, > [104] struct per_cpu_zonestat * per_cpu_zonestats, > [112] int pageset_high, > [116] int pageset_batch, > [120] unsigned long zone_start_pfn, > [128] atomic_long_t managed_pages, > [136] unsigned long spanned_pages, > [144] unsigned long present_pages, > [152] unsigned long present_early_pages, > [160] unsigned long cma_pages, > [168] const char * name, > [176] unsigned long nr_isolate_pageblock, > [184] seqlock_t span_seqlock, > [192] int order, > [196] int initialized, > [256] struct cacheline_padding _pad1_, > [256] struct free_area [11] free_area, > [1400] unsigned long flags, > [1408] spinlock_t lock, > [1472] struct cacheline_padding _pad2_, > [1472] unsigned long percpu_drift_mark, > [1480] unsigned long compact_cached_free_pfn, > [1488] unsigned long [2] compact_cached_migrate_pfn, > [1504] unsigned long compact_init_migrate_pfn, > [1512] unsigned long compact_init_free_pfn, > [1520] unsigned int compact_considered, > [1524] unsigned int compact_defer_shift, > [1528] int compact_order_failed, > [1532] bool compact_blockskip_flush, > [1533] bool contiguous, > [1536] struct cacheline_padding _pad3_, > [1536] atomic_long_t [11] vm_stat, > [1624] atomic_long_t [0] vm_numa_event) > > 1) It can be observed that there are 56B of free space in CACHELINE_PADDING(pad1); > 2) Before the variables in CACHELINE_PADDING(pad1), there are two CONFIGs that are not enabled in Android 15-6.6: > #ifdef CONFIG_NUMA > int node; > #endif > > #ifndef CONFIG_SPARSEMEM > unsigned long *pageblock_flags; > #endif /* CONFIG_SPARSEMEM */ > These two CONFIGs occupy 16B. > 3) Compared to the latest Linux code, two new variables, unsigned long nr_free_highatomic and int pageset_high_max, occupy an additional 16B; > Based on the above analysis, there are still 24B of free space before CACHELINE_PADDING(pad1). > (If I misunderstand, please point it out in a timely manner. Thank you!) > > It would be more appropriate to place the newly added variable zone_idx before initialized. I don't have a strong opinion on whether we need `zone_idx`—I'm okay with having it or not. If you'd like to add it, feel free to send out a v2 noting that it doesn't increase the struct size. If no one objects, it might be a nice cleanup. > > > > > > > > > > +1. Curious if there's data indicating zone_idx is a hot path. > > > There are several functions in the memory management code that are > > > frequently executed and will call zone_idx: > > > rmqueue()->wakeup_kswapd()->zone_idx() > > > alloc_pages_bulk_noprof()->__count_zid_vm_events()->zone_idx() > > > > > > The patch > > > (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.co > > > m/) will add new hotspot paths, with the details as follows: > > > __zone_watermark_ok()->zone_is_suitable()->zone_idx() > > > zone_watermark_fast()->zone_is_suitable()->zone_idx() > > > get_page_from_freelist()->zone_is_suitable()->zone_idx() > > > __free_one_page()->zone_max_order()->zone_idx() > > > > > > Although The patch > > > (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.co > > > m/) has not yet merged into the Linux mainline; it is already included > > > in Android 15-6.6. > > > > > > > > > > > > > > > > > > Signed-off-by: gao xu <gaoxu2@honor.com> > > > > > > --- > > > > > > include/linux/mmzone.h | 3 ++- > > > > > > mm/mm_init.c | 1 + > > > > > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > > > > > index 4c95fcc9e..7b14f577d 100644 > > > > > > --- a/include/linux/mmzone.h > > > > > > +++ b/include/linux/mmzone.h > > > > > > @@ -941,6 +941,7 @@ struct zone { #endif > > > > > > > > > > > > const char *name; > > > > > > + enum zone_type zone_idx; > > > > > > > > > > > > #ifdef CONFIG_MEMORY_ISOLATION > > > > > > /* > > > > > > @@ -1536,7 +1537,7 @@ static inline int local_memory_node(int > > > > > > node_id) > > > > { return node_id; }; > > > > > > /* > > > > > > * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the > > > > > > ZONE_NORMAL > > > > zone, etc. > > > > > > */ > > > > > > -#define zone_idx(zone) ((zone) - > > > > (zone)->zone_pgdat->node_zones) > > > > > > +#define zone_idx(zone) ((zone)->zone_idx) > > > > > > > > > > > > #ifdef CONFIG_ZONE_DEVICE > > > > > > static inline bool zone_is_zone_device(struct zone *zone) diff > > > > > > --git a/mm/mm_init.c b/mm/mm_init.c index 9659689b8..a7f7264f1 > > > > > > 100644 > > > > > > --- a/mm/mm_init.c > > > > > > +++ b/mm/mm_init.c > > > > > > @@ -1425,6 +1425,7 @@ static void __meminit > > > > > > zone_init_internals(struct > > > > zone *zone, enum zone_type idx, > > > > > > atomic_long_set(&zone->managed_pages, remaining_pages); > > > > > > zone_set_nid(zone, nid); > > > > > > zone->name = zone_names[idx]; > > > > > > + zone->zone_idx = idx; > > > > > > zone->zone_pgdat = NODE_DATA(nid); > > > > > > spin_lock_init(&zone->lock); > > > > > > zone_seqlock_init(zone); > > > > > > -- > > > > > > 2.17.1 > > > > > > > > > > -- > > > > > Sincerely yours, > > > > > Mike. > > > > > > Thanks Barry
On Mon, Apr 14, 2025 at 09:57:26AM +1200, Barry Song wrote: > On Sat, Apr 12, 2025 at 10:06 PM gaoxu <gaoxu2@honor.com> wrote: > > The zone info obtained through T32 in the Android 15-6.6 system(arm64): > > (struct zone) struct (1664 bytes, > > I don't have a strong opinion on whether we need `zone_idx`—I'm okay > with having it or not. If you'd like to add it, feel free to send out > a v2 noting that it doesn't increase the struct size. If no one > objects, it might be a nice cleanup. Plus it's already 1664 bytes! And we have, what, 4 zones per NUMA node? Growing it doesn't feel like a big deal. Although "saves two assembly instructions" is also not exactly a big win. If it saved a cacheline reference, that might be more interesting, but it seems like it's more likely to introduce a cacheline reference than save one. Maybe just not worth doing?
> > On Mon, Apr 14, 2025 at 09:57:26AM +1200, Barry Song wrote: > > On Sat, Apr 12, 2025 at 10:06 PM gaoxu <gaoxu2@honor.com> wrote: > > > The zone info obtained through T32 in the Android 15-6.6 system(arm64): > > > (struct zone) struct (1664 bytes, > > > > I don't have a strong opinion on whether we need `zone_idx`—I'm okay > > with having it or not. If you'd like to add it, feel free to send out > > a v2 noting that it doesn't increase the struct size. If no one > > objects, it might be a nice cleanup. > > Plus it's already 1664 bytes! And we have, what, 4 zones per NUMA node? > Growing it doesn't feel like a big deal. Although "saves two assembly > instructions" is also not exactly a big win. If it saved a cacheline reference, > that might be more interesting, but it seems like it's more likely to introduce a > cacheline reference than save one. Maybe just not worth doing? Zone, zone_pgdat, and node_zones are all considered hot data; most of the time, they reside in the cache. In contrast, zone_idx in the patch is not hot data, and executing ((zone)->zone_idx) will add a new cache line. Am I understanding this correctly? If the heat of the zone_idx function increases, this modification will become worthwhile. For example, this patch will increase the heat of the zone_idx function. The patch (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.com/) will add new hotspot paths, with the details as follows: __zone_watermark_ok()->zone_is_suitable()->zone_idx() zone_watermark_fast()->zone_is_suitable()->zone_idx() get_page_from_freelist()->zone_is_suitable()->zone_idx() __free_one_page()->zone_max_order()->zone_idx()
On Tue, Apr 15, 2025 at 12:34:27PM +0000, gaoxu wrote: > > Growing it doesn't feel like a big deal. Although "saves two assembly > > instructions" is also not exactly a big win. If it saved a cacheline reference, > > that might be more interesting, but it seems like it's more likely to introduce a > > cacheline reference than save one. Maybe just not worth doing? > > Zone, zone_pgdat, and node_zones are all considered hot data; most of the time, > they reside in the cache. In contrast, zone_idx in the patch is not hot data, > and executing ((zone)->zone_idx) will add a new cache line. > Am I understanding this correctly? CPUs are not limited in the number of instructions they can execute today. For example the ARM X4 can execute 8 instructions per clock. Most of the time most of the CPU is idle, waiting on cache misses. A cache miss (all the way to memory) is about 100ns, so if said CPU is clocked at 3.4GHz, that's 2700 instructions that could be executed instead of waiting for that cacheline. Other top-end CPUs have similar numbers; you can find weak CPUs out there which have smaller ratios, but we don't tend to optimise for low-end CPUs. Therefore it is more important to avoid cacheline misses than it is to reduce the number of instructions executed. You haven't measured any improvement from your patch, so I think we should defaut to not changing anything.
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 4c95fcc9e..7b14f577d 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -941,6 +941,7 @@ struct zone { #endif const char *name; + enum zone_type zone_idx; #ifdef CONFIG_MEMORY_ISOLATION /* @@ -1536,7 +1537,7 @@ static inline int local_memory_node(int node_id) { return node_id; }; /* * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the ZONE_NORMAL zone, etc. */ -#define zone_idx(zone) ((zone) - (zone)->zone_pgdat->node_zones) +#define zone_idx(zone) ((zone)->zone_idx) #ifdef CONFIG_ZONE_DEVICE static inline bool zone_is_zone_device(struct zone *zone) diff --git a/mm/mm_init.c b/mm/mm_init.c index 9659689b8..a7f7264f1 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1425,6 +1425,7 @@ static void __meminit zone_init_internals(struct zone *zone, enum zone_type idx, atomic_long_set(&zone->managed_pages, remaining_pages); zone_set_nid(zone, nid); zone->name = zone_names[idx]; + zone->zone_idx = idx; zone->zone_pgdat = NODE_DATA(nid); spin_lock_init(&zone->lock); zone_seqlock_init(zone);
store zone_idx directly in struct zone to simplify and optimize zone_idx() Signed-off-by: gao xu <gaoxu2@honor.com> --- include/linux/mmzone.h | 3 ++- mm/mm_init.c | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)