diff mbox series

[2/3] mm/page_alloc: Add trace event for per-zone lowmem reserve setup

Message ID 20250303073537.2264323-3-liumartin@google.com (mailing list archive)
State New
Headers show
Series Add tracepoints for lowmem reserves, watermarks and totalreserve_pages | expand

Commit Message

Martin Liu March 3, 2025, 7:35 a.m. UTC
This commit introduces the `mm_setup_per_zone_lowmem_reserve` trace
event,which provides detailed insights into the kernel's per-zone lowmem
reserve configuration.

The trace event provides precise timestamps, allowing developers to

1. Correlate lowmem reserve changes with specific kernel events and
able to diagnose unexpected kswapd or direct reclaim behavior
triggered by dynamic changes in lowmem reserve.

2. know memory allocation failures that occur due to insufficient lowmem
reserve, by precisely correlating allocation attempts with reserve
adjustments.

Signed-off-by: Martin Liu <liumartin@google.com>
---
 include/trace/events/kmem.h | 27 +++++++++++++++++++++++++++
 mm/page_alloc.c             |  2 ++
 2 files changed, 29 insertions(+)

Comments

Kalesh Singh March 3, 2025, 8:18 a.m. UTC | #1
On Sun, Mar 2, 2025 at 11:36 PM Martin Liu <liumartin@google.com> wrote:
>
> This commit introduces the `mm_setup_per_zone_lowmem_reserve` trace
> event,which provides detailed insights into the kernel's per-zone lowmem
> reserve configuration.
>
> The trace event provides precise timestamps, allowing developers to
>
> 1. Correlate lowmem reserve changes with specific kernel events and
> able to diagnose unexpected kswapd or direct reclaim behavior
> triggered by dynamic changes in lowmem reserve.
>
> 2. know memory allocation failures that occur due to insufficient lowmem
> reserve, by precisely correlating allocation attempts with reserve
> adjustments.
>
> Signed-off-by: Martin Liu <liumartin@google.com>
> ---
>  include/trace/events/kmem.h | 27 +++++++++++++++++++++++++++
>  mm/page_alloc.c             |  2 ++
>  2 files changed, 29 insertions(+)
>
> diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> index 5fd392dae503..9623e68d4d26 100644
> --- a/include/trace/events/kmem.h
> +++ b/include/trace/events/kmem.h
> @@ -375,6 +375,33 @@ TRACE_EVENT(mm_setup_per_zone_wmarks,
>                   __entry->watermark_promo)
>  );
>
> +TRACE_EVENT(mm_setup_per_zone_lowmem_reserve,
> +
> +       TP_PROTO(struct zone *zone, struct zone *upper_zone, long lowmem_reserve),
> +
> +       TP_ARGS(zone, upper_zone, lowmem_reserve),
> +
> +       TP_STRUCT__entry(
> +               __field(int, node_id)
> +               __string(name, zone->name)
> +               __string(upper_name, upper_zone->name)
> +               __field(long, lowmem_reserve)
> +       ),
> +
> +       TP_fast_assign(
> +               __entry->node_id = zone->zone_pgdat->node_id;
> +               __assign_str(name);
> +               __assign_str(upper_name);
> +               __entry->lowmem_reserve = lowmem_reserve;
> +       ),
> +
> +       TP_printk("node_id=%d zone name=%s upper_zone name=%s lowmem_reserve_pages=%ld",
> +                 __entry->node_id,
> +                 __get_str(name),
> +                 __get_str(upper_name),
> +                 __entry->lowmem_reserve)
> +);
> +
>  /*
>   * Required for uniquely and securely identifying mm in rss_stat tracepoint.
>   */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 50893061db66..48623a2bf1ac 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5857,6 +5857,8 @@ static void setup_per_zone_lowmem_reserve(void)
>                                         zone->lowmem_reserve[j] = 0;
>                                 else
>                                         zone->lowmem_reserve[j] = managed_pages / ratio;
> +                               trace_mm_setup_per_zone_lowmem_reserve(zone, upper_zone,
> +                                                                                                          zone->lowmem_reserve[j]);

Hi Martin,

Please use 8-character width tabs for indentation.

-- Kalesh
>                         }
>                 }
>         }
> --
> 2.48.1.711.g2feabab25a-goog
>
>
Martin Liu March 3, 2025, 10:11 a.m. UTC | #2
On Mon, Mar 03, 2025 at 12:18:46AM -0800, Kalesh Singh wrote:
> On Sun, Mar 2, 2025 at 11:36 PM Martin Liu <liumartin@google.com> wrote:
> >
> > This commit introduces the `mm_setup_per_zone_lowmem_reserve` trace
> > event,which provides detailed insights into the kernel's per-zone lowmem
> > reserve configuration.
> >
> > The trace event provides precise timestamps, allowing developers to
> >
> > 1. Correlate lowmem reserve changes with specific kernel events and
> > able to diagnose unexpected kswapd or direct reclaim behavior
> > triggered by dynamic changes in lowmem reserve.
> >
> > 2. know memory allocation failures that occur due to insufficient lowmem
> > reserve, by precisely correlating allocation attempts with reserve
> > adjustments.
> >
> > Signed-off-by: Martin Liu <liumartin@google.com>
> > ---
> >  include/trace/events/kmem.h | 27 +++++++++++++++++++++++++++
> >  mm/page_alloc.c             |  2 ++
> >  2 files changed, 29 insertions(+)
> >
> > diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
> > index 5fd392dae503..9623e68d4d26 100644
> > --- a/include/trace/events/kmem.h
> > +++ b/include/trace/events/kmem.h
> > @@ -375,6 +375,33 @@ TRACE_EVENT(mm_setup_per_zone_wmarks,
> >                   __entry->watermark_promo)
> >  );
> >
> > +TRACE_EVENT(mm_setup_per_zone_lowmem_reserve,
> > +
> > +       TP_PROTO(struct zone *zone, struct zone *upper_zone, long lowmem_reserve),
> > +
> > +       TP_ARGS(zone, upper_zone, lowmem_reserve),
> > +
> > +       TP_STRUCT__entry(
> > +               __field(int, node_id)
> > +               __string(name, zone->name)
> > +               __string(upper_name, upper_zone->name)
> > +               __field(long, lowmem_reserve)
> > +       ),
> > +
> > +       TP_fast_assign(
> > +               __entry->node_id = zone->zone_pgdat->node_id;
> > +               __assign_str(name);
> > +               __assign_str(upper_name);
> > +               __entry->lowmem_reserve = lowmem_reserve;
> > +       ),
> > +
> > +       TP_printk("node_id=%d zone name=%s upper_zone name=%s lowmem_reserve_pages=%ld",
> > +                 __entry->node_id,
> > +                 __get_str(name),
> > +                 __get_str(upper_name),
> > +                 __entry->lowmem_reserve)
> > +);
> > +
> >  /*
> >   * Required for uniquely and securely identifying mm in rss_stat tracepoint.
> >   */
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 50893061db66..48623a2bf1ac 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5857,6 +5857,8 @@ static void setup_per_zone_lowmem_reserve(void)
> >                                         zone->lowmem_reserve[j] = 0;
> >                                 else
> >                                         zone->lowmem_reserve[j] = managed_pages / ratio;
> > +                               trace_mm_setup_per_zone_lowmem_reserve(zone, upper_zone,
> > +                                                                                                          zone->lowmem_reserve[j]);
> 
> Hi Martin,
> 
> Please use 8-character width tabs for indentation.

Hi Kalesh,

Yes, thank you for the reminders. I will address these once receiving
feedback :)

> 
> -- Kalesh
> >                         }
> >                 }
> >         }
> > --
> > 2.48.1.711.g2feabab25a-goog
> >
> >
diff mbox series

Patch

diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index 5fd392dae503..9623e68d4d26 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -375,6 +375,33 @@  TRACE_EVENT(mm_setup_per_zone_wmarks,
 		  __entry->watermark_promo)
 );
 
+TRACE_EVENT(mm_setup_per_zone_lowmem_reserve,
+
+	TP_PROTO(struct zone *zone, struct zone *upper_zone, long lowmem_reserve),
+
+	TP_ARGS(zone, upper_zone, lowmem_reserve),
+
+	TP_STRUCT__entry(
+		__field(int, node_id)
+		__string(name, zone->name)
+		__string(upper_name, upper_zone->name)
+		__field(long, lowmem_reserve)
+	),
+
+	TP_fast_assign(
+		__entry->node_id = zone->zone_pgdat->node_id;
+		__assign_str(name);
+		__assign_str(upper_name);
+		__entry->lowmem_reserve = lowmem_reserve;
+	),
+
+	TP_printk("node_id=%d zone name=%s upper_zone name=%s lowmem_reserve_pages=%ld",
+		  __entry->node_id,
+		  __get_str(name),
+		  __get_str(upper_name),
+		  __entry->lowmem_reserve)
+);
+
 /*
  * Required for uniquely and securely identifying mm in rss_stat tracepoint.
  */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 50893061db66..48623a2bf1ac 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5857,6 +5857,8 @@  static void setup_per_zone_lowmem_reserve(void)
 					zone->lowmem_reserve[j] = 0;
 				else
 					zone->lowmem_reserve[j] = managed_pages / ratio;
+				trace_mm_setup_per_zone_lowmem_reserve(zone, upper_zone,
+													   zone->lowmem_reserve[j]);
 			}
 		}
 	}