Message ID | 1646172610-129397-2-git-send-email-jdamato@fastly.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | page_pool: Add stats counters | expand |
On 01 Mar 14:10, Joe Damato wrote: >Add per-pool statistics counters for the allocation path of a page pool. >These stats are incremented in softirq context, so no locking or per-cpu >variables are needed. > >This code is disabled by default and a kernel config option is provided for >users who wish to enable them. > Sorry for the late review Joe, Why disabled by default ? if your benchmarks showed no diff. IMHO If we believe in this, we should have it enabled by default. >The statistics added are: > - fast: successful fast path allocations > - slow: slow path order-0 allocations > - slow_high_order: slow path high order allocations > - empty: ptr ring is empty, so a slow path allocation was forced. > - refill: an allocation which triggered a refill of the cache > - waive: pages obtained from the ptr ring that cannot be added to > the cache due to a NUMA mismatch. > Let's have this documented under kernel documentation. https://docs.kernel.org/networking/page_pool.html I would also mention the kconfig and any user knobs APIs introduced in this series
On Tue, Mar 1, 2022 at 3:50 PM Saeed Mahameed <saeed@kernel.org> wrote: > > On 01 Mar 14:10, Joe Damato wrote: > >Add per-pool statistics counters for the allocation path of a page pool. > >These stats are incremented in softirq context, so no locking or per-cpu > >variables are needed. > > > >This code is disabled by default and a kernel config option is provided for > >users who wish to enable them. > > > > Sorry for the late review Joe, No worries, thanks for taking a look. > Why disabled by default ? if your benchmarks showed no diff. > > IMHO If we believe in this, we should have it enabled by default. I think keeping it disabled by default makes sense for three reasons: - The benchmarks on my hardware don't show a difference, but less powerful hardware may be more greatly impacted. - The new code uses more memory when enabled for storing the stats. - These stats are useful for debugging and performance investigations, but generally speaking I think the vast majority of everyday kernel users won't be looking at this data. Advanced users who need this information (and are willing to pay the cost in memory and potentially CPU) can enable the code relatively easily, so I think keeping it defaulted to off makes sense. > >The statistics added are: > > - fast: successful fast path allocations > > - slow: slow path order-0 allocations > > - slow_high_order: slow path high order allocations > > - empty: ptr ring is empty, so a slow path allocation was forced. > > - refill: an allocation which triggered a refill of the cache > > - waive: pages obtained from the ptr ring that cannot be added to > > the cache due to a NUMA mismatch. > > > > Let's have this documented under kernel documentation. > https://docs.kernel.org/networking/page_pool.html > > I would also mention the kconfig and any user knobs APIs introduced in > this series Sure, I can add a doc commit in the v9 that explains the kernel config option, the API, and the fields of the stats structures. Thanks, Joe
On 01 Mar 17:51, Joe Damato wrote: >On Tue, Mar 1, 2022 at 3:50 PM Saeed Mahameed <saeed@kernel.org> wrote: >> >> On 01 Mar 14:10, Joe Damato wrote: >> >Add per-pool statistics counters for the allocation path of a page pool. >> >These stats are incremented in softirq context, so no locking or per-cpu >> >variables are needed. >> > >> >This code is disabled by default and a kernel config option is provided for >> >users who wish to enable them. >> > >> >> Sorry for the late review Joe, > >No worries, thanks for taking a look. > >> Why disabled by default ? if your benchmarks showed no diff. >> >> IMHO If we believe in this, we should have it enabled by default. > >I think keeping it disabled by default makes sense for three reasons: > - The benchmarks on my hardware don't show a difference, but less >powerful hardware may be more greatly impacted. > - The new code uses more memory when enabled for storing the stats. > - These stats are useful for debugging and performance >investigations, but generally speaking I think the vast majority of >everyday kernel users won't be looking at this data. > >Advanced users who need this information (and are willing to pay the >cost in memory and potentially CPU) can enable the code relatively >easily, so I think keeping it defaulted to off makes sense. I partially agree, since we have other means to detect if page_pool is effective or not without these stats. But here is my .02$: the difference in performance when page_pool is effective and when isn't is huge, these counters are useful on production systems when the page pool is under stress. Simply put, the benefits of the page_pool outweigh the overhead of counting (if even measurable), these counters should've been added long time ago. if you are aiming for debug only counters then you should've introduced this feature as a static key (tracepoints) to be enabled on the fly and the overhead is paid only when enabled for the debug period. Anyway, not a huge deal :).
diff --git a/include/net/page_pool.h b/include/net/page_pool.h index 97c3c19..1f27e8a4 100644 --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -84,6 +84,19 @@ struct page_pool_params { void *init_arg; }; +#ifdef CONFIG_PAGE_POOL_STATS +struct page_pool_alloc_stats { + u64 fast; /* fast path allocations */ + u64 slow; /* slow-path order 0 allocations */ + u64 slow_high_order; /* slow-path high order allocations */ + u64 empty; /* failed refills due to empty ptr ring, forcing + * slow path allocation + */ + u64 refill; /* allocations via successful refill */ + u64 waive; /* failed refills due to numa zone mismatch */ +}; +#endif + struct page_pool { struct page_pool_params p; @@ -96,6 +109,11 @@ struct page_pool { unsigned int frag_offset; struct page *frag_page; long frag_users; + +#ifdef CONFIG_PAGE_POOL_STATS + /* these stats are incremented while in softirq context */ + struct page_pool_alloc_stats alloc_stats; +#endif u32 xdp_mem_id; /* diff --git a/net/Kconfig b/net/Kconfig index 8a1f9d0..6b78f69 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -434,6 +434,19 @@ config NET_DEVLINK config PAGE_POOL bool +config PAGE_POOL_STATS + default n + bool "Page pool stats" + depends on PAGE_POOL + help + Enable page pool statistics to track page allocation and recycling + in page pools. This option incurs additional CPU cost in allocation + and recycle paths and additional memory cost to store the statistics. + These statistics are only available if this option is enabled and if + the driver using the page pool supports exporting this data. + + If unsure, say N. + config FAILOVER tristate "Generic failover module" help diff --git a/net/core/page_pool.c b/net/core/page_pool.c index e25d359..0fa4b76 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -26,6 +26,13 @@ #define BIAS_MAX LONG_MAX +#ifdef CONFIG_PAGE_POOL_STATS +/* alloc_stat_inc is intended to be used in softirq context */ +#define alloc_stat_inc(pool, __stat) (pool->alloc_stats.__stat++) +#else +#define alloc_stat_inc(pool, __stat) +#endif + static int page_pool_init(struct page_pool *pool, const struct page_pool_params *params) { @@ -117,8 +124,10 @@ static struct page *page_pool_refill_alloc_cache(struct page_pool *pool) int pref_nid; /* preferred NUMA node */ /* Quicker fallback, avoid locks when ring is empty */ - if (__ptr_ring_empty(r)) + if (__ptr_ring_empty(r)) { + alloc_stat_inc(pool, empty); return NULL; + } /* Softirq guarantee CPU and thus NUMA node is stable. This, * assumes CPU refilling driver RX-ring will also run RX-NAPI. @@ -145,14 +154,17 @@ static struct page *page_pool_refill_alloc_cache(struct page_pool *pool) * This limit stress on page buddy alloactor. */ page_pool_return_page(pool, page); + alloc_stat_inc(pool, waive); page = NULL; break; } } while (pool->alloc.count < PP_ALLOC_CACHE_REFILL); /* Return last page */ - if (likely(pool->alloc.count > 0)) + if (likely(pool->alloc.count > 0)) { page = pool->alloc.cache[--pool->alloc.count]; + alloc_stat_inc(pool, refill); + } return page; } @@ -166,6 +178,7 @@ static struct page *__page_pool_get_cached(struct page_pool *pool) if (likely(pool->alloc.count)) { /* Fast-path */ page = pool->alloc.cache[--pool->alloc.count]; + alloc_stat_inc(pool, fast); } else { page = page_pool_refill_alloc_cache(pool); } @@ -239,6 +252,7 @@ static struct page *__page_pool_alloc_page_order(struct page_pool *pool, return NULL; } + alloc_stat_inc(pool, slow_high_order); page_pool_set_pp_info(pool, page); /* Track how many pages are held 'in-flight' */ @@ -293,10 +307,12 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool, } /* Return last page */ - if (likely(pool->alloc.count > 0)) + if (likely(pool->alloc.count > 0)) { page = pool->alloc.cache[--pool->alloc.count]; - else + alloc_stat_inc(pool, slow); + } else { page = NULL; + } /* When page just alloc'ed is should/must have refcnt 1. */ return page;