Message ID | 20231123235949.421106-1-kent.overstreet@linux.dev (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | rhashtable: Better error message on allocation failure | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
From: Kent Overstreet > Sent: 24 November 2023 00:00 > > Memory allocation failures print backtraces by default, but when we're > running out of a rhashtable worker the backtrace is useless - it doesn't > tell us which hashtable the allocation failure was for. > > This adds a dedicated warning that prints out functions from the > rhashtable params, which will be a bit more useful. > > Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> > Cc: Thomas Graf <tgraf@suug.ch> > Cc: Herbert Xu <herbert@gondor.apana.org.au> > Acked-by: Herbert Xu <herbert@gondor.apana.org.au> > --- > lib/rhashtable.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > index 6ae2ba8e06a2..d3fce9c8989a 100644 > --- a/lib/rhashtable.c > +++ b/lib/rhashtable.c > @@ -360,9 +360,14 @@ static int rhashtable_rehash_alloc(struct rhashtable *ht, > > ASSERT_RHT_MUTEX(ht); > > - new_tbl = bucket_table_alloc(ht, size, GFP_KERNEL); > - if (new_tbl == NULL) > + new_tbl = bucket_table_alloc(ht, size, GFP_KERNEL|__GFP_NOWARN); > + if (new_tbl == NULL) { > + WARN("rhashtable bucket table allocation failure for %ps", Won't WARN() be a panic on systems with PANICK_ON_WARN set? > + (void *) ht->p.hashfn ?: > + (void *) ht->p.obj_hashfn ?: > + (void *) ht->p.obj_cmpfn); That layout is horrid (and I bet checkpatch complains). You only actually need one (void *) cast on the RH value: ht->p.hashfn ?: ht->p.obj_hashfn ?: (void *)ht->p.obj_cmpfn David > return -ENOMEM; > + } > > err = rhashtable_rehash_attach(ht, old_tbl, new_tbl); > if (err) > -- > 2.42.0 > - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Sat, 25 Nov 2023 15:23:49 +0000 David Laight wrote: > > + new_tbl = bucket_table_alloc(ht, size, GFP_KERNEL|__GFP_NOWARN); > > + if (new_tbl == NULL) { > > + WARN("rhashtable bucket table allocation failure for %ps", > > Won't WARN() be a panic on systems with PANICK_ON_WARN set? Yes, that's problematic :( Let's leave out the GFP_NOWARN and add a pr_warn() instead of the WARN()?
On Tue, Nov 28, 2023 at 05:35:36PM -0800, Jakub Kicinski wrote: > On Sat, 25 Nov 2023 15:23:49 +0000 David Laight wrote: > > > + new_tbl = bucket_table_alloc(ht, size, GFP_KERNEL|__GFP_NOWARN); > > > + if (new_tbl == NULL) { > > > + WARN("rhashtable bucket table allocation failure for %ps", > > > > Won't WARN() be a panic on systems with PANICK_ON_WARN set? > > Yes, that's problematic :( > Let's leave out the GFP_NOWARN and add a pr_warn() instead of > the WARN()? pr_warn() instead of WARN() is fine, but the stack trace from warn_alloc() will be entirely useless. Perhaps if we had a GFP flag to just suppress the backtrace in warn_alloc() - we could even stash a backtrace in the rhashtable at rhashtable_init() time, if we want to print out a more useful one.
On Tue, 28 Nov 2023 20:57:05 -0500 Kent Overstreet wrote: > > Yes, that's problematic :( > > Let's leave out the GFP_NOWARN and add a pr_warn() instead of > > the WARN()? > > pr_warn() instead of WARN() is fine, but the stack trace from > warn_alloc() will be entirely useless. > > Perhaps if we had a GFP flag to just suppress the backtrace in > warn_alloc() - we could even stash a backtrace in the rhashtable at > rhashtable_init() time, if we want to print out a more useful one. Interesting idea, up to you how far down the rabbit hole you're willing to go, really :) Stating the obvious but would be good to add to the commit message, if you decide to implement this, how many rht instances there are on a sample system, IOW how much memory we expect the stacks to burn.
From: Jakub Kicinski > Sent: 29 November 2023 02:15 > > On Tue, 28 Nov 2023 20:57:05 -0500 Kent Overstreet wrote: > > > Yes, that's problematic :( > > > Let's leave out the GFP_NOWARN and add a pr_warn() instead of > > > the WARN()? > > > > pr_warn() instead of WARN() is fine, but the stack trace from > > warn_alloc() will be entirely useless. > > > > Perhaps if we had a GFP flag to just suppress the backtrace in > > warn_alloc() - we could even stash a backtrace in the rhashtable at > > rhashtable_init() time, if we want to print out a more useful one. > > Interesting idea, up to you how far down the rabbit hole you're > willing to go, really :) > > Stating the obvious but would be good to add to the commit message, > if you decide to implement this, how many rht instances there are > on a sample system, IOW how much memory we expect the stacks to burn. It's not really memory, just 'junk' in the console buffer. But completely supressing the traceback from warn_alloc() (et al) would give absolutely no indication of what failed. You wouldn't even know it was a call from rhashtable(). Actually I'd have thought the traceback would show where the hash table was being allocated - which would (mostly) tell you which one. (Unless they were being allocated by a worker thread - which seems unlikely.) IIRC the traceback includes the cpu registers (and code??) they probably are noise for 'controlled' errors like kmalloc failing. Is the whole extra trace really worthwhile at all? How often does kmalloc() really fail? Although if the code recovers by using a smaller table then maybe that might be worth a trace instead of the one from kmalloc(). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/lib/rhashtable.c b/lib/rhashtable.c index 6ae2ba8e06a2..d3fce9c8989a 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -360,9 +360,14 @@ static int rhashtable_rehash_alloc(struct rhashtable *ht, ASSERT_RHT_MUTEX(ht); - new_tbl = bucket_table_alloc(ht, size, GFP_KERNEL); - if (new_tbl == NULL) + new_tbl = bucket_table_alloc(ht, size, GFP_KERNEL|__GFP_NOWARN); + if (new_tbl == NULL) { + WARN("rhashtable bucket table allocation failure for %ps", + (void *) ht->p.hashfn ?: + (void *) ht->p.obj_hashfn ?: + (void *) ht->p.obj_cmpfn); return -ENOMEM; + } err = rhashtable_rehash_attach(ht, old_tbl, new_tbl); if (err)