Message ID | 20240307192704.37213-1-tony.luck@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3] x86/mce: Dynamically size space for machine check records | expand |
On 3/7/2024 13:27, Tony Luck wrote: > Systems with a large number of CPUs may generate a large > number of machine check records when things go seriously > wrong. But Linux has a fixed buffer that can only capture > a few dozen errors. > > Allocate space based on the number of CPUs (with a minimum > value based on the historical fixed buffer that could store > 80 records). > > Signed-off-by: Tony Luck <tony.luck@intel.com> > Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> > --- > > Changes since V2; Link: https://lore.kernel.org/all/20240307000256.34352-1-tony.luck@intel.com/ > > Boris: Eliminate "out:" label in mce_gen_pool_create() > > Sohil: Added Reviewed-by tag > > arch/x86/kernel/cpu/mce/genpool.c | 28 +++++++++++++++++++--------- > 1 file changed, 19 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c > index fbe8b61c3413..cadf28662a70 100644 > --- a/arch/x86/kernel/cpu/mce/genpool.c > +++ b/arch/x86/kernel/cpu/mce/genpool.c > @@ -16,14 +16,14 @@ > * used to save error information organized in a lock-less list. > * > * This memory pool is only to be used to save MCE records in MCE context. > - * MCE events are rare, so a fixed size memory pool should be enough. Use > - * 2 pages to save MCE events for now (~80 MCE records at most). > + * MCE events are rare, so a fixed size memory pool should be enough. > + * Allocate on a sliding scale based on number of CPUs. > */ > -#define MCE_POOLSZ (2 * PAGE_SIZE) > +#define MCE_MIN_ENTRIES 80 > +#define MCE_PER_CPU 2 > > static struct gen_pool *mce_evt_pool; > static LLIST_HEAD(mce_event_llist); > -static char gen_pool_buf[MCE_POOLSZ]; > > /* > * Compare the record "t" with each of the records on list "l" to see if > @@ -118,22 +118,32 @@ int mce_gen_pool_add(struct mce *mce) > > static int mce_gen_pool_create(void) > { > + int mce_numrecords, mce_poolsz, order; > struct gen_pool *tmpp; > int ret = -ENOMEM; > + void *mce_pool; > > - tmpp = gen_pool_create(ilog2(sizeof(struct mce_evt_llist)), -1); > + order = order_base_2(sizeof(struct mce_evt_llist)); > + tmpp = gen_pool_create(order, -1); > if (!tmpp) > - goto out; > + return ret; > > - ret = gen_pool_add(tmpp, (unsigned long)gen_pool_buf, MCE_POOLSZ, -1); > + mce_numrecords = max(MCE_MIN_ENTRIES, num_possible_cpus() * MCE_PER_CPU); > + mce_poolsz = mce_numrecords * (1 << order); > + mce_pool = kmalloc(mce_poolsz, GFP_KERNEL); > + if (!mce_pool) { > + gen_pool_destroy(tmpp); > + return ret; > + } > + ret = gen_pool_add(tmpp, (unsigned long)mce_pool, mce_poolsz, -1); > if (ret) { > gen_pool_destroy(tmpp); > - goto out; > + kfree(mce_pool); > + return ret; > } > > mce_evt_pool = tmpp; > > -out: > return ret; > } > > > base-commit: d206a76d7d2726f3b096037f2079ce0bd3ba329b LGTM! Reviewed-by: Avadhut Naik <avadhut.naik@amd.com>
On Thu, Mar 07, 2024 at 11:27:04AM -0800, Tony Luck wrote: > - ret = gen_pool_add(tmpp, (unsigned long)gen_pool_buf, MCE_POOLSZ, -1); > + mce_numrecords = max(MCE_MIN_ENTRIES, num_possible_cpus() * MCE_PER_CPU); > + mce_poolsz = mce_numrecords * (1 << order); So, on a big fat machine with 8K CPUs that's, what 8192 * 2 * (1 << 8) = ~4M buffer? Well, if you have a fat machine, you get fat buffers too.
diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c index fbe8b61c3413..cadf28662a70 100644 --- a/arch/x86/kernel/cpu/mce/genpool.c +++ b/arch/x86/kernel/cpu/mce/genpool.c @@ -16,14 +16,14 @@ * used to save error information organized in a lock-less list. * * This memory pool is only to be used to save MCE records in MCE context. - * MCE events are rare, so a fixed size memory pool should be enough. Use - * 2 pages to save MCE events for now (~80 MCE records at most). + * MCE events are rare, so a fixed size memory pool should be enough. + * Allocate on a sliding scale based on number of CPUs. */ -#define MCE_POOLSZ (2 * PAGE_SIZE) +#define MCE_MIN_ENTRIES 80 +#define MCE_PER_CPU 2 static struct gen_pool *mce_evt_pool; static LLIST_HEAD(mce_event_llist); -static char gen_pool_buf[MCE_POOLSZ]; /* * Compare the record "t" with each of the records on list "l" to see if @@ -118,22 +118,32 @@ int mce_gen_pool_add(struct mce *mce) static int mce_gen_pool_create(void) { + int mce_numrecords, mce_poolsz, order; struct gen_pool *tmpp; int ret = -ENOMEM; + void *mce_pool; - tmpp = gen_pool_create(ilog2(sizeof(struct mce_evt_llist)), -1); + order = order_base_2(sizeof(struct mce_evt_llist)); + tmpp = gen_pool_create(order, -1); if (!tmpp) - goto out; + return ret; - ret = gen_pool_add(tmpp, (unsigned long)gen_pool_buf, MCE_POOLSZ, -1); + mce_numrecords = max(MCE_MIN_ENTRIES, num_possible_cpus() * MCE_PER_CPU); + mce_poolsz = mce_numrecords * (1 << order); + mce_pool = kmalloc(mce_poolsz, GFP_KERNEL); + if (!mce_pool) { + gen_pool_destroy(tmpp); + return ret; + } + ret = gen_pool_add(tmpp, (unsigned long)mce_pool, mce_poolsz, -1); if (ret) { gen_pool_destroy(tmpp); - goto out; + kfree(mce_pool); + return ret; } mce_evt_pool = tmpp; -out: return ret; }