Message ID | 20250205004615.1253389-1-hyesoo.yu@samsung.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm: slub: call WARN() instead of pr_err on slab_fix. | expand |
On Wed, 5 Feb 2025, Hyesoo Yu wrote: > If a slab object is corrupted or an error occurs in its internal > value, continuing after restoration may cause other side effects. > At this point, it is difficult to debug because the problem occurred > in the past. It is better to use WARN() instead of pr_err to catch > errors at the point of issue because WARN() could trigger panic for > system debugging when panic_on_warn is enabled. WARN() should be > called prior to fixing the value because when a panic is triggered by WARN(), > it allows us to check corrupted data. > I think this makes sense, but it doesn't document why the other changes are being made, like moving the setting of *freelist to NULL. This is presumably something that you want in the crash dump when kernel.panic_on_warn is enabled. Probably best to call that out, but to also indicate what you're relying on in the crash dump to make forward progress on in diagnosing the issue. > Changes in v2: > - Replace direct calling with BUG_ON with the use of WARN in slab_fix. > > Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> > --- > mm/slub.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/mm/slub.c b/mm/slub.c > index 1f50129dcfb3..ea956cb4b8be 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) > va_start(args, fmt); > vaf.fmt = fmt; > vaf.va = &args; > - pr_err("FIX %s: %pV\n", s->name, &vaf); > + WARN(1, "FIX %s: %pV\n", s->name, &vaf); > va_end(args); > } > > @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, > if ((s->flags & SLAB_CONSISTENCY_CHECKS) && > !check_valid_pointer(s, slab, nextfree) && freelist) { > object_err(s, slab, *freelist, "Freechain corrupt"); > - *freelist = NULL; > slab_fix(s, "Isolate corrupted freechain"); > + *freelist = NULL; > return true; > } > > @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > set_freepointer(s, object, NULL); > } else { > slab_err(s, slab, "Freepointer corrupt"); > + slab_fix(s, "Freelist cleared"); > slab->freelist = NULL; > slab->inuse = slab->objects; > - slab_fix(s, "Freelist cleared"); > return 0; > } > break; > @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > if (slab->objects != max_objects) { > slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", > slab->objects, max_objects); > - slab->objects = max_objects; > slab_fix(s, "Number of objects adjusted"); > + slab->objects = max_objects; > } > if (slab->inuse != slab->objects - nr) { > slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", > slab->inuse, slab->objects - nr); > - slab->inuse = slab->objects - nr; > slab_fix(s, "Object count adjusted"); > + slab->inuse = slab->objects - nr; > } > return search == NULL; > } > -- > 2.48.0 > >
On 2/5/25 18:10, David Rientjes wrote: > On Wed, 5 Feb 2025, Hyesoo Yu wrote: > >> If a slab object is corrupted or an error occurs in its internal >> value, continuing after restoration may cause other side effects. >> At this point, it is difficult to debug because the problem occurred >> in the past. It is better to use WARN() instead of pr_err to catch >> errors at the point of issue because WARN() could trigger panic for >> system debugging when panic_on_warn is enabled. WARN() should be >> called prior to fixing the value because when a panic is triggered by WARN(), >> it allows us to check corrupted data. >> > > I think this makes sense, but it doesn't document why the other changes > are being made, like moving the setting of *freelist to NULL. This is > presumably something that you want in the crash dump when > kernel.panic_on_warn is enabled. Probably best to call that out, but to > also indicate what you're relying on in the crash dump to make forward > progress on in diagnosing the issue. Well the last sentence of the changelog above says exactly that, no? >> Changes in v2: >> - Replace direct calling with BUG_ON with the use of WARN in slab_fix. >> >> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> >> --- >> mm/slub.c | 10 +++++----- >> 1 file changed, 5 insertions(+), 5 deletions(-) >> >> diff --git a/mm/slub.c b/mm/slub.c >> index 1f50129dcfb3..ea956cb4b8be 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) >> va_start(args, fmt); >> vaf.fmt = fmt; >> vaf.va = &args; >> - pr_err("FIX %s: %pV\n", s->name, &vaf); >> + WARN(1, "FIX %s: %pV\n", s->name, &vaf); >> va_end(args); >> } >> >> @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, >> if ((s->flags & SLAB_CONSISTENCY_CHECKS) && >> !check_valid_pointer(s, slab, nextfree) && freelist) { >> object_err(s, slab, *freelist, "Freechain corrupt"); >> - *freelist = NULL; >> slab_fix(s, "Isolate corrupted freechain"); >> + *freelist = NULL; >> return true; >> } >> >> @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) >> set_freepointer(s, object, NULL); >> } else { >> slab_err(s, slab, "Freepointer corrupt"); >> + slab_fix(s, "Freelist cleared"); >> slab->freelist = NULL; >> slab->inuse = slab->objects; >> - slab_fix(s, "Freelist cleared"); >> return 0; >> } >> break; >> @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) >> if (slab->objects != max_objects) { >> slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", >> slab->objects, max_objects); >> - slab->objects = max_objects; >> slab_fix(s, "Number of objects adjusted"); >> + slab->objects = max_objects; >> } >> if (slab->inuse != slab->objects - nr) { >> slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", >> slab->inuse, slab->objects - nr); >> - slab->inuse = slab->objects - nr; >> slab_fix(s, "Object count adjusted"); >> + slab->inuse = slab->objects - nr; >> } >> return search == NULL; >> } >> -- >> 2.48.0 >> >>
On Wed, 5 Feb 2025, Vlastimil Babka wrote: > On 2/5/25 18:10, David Rientjes wrote: > > On Wed, 5 Feb 2025, Hyesoo Yu wrote: > > > >> If a slab object is corrupted or an error occurs in its internal > >> value, continuing after restoration may cause other side effects. > >> At this point, it is difficult to debug because the problem occurred > >> in the past. It is better to use WARN() instead of pr_err to catch > >> errors at the point of issue because WARN() could trigger panic for > >> system debugging when panic_on_warn is enabled. WARN() should be > >> called prior to fixing the value because when a panic is triggered by WARN(), > >> it allows us to check corrupted data. > >> > > > > I think this makes sense, but it doesn't document why the other changes > > are being made, like moving the setting of *freelist to NULL. This is > > presumably something that you want in the crash dump when > > kernel.panic_on_warn is enabled. Probably best to call that out, but to > > also indicate what you're relying on in the crash dump to make forward > > progress on in diagnosing the issue. > > Well the last sentence of the changelog above says exactly that, no? > Sorry, I should have been more clear. It's unclear in the code why choosing WARN() here is helpful given the stack would be known. It makes sense to enable kernel.panic_on_warn this way for debugging purposes, but thought it should also carry a comment in the code on the rationale (and the state we're trying to capture in a crash dump) so a future change doesn't go and unravel this for us again. > >> Changes in v2: > >> - Replace direct calling with BUG_ON with the use of WARN in slab_fix. > >> > >> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> > >> --- > >> mm/slub.c | 10 +++++----- > >> 1 file changed, 5 insertions(+), 5 deletions(-) > >> > >> diff --git a/mm/slub.c b/mm/slub.c > >> index 1f50129dcfb3..ea956cb4b8be 100644 > >> --- a/mm/slub.c > >> +++ b/mm/slub.c > >> @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) > >> va_start(args, fmt); > >> vaf.fmt = fmt; > >> vaf.va = &args; > >> - pr_err("FIX %s: %pV\n", s->name, &vaf); > >> + WARN(1, "FIX %s: %pV\n", s->name, &vaf); > >> va_end(args); > >> } > >> > >> @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, > >> if ((s->flags & SLAB_CONSISTENCY_CHECKS) && > >> !check_valid_pointer(s, slab, nextfree) && freelist) { > >> object_err(s, slab, *freelist, "Freechain corrupt"); > >> - *freelist = NULL; > >> slab_fix(s, "Isolate corrupted freechain"); > >> + *freelist = NULL; > >> return true; > >> } > >> > >> @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > >> set_freepointer(s, object, NULL); > >> } else { > >> slab_err(s, slab, "Freepointer corrupt"); > >> + slab_fix(s, "Freelist cleared"); > >> slab->freelist = NULL; > >> slab->inuse = slab->objects; > >> - slab_fix(s, "Freelist cleared"); > >> return 0; > >> } > >> break; > >> @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > >> if (slab->objects != max_objects) { > >> slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", > >> slab->objects, max_objects); > >> - slab->objects = max_objects; > >> slab_fix(s, "Number of objects adjusted"); > >> + slab->objects = max_objects; > >> } > >> if (slab->inuse != slab->objects - nr) { > >> slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", > >> slab->inuse, slab->objects - nr); > >> - slab->inuse = slab->objects - nr; > >> slab_fix(s, "Object count adjusted"); > >> + slab->inuse = slab->objects - nr; > >> } > >> return search == NULL; > >> } > >> -- > >> 2.48.0 > >> > >> > >
On Wed, Feb 05, 2025 at 06:57:14PM -0800, David Rientjes wrote: > On Wed, 5 Feb 2025, Vlastimil Babka wrote: > > > On 2/5/25 18:10, David Rientjes wrote: > > > On Wed, 5 Feb 2025, Hyesoo Yu wrote: > > > > > >> If a slab object is corrupted or an error occurs in its internal > > >> value, continuing after restoration may cause other side effects. > > >> At this point, it is difficult to debug because the problem occurred > > >> in the past. It is better to use WARN() instead of pr_err to catch > > >> errors at the point of issue because WARN() could trigger panic for > > >> system debugging when panic_on_warn is enabled. WARN() should be > > >> called prior to fixing the value because when a panic is triggered by WARN(), > > >> it allows us to check corrupted data. > > >> > > > > > > I think this makes sense, but it doesn't document why the other changes > > > are being made, like moving the setting of *freelist to NULL. This is > > > presumably something that you want in the crash dump when > > > kernel.panic_on_warn is enabled. Probably best to call that out, but to > > > also indicate what you're relying on in the crash dump to make forward > > > progress on in diagnosing the issue. > > > > Well the last sentence of the changelog above says exactly that, no? > > > > Sorry, I should have been more clear. It's unclear in the code why > choosing WARN() here is helpful given the stack would be known. It makes > sense to enable kernel.panic_on_warn this way for debugging purposes, but > thought it should also carry a comment in the code on the rationale (and > the state we're trying to capture in a crash dump) so a future change > doesn't go and unravel this for us again. > What do you think of this comment ? I will add it in patch v3. /* * slab_fix indicates that the value would be restored even if an error occurs. * Or, it is possible to trigger a panic without restoring using WARN() if panic_on_warn * is enabled. This can obtain a crash dump at the point of issue to debug. * It is advisable not to restore the data before calling slab_fix() to check for corrupted * data in the crash dump. */ Thanks, Regards. > > >> Changes in v2: > > >> - Replace direct calling with BUG_ON with the use of WARN in slab_fix. > > >> > > >> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> > > >> --- > > >> mm/slub.c | 10 +++++----- > > >> 1 file changed, 5 insertions(+), 5 deletions(-) > > >> > > >> diff --git a/mm/slub.c b/mm/slub.c > > >> index 1f50129dcfb3..ea956cb4b8be 100644 > > >> --- a/mm/slub.c > > >> +++ b/mm/slub.c > > >> @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) > > >> va_start(args, fmt); > > >> vaf.fmt = fmt; > > >> vaf.va = &args; > > >> - pr_err("FIX %s: %pV\n", s->name, &vaf); > > >> + WARN(1, "FIX %s: %pV\n", s->name, &vaf); > > >> va_end(args); > > >> } > > >> > > >> @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, > > >> if ((s->flags & SLAB_CONSISTENCY_CHECKS) && > > >> !check_valid_pointer(s, slab, nextfree) && freelist) { > > >> object_err(s, slab, *freelist, "Freechain corrupt"); > > >> - *freelist = NULL; > > >> slab_fix(s, "Isolate corrupted freechain"); > > >> + *freelist = NULL; > > >> return true; > > >> } > > >> > > >> @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > > >> set_freepointer(s, object, NULL); > > >> } else { > > >> slab_err(s, slab, "Freepointer corrupt"); > > >> + slab_fix(s, "Freelist cleared"); > > >> slab->freelist = NULL; > > >> slab->inuse = slab->objects; > > >> - slab_fix(s, "Freelist cleared"); > > >> return 0; > > >> } > > >> break; > > >> @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > > >> if (slab->objects != max_objects) { > > >> slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", > > >> slab->objects, max_objects); > > >> - slab->objects = max_objects; > > >> slab_fix(s, "Number of objects adjusted"); > > >> + slab->objects = max_objects; > > >> } > > >> if (slab->inuse != slab->objects - nr) { > > >> slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", > > >> slab->inuse, slab->objects - nr); > > >> - slab->inuse = slab->objects - nr; > > >> slab_fix(s, "Object count adjusted"); > > >> + slab->inuse = slab->objects - nr; > > >> } > > >> return search == NULL; > > >> } > > >> -- > > >> 2.48.0 > > >> > > >> > > > > >
On 2/5/25 01:46, Hyesoo Yu wrote: > If a slab object is corrupted or an error occurs in its internal > value, continuing after restoration may cause other side effects. > At this point, it is difficult to debug because the problem occurred > in the past. It is better to use WARN() instead of pr_err to catch > errors at the point of issue because WARN() could trigger panic for > system debugging when panic_on_warn is enabled. WARN() should be > called prior to fixing the value because when a panic is triggered by WARN(), > it allows us to check corrupted data. > > Changes in v2: > - Replace direct calling with BUG_ON with the use of WARN in slab_fix. > > Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> Hi and thanks for the patch, I wonder if it would be better not to change slab_fix() but rather slab_err() and object_err(). It wouldn't then require to rearrange the fixup code. Also I think some error reporting paths don't go through slab_fix() and we still would like them to become a WARN too. Basically it would mean the last line in slab_err() would be a WARN and we'd drop the dump_stack() as that's redundant. Same in object_err() (no dump_stack() there). It would be a bit noisier as a result, but hopefully acceptable. The slab specific debugging info would still be printed before the WARN hits (and potentially results in a panic) so anyone investigating the crash dump would have that information. Hm but I see some places print stuff after slab_err(). slab_pad_check() and list_slab_objects(). We could create slab_err_start() and slab_err_end() for those, and slab_err() would just call both at once. > --- > mm/slub.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/mm/slub.c b/mm/slub.c > index 1f50129dcfb3..ea956cb4b8be 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) > va_start(args, fmt); > vaf.fmt = fmt; > vaf.va = &args; > - pr_err("FIX %s: %pV\n", s->name, &vaf); > + WARN(1, "FIX %s: %pV\n", s->name, &vaf); > va_end(args); > } > > @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, > if ((s->flags & SLAB_CONSISTENCY_CHECKS) && > !check_valid_pointer(s, slab, nextfree) && freelist) { > object_err(s, slab, *freelist, "Freechain corrupt"); > - *freelist = NULL; > slab_fix(s, "Isolate corrupted freechain"); > + *freelist = NULL; > return true; > } > > @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > set_freepointer(s, object, NULL); > } else { > slab_err(s, slab, "Freepointer corrupt"); > + slab_fix(s, "Freelist cleared"); > slab->freelist = NULL; > slab->inuse = slab->objects; > - slab_fix(s, "Freelist cleared"); > return 0; > } > break; > @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > if (slab->objects != max_objects) { > slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", > slab->objects, max_objects); > - slab->objects = max_objects; > slab_fix(s, "Number of objects adjusted"); > + slab->objects = max_objects; > } > if (slab->inuse != slab->objects - nr) { > slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", > slab->inuse, slab->objects - nr); > - slab->inuse = slab->objects - nr; > slab_fix(s, "Object count adjusted"); > + slab->inuse = slab->objects - nr; > } > return search == NULL; > }
On Thu, 6 Feb 2025, Hyesoo Yu wrote: > What do you think of this comment ? I will add it in patch v3. > > /* > * slab_fix indicates that the value would be restored even if an error occurs. > * Or, it is possible to trigger a panic without restoring using WARN() if panic_on_warn > * is enabled. This can obtain a crash dump at the point of issue to debug. > * It is advisable not to restore the data before calling slab_fix() to check for corrupted > * data in the crash dump. > */ > Looks good, you might consider it to be a bit more brief, but otherwise looks to clearly document the intent here. Was thinking of something like: /* * If panic_on_warn is enabled to generate a crash dump, be careful to not * restore data before this point. */ WARN() But I don't feel strongly. Thanks! > Thanks, > Regards. > > > > >> Changes in v2: > > > >> - Replace direct calling with BUG_ON with the use of WARN in slab_fix. > > > >> > > > >> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> > > > >> --- > > > >> mm/slub.c | 10 +++++----- > > > >> 1 file changed, 5 insertions(+), 5 deletions(-) > > > >> > > > >> diff --git a/mm/slub.c b/mm/slub.c > > > >> index 1f50129dcfb3..ea956cb4b8be 100644 > > > >> --- a/mm/slub.c > > > >> +++ b/mm/slub.c > > > >> @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) > > > >> va_start(args, fmt); > > > >> vaf.fmt = fmt; > > > >> vaf.va = &args; > > > >> - pr_err("FIX %s: %pV\n", s->name, &vaf); > > > >> + WARN(1, "FIX %s: %pV\n", s->name, &vaf); > > > >> va_end(args); > > > >> } > > > >> > > > >> @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, > > > >> if ((s->flags & SLAB_CONSISTENCY_CHECKS) && > > > >> !check_valid_pointer(s, slab, nextfree) && freelist) { > > > >> object_err(s, slab, *freelist, "Freechain corrupt"); > > > >> - *freelist = NULL; > > > >> slab_fix(s, "Isolate corrupted freechain"); > > > >> + *freelist = NULL; > > > >> return true; > > > >> } > > > >> > > > >> @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > > > >> set_freepointer(s, object, NULL); > > > >> } else { > > > >> slab_err(s, slab, "Freepointer corrupt"); > > > >> + slab_fix(s, "Freelist cleared"); > > > >> slab->freelist = NULL; > > > >> slab->inuse = slab->objects; > > > >> - slab_fix(s, "Freelist cleared"); > > > >> return 0; > > > >> } > > > >> break; > > > >> @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > > > >> if (slab->objects != max_objects) { > > > >> slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", > > > >> slab->objects, max_objects); > > > >> - slab->objects = max_objects; > > > >> slab_fix(s, "Number of objects adjusted"); > > > >> + slab->objects = max_objects; > > > >> } > > > >> if (slab->inuse != slab->objects - nr) { > > > >> slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", > > > >> slab->inuse, slab->objects - nr); > > > >> - slab->inuse = slab->objects - nr; > > > >> slab_fix(s, "Object count adjusted"); > > > >> + slab->inuse = slab->objects - nr; > > > >> } > > > >> return search == NULL; > > > >> } > > > >> -- > > > >> 2.48.0 > > > >> > > > >> > > > > > > > > >
On Thu, Feb 06, 2025 at 12:35:22PM +0100, Vlastimil Babka wrote: > On 2/5/25 01:46, Hyesoo Yu wrote: > > If a slab object is corrupted or an error occurs in its internal > > value, continuing after restoration may cause other side effects. > > At this point, it is difficult to debug because the problem occurred > > in the past. It is better to use WARN() instead of pr_err to catch > > errors at the point of issue because WARN() could trigger panic for > > system debugging when panic_on_warn is enabled. WARN() should be > > called prior to fixing the value because when a panic is triggered by WARN(), > > it allows us to check corrupted data. > > > > Changes in v2: > > - Replace direct calling with BUG_ON with the use of WARN in slab_fix. > > > > Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> > > Hi and thanks for the patch, > > I wonder if it would be better not to change slab_fix() but rather > slab_err() and object_err(). It wouldn't then require to rearrange the fixup > code. Also I think some error reporting paths don't go through slab_fix() > and we still would like them to become a WARN too. > > Basically it would mean the last line in slab_err() would be a WARN and we'd > drop the dump_stack() as that's redundant. Same in object_err() (no > dump_stack() there). It would be a bit noisier as a result, but hopefully > acceptable. The slab specific debugging info would still be printed before > the WARN hits (and potentially results in a panic) so anyone investigating > the crash dump would have that information. > > Hm but I see some places print stuff after slab_err(). slab_pad_check() and > list_slab_objects(). We could create slab_err_start() and slab_err_end() for > those, and slab_err() would just call both at once. > Thank you so much for your review. That's a great suggestion. Following your suggestion, it seems possible to use WARN on all error reporting paths. For some places print stuff after slab_err, print them as follows, +static void __slab_err(struct slab *slab) +{ + print_slab_info(slab); + add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); + + WARN_ON(1); +} + static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab, const char *fmt, ...) { vsnprintf(buf, sizeof(buf), fmt, args); va_end(args); slab_bug(s, "%s", buf); - print_slab_info(slab); - dump_stack(); - add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); + __slab_err(slab); } @@ -1316,11 +1322,13 @@ slab_pad_check(struct kmem_cache *s, struct slab *slab) while (end > fault && end[-1] == POISON_INUSE) end--; - slab_err(s, slab, "Padding overwritten. 0x%p-0x%p @offset=%tu", - fault, end - 1, fault - start); + slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu", + fault, end - 1, fault - start); print_section(KERN_ERR, "Padding ", pad, remainder); restore_bytes(s, "slab padding", POISON_INUSE, fault, end); + + __slab_err(slab); } > > --- > > mm/slub.c | 10 +++++----- > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/mm/slub.c b/mm/slub.c > > index 1f50129dcfb3..ea956cb4b8be 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) > > va_start(args, fmt); > > vaf.fmt = fmt; > > vaf.va = &args; > > - pr_err("FIX %s: %pV\n", s->name, &vaf); > > + WARN(1, "FIX %s: %pV\n", s->name, &vaf); > > va_end(args); > > } > > > > @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, > > if ((s->flags & SLAB_CONSISTENCY_CHECKS) && > > !check_valid_pointer(s, slab, nextfree) && freelist) { > > object_err(s, slab, *freelist, "Freechain corrupt"); > > - *freelist = NULL; > > slab_fix(s, "Isolate corrupted freechain"); > > + *freelist = NULL; > > return true; > > } > > > > @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > > set_freepointer(s, object, NULL); > > } else { > > slab_err(s, slab, "Freepointer corrupt"); > > + slab_fix(s, "Freelist cleared"); > > slab->freelist = NULL; > > slab->inuse = slab->objects; > > - slab_fix(s, "Freelist cleared"); > > return 0; > > } > > break; > > @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > > if (slab->objects != max_objects) { > > slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", > > slab->objects, max_objects); > > - slab->objects = max_objects; > > slab_fix(s, "Number of objects adjusted"); > > + slab->objects = max_objects; > > } > > if (slab->inuse != slab->objects - nr) { > > slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", > > slab->inuse, slab->objects - nr); > > - slab->inuse = slab->objects - nr; > > slab_fix(s, "Object count adjusted"); > > + slab->inuse = slab->objects - nr; > > } > > return search == NULL; > > } > >
On 2/7/25 04:28, Hyesoo Yu wrote: > On Thu, Feb 06, 2025 at 12:35:22PM +0100, Vlastimil Babka wrote: >> On 2/5/25 01:46, Hyesoo Yu wrote: >> > If a slab object is corrupted or an error occurs in its internal >> > value, continuing after restoration may cause other side effects. >> > At this point, it is difficult to debug because the problem occurred >> > in the past. It is better to use WARN() instead of pr_err to catch >> > errors at the point of issue because WARN() could trigger panic for >> > system debugging when panic_on_warn is enabled. WARN() should be >> > called prior to fixing the value because when a panic is triggered by WARN(), >> > it allows us to check corrupted data. >> > >> > Changes in v2: >> > - Replace direct calling with BUG_ON with the use of WARN in slab_fix. >> > >> > Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> >> >> Hi and thanks for the patch, >> >> I wonder if it would be better not to change slab_fix() but rather >> slab_err() and object_err(). It wouldn't then require to rearrange the fixup >> code. Also I think some error reporting paths don't go through slab_fix() >> and we still would like them to become a WARN too. >> >> Basically it would mean the last line in slab_err() would be a WARN and we'd >> drop the dump_stack() as that's redundant. Same in object_err() (no >> dump_stack() there). It would be a bit noisier as a result, but hopefully >> acceptable. The slab specific debugging info would still be printed before >> the WARN hits (and potentially results in a panic) so anyone investigating >> the crash dump would have that information. >> >> Hm but I see some places print stuff after slab_err(). slab_pad_check() and >> list_slab_objects(). We could create slab_err_start() and slab_err_end() for >> those, and slab_err() would just call both at once. >> > > Thank you so much for your review. > > That's a great suggestion. Following your suggestion, it seems possible to use > WARN on all error reporting paths. For some places print stuff after slab_err, > print them as follows, > > +static void __slab_err(struct slab *slab) > +{ > + print_slab_info(slab); > + add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); > + > + WARN_ON(1); > +} > + > static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab, > const char *fmt, ...) > { > vsnprintf(buf, sizeof(buf), fmt, args); > va_end(args); > slab_bug(s, "%s", buf); > - print_slab_info(slab); > - dump_stack(); > - add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); > + __slab_err(slab); > } > > @@ -1316,11 +1322,13 @@ slab_pad_check(struct kmem_cache *s, struct slab *slab) > while (end > fault && end[-1] == POISON_INUSE) > end--; > > - slab_err(s, slab, "Padding overwritten. 0x%p-0x%p @offset=%tu", > - fault, end - 1, fault - start); > + slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu", > + fault, end - 1, fault - start); > print_section(KERN_ERR, "Padding ", pad, remainder); > > restore_bytes(s, "slab padding", POISON_INUSE, fault, end); > + > + __slab_err(slab); Yeah but move that above restore_bytes()? BTW I think we could also do this? --- a/mm/slub.c +++ b/mm/slub.c @@ -1162,7 +1162,7 @@ void skip_orig_size_check(struct kmem_cache *s, const void *object) set_orig_size(s, (void *)object, s->object_size); } -static void slab_bug(struct kmem_cache *s, char *fmt, ...) +static void slab_bug(struct kmem_cache *s, const char *fmt, ...) { struct va_format vaf; va_list args; @@ -1263,15 +1263,11 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab, const char *fmt, ...) { va_list args; - char buf[100]; if (slab_add_kunit_errors()) return; - va_start(args, fmt); - vsnprintf(buf, sizeof(buf), fmt, args); - va_end(args); - slab_bug(s, "%s", buf); + slab_bug(s, fmt, args); > } > >> > --- >> > mm/slub.c | 10 +++++----- >> > 1 file changed, 5 insertions(+), 5 deletions(-) >> > >> > diff --git a/mm/slub.c b/mm/slub.c >> > index 1f50129dcfb3..ea956cb4b8be 100644 >> > --- a/mm/slub.c >> > +++ b/mm/slub.c >> > @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) >> > va_start(args, fmt); >> > vaf.fmt = fmt; >> > vaf.va = &args; >> > - pr_err("FIX %s: %pV\n", s->name, &vaf); >> > + WARN(1, "FIX %s: %pV\n", s->name, &vaf); >> > va_end(args); >> > } >> > >> > @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, >> > if ((s->flags & SLAB_CONSISTENCY_CHECKS) && >> > !check_valid_pointer(s, slab, nextfree) && freelist) { >> > object_err(s, slab, *freelist, "Freechain corrupt"); >> > - *freelist = NULL; >> > slab_fix(s, "Isolate corrupted freechain"); >> > + *freelist = NULL; >> > return true; >> > } >> > >> > @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) >> > set_freepointer(s, object, NULL); >> > } else { >> > slab_err(s, slab, "Freepointer corrupt"); >> > + slab_fix(s, "Freelist cleared"); >> > slab->freelist = NULL; >> > slab->inuse = slab->objects; >> > - slab_fix(s, "Freelist cleared"); >> > return 0; >> > } >> > break; >> > @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) >> > if (slab->objects != max_objects) { >> > slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", >> > slab->objects, max_objects); >> > - slab->objects = max_objects; >> > slab_fix(s, "Number of objects adjusted"); >> > + slab->objects = max_objects; >> > } >> > if (slab->inuse != slab->objects - nr) { >> > slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", >> > slab->inuse, slab->objects - nr); >> > - slab->inuse = slab->objects - nr; >> > slab_fix(s, "Object count adjusted"); >> > + slab->inuse = slab->objects - nr; >> > } >> > return search == NULL; >> > } >> >> > >
On Fri, Feb 07, 2025 at 10:08:20AM +0100, Vlastimil Babka wrote: > On 2/7/25 04:28, Hyesoo Yu wrote: > > On Thu, Feb 06, 2025 at 12:35:22PM +0100, Vlastimil Babka wrote: > >> On 2/5/25 01:46, Hyesoo Yu wrote: > >> > If a slab object is corrupted or an error occurs in its internal > >> > value, continuing after restoration may cause other side effects. > >> > At this point, it is difficult to debug because the problem occurred > >> > in the past. It is better to use WARN() instead of pr_err to catch > >> > errors at the point of issue because WARN() could trigger panic for > >> > system debugging when panic_on_warn is enabled. WARN() should be > >> > called prior to fixing the value because when a panic is triggered by WARN(), > >> > it allows us to check corrupted data. > >> > > >> > Changes in v2: > >> > - Replace direct calling with BUG_ON with the use of WARN in slab_fix. > >> > > >> > Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> > >> > >> Hi and thanks for the patch, > >> > >> I wonder if it would be better not to change slab_fix() but rather > >> slab_err() and object_err(). It wouldn't then require to rearrange the fixup > >> code. Also I think some error reporting paths don't go through slab_fix() > >> and we still would like them to become a WARN too. > >> > >> Basically it would mean the last line in slab_err() would be a WARN and we'd > >> drop the dump_stack() as that's redundant. Same in object_err() (no > >> dump_stack() there). It would be a bit noisier as a result, but hopefully > >> acceptable. The slab specific debugging info would still be printed before > >> the WARN hits (and potentially results in a panic) so anyone investigating > >> the crash dump would have that information. > >> > >> Hm but I see some places print stuff after slab_err(). slab_pad_check() and > >> list_slab_objects(). We could create slab_err_start() and slab_err_end() for > >> those, and slab_err() would just call both at once. > >> > > > > Thank you so much for your review. > > > > That's a great suggestion. Following your suggestion, it seems possible to use > > WARN on all error reporting paths. For some places print stuff after slab_err, > > print them as follows, > > > > +static void __slab_err(struct slab *slab) > > +{ > > + print_slab_info(slab); > > + add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); > > + > > + WARN_ON(1); > > +} > > + > > static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab, > > const char *fmt, ...) > > { > > vsnprintf(buf, sizeof(buf), fmt, args); > > va_end(args); > > slab_bug(s, "%s", buf); > > - print_slab_info(slab); > > - dump_stack(); > > - add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); > > + __slab_err(slab); > > } > > > > @@ -1316,11 +1322,13 @@ slab_pad_check(struct kmem_cache *s, struct slab *slab) > > while (end > fault && end[-1] == POISON_INUSE) > > end--; > > > > - slab_err(s, slab, "Padding overwritten. 0x%p-0x%p @offset=%tu", > > - fault, end - 1, fault - start); > > + slab_bug(s, "Padding overwritten. 0x%p-0x%p @offset=%tu", > > + fault, end - 1, fault - start); > > print_section(KERN_ERR, "Padding ", pad, remainder); > > > > restore_bytes(s, "slab padding", POISON_INUSE, fault, end); > > + > > + __slab_err(slab); > > Yeah but move that above restore_bytes()? > Yes, I will. > BTW I think we could also do this? > > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1162,7 +1162,7 @@ void skip_orig_size_check(struct kmem_cache *s, const void *object) > set_orig_size(s, (void *)object, s->object_size); > } > > -static void slab_bug(struct kmem_cache *s, char *fmt, ...) > +static void slab_bug(struct kmem_cache *s, const char *fmt, ...) > { > struct va_format vaf; > va_list args; > @@ -1263,15 +1263,11 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab, > const char *fmt, ...) > { > va_list args; > - char buf[100]; > > if (slab_add_kunit_errors()) > return; > > - va_start(args, fmt); > - vsnprintf(buf, sizeof(buf), fmt, args); > - va_end(args); > - slab_bug(s, "%s", buf); > + slab_bug(s, fmt, args); > > I guess the args is not initialized. Do you want it to be modified like this ? Thanks, Regards. -static void slab_bug(struct kmem_cache *s, char *fmt, ...) +static void slab_bug(struct kmem_cache *s, const char *fmt, ...) { struct va_format vaf; va_list args; @@ -1137,15 +1137,14 @@ static __printf(3, 4) void slab_err(struct kmem_cache *s, struct slab *slab, const char *fmt, ...) { va_list args; - char buf[100]; if (slab_add_kunit_errors()) return; va_start(args, fmt); - vsnprintf(buf, sizeof(buf), fmt, args); + slab_bug(s, fmt, args); va_end(args); - slab_bug(s, "%s", buf); > > } > > > >> > --- > >> > mm/slub.c | 10 +++++----- > >> > 1 file changed, 5 insertions(+), 5 deletions(-) > >> > > >> > diff --git a/mm/slub.c b/mm/slub.c > >> > index 1f50129dcfb3..ea956cb4b8be 100644 > >> > --- a/mm/slub.c > >> > +++ b/mm/slub.c > >> > @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) > >> > va_start(args, fmt); > >> > vaf.fmt = fmt; > >> > vaf.va = &args; > >> > - pr_err("FIX %s: %pV\n", s->name, &vaf); > >> > + WARN(1, "FIX %s: %pV\n", s->name, &vaf); > >> > va_end(args); > >> > } > >> > > >> > @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, > >> > if ((s->flags & SLAB_CONSISTENCY_CHECKS) && > >> > !check_valid_pointer(s, slab, nextfree) && freelist) { > >> > object_err(s, slab, *freelist, "Freechain corrupt"); > >> > - *freelist = NULL; > >> > slab_fix(s, "Isolate corrupted freechain"); > >> > + *freelist = NULL; > >> > return true; > >> > } > >> > > >> > @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > >> > set_freepointer(s, object, NULL); > >> > } else { > >> > slab_err(s, slab, "Freepointer corrupt"); > >> > + slab_fix(s, "Freelist cleared"); > >> > slab->freelist = NULL; > >> > slab->inuse = slab->objects; > >> > - slab_fix(s, "Freelist cleared"); > >> > return 0; > >> > } > >> > break; > >> > @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) > >> > if (slab->objects != max_objects) { > >> > slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", > >> > slab->objects, max_objects); > >> > - slab->objects = max_objects; > >> > slab_fix(s, "Number of objects adjusted"); > >> > + slab->objects = max_objects; > >> > } > >> > if (slab->inuse != slab->objects - nr) { > >> > slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", > >> > slab->inuse, slab->objects - nr); > >> > - slab->inuse = slab->objects - nr; > >> > slab_fix(s, "Object count adjusted"); > >> > + slab->inuse = slab->objects - nr; > >> > } > >> > return search == NULL; > >> > } > >> > >> > > > > > >
diff --git a/mm/slub.c b/mm/slub.c index 1f50129dcfb3..ea956cb4b8be 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1043,7 +1043,7 @@ static void slab_fix(struct kmem_cache *s, char *fmt, ...) va_start(args, fmt); vaf.fmt = fmt; vaf.va = &args; - pr_err("FIX %s: %pV\n", s->name, &vaf); + WARN(1, "FIX %s: %pV\n", s->name, &vaf); va_end(args); } @@ -1106,8 +1106,8 @@ static bool freelist_corrupted(struct kmem_cache *s, struct slab *slab, if ((s->flags & SLAB_CONSISTENCY_CHECKS) && !check_valid_pointer(s, slab, nextfree) && freelist) { object_err(s, slab, *freelist, "Freechain corrupt"); - *freelist = NULL; slab_fix(s, "Isolate corrupted freechain"); + *freelist = NULL; return true; } @@ -1445,9 +1445,9 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) set_freepointer(s, object, NULL); } else { slab_err(s, slab, "Freepointer corrupt"); + slab_fix(s, "Freelist cleared"); slab->freelist = NULL; slab->inuse = slab->objects; - slab_fix(s, "Freelist cleared"); return 0; } break; @@ -1464,14 +1464,14 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) if (slab->objects != max_objects) { slab_err(s, slab, "Wrong number of objects. Found %d but should be %d", slab->objects, max_objects); - slab->objects = max_objects; slab_fix(s, "Number of objects adjusted"); + slab->objects = max_objects; } if (slab->inuse != slab->objects - nr) { slab_err(s, slab, "Wrong object count. Counter is %d but counted were %d", slab->inuse, slab->objects - nr); - slab->inuse = slab->objects - nr; slab_fix(s, "Object count adjusted"); + slab->inuse = slab->objects - nr; } return search == NULL; }
If a slab object is corrupted or an error occurs in its internal value, continuing after restoration may cause other side effects. At this point, it is difficult to debug because the problem occurred in the past. It is better to use WARN() instead of pr_err to catch errors at the point of issue because WARN() could trigger panic for system debugging when panic_on_warn is enabled. WARN() should be called prior to fixing the value because when a panic is triggered by WARN(), it allows us to check corrupted data. Changes in v2: - Replace direct calling with BUG_ON with the use of WARN in slab_fix. Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> --- mm/slub.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)