Message ID | 20231020202649.1802700-1-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [for-4.19] xen/xmalloc: XMEM_POOL_POISON improvements | expand |
Hi Andrew, On 20/10/2023 21:26, Andrew Cooper wrote: > When in use, the spew: > > (XEN) Assertion '!memchr_inv(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE)' failed at common/xmalloc_tlsf.c:246 > > is unweidly and meaningless to non-Xen developers. Therefore: > > * Switch to IS_ENABLED(). There's no need for full #ifdef-ary. > * Pull memchr_inv() out into the if(), and provide a better error message. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: George Dunlap <George.Dunlap@eu.citrix.com> > CC: Jan Beulich <JBeulich@suse.com> > CC: Stefano Stabellini <sstabellini@kernel.org> > CC: Wei Liu <wl@xen.org> > CC: Julien Grall <julien@xen.org> > CC: Roger Pau Monné <roger.pau@citrix.com> > > Observations from the debugging of: > https://github.com/Dasharo/dasharo-issues/issues/488 > > There's a further bug. XMEM_POOL_POISON can be enabled in release builds, > where the ASSERT() gets dropped. > > However, upping to a BUG() can't provide any helpful message out to the user. > > I tried modifying BUG() to take an optional message, but xen/bug.h needs > untangling substantially before that will work, and I don't have time right now. Do we actually care about the registers values? If not, we can either use: printk(...); BUG(); Or panic(...); This would allow us to use XMEM_POOL_POISON in prod build before BUG() can accept string. > --- > xen/common/xmalloc_tlsf.c | 15 +++++++-------- > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/xen/common/xmalloc_tlsf.c b/xen/common/xmalloc_tlsf.c > index 349b31cb4cc1..13305cd87c2f 100644 > --- a/xen/common/xmalloc_tlsf.c > +++ b/xen/common/xmalloc_tlsf.c > @@ -249,11 +249,11 @@ static inline void EXTRACT_BLOCK(struct bhdr *b, struct xmem_pool *p, int fl, > } > b->ptr.free_ptr = (struct free_ptr) {NULL, NULL}; > > -#ifdef CONFIG_XMEM_POOL_POISON > - if ( (b->size & BLOCK_SIZE_MASK) > MIN_BLOCK_SIZE ) > - ASSERT(!memchr_inv(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, > - (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE)); > -#endif /* CONFIG_XMEM_POOL_POISON */ > + if ( IS_ENABLED(CONFIG_XMEM_POOL_POISON) && > + (b->size & BLOCK_SIZE_MASK) > MIN_BLOCK_SIZE && > + memchr_inv(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, > + (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE) ) > + ASSERT(!"XMEM Pool corruption found"); > } > > /** > @@ -261,11 +261,10 @@ static inline void EXTRACT_BLOCK(struct bhdr *b, struct xmem_pool *p, int fl, > */ > static inline void INSERT_BLOCK(struct bhdr *b, struct xmem_pool *p, int fl, int sl) > { > -#ifdef CONFIG_XMEM_POOL_POISON > - if ( (b->size & BLOCK_SIZE_MASK) > MIN_BLOCK_SIZE ) > + if ( IS_ENABLED(CONFIG_XMEM_POOL_POISON) && > + (b->size & BLOCK_SIZE_MASK) > MIN_BLOCK_SIZE ) > memset(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, > (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE); > -#endif /* CONFIG_XMEM_POOL_POISON */ > > b->ptr.free_ptr = (struct free_ptr) {NULL, p->matrix[fl][sl]}; > if ( p->matrix[fl][sl] ) Cheers,
On 20.10.2023 22:26, Andrew Cooper wrote: > When in use, the spew: > > (XEN) Assertion '!memchr_inv(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE)' failed at common/xmalloc_tlsf.c:246 > > is unweidly and meaningless to non-Xen developers. Just to mention it (no objection to the changes done): While I agree with unwieldy (and I'd add "of limited use"), an assertion message not necessarily being meaningful to non-Xen developers is imo not really a criteria - that's to be expected in various cases. > Therefore: > > * Switch to IS_ENABLED(). There's no need for full #ifdef-ary. > * Pull memchr_inv() out into the if(), and provide a better error message. Thing is - this "better" error message now ends up being less specific, and hence could mean also other kinds of corruption. > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: George Dunlap <George.Dunlap@eu.citrix.com> > CC: Jan Beulich <JBeulich@suse.com> > CC: Stefano Stabellini <sstabellini@kernel.org> > CC: Wei Liu <wl@xen.org> > CC: Julien Grall <julien@xen.org> > CC: Roger Pau Monné <roger.pau@citrix.com> > > Observations from the debugging of: > https://github.com/Dasharo/dasharo-issues/issues/488 > > There's a further bug. XMEM_POOL_POISON can be enabled in release builds, > where the ASSERT() gets dropped. Just to mention it - this limits usefulness, but doesn't make the option entirely useless. The poisoning itself also allows certain cases of use- after-free to be caught. The assertion really is "only" about write-after- free. (Maybe the improved error message could indicate so?) > However, upping to a BUG() can't provide any helpful message out to the user. > > I tried modifying BUG() to take an optional message, but xen/bug.h needs > untangling substantially before that will work, and I don't have time right now. I agree with Julien's suggestion of using panic() in the meantime, as a possible alternative. Question though is whether it's better to halt the system right away, as opposed to e.g. permitting orderly shutdown to cover the case where the corruption ends up not being "deadly". Jan
On Mon, Oct 23, 2023 at 10:04:21AM +0200, Jan Beulich wrote: > On 20.10.2023 22:26, Andrew Cooper wrote: > > However, upping to a BUG() can't provide any helpful message out to the user. > > > > I tried modifying BUG() to take an optional message, but xen/bug.h needs > > untangling substantially before that will work, and I don't have time right now. > > I agree with Julien's suggestion of using panic() in the meantime, as a > possible alternative. We might care about the stack trace, so would be helpful to print it, maybe WARN + panic? > Question though is whether it's better to halt the > system right away, as opposed to e.g. permitting orderly shutdown to cover > the case where the corruption ends up not being "deadly". Hm, won't this be risky, as we could then possibly corrupt data on disk for example? Thanks, Roger.
On 23.10.2023 10:28, Roger Pau Monné wrote: > On Mon, Oct 23, 2023 at 10:04:21AM +0200, Jan Beulich wrote: >> On 20.10.2023 22:26, Andrew Cooper wrote: >>> However, upping to a BUG() can't provide any helpful message out to the user. >>> >>> I tried modifying BUG() to take an optional message, but xen/bug.h needs >>> untangling substantially before that will work, and I don't have time right now. >> >> I agree with Julien's suggestion of using panic() in the meantime, as a >> possible alternative. > > We might care about the stack trace, so would be helpful to print it, > maybe WARN + panic? Perhaps, yes. >> Question though is whether it's better to halt the >> system right away, as opposed to e.g. permitting orderly shutdown to cover >> the case where the corruption ends up not being "deadly". > > Hm, won't this be risky, as we could then possibly corrupt data on disk > for example? It is risky - you never really know which of the two options is better. Jan
On 22/10/2023 5:52 pm, Julien Grall wrote: > Hi Andrew, > > On 20/10/2023 21:26, Andrew Cooper wrote: >> When in use, the spew: >> >> (XEN) Assertion '!memchr_inv(b->ptr.buffer + MIN_BLOCK_SIZE, >> POISON_BYTE, (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE)' failed at >> common/xmalloc_tlsf.c:246 >> >> is unweidly and meaningless to non-Xen developers. Therefore: >> >> * Switch to IS_ENABLED(). There's no need for full #ifdef-ary. >> * Pull memchr_inv() out into the if(), and provide a better error >> message. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> --- >> CC: George Dunlap <George.Dunlap@eu.citrix.com> >> CC: Jan Beulich <JBeulich@suse.com> >> CC: Stefano Stabellini <sstabellini@kernel.org> >> CC: Wei Liu <wl@xen.org> >> CC: Julien Grall <julien@xen.org> >> CC: Roger Pau Monné <roger.pau@citrix.com> >> >> Observations from the debugging of: >> https://github.com/Dasharo/dasharo-issues/issues/488 >> >> There's a further bug. XMEM_POOL_POISON can be enabled in release >> builds, >> where the ASSERT() gets dropped. >> >> However, upping to a BUG() can't provide any helpful message out to >> the user. >> >> I tried modifying BUG() to take an optional message, but xen/bug.h needs >> untangling substantially before that will work, and I don't have time >> right now. > > Do we actually care about the registers values? If not, we can either > use: > > printk(...); > BUG(); > > Or > > panic(...); > > This would allow us to use XMEM_POOL_POISON in prod build before BUG() > can accept string. We care about the backtrace, so panic() on it's own isn't good enough. But printk();BUG(); should be good enough so I'll swap to that. ~Andrew
diff --git a/xen/common/xmalloc_tlsf.c b/xen/common/xmalloc_tlsf.c index 349b31cb4cc1..13305cd87c2f 100644 --- a/xen/common/xmalloc_tlsf.c +++ b/xen/common/xmalloc_tlsf.c @@ -249,11 +249,11 @@ static inline void EXTRACT_BLOCK(struct bhdr *b, struct xmem_pool *p, int fl, } b->ptr.free_ptr = (struct free_ptr) {NULL, NULL}; -#ifdef CONFIG_XMEM_POOL_POISON - if ( (b->size & BLOCK_SIZE_MASK) > MIN_BLOCK_SIZE ) - ASSERT(!memchr_inv(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, - (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE)); -#endif /* CONFIG_XMEM_POOL_POISON */ + if ( IS_ENABLED(CONFIG_XMEM_POOL_POISON) && + (b->size & BLOCK_SIZE_MASK) > MIN_BLOCK_SIZE && + memchr_inv(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, + (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE) ) + ASSERT(!"XMEM Pool corruption found"); } /** @@ -261,11 +261,10 @@ static inline void EXTRACT_BLOCK(struct bhdr *b, struct xmem_pool *p, int fl, */ static inline void INSERT_BLOCK(struct bhdr *b, struct xmem_pool *p, int fl, int sl) { -#ifdef CONFIG_XMEM_POOL_POISON - if ( (b->size & BLOCK_SIZE_MASK) > MIN_BLOCK_SIZE ) + if ( IS_ENABLED(CONFIG_XMEM_POOL_POISON) && + (b->size & BLOCK_SIZE_MASK) > MIN_BLOCK_SIZE ) memset(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE); -#endif /* CONFIG_XMEM_POOL_POISON */ b->ptr.free_ptr = (struct free_ptr) {NULL, p->matrix[fl][sl]}; if ( p->matrix[fl][sl] )
When in use, the spew: (XEN) Assertion '!memchr_inv(b->ptr.buffer + MIN_BLOCK_SIZE, POISON_BYTE, (b->size & BLOCK_SIZE_MASK) - MIN_BLOCK_SIZE)' failed at common/xmalloc_tlsf.c:246 is unweidly and meaningless to non-Xen developers. Therefore: * Switch to IS_ENABLED(). There's no need for full #ifdef-ary. * Pull memchr_inv() out into the if(), and provide a better error message. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: George Dunlap <George.Dunlap@eu.citrix.com> CC: Jan Beulich <JBeulich@suse.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Wei Liu <wl@xen.org> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> Observations from the debugging of: https://github.com/Dasharo/dasharo-issues/issues/488 There's a further bug. XMEM_POOL_POISON can be enabled in release builds, where the ASSERT() gets dropped. However, upping to a BUG() can't provide any helpful message out to the user. I tried modifying BUG() to take an optional message, but xen/bug.h needs untangling substantially before that will work, and I don't have time right now. --- xen/common/xmalloc_tlsf.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)