diff mbox series

mm: use compare-exchange operation to set KASAN page tag

Message ID 20220113031434.464992-1-pcc@google.com (mailing list archive)
State New
Headers show
Series mm: use compare-exchange operation to set KASAN page tag | expand

Commit Message

Peter Collingbourne Jan. 13, 2022, 3:14 a.m. UTC
It has been reported that the tag setting operation on newly-allocated
pages can cause the page flags to be corrupted when performed
concurrently with other flag updates as a result of the use of
non-atomic operations. Fix the problem by using a compare-exchange
loop to update the tag.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0cec63101
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Cc: stable@vger.kernel.org
---
 include/linux/mm.h | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

Comments

Matthew Wilcox Jan. 13, 2022, 3:32 a.m. UTC | #1
On Wed, Jan 12, 2022 at 07:14:34PM -0800, Peter Collingbourne wrote:
> It has been reported that the tag setting operation on newly-allocated
> pages can cause the page flags to be corrupted when performed
> concurrently with other flag updates as a result of the use of
> non-atomic operations. Fix the problem by using a compare-exchange
> loop to update the tag.

I really dislike it that kasan has taken some page flags for its use.
I would much prefer it if kasan used some page_ext_flags.  It's somewhat
slower to access them, and they take up a bit of extra space (unless
you already have CONFIG_PAGE_EXTENSION enabled).  But page flags are a
really scarce resource and kasan has taken 9.
Andrey Konovalov Jan. 14, 2022, 9:58 p.m. UTC | #2
On Thu, Jan 13, 2022 at 6:14 AM Peter Collingbourne <pcc@google.com> wrote:
>
> It has been reported that the tag setting operation on newly-allocated
> pages can cause the page flags to be corrupted when performed
> concurrently with other flag updates as a result of the use of
> non-atomic operations.

Is it know how exactly this race happens? Why are flags for a newly
allocated page being accessed concurrently?

> Fix the problem by using a compare-exchange
> loop to update the tag.
>
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0cec63101
> Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
> Cc: stable@vger.kernel.org
> ---
>  include/linux/mm.h | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index c768a7c81b0b..b544b0a9f537 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1531,11 +1531,17 @@ static inline u8 page_kasan_tag(const struct page *page)
>
>  static inline void page_kasan_tag_set(struct page *page, u8 tag)
>  {
> -       if (kasan_enabled()) {
> -               tag ^= 0xff;
> -               page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
> -               page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
> -       }
> +       unsigned long old_flags, flags;
> +
> +       if (!kasan_enabled())
> +               return;
> +
> +       tag ^= 0xff;
> +       do {
> +               old_flags = flags = page->flags;

I guess this should be at least READ_ONCE(page->flags) if we care
about concurrency.

> +               flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
> +               flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
> +       } while (unlikely(cmpxchg(&page->flags, old_flags, flags) != old_flags));
>  }
>
>  static inline void page_kasan_tag_reset(struct page *page)
> --
> 2.34.1.575.g55b058a8bb-goog
>
Peter Collingbourne Jan. 18, 2022, 10:38 p.m. UTC | #3
On Fri, Jan 14, 2022 at 1:58 PM Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
>  On Thu, Jan 13, 2022 at 6:14 AM Peter Collingbourne <pcc@google.com> wrote:
> >
> > It has been reported that the tag setting operation on newly-allocated
> > pages can cause the page flags to be corrupted when performed
> > concurrently with other flag updates as a result of the use of
> > non-atomic operations.
>
> Is it know how exactly this race happens? Why are flags for a newly
> allocated page being accessed concurrently?

In the report that we received, the race resulted in a crash in
kswapd. This may just be a symptom of the problem though.

I haven't closely audited all of the callers to page_kasan_tag_set()
to check whether they may be operating on already-visible pages, but
at least it doesn't appear to be unanticipated that there may be other
threads accessing the page flags concurrently with a call to
page_kasan_tag_set() (see the calls to smp_wmb() in
arch/arm64/kernel/mte.c, arch/arm64/mm/copypage.c and
arch/arm64/mm/mteswap.c).

> > Fix the problem by using a compare-exchange
> > loop to update the tag.
> >
> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0cec63101
> > Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
> > Cc: stable@vger.kernel.org
> > ---
> >  include/linux/mm.h | 16 +++++++++++-----
> >  1 file changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index c768a7c81b0b..b544b0a9f537 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1531,11 +1531,17 @@ static inline u8 page_kasan_tag(const struct page *page)
> >
> >  static inline void page_kasan_tag_set(struct page *page, u8 tag)
> >  {
> > -       if (kasan_enabled()) {
> > -               tag ^= 0xff;
> > -               page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
> > -               page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
> > -       }
> > +       unsigned long old_flags, flags;
> > +
> > +       if (!kasan_enabled())
> > +               return;
> > +
> > +       tag ^= 0xff;
> > +       do {
> > +               old_flags = flags = page->flags;
>
> I guess this should be at least READ_ONCE(page->flags) if we care
> about concurrency.

Makes sense. I copied this code from page_cpupid_xchg_last() in
mm/mmzone.c which has the same problem. I'll send a patch to fix that
one as well.

Peter
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c768a7c81b0b..b544b0a9f537 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1531,11 +1531,17 @@  static inline u8 page_kasan_tag(const struct page *page)
 
 static inline void page_kasan_tag_set(struct page *page, u8 tag)
 {
-	if (kasan_enabled()) {
-		tag ^= 0xff;
-		page->flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
-		page->flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
-	}
+	unsigned long old_flags, flags;
+
+	if (!kasan_enabled())
+		return;
+
+	tag ^= 0xff;
+	do {
+		old_flags = flags = page->flags;
+		flags &= ~(KASAN_TAG_MASK << KASAN_TAG_PGSHIFT);
+		flags |= (tag & KASAN_TAG_MASK) << KASAN_TAG_PGSHIFT;
+	} while (unlikely(cmpxchg(&page->flags, old_flags, flags) != old_flags));
 }
 
 static inline void page_kasan_tag_reset(struct page *page)