Message ID | 20230420210945.2313627-1-pcc@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | arm64: Also reset KASAN tag if page is not PG_mte_tagged | expand |
On Thu, Apr 20, 2023 at 02:09:45PM -0700, Peter Collingbourne wrote: > Consider the following sequence of events: > > 1) A page in a PROT_READ|PROT_WRITE VMA is faulted. > 2) Page migration allocates a page with the KASAN allocator, > causing it to receive a non-match-all tag, and uses it > to replace the page faulted in 1. > 3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1. Ah, so there is no race here, it's simply because the page allocation for migration has a non-match-all kasan tag in page->flags. How do we handle the non-migration case with mprotect()? IIRC post_alloc_hook() always resets the page->flags since GFP_HIGHUSER_MOVABLE has the __GFP_SKIP_KASAN_UNPOISON flag. > As a result of step 3, we are left with a non-match-all tag for a page > with tags accessible to userspace, which can lead to the same kind of > tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in > copy_highpage with HW tags only") intended to fix. > > The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED > is that they cannot have a non-match-all tag. As a result of step 2, the > invariant is broken. This means that the fix in the referenced commit > was incomplete and we also need to reset the tag for pages without > PG_mte_tagged. > > Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags") This commit was reverted in 20794545c146 (arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"). It looks a bit strange to fix it up. > diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c > index 4aadcfb01754..a7bb20055ce0 100644 > --- a/arch/arm64/mm/copypage.c > +++ b/arch/arm64/mm/copypage.c > @@ -21,9 +21,10 @@ void copy_highpage(struct page *to, struct page *from) > > copy_page(kto, kfrom); > > + if (kasan_hw_tags_enabled()) > + page_kasan_tag_reset(to); > + > if (system_supports_mte() && page_mte_tagged(from)) { > - if (kasan_hw_tags_enabled()) > - page_kasan_tag_reset(to); This should work but can we not do this at allocation time like we do for the source page and remove any page_kasan_tag_reset() here altogether?
On Fri, Apr 21, 2023 at 5:24 AM Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Thu, Apr 20, 2023 at 02:09:45PM -0700, Peter Collingbourne wrote: > > Consider the following sequence of events: > > > > 1) A page in a PROT_READ|PROT_WRITE VMA is faulted. > > 2) Page migration allocates a page with the KASAN allocator, > > causing it to receive a non-match-all tag, and uses it > > to replace the page faulted in 1. > > 3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1. > > Ah, so there is no race here, it's simply because the page allocation > for migration has a non-match-all kasan tag in page->flags. > > How do we handle the non-migration case with mprotect()? IIRC > post_alloc_hook() always resets the page->flags since > GFP_HIGHUSER_MOVABLE has the __GFP_SKIP_KASAN_UNPOISON flag. Yes, that's how it normally works. > > As a result of step 3, we are left with a non-match-all tag for a page > > with tags accessible to userspace, which can lead to the same kind of > > tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in > > copy_highpage with HW tags only") intended to fix. > > > > The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED > > is that they cannot have a non-match-all tag. As a result of step 2, the > > invariant is broken. This means that the fix in the referenced commit > > was incomplete and we also need to reset the tag for pages without > > PG_mte_tagged. > > > > Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags") > > This commit was reverted in 20794545c146 (arm64: kasan: Revert "arm64: > mte: reset the page tag in page->flags"). It looks a bit strange to fix > it up. It does seem strange but I think it is correct because that is when the bug (resetting tag only if PG_mte_tagged) was introduced. The revert preserved the bug because it did not account for the migration case, which means that it didn't account for migration+mprotect either. > > diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c > > index 4aadcfb01754..a7bb20055ce0 100644 > > --- a/arch/arm64/mm/copypage.c > > +++ b/arch/arm64/mm/copypage.c > > @@ -21,9 +21,10 @@ void copy_highpage(struct page *to, struct page *from) > > > > copy_page(kto, kfrom); > > > > + if (kasan_hw_tags_enabled()) > > + page_kasan_tag_reset(to); > > + > > if (system_supports_mte() && page_mte_tagged(from)) { > > - if (kasan_hw_tags_enabled()) > > - page_kasan_tag_reset(to); > > This should work but can we not do this at allocation time like we do > for the source page and remove any page_kasan_tag_reset() here > altogether? That would be difficult because of the number of different ways that the page can be allocated. That's why we also decided to reset it here in commit e74a68468062. Peter
On Fri, Apr 21, 2023 at 10:20 AM Peter Collingbourne <pcc@google.com> wrote: > > On Fri, Apr 21, 2023 at 5:24 AM Catalin Marinas <catalin.marinas@arm.com> wrote: > > > > On Thu, Apr 20, 2023 at 02:09:45PM -0700, Peter Collingbourne wrote: > > > Consider the following sequence of events: > > > > > > 1) A page in a PROT_READ|PROT_WRITE VMA is faulted. > > > 2) Page migration allocates a page with the KASAN allocator, > > > causing it to receive a non-match-all tag, and uses it > > > to replace the page faulted in 1. > > > 3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1. > > > > Ah, so there is no race here, it's simply because the page allocation > > for migration has a non-match-all kasan tag in page->flags. > > > > How do we handle the non-migration case with mprotect()? IIRC > > post_alloc_hook() always resets the page->flags since > > GFP_HIGHUSER_MOVABLE has the __GFP_SKIP_KASAN_UNPOISON flag. > > Yes, that's how it normally works. > > > > As a result of step 3, we are left with a non-match-all tag for a page > > > with tags accessible to userspace, which can lead to the same kind of > > > tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in > > > copy_highpage with HW tags only") intended to fix. > > > > > > The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED > > > is that they cannot have a non-match-all tag. As a result of step 2, the > > > invariant is broken. This means that the fix in the referenced commit > > > was incomplete and we also need to reset the tag for pages without > > > PG_mte_tagged. > > > > > > Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags") > > > > This commit was reverted in 20794545c146 (arm64: kasan: Revert "arm64: > > mte: reset the page tag in page->flags"). It looks a bit strange to fix > > it up. > > It does seem strange but I think it is correct because that is when > the bug (resetting tag only if PG_mte_tagged) was introduced. The > revert preserved the bug because it did not account for the migration > case, which means that it didn't account for migration+mprotect > either. > > > > diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c > > > index 4aadcfb01754..a7bb20055ce0 100644 > > > --- a/arch/arm64/mm/copypage.c > > > +++ b/arch/arm64/mm/copypage.c > > > @@ -21,9 +21,10 @@ void copy_highpage(struct page *to, struct page *from) > > > > > > copy_page(kto, kfrom); > > > > > > + if (kasan_hw_tags_enabled()) > > > + page_kasan_tag_reset(to); > > > + > > > if (system_supports_mte() && page_mte_tagged(from)) { > > > - if (kasan_hw_tags_enabled()) > > > - page_kasan_tag_reset(to); > > > > This should work but can we not do this at allocation time like we do > > for the source page and remove any page_kasan_tag_reset() here > > altogether? > > That would be difficult because of the number of different ways that > the page can be allocated. That's why we also decided to reset it here > in commit e74a68468062. Ping. Peter
On Thu, Apr 20, 2023 at 02:09:45PM -0700, Peter Collingbourne wrote: > Consider the following sequence of events: > > 1) A page in a PROT_READ|PROT_WRITE VMA is faulted. > 2) Page migration allocates a page with the KASAN allocator, > causing it to receive a non-match-all tag, and uses it > to replace the page faulted in 1. > 3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1. > > As a result of step 3, we are left with a non-match-all tag for a page > with tags accessible to userspace, which can lead to the same kind of > tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in > copy_highpage with HW tags only") intended to fix. > > The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED > is that they cannot have a non-match-all tag. As a result of step 2, the > invariant is broken. This means that the fix in the referenced commit > was incomplete and we also need to reset the tag for pages without > PG_mte_tagged. > > Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags") > Cc: <stable@vger.kernel.org> # 5.15 > Link: https://linux-review.googlesource.com/id/I7409cdd41acbcb215c2a7417c1e50d37b875beff > Signed-off-by: Peter Collingbourne <pcc@google.com> Sorry, forgot to reply: Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
On Thu, 20 Apr 2023 14:09:45 -0700, Peter Collingbourne wrote: > Consider the following sequence of events: > > 1) A page in a PROT_READ|PROT_WRITE VMA is faulted. > 2) Page migration allocates a page with the KASAN allocator, > causing it to receive a non-match-all tag, and uses it > to replace the page faulted in 1. > 3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1. > > [...] Applied to arm64 (for-next/fixes), thanks! [1/1] arm64: Also reset KASAN tag if page is not PG_mte_tagged https://git.kernel.org/arm64/c/2efbafb91e12 Cheers,
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c index 4aadcfb01754..a7bb20055ce0 100644 --- a/arch/arm64/mm/copypage.c +++ b/arch/arm64/mm/copypage.c @@ -21,9 +21,10 @@ void copy_highpage(struct page *to, struct page *from) copy_page(kto, kfrom); + if (kasan_hw_tags_enabled()) + page_kasan_tag_reset(to); + if (system_supports_mte() && page_mte_tagged(from)) { - if (kasan_hw_tags_enabled()) - page_kasan_tag_reset(to); /* It's a new page, shouldn't have been tagged yet */ WARN_ON_ONCE(!try_page_mte_tagging(to)); mte_copy_page_tags(kto, kfrom);
Consider the following sequence of events: 1) A page in a PROT_READ|PROT_WRITE VMA is faulted. 2) Page migration allocates a page with the KASAN allocator, causing it to receive a non-match-all tag, and uses it to replace the page faulted in 1. 3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1. As a result of step 3, we are left with a non-match-all tag for a page with tags accessible to userspace, which can lead to the same kind of tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in copy_highpage with HW tags only") intended to fix. The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED is that they cannot have a non-match-all tag. As a result of step 2, the invariant is broken. This means that the fix in the referenced commit was incomplete and we also need to reset the tag for pages without PG_mte_tagged. Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags") Cc: <stable@vger.kernel.org> # 5.15 Link: https://linux-review.googlesource.com/id/I7409cdd41acbcb215c2a7417c1e50d37b875beff Signed-off-by: Peter Collingbourne <pcc@google.com> --- arch/arm64/mm/copypage.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)