Message ID | 20220916135953.1320601-4-keescook@chromium.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | x86/dumpstack: Inline copy_from_user_nmi() | expand |
On Fri, Sep 16, 2022 at 06:59:57AM -0700, Kees Cook wrote: > The check_object_size() checks under CONFIG_HARDENED_USERCOPY need to be > more defensive against running from interrupt context. Use a best-effort > check for VMAP areas when running in interrupt context I had something more like this in mind: diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 096d48aa3437..2b7c52e76856 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -215,7 +215,7 @@ extern struct vm_struct *__get_vm_area_caller(unsigned long size, void free_vm_area(struct vm_struct *area); extern struct vm_struct *remove_vm_area(const void *addr); extern struct vm_struct *find_vm_area(const void *addr); -struct vmap_area *find_vmap_area(unsigned long addr); +struct vmap_area *find_vmap_area_try(unsigned long addr); static inline bool is_vm_area_hugepages(const void *addr) { diff --git a/mm/usercopy.c b/mm/usercopy.c index c1ee15a98633..e0fb605c1b38 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -173,7 +173,11 @@ static inline void check_heap_object(const void *ptr, unsigned long n, } if (is_vmalloc_addr(ptr)) { - struct vmap_area *area = find_vmap_area(addr); + struct vmap_area *area = find_vmap_area_try(addr); + + /* We may be in NMI context */ + if (area == ERR_PTR(-EAGAIN)) + return; if (!area) usercopy_abort("vmalloc", "no area", to_user, 0, n); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index dd6cdb201195..2ea76cb56d4b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1829,7 +1829,7 @@ static void free_unmap_vmap_area(struct vmap_area *va) free_vmap_area_noflush(va); } -struct vmap_area *find_vmap_area(unsigned long addr) +static struct vmap_area *find_vmap_area(unsigned long addr) { struct vmap_area *va; @@ -1840,6 +1840,18 @@ struct vmap_area *find_vmap_area(unsigned long addr) return va; } +struct vmap_area *find_vmap_area_try(unsigned long addr) +{ + struct vmap_area *va; + + if (!spin_lock(&vmap_area_lock)) + return ERR_PTR(-EAGAIN); + va = __find_vmap_area(addr, &vmap_area_root); + spin_unlock(&vmap_area_lock); + + return va; +} + /*** Per cpu kva allocator ***/ /*
On Fri, Sep 16, 2022 at 03:46:07PM +0100, Matthew Wilcox wrote: > On Fri, Sep 16, 2022 at 06:59:57AM -0700, Kees Cook wrote: > > The check_object_size() checks under CONFIG_HARDENED_USERCOPY need to be > > more defensive against running from interrupt context. Use a best-effort > > check for VMAP areas when running in interrupt context > > I had something more like this in mind: Yeah, I like -EAGAIN. I'd like to keep the interrupt test to choose lock vs trylock, otherwise it's trivial to bypass the hardening test by having all the other CPUs beating on the spinlock. Thanks! -Kees
On Fri, Sep 16, 2022 at 06:59:57AM -0700, Kees Cook wrote: > The check_object_size() checks under CONFIG_HARDENED_USERCOPY need to be > more defensive against running from interrupt context. Use a best-effort > check for VMAP areas when running in interrupt context > > Suggested-by: Matthew Wilcox <willy@infradead.org> > Link: https://lore.kernel.org/linux-mm/YyQ2CSdIJdvQPSPO@casper.infradead.org > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Yu Zhao <yuzhao@google.com> > Cc: dev@der-flo.net > Cc: linux-mm@kvack.org > Cc: linux-hardening@vger.kernel.org > Signed-off-by: Kees Cook <keescook@chromium.org> > --- > include/linux/vmalloc.h | 1 + > mm/usercopy.c | 11 ++++++++++- > mm/vmalloc.c | 11 +++++++++++ > 3 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 096d48aa3437..c8a00f181a11 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -216,6 +216,7 @@ void free_vm_area(struct vm_struct *area); > extern struct vm_struct *remove_vm_area(const void *addr); > extern struct vm_struct *find_vm_area(const void *addr); > struct vmap_area *find_vmap_area(unsigned long addr); > +struct vmap_area *find_vmap_area_try(unsigned long addr); > > static inline bool is_vm_area_hugepages(const void *addr) > { > diff --git a/mm/usercopy.c b/mm/usercopy.c > index c1ee15a98633..4a371099ac64 100644 > --- a/mm/usercopy.c > +++ b/mm/usercopy.c > @@ -173,7 +173,16 @@ static inline void check_heap_object(const void *ptr, unsigned long n, > } > > if (is_vmalloc_addr(ptr)) { > - struct vmap_area *area = find_vmap_area(addr); > + struct vmap_area *area; > + > + if (unlikely(in_interrupt())) { > + area = find_vmap_area_try(addr); > + /* Give up under interrupt to avoid deadlocks. */ > + if (!area) > + return; > + } else { > + area = find_vmap_area(addr); > + } > > if (!area) > usercopy_abort("vmalloc", "no area", to_user, 0, n); > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index dd6cdb201195..f14f1902c2f6 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1840,6 +1840,17 @@ struct vmap_area *find_vmap_area(unsigned long addr) > return va; > } > > +struct vmap_area *find_vmap_area_try(unsigned long addr) > I think it makes sense to add a comment of having such function and for which context it is supposed to be used. Maybe rename it to something with "atomic" prefix. Thanks. -- Uladzislau Rezki
On Fri, Sep 16, 2022 at 08:09:16AM -0700, Kees Cook wrote: > On Fri, Sep 16, 2022 at 03:46:07PM +0100, Matthew Wilcox wrote: > > On Fri, Sep 16, 2022 at 06:59:57AM -0700, Kees Cook wrote: > > > The check_object_size() checks under CONFIG_HARDENED_USERCOPY need to be > > > more defensive against running from interrupt context. Use a best-effort > > > check for VMAP areas when running in interrupt context > > > > I had something more like this in mind: > > Yeah, I like -EAGAIN. I'd like to keep the interrupt test to choose lock > vs trylock, otherwise it's trivial to bypass the hardening test by having > all the other CPUs beating on the spinlock. I was thinking about this: +++ b/mm/vmalloc.c @@ -1844,12 +1844,19 @@ { struct vmap_area *va; - if (!spin_lock(&vmap_area_lock)) - return ERR_PTR(-EAGAIN); + /* + * It's safe to walk the rbtree under the RCU lock, but we may + * incorrectly find no vmap_area if the tree is being modified. + */ + rcu_read_lock(); va = __find_vmap_area(addr, &vmap_area_root); - spin_unlock(&vmap_area_lock); + if (!va && in_interrupt()) + va = ERR_PTR(-EAGAIN); + rcu_read_unlock(); - return va; + if (va) + return va; + return find_vmap_area(addr); } /*** Per cpu kva allocator ***/ ... but I don't think that works since vmap_areas aren't freed by RCU, and I think they're reused without going through an RCU cycle. So here's attempt #4, which actually compiles, and is, I think, what you had in mind. diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 096d48aa3437..2b7c52e76856 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -215,7 +215,7 @@ extern struct vm_struct *__get_vm_area_caller(unsigned long size, void free_vm_area(struct vm_struct *area); extern struct vm_struct *remove_vm_area(const void *addr); extern struct vm_struct *find_vm_area(const void *addr); -struct vmap_area *find_vmap_area(unsigned long addr); +struct vmap_area *find_vmap_area_try(unsigned long addr); static inline bool is_vm_area_hugepages(const void *addr) { diff --git a/mm/usercopy.c b/mm/usercopy.c index c1ee15a98633..e0fb605c1b38 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -173,7 +173,11 @@ static inline void check_heap_object(const void *ptr, unsigned long n, } if (is_vmalloc_addr(ptr)) { - struct vmap_area *area = find_vmap_area(addr); + struct vmap_area *area = find_vmap_area_try(addr); + + /* We may be in NMI context */ + if (area == ERR_PTR(-EAGAIN)) + return; if (!area) usercopy_abort("vmalloc", "no area", to_user, 0, n); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index dd6cdb201195..c47b3b5d1c2d 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1829,7 +1829,7 @@ static void free_unmap_vmap_area(struct vmap_area *va) free_vmap_area_noflush(va); } -struct vmap_area *find_vmap_area(unsigned long addr) +static struct vmap_area *find_vmap_area(unsigned long addr) { struct vmap_area *va; @@ -1840,6 +1840,26 @@ struct vmap_area *find_vmap_area(unsigned long addr) return va; } +/* + * The vmap_area_lock is not interrupt-safe, and we can end up here from + * NMI context, so it's not worth even trying to make it IRQ-safe. + */ +struct vmap_area *find_vmap_area_try(unsigned long addr) +{ + struct vmap_area *va; + + if (in_interrupt()) { + if (!spin_trylock(&vmap_area_lock)) + return ERR_PTR(-EAGAIN); + } else { + spin_lock(&vmap_area_lock); + } + va = __find_vmap_area(addr, &vmap_area_root); + spin_unlock(&vmap_area_lock); + + return va; +} + /*** Per cpu kva allocator ***/ /*
On Fri, Sep 16, 2022 at 08:15:24PM +0100, Matthew Wilcox wrote: > On Fri, Sep 16, 2022 at 08:09:16AM -0700, Kees Cook wrote: > > On Fri, Sep 16, 2022 at 03:46:07PM +0100, Matthew Wilcox wrote: > > > On Fri, Sep 16, 2022 at 06:59:57AM -0700, Kees Cook wrote: > > > > The check_object_size() checks under CONFIG_HARDENED_USERCOPY need to be > > > > more defensive against running from interrupt context. Use a best-effort > > > > check for VMAP areas when running in interrupt context > > > > > > I had something more like this in mind: > > > > Yeah, I like -EAGAIN. I'd like to keep the interrupt test to choose lock > > vs trylock, otherwise it's trivial to bypass the hardening test by having > > all the other CPUs beating on the spinlock. > > I was thinking about this: > > +++ b/mm/vmalloc.c > @@ -1844,12 +1844,19 @@ > { > struct vmap_area *va; > > - if (!spin_lock(&vmap_area_lock)) > - return ERR_PTR(-EAGAIN); > + /* > + * It's safe to walk the rbtree under the RCU lock, but we may > + * incorrectly find no vmap_area if the tree is being modified. > + */ > + rcu_read_lock(); > va = __find_vmap_area(addr, &vmap_area_root); > - spin_unlock(&vmap_area_lock); > + if (!va && in_interrupt()) > + va = ERR_PTR(-EAGAIN); > + rcu_read_unlock(); > > - return va; > + if (va) > + return va; > + return find_vmap_area(addr); > } > > /*** Per cpu kva allocator ***/ > > ... but I don't think that works since vmap_areas aren't freed by RCU, > and I think they're reused without going through an RCU cycle. > Right you are. It should be freed via RCU-core. So wrapping by rcu_* is useless here. > > So here's attempt #4, which actually compiles, and is, I think, what you > had in mind. > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 096d48aa3437..2b7c52e76856 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -215,7 +215,7 @@ extern struct vm_struct *__get_vm_area_caller(unsigned long size, > void free_vm_area(struct vm_struct *area); > extern struct vm_struct *remove_vm_area(const void *addr); > extern struct vm_struct *find_vm_area(const void *addr); > -struct vmap_area *find_vmap_area(unsigned long addr); > +struct vmap_area *find_vmap_area_try(unsigned long addr); > > static inline bool is_vm_area_hugepages(const void *addr) > { > diff --git a/mm/usercopy.c b/mm/usercopy.c > index c1ee15a98633..e0fb605c1b38 100644 > --- a/mm/usercopy.c > +++ b/mm/usercopy.c > @@ -173,7 +173,11 @@ static inline void check_heap_object(const void *ptr, unsigned long n, > } > > if (is_vmalloc_addr(ptr)) { > - struct vmap_area *area = find_vmap_area(addr); > + struct vmap_area *area = find_vmap_area_try(addr); > + > + /* We may be in NMI context */ > + if (area == ERR_PTR(-EAGAIN)) > + return; > > if (!area) > usercopy_abort("vmalloc", "no area", to_user, 0, n); > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index dd6cdb201195..c47b3b5d1c2d 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1829,7 +1829,7 @@ static void free_unmap_vmap_area(struct vmap_area *va) > free_vmap_area_noflush(va); > } > > -struct vmap_area *find_vmap_area(unsigned long addr) > +static struct vmap_area *find_vmap_area(unsigned long addr) > { > struct vmap_area *va; > > @@ -1840,6 +1840,26 @@ struct vmap_area *find_vmap_area(unsigned long addr) > return va; > } > > +/* > + * The vmap_area_lock is not interrupt-safe, and we can end up here from > + * NMI context, so it's not worth even trying to make it IRQ-safe. > + */ > +struct vmap_area *find_vmap_area_try(unsigned long addr) > +{ > + struct vmap_area *va; > + > + if (in_interrupt()) { > + if (!spin_trylock(&vmap_area_lock)) > + return ERR_PTR(-EAGAIN); > + } else { > + spin_lock(&vmap_area_lock); > + } > + va = __find_vmap_area(addr, &vmap_area_root); > + spin_unlock(&vmap_area_lock); > + > + return va; > +} > + > If we look at it other way around. There is a user that tries to access the tree from IRQ context. Can we just extend the find_vmap_area() with? - in_interrupt(); - use trylock if so. -- Uladzislau Rezki
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 096d48aa3437..c8a00f181a11 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -216,6 +216,7 @@ void free_vm_area(struct vm_struct *area); extern struct vm_struct *remove_vm_area(const void *addr); extern struct vm_struct *find_vm_area(const void *addr); struct vmap_area *find_vmap_area(unsigned long addr); +struct vmap_area *find_vmap_area_try(unsigned long addr); static inline bool is_vm_area_hugepages(const void *addr) { diff --git a/mm/usercopy.c b/mm/usercopy.c index c1ee15a98633..4a371099ac64 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -173,7 +173,16 @@ static inline void check_heap_object(const void *ptr, unsigned long n, } if (is_vmalloc_addr(ptr)) { - struct vmap_area *area = find_vmap_area(addr); + struct vmap_area *area; + + if (unlikely(in_interrupt())) { + area = find_vmap_area_try(addr); + /* Give up under interrupt to avoid deadlocks. */ + if (!area) + return; + } else { + area = find_vmap_area(addr); + } if (!area) usercopy_abort("vmalloc", "no area", to_user, 0, n); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index dd6cdb201195..f14f1902c2f6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1840,6 +1840,17 @@ struct vmap_area *find_vmap_area(unsigned long addr) return va; } +struct vmap_area *find_vmap_area_try(unsigned long addr) +{ + struct vmap_area *va = NULL; + + if (spin_trylock(&vmap_area_lock)) { + va = __find_vmap_area(addr, &vmap_area_root); + spin_unlock(&vmap_area_lock); + } + return va; +} + /*** Per cpu kva allocator ***/ /*
The check_object_size() checks under CONFIG_HARDENED_USERCOPY need to be more defensive against running from interrupt context. Use a best-effort check for VMAP areas when running in interrupt context Suggested-by: Matthew Wilcox <willy@infradead.org> Link: https://lore.kernel.org/linux-mm/YyQ2CSdIJdvQPSPO@casper.infradead.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Yu Zhao <yuzhao@google.com> Cc: dev@der-flo.net Cc: linux-mm@kvack.org Cc: linux-hardening@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> --- include/linux/vmalloc.h | 1 + mm/usercopy.c | 11 ++++++++++- mm/vmalloc.c | 11 +++++++++++ 3 files changed, 22 insertions(+), 1 deletion(-)