From patchwork Thu Jun 6 16:48:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Potapenko X-Patchwork-Id: 10981071 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 20C2A92A for ; Fri, 7 Jun 2019 08:55:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 112CC28956 for ; Fri, 7 Jun 2019 08:55:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0453A289B1; Fri, 7 Jun 2019 08:55:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.5 required=2.0 tests=BAYES_00,DATE_IN_PAST_12_24, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9F63C28A03 for ; Fri, 7 Jun 2019 08:55:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 242B56B000C; Fri, 7 Jun 2019 04:55:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1F2C86B000E; Fri, 7 Jun 2019 04:55:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BB5C6B0266; Fri, 7 Jun 2019 04:55:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-yw1-f69.google.com (mail-yw1-f69.google.com [209.85.161.69]) by kanga.kvack.org (Postfix) with ESMTP id D7DB96B000C for ; Fri, 7 Jun 2019 04:55:40 -0400 (EDT) Received: by mail-yw1-f69.google.com with SMTP id y205so1273276ywy.19 for ; Fri, 07 Jun 2019 01:55:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=EgCMZEXKSY8pRE3AOo7BTYLSTDaJL2WRNvKvyETJfZQ=; b=G3uJOoqqxkYDOg9b5/RJsTVUyp+Enbrajed4akz7ztpa5oFwxszBr4WSrfF9wjslG5 zcKOTQIf8Lpyo7ydsWkOddjvB++k8e0wGbcxEOUlAvsaBxTjmriWwO7RM+hUJek3biHi OofclxzcLKImfaAPo35Ge8d3FPYIwhIvM7LqXcBbGuqoAyga4n0L9pXRehI8U7xlbHHP jrusDAImaalDhgaYYnCqcNSNq7p0m9ppsvEYeODXRBZVDpYgVFOxx9UMMwoygbFhZp7C 6ap+xN+taQ2vJzA2OGqisEZKNg5lrLlFPXSlSyxh8BRN4iutk5kMmPdMPHL/kPG4mZV2 a8WA== X-Gm-Message-State: APjAAAVY/hgfuw1xd+zG505XmA5NyDC+yLxD7lTUtoPi0uLcRwq0+AKv RBx8OHtAqo/cWZraFWc+XdaZ+SSBNU9aersasX7t/jHzBspkeY6aShPfYs6uxHngnuXM7Ar4E4l amu7O6qDF7Dc3FUwlNNyLbYGbVOL+pV7Gp8Bz0er9NV70ppEgRsDbRIwbi1LHRgbhqQ== X-Received: by 2002:a81:7882:: with SMTP id t124mr24500925ywc.92.1559897740462; Fri, 07 Jun 2019 01:55:40 -0700 (PDT) X-Received: by 2002:a81:7882:: with SMTP id t124mr24500909ywc.92.1559897739415; Fri, 07 Jun 2019 01:55:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559897739; cv=none; d=google.com; s=arc-20160816; b=MtfxQvHIwD6Ug5tksTkyrTbl8e1UZhfb6wi0Y5MWrGfSg+fK+kB6Ca4eqQ5MAI2Dz2 zHq/8bPS+hqn/K2URlWjzLjXCrbceliQrNYWq6dGEXy6kC9NhA83xkHJrDX/oyMV8omC VL4ZDxkVpLiJgHqbqWOaFip3j1Ck+xi0jS9v6FTYsqEHysekJVOVsCW5V0SJuIlLpiLZ AOceaZSofUCUBp3uf/yLdp9yemr+sAvWMFse2zYlKbG2kg0psKkQZ+hRlVZF7uDC31n/ s5QRqhulmAbfagSwvbhp8xaui9XlcaF2WI2c+obcdnZaKLjm+MLw3nBuaf9mbNTagTJF 8oZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:dkim-signature; bh=EgCMZEXKSY8pRE3AOo7BTYLSTDaJL2WRNvKvyETJfZQ=; b=azJ7+RVxM04A+m2Idgnl3JYeDwCtpNwyp9HIq1ev7YGoZ08RvJReyjSLXB2zP0fIjp EQylSrL19sBZ5bipa9oLn3+dhHswFFtpZx0cu6YqKd6IUfARyE6SPzXzljGvc8CTS5Jl Z/X+bYZ4B1dAKG/BAvFe3NrbC0QQQYOl4A3u2GHeqpf8HlynDqTmRrwoVWAdE6YqBUJO 70Aqgr1prx9ESknuynOO9//iD2xf9v1vd1VyRsJSvRJlOGWJecAKcD/7EE2+XaRWLPLb cV72hNRzpZ6jdzrzBmOAu2k6JgORaFVJeOSCuGil2qb7x9G0PjDJtHj8lQzjN5UWEPxW Ab4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=VyeCN5UI; spf=pass (google.com: domain of 3iib6xaykck0tyvqretbbtyr.pbzyvahk-zzxinpx.bet@flex--glider.bounces.google.com designates 209.85.220.73 as permitted sender) smtp.mailfrom=3iib6XAYKCK0TYVQReTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--glider.bounces.google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from mail-sor-f73.google.com (mail-sor-f73.google.com. [209.85.220.73]) by mx.google.com with SMTPS id x1sor587679ywg.17.2019.06.07.01.55.39 for (Google Transport Security); Fri, 07 Jun 2019 01:55:39 -0700 (PDT) Received-SPF: pass (google.com: domain of 3iib6xaykck0tyvqretbbtyr.pbzyvahk-zzxinpx.bet@flex--glider.bounces.google.com designates 209.85.220.73 as permitted sender) client-ip=209.85.220.73; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=VyeCN5UI; spf=pass (google.com: domain of 3iib6xaykck0tyvqretbbtyr.pbzyvahk-zzxinpx.bet@flex--glider.bounces.google.com designates 209.85.220.73 as permitted sender) smtp.mailfrom=3iib6XAYKCK0TYVQReTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--glider.bounces.google.com; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=EgCMZEXKSY8pRE3AOo7BTYLSTDaJL2WRNvKvyETJfZQ=; b=VyeCN5UIUrkDBRBK3BoweEa80+GLp635dVwpbW1YrOa1Y27Yf/o/8KPqzUPMdfuaPZ 0AU39EK4VFQvacRNBLD8oBPfnqNtZMxSDDsNAHi1VWdpzqnPEysW4fWvrw+TRtKcnzKf JkGglAWis84z1BeVXrhX113J3QKKYn62aBsc7kZe8UG3mU3pnn9gtr+l40J4ZbKJ5Swm tT59Q/wDEupO215dOJZ1phIhpK1RzMc5M86Ekp7loO01cnr4zcHSVv2rOkxo/F+ytG2x 6jmMCv9qrtGlXpnQZuCgmiQhQlWxBSFm9NWTIV/t5EGeYbkOq33/LZhQBRW0W7BlK7BV S8Ug== X-Google-Smtp-Source: APXvYqz17bitiEZO3ROxxuY/29NsfPMotYv87xwVJ4U2/PxIi9TIbfQ2bg+0DXpqQnaLlnk9S9+5wyrhgjk= X-Received: by 2002:a81:23c5:: with SMTP id j188mr25689805ywj.277.1559897738920; Fri, 07 Jun 2019 01:55:38 -0700 (PDT) Date: Thu, 6 Jun 2019 18:48:43 +0200 In-Reply-To: <20190606164845.179427-1-glider@google.com> Message-Id: <20190606164845.179427-2-glider@google.com> Mime-Version: 1.0 References: <20190606164845.179427-1-glider@google.com> X-Mailer: git-send-email 2.22.0.rc1.311.g5d7573a151-goog Subject: [PATCH v6 1/3] mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options From: Alexander Potapenko To: Andrew Morton , Christoph Lameter , Kees Cook Cc: Alexander Potapenko , Masahiro Yamada , Michal Hocko , James Morris , "Serge E. Hallyn" , Nick Desaulniers , Kostya Serebryany , Dmitry Vyukov , Sandeep Patil , Laura Abbott , Randy Dunlap , Jann Horn , Mark Rutland , Marco Elver , linux-mm@kvack.org, linux-security-module@vger.kernel.org, kernel-hardening@lists.openwall.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The new options are needed to prevent possible information leaks and make control-flow bugs that depend on uninitialized values more deterministic. init_on_alloc=1 makes the kernel initialize newly allocated pages and heap objects with zeroes. Initialization is done at allocation time at the places where checks for __GFP_ZERO are performed. init_on_free=1 makes the kernel initialize freed pages and heap objects with zeroes upon their deletion. This helps to ensure sensitive data doesn't leak via use-after-free accesses. Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator returns zeroed memory. The two exceptions are slab caches with constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never zero-initialized to preserve their semantics. Both init_on_alloc and init_on_free default to zero, but those defaults can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. Slowdown for the new features compared to init_on_free=0, init_on_alloc=0: hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%) hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%) Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%) Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%) Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%) Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%) The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline is within the standard error. The new features are also going to pave the way for hardware memory tagging (e.g. arm64's MTE), which will require both on_alloc and on_free hooks to set the tags for heap objects. With MTE, tagging will have the same cost as memory initialization. Although init_on_free is rather costly, there are paranoid use-cases where in-memory data lifetime is desired to be minimized. There are various arguments for/against the realism of the associated threat models, but given that we'll need the infrastructre for MTE anyway, and there are people who want wipe-on-free behavior no matter what the performance cost, it seems reasonable to include it in this series. Signed-off-by: Alexander Potapenko To: Andrew Morton To: Christoph Lameter To: Kees Cook Cc: Masahiro Yamada Cc: Michal Hocko Cc: James Morris Cc: "Serge E. Hallyn" Cc: Nick Desaulniers Cc: Kostya Serebryany Cc: Dmitry Vyukov Cc: Sandeep Patil Cc: Laura Abbott Cc: Randy Dunlap Cc: Jann Horn Cc: Mark Rutland Cc: Marco Elver Cc: linux-mm@kvack.org Cc: linux-security-module@vger.kernel.org Cc: kernel-hardening@lists.openwall.com Acked-by: Kees Cook --- v2: - unconditionally initialize pages in kernel_init_free_pages() - comment from Randy Dunlap: drop 'default false' lines from Kconfig.hardening v3: - don't call kernel_init_free_pages() from memblock_free_pages() - adopted some Kees' comments for the patch description v4: - use NULL instead of 0 in slab_alloc_node() (found by kbuild test robot) - don't write to NULL object in slab_alloc_node() (found by Android testing) v5: - adjusted documentation wording as suggested by Kees - disable SLAB_POISON if auto-initialization is on - don't wipe RCU cache allocations made without __GFP_ZERO - dropped SLOB support --- .../admin-guide/kernel-parameters.txt | 9 +++ drivers/infiniband/core/uverbs_ioctl.c | 2 +- include/linux/mm.h | 22 +++++++ kernel/kexec_core.c | 2 +- mm/dmapool.c | 2 +- mm/page_alloc.c | 63 ++++++++++++++++--- mm/slab.c | 16 ++++- mm/slab.h | 19 ++++++ mm/slub.c | 33 ++++++++-- net/core/sock.c | 2 +- security/Kconfig.hardening | 29 +++++++++ 11 files changed, 180 insertions(+), 19 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 138f6664b2e2..84ee1121a2b9 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1673,6 +1673,15 @@ initrd= [BOOT] Specify the location of the initial ramdisk + init_on_alloc= [MM] Fill newly allocated pages and heap objects with + zeroes. + Format: 0 | 1 + Default set by CONFIG_INIT_ON_ALLOC_DEFAULT_ON. + + init_on_free= [MM] Fill freed pages and heap objects with zeroes. + Format: 0 | 1 + Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON. + init_pkru= [x86] Specify the default memory protection keys rights register contents for all processes. 0x55555554 by default (disallow access to all but pkey 0). Can diff --git a/drivers/infiniband/core/uverbs_ioctl.c b/drivers/infiniband/core/uverbs_ioctl.c index 829b0c6944d8..61758201d9b2 100644 --- a/drivers/infiniband/core/uverbs_ioctl.c +++ b/drivers/infiniband/core/uverbs_ioctl.c @@ -127,7 +127,7 @@ __malloc void *_uverbs_alloc(struct uverbs_attr_bundle *bundle, size_t size, res = (void *)pbundle->internal_buffer + pbundle->internal_used; pbundle->internal_used = ALIGN(new_used, sizeof(*pbundle->internal_buffer)); - if (flags & __GFP_ZERO) + if (want_init_on_alloc(flags)) memset(res, 0, size); return res; } diff --git a/include/linux/mm.h b/include/linux/mm.h index 0e8834ac32b7..7733a341c0c4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2685,6 +2685,28 @@ static inline void kernel_poison_pages(struct page *page, int numpages, int enable) { } #endif +#ifdef CONFIG_INIT_ON_ALLOC_DEFAULT_ON +DECLARE_STATIC_KEY_TRUE(init_on_alloc); +#else +DECLARE_STATIC_KEY_FALSE(init_on_alloc); +#endif +static inline bool want_init_on_alloc(gfp_t flags) +{ + if (static_branch_unlikely(&init_on_alloc)) + return true; + return flags & __GFP_ZERO; +} + +#ifdef CONFIG_INIT_ON_FREE_DEFAULT_ON +DECLARE_STATIC_KEY_TRUE(init_on_free); +#else +DECLARE_STATIC_KEY_FALSE(init_on_free); +#endif +static inline bool want_init_on_free(void) +{ + return static_branch_unlikely(&init_on_free); +} + extern bool _debug_pagealloc_enabled; static inline bool debug_pagealloc_enabled(void) diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index fd5c95ff9251..2f75dd0d0d81 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -315,7 +315,7 @@ static struct page *kimage_alloc_pages(gfp_t gfp_mask, unsigned int order) arch_kexec_post_alloc_pages(page_address(pages), count, gfp_mask); - if (gfp_mask & __GFP_ZERO) + if (want_init_on_alloc(gfp_mask)) for (i = 0; i < count; i++) clear_highpage(pages + i); } diff --git a/mm/dmapool.c b/mm/dmapool.c index 76a160083506..493d151067cb 100644 --- a/mm/dmapool.c +++ b/mm/dmapool.c @@ -381,7 +381,7 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags, #endif spin_unlock_irqrestore(&pool->lock, flags); - if (mem_flags & __GFP_ZERO) + if (want_init_on_alloc(mem_flags)) memset(retval, 0, pool->size); return retval; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d66bc8abe0af..50a3b104a491 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -136,6 +136,48 @@ unsigned long totalcma_pages __read_mostly; int percpu_pagelist_fraction; gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK; +#ifdef CONFIG_INIT_ON_ALLOC_DEFAULT_ON +DEFINE_STATIC_KEY_TRUE(init_on_alloc); +#else +DEFINE_STATIC_KEY_FALSE(init_on_alloc); +#endif +#ifdef CONFIG_INIT_ON_FREE_DEFAULT_ON +DEFINE_STATIC_KEY_TRUE(init_on_free); +#else +DEFINE_STATIC_KEY_FALSE(init_on_free); +#endif + +static int __init early_init_on_alloc(char *buf) +{ + int ret; + bool bool_result; + + if (!buf) + return -EINVAL; + ret = kstrtobool(buf, &bool_result); + if (bool_result) + static_branch_enable(&init_on_alloc); + else + static_branch_disable(&init_on_alloc); + return ret; +} +early_param("init_on_alloc", early_init_on_alloc); + +static int __init early_init_on_free(char *buf) +{ + int ret; + bool bool_result; + + if (!buf) + return -EINVAL; + ret = kstrtobool(buf, &bool_result); + if (bool_result) + static_branch_enable(&init_on_free); + else + static_branch_disable(&init_on_free); + return ret; +} +early_param("init_on_free", early_init_on_free); /* * A cached value of the page's pageblock's migratetype, used when the page is @@ -1090,6 +1132,14 @@ static int free_tail_pages_check(struct page *head_page, struct page *page) return ret; } +static void kernel_init_free_pages(struct page *page, int numpages) +{ + int i; + + for (i = 0; i < numpages; i++) + clear_highpage(page + i); +} + static __always_inline bool free_pages_prepare(struct page *page, unsigned int order, bool check_free) { @@ -1142,6 +1192,8 @@ static __always_inline bool free_pages_prepare(struct page *page, } arch_free_page(page, order); kernel_poison_pages(page, 1 << order, 0); + if (want_init_on_free()) + kernel_init_free_pages(page, 1 << order); if (debug_pagealloc_enabled()) kernel_map_pages(page, 1 << order, 0); @@ -2020,8 +2072,8 @@ static inline int check_new_page(struct page *page) static inline bool free_pages_prezeroed(void) { - return IS_ENABLED(CONFIG_PAGE_POISONING_ZERO) && - page_poisoning_enabled(); + return (IS_ENABLED(CONFIG_PAGE_POISONING_ZERO) && + page_poisoning_enabled()) || want_init_on_free(); } #ifdef CONFIG_DEBUG_VM @@ -2075,13 +2127,10 @@ inline void post_alloc_hook(struct page *page, unsigned int order, static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags, unsigned int alloc_flags) { - int i; - post_alloc_hook(page, order, gfp_flags); - if (!free_pages_prezeroed() && (gfp_flags & __GFP_ZERO)) - for (i = 0; i < (1 << order); i++) - clear_highpage(page + i); + if (!free_pages_prezeroed() && want_init_on_alloc(gfp_flags)) + kernel_init_free_pages(page, 1 << order); if (order && (gfp_flags & __GFP_COMP)) prep_compound_page(page, order); diff --git a/mm/slab.c b/mm/slab.c index f7117ad9b3a3..98a89d7c922d 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1830,6 +1830,14 @@ static bool set_objfreelist_slab_cache(struct kmem_cache *cachep, cachep->num = 0; + /* + * If slab auto-initialization on free is enabled, store the freelist + * off-slab, so that its contents don't end up in one of the allocated + * objects. + */ + if (unlikely(slab_want_init_on_free(cachep))) + return false; + if (cachep->ctor || flags & SLAB_TYPESAFE_BY_RCU) return false; @@ -3263,7 +3271,7 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid, local_irq_restore(save_flags); ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller); - if (unlikely(flags & __GFP_ZERO) && ptr) + if (unlikely(slab_want_init_on_alloc(flags, cachep)) && ptr) memset(ptr, 0, cachep->object_size); slab_post_alloc_hook(cachep, flags, 1, &ptr); @@ -3320,7 +3328,7 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller) objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller); prefetchw(objp); - if (unlikely(flags & __GFP_ZERO) && objp) + if (unlikely(slab_want_init_on_alloc(flags, cachep)) && objp) memset(objp, 0, cachep->object_size); slab_post_alloc_hook(cachep, flags, 1, &objp); @@ -3441,6 +3449,8 @@ void ___cache_free(struct kmem_cache *cachep, void *objp, struct array_cache *ac = cpu_cache_get(cachep); check_irq_off(); + if (unlikely(slab_want_init_on_free(cachep))) + memset(objp, 0, cachep->object_size); kmemleak_free_recursive(objp, cachep->flags); objp = cache_free_debugcheck(cachep, objp, caller); @@ -3528,7 +3538,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, cache_alloc_debugcheck_after_bulk(s, flags, size, p, _RET_IP_); /* Clear memory outside IRQ disabled section */ - if (unlikely(flags & __GFP_ZERO)) + if (unlikely(slab_want_init_on_alloc(flags, s))) for (i = 0; i < size; i++) memset(p[i], 0, s->object_size); diff --git a/mm/slab.h b/mm/slab.h index 43ac818b8592..31032d488b29 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -524,4 +524,23 @@ static inline int cache_random_seq_create(struct kmem_cache *cachep, static inline void cache_random_seq_destroy(struct kmem_cache *cachep) { } #endif /* CONFIG_SLAB_FREELIST_RANDOM */ +static inline bool slab_want_init_on_alloc(gfp_t flags, struct kmem_cache *c) +{ + if (static_branch_unlikely(&init_on_alloc)) { + if (c->ctor) + return false; + if (c->flags & SLAB_TYPESAFE_BY_RCU) + return flags & __GFP_ZERO; + return true; + } + return flags & __GFP_ZERO; +} + +static inline bool slab_want_init_on_free(struct kmem_cache *c) +{ + if (static_branch_unlikely(&init_on_free)) + return !(c->ctor || (c->flags & SLAB_TYPESAFE_BY_RCU)); + return false; +} + #endif /* MM_SLAB_H */ diff --git a/mm/slub.c b/mm/slub.c index cd04dbd2b5d0..9c4a8b9a955c 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1279,6 +1279,12 @@ static int __init setup_slub_debug(char *str) if (*str == ',') slub_debug_slabs = str + 1; out: + if ((static_branch_unlikely(&init_on_alloc) || + static_branch_unlikely(&init_on_free)) && + (slub_debug & SLAB_POISON)) { + pr_warn("disabling SLAB_POISON: can't be used together with memory auto-initialization\n"); + slub_debug &= ~SLAB_POISON; + } return 1; } @@ -1424,6 +1430,19 @@ static __always_inline bool slab_free_hook(struct kmem_cache *s, void *x) static inline bool slab_free_freelist_hook(struct kmem_cache *s, void **head, void **tail) { + + void *object; + void *next = *head; + void *old_tail = *tail ? *tail : *head; + + if (slab_want_init_on_free(s)) + do { + object = next; + next = get_freepointer(s, object); + memset(object, 0, s->size); + set_freepointer(s, object, next); + } while (object != old_tail); + /* * Compiler cannot detect this function can be removed if slab_free_hook() * evaluates to nothing. Thus, catch all relevant config debug options here. @@ -1433,9 +1452,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, defined(CONFIG_DEBUG_OBJECTS_FREE) || \ defined(CONFIG_KASAN) - void *object; - void *next = *head; - void *old_tail = *tail ? *tail : *head; + next = *head; /* Head and tail of the reconstructed freelist */ *head = NULL; @@ -2741,8 +2758,14 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s, prefetch_freepointer(s, next_object); stat(s, ALLOC_FASTPATH); } + /* + * If the object has been wiped upon free, make sure it's fully + * initialized by zeroing out freelist pointer. + */ + if (unlikely(slab_want_init_on_free(s)) && object) + *(void **)object = NULL; - if (unlikely(gfpflags & __GFP_ZERO) && object) + if (unlikely(slab_want_init_on_alloc(gfpflags, s)) && object) memset(object, 0, s->object_size); slab_post_alloc_hook(s, gfpflags, 1, &object); @@ -3163,7 +3186,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, local_irq_enable(); /* Clear memory outside IRQ disabled fastpath loop */ - if (unlikely(flags & __GFP_ZERO)) { + if (unlikely(slab_want_init_on_alloc(flags, s))) { int j; for (j = 0; j < i; j++) diff --git a/net/core/sock.c b/net/core/sock.c index 75b1c950b49f..9ceb90c875bc 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1602,7 +1602,7 @@ static struct sock *sk_prot_alloc(struct proto *prot, gfp_t priority, sk = kmem_cache_alloc(slab, priority & ~__GFP_ZERO); if (!sk) return sk; - if (priority & __GFP_ZERO) + if (want_init_on_alloc(priority)) sk_prot_clear_nulls(sk, prot->obj_size); } else sk = kmalloc(prot->obj_size, priority); diff --git a/security/Kconfig.hardening b/security/Kconfig.hardening index c6cb2d9b2905..a1ffe2eb4d5f 100644 --- a/security/Kconfig.hardening +++ b/security/Kconfig.hardening @@ -160,6 +160,35 @@ config STACKLEAK_RUNTIME_DISABLE runtime to control kernel stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK. +config INIT_ON_ALLOC_DEFAULT_ON + bool "Enable heap memory zeroing on allocation by default" + help + This has the effect of setting "init_on_alloc=1" on the kernel + command line. This can be disabled with "init_on_alloc=0". + When "init_on_alloc" is enabled, all page allocator and slab + allocator memory will be zeroed when allocated, eliminating + many kinds of "uninitialized heap memory" flaws, especially + heap content exposures. The performance impact varies by + workload, but most cases see <1% impact. Some synthetic + workloads have measured as high as 7%. + +config INIT_ON_FREE_DEFAULT_ON + bool "Enable heap memory zeroing on free by default" + help + This has the effect of setting "init_on_free=1" on the kernel + command line. This can be disabled with "init_on_free=0". + Similar to "init_on_alloc", when "init_on_free" is enabled, + all page allocator and slab allocator memory will be zeroed + when freed, eliminating many kinds of "uninitialized heap memory" + flaws, especially heap content exposures. The primary difference + with "init_on_free" is that data lifetime in memory is reduced, + as anything freed is wiped immediately, making live forensics or + cold boot memory attacks unable to recover freed memory contents. + The performance impact varies by workload, but is more expensive + than "init_on_alloc" due to the negative cache effects of + touching "cold" memory areas. Most cases see 3-5% impact. Some + synthetic workloads have measured as high as 8%. + endmenu endmenu