From patchwork Fri Jun 1 23:34:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anton Eidelman X-Patchwork-Id: 10444423 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5C9646028F for ; Fri, 1 Jun 2018 23:34:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3FA332899D for ; Fri, 1 Jun 2018 23:34:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 31506289BE; Fri, 1 Jun 2018 23:34:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HTML_MESSAGE, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 28ECA2899D for ; Fri, 1 Jun 2018 23:34:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 64F1A6B0005; Fri, 1 Jun 2018 19:34:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5FEDE6B0006; Fri, 1 Jun 2018 19:34:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EEB76B0007; Fri, 1 Jun 2018 19:34:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-wr0-f199.google.com (mail-wr0-f199.google.com [209.85.128.199]) by kanga.kvack.org (Postfix) with ESMTP id D18816B0005 for ; Fri, 1 Jun 2018 19:34:05 -0400 (EDT) Received: by mail-wr0-f199.google.com with SMTP id k27-v6so19312526wre.23 for ; Fri, 01 Jun 2018 16:34:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:mime-version:in-reply-to :references:from:date:message-id:subject:to:cc; bh=YgFDpYC5SnGVZx8GPIM1mc1l1LdYIPlcbSsNAKLrw5o=; b=UxdzRGxSqhH7DFXcC6gUvHMP6jbLVsChx2Tv25Qdew835ezxLC9vnoEdxZEjO1nVpM uXkBT+uwseNaUXdx6awMDyGrKKQ9tV9FwH/vJWQ1a98X1sQRLG058UxOn1yvYzpsBiLL yACJNFk/WhValwVjcvXLYSp/b0Bkl/pMNmVFMsACEAP1hYEemIExlO3a3CK+226hSZaT HNQAOArsDI7WtI8d64dA6Sp8WAV+fNFokuOL6wKL8SRvn2UBM5Cecv9TegxYFDW62pxC 9H+mAYO8wksED3eYs35ZR/5olIYPZ5ZFs3Wj1mrKH6FafvlnEc0/IvxzYe/JU+EdyFUx x4TA== X-Gm-Message-State: APt69E2Gs/7aIE4zkmOQOT/yFYyYc3gi1Cuk6om6dXUeEgzm5tb7tbuD VJFFpm7UvRr29sLNM7dR+ZmJTv5R4wZI4+AMRRRKmpb0mwGAa1AaL0NueHGnxhgAXm4FmF6AgMJ nfTseJccCjO9wN9VBXIVxQloDfJOUzW7u0mefE04F1rFJZQd14CpjyrYXnhyqDX9TVZfPaBjLR2 LZw2ONCjbZs423fDgHdTujSgYZ8AXFH6w1ah1UE4/WQ/3+UZJupQF1TPgxdQ0ppqhUvepVAaIYN ouEUbtO0nNTTCx54m0QyF34fVqh/mfuy67UcCICgHtjZpKOj7zsy+nf5X3JRR1NRNKqbg+f1aDr K2d2C6DUm+ajkiqvSTHflLSJZgKjtHUYO56kSgXswns/m0jvfYZPoc3lTBNvgwEldOP2iZrjegD x X-Received: by 2002:a1c:afc3:: with SMTP id y186-v6mr3599924wme.87.1527896045049; Fri, 01 Jun 2018 16:34:05 -0700 (PDT) X-Received: by 2002:a1c:afc3:: with SMTP id y186-v6mr3599893wme.87.1527896043693; Fri, 01 Jun 2018 16:34:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527896043; cv=none; d=google.com; s=arc-20160816; b=Qz8YENL8Frz7ydkhh/SC76hRmo7QbUoOVCe4fJIKh4REOl+dRzNoMnTp5+g1FQX5xf 4lhaP6n+XZZrHddpA6Zo9Xt+9eStaKEq7o8lmgvQKr8JP+uRPlO9Q66bqSQb7rSW1Fwa I1ecTW4eMlhe7y9YAFr6mYyp/epKo9xBwBW/RyY4mawDiNZhblSFoDqg6FXNfnV4rlMp 99db2v4o1Dheod56P9iB4ROYJlVlVO2HhlHCWf5aloqZtFaa8YARtAvuCLeU58pA2BGe 2+2qKsKpEI9gw6fB3re3AOxOTWCocCmDey6Z1i079yuGTgvAO4zD47wkVMbW/AevBOL+ DwUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:dkim-signature:arc-authentication-results; bh=YgFDpYC5SnGVZx8GPIM1mc1l1LdYIPlcbSsNAKLrw5o=; b=U6E/kErxLynBfp9wx3JVdBnSnEHTys1Mhki90Fjq1Ld9C+sTkhRzdKPbuH5GkQS+bi EfvnQLRe3aDoIL5Q6WW04nXwlirr8rmpGsCqK5A+f4BR/amZxHOyKWpJwzaQIpUua6sa +c1zAGjEh/XDQ/1dQ2EDMYQfDoyPBEr3bpF+6ZzP5Nq5d73NQ6qX21uohr9++C3wCMXE ZUX6ze3gMh/eF+fsSq49ZnJ9bkpLhFERKDVmA9hv8NBJfBxVLYFsCc+ybaZj6z4msAZ+ sfA4TVMqq1ICWtWaXu+f4FIagQHTySFbW4+rGczPZDyJHCwzivQHJJ8Ml7P0TwWAqit+ GK3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@lightbitslabs.com header.s=google header.b=zHMSMNqw; spf=pass (google.com: domain of anton@lightbitslabs.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=anton@lightbitslabs.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id n128-v6sor797098wma.46.2018.06.01.16.34.03 for (Google Transport Security); Fri, 01 Jun 2018 16:34:03 -0700 (PDT) Received-SPF: pass (google.com: domain of anton@lightbitslabs.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@lightbitslabs.com header.s=google header.b=zHMSMNqw; spf=pass (google.com: domain of anton@lightbitslabs.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=anton@lightbitslabs.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lightbitslabs.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=YgFDpYC5SnGVZx8GPIM1mc1l1LdYIPlcbSsNAKLrw5o=; b=zHMSMNqwtt+xGL6+o7+AHlutBdXNfjJ1kRd9rFXKIuJggqMXRKiFFwqLO+m2UGqzTn yGgXOtA+q6YyKkSJXTDDFWwZAV27/7hL/XBCxkK21yJFIiypPgSrcFkzhoWX85JleD0H 8Fvv4hFdbp8lOFmDPgcKw4bCQ9LOj71TRenj8= X-Google-Smtp-Source: ADUXVKKDQv9Gw52Z7NgCghfr53HEOMkP8Hs6sE1QRXwskIkL7/+VkQyvNcjOJSy5Q+QkKZeLy9wG5/HcyJDkLurAcnw= X-Received: by 2002:a1c:80e:: with SMTP id 14-v6mr4112463wmi.9.1527896043117; Fri, 01 Jun 2018 16:34:03 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a1c:815:0:0:0:0:0 with HTTP; Fri, 1 Jun 2018 16:34:01 -0700 (PDT) In-Reply-To: References: <55be03eb-3d0d-d43d-b0a4-669341e6d9ab@redhat.com> <20180601205837.GB29651@bombadil.infradead.org> From: Anton Eidelman Date: Fri, 1 Jun 2018 16:34:01 -0700 Message-ID: Subject: Re: HARDENED_USERCOPY will BUG on multiple slub objects coalesced into an sk_buff fragment To: Kees Cook Cc: Matthew Wilcox , Laura Abbott , Linux-MM , linux-hardened@lists.openwall.com X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Hi all, I do not have a way of reproducing this decent enough to recommend: I'll keep digging. The page belongs to a slub when the fragment is being constructed in __skb_fill_page_desc(), see the instrumentation I used below. When usercopy triggers, .coals shows values of 2/3 for 128/192 bytes respectively. The question is how the RX sk_buff ends up having data fragment in a PageSlab page. Some network drivers use netdev_alloc_frag() so pages indeed come from page_frag allocator. Others (mellanox, intel) just alloc_page() when filling their RX descriptors. In both cases the pages will be refcounted properly. I suspect my kernel TCP traffic that uses kernel_sendpage() for bio pages AND slub pages. Thanks a lot! Anton * is as follows... @@ -316,7 +317,8 @@ struct skb_frag_struct { } page; #if (BITS_PER_LONG > 32) || (PAGE_SIZE >= 65536) __u32 page_offset; - __u32 size; + __u16 size; + __u16 coals; #else __u16 page_offset; __u16 size; @@ -1850,9 +1852,11 @@ static inline void __skb_fill_page_desc(struct sk_buff *skb, int i, */ frag->page.p = page; frag->page_offset = off; + frag->coals = 0; skb_frag_size_set(frag, size); page = compound_head(page); + *WARN_ON(PageSlab(page) && (page->slab_cache->size < size)); // does NOT trigger* if (page_is_pfmemalloc(page)) skb->pfmemalloc = true; } @@ -2849,10 +2853,14 @@ static inline bool skb_can_coalesce(struct sk_buff *skb, int i, const struct page *page, int off) { if (i) { - const struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i - 1]; + struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i - 1]; - return page == skb_frag_page(frag) && + bool ret = page == skb_frag_page(frag) && off == frag->page_offset + skb_frag_size(frag); + if (unlikely(ret)) *+ if (PageSlab(compound_head((struct page *)page)))* *+ frag->coals++;* + return ret; } return false; } On Fri, Jun 1, 2018 at 2:55 PM, Kees Cook wrote: > On Fri, Jun 1, 2018 at 1:58 PM, Matthew Wilcox > wrote: > > On Fri, Jun 01, 2018 at 01:49:38PM -0700, Kees Cook wrote: > >> On Fri, Jun 1, 2018 at 12:02 PM, Laura Abbott > wrote: > >> > (cc-ing some interested people) > >> > > >> > > >> > > >> > On 05/31/2018 05:03 PM, Anton Eidelman wrote: > >> >> Here's a rare issue I reproduce on 4.12.10 (centos config): full log > >> >> sample below. > >> > >> Thanks for digging into this! Do you have any specific reproducer for > >> this? If so, I'd love to try a bisection, as I'm surprised this has > >> only now surfaced: hardened usercopy was introduced in 4.8 ... > >> > >> >> An innocent process (dhcpclient) is about to receive a datagram, but > >> >> during skb_copy_datagram_iter() usercopy triggers a BUG in: > >> >> usercopy.c:check_heap_object() -> slub.c:__check_heap_object(), > because > >> >> the sk_buff fragment being copied crosses the 64-byte slub object > boundary. > >> >> > >> >> Example __check_heap_object() context: > >> >> n=128 << usually 128, sometimes 192. > >> >> object_size=64 > >> >> s->size=64 > >> >> page_address(page)=0xffff880233f7c000 > >> >> ptr=0xffff880233f7c540 > >> >> > >> >> My take on the root cause: > >> >> When adding data to an skb, new data is appended to the current > >> >> fragment if the new chunk immediately follows the last one: by simply > >> >> increasing the frag->size, skb_frag_size_add(). > >> >> See include/linux/skbuff.h:skb_can_coalesce() callers. > >> > >> Oooh, sneaky: > >> return page == skb_frag_page(frag) && > >> off == frag->page_offset + skb_frag_size(frag); > >> > >> Originally I was thinking that slab red-zoning would get triggered > >> too, but I see the above is checking to see if these are precisely > >> neighboring allocations, I think. > >> > >> But then ... how does freeing actually work? I'm really not sure how > >> this seeming layering violation could be safe in other areas? > > > > I'm confused ... I thought skb frags came from the page_frag allocator, > > not the slab allocator. But then why would the slab hardening trigger? > > Well that would certainly make more sense (well, the sense about > alloc/free). Having it overlap with a slab allocation, though, that's > quite bad. Perhaps this is a very odd use-after-free case? I.e. freed > page got allocated to slab, and when it got copied out, usercopy found > it spanned a slub object? > > [ 655.602500] usercopy: kernel memory exposure attempt detected from > ffff88022a31aa00 (kmalloc-64) (192 bytes) > > This wouldn't be the first time usercopy triggered due to a memory > corruption... > > -Kees > > -- > Kees Cook > Pixel Security > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index a098d95..7cd744c 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -40,6 +40,7 @@ #include #include #include +#include /* The interface for checksum offload between the stack and networking drivers