From patchwork Wed Feb 14 15:07:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 10219217 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DF2F360467 for ; Wed, 14 Feb 2018 15:08:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC28E28F2B for ; Wed, 14 Feb 2018 15:08:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C07FB28FEF; Wed, 14 Feb 2018 15:08:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 49B4A28F2B for ; Wed, 14 Feb 2018 15:07:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=YHADOU8x5VHcYqKFOnwJ06H1ZAoSwemh4bx7VziMhl8=; b=dJMZSj5UWndfgR itRWbdjhhApghZSqULnrzhSd1X6elOq8w4sWN3+81kJeQQsAvC6qhdGEDEw+t4pTfgMCMwdfkLmoV tQTR3iUd2uzxMCsCixV9h+nvac52xwk6wlzk+yzoDhiuzPiRAHEm48prD9PX9+ueJuTeo5hpKh1da R9mRce29dU2NwZuOEqpfNwEbElvA3uZAJX5tnPWbovqxC1H++9+7c7Kg1cxZ0/3xwN/x0KA4XRV5j QfgvVKb0Dno4P3r/rExuABvr2MJabTyyJm2W8h4vGCVqpb7pQTiHxefxgwgrNUc1+EqBY2fBU4Gkh n1+UJfEa9pejImcg7bLQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.89 #1 (Red Hat Linux)) id 1elyf0-0004Uo-HE; Wed, 14 Feb 2018 15:07:50 +0000 Received: from foss.arm.com ([217.140.101.70]) by bombadil.infradead.org with esmtp (Exim 4.89 #1 (Red Hat Linux)) id 1elyeu-0004OI-W6 for linux-arm-kernel@lists.infradead.org; Wed, 14 Feb 2018 15:07:46 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A262D1435; Wed, 14 Feb 2018 07:07:34 -0800 (PST) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 734313F53D; Wed, 14 Feb 2018 07:07:34 -0800 (PST) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 7F58D1AE5460; Wed, 14 Feb 2018 15:07:41 +0000 (GMT) Date: Wed, 14 Feb 2018 15:07:41 +0000 From: Will Deacon To: Mark Rutland Subject: Re: arm64/v4.16-rc1: KASAN: use-after-free Read in finish_task_switch Message-ID: <20180214150739.GH2992@arm.com> References: <20180214120254.qq4w4s42ecxio7lu@lakrids.cambridge.arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180214120254.qq4w4s42ecxio7lu@lakrids.cambridge.arm.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180214_070745_066076_40FF3B5E X-CRM114-Status: GOOD ( 18.92 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peterz@infradead.org, mathieu.desnoyers@efficios.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, mingo@kernel.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Mark, Cheers for the report. These things tend to be a pain to debug, but I've had a go. On Wed, Feb 14, 2018 at 12:02:54PM +0000, Mark Rutland wrote: > As a heads-up, I hit the splat below when fuzzing v4.16-rc1 on arm64. > > Evidently, we get to finish_task_switch() with rq->prev_mm != NULL, > despite rq->prev_mm having been freed. KASAN spots the dereference of > mm->membarrier_state in membarrier_mm_sync_core_before_usermode(mm), but > AFAICT the underlying issue is independent of the membarrier code, and > we could get a splat on the subsequent mmdrop(mm). > > I've seen this once in ~2500 CPU hours of fuzzing, so it looks pretty > difficult to hit, and I have no reproducer so far. > > Syzkaller report below, mirrored with Syzkaller log at [1]. If I hit > this again, I'll upload new info there. The interesting thing here is on the exit path: > Freed by task 10882: > save_stack mm/kasan/kasan.c:447 [inline] > set_track mm/kasan/kasan.c:459 [inline] > __kasan_slab_free+0x114/0x220 mm/kasan/kasan.c:520 > kasan_slab_free+0x10/0x18 mm/kasan/kasan.c:527 > slab_free_hook mm/slub.c:1393 [inline] > slab_free_freelist_hook mm/slub.c:1414 [inline] > slab_free mm/slub.c:2968 [inline] > kmem_cache_free+0x88/0x270 mm/slub.c:2990 > __mmdrop+0x164/0x248 kernel/fork.c:604 ^^ This should never run, because there's an mmgrab() about 8 lines above the mmput() in exit_mm. > mmdrop+0x50/0x60 kernel/fork.c:615 > __mmput kernel/fork.c:981 [inline] > mmput+0x270/0x338 kernel/fork.c:992 > exit_mm kernel/exit.c:544 [inline] Looking at exit_mm: mmgrab(mm); BUG_ON(mm != current->active_mm); /* more a memory barrier than a real lock */ task_lock(current); current->mm = NULL; up_read(&mm->mmap_sem); enter_lazy_tlb(mm, current); task_unlock(current); mm_update_next_owner(mm); mmput(mm); Then the comment already rings some alarm bells: our spin_lock (as used by task_lock) has ACQUIRE semantics, so the mmgrab (which is unordered due to being an atomic_inc) can be reordered with respect to the assignment of NULL to current->mm. If the exit()ing task had recently migrated from another CPU, then that CPU could concurrently run context_switch() and take this path: if (!prev->mm) { prev->active_mm = NULL; rq->prev_mm = oldmm; } which then means finish_task_switch will call mmdrop(): struct mm_struct *mm = rq->prev_mm; [...] if (mm) { membarrier_mm_sync_core_before_usermode(mm); mmdrop(mm); } note that KASAN will be ok at this point, but it explains how the exit_mm path ends up freeing the mm. Then, when the exit()ing CPU calls context_switch, *it* will explode accessing the freed mm. Easiest way to fix this is by guaranteeing the barrier semantics in the exit path. Patch below. I guess we'll have to wait another 2500 hours to see if it works :) Will --->8 diff --git a/kernel/exit.c b/kernel/exit.c index 995453d9fb55..f91e8d56b03f 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -534,8 +534,9 @@ static void exit_mm(void) } mmgrab(mm); BUG_ON(mm != current->active_mm); - /* more a memory barrier than a real lock */ task_lock(current); + /* Ensure we've grabbed the mm before setting current->mm to NULL */ + smp_mb__after_spin_lock(); current->mm = NULL; up_read(&mm->mmap_sem); enter_lazy_tlb(mm, current);