From patchwork Thu Jul 7 15:32:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 12909852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A7CBC433EF for ; Thu, 7 Jul 2022 15:32:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5EC306B0072; Thu, 7 Jul 2022 11:32:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 59E536B0073; Thu, 7 Jul 2022 11:32:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 464DF6B0074; Thu, 7 Jul 2022 11:32:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3860E6B0072 for ; Thu, 7 Jul 2022 11:32:31 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 099CD785 for ; Thu, 7 Jul 2022 15:32:31 +0000 (UTC) X-FDA: 79660695702.31.F500F11 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf28.hostedemail.com (Postfix) with ESMTP id 62865C0034 for ; Thu, 7 Jul 2022 15:32:30 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 28B3022187; Thu, 7 Jul 2022 15:32:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1657207949; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=1PuqI+08ADI/9qiHR2e3AfpvWjMRUHmATSbnJ4mLbhY=; b=V9xax3X3h0rDFcLm+7ROHodLAddfA8SQOzE0PyvnoQW9VA3VY8Eyx+yBwMaiNooPmi39MW eKY9AxhnUbOxyVxXtP0oFA5HBDB1vbPjwINy3a9d4cCS7uCVRB8LEiIk6+f5fW/T6OUH1Z Myq54lI1AZDcfv+TqgugJ7QWgjdPmTw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1657207949; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=1PuqI+08ADI/9qiHR2e3AfpvWjMRUHmATSbnJ4mLbhY=; b=bLGxXGT9ajHXmAD/30K8IWtWipVGNQjjaAaaqlGj0ZnQR6d/RFMWlOjhHa9XmvyKZaFfk/ Gunzo5YX/5AiOfDA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 035D113A33; Thu, 7 Jul 2022 15:32:29 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id o09FAI38xmL7TQAAMHmgww (envelope-from ); Thu, 07 Jul 2022 15:32:29 +0000 From: Vlastimil Babka To: stable@vger.kernel.org Cc: Jann Horn , Christoph Lameter , David Rientjes , Joonsoo Kim , Pekka Enberg , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Muchun Song , linux-mm@kvack.org, Vlastimil Babka Subject: [PATCH 4.9-stable] mm/slub: add missing TID updates on slab deactivation Date: Thu, 7 Jul 2022 17:32:24 +0200 Message-Id: <20220707153224.24260-1-vbabka@suse.cz> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657207950; a=rsa-sha256; cv=none; b=ZK0sFqREQaPCCVP3/h2d5mbPl3XzsZndWM93QxHpq9Ey0Bv/2QUsVvwlLZzzO6ym3BEXAX E18KpKDc2QmDpLFH+XXRqwN1ofm+DVr2zZITZ6ydJbK2Nni8Dm0Sb7etl9zeNcfCVvxbwU 0d4ELhfJpPidpMYRbUnQ4BcCJLL980U= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=V9xax3X3; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=bLGxXGT9; spf=pass (imf28.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657207950; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=1PuqI+08ADI/9qiHR2e3AfpvWjMRUHmATSbnJ4mLbhY=; b=p9uVAnfjehEqfOKwnDPFr9GLwKpZV9x4Plci60EfkAIX9wxEVHzOBMHO9DZIF7zHHxXdcU BP0PmIecBmKkhFv0d3rdHsPYnDA69SHKrvA/B6ULEm3VKjgMUOKbGpGTBBijKaibRzfYqB Dyd2NbBCz3yuuTkFmQSEfcYutj+fRBE= X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 62865C0034 X-Rspam-User: Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=V9xax3X3; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=bLGxXGT9; spf=pass (imf28.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Stat-Signature: zttpnefmebyfz57qj9q6dcaqwf7hqy8d X-HE-Tag: 1657207950-130043 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jann Horn commit eeaa345e128515135ccb864c04482180c08e3259 upstream. The fastpath in slab_alloc_node() assumes that c->slab is stable as long as the TID stays the same. However, two places in __slab_alloc() currently don't update the TID when deactivating the CPU slab. If multiple operations race the right way, this could lead to an object getting lost; or, in an even more unlikely situation, it could even lead to an object being freed onto the wrong slab's freelist, messing up the `inuse` counter and eventually causing a page to be freed to the page allocator while it still contains slab objects. (I haven't actually tested these cases though, this is just based on looking at the code. Writing testcases for this stuff seems like it'd be a pain...) The race leading to state inconsistency is (all operations on the same CPU and kmem_cache): - task A: begin do_slab_free(): - read TID - read pcpu freelist (==NULL) - check `slab == c->slab` (true) - [PREEMPT A->B] - task B: begin slab_alloc_node(): - fastpath fails (`c->freelist` is NULL) - enter __slab_alloc() - slub_get_cpu_ptr() (disables preemption) - enter ___slab_alloc() - take local_lock_irqsave() - read c->freelist as NULL - get_freelist() returns NULL - write `c->slab = NULL` - drop local_unlock_irqrestore() - goto new_slab - slub_percpu_partial() is NULL - get_partial() returns NULL - slub_put_cpu_ptr() (enables preemption) - [PREEMPT B->A] - task A: finish do_slab_free(): - this_cpu_cmpxchg_double() succeeds() - [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL] From there, the object on c->freelist will get lost if task B is allowed to continue from here: It will proceed to the retry_load_slab label, set c->slab, then jump to load_freelist, which clobbers c->freelist. But if we instead continue as follows, we get worse corruption: - task A: run __slab_free() on object from other struct slab: - CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial) - task A: run slab_alloc_node() with NUMA node constraint: - fastpath fails (c->slab is NULL) - call __slab_alloc() - slub_get_cpu_ptr() (disables preemption) - enter ___slab_alloc() - c->slab is NULL: goto new_slab - slub_percpu_partial() is non-NULL - set c->slab to slub_percpu_partial(c) - [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects from slab-2] - goto redo - node_match() fails - goto deactivate_slab - existing c->freelist is passed into deactivate_slab() - inuse count of slab-1 is decremented to account for object from slab-2 At this point, the inuse count of slab-1 is 1 lower than it should be. This means that if we free all allocated objects in slab-1 except for one, SLUB will think that slab-1 is completely unused, and may free its page, leading to use-after-free. Fixes: c17dda40a6a4e ("slub: Separate out kmem_cache_cpu processing from deactivate_slab") Fixes: 03e404af26dc2 ("slub: fast release on full slab") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn Acked-by: Christoph Lameter Acked-by: David Rientjes Reviewed-by: Muchun Song Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka Link: https://lore.kernel.org/r/20220608182205.2945720-1-jannh@google.com --- mm/slub.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mm/slub.c b/mm/slub.c index 0b13135fd571..c07c5fa6adcd 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2556,6 +2556,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, deactivate_slab(s, page, c->freelist); c->page = NULL; c->freelist = NULL; + c->tid = next_tid(c->tid); goto new_slab; } } @@ -2569,6 +2570,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, deactivate_slab(s, page, c->freelist); c->page = NULL; c->freelist = NULL; + c->tid = next_tid(c->tid); goto new_slab; } @@ -2581,6 +2583,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, if (!freelist) { c->page = NULL; + c->tid = next_tid(c->tid); stat(s, DEACTIVATE_BYPASS); goto new_slab; } @@ -2605,6 +2608,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, c->partial = page->next; stat(s, CPU_PARTIAL_ALLOC); c->freelist = NULL; + c->tid = next_tid(c->tid); goto redo; } @@ -2627,6 +2631,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, deactivate_slab(s, page, get_freepointer(s, freelist)); c->page = NULL; c->freelist = NULL; + c->tid = next_tid(c->tid); return freelist; }