From patchwork Thu Nov 22 19:52:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sasha Levin X-Patchwork-Id: 10694761 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A27B11709 for ; Thu, 22 Nov 2018 19:54:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8F4C429710 for ; Thu, 22 Nov 2018 19:54:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7E6312AF94; Thu, 22 Nov 2018 19:54:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7BD2429710 for ; Thu, 22 Nov 2018 19:54:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 64D636B2CEA; Thu, 22 Nov 2018 14:54:40 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5F96B6B2CEB; Thu, 22 Nov 2018 14:54:40 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EAB06B2CEC; Thu, 22 Nov 2018 14:54:40 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id 0959A6B2CEA for ; Thu, 22 Nov 2018 14:54:40 -0500 (EST) Received: by mail-pg1-f200.google.com with SMTP id h9so3062470pgm.1 for ; Thu, 22 Nov 2018 11:54:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=poLDuUGujotyoCjXCCgq11V1dwqmo3SolxbNlFciDWY=; b=foyQNIAd5GH+zsHPyJ8HfJjvPFwstRoG20Pgnvdr3rzDZdkFIuWYiQS9rDB1qO+gVG VF805lWVAtzaVIY2vGCN7HXcFY8c+BNC4rGbaHy2kp/RxdbACS8sjpJBEzRdbk9YVVCX 7ueLz/0vubc9peQaaP94NoC9J8ctuWk7MULREYl3Vz+8otUFSxB3Yq9OXR8CVhFBBkzh vhsGrblxzUnjeMF5bc9S/yD631GwczwnRCbl+xepwglyjquUPwrwsXvYYJHiBuX6fXAD 5v9w6MS9uqSrSsGcgnQF5/eYS0d2kFLHi6qXN0W8x8++iOOdurzFtbOxzfhy9MwYDxeM 2WBg== X-Gm-Message-State: AA+aEWbh6XSnS7bGB+3Tc8IJjQf4zLzZVH9PeNTXbzOFpg5qDdgV+J3X 0bvH2s8Ru5KbIC6sXcfkGHbt6sRpecD19IsXNzd94+FA42mUFDRFX3P/JNwg0joWcYCYKv8VvNe sByqlxI5+MvYSVrG8QDRzAy8SGvpTNctF9B1g3SQgkaTI1kcCt+meqRc0au5gww7Nsg== X-Received: by 2002:a17:902:7587:: with SMTP id j7mr12610786pll.191.1542916479517; Thu, 22 Nov 2018 11:54:39 -0800 (PST) X-Google-Smtp-Source: AFSGD/XZPwgj8lHtVLwxs7cnfhdn3/QoU8zW1tPc6k/ONBdMPaWCJnT60iHAZYflPJJdb680xCjG X-Received: by 2002:a17:902:7587:: with SMTP id j7mr12610752pll.191.1542916478628; Thu, 22 Nov 2018 11:54:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542916478; cv=none; d=google.com; s=arc-20160816; b=PHBCTXwSYeK830rrQcVr4pM+E/IRNkIedQpDtPqOdh6A+Atu6qsGt/GCGBiSusAb6b QXYwUBXw7osDc44wpn4s+15ZtuHwcZnHAs448mQkvf9emIRfg9FClnxEX+Hy0GMdztqp BGUniANFYYJgJcIvVPpk/1U+9xoJb7OoMyWaAHENFaBXslq+8u5JGBgIoJSWf7du1Id8 Uo02cXn0X5uIr57UZDnHAqIsa4oMnFFnQ0QaoG1Od9Iq1S7Fx0Amj5+3KKXGG6ETot/o 1w9pyWOiq64x3Clwm6qjZSz3zyhQ6I+XDWpN59rV+7UAfCe9Cxd1TPFjyAv68Hodpfod wC6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=poLDuUGujotyoCjXCCgq11V1dwqmo3SolxbNlFciDWY=; b=pNzBreHC8Cx4QfLsDsZx1whlWbE+Amnq+Ue9zbrhi+mLSWTtbWAcTKvFu6TcboKEEC B7AxaaCBhQ+duP7QFgJ0kNkfpe9snAswbqkT8w7Nb0CFjW9QrUVtwydUY6zNeoNBf7uv BatFjpFwhBwq8oRfEJBHedraf0JeZFJd1ywBQ7us1ArdnC9PO1rBkLL1Ix7Dvl/KibQ3 HBqSF5rzyYNZWD0MUt20HLLT/VfE3bWKqb7o8qrYwb7MaO52oWwj75m7pmx4mCdDMSDZ rStr82qva1+xgFQpq4ZMJegt6SRcDVw68i0awCliABVIx5PJCtqenQwdquhMcr+k5vfQ F9xw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="qj+/UAGH"; spf=pass (google.com: domain of sashal@kernel.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=sashal@kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from mail.kernel.org (mail.kernel.org. [198.145.29.99]) by mx.google.com with ESMTPS id h191si21267462pgc.302.2018.11.22.11.54.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Nov 2018 11:54:38 -0800 (PST) Received-SPF: pass (google.com: domain of sashal@kernel.org designates 198.145.29.99 as permitted sender) client-ip=198.145.29.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="qj+/UAGH"; spf=pass (google.com: domain of sashal@kernel.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=sashal@kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from sasha-vm.mshome.net (unknown [37.142.5.207]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C7C4D20645; Thu, 22 Nov 2018 19:54:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1542916478; bh=OIvwv7IS2hqDONDnSt/xuB732ZCGsEvkdES5wVdcztc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qj+/UAGHFES19oQT61TWSezhS3+61IVd+s1hLpag4KU2KFCuZ5mcQ6t4m2Ysa3VZk JrGGS5sBtmWiNYTfFDR0+8jojwV79U0lLg8sVzSIyrufeVHTRmwRhUbv3glBTkxPZG ++TI8uRWuM5jcC+0njuOJ5NGeRRzwf3/d2roITSY= From: Sasha Levin To: stable@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Vitaly Wool , Vitaly Wool , Jongseok Kim , Andrew Morton , Linus Torvalds , Sasha Levin , linux-mm@kvack.org Subject: [PATCH AUTOSEL 4.19 33/36] z3fold: fix possible reclaim races Date: Thu, 22 Nov 2018 14:52:37 -0500 Message-Id: <20181122195240.13123-33-sashal@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181122195240.13123-1-sashal@kernel.org> References: <20181122195240.13123-1-sashal@kernel.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Vitaly Wool [ Upstream commit ca0246bb97c23da9d267c2107c07fb77e38205c9 ] Reclaim and free can race on an object which is basically fine but in order for reclaim to be able to map "freed" object we need to encode object length in the handle. handle_to_chunks() is then introduced to extract object length from a handle and use it during mapping. Moreover, to avoid racing on a z3fold "headless" page release, we should not try to free that page in z3fold_free() if the reclaim bit is set. Also, in the unlikely case of trying to reclaim a page being freed, we should not proceed with that page. While at it, fix the page accounting in reclaim function. This patch supersedes "[PATCH] z3fold: fix reclaim lock-ups". Link: http://lkml.kernel.org/r/20181105162225.74e8837d03583a9b707cf559@gmail.com Signed-off-by: Vitaly Wool Signed-off-by: Jongseok Kim Reported-by-by: Jongseok Kim Reviewed-by: Snild Dolkow Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/z3fold.c | 101 ++++++++++++++++++++++++++++++++-------------------- 1 file changed, 62 insertions(+), 39 deletions(-) diff --git a/mm/z3fold.c b/mm/z3fold.c index 4b366d181f35..aee9b0b8d907 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -99,6 +99,7 @@ struct z3fold_header { #define NCHUNKS ((PAGE_SIZE - ZHDR_SIZE_ALIGNED) >> CHUNK_SHIFT) #define BUDDY_MASK (0x3) +#define BUDDY_SHIFT 2 /** * struct z3fold_pool - stores metadata for each z3fold pool @@ -145,7 +146,7 @@ enum z3fold_page_flags { MIDDLE_CHUNK_MAPPED, NEEDS_COMPACTING, PAGE_STALE, - UNDER_RECLAIM + PAGE_CLAIMED, /* by either reclaim or free */ }; /***************** @@ -174,7 +175,7 @@ static struct z3fold_header *init_z3fold_page(struct page *page, clear_bit(MIDDLE_CHUNK_MAPPED, &page->private); clear_bit(NEEDS_COMPACTING, &page->private); clear_bit(PAGE_STALE, &page->private); - clear_bit(UNDER_RECLAIM, &page->private); + clear_bit(PAGE_CLAIMED, &page->private); spin_lock_init(&zhdr->page_lock); kref_init(&zhdr->refcount); @@ -223,8 +224,11 @@ static unsigned long encode_handle(struct z3fold_header *zhdr, enum buddy bud) unsigned long handle; handle = (unsigned long)zhdr; - if (bud != HEADLESS) - handle += (bud + zhdr->first_num) & BUDDY_MASK; + if (bud != HEADLESS) { + handle |= (bud + zhdr->first_num) & BUDDY_MASK; + if (bud == LAST) + handle |= (zhdr->last_chunks << BUDDY_SHIFT); + } return handle; } @@ -234,6 +238,12 @@ static struct z3fold_header *handle_to_z3fold_header(unsigned long handle) return (struct z3fold_header *)(handle & PAGE_MASK); } +/* only for LAST bud, returns zero otherwise */ +static unsigned short handle_to_chunks(unsigned long handle) +{ + return (handle & ~PAGE_MASK) >> BUDDY_SHIFT; +} + /* * (handle & BUDDY_MASK) < zhdr->first_num is possible in encode_handle * but that doesn't matter. because the masking will result in the @@ -720,37 +730,39 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle) page = virt_to_page(zhdr); if (test_bit(PAGE_HEADLESS, &page->private)) { - /* HEADLESS page stored */ - bud = HEADLESS; - } else { - z3fold_page_lock(zhdr); - bud = handle_to_buddy(handle); - - switch (bud) { - case FIRST: - zhdr->first_chunks = 0; - break; - case MIDDLE: - zhdr->middle_chunks = 0; - zhdr->start_middle = 0; - break; - case LAST: - zhdr->last_chunks = 0; - break; - default: - pr_err("%s: unknown bud %d\n", __func__, bud); - WARN_ON(1); - z3fold_page_unlock(zhdr); - return; + /* if a headless page is under reclaim, just leave. + * NB: we use test_and_set_bit for a reason: if the bit + * has not been set before, we release this page + * immediately so we don't care about its value any more. + */ + if (!test_and_set_bit(PAGE_CLAIMED, &page->private)) { + spin_lock(&pool->lock); + list_del(&page->lru); + spin_unlock(&pool->lock); + free_z3fold_page(page); + atomic64_dec(&pool->pages_nr); } + return; } - if (bud == HEADLESS) { - spin_lock(&pool->lock); - list_del(&page->lru); - spin_unlock(&pool->lock); - free_z3fold_page(page); - atomic64_dec(&pool->pages_nr); + /* Non-headless case */ + z3fold_page_lock(zhdr); + bud = handle_to_buddy(handle); + + switch (bud) { + case FIRST: + zhdr->first_chunks = 0; + break; + case MIDDLE: + zhdr->middle_chunks = 0; + break; + case LAST: + zhdr->last_chunks = 0; + break; + default: + pr_err("%s: unknown bud %d\n", __func__, bud); + WARN_ON(1); + z3fold_page_unlock(zhdr); return; } @@ -758,7 +770,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle) atomic64_dec(&pool->pages_nr); return; } - if (test_bit(UNDER_RECLAIM, &page->private)) { + if (test_bit(PAGE_CLAIMED, &page->private)) { z3fold_page_unlock(zhdr); return; } @@ -836,20 +848,30 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) } list_for_each_prev(pos, &pool->lru) { page = list_entry(pos, struct page, lru); + + /* this bit could have been set by free, in which case + * we pass over to the next page in the pool. + */ + if (test_and_set_bit(PAGE_CLAIMED, &page->private)) + continue; + + zhdr = page_address(page); if (test_bit(PAGE_HEADLESS, &page->private)) - /* candidate found */ break; - zhdr = page_address(page); - if (!z3fold_page_trylock(zhdr)) + if (!z3fold_page_trylock(zhdr)) { + zhdr = NULL; continue; /* can't evict at this point */ + } kref_get(&zhdr->refcount); list_del_init(&zhdr->buddy); zhdr->cpu = -1; - set_bit(UNDER_RECLAIM, &page->private); break; } + if (!zhdr) + break; + list_del_init(&page->lru); spin_unlock(&pool->lock); @@ -898,6 +920,7 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) if (test_bit(PAGE_HEADLESS, &page->private)) { if (ret == 0) { free_z3fold_page(page); + atomic64_dec(&pool->pages_nr); return 0; } spin_lock(&pool->lock); @@ -905,7 +928,7 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) spin_unlock(&pool->lock); } else { z3fold_page_lock(zhdr); - clear_bit(UNDER_RECLAIM, &page->private); + clear_bit(PAGE_CLAIMED, &page->private); if (kref_put(&zhdr->refcount, release_z3fold_page_locked)) { atomic64_dec(&pool->pages_nr); @@ -964,7 +987,7 @@ static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle) set_bit(MIDDLE_CHUNK_MAPPED, &page->private); break; case LAST: - addr += PAGE_SIZE - (zhdr->last_chunks << CHUNK_SHIFT); + addr += PAGE_SIZE - (handle_to_chunks(handle) << CHUNK_SHIFT); break; default: pr_err("unknown buddy id %d\n", buddy);