From patchwork Wed Apr 10 09:14:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Miaohe Lin X-Patchwork-Id: 13623931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D642CD1296 for ; Wed, 10 Apr 2024 09:17:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D24DE6B0088; Wed, 10 Apr 2024 05:17:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CAE806B008A; Wed, 10 Apr 2024 05:17:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B4FC46B0092; Wed, 10 Apr 2024 05:17:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 944976B0088 for ; Wed, 10 Apr 2024 05:17:01 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 23218C06FB for ; Wed, 10 Apr 2024 09:17:01 +0000 (UTC) X-FDA: 81993067842.06.34EA453 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf24.hostedemail.com (Postfix) with ESMTP id D1FB6180019 for ; Wed, 10 Apr 2024 09:16:57 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf24.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712740618; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=KtxKMclj0Q0FlUtbI78va0vcks34wJpOJotzsSD8GTE=; b=DNvMMOW9FVjjcUa579ymbU7nwVz6vnuXTgNjR02hrLD+hwtVugA0OfB7QQUV0fOL88GiIy ZuUAeNetJZHY8ia9kr9XuFYLnMuzLaZ/w1b0X5e0y7aW8wvibSQwuxzdWO+DR9HKS5thta Ly3A1ZdjkJXP7nDtedQ1yrK2TmFkxa4= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf24.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712740618; a=rsa-sha256; cv=none; b=iNKQQgMKHO1K+W8iT9RH/HVUN1Ovu+7fKiDxYfPxj4qxJ7aVpcKwd8xzs7Ot0WnAxdB7bB Nh+ueBr5z9mlAv6xqLurKTSymQnwzFR7xpItu51R5kuwpvyheEbpHwKVnCf9Bdm1EL5q9v 3AFYPAFgaN5ctzJyn3VV7UdIdo0mj70= Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4VDxyM2K6KzFrMM; Wed, 10 Apr 2024 17:15:55 +0800 (CST) Received: from canpemm500002.china.huawei.com (unknown [7.192.104.244]) by mail.maildlp.com (Postfix) with ESMTPS id 7576B140159; Wed, 10 Apr 2024 17:16:53 +0800 (CST) Received: from huawei.com (10.173.135.154) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 10 Apr 2024 17:16:52 +0800 From: Miaohe Lin To: , CC: , , , , , , , , , , , , , , Subject: [PATCH] fork: defer linking file vma until vma is fully initialized Date: Wed, 10 Apr 2024 17:14:41 +0800 Message-ID: <20240410091441.3539905-1-linmiaohe@huawei.com> X-Mailer: git-send-email 2.33.0 MIME-Version: 1.0 X-Originating-IP: [10.173.135.154] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To canpemm500002.china.huawei.com (7.192.104.244) X-Rspamd-Queue-Id: D1FB6180019 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 1i8rkma787y9aig3xjb7omdnznrkr5qd X-HE-Tag: 1712740617-77855 X-HE-Meta: U2FsdGVkX196qCB9/5YAh90XCNYIJeKaRoyx86r8OkDcJs72LG2lrebf7vmKFCqTeG03e5Nb1sVGb8prgobTACPTu9RYWMbxGjeE+OQHHcHNafnovmWcfvCrWoFna0mBo1BZG9SyDLhlSfXR2oVMHpkYyjk07osfv5Jq0EX1HYMMDMtJ4bT+3Zc6cy4xglktODiajYGndoCGjhAuER+C4lgl5eUJ6N0+FtmS/5U7kNRCYXFbEh5uzbZyzPf0UjkIWu8tC5YgFUyWbzMCAkgtrzFf42Knw8plWy28/d6MjRHr6bOmCcGlJFZjr9eVyUG15TNW0S2aBI51cxEUB+sjcWMSRwnhmr1tUXJ45JJI70TEMPeVjlJRnsKoNdZn4XltNV84X1YS5WxgPLN3WtuKXe4VDf9uNemB8+ngrrQMuh/aLCg2Q0C7UURBTrZbEsEo8oXtmgUXCp34/MckQ1b2KfKPHAh4/RNGr8eg7MWZm5kL80/ZJ8EsAQs5NHiWlpslbdtzlxo1EtyebZ89UdAUSM8Z6oegF6wLwD0LV3BUjav8EWRq/vISSrlCgWTFLnn//rveOsx4zGSBsSel+JoVB/Ss87P8mMuvc2JWZX0wGyxaVaKZ+q8c/wHi7CRx2AYyJu6pqnw44OBqxL4FednjvwENgBfqtJQITu4AmoMf7jwymFlUI5yHPNITip8VXkeVctaFxEdcQDBI4HF2VKYe9sz+6dXxTXgSsClOV/dgfdxrn5McKmqFVjB/+0l6qfCd13zzIvBNgDapWMv/GPwNLY4EP+4mIMRdyWEtLsbRyhut8vpHtSfDE/XCfva9Z+nXfbKSBtLeOvXhjD3aHN20+yJLQXzJtsSSnbrY4SDBrRPa/YMpO3N3DpKaqlX8quyEoNhZAzvpV63kfEMxj0yXbVuBWRIxeKwoxKrJbR/8hYGn9p73lbcxja0SR/jrl4sO+ElvW9DjeSzgLeIIWgB ZzNTZuMW zRE4IaOm05wwYSUu9zg/a9mQTkNghQ903jlgyWCFZSmAFzIzsyIT+XDr+u15+XC6+/F3xDHX83Xs8R56Q0AydIqaR7LdPLRUtxcJnipHeIXG5l5AbwpO8mkQM9xeysMJYrdaal1HGkaiNFRFcV3vpjzLO4b9sauu9qvOedrlJbpGnabmT6K91XpEvkkaBFuMD2BMa0Yf7s1yvK8QgLTiYrC2lvjBUzBu6Qa9W X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Thorvald reported a WARNING [1]. And the root cause is below race: CPU 1 CPU 2 fork hugetlbfs_fallocate dup_mmap hugetlbfs_punch_hole i_mmap_lock_write(mapping); vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree. i_mmap_unlock_write(mapping); hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem! i_mmap_lock_write(mapping); hugetlb_vmdelete_list vma_interval_tree_foreach hugetlb_vma_trylock_write -- Vma_lock is cleared. tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem! hugetlb_vma_unlock_write -- Vma_lock is assigned!!! i_mmap_unlock_write(mapping); hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside i_mmap_rwsem lock while vma lock can be used in the same time. Fix this by deferring linking file vma until vma is fully initialized. Those vmas should be initialized first before they can be used. Reported-by: Thorvald Natvig Closes: https://lore.kernel.org/linux-mm/20240129161735.6gmjsswx62o4pbja@revolver/T/ [1] Fixes: 8d9bfb260814 ("hugetlb: add vma based lock for pmd sharing") Signed-off-by: Miaohe Lin Reviewed-by: Jane Chu --- kernel/fork.c | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 84de5faa8c9a..99076dbe27d8 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -714,6 +714,23 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, } else if (anon_vma_fork(tmp, mpnt)) goto fail_nomem_anon_vma_fork; vm_flags_clear(tmp, VM_LOCKED_MASK); + /* + * Copy/update hugetlb private vma information. + */ + if (is_vm_hugetlb_page(tmp)) + hugetlb_dup_vma_private(tmp); + + /* + * Link the vma into the MT. After using __mt_dup(), memory + * allocation is not necessary here, so it cannot fail. + */ + vma_iter_bulk_store(&vmi, tmp); + + mm->map_count++; + + if (tmp->vm_ops && tmp->vm_ops->open) + tmp->vm_ops->open(tmp); + file = tmp->vm_file; if (file) { struct address_space *mapping = file->f_mapping; @@ -730,25 +747,9 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, i_mmap_unlock_write(mapping); } - /* - * Copy/update hugetlb private vma information. - */ - if (is_vm_hugetlb_page(tmp)) - hugetlb_dup_vma_private(tmp); - - /* - * Link the vma into the MT. After using __mt_dup(), memory - * allocation is not necessary here, so it cannot fail. - */ - vma_iter_bulk_store(&vmi, tmp); - - mm->map_count++; if (!(tmp->vm_flags & VM_WIPEONFORK)) retval = copy_page_range(tmp, mpnt); - if (tmp->vm_ops && tmp->vm_ops->open) - tmp->vm_ops->open(tmp); - if (retval) { mpnt = vma_next(&vmi); goto loop_out;