From patchwork Wed Apr 3 19:33:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884393 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C7E11708 for ; Wed, 3 Apr 2019 19:33:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 14C162875F for ; Wed, 3 Apr 2019 19:33:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 08F3C289D0; Wed, 3 Apr 2019 19:33:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9C48F2875F for ; Wed, 3 Apr 2019 19:33:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC49F6B026A; Wed, 3 Apr 2019 15:33:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DD3DF6B026B; Wed, 3 Apr 2019 15:33:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4DED6B026C; Wed, 3 Apr 2019 15:33:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by kanga.kvack.org (Postfix) with ESMTP id 891486B026A for ; Wed, 3 Apr 2019 15:33:35 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id a15so132970qkl.23 for ; Wed, 03 Apr 2019 12:33:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=UDx0XcOY0ZN0vAPqmojaKaNgTIVOXpxM0dk8s8AwFyQ=; b=NnYHtv8Rh1V0HcGN9VIkpLT2T/v377XXBjV6seqzVldfn/KNb+MbwU96inas/fQp6M 50c36ChVYHl5T9pc6ZlZ3dMY1lZ0mGlJnJeKgtn4ehSOUEIr2OMwJRvz4nKaaxJDbS1o f8DwLX8/4BwpMLXVvw6gX56UC2gE+yR7PQIgKI5G1h2H35BtqWEDI3xcDlpfogAZE3kz s1PhAsUC5qQlPW6UwVawWANWi008YEb4px2gm6v/x//VY18iZMCQ4Zf/HuIEl3/yKMdc SiU8/MtwCOeF3qocnxWTAgPGPXVKkiA/rQmNXzrGYJXBasx6rmydYjbVBAbUHLoIkPbH yVUw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWG/A8cVu9pxi2F4Db3u0AuJleiJ599SDaTShhzdPKPUjbw970b ZXko0HhkvB7W/BHN4U5oI+nOJ/zxVwQ/lgs8Kv+hyc5RgZ1p2J9mEB0tQqzTMdrkWKi8Qt/iL1U CNJWsig5gj8HtHCzmRIBkuI3xPGemXSLrUo3WVajTmW1YoJNL9mTv/OZqw7J5hplmJw== X-Received: by 2002:a0c:d849:: with SMTP id i9mr1162828qvj.207.1554320015333; Wed, 03 Apr 2019 12:33:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqw+X9qp/eozZhq1gwce701VmRcrPSAKf7GnJZbKyKCPTH0oInw+UMb55X91FNDP4EY8xTSP X-Received: by 2002:a0c:d849:: with SMTP id i9mr1162597qvj.207.1554320010982; Wed, 03 Apr 2019 12:33:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320010; cv=none; d=google.com; s=arc-20160816; b=0pORkpabbo1XRzl9yzjrHDtlXfu1AwxpVtBYgBPxDC7LfNcdMmMnDFxi0niB9FYfpu wascAjUEYEPR/NjMkVG5vlXzrIynZq3fTyzHrf96SHXoCr5hLpe0ibLciO4Cr6H/5uGK k4WgHzfYM3hDR3Z+l7d2Tf7KkoCup7KYm52jQmK9OkBio8n0UirpWrc+njc7isrG20gs gz6ecLnN1LBprX0wrbOUnyVcF/DSonqRaj7K6dRggtyE8HCeIqPyaV4w5Hxwjz8W8R3P mnWUP4pOM0qXh7fsA5wIiD5X/5i/spvlBh3QnnCCVgJ7A1X2qVx2addVYYsGfzYpYB27 Q9PA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=UDx0XcOY0ZN0vAPqmojaKaNgTIVOXpxM0dk8s8AwFyQ=; b=PJZuRk6BcnjJDX2lKLfwecbX2iu0llMYVxu+3OLO35HOF1EOSmLuWMoqUvSs++QmZj kMIANnzMrfn6XS1Z6ZJcfj6PrN1O6IwenBPAi2lUShzgJ0Nz+P4FBylwUo+fFoQLRTWR USoL/5ydc1TIpkrTIKikXUef09pPztAoD1hrOa/BHcIba/twn/C6k6WJzlN9rjSwboB5 tIcfsQPvaqHTd26ai2pvq0Kp0EfScwKtm9FrjL2Bge4oeBs1co8+gewvkukkoE99ulZB 3GXqIXl/qL6c8IsnCk01W1ZwEQ+YV1c3L32+L7yt8A3aYIHJnrQK4Db4DH312iRS/gJK cOHw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g4si6497722qkb.201.2019.04.03.12.33.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:30 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3FB2987624; Wed, 3 Apr 2019 19:33:30 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3FF8F6012C; Wed, 3 Apr 2019 19:33:29 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Ralph Campbell , John Hubbard , Dan Williams Subject: [PATCH v3 01/12] mm/hmm: select mmu notifier when selecting HMM v2 Date: Wed, 3 Apr 2019 15:33:07 -0400 Message-Id: <20190403193318.16478-2-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 03 Apr 2019 19:33:30 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse To avoid random config build issue, select mmu notifier when HMM is selected. In any cases when HMM get selected it will be by users that will also wants the mmu notifier. Changes since v1: - remove select MMU_NOTIFIER from HMM_MIRROR as it select HMM which select MMU_NOTIFIER now Signed-off-by: Jérôme Glisse Acked-by: Balbir Singh Cc: Ralph Campbell Cc: Andrew Morton Cc: John Hubbard Cc: Dan Williams --- mm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/Kconfig b/mm/Kconfig index 25c71eb8a7db..2e6d24d783f7 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -694,12 +694,12 @@ config DEV_PAGEMAP_OPS config HMM bool + select MMU_NOTIFIER select MIGRATE_VMA_HELPER config HMM_MIRROR bool "HMM mirror CPU page table into a device page table" depends on ARCH_HAS_HMM - select MMU_NOTIFIER select HMM help Select HMM_MIRROR if you want to mirror range of the CPU page table of a From patchwork Wed Apr 3 19:33:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884389 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 657841708 for ; Wed, 3 Apr 2019 19:33:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4AB3E2875F for ; Wed, 3 Apr 2019 19:33:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3D7A2289D0; Wed, 3 Apr 2019 19:33:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 371472875F for ; Wed, 3 Apr 2019 19:33:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A09B36B0010; Wed, 3 Apr 2019 15:33:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9685B6B0269; Wed, 3 Apr 2019 15:33:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 792396B026A; Wed, 3 Apr 2019 15:33:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 517586B0010 for ; Wed, 3 Apr 2019 15:33:33 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id c67so173443qkg.5 for ; Wed, 03 Apr 2019 12:33:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=D34hwzoykzfsqSBqKRSRJ/FelzSzzzD2ynA/IC+Vq1g=; b=crrxhaqwZSIGmXhGrszfi/jVQYsMbypovD9HgxoeRjFp9Aa0I7NYE4uYk4qjOm77uX Hfrjjslb55ICTm+BJQCcF6VhmyCnUpoBFVI/c3hNtHNvd5BFwMUkUzAVCpGoZgROrk10 5hI6HPIRY5JsKBGZNWA1fvXTCJBsvYHgMVBiZdntEen+gc4LwZ4V9Ud7luNU0+VWXyGc 3B1vDt2STkP++qcSfUhrCj3XPttmhkb0DSBSjjrdmQaNeAG7UpboeiS/yVgMuK8GIOzi lfTWQiEbkNAtinhEnORxHWFPihTGMLMt0+hC2zQrpnZOeCfM/byz93z+UWsHDArsYonN lKxw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVSvCcoQrubBHD410fbDO+8HwQuRMCPoFu17ZtwDBY2a6QefQlK F2WZxuHqnlHVsdmg8HSI5LbxAk25/hcIXKFIqlxpp6UCFNOGeBK0384UZcC3XR4vXb6Wfj42l4S LSS6tsX79N0m1GMEAYbnDsqpr25vmDkWAatwJWqRtsmXIU9zEwENrKF2jDBXEyWO3OA== X-Received: by 2002:a0c:b3c4:: with SMTP id b4mr1287240qvf.176.1554320013066; Wed, 03 Apr 2019 12:33:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqwfLkD32XWClxVdKNRBTd0ye6Wc5qMuLuz6CuxrDApCcqdhB3opNFmwxXEWLH6yhmLUbC58 X-Received: by 2002:a0c:b3c4:: with SMTP id b4mr1287190qvf.176.1554320012228; Wed, 03 Apr 2019 12:33:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320012; cv=none; d=google.com; s=arc-20160816; b=BEuhHXHbwEQYraT4rixcJBYS32FnBhP3uRhGvscyheO/AdxHdeXtTLNRcSJmw1KMQE lgCAoIJ9qnVnnbWSTy+6qOF+n4gDTsV8X5W7hiXtnHwvd1M/Ny2Gze4R3hRLuqNQyMh3 hScN5vnRO01bGBtlm0tbMAXA0XrMhEQ92U9DlLCnMysAZWR3OayErgM7sRbdMR896Wz6 WsXgigxPsu/ESUVHJ9PBWqmHnbS7FpFtRADu0PhrANk7E4v016qQES4vey8rqOLQZww9 ebD+7YkYlcwlQTZcyQSCefG7uPP1ZG6aGRgn9B1eiOxzFkhsK33/ZnM3nBQeC5nhJhe5 OfEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=D34hwzoykzfsqSBqKRSRJ/FelzSzzzD2ynA/IC+Vq1g=; b=0FCLHbW2qyEvJRS4CeUW6vL3D97xGvCvm+xuJQ1RZcN9BI7zSmL5dunVkiAgrBJ7sn POv+R2YAg2miu0ocEn+8hnEZbiuSJ+bEqcy5GR2JuJ6UFRkKzR/MWMLhFFNkJvZghfBb NOtjSqKD8VqPGjTmjUfWtsYlhomHPa3ah1hA2OjtteIARSOnabY+iJaZXS7uIn3X4G/f UqYeTILJALPrlTfzPh/vqcnz5QUO2jyE1pvXhY4hOeuT1mtTtx9VTPa3enO7BYUCT51o +xEhWIwmxRjK5xbTiS+pSHH9XGDALUmlKyBmjFMb5ewoqOZx731CY9UupsVcsPYvMnVo Ebjg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 9si503129qvu.126.2019.04.03.12.33.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:32 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6882A3003B36; Wed, 3 Apr 2019 19:33:31 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A8266012C; Wed, 3 Apr 2019 19:33:30 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , John Hubbard , Dan Williams Subject: [PATCH v3 02/12] mm/hmm: use reference counting for HMM struct v3 Date: Wed, 3 Apr 2019 15:33:08 -0400 Message-Id: <20190403193318.16478-3-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Wed, 03 Apr 2019 19:33:31 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse Every time i read the code to check that the HMM structure does not vanish before it should thanks to the many lock protecting its removal i get a headache. Switch to reference counting instead it is much easier to follow and harder to break. This also remove some code that is no longer needed with refcounting. Changes since v2: - Renamed hmm_register() to hmm_get_or_create() updated comments accordingly Changes since v1: - removed bunch of useless check (if API is use with bogus argument better to fail loudly so user fix their code) - s/hmm_get/mm_get_hmm/ Signed-off-by: Jérôme Glisse Reviewed-by: Ralph Campbell Cc: John Hubbard Cc: Andrew Morton Cc: Dan Williams --- include/linux/hmm.h | 2 + mm/hmm.c | 190 ++++++++++++++++++++++++++++---------------- 2 files changed, 124 insertions(+), 68 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index ad50b7b4f141..716fc61fa6d4 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -131,6 +131,7 @@ enum hmm_pfn_value_e { /* * struct hmm_range - track invalidation lock on virtual address range * + * @hmm: the core HMM structure this range is active against * @vma: the vm area struct for the range * @list: all range lock are on a list * @start: range virtual start address (inclusive) @@ -142,6 +143,7 @@ enum hmm_pfn_value_e { * @valid: pfns array did not change since it has been fill by an HMM function */ struct hmm_range { + struct hmm *hmm; struct vm_area_struct *vma; struct list_head list; unsigned long start; diff --git a/mm/hmm.c b/mm/hmm.c index fe1cd87e49ac..919d78fd21c5 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -50,6 +50,7 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops; */ struct hmm { struct mm_struct *mm; + struct kref kref; spinlock_t lock; struct list_head ranges; struct list_head mirrors; @@ -57,24 +58,33 @@ struct hmm { struct rw_semaphore mirrors_sem; }; -/* - * hmm_register - register HMM against an mm (HMM internal) +static inline struct hmm *mm_get_hmm(struct mm_struct *mm) +{ + struct hmm *hmm = READ_ONCE(mm->hmm); + + if (hmm && kref_get_unless_zero(&hmm->kref)) + return hmm; + + return NULL; +} + +/** + * hmm_get_or_create - register HMM against an mm (HMM internal) * * @mm: mm struct to attach to + * Returns: returns an HMM object, either by referencing the existing + * (per-process) object, or by creating a new one. * - * This is not intended to be used directly by device drivers. It allocates an - * HMM struct if mm does not have one, and initializes it. + * This is not intended to be used directly by device drivers. If mm already + * has an HMM struct then it get a reference on it and returns it. Otherwise + * it allocates an HMM struct, initializes it, associate it with the mm and + * returns it. */ -static struct hmm *hmm_register(struct mm_struct *mm) +static struct hmm *hmm_get_or_create(struct mm_struct *mm) { - struct hmm *hmm = READ_ONCE(mm->hmm); + struct hmm *hmm = mm_get_hmm(mm); bool cleanup = false; - /* - * The hmm struct can only be freed once the mm_struct goes away, - * hence we should always have pre-allocated an new hmm struct - * above. - */ if (hmm) return hmm; @@ -86,6 +96,7 @@ static struct hmm *hmm_register(struct mm_struct *mm) hmm->mmu_notifier.ops = NULL; INIT_LIST_HEAD(&hmm->ranges); spin_lock_init(&hmm->lock); + kref_init(&hmm->kref); hmm->mm = mm; spin_lock(&mm->page_table_lock); @@ -106,7 +117,7 @@ static struct hmm *hmm_register(struct mm_struct *mm) if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) goto error_mm; - return mm->hmm; + return hmm; error_mm: spin_lock(&mm->page_table_lock); @@ -118,9 +129,41 @@ static struct hmm *hmm_register(struct mm_struct *mm) return NULL; } +static void hmm_free(struct kref *kref) +{ + struct hmm *hmm = container_of(kref, struct hmm, kref); + struct mm_struct *mm = hmm->mm; + + mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm); + + spin_lock(&mm->page_table_lock); + if (mm->hmm == hmm) + mm->hmm = NULL; + spin_unlock(&mm->page_table_lock); + + kfree(hmm); +} + +static inline void hmm_put(struct hmm *hmm) +{ + kref_put(&hmm->kref, hmm_free); +} + void hmm_mm_destroy(struct mm_struct *mm) { - kfree(mm->hmm); + struct hmm *hmm; + + spin_lock(&mm->page_table_lock); + hmm = mm_get_hmm(mm); + mm->hmm = NULL; + if (hmm) { + hmm->mm = NULL; + spin_unlock(&mm->page_table_lock); + hmm_put(hmm); + return; + } + + spin_unlock(&mm->page_table_lock); } static int hmm_invalidate_range(struct hmm *hmm, bool device, @@ -165,7 +208,7 @@ static int hmm_invalidate_range(struct hmm *hmm, bool device, static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) { struct hmm_mirror *mirror; - struct hmm *hmm = mm->hmm; + struct hmm *hmm = mm_get_hmm(mm); down_write(&hmm->mirrors_sem); mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror, @@ -186,13 +229,16 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) struct hmm_mirror, list); } up_write(&hmm->mirrors_sem); + + hmm_put(hmm); } static int hmm_invalidate_range_start(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { + struct hmm *hmm = mm_get_hmm(range->mm); struct hmm_update update; - struct hmm *hmm = range->mm->hmm; + int ret; VM_BUG_ON(!hmm); @@ -200,14 +246,16 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn, update.end = range->end; update.event = HMM_UPDATE_INVALIDATE; update.blockable = range->blockable; - return hmm_invalidate_range(hmm, true, &update); + ret = hmm_invalidate_range(hmm, true, &update); + hmm_put(hmm); + return ret; } static void hmm_invalidate_range_end(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { + struct hmm *hmm = mm_get_hmm(range->mm); struct hmm_update update; - struct hmm *hmm = range->mm->hmm; VM_BUG_ON(!hmm); @@ -216,6 +264,7 @@ static void hmm_invalidate_range_end(struct mmu_notifier *mn, update.event = HMM_UPDATE_INVALIDATE; update.blockable = true; hmm_invalidate_range(hmm, false, &update); + hmm_put(hmm); } static const struct mmu_notifier_ops hmm_mmu_notifier_ops = { @@ -241,24 +290,13 @@ int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm) if (!mm || !mirror || !mirror->ops) return -EINVAL; -again: - mirror->hmm = hmm_register(mm); + mirror->hmm = hmm_get_or_create(mm); if (!mirror->hmm) return -ENOMEM; down_write(&mirror->hmm->mirrors_sem); - if (mirror->hmm->mm == NULL) { - /* - * A racing hmm_mirror_unregister() is about to destroy the hmm - * struct. Try again to allocate a new one. - */ - up_write(&mirror->hmm->mirrors_sem); - mirror->hmm = NULL; - goto again; - } else { - list_add(&mirror->list, &mirror->hmm->mirrors); - up_write(&mirror->hmm->mirrors_sem); - } + list_add(&mirror->list, &mirror->hmm->mirrors); + up_write(&mirror->hmm->mirrors_sem); return 0; } @@ -273,33 +311,18 @@ EXPORT_SYMBOL(hmm_mirror_register); */ void hmm_mirror_unregister(struct hmm_mirror *mirror) { - bool should_unregister = false; - struct mm_struct *mm; - struct hmm *hmm; + struct hmm *hmm = READ_ONCE(mirror->hmm); - if (mirror->hmm == NULL) + if (hmm == NULL) return; - hmm = mirror->hmm; down_write(&hmm->mirrors_sem); list_del_init(&mirror->list); - should_unregister = list_empty(&hmm->mirrors); + /* To protect us against double unregister ... */ mirror->hmm = NULL; - mm = hmm->mm; - hmm->mm = NULL; up_write(&hmm->mirrors_sem); - if (!should_unregister || mm == NULL) - return; - - mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm); - - spin_lock(&mm->page_table_lock); - if (mm->hmm == hmm) - mm->hmm = NULL; - spin_unlock(&mm->page_table_lock); - - kfree(hmm); + hmm_put(hmm); } EXPORT_SYMBOL(hmm_mirror_unregister); @@ -708,23 +731,29 @@ int hmm_vma_get_pfns(struct hmm_range *range) struct mm_walk mm_walk; struct hmm *hmm; + range->hmm = NULL; + /* Sanity check, this really should not happen ! */ if (range->start < vma->vm_start || range->start >= vma->vm_end) return -EINVAL; if (range->end < vma->vm_start || range->end > vma->vm_end) return -EINVAL; - hmm = hmm_register(vma->vm_mm); + hmm = hmm_get_or_create(vma->vm_mm); if (!hmm) return -ENOMEM; - /* Caller must have registered a mirror, via hmm_mirror_register() ! */ - if (!hmm->mmu_notifier.ops) + + /* Check if hmm_mm_destroy() was call. */ + if (hmm->mm == NULL) { + hmm_put(hmm); return -EINVAL; + } /* FIXME support hugetlb fs */ if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) || vma_is_dax(vma)) { hmm_pfns_special(range); + hmm_put(hmm); return -EINVAL; } @@ -736,6 +765,7 @@ int hmm_vma_get_pfns(struct hmm_range *range) * operations such has atomic access would not work. */ hmm_pfns_clear(range, range->pfns, range->start, range->end); + hmm_put(hmm); return -EPERM; } @@ -758,6 +788,12 @@ int hmm_vma_get_pfns(struct hmm_range *range) mm_walk.pte_hole = hmm_vma_walk_hole; walk_page_range(range->start, range->end, &mm_walk); + /* + * Transfer hmm reference to the range struct it will be drop inside + * the hmm_vma_range_done() function (which _must_ be call if this + * function return 0). + */ + range->hmm = hmm; return 0; } EXPORT_SYMBOL(hmm_vma_get_pfns); @@ -802,25 +838,27 @@ EXPORT_SYMBOL(hmm_vma_get_pfns); */ bool hmm_vma_range_done(struct hmm_range *range) { - unsigned long npages = (range->end - range->start) >> PAGE_SHIFT; - struct hmm *hmm; + bool ret = false; - if (range->end <= range->start) { + /* Sanity check this really should not happen. */ + if (range->hmm == NULL || range->end <= range->start) { BUG(); return false; } - hmm = hmm_register(range->vma->vm_mm); - if (!hmm) { - memset(range->pfns, 0, sizeof(*range->pfns) * npages); - return false; - } - - spin_lock(&hmm->lock); + spin_lock(&range->hmm->lock); list_del_rcu(&range->list); - spin_unlock(&hmm->lock); + ret = range->valid; + spin_unlock(&range->hmm->lock); - return range->valid; + /* Is the mm still alive ? */ + if (range->hmm->mm == NULL) + ret = false; + + /* Drop reference taken by hmm_vma_fault() or hmm_vma_get_pfns() */ + hmm_put(range->hmm); + range->hmm = NULL; + return ret; } EXPORT_SYMBOL(hmm_vma_range_done); @@ -880,25 +918,31 @@ int hmm_vma_fault(struct hmm_range *range, bool block) struct hmm *hmm; int ret; + range->hmm = NULL; + /* Sanity check, this really should not happen ! */ if (range->start < vma->vm_start || range->start >= vma->vm_end) return -EINVAL; if (range->end < vma->vm_start || range->end > vma->vm_end) return -EINVAL; - hmm = hmm_register(vma->vm_mm); + hmm = hmm_get_or_create(vma->vm_mm); if (!hmm) { hmm_pfns_clear(range, range->pfns, range->start, range->end); return -ENOMEM; } - /* Caller must have registered a mirror using hmm_mirror_register() */ - if (!hmm->mmu_notifier.ops) + + /* Check if hmm_mm_destroy() was call. */ + if (hmm->mm == NULL) { + hmm_put(hmm); return -EINVAL; + } /* FIXME support hugetlb fs */ if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) || vma_is_dax(vma)) { hmm_pfns_special(range); + hmm_put(hmm); return -EINVAL; } @@ -910,6 +954,7 @@ int hmm_vma_fault(struct hmm_range *range, bool block) * operations such has atomic access would not work. */ hmm_pfns_clear(range, range->pfns, range->start, range->end); + hmm_put(hmm); return -EPERM; } @@ -945,7 +990,16 @@ int hmm_vma_fault(struct hmm_range *range, bool block) hmm_pfns_clear(range, &range->pfns[i], hmm_vma_walk.last, range->end); hmm_vma_range_done(range); + hmm_put(hmm); + } else { + /* + * Transfer hmm reference to the range struct it will be drop + * inside the hmm_vma_range_done() function (which _must_ be + * call if this function return 0). + */ + range->hmm = hmm; } + return ret; } EXPORT_SYMBOL(hmm_vma_fault); From patchwork Wed Apr 3 19:33:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884391 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 378BC1708 for ; Wed, 3 Apr 2019 19:33:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1F4122875F for ; Wed, 3 Apr 2019 19:33:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 12ED0289D0; Wed, 3 Apr 2019 19:33:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F1DB32875F for ; Wed, 3 Apr 2019 19:33:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D7CC6B0269; Wed, 3 Apr 2019 15:33:34 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 45F2A6B026A; Wed, 3 Apr 2019 15:33:34 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1EE2E6B026B; Wed, 3 Apr 2019 15:33:34 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id EE11F6B0269 for ; Wed, 3 Apr 2019 15:33:33 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id x18so165639qkf.8 for ; Wed, 03 Apr 2019 12:33:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Oe+q9l2XDgvlTBk6Utu7p+X1Qwfox15Ft9tXIgf+A00=; b=ToSZB4J2bxe/jebxD69TGrkVOROEY+KjFlLtxo1TINkX1xAXwcrSCCoMNut1NUOajg IAiEIRIdB5Xw1PfZK4zzgzVRZYuTuJaBWuc3VA9yn9Or6upWRjh0ZQmSwKukDVVIbReq W9wrF2Y2DaLSYJ9vly5Q6gahRo9PWL5HUBFKWX6IfthQ4OUy/AvI40ef9G2vYE5cpvD4 CpuKwRWEyi8EEWbVl62ZV/tddMiP3UOUbaoMwN0nWNyh4pV/vESo4pReq1WlF+RAhjzX UDKlk7o+4JobmVzM5iINzFisMc8ctT1JLkXbwPPm4QEoZtfPuvq2HBr4pbAwVRERdQlB 2ecw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAX3HQtn284CFBytI6gQC/SQ0MvTnG2PqppA3zWEyrtFVdm33gU2 tPsLvtCF2OtxdMAdZMS+YQ8jRZdnnCVy7dQuzihqpvNj1C7bqcsBO9m/z5rW2VLv4KQaQJMau0O 4SJEqgYCUM7U7O70Bs9tOpq7y5265sPa0wEaniOotblYZJdWCAodcMx3focsMfnBxJQ== X-Received: by 2002:ac8:2d02:: with SMTP id n2mr1602983qta.229.1554320013770; Wed, 03 Apr 2019 12:33:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqx42uhQcAKgyOl3g5UahCT2viuQoD9btOq0lyQVdCDw6aWQORANwiBaKw3+MsAxjHr9d25O X-Received: by 2002:ac8:2d02:: with SMTP id n2mr1602947qta.229.1554320013122; Wed, 03 Apr 2019 12:33:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320013; cv=none; d=google.com; s=arc-20160816; b=tseK0ZwVIeRXdwiDj+wr9ovZyhwlsHbHt32E0HxlaV6slxM8BEP58WZ7qy7I+jafih Up4IWfSKxyk3Mxi3tYhqnadM+xkVeaT7aIq8QGBHot61LJYO5qYgy6r2vzlG8ZTwrBVu G1dX/FbOEFpEYzi1XQOag1LRghVMi/QT96S3JPxi85+nU4Ig59q6Eu++oRmqCAoKzfi/ DO35/2FL0lc7vTim63zltSqRbjJwSBj8Q/ckRpn6KgnzBHd59smsg9fB2UHMM7ZtrH2z YpL8Ook9kXMSt29li89K7BGbhB9NyLzh8NkFsDIavfCSuGJVm3RrWnvqqbr9KK9X690g VvfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=Oe+q9l2XDgvlTBk6Utu7p+X1Qwfox15Ft9tXIgf+A00=; b=O9YD318i/3o2lMfAMDR9FzDYZbfNkkekkYGMS4LwM4t3gkoVWNXd2x6mZNte5XI1Xn AfQI1J2DJT8UdgCg/A9fKomBoPVMR8jMS3bxu3kCLh9S3XFqMVZHdhbiEYx7UuL7rD1r LOxewd7ti9T/sSJxUA1Lz8Mlm5Q6kVLmv5LQYk9Gi4WT2AmyrhQW6un+3mHQGzjtBagU 9EkZmNDQq2QQs/V1AqbSgc/Mpr0NoiCuO4MJzISM3M2qeq2bgm+GjYQ6l+QDNeQZg4Ne Q47LyiaQJF7JqU+whsaYmFC4iVbHljJwOIvf2W5ASw+qeZmsGGP9yJjVifmi5HknYNbt fKCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id f129si4504474qkb.56.2019.04.03.12.33.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:33 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 60F60F961D; Wed, 3 Apr 2019 19:33:32 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 93647605CA; Wed, 3 Apr 2019 19:33:31 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Dan Williams Subject: [PATCH v3 03/12] mm/hmm: do not erase snapshot when a range is invalidated Date: Wed, 3 Apr 2019 15:33:09 -0400 Message-Id: <20190403193318.16478-4-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 03 Apr 2019 19:33:32 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse Users of HMM might be using the snapshot information to do preparatory step like dma mapping pages to a device before checking for invalidation through hmm_vma_range_done() so do not erase that information and assume users will do the right thing. Signed-off-by: Jérôme Glisse Reviewed-by: Ralph Campbell Reviewed-by: John Hubbard Cc: Andrew Morton Cc: Dan Williams --- mm/hmm.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index 919d78fd21c5..84e0577a912a 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -174,16 +174,10 @@ static int hmm_invalidate_range(struct hmm *hmm, bool device, spin_lock(&hmm->lock); list_for_each_entry(range, &hmm->ranges, list) { - unsigned long addr, idx, npages; - if (update->end < range->start || update->start >= range->end) continue; range->valid = false; - addr = max(update->start, range->start); - idx = (addr - range->start) >> PAGE_SHIFT; - npages = (min(range->end, update->end) - addr) >> PAGE_SHIFT; - memset(&range->pfns[idx], 0, sizeof(*range->pfns) * npages); } spin_unlock(&hmm->lock); From patchwork Wed Apr 3 19:33:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884395 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 474C017E9 for ; Wed, 3 Apr 2019 19:33:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F62C2875F for ; Wed, 3 Apr 2019 19:33:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 23B37289D0; Wed, 3 Apr 2019 19:33:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5ADBE2875F for ; Wed, 3 Apr 2019 19:33:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5714C6B026B; Wed, 3 Apr 2019 15:33:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 483AC6B026C; Wed, 3 Apr 2019 15:33:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3507D6B026D; Wed, 3 Apr 2019 15:33:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id F18E06B026C for ; Wed, 3 Apr 2019 15:33:35 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id g48so104277qtk.19 for ; Wed, 03 Apr 2019 12:33:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=yTkL8QdLNQoFxlLSf8gC3Y2jl4lnbj9zpLCjzX1svHg=; b=Oq/QrKcByTUxV1vcR4JfN9gz1SF5bKMEcdi2WPqh+NkwxR3YRuA0thdlF8UtWSyqrN ivf98zoWzKE9SMsCqtQAdMv1eUgmwB4PeQE1HY2KtbVB22ADrSrRrV006xy/NMoP1AZt sSnyvJ3IXhHuQkTOFOkKgk7tVomRWsGKuN1AvEO+TPdgur6XGZFZM1kR2oBmpjTiMeiN y1QOA8hheIYQEMQLNEXlAl1mh2GQQYRq7UeeFWkbqMkvixQoQTdhwNK0sApAZx4OurfD tiqsruj/z4hsbPkkgETf7Gati5JKIuaCey8B8wBnoA78/u9Ag+23j+Dv3L1/XzvhouO3 8xQA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAU8AFb+QChavagiqT92N4h/HZ/KAKmiFfhl6bOSA59hpT7iByn7 cshQw+TH422r5heS2j3iQws1jdA9/ZbFc9dfduGm6l6YL5FO9LdFmsYCEqDoDGSkHPNmHmygHrt UorPHwtndoTDZSbZQCYdiG9JGSisB77N5W2jYoGDCcmBFbnNDWR0Jj/1ZuVn8ZxnQoA== X-Received: by 2002:ac8:75ca:: with SMTP id z10mr1674624qtq.224.1554320015732; Wed, 03 Apr 2019 12:33:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqyq42MYQas884g4IsTWFLWP4ExEDjsAaceIY8LkEjM/somyN88+kGde6rQ0H2cmKZSQwTaj X-Received: by 2002:ac8:75ca:: with SMTP id z10mr1674537qtq.224.1554320014178; Wed, 03 Apr 2019 12:33:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320014; cv=none; d=google.com; s=arc-20160816; b=i+RmrOVPFdtPd4G4+yiYjOc7ey3LCgOM2JvhGOHal3kpNNzn5JhMKteyalbGo/+Ofs d+6jb6b2fldcE1sLA7AOMRWDQxD5j6jy2JU7yHpsy85GGvZSeZkV4E8Erf0rSi47mG+u lVUa/U7+pSCwtwFNqjPkwj7557uXdrhhsoNDFbVqTdzKhLTMh3LOhKkOjrETC96ySUMe YJDiBxFhvAnfk5YxjvwthDFQdHlrqjqGllHIYNrRqR4/fuePgoYrEHbjLbdtglndCdZe Pa0CUb/EuV9enViUZGN+6yOWNjX6BXBe2emHbuDQxBsq38lHPzEDjIu4crjygrVIPMz7 2vFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=yTkL8QdLNQoFxlLSf8gC3Y2jl4lnbj9zpLCjzX1svHg=; b=oXKFunH52XSmjwK6Qgjq5fQ4r4iTrwGXuNibVQ/zI2gCs9QsJnUbJI28Pw12cNMbpJ H9E83qmIq5CmnbOuFngC2fCKPSzTZAhfI9uLBVEf98Wb6D/IgFvWUPeAnpH4c5n0gKR4 DzFOxEJrB28RAfGuJpQYzLaBurhwtJSYoAsVksNsWdp4neI/HfpnkOCGAO4bnjnqdu1V keuZ9t/M0djrbhTyZeeAg7m0+zg5b16dJeqQxyPIJTLmiNTLeu2eOAlyaHDX/HmRl/Iw O5YXnLZ+lnnqbp/HfK7xndvXtCVna0hjBBIPVLxqQWoGD0Ily6ceIp9Vp4A2eHpa3BcE d5Cg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id c18si628526qkm.240.2019.04.03.12.33.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:34 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 566A1C079C2A; Wed, 3 Apr 2019 19:33:33 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8C2836012C; Wed, 3 Apr 2019 19:33:32 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Dan Williams Subject: [PATCH v3 04/12] mm/hmm: improve and rename hmm_vma_get_pfns() to hmm_range_snapshot() v2 Date: Wed, 3 Apr 2019 15:33:10 -0400 Message-Id: <20190403193318.16478-5-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 03 Apr 2019 19:33:33 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse Rename for consistency between code, comments and documentation. Also improves the comments on all the possible returns values. Improve the function by returning the number of populated entries in pfns array. Changes since v1: - updated documentation - reformated some comments Signed-off-by: Jérôme Glisse Reviewed-by: Ralph Campbell Reviewed-by: John Hubbard Reviewed-by: Ira Weiny Cc: Andrew Morton Cc: Dan Williams --- Documentation/vm/hmm.rst | 26 ++++++++++++++++++-------- include/linux/hmm.h | 4 ++-- mm/hmm.c | 31 +++++++++++++++++-------------- 3 files changed, 37 insertions(+), 24 deletions(-) diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index 44205f0b671f..d9b27bdadd1b 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -189,11 +189,7 @@ the driver callback returns. When the device driver wants to populate a range of virtual addresses, it can use either:: - int hmm_vma_get_pfns(struct vm_area_struct *vma, - struct hmm_range *range, - unsigned long start, - unsigned long end, - hmm_pfn_t *pfns); + long hmm_range_snapshot(struct hmm_range *range); int hmm_vma_fault(struct vm_area_struct *vma, struct hmm_range *range, unsigned long start, @@ -202,7 +198,7 @@ When the device driver wants to populate a range of virtual addresses, it can bool write, bool block); -The first one (hmm_vma_get_pfns()) will only fetch present CPU page table +The first one (hmm_range_snapshot()) will only fetch present CPU page table entries and will not trigger a page fault on missing or non-present entries. The second one does trigger a page fault on missing or read-only entry if the write parameter is true. Page faults use the generic mm page fault code path @@ -220,19 +216,33 @@ Locking with the update() callback is the most important aspect the driver must { struct hmm_range range; ... + + range.start = ...; + range.end = ...; + range.pfns = ...; + range.flags = ...; + range.values = ...; + range.pfn_shift = ...; + again: - ret = hmm_vma_get_pfns(vma, &range, start, end, pfns); - if (ret) + down_read(&mm->mmap_sem); + range.vma = ...; + ret = hmm_range_snapshot(&range); + if (ret) { + up_read(&mm->mmap_sem); return ret; + } take_lock(driver->update); if (!hmm_vma_range_done(vma, &range)) { release_lock(driver->update); + up_read(&mm->mmap_sem); goto again; } // Use pfns array content to update device page table release_lock(driver->update); + up_read(&mm->mmap_sem); return 0; } diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 716fc61fa6d4..32206b0b1bfd 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -365,11 +365,11 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror); * table invalidation serializes on it. * * YOU MUST CALL hmm_vma_range_done() ONCE AND ONLY ONCE EACH TIME YOU CALL - * hmm_vma_get_pfns() WITHOUT ERROR ! + * hmm_range_snapshot() WITHOUT ERROR ! * * IF YOU DO NOT FOLLOW THE ABOVE RULE THE SNAPSHOT CONTENT MIGHT BE INVALID ! */ -int hmm_vma_get_pfns(struct hmm_range *range); +long hmm_range_snapshot(struct hmm_range *range); bool hmm_vma_range_done(struct hmm_range *range); diff --git a/mm/hmm.c b/mm/hmm.c index 84e0577a912a..bd957a9f10d1 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -702,23 +702,25 @@ static void hmm_pfns_special(struct hmm_range *range) } /* - * hmm_vma_get_pfns() - snapshot CPU page table for a range of virtual addresses - * @range: range being snapshotted - * Returns: -EINVAL if invalid argument, -ENOMEM out of memory, -EPERM invalid - * vma permission, 0 success + * hmm_range_snapshot() - snapshot CPU page table for a range + * @range: range + * Returns: number of valid pages in range->pfns[] (from range start + * address). This may be zero. If the return value is negative, + * then one of the following values may be returned: + * + * -EINVAL invalid arguments or mm or virtual address are in an + * invalid vma (ie either hugetlbfs or device file vma). + * -EPERM For example, asking for write, when the range is + * read-only + * -EAGAIN Caller needs to retry + * -EFAULT Either no valid vma exists for this range, or it is + * illegal to access the range * * This snapshots the CPU page table for a range of virtual addresses. Snapshot * validity is tracked by range struct. See hmm_vma_range_done() for further * information. - * - * The range struct is initialized here. It tracks the CPU page table, but only - * if the function returns success (0), in which case the caller must then call - * hmm_vma_range_done() to stop CPU page table update tracking on this range. - * - * NOT CALLING hmm_vma_range_done() IF FUNCTION RETURNS 0 WILL LEAD TO SERIOUS - * MEMORY CORRUPTION ! YOU HAVE BEEN WARNED ! */ -int hmm_vma_get_pfns(struct hmm_range *range) +long hmm_range_snapshot(struct hmm_range *range) { struct vm_area_struct *vma = range->vma; struct hmm_vma_walk hmm_vma_walk; @@ -772,6 +774,7 @@ int hmm_vma_get_pfns(struct hmm_range *range) hmm_vma_walk.fault = false; hmm_vma_walk.range = range; mm_walk.private = &hmm_vma_walk; + hmm_vma_walk.last = range->start; mm_walk.vma = vma; mm_walk.mm = vma->vm_mm; @@ -788,9 +791,9 @@ int hmm_vma_get_pfns(struct hmm_range *range) * function return 0). */ range->hmm = hmm; - return 0; + return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; } -EXPORT_SYMBOL(hmm_vma_get_pfns); +EXPORT_SYMBOL(hmm_range_snapshot); /* * hmm_vma_range_done() - stop tracking change to CPU page table over a range From patchwork Wed Apr 3 19:33:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884397 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E91617E9 for ; Wed, 3 Apr 2019 19:33:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E9E492875F for ; Wed, 3 Apr 2019 19:33:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DB647289D0; Wed, 3 Apr 2019 19:33:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9CB332875F for ; Wed, 3 Apr 2019 19:33:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B8BC86B026C; Wed, 3 Apr 2019 15:33:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A9F926B026D; Wed, 3 Apr 2019 15:33:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 918C56B026F; Wed, 3 Apr 2019 15:33:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 50CA86B026D for ; Wed, 3 Apr 2019 15:33:36 -0400 (EDT) Received: by mail-qt1-f199.google.com with SMTP id x58so153217qtc.1 for ; Wed, 03 Apr 2019 12:33:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=pJDXkCI1q54bhCNH4Jl2vzyz7tUekX9ZSSdkqoSx9ws=; b=N7T9ubHoql0FQZ44m9MNDQ+0ZV8veZ7xzsbdsbwpNLu3UzJDkGrzZBHroE9xQXWtJR Ket8cxS34NIyy9zcaevM5logISHx8UkI9sfRE4AP3e50tEGwvcHab1T4Bu44rt9eUWSB 7ldgKeY57LeWRwi3htXGgs4YQX3KZEGVCxojIV99L0HRjBHjSlbbyJBAvRASZ0FmvafW IcfU8jdqnwmGXdW5ekfKgsz07Kq1hXnpoX/sgPI/3ZHfq+VgVmQWRpoud/7ySeDnk/oS 0TKUkb6S9vNhHII/+CR4hyRAvV5RkGj+q7J93mNdPvlykLPuvCzNVl4Na04M7FqjkOo9 vZIA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAU5g9jScEg5r9jG9xJhEKMBFh0Ik9Qid9Ek+jH2ZNfpHH4E+dGP lHgOehsk6lhPDpd/smZjdbp+zlsMNN0x1UuBAoZxWEFBnAzpmA2NT14uMe49BCRYcaOgStG/bWd TZeCh2T4L4LV8QCbS16rjA3ldatcaorbjGKdxoC7dN+5o+XZvilW4F1m8FCvJBmRMjQ== X-Received: by 2002:ac8:1882:: with SMTP id s2mr1742067qtj.184.1554320016055; Wed, 03 Apr 2019 12:33:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqwfdrvZSyzffK9L8PUa23MqaA++AMlzpINHuB4Sb8kNy7YvNoaz/0SkFHVewSwp2LiUfVuR X-Received: by 2002:ac8:1882:: with SMTP id s2mr1742020qtj.184.1554320015190; Wed, 03 Apr 2019 12:33:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320015; cv=none; d=google.com; s=arc-20160816; b=Dl6teBzg6ZM+e+btAyBkYz/U4lfUABERKYkOJssweVVhBCR0eJ/ABnKrzw1EbEYX5t Yk86/8pk8qgXI4igNSOQmHrcqVzxdjuakP28Ke0b/pJcDq2VeVwZ4XWU1fbcJPKqahef JYUfcKGQKg7/QuLHrMPseKHtaOQaxBBLHl3nFZPpsTSwhX5yHr6hfGIUYk2M+hGlRBf9 azSGT5kjKvv8PmHpDxuf+MplPcgwdeSExnhBCzroxK0Xdw2p+N23xfscG/HjS0RCubCx rt+Nu6CDF5LyfZKvpveIpotD6a0vblUvIYE27kmyth+6AMqfUx12tNogyLwWEZLrnNz2 gAAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=pJDXkCI1q54bhCNH4Jl2vzyz7tUekX9ZSSdkqoSx9ws=; b=ZvTiSYoPuWAz35K8LDyiBbbIKsMOZnF4m7dCDAGuw9bGGQtcuO7y1bL5vIaMUHvzIi tsDq9CYfkdAMoQMudHewN5sR4dUM3HxybGAQd4i9U3YG5zWVUQYwMaqhAU5jFTf/eKQO 1OPMEj8cGJtDo09dxFhMQtJVQksGrhvnmk8LT1OEVxiGaDCtCvlX6FnKSaTJqQKaui+5 YUbOoftYgH3qW6bQI79xnu2U/829dsaxof4LxywSwgMtiglJ/Cf+TwVB1AnBk5NQ8pM4 IZ1kpJ30/ya6JeUfX+wvCA5hMBv1geBDW+SqsVb+YEHfGYTrUgU2WqnAj5X6J+S0/oEc U56g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id o32si3077460qte.347.2019.04.03.12.33.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:35 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 63525882FF; Wed, 3 Apr 2019 19:33:34 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 82890605CA; Wed, 3 Apr 2019 19:33:33 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , John Hubbard , Dan Williams Subject: [PATCH v3 05/12] mm/hmm: improve and rename hmm_vma_fault() to hmm_range_fault() v3 Date: Wed, 3 Apr 2019 15:33:11 -0400 Message-Id: <20190403193318.16478-6-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 03 Apr 2019 19:33:34 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse Minor optimization around hmm_pte_need_fault(). Rename for consistency between code, comments and documentation. Also improves the comments on all the possible returns values. Improve the function by returning the number of populated entries in pfns array. Changes since v2: - updated commit message Changes since v1: - updated documentation - reformated some comments Signed-off-by: Jérôme Glisse Reviewed-by: Ralph Campbell Cc: Andrew Morton Cc: John Hubbard Cc: Dan Williams --- Documentation/vm/hmm.rst | 8 +--- include/linux/hmm.h | 13 +++++- mm/hmm.c | 91 +++++++++++++++++----------------------- 3 files changed, 52 insertions(+), 60 deletions(-) diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index d9b27bdadd1b..61f073215a8d 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -190,13 +190,7 @@ When the device driver wants to populate a range of virtual addresses, it can use either:: long hmm_range_snapshot(struct hmm_range *range); - int hmm_vma_fault(struct vm_area_struct *vma, - struct hmm_range *range, - unsigned long start, - unsigned long end, - hmm_pfn_t *pfns, - bool write, - bool block); + long hmm_range_fault(struct hmm_range *range, bool block); The first one (hmm_range_snapshot()) will only fetch present CPU page table entries and will not trigger a page fault on missing or non-present entries. diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 32206b0b1bfd..e9afd23c2eac 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -391,7 +391,18 @@ bool hmm_vma_range_done(struct hmm_range *range); * * See the function description in mm/hmm.c for further documentation. */ -int hmm_vma_fault(struct hmm_range *range, bool block); +long hmm_range_fault(struct hmm_range *range, bool block); + +/* This is a temporary helper to avoid merge conflict between trees. */ +static inline int hmm_vma_fault(struct hmm_range *range, bool block) +{ + long ret = hmm_range_fault(range, block); + if (ret == -EBUSY) + ret = -EAGAIN; + else if (ret == -EAGAIN) + ret = -EBUSY; + return ret < 0 ? ret : 0; +} /* Below are for HMM internal use only! Not to be used by device driver! */ void hmm_mm_destroy(struct mm_struct *mm); diff --git a/mm/hmm.c b/mm/hmm.c index bd957a9f10d1..b7e4034d96e1 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -340,13 +340,13 @@ static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr, flags |= write_fault ? FAULT_FLAG_WRITE : 0; ret = handle_mm_fault(vma, addr, flags); if (ret & VM_FAULT_RETRY) - return -EBUSY; + return -EAGAIN; if (ret & VM_FAULT_ERROR) { *pfn = range->values[HMM_PFN_ERROR]; return -EFAULT; } - return -EAGAIN; + return -EBUSY; } static int hmm_pfns_bad(unsigned long addr, @@ -372,7 +372,7 @@ static int hmm_pfns_bad(unsigned long addr, * @fault: should we fault or not ? * @write_fault: write fault ? * @walk: mm_walk structure - * Returns: 0 on success, -EAGAIN after page fault, or page fault error + * Returns: 0 on success, -EBUSY after page fault, or page fault error * * This function will be called whenever pmd_none() or pte_none() returns true, * or whenever there is no page directory covering the virtual address range. @@ -395,12 +395,12 @@ static int hmm_vma_walk_hole_(unsigned long addr, unsigned long end, ret = hmm_vma_do_fault(walk, addr, write_fault, &pfns[i]); - if (ret != -EAGAIN) + if (ret != -EBUSY) return ret; } } - return (fault || write_fault) ? -EAGAIN : 0; + return (fault || write_fault) ? -EBUSY : 0; } static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk, @@ -531,11 +531,11 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, uint64_t orig_pfn = *pfn; *pfn = range->values[HMM_PFN_NONE]; - cpu_flags = pte_to_hmm_pfn_flags(range, pte); - hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags, - &fault, &write_fault); + fault = write_fault = false; if (pte_none(pte)) { + hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0, + &fault, &write_fault); if (fault || write_fault) goto fault; return 0; @@ -574,7 +574,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, hmm_vma_walk->last = addr; migration_entry_wait(vma->vm_mm, pmdp, addr); - return -EAGAIN; + return -EBUSY; } return 0; } @@ -582,6 +582,10 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, /* Report error for everything else */ *pfn = range->values[HMM_PFN_ERROR]; return -EFAULT; + } else { + cpu_flags = pte_to_hmm_pfn_flags(range, pte); + hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags, + &fault, &write_fault); } if (fault || write_fault) @@ -632,7 +636,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, if (fault || write_fault) { hmm_vma_walk->last = addr; pmd_migration_entry_wait(vma->vm_mm, pmdp); - return -EAGAIN; + return -EBUSY; } return 0; } else if (!pmd_present(pmd)) @@ -860,53 +864,34 @@ bool hmm_vma_range_done(struct hmm_range *range) EXPORT_SYMBOL(hmm_vma_range_done); /* - * hmm_vma_fault() - try to fault some address in a virtual address range + * hmm_range_fault() - try to fault some address in a virtual address range * @range: range being faulted * @block: allow blocking on fault (if true it sleeps and do not drop mmap_sem) - * Returns: 0 success, error otherwise (-EAGAIN means mmap_sem have been drop) + * Returns: number of valid pages in range->pfns[] (from range start + * address). This may be zero. If the return value is negative, + * then one of the following values may be returned: + * + * -EINVAL invalid arguments or mm or virtual address are in an + * invalid vma (ie either hugetlbfs or device file vma). + * -ENOMEM: Out of memory. + * -EPERM: Invalid permission (for instance asking for write and + * range is read only). + * -EAGAIN: If you need to retry and mmap_sem was drop. This can only + * happens if block argument is false. + * -EBUSY: If the the range is being invalidated and you should wait + * for invalidation to finish. + * -EFAULT: Invalid (ie either no valid vma or it is illegal to access + * that range), number of valid pages in range->pfns[] (from + * range start address). * * This is similar to a regular CPU page fault except that it will not trigger - * any memory migration if the memory being faulted is not accessible by CPUs. + * any memory migration if the memory being faulted is not accessible by CPUs + * and caller does not ask for migration. * * On error, for one virtual address in the range, the function will mark the * corresponding HMM pfn entry with an error flag. - * - * Expected use pattern: - * retry: - * down_read(&mm->mmap_sem); - * // Find vma and address device wants to fault, initialize hmm_pfn_t - * // array accordingly - * ret = hmm_vma_fault(range, write, block); - * switch (ret) { - * case -EAGAIN: - * hmm_vma_range_done(range); - * // You might want to rate limit or yield to play nicely, you may - * // also commit any valid pfn in the array assuming that you are - * // getting true from hmm_vma_range_monitor_end() - * goto retry; - * case 0: - * break; - * case -ENOMEM: - * case -EINVAL: - * case -EPERM: - * default: - * // Handle error ! - * up_read(&mm->mmap_sem) - * return; - * } - * // Take device driver lock that serialize device page table update - * driver_lock_device_page_table_update(); - * hmm_vma_range_done(range); - * // Commit pfns we got from hmm_vma_fault() - * driver_unlock_device_page_table_update(); - * up_read(&mm->mmap_sem) - * - * YOU MUST CALL hmm_vma_range_done() AFTER THIS FUNCTION RETURN SUCCESS (0) - * BEFORE FREEING THE range struct OR YOU WILL HAVE SERIOUS MEMORY CORRUPTION ! - * - * YOU HAVE BEEN WARNED ! */ -int hmm_vma_fault(struct hmm_range *range, bool block) +long hmm_range_fault(struct hmm_range *range, bool block) { struct vm_area_struct *vma = range->vma; unsigned long start = range->start; @@ -978,7 +963,8 @@ int hmm_vma_fault(struct hmm_range *range, bool block) do { ret = walk_page_range(start, range->end, &mm_walk); start = hmm_vma_walk.last; - } while (ret == -EAGAIN); + /* Keep trying while the range is valid. */ + } while (ret == -EBUSY && range->valid); if (ret) { unsigned long i; @@ -988,6 +974,7 @@ int hmm_vma_fault(struct hmm_range *range, bool block) range->end); hmm_vma_range_done(range); hmm_put(hmm); + return ret; } else { /* * Transfer hmm reference to the range struct it will be drop @@ -997,9 +984,9 @@ int hmm_vma_fault(struct hmm_range *range, bool block) range->hmm = hmm; } - return ret; + return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; } -EXPORT_SYMBOL(hmm_vma_fault); +EXPORT_SYMBOL(hmm_range_fault); #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */ From patchwork Wed Apr 3 19:33:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884399 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1A301708 for ; Wed, 3 Apr 2019 19:33:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BFE272875F for ; Wed, 3 Apr 2019 19:33:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B40F8289D0; Wed, 3 Apr 2019 19:33:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C70B22875F for ; Wed, 3 Apr 2019 19:33:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 885FD6B026F; Wed, 3 Apr 2019 15:33:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7E23B6B026D; Wed, 3 Apr 2019 15:33:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5987C6B0272; Wed, 3 Apr 2019 15:33:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 1F2656B026D for ; Wed, 3 Apr 2019 15:33:39 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id 54so114407qtn.15 for ; Wed, 03 Apr 2019 12:33:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=1slSSF5ggjKW1CRbI3lXLqNclzMYmdvCruVVukHeqUA=; b=ZLm9KQGZGVQW88Hu3Pu7Mi2X4XYE8JumgPGzIjjTUDafsLMLXq9EGKbccpv0TkfUWq 3xDh9pS2bI2/JkqN/eDmtVDHJO7bkvjwN0D07Hb0SYdBh+wyNsNh39cC8ap3JKw4l9SH /OgJa508RT6Pdt+6aVmcBUoFV7JS79RNjyav9qYVZ+w12zs2Ms1i1APJfUjtFL3hlA7m D4N+Ixbp87VAMUUCmcNCPG1FlAJH138bbRMTvWwBe7Nc9qkO/IAUxbNNbHoU08AZosJC rQFhtfxax7BA6fl89efAfwJiKoVluOztcpTY9YXgj9Ni8AmMSfIWOT/2xRNFrk68B14I 3vYA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXLcUGQ0wKJYfu67Moq+zxOo5H+OkGDrAEIQC1ejptkplO6BP7Y yZQcBUhzlde09CwPj+dq6txRTEre1yluXutb8IpuuE/mxFe1Cs0AKMQUETi+bi06Aqzu6vTgAPb QcNSl3sz9QKWhMzgjv38Vjn8TJ1yCtW33nrj9d2RQDmJCNc5iA0DDoFM7FwVwdlSJ1Q== X-Received: by 2002:a05:620a:103c:: with SMTP id a28mr1590908qkk.284.1554320018786; Wed, 03 Apr 2019 12:33:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqwhVs8diP3T5PskkhDHkjla5PaAs9ObBc3TZr9xIZxMBxRY48xOW55bbstwS7BvtcoIflAO X-Received: by 2002:a05:620a:103c:: with SMTP id a28mr1590815qkk.284.1554320017142; Wed, 03 Apr 2019 12:33:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320017; cv=none; d=google.com; s=arc-20160816; b=torzxmhrxpHEGYqZUz47dUAvH6A/1dIULOkOww3bvqqV6kuyKUF+TlaJVCuj6b5NU6 jFO0uufwK4vYlKJJB+3vSs1OeRVYer3Qm1Lt626JP0pqyU93VEKg6+/w6JBgdvSMvLkh +W4LANnx90VRYje5xc5hr54Gj5m9b8s/TsCdxpbu8t9nXT074a8cpsDJru2WjVF7fl9v Hzs/1TLsHG06KosSvQAVKvOdNlC+MKdKZs8c9UAzBN9uDMdax//bUAOStdz8BPi3ksCE a6sUfK7G/Kw7DBRb+EzNgHPB/lUcn3Nbr76mPnD39lehQnfqW4V5n1iZ4GZ/JO1dWG+r 9rLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=1slSSF5ggjKW1CRbI3lXLqNclzMYmdvCruVVukHeqUA=; b=j3ZQ28mTn4UpdBTwVTZbNTzNCnN5OzSqXbr3+1YvJaA1htDlCmmqWVfTy+n1stsMhK j3u8jnPd/UhXjvVth8ZmZcxIHM7tzGH0vxvbBu95wUkIdZCkm+ZMx9mRGP4NBPwHkWLO 0MANtC7w0aUWPbreiWvs9uNnwV5+SnKG5/Jcu4gUFDMg4kizddWcnjFaqpOcEGFlIOvR BA2aBAO+BtjEajV9vRrrtw+rGqkDJ8YgBRlz3A+nOFfATF17c4MOnX/IUdtNmR58KrBI AXz1qGAnRpNyjM+7qH5X76GnYncJsICQHkGg0Ic30VrFlNGYkv8NKvkSXUlYNWUOPK8P sNbw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 28si1639505qvl.218.2019.04.03.12.33.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:37 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 540D9308FF17; Wed, 3 Apr 2019 19:33:36 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 915516012C; Wed, 3 Apr 2019 19:33:34 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , John Hubbard , Dan Williams , Dan Carpenter , Matthew Wilcox Subject: [PATCH v3 06/12] mm/hmm: improve driver API to work and wait over a range v3 Date: Wed, 3 Apr 2019 15:33:12 -0400 Message-Id: <20190403193318.16478-7-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Wed, 03 Apr 2019 19:33:36 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse A common use case for HMM mirror is user trying to mirror a range and before they could program the hardware it get invalidated by some core mm event. Instead of having user re-try right away to mirror the range provide a completion mechanism for them to wait for any active invalidation affecting the range. This also changes how hmm_range_snapshot() and hmm_range_fault() works by not relying on vma so that we can drop the mmap_sem when waiting and lookup the vma again on retry. Changes since v2: - Updated documentation to match new API. - Added more comments in old API temporary wrapper. - Consolidated documentation in hmm.rst to avoid out of sync. Changes since v1: - squashed: Dan Carpenter: potential deadlock in nonblocking code Signed-off-by: Jérôme Glisse Reviewed-by: Ralph Campbell Cc: Andrew Morton Cc: John Hubbard Cc: Dan Williams Cc: Dan Carpenter Cc: Matthew Wilcox --- Documentation/vm/hmm.rst | 25 +- include/linux/hmm.h | 145 ++++++++--- mm/hmm.c | 531 +++++++++++++++++++-------------------- 3 files changed, 387 insertions(+), 314 deletions(-) diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index 61f073215a8d..945d5fb6d14a 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -217,17 +217,33 @@ Locking with the update() callback is the most important aspect the driver must range.flags = ...; range.values = ...; range.pfn_shift = ...; + hmm_range_register(&range); + + /* + * Just wait for range to be valid, safe to ignore return value as we + * will use the return value of hmm_range_snapshot() below under the + * mmap_sem to ascertain the validity of the range. + */ + hmm_range_wait_until_valid(&range, TIMEOUT_IN_MSEC); again: down_read(&mm->mmap_sem); - range.vma = ...; ret = hmm_range_snapshot(&range); if (ret) { up_read(&mm->mmap_sem); + if (ret == -EAGAIN) { + /* + * No need to check hmm_range_wait_until_valid() return value + * on retry we will get proper error with hmm_range_snapshot() + */ + hmm_range_wait_until_valid(&range, TIMEOUT_IN_MSEC); + goto again; + } + hmm_mirror_unregister(&range); return ret; } take_lock(driver->update); - if (!hmm_vma_range_done(vma, &range)) { + if (!range.valid) { release_lock(driver->update); up_read(&mm->mmap_sem); goto again; @@ -235,14 +251,15 @@ Locking with the update() callback is the most important aspect the driver must // Use pfns array content to update device page table + hmm_mirror_unregister(&range); release_lock(driver->update); up_read(&mm->mmap_sem); return 0; } The driver->update lock is the same lock that the driver takes inside its -update() callback. That lock must be held before hmm_vma_range_done() to avoid -any race with a concurrent CPU page table update. +update() callback. That lock must be held before checking the range.valid +field to avoid any race with a concurrent CPU page table update. HMM implements all this on top of the mmu_notifier API because we wanted a simpler API and also to be able to perform optimizations latter on like doing diff --git a/include/linux/hmm.h b/include/linux/hmm.h index e9afd23c2eac..ec4bfa91648f 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -77,8 +77,34 @@ #include #include #include +#include -struct hmm; + +/* + * struct hmm - HMM per mm struct + * + * @mm: mm struct this HMM struct is bound to + * @lock: lock protecting ranges list + * @ranges: list of range being snapshotted + * @mirrors: list of mirrors for this mm + * @mmu_notifier: mmu notifier to track updates to CPU page table + * @mirrors_sem: read/write semaphore protecting the mirrors list + * @wq: wait queue for user waiting on a range invalidation + * @notifiers: count of active mmu notifiers + * @dead: is the mm dead ? + */ +struct hmm { + struct mm_struct *mm; + struct kref kref; + struct mutex lock; + struct list_head ranges; + struct list_head mirrors; + struct mmu_notifier mmu_notifier; + struct rw_semaphore mirrors_sem; + wait_queue_head_t wq; + long notifiers; + bool dead; +}; /* * hmm_pfn_flag_e - HMM flag enums @@ -155,6 +181,38 @@ struct hmm_range { bool valid; }; +/* + * hmm_range_wait_until_valid() - wait for range to be valid + * @range: range affected by invalidation to wait on + * @timeout: time out for wait in ms (ie abort wait after that period of time) + * Returns: true if the range is valid, false otherwise. + */ +static inline bool hmm_range_wait_until_valid(struct hmm_range *range, + unsigned long timeout) +{ + /* Check if mm is dead ? */ + if (range->hmm == NULL || range->hmm->dead || range->hmm->mm == NULL) { + range->valid = false; + return false; + } + if (range->valid) + return true; + wait_event_timeout(range->hmm->wq, range->valid || range->hmm->dead, + msecs_to_jiffies(timeout)); + /* Return current valid status just in case we get lucky */ + return range->valid; +} + +/* + * hmm_range_valid() - test if a range is valid or not + * @range: range + * Returns: true if the range is valid, false otherwise. + */ +static inline bool hmm_range_valid(struct hmm_range *range) +{ + return range->valid; +} + /* * hmm_pfn_to_page() - return struct page pointed to by a valid HMM pfn * @range: range use to decode HMM pfn value @@ -357,51 +415,66 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror); /* - * To snapshot the CPU page table, call hmm_vma_get_pfns(), then take a device - * driver lock that serializes device page table updates, then call - * hmm_vma_range_done(), to check if the snapshot is still valid. The same - * device driver page table update lock must also be used in the - * hmm_mirror_ops.sync_cpu_device_pagetables() callback, so that CPU page - * table invalidation serializes on it. - * - * YOU MUST CALL hmm_vma_range_done() ONCE AND ONLY ONCE EACH TIME YOU CALL - * hmm_range_snapshot() WITHOUT ERROR ! - * - * IF YOU DO NOT FOLLOW THE ABOVE RULE THE SNAPSHOT CONTENT MIGHT BE INVALID ! + * Please see Documentation/vm/hmm.rst for how to use the range API. */ +int hmm_range_register(struct hmm_range *range, + struct mm_struct *mm, + unsigned long start, + unsigned long end); +void hmm_range_unregister(struct hmm_range *range); long hmm_range_snapshot(struct hmm_range *range); -bool hmm_vma_range_done(struct hmm_range *range); - +long hmm_range_fault(struct hmm_range *range, bool block); /* - * Fault memory on behalf of device driver. Unlike handle_mm_fault(), this will - * not migrate any device memory back to system memory. The HMM pfn array will - * be updated with the fault result and current snapshot of the CPU page table - * for the range. - * - * The mmap_sem must be taken in read mode before entering and it might be - * dropped by the function if the block argument is false. In that case, the - * function returns -EAGAIN. - * - * Return value does not reflect if the fault was successful for every single - * address or not. Therefore, the caller must to inspect the HMM pfn array to - * determine fault status for each address. - * - * Trying to fault inside an invalid vma will result in -EINVAL. + * HMM_RANGE_DEFAULT_TIMEOUT - default timeout (ms) when waiting for a range * - * See the function description in mm/hmm.c for further documentation. + * When waiting for mmu notifiers we need some kind of time out otherwise we + * could potentialy wait for ever, 1000ms ie 1s sounds like a long time to + * wait already. */ -long hmm_range_fault(struct hmm_range *range, bool block); +#define HMM_RANGE_DEFAULT_TIMEOUT 1000 + +/* This is a temporary helper to avoid merge conflict between trees. */ +static inline bool hmm_vma_range_done(struct hmm_range *range) +{ + bool ret = hmm_range_valid(range); + + hmm_range_unregister(range); + return ret; +} /* This is a temporary helper to avoid merge conflict between trees. */ static inline int hmm_vma_fault(struct hmm_range *range, bool block) { - long ret = hmm_range_fault(range, block); - if (ret == -EBUSY) - ret = -EAGAIN; - else if (ret == -EAGAIN) - ret = -EBUSY; - return ret < 0 ? ret : 0; + long ret; + + ret = hmm_range_register(range, range->vma->vm_mm, + range->start, range->end); + if (ret) + return (int)ret; + + if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { + /* + * The mmap_sem was taken by driver we release it here and + * returns -EAGAIN which correspond to mmap_sem have been + * drop in the old API. + */ + up_read(&range->vma->vm_mm->mmap_sem); + return -EAGAIN; + } + + ret = hmm_range_fault(range, block); + if (ret <= 0) { + if (ret == -EBUSY || !ret) { + /* Same as above drop mmap_sem to match old API. */ + up_read(&range->vma->vm_mm->mmap_sem); + ret = -EBUSY; + } else if (ret == -EAGAIN) + ret = -EBUSY; + hmm_range_unregister(range); + return ret; + } + return 0; } /* Below are for HMM internal use only! Not to be used by device driver! */ diff --git a/mm/hmm.c b/mm/hmm.c index b7e4034d96e1..3e07f32b94f8 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -38,26 +38,6 @@ #if IS_ENABLED(CONFIG_HMM_MIRROR) static const struct mmu_notifier_ops hmm_mmu_notifier_ops; -/* - * struct hmm - HMM per mm struct - * - * @mm: mm struct this HMM struct is bound to - * @lock: lock protecting ranges list - * @ranges: list of range being snapshotted - * @mirrors: list of mirrors for this mm - * @mmu_notifier: mmu notifier to track updates to CPU page table - * @mirrors_sem: read/write semaphore protecting the mirrors list - */ -struct hmm { - struct mm_struct *mm; - struct kref kref; - spinlock_t lock; - struct list_head ranges; - struct list_head mirrors; - struct mmu_notifier mmu_notifier; - struct rw_semaphore mirrors_sem; -}; - static inline struct hmm *mm_get_hmm(struct mm_struct *mm) { struct hmm *hmm = READ_ONCE(mm->hmm); @@ -91,12 +71,15 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm) hmm = kmalloc(sizeof(*hmm), GFP_KERNEL); if (!hmm) return NULL; + init_waitqueue_head(&hmm->wq); INIT_LIST_HEAD(&hmm->mirrors); init_rwsem(&hmm->mirrors_sem); hmm->mmu_notifier.ops = NULL; INIT_LIST_HEAD(&hmm->ranges); - spin_lock_init(&hmm->lock); + mutex_init(&hmm->lock); kref_init(&hmm->kref); + hmm->notifiers = 0; + hmm->dead = false; hmm->mm = mm; spin_lock(&mm->page_table_lock); @@ -158,6 +141,7 @@ void hmm_mm_destroy(struct mm_struct *mm) mm->hmm = NULL; if (hmm) { hmm->mm = NULL; + hmm->dead = true; spin_unlock(&mm->page_table_lock); hmm_put(hmm); return; @@ -166,43 +150,22 @@ void hmm_mm_destroy(struct mm_struct *mm) spin_unlock(&mm->page_table_lock); } -static int hmm_invalidate_range(struct hmm *hmm, bool device, - const struct hmm_update *update) +static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) { + struct hmm *hmm = mm_get_hmm(mm); struct hmm_mirror *mirror; struct hmm_range *range; - spin_lock(&hmm->lock); - list_for_each_entry(range, &hmm->ranges, list) { - if (update->end < range->start || update->start >= range->end) - continue; + /* Report this HMM as dying. */ + hmm->dead = true; + /* Wake-up everyone waiting on any range. */ + mutex_lock(&hmm->lock); + list_for_each_entry(range, &hmm->ranges, list) { range->valid = false; } - spin_unlock(&hmm->lock); - - if (!device) - return 0; - - down_read(&hmm->mirrors_sem); - list_for_each_entry(mirror, &hmm->mirrors, list) { - int ret; - - ret = mirror->ops->sync_cpu_device_pagetables(mirror, update); - if (!update->blockable && ret == -EAGAIN) { - up_read(&hmm->mirrors_sem); - return -EAGAIN; - } - } - up_read(&hmm->mirrors_sem); - - return 0; -} - -static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) -{ - struct hmm_mirror *mirror; - struct hmm *hmm = mm_get_hmm(mm); + wake_up_all(&hmm->wq); + mutex_unlock(&hmm->lock); down_write(&hmm->mirrors_sem); mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror, @@ -228,36 +191,80 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) } static int hmm_invalidate_range_start(struct mmu_notifier *mn, - const struct mmu_notifier_range *range) + const struct mmu_notifier_range *nrange) { - struct hmm *hmm = mm_get_hmm(range->mm); + struct hmm *hmm = mm_get_hmm(nrange->mm); + struct hmm_mirror *mirror; struct hmm_update update; - int ret; + struct hmm_range *range; + int ret = 0; VM_BUG_ON(!hmm); - update.start = range->start; - update.end = range->end; + update.start = nrange->start; + update.end = nrange->end; update.event = HMM_UPDATE_INVALIDATE; - update.blockable = range->blockable; - ret = hmm_invalidate_range(hmm, true, &update); + update.blockable = nrange->blockable; + + if (nrange->blockable) + mutex_lock(&hmm->lock); + else if (!mutex_trylock(&hmm->lock)) { + ret = -EAGAIN; + goto out; + } + hmm->notifiers++; + list_for_each_entry(range, &hmm->ranges, list) { + if (update.end < range->start || update.start >= range->end) + continue; + + range->valid = false; + } + mutex_unlock(&hmm->lock); + + if (nrange->blockable) + down_read(&hmm->mirrors_sem); + else if (!down_read_trylock(&hmm->mirrors_sem)) { + ret = -EAGAIN; + goto out; + } + list_for_each_entry(mirror, &hmm->mirrors, list) { + int ret; + + ret = mirror->ops->sync_cpu_device_pagetables(mirror, &update); + if (!update.blockable && ret == -EAGAIN) { + up_read(&hmm->mirrors_sem); + ret = -EAGAIN; + goto out; + } + } + up_read(&hmm->mirrors_sem); + +out: hmm_put(hmm); return ret; } static void hmm_invalidate_range_end(struct mmu_notifier *mn, - const struct mmu_notifier_range *range) + const struct mmu_notifier_range *nrange) { - struct hmm *hmm = mm_get_hmm(range->mm); - struct hmm_update update; + struct hmm *hmm = mm_get_hmm(nrange->mm); VM_BUG_ON(!hmm); - update.start = range->start; - update.end = range->end; - update.event = HMM_UPDATE_INVALIDATE; - update.blockable = true; - hmm_invalidate_range(hmm, false, &update); + mutex_lock(&hmm->lock); + hmm->notifiers--; + if (!hmm->notifiers) { + struct hmm_range *range; + + list_for_each_entry(range, &hmm->ranges, list) { + if (range->valid) + continue; + range->valid = true; + } + wake_up_all(&hmm->wq); + } + mutex_unlock(&hmm->lock); + hmm_put(hmm); } @@ -409,7 +416,6 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk, { struct hmm_range *range = hmm_vma_walk->range; - *fault = *write_fault = false; if (!hmm_vma_walk->fault) return; @@ -448,10 +454,11 @@ static void hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk, return; } + *fault = *write_fault = false; for (i = 0; i < npages; ++i) { hmm_pte_need_fault(hmm_vma_walk, pfns[i], cpu_flags, fault, write_fault); - if ((*fault) || (*write_fault)) + if ((*write_fault)) return; } } @@ -706,162 +713,155 @@ static void hmm_pfns_special(struct hmm_range *range) } /* - * hmm_range_snapshot() - snapshot CPU page table for a range + * hmm_range_register() - start tracking change to CPU page table over a range * @range: range - * Returns: number of valid pages in range->pfns[] (from range start - * address). This may be zero. If the return value is negative, - * then one of the following values may be returned: + * @mm: the mm struct for the range of virtual address + * @start: start virtual address (inclusive) + * @end: end virtual address (exclusive) + * Returns 0 on success, -EFAULT if the address space is no longer valid * - * -EINVAL invalid arguments or mm or virtual address are in an - * invalid vma (ie either hugetlbfs or device file vma). - * -EPERM For example, asking for write, when the range is - * read-only - * -EAGAIN Caller needs to retry - * -EFAULT Either no valid vma exists for this range, or it is - * illegal to access the range - * - * This snapshots the CPU page table for a range of virtual addresses. Snapshot - * validity is tracked by range struct. See hmm_vma_range_done() for further - * information. + * Track updates to the CPU page table see include/linux/hmm.h */ -long hmm_range_snapshot(struct hmm_range *range) +int hmm_range_register(struct hmm_range *range, + struct mm_struct *mm, + unsigned long start, + unsigned long end) { - struct vm_area_struct *vma = range->vma; - struct hmm_vma_walk hmm_vma_walk; - struct mm_walk mm_walk; - struct hmm *hmm; - + range->start = start & PAGE_MASK; + range->end = end & PAGE_MASK; + range->valid = false; range->hmm = NULL; - /* Sanity check, this really should not happen ! */ - if (range->start < vma->vm_start || range->start >= vma->vm_end) - return -EINVAL; - if (range->end < vma->vm_start || range->end > vma->vm_end) + if (range->start >= range->end) return -EINVAL; - hmm = hmm_get_or_create(vma->vm_mm); - if (!hmm) - return -ENOMEM; + range->start = start; + range->end = end; + + range->hmm = hmm_get_or_create(mm); + if (!range->hmm) + return -EFAULT; /* Check if hmm_mm_destroy() was call. */ - if (hmm->mm == NULL) { - hmm_put(hmm); - return -EINVAL; + if (range->hmm->mm == NULL || range->hmm->dead) { + hmm_put(range->hmm); + return -EFAULT; } - /* FIXME support hugetlb fs */ - if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) || - vma_is_dax(vma)) { - hmm_pfns_special(range); - hmm_put(hmm); - return -EINVAL; - } + /* Initialize range to track CPU page table update */ + mutex_lock(&range->hmm->lock); - if (!(vma->vm_flags & VM_READ)) { - /* - * If vma do not allow read access, then assume that it does - * not allow write access, either. Architecture that allow - * write without read access are not supported by HMM, because - * operations such has atomic access would not work. - */ - hmm_pfns_clear(range, range->pfns, range->start, range->end); - hmm_put(hmm); - return -EPERM; - } + list_add_rcu(&range->list, &range->hmm->ranges); - /* Initialize range to track CPU page table update */ - spin_lock(&hmm->lock); - range->valid = true; - list_add_rcu(&range->list, &hmm->ranges); - spin_unlock(&hmm->lock); - - hmm_vma_walk.fault = false; - hmm_vma_walk.range = range; - mm_walk.private = &hmm_vma_walk; - hmm_vma_walk.last = range->start; - - mm_walk.vma = vma; - mm_walk.mm = vma->vm_mm; - mm_walk.pte_entry = NULL; - mm_walk.test_walk = NULL; - mm_walk.hugetlb_entry = NULL; - mm_walk.pmd_entry = hmm_vma_walk_pmd; - mm_walk.pte_hole = hmm_vma_walk_hole; - - walk_page_range(range->start, range->end, &mm_walk); /* - * Transfer hmm reference to the range struct it will be drop inside - * the hmm_vma_range_done() function (which _must_ be call if this - * function return 0). + * If there are any concurrent notifiers we have to wait for them for + * the range to be valid (see hmm_range_wait_until_valid()). */ - range->hmm = hmm; - return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; + if (!range->hmm->notifiers) + range->valid = true; + mutex_unlock(&range->hmm->lock); + + return 0; } -EXPORT_SYMBOL(hmm_range_snapshot); +EXPORT_SYMBOL(hmm_range_register); /* - * hmm_vma_range_done() - stop tracking change to CPU page table over a range - * @range: range being tracked - * Returns: false if range data has been invalidated, true otherwise + * hmm_range_unregister() - stop tracking change to CPU page table over a range + * @range: range * * Range struct is used to track updates to the CPU page table after a call to - * either hmm_vma_get_pfns() or hmm_vma_fault(). Once the device driver is done - * using the data, or wants to lock updates to the data it got from those - * functions, it must call the hmm_vma_range_done() function, which will then - * stop tracking CPU page table updates. - * - * Note that device driver must still implement general CPU page table update - * tracking either by using hmm_mirror (see hmm_mirror_register()) or by using - * the mmu_notifier API directly. - * - * CPU page table update tracking done through hmm_range is only temporary and - * to be used while trying to duplicate CPU page table contents for a range of - * virtual addresses. - * - * There are two ways to use this : - * again: - * hmm_vma_get_pfns(range); or hmm_vma_fault(...); - * trans = device_build_page_table_update_transaction(pfns); - * device_page_table_lock(); - * if (!hmm_vma_range_done(range)) { - * device_page_table_unlock(); - * goto again; - * } - * device_commit_transaction(trans); - * device_page_table_unlock(); - * - * Or: - * hmm_vma_get_pfns(range); or hmm_vma_fault(...); - * device_page_table_lock(); - * hmm_vma_range_done(range); - * device_update_page_table(range->pfns); - * device_page_table_unlock(); + * hmm_range_register(). See include/linux/hmm.h for how to use it. */ -bool hmm_vma_range_done(struct hmm_range *range) +void hmm_range_unregister(struct hmm_range *range) { - bool ret = false; - /* Sanity check this really should not happen. */ - if (range->hmm == NULL || range->end <= range->start) { - BUG(); - return false; - } + if (range->hmm == NULL || range->end <= range->start) + return; - spin_lock(&range->hmm->lock); + mutex_lock(&range->hmm->lock); list_del_rcu(&range->list); - ret = range->valid; - spin_unlock(&range->hmm->lock); + mutex_unlock(&range->hmm->lock); - /* Is the mm still alive ? */ - if (range->hmm->mm == NULL) - ret = false; - - /* Drop reference taken by hmm_vma_fault() or hmm_vma_get_pfns() */ + /* Drop reference taken by hmm_range_register() */ + range->valid = false; hmm_put(range->hmm); range->hmm = NULL; - return ret; } -EXPORT_SYMBOL(hmm_vma_range_done); +EXPORT_SYMBOL(hmm_range_unregister); + +/* + * hmm_range_snapshot() - snapshot CPU page table for a range + * @range: range + * Returns: -EINVAL if invalid argument, -ENOMEM out of memory, -EPERM invalid + * permission (for instance asking for write and range is read only), + * -EAGAIN if you need to retry, -EFAULT invalid (ie either no valid + * vma or it is illegal to access that range), number of valid pages + * in range->pfns[] (from range start address). + * + * This snapshots the CPU page table for a range of virtual addresses. Snapshot + * validity is tracked by range struct. See in include/linux/hmm.h for example + * on how to use. + */ +long hmm_range_snapshot(struct hmm_range *range) +{ + unsigned long start = range->start, end; + struct hmm_vma_walk hmm_vma_walk; + struct hmm *hmm = range->hmm; + struct vm_area_struct *vma; + struct mm_walk mm_walk; + + /* Check if hmm_mm_destroy() was call. */ + if (hmm->mm == NULL || hmm->dead) + return -EFAULT; + + do { + /* If range is no longer valid force retry. */ + if (!range->valid) + return -EAGAIN; + + vma = find_vma(hmm->mm, start); + if (vma == NULL || (vma->vm_flags & VM_SPECIAL)) + return -EFAULT; + + /* FIXME support hugetlb fs/dax */ + if (is_vm_hugetlb_page(vma) || vma_is_dax(vma)) { + hmm_pfns_special(range); + return -EINVAL; + } + + if (!(vma->vm_flags & VM_READ)) { + /* + * If vma do not allow read access, then assume that it + * does not allow write access, either. HMM does not + * support architecture that allow write without read. + */ + hmm_pfns_clear(range, range->pfns, + range->start, range->end); + return -EPERM; + } + + range->vma = vma; + hmm_vma_walk.last = start; + hmm_vma_walk.fault = false; + hmm_vma_walk.range = range; + mm_walk.private = &hmm_vma_walk; + end = min(range->end, vma->vm_end); + + mm_walk.vma = vma; + mm_walk.mm = vma->vm_mm; + mm_walk.pte_entry = NULL; + mm_walk.test_walk = NULL; + mm_walk.hugetlb_entry = NULL; + mm_walk.pmd_entry = hmm_vma_walk_pmd; + mm_walk.pte_hole = hmm_vma_walk_hole; + + walk_page_range(start, end, &mm_walk); + start = end; + } while (start < range->end); + + return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; +} +EXPORT_SYMBOL(hmm_range_snapshot); /* * hmm_range_fault() - try to fault some address in a virtual address range @@ -893,96 +893,79 @@ EXPORT_SYMBOL(hmm_vma_range_done); */ long hmm_range_fault(struct hmm_range *range, bool block) { - struct vm_area_struct *vma = range->vma; - unsigned long start = range->start; + unsigned long start = range->start, end; struct hmm_vma_walk hmm_vma_walk; + struct hmm *hmm = range->hmm; + struct vm_area_struct *vma; struct mm_walk mm_walk; - struct hmm *hmm; int ret; - range->hmm = NULL; - - /* Sanity check, this really should not happen ! */ - if (range->start < vma->vm_start || range->start >= vma->vm_end) - return -EINVAL; - if (range->end < vma->vm_start || range->end > vma->vm_end) - return -EINVAL; + /* Check if hmm_mm_destroy() was call. */ + if (hmm->mm == NULL || hmm->dead) + return -EFAULT; - hmm = hmm_get_or_create(vma->vm_mm); - if (!hmm) { - hmm_pfns_clear(range, range->pfns, range->start, range->end); - return -ENOMEM; - } + do { + /* If range is no longer valid force retry. */ + if (!range->valid) { + up_read(&hmm->mm->mmap_sem); + return -EAGAIN; + } - /* Check if hmm_mm_destroy() was call. */ - if (hmm->mm == NULL) { - hmm_put(hmm); - return -EINVAL; - } + vma = find_vma(hmm->mm, start); + if (vma == NULL || (vma->vm_flags & VM_SPECIAL)) + return -EFAULT; - /* FIXME support hugetlb fs */ - if (is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_SPECIAL) || - vma_is_dax(vma)) { - hmm_pfns_special(range); - hmm_put(hmm); - return -EINVAL; - } + /* FIXME support hugetlb fs/dax */ + if (is_vm_hugetlb_page(vma) || vma_is_dax(vma)) { + hmm_pfns_special(range); + return -EINVAL; + } - if (!(vma->vm_flags & VM_READ)) { - /* - * If vma do not allow read access, then assume that it does - * not allow write access, either. Architecture that allow - * write without read access are not supported by HMM, because - * operations such has atomic access would not work. - */ - hmm_pfns_clear(range, range->pfns, range->start, range->end); - hmm_put(hmm); - return -EPERM; - } + if (!(vma->vm_flags & VM_READ)) { + /* + * If vma do not allow read access, then assume that it + * does not allow write access, either. HMM does not + * support architecture that allow write without read. + */ + hmm_pfns_clear(range, range->pfns, + range->start, range->end); + return -EPERM; + } - /* Initialize range to track CPU page table update */ - spin_lock(&hmm->lock); - range->valid = true; - list_add_rcu(&range->list, &hmm->ranges); - spin_unlock(&hmm->lock); - - hmm_vma_walk.fault = true; - hmm_vma_walk.block = block; - hmm_vma_walk.range = range; - mm_walk.private = &hmm_vma_walk; - hmm_vma_walk.last = range->start; - - mm_walk.vma = vma; - mm_walk.mm = vma->vm_mm; - mm_walk.pte_entry = NULL; - mm_walk.test_walk = NULL; - mm_walk.hugetlb_entry = NULL; - mm_walk.pmd_entry = hmm_vma_walk_pmd; - mm_walk.pte_hole = hmm_vma_walk_hole; + range->vma = vma; + hmm_vma_walk.last = start; + hmm_vma_walk.fault = true; + hmm_vma_walk.block = block; + hmm_vma_walk.range = range; + mm_walk.private = &hmm_vma_walk; + end = min(range->end, vma->vm_end); + + mm_walk.vma = vma; + mm_walk.mm = vma->vm_mm; + mm_walk.pte_entry = NULL; + mm_walk.test_walk = NULL; + mm_walk.hugetlb_entry = NULL; + mm_walk.pmd_entry = hmm_vma_walk_pmd; + mm_walk.pte_hole = hmm_vma_walk_hole; + + do { + ret = walk_page_range(start, end, &mm_walk); + start = hmm_vma_walk.last; + + /* Keep trying while the range is valid. */ + } while (ret == -EBUSY && range->valid); + + if (ret) { + unsigned long i; + + i = (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; + hmm_pfns_clear(range, &range->pfns[i], + hmm_vma_walk.last, range->end); + return ret; + } + start = end; - do { - ret = walk_page_range(start, range->end, &mm_walk); - start = hmm_vma_walk.last; - /* Keep trying while the range is valid. */ - } while (ret == -EBUSY && range->valid); - - if (ret) { - unsigned long i; - - i = (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; - hmm_pfns_clear(range, &range->pfns[i], hmm_vma_walk.last, - range->end); - hmm_vma_range_done(range); - hmm_put(hmm); - return ret; - } else { - /* - * Transfer hmm reference to the range struct it will be drop - * inside the hmm_vma_range_done() function (which _must_ be - * call if this function return 0). - */ - range->hmm = hmm; - } + } while (start < range->end); return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; } From patchwork Wed Apr 3 19:33:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884401 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C447E17E9 for ; Wed, 3 Apr 2019 19:33:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AD56E2875F for ; Wed, 3 Apr 2019 19:33:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A16EB289D0; Wed, 3 Apr 2019 19:33:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F2F0F2875F for ; Wed, 3 Apr 2019 19:33:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C56586B026D; Wed, 3 Apr 2019 15:33:39 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A83516B0274; Wed, 3 Apr 2019 15:33:39 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 680EB6B0271; Wed, 3 Apr 2019 15:33:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 39E5C6B026F for ; Wed, 3 Apr 2019 15:33:39 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id o135so159280qke.11 for ; Wed, 03 Apr 2019 12:33:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=seuyMvHcnjUrsqyohmlvNEeaYKNbox69Y4Nl803gldo=; b=SPkNwd1ZMr4a12xssedSOg67sTlza+YYlUI9myGpD8V6oddyvCG51+qpwbDhnu020j MJsRipWLAxX3AqSFtqkH/fVyT6lNEeWqtvdcRy8DkGWRlkC2OvitrXrxZZAgQPfFt9Zq 6x/WCEaZ1TA5EwH6SLK89SSaJWyt2LGbpUukGfM02lzO0Avo+yk7HPl8O1XKrgZdaD3k BTr5E/O44ME0xxgH51SLXxXseT7avdBcKD1ORQpdPI/v/lEGBv/HEzuMXcy9UlxyM5eV xobZPmrSIiPHL2tIBmapG5fl4nq9zOswVhUw2mBBdvoqCBXcs251ukfuQs/fTGPHjH81 Rz8A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAWRGv7rewetp+MPNLv3M1TfWzdpHYFjjwcS9hgOMDXobOXVuXB0 SkjKBOATFhgZ6VI4iMhYOBzQQUAi7NXUFiZlRUcHOpxW7HgCxKUtt7/SiExHRSVG+oHSJZr4/mY E/gIMFzriew1c3qELoI9W5hS+bX86NzFrnXbYco40xdsHlokprDAzs1NJG9NTjRovNA== X-Received: by 2002:ac8:3687:: with SMTP id a7mr1667042qtc.284.1554320018983; Wed, 03 Apr 2019 12:33:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqzhi+BPO0Ap3CW5gV7SCXx1DOqXxZBupDcX7gRdXEzn2zESLhPIFLnlt5wAbb4F+/5fSbD9 X-Received: by 2002:ac8:3687:: with SMTP id a7mr1666980qtc.284.1554320018178; Wed, 03 Apr 2019 12:33:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320018; cv=none; d=google.com; s=arc-20160816; b=dEXj9QL1M+tWqBJncJ44jZX29j+5hHuHtVpHALUVZmId+lPckCSfArUqOBV16Np5jX +VdpUnHkOiiHW3Io0O77Pi3qDosvlOKDkS8LRY12eqYZGs32L6k1LTfjkf043XH4U4CB KRir262vWIMhqmrrgD63IfZ1Gvo+LK9YS3Ep6FeU1YJBfQepVCLUL52KP3qZy0dOWvpB KTfcUWbUutHb1NVWGxuJ17CQ5PSIdM48dG16WOowyW053RBIbB9XeKL/mGSFrHW2Mokp vzEoFVxwxa5kzIZEImh/jQx/fWrr/yNPuUc6YwHuuf3J0NYNpBnUaBdtw7YW3XZuCmvj fj1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=seuyMvHcnjUrsqyohmlvNEeaYKNbox69Y4Nl803gldo=; b=TIv/O5DwvZ0NCi8H0TN6qbMGalYHXvfborNfPxJVDVP2b/bjs6PVaolmRPOuh4YMXm JHIbh7RnTEOJ6L/2P+zM8pATHHTGvs5Zf9q2yUZTi4kw7NojjFdAQ9+q+CQ0TbsqLGKY qWQ68+cp6ObXbiSyeHlOeN4cLofs9zRPg2Nu1wAwZ/GJqurCLVWm8i3wlnRGAHv5cfzR grLSv94tqyJpT7oJ7xzSdU01yoX2ihNzx9iuyhlaomYktO0+2U+XRwn3iizAuGB6mpSd SzNRjUfRNo+yCxN5ZLv6Q1eHn++ZRnKoGQPZV0LZAhAF38q6Xb76EIcvjFSXTrMUprEV /s0Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id w28si2521044qtk.21.2019.04.03.12.33.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:38 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 63F8230842B1; Wed, 3 Apr 2019 19:33:37 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7D2C86012C; Wed, 3 Apr 2019 19:33:36 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , John Hubbard , Dan Williams Subject: [PATCH v3 07/12] mm/hmm: add default fault flags to avoid the need to pre-fill pfns arrays v2 Date: Wed, 3 Apr 2019 15:33:13 -0400 Message-Id: <20190403193318.16478-8-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Wed, 03 Apr 2019 19:33:37 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse The HMM mirror API can be use in two fashions. The first one where the HMM user coalesce multiple page faults into one request and set flags per pfns for of those faults. The second one where the HMM user want to pre-fault a range with specific flags. For the latter one it is a waste to have the user pre-fill the pfn arrays with a default flags value. This patch adds a default flags value allowing user to set them for a range without having to pre-fill the pfn array. Changes since v1: - Added documentation. - Added comments in the old API wrapper to explain what is going on. Signed-off-by: Jérôme Glisse Reviewed-by: Ralph Campbell Cc: Andrew Morton Cc: John Hubbard Cc: Dan Williams --- Documentation/vm/hmm.rst | 35 +++++++++++++++++++++++++++++++++++ include/linux/hmm.h | 13 +++++++++++++ mm/hmm.c | 12 ++++++++++++ 3 files changed, 60 insertions(+) diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index 945d5fb6d14a..ec1efa32af3c 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -276,6 +276,41 @@ report commands as executed is serialized (there is no point in doing this concurrently). +Leverage default_flags and pfn_flags_mask +========================================= + +The hmm_range struct has 2 fields default_flags and pfn_flags_mask that allows +to set fault or snapshot policy for a whole range instead of having to set them +for each entries in the range. + +For instance if the device flags for device entries are: + VALID (1 << 63) + WRITE (1 << 62) + +Now let say that device driver wants to fault with at least read a range then +it does set: + range->default_flags = (1 << 63) + range->pfn_flags_mask = 0; + +and calls hmm_range_fault() as described above. This will fill fault all page +in the range with at least read permission. + +Now let say driver wants to do the same except for one page in the range for +which its want to have write. Now driver set: + range->default_flags = (1 << 63); + range->pfn_flags_mask = (1 << 62); + range->pfns[index_of_write] = (1 << 62); + +With this HMM will fault in all page with at least read (ie valid) and for the +address == range->start + (index_of_write << PAGE_SHIFT) it will fault with +write permission ie if the CPU pte does not have write permission set then HMM +will call handle_mm_fault(). + +Note that HMM will populate the pfns array with write permission for any entry +that have write permission within the CPU pte no matter what are the values set +in default_flags or pfn_flags_mask. + + Represent and manage device memory from core kernel point of view ================================================================= diff --git a/include/linux/hmm.h b/include/linux/hmm.h index ec4bfa91648f..dee2f8953b2e 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -165,6 +165,8 @@ enum hmm_pfn_value_e { * @pfns: array of pfns (big enough for the range) * @flags: pfn flags to match device driver page table * @values: pfn value for some special case (none, special, error, ...) + * @default_flags: default flags for the range (write, read, ... see hmm doc) + * @pfn_flags_mask: allows to mask pfn flags so that only default_flags matter * @pfn_shifts: pfn shift value (should be <= PAGE_SHIFT) * @valid: pfns array did not change since it has been fill by an HMM function */ @@ -177,6 +179,8 @@ struct hmm_range { uint64_t *pfns; const uint64_t *flags; const uint64_t *values; + uint64_t default_flags; + uint64_t pfn_flags_mask; uint8_t pfn_shift; bool valid; }; @@ -448,6 +452,15 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block) { long ret; + /* + * With the old API the driver must set each individual entries with + * the requested flags (valid, write, ...). So here we set the mask to + * keep intact the entries provided by the driver and zero out the + * default_flags. + */ + range->default_flags = 0; + range->pfn_flags_mask = -1UL; + ret = hmm_range_register(range, range->vma->vm_mm, range->start, range->end); if (ret) diff --git a/mm/hmm.c b/mm/hmm.c index 3e07f32b94f8..0e21d3594ab6 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -419,6 +419,18 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk, if (!hmm_vma_walk->fault) return; + /* + * So we not only consider the individual per page request we also + * consider the default flags requested for the range. The API can + * be use in 2 fashions. The first one where the HMM user coalesce + * multiple page fault into one request and set flags per pfns for + * of those faults. The second one where the HMM user want to pre- + * fault a range with specific flags. For the latter one it is a + * waste to have the user pre-fill the pfn arrays with a default + * flags value. + */ + pfns = (pfns & range->pfn_flags_mask) | range->default_flags; + /* We aren't ask to do anything ... */ if (!(pfns & range->flags[HMM_PFN_VALID])) return; From patchwork Wed Apr 3 19:33:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884403 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F0D6217E9 for ; Wed, 3 Apr 2019 19:33:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D4A452875F for ; Wed, 3 Apr 2019 19:33:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C8C28289D0; Wed, 3 Apr 2019 19:33:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3FD52875F for ; Wed, 3 Apr 2019 19:33:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C9BAB6B0271; Wed, 3 Apr 2019 15:33:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C282A6B0272; Wed, 3 Apr 2019 15:33:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A548D6B0274; Wed, 3 Apr 2019 15:33:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by kanga.kvack.org (Postfix) with ESMTP id 7B7816B0271 for ; Wed, 3 Apr 2019 15:33:41 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id n1so122682qte.12 for ; Wed, 03 Apr 2019 12:33:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=q0ywwlwxtQ/rMX0avTCsTg2NUa/EOY3ZrC/ujKng//M=; b=WQY4auiUjAkMfsJZhY8fp9UOxSVmShnKZWgjDPuCmRD27robDKLe5001aLnNLGMWC2 TNpP+mlMIZgXTqfjiMyjLNHEiNZqnj5ReQjYjJ2HvJj96mDNvDt+vWKMzmaS9tllB5mC e0WjJwF4Zzc/KMgGU6WEgk+6UxSIk+b368P69fiDuVY1JSPTzceo1zP/x4DQOjsah6CY jtoN4KUOk7SoKIczXYinqTlNbouA1X14ot805mbs2zlwCjPpgAqgz23xMbCAGsB2FjuD PlYOpdSveA7tV2DNawy9TOzGtDbABLvjoKyxCj5SpL1T1RBYe6vXH1nsn3rklH6PO6bx Yytg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAUVewqr2GH9O2c7SC18FTr+OThvVCW23dwqBG9+8hrB9rx8lPN/ 5upgt3Nf0oflXb2ThsjAd/EqVumFfjZqxqHkKh4jtsBJAYLi2OqlgX+NduXrXn5IEwDaqaw39GA 89M8UwuNrDNPX61/1FuOjIVf8CF8ECfMGLO88LsRJn8UB14gf0tLHpRu+OGdG51tzRA== X-Received: by 2002:ac8:19f0:: with SMTP id s45mr1661682qtk.86.1554320021252; Wed, 03 Apr 2019 12:33:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqy88xV6OQubNmtocn7/+BqOZY5VwTT762dZLiHRAWLzs/x853kHJf2ZmoWfHxcsKNK3JtFd X-Received: by 2002:ac8:19f0:: with SMTP id s45mr1661628qtk.86.1554320020451; Wed, 03 Apr 2019 12:33:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320020; cv=none; d=google.com; s=arc-20160816; b=eRKP1PwQvrfrpq+yQ3Wbz9SO/4+QqYlKlkbd0a+KFOVFDiklBrLWdbtOM6/ngkeK98 9dquHeIdkQz65vOAded2X0HcSJPSTvQmgeniYlXBtkOcKE4x+E1urgg1JIQLCbuJGtly BEIvjqBraM1zwP8MGyA7lLYZCNcR2a6+sefLBKYMiextJT3O5bjWMtaoz2aJWEq29izU aITUWPlos7E0szQfMPE7GUDbnkGgN+mTi3kL3om4OVXmo7H0iWQnGGAEUynIcMYsn1dy +K/olRuNq6KJBfJInXTD1AM77FliFnk8egQJ+Y1jTmAbLo3KCOCo8hK6NVen1S5tdVQJ BeQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=q0ywwlwxtQ/rMX0avTCsTg2NUa/EOY3ZrC/ujKng//M=; b=IPgs8HP7z5WkVfKvqlxloTF0NBxRAyie87OVTEqcpwijgzNYPy4WJ5GQ5uStlelpwa Gl0CvRqENcHum4qEdpzWb/Dkzx+GKCtp8FPNQAAcqkAn2hjjMiRxEvbvNq2FiDMKQMvt JntyscGGYHPuv2hhmn5pXkkQhAWJk3yjEc9LMb0laRgwQyZBzp7rKxuD/Y7uipGw/luq ejbvpavfDWVM97MrQBZBoxdYj2dXsUe0FgGHdDa/WQqIuOxkoyr7jJVZAo0iLgkGe4Or ZqK4piF2s8/1mfNn6tMgTNz0unHgORjsrbiFqZ2dod6ZFwDo/CRRCPfdM46zclUIRj7n FXxg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id n9si3783399qtb.198.2019.04.03.12.33.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:40 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A06813088B61; Wed, 3 Apr 2019 19:33:39 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8C2A56012C; Wed, 3 Apr 2019 19:33:37 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , John Hubbard , Dan Williams , Arnd Bergmann Subject: [PATCH v3 08/12] mm/hmm: mirror hugetlbfs (snapshoting, faulting and DMA mapping) v3 Date: Wed, 3 Apr 2019 15:33:14 -0400 Message-Id: <20190403193318.16478-9-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Wed, 03 Apr 2019 19:33:39 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse HMM mirror is a device driver helpers to mirror range of virtual address. It means that the process jobs running on the device can access the same virtual address as the CPU threads of that process. This patch adds support for hugetlbfs mapping (ie range of virtual address that are mmap of a hugetlbfs). Changes since v2: - Use hmm_range_page_size() where we can. Changes since v1: - improved commit message - squashed: Arnd Bergmann: fix unused variable warnings Signed-off-by: Jérôme Glisse Reviewed-by: Ralph Campbell Reviewed-by: Ira Weiny Cc: Andrew Morton Cc: John Hubbard Cc: Dan Williams Cc: Arnd Bergmann --- include/linux/hmm.h | 27 +++++++++- mm/hmm.c | 123 +++++++++++++++++++++++++++++++++++++++----- 2 files changed, 134 insertions(+), 16 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index dee2f8953b2e..e5834082de60 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -181,10 +181,31 @@ struct hmm_range { const uint64_t *values; uint64_t default_flags; uint64_t pfn_flags_mask; + uint8_t page_shift; uint8_t pfn_shift; bool valid; }; +/* + * hmm_range_page_shift() - return the page shift for the range + * @range: range being queried + * Returns: page shift (page size = 1 << page shift) for the range + */ +static inline unsigned hmm_range_page_shift(const struct hmm_range *range) +{ + return range->page_shift; +} + +/* + * hmm_range_page_size() - return the page size for the range + * @range: range being queried + * Returns: page size for the range in bytes + */ +static inline unsigned long hmm_range_page_size(const struct hmm_range *range) +{ + return 1UL << hmm_range_page_shift(range); +} + /* * hmm_range_wait_until_valid() - wait for range to be valid * @range: range affected by invalidation to wait on @@ -424,7 +445,8 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror); int hmm_range_register(struct hmm_range *range, struct mm_struct *mm, unsigned long start, - unsigned long end); + unsigned long end, + unsigned page_shift); void hmm_range_unregister(struct hmm_range *range); long hmm_range_snapshot(struct hmm_range *range); long hmm_range_fault(struct hmm_range *range, bool block); @@ -462,7 +484,8 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block) range->pfn_flags_mask = -1UL; ret = hmm_range_register(range, range->vma->vm_mm, - range->start, range->end); + range->start, range->end, + PAGE_SHIFT); if (ret) return (int)ret; diff --git a/mm/hmm.c b/mm/hmm.c index 0e21d3594ab6..9140cee24d36 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -391,11 +391,13 @@ static int hmm_vma_walk_hole_(unsigned long addr, unsigned long end, struct hmm_vma_walk *hmm_vma_walk = walk->private; struct hmm_range *range = hmm_vma_walk->range; uint64_t *pfns = range->pfns; - unsigned long i; + unsigned long i, page_size; hmm_vma_walk->last = addr; - i = (addr - range->start) >> PAGE_SHIFT; - for (; addr < end; addr += PAGE_SIZE, i++) { + page_size = hmm_range_page_size(range); + i = (addr - range->start) >> range->page_shift; + + for (; addr < end; addr += page_size, i++) { pfns[i] = range->values[HMM_PFN_NONE]; if (fault || write_fault) { int ret; @@ -707,6 +709,69 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, return 0; } +static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, + unsigned long start, unsigned long end, + struct mm_walk *walk) +{ +#ifdef CONFIG_HUGETLB_PAGE + unsigned long addr = start, i, pfn, mask, size, pfn_inc; + struct hmm_vma_walk *hmm_vma_walk = walk->private; + struct hmm_range *range = hmm_vma_walk->range; + struct vm_area_struct *vma = walk->vma; + struct hstate *h = hstate_vma(vma); + uint64_t orig_pfn, cpu_flags; + bool fault, write_fault; + spinlock_t *ptl; + pte_t entry; + int ret = 0; + + size = 1UL << huge_page_shift(h); + mask = size - 1; + if (range->page_shift != PAGE_SHIFT) { + /* Make sure we are looking at full page. */ + if (start & mask) + return -EINVAL; + if (end < (start + size)) + return -EINVAL; + pfn_inc = size >> PAGE_SHIFT; + } else { + pfn_inc = 1; + size = PAGE_SIZE; + } + + + ptl = huge_pte_lock(hstate_vma(walk->vma), walk->mm, pte); + entry = huge_ptep_get(pte); + + i = (start - range->start) >> range->page_shift; + orig_pfn = range->pfns[i]; + range->pfns[i] = range->values[HMM_PFN_NONE]; + cpu_flags = pte_to_hmm_pfn_flags(range, entry); + fault = write_fault = false; + hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags, + &fault, &write_fault); + if (fault || write_fault) { + ret = -ENOENT; + goto unlock; + } + + pfn = pte_pfn(entry) + (start & mask); + for (; addr < end; addr += size, i++, pfn += pfn_inc) + range->pfns[i] = hmm_pfn_from_pfn(range, pfn) | cpu_flags; + hmm_vma_walk->last = end; + +unlock: + spin_unlock(ptl); + + if (ret == -ENOENT) + return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk); + + return ret; +#else /* CONFIG_HUGETLB_PAGE */ + return -EINVAL; +#endif +} + static void hmm_pfns_clear(struct hmm_range *range, uint64_t *pfns, unsigned long addr, @@ -730,6 +795,7 @@ static void hmm_pfns_special(struct hmm_range *range) * @mm: the mm struct for the range of virtual address * @start: start virtual address (inclusive) * @end: end virtual address (exclusive) + * @page_shift: expect page shift for the range * Returns 0 on success, -EFAULT if the address space is no longer valid * * Track updates to the CPU page table see include/linux/hmm.h @@ -737,16 +803,20 @@ static void hmm_pfns_special(struct hmm_range *range) int hmm_range_register(struct hmm_range *range, struct mm_struct *mm, unsigned long start, - unsigned long end) + unsigned long end, + unsigned page_shift) { - range->start = start & PAGE_MASK; - range->end = end & PAGE_MASK; + unsigned long mask = ((1UL << page_shift) - 1UL); + range->valid = false; range->hmm = NULL; - if (range->start >= range->end) + if ((start & mask) || (end & mask)) + return -EINVAL; + if (start >= end) return -EINVAL; + range->page_shift = page_shift; range->start = start; range->end = end; @@ -816,6 +886,7 @@ EXPORT_SYMBOL(hmm_range_unregister); */ long hmm_range_snapshot(struct hmm_range *range) { + const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP; unsigned long start = range->start, end; struct hmm_vma_walk hmm_vma_walk; struct hmm *hmm = range->hmm; @@ -832,15 +903,26 @@ long hmm_range_snapshot(struct hmm_range *range) return -EAGAIN; vma = find_vma(hmm->mm, start); - if (vma == NULL || (vma->vm_flags & VM_SPECIAL)) + if (vma == NULL || (vma->vm_flags & device_vma)) return -EFAULT; - /* FIXME support hugetlb fs/dax */ - if (is_vm_hugetlb_page(vma) || vma_is_dax(vma)) { + /* FIXME support dax */ + if (vma_is_dax(vma)) { hmm_pfns_special(range); return -EINVAL; } + if (is_vm_hugetlb_page(vma)) { + struct hstate *h = hstate_vma(vma); + + if (huge_page_shift(h) != range->page_shift && + range->page_shift != PAGE_SHIFT) + return -EINVAL; + } else { + if (range->page_shift != PAGE_SHIFT) + return -EINVAL; + } + if (!(vma->vm_flags & VM_READ)) { /* * If vma do not allow read access, then assume that it @@ -866,6 +948,7 @@ long hmm_range_snapshot(struct hmm_range *range) mm_walk.hugetlb_entry = NULL; mm_walk.pmd_entry = hmm_vma_walk_pmd; mm_walk.pte_hole = hmm_vma_walk_hole; + mm_walk.hugetlb_entry = hmm_vma_walk_hugetlb_entry; walk_page_range(start, end, &mm_walk); start = end; @@ -884,7 +967,7 @@ EXPORT_SYMBOL(hmm_range_snapshot); * then one of the following values may be returned: * * -EINVAL invalid arguments or mm or virtual address are in an - * invalid vma (ie either hugetlbfs or device file vma). + * invalid vma (for instance device file vma). * -ENOMEM: Out of memory. * -EPERM: Invalid permission (for instance asking for write and * range is read only). @@ -905,6 +988,7 @@ EXPORT_SYMBOL(hmm_range_snapshot); */ long hmm_range_fault(struct hmm_range *range, bool block) { + const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP; unsigned long start = range->start, end; struct hmm_vma_walk hmm_vma_walk; struct hmm *hmm = range->hmm; @@ -924,15 +1008,25 @@ long hmm_range_fault(struct hmm_range *range, bool block) } vma = find_vma(hmm->mm, start); - if (vma == NULL || (vma->vm_flags & VM_SPECIAL)) + if (vma == NULL || (vma->vm_flags & device_vma)) return -EFAULT; - /* FIXME support hugetlb fs/dax */ - if (is_vm_hugetlb_page(vma) || vma_is_dax(vma)) { + /* FIXME support dax */ + if (vma_is_dax(vma)) { hmm_pfns_special(range); return -EINVAL; } + if (is_vm_hugetlb_page(vma)) { + if (huge_page_shift(hstate_vma(vma)) != + range->page_shift && + range->page_shift != PAGE_SHIFT) + return -EINVAL; + } else { + if (range->page_shift != PAGE_SHIFT) + return -EINVAL; + } + if (!(vma->vm_flags & VM_READ)) { /* * If vma do not allow read access, then assume that it @@ -959,6 +1053,7 @@ long hmm_range_fault(struct hmm_range *range, bool block) mm_walk.hugetlb_entry = NULL; mm_walk.pmd_entry = hmm_vma_walk_pmd; mm_walk.pte_hole = hmm_vma_walk_hole; + mm_walk.hugetlb_entry = hmm_vma_walk_hugetlb_entry; do { ret = walk_page_range(start, end, &mm_walk); From patchwork Wed Apr 3 19:33:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884405 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B82F1708 for ; Wed, 3 Apr 2019 19:34:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E6446289C9 for ; Wed, 3 Apr 2019 19:33:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DA688289E2; Wed, 3 Apr 2019 19:33:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0C1D6289C9 for ; Wed, 3 Apr 2019 19:33:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DFCE96B0272; Wed, 3 Apr 2019 15:33:42 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DAFD56B0274; Wed, 3 Apr 2019 15:33:42 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BFF426B0275; Wed, 3 Apr 2019 15:33:42 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 977486B0272 for ; Wed, 3 Apr 2019 15:33:42 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id k13so93072qtc.23 for ; Wed, 03 Apr 2019 12:33:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=CfljN/CbOV6Fry9g92TnFDy7qc8FAxA2gGSJjNt1zCk=; b=YBSXAaG4ikiHAKe693twmbp20V9vEHEZPNZOiSb8dyE+b07zP5uJ080ahtmSMFNXFR jJqeN4xfIdNKuVVm/FzulGdPmvEfFk5Vb69yDO1gIb/S619/8rtH0qRwBA+yQIq72pmG uaA8JrHcXLkqnhGyjvUBCquqDvqbYxmfh5P1X80VL1wg65uzky+3Mg4wv+XFLt8YUsCl ccJ62cWmA3U9xNwYsKXvx0CfgAwca0QPHHMvh/T0OQABu9Tb7AqqK6zqKTUbQ2UkZXsu ujwWXKC3Ktlm0rB+msPNwW/u6AlDlbQH4r4K/kWNHOvIgoSXv4LvcPtS2HjQ9/WtQ6Pb uawg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAVlLk6qSMkQjWwZ9XnGgxNcRbf3QHwZG6IdRMm8wQMcojaayz1i EOQA9K2GB8r2GE4IwOYPBOdfbcnEsGfewHKQRj0gpny1oth69bOcE2NlSf5Ph/dv8ws297sX+Xg I5y9lHJoZFpCcFXKn8XOaH3AkCNEoITU7Fal7KE4Hlg75FK3BVYdOYjnXPfmkkNROWQ== X-Received: by 2002:a37:a546:: with SMTP id o67mr1661624qke.134.1554320022375; Wed, 03 Apr 2019 12:33:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqwwL/ExUm2pXhGLmccdI17TDSNkoS9ZfuQHALWo2tI42TPzgdn+p3exEs8Pa4jDLxePpk0K X-Received: by 2002:a37:a546:: with SMTP id o67mr1661574qke.134.1554320021626; Wed, 03 Apr 2019 12:33:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320021; cv=none; d=google.com; s=arc-20160816; b=ofMIXS5IzHTVOePTVQiHBAgzwxKYhfv7v8yIYUSxteHFRWwXBw7jg/JUvj0v4lJddd QOHepFSIhQ0qCKvVwIoF7YCEby3PVG/F1VGK5Djy1NGL27CBXDrgCBhlbdcJHqVXcO5s v93Re3Io33AbOGwj27NBDu6k2HulGCp3Bwg0QZluW/KSDrphOd+VWUvrw4hdPECy/YJF kt+wyBQqxGxYg3Fh7roQTPKt1UeEqh4ZA0LzFlKm332nMP+wfHu5Lpxc/EFQhoRway+d G/NIPjlPNdzuzF9DlL/cY2cuFKJ+8SqWPCfUZbBGb8Ng+cnsgOIH22R6UO/Cx76VoEnw p28Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=CfljN/CbOV6Fry9g92TnFDy7qc8FAxA2gGSJjNt1zCk=; b=SzcpqpzBw8J+r1hIRpOsbx0B0Ki1xqZVknH6YWJQIlSCKXkp//I4YFkXMeQaEWNKl1 x0ffI/BCSEgIwQiXIsIIAEb6AlUyWn+Os8YWHjF7mEdkQxKV7jDcjj8Yrc214mGGDikz X6jGRAsrrZP6uaVS3CbV1YXcbpnNN82PZdAIkFLCzq42p5xCn/wqWTJ1gVJIxHKvHoVS HkXtau1pkmbg7GHmIPYs0IdxtVvlMHK5R7qRgiCbQgGGbeoPNI2M//reLgd3UF/Xa0l8 dxFaL5F3dM3JY+t98WlMEEE6sJg8sxZoII6Bf8aBwG/qf5clCKbVeVsm6LeUj6SCOGGH CLdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id v91si2523291qte.315.2019.04.03.12.33.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:41 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CC0A499D2E; Wed, 3 Apr 2019 19:33:40 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id CC9306012C; Wed, 3 Apr 2019 19:33:39 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Dan Williams , John Hubbard , Arnd Bergmann Subject: [PATCH v3 09/12] mm/hmm: allow to mirror vma of a file on a DAX backed filesystem v3 Date: Wed, 3 Apr 2019 15:33:15 -0400 Message-Id: <20190403193318.16478-10-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 03 Apr 2019 19:33:40 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse HMM mirror is a device driver helpers to mirror range of virtual address. It means that the process jobs running on the device can access the same virtual address as the CPU threads of that process. This patch adds support for mirroring mapping of file that are on a DAX block device (ie range of virtual address that is an mmap of a file in a filesystem on a DAX block device). There is no reason to not support such case when mirroring virtual address on a device. Note that unlike GUP code we do not take page reference hence when we back-off we have nothing to undo. Changes since v2: - Added comments about get_dev_pagemap() optimization. Changes since v1: - improved commit message - squashed: Arnd Bergmann: fix unused variable warning in hmm_vma_walk_pud Signed-off-by: Jérôme Glisse Reviewed-by: Ralph Campbell Cc: Andrew Morton Cc: Dan Williams Cc: John Hubbard Cc: Arnd Bergmann --- mm/hmm.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 117 insertions(+), 21 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index 9140cee24d36..39bc77d7e6e3 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -329,6 +329,7 @@ EXPORT_SYMBOL(hmm_mirror_unregister); struct hmm_vma_walk { struct hmm_range *range; + struct dev_pagemap *pgmap; unsigned long last; bool fault; bool block; @@ -503,6 +504,15 @@ static inline uint64_t pmd_to_hmm_pfn_flags(struct hmm_range *range, pmd_t pmd) range->flags[HMM_PFN_VALID]; } +static inline uint64_t pud_to_hmm_pfn_flags(struct hmm_range *range, pud_t pud) +{ + if (!pud_present(pud)) + return 0; + return pud_write(pud) ? range->flags[HMM_PFN_VALID] | + range->flags[HMM_PFN_WRITE] : + range->flags[HMM_PFN_VALID]; +} + static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr, unsigned long end, @@ -524,8 +534,19 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk); pfn = pmd_pfn(pmd) + pte_index(addr); - for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) + for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) { + if (pmd_devmap(pmd)) { + hmm_vma_walk->pgmap = get_dev_pagemap(pfn, + hmm_vma_walk->pgmap); + if (unlikely(!hmm_vma_walk->pgmap)) + return -EBUSY; + } pfns[i] = hmm_pfn_from_pfn(range, pfn) | cpu_flags; + } + if (hmm_vma_walk->pgmap) { + put_dev_pagemap(hmm_vma_walk->pgmap); + hmm_vma_walk->pgmap = NULL; + } hmm_vma_walk->last = end; return 0; } @@ -612,10 +633,24 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, if (fault || write_fault) goto fault; + if (pte_devmap(pte)) { + hmm_vma_walk->pgmap = get_dev_pagemap(pte_pfn(pte), + hmm_vma_walk->pgmap); + if (unlikely(!hmm_vma_walk->pgmap)) + return -EBUSY; + } else if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pte_special(pte)) { + *pfn = range->values[HMM_PFN_SPECIAL]; + return -EFAULT; + } + *pfn = hmm_pfn_from_pfn(range, pte_pfn(pte)) | cpu_flags; return 0; fault: + if (hmm_vma_walk->pgmap) { + put_dev_pagemap(hmm_vma_walk->pgmap); + hmm_vma_walk->pgmap = NULL; + } pte_unmap(ptep); /* Fault any virtual address we were asked to fault */ return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk); @@ -703,12 +738,89 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, return r; } } + if (hmm_vma_walk->pgmap) { + /* + * We do put_dev_pagemap() here and not in hmm_vma_handle_pte() + * so that we can leverage get_dev_pagemap() optimization which + * will not re-take a reference on a pgmap if we already have + * one. + */ + put_dev_pagemap(hmm_vma_walk->pgmap); + hmm_vma_walk->pgmap = NULL; + } pte_unmap(ptep - 1); hmm_vma_walk->last = addr; return 0; } +static int hmm_vma_walk_pud(pud_t *pudp, + unsigned long start, + unsigned long end, + struct mm_walk *walk) +{ + struct hmm_vma_walk *hmm_vma_walk = walk->private; + struct hmm_range *range = hmm_vma_walk->range; + unsigned long addr = start, next; + pmd_t *pmdp; + pud_t pud; + int ret; + +again: + pud = READ_ONCE(*pudp); + if (pud_none(pud)) + return hmm_vma_walk_hole(start, end, walk); + + if (pud_huge(pud) && pud_devmap(pud)) { + unsigned long i, npages, pfn; + uint64_t *pfns, cpu_flags; + bool fault, write_fault; + + if (!pud_present(pud)) + return hmm_vma_walk_hole(start, end, walk); + + i = (addr - range->start) >> PAGE_SHIFT; + npages = (end - addr) >> PAGE_SHIFT; + pfns = &range->pfns[i]; + + cpu_flags = pud_to_hmm_pfn_flags(range, pud); + hmm_range_need_fault(hmm_vma_walk, pfns, npages, + cpu_flags, &fault, &write_fault); + if (fault || write_fault) + return hmm_vma_walk_hole_(addr, end, fault, + write_fault, walk); + + pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); + for (i = 0; i < npages; ++i, ++pfn) { + hmm_vma_walk->pgmap = get_dev_pagemap(pfn, + hmm_vma_walk->pgmap); + if (unlikely(!hmm_vma_walk->pgmap)) + return -EBUSY; + pfns[i] = hmm_pfn_from_pfn(range, pfn) | cpu_flags; + } + if (hmm_vma_walk->pgmap) { + put_dev_pagemap(hmm_vma_walk->pgmap); + hmm_vma_walk->pgmap = NULL; + } + hmm_vma_walk->last = end; + return 0; + } + + split_huge_pud(walk->vma, pudp, addr); + if (pud_none(*pudp)) + goto again; + + pmdp = pmd_offset(pudp, addr); + do { + next = pmd_addr_end(addr, end); + ret = hmm_vma_walk_pmd(pmdp, addr, next, walk); + if (ret) + return ret; + } while (pmdp++, addr = next, addr != end); + + return 0; +} + static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, unsigned long start, unsigned long end, struct mm_walk *walk) @@ -781,14 +893,6 @@ static void hmm_pfns_clear(struct hmm_range *range, *pfns = range->values[HMM_PFN_NONE]; } -static void hmm_pfns_special(struct hmm_range *range) -{ - unsigned long addr = range->start, i = 0; - - for (; addr < range->end; addr += PAGE_SIZE, i++) - range->pfns[i] = range->values[HMM_PFN_SPECIAL]; -} - /* * hmm_range_register() - start tracking change to CPU page table over a range * @range: range @@ -906,12 +1010,6 @@ long hmm_range_snapshot(struct hmm_range *range) if (vma == NULL || (vma->vm_flags & device_vma)) return -EFAULT; - /* FIXME support dax */ - if (vma_is_dax(vma)) { - hmm_pfns_special(range); - return -EINVAL; - } - if (is_vm_hugetlb_page(vma)) { struct hstate *h = hstate_vma(vma); @@ -935,6 +1033,7 @@ long hmm_range_snapshot(struct hmm_range *range) } range->vma = vma; + hmm_vma_walk.pgmap = NULL; hmm_vma_walk.last = start; hmm_vma_walk.fault = false; hmm_vma_walk.range = range; @@ -946,6 +1045,7 @@ long hmm_range_snapshot(struct hmm_range *range) mm_walk.pte_entry = NULL; mm_walk.test_walk = NULL; mm_walk.hugetlb_entry = NULL; + mm_walk.pud_entry = hmm_vma_walk_pud; mm_walk.pmd_entry = hmm_vma_walk_pmd; mm_walk.pte_hole = hmm_vma_walk_hole; mm_walk.hugetlb_entry = hmm_vma_walk_hugetlb_entry; @@ -1011,12 +1111,6 @@ long hmm_range_fault(struct hmm_range *range, bool block) if (vma == NULL || (vma->vm_flags & device_vma)) return -EFAULT; - /* FIXME support dax */ - if (vma_is_dax(vma)) { - hmm_pfns_special(range); - return -EINVAL; - } - if (is_vm_hugetlb_page(vma)) { if (huge_page_shift(hstate_vma(vma)) != range->page_shift && @@ -1039,6 +1133,7 @@ long hmm_range_fault(struct hmm_range *range, bool block) } range->vma = vma; + hmm_vma_walk.pgmap = NULL; hmm_vma_walk.last = start; hmm_vma_walk.fault = true; hmm_vma_walk.block = block; @@ -1051,6 +1146,7 @@ long hmm_range_fault(struct hmm_range *range, bool block) mm_walk.pte_entry = NULL; mm_walk.test_walk = NULL; mm_walk.hugetlb_entry = NULL; + mm_walk.pud_entry = hmm_vma_walk_pud; mm_walk.pmd_entry = hmm_vma_walk_pmd; mm_walk.pte_hole = hmm_vma_walk_hole; mm_walk.hugetlb_entry = hmm_vma_walk_hugetlb_entry; From patchwork Wed Apr 3 19:33:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884407 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 99BB21708 for ; Wed, 3 Apr 2019 19:34:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 81E712875F for ; Wed, 3 Apr 2019 19:34:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 75EF2289D0; Wed, 3 Apr 2019 19:34:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC6302875F for ; Wed, 3 Apr 2019 19:34:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D62BF6B0274; Wed, 3 Apr 2019 15:33:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CE9826B0275; Wed, 3 Apr 2019 15:33:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B89136B0276; Wed, 3 Apr 2019 15:33:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by kanga.kvack.org (Postfix) with ESMTP id 92CAB6B0274 for ; Wed, 3 Apr 2019 15:33:43 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id g25so130231qkm.22 for ; Wed, 03 Apr 2019 12:33:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=jtYULlHKq/3/5SkfVuC3rY1FhUrrAMSpMPLKLzXxOkE=; b=YsaZ0kdYg9HtU71bFjZGW0DU8l7gUQd0tkSOjkVPvIsI0F3fxJagSxL2oFw1tf6tf3 n+i+0svNDBhQJP+px9LWkvUPpVWZk5iuNRTS5wYP+M1c9cvh7Yuyf8tsgNWRqP5gBQ97 bvn6YImN4IYy3LqJi+XBS5hCYzzr8lGZaq318BA4m2OPdq1PFvu53QHYnIbg7pQYfPuO XcBu2S9W7Q8wjkOvlta2b015MktkF+yuXcP1wr+C3IF/j5ARucS+UVU9A2iX6goygkRQ o4f/UK8a8zyRhw22iw8S1EsLfrJIOcXg59IvCn94Lz6jxd/gLPOcBb+VNA79ZMiyl9Uf KRlA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXK72yPmnFZ05s2CnKqEx9ohKVpP+xQTh/di09VpxFH6lfE0DMb xW9mxtoYPP9R6g3HadpZfa8H0ttQCPNP5bKY6xVbTpHj+EPjdVlHPoiOrFEIyw5gpC7YbNTmm6z TM/1K3xC/klAM2OvgyBLRWWaQxTRj6maaDSpMHxvWGzF3LtYhfDTKvLDpYV0gTENOZg== X-Received: by 2002:ae9:c219:: with SMTP id j25mr1671260qkg.82.1554320023381; Wed, 03 Apr 2019 12:33:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqz1UJd9QfSHQjm4HKLmBjUBl9xPgOqtBol9N6qteWZxICfyQ9rnqh3KHzypgwBDuHYPHUjV X-Received: by 2002:ae9:c219:: with SMTP id j25mr1671220qkg.82.1554320022857; Wed, 03 Apr 2019 12:33:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320022; cv=none; d=google.com; s=arc-20160816; b=jTk2e5PkQ4dVJjDkvLMI5cNUFRH4vxSrynn77gEr5OCUFrkLP40bMde6vPJVfCRLxR mpcKiZLuy8Jlg71HUKwhRuLIGRpzejUBXO8tW2SLG37PEVDA+QVu5jDbFrf81cA20ILH ErIvnh2GNqoX9lCETx8U82pQLHTjQLYz7qC3Y7JnjY0ShJ4t3xafZ51/fuAeyo8RGt2m 7tSFDGIL2Y8ZHQ7YMzKzyRGSsfDiZ+GfLMmBzsIzfKi2vEawgk2+tJHNXUl5QGjzoqlA WX+1vcGNCld8Nm9zYrI3JLtq8v/dhkpRz+rRzOkqV0ZKsgTYIa94uyzuzaMUYhMvrcee jbpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=jtYULlHKq/3/5SkfVuC3rY1FhUrrAMSpMPLKLzXxOkE=; b=Am4TyTiTfwIcCFNV0fJIObmC/0Aq3503+knwSinRA1CSAnhnyuIUb3MWwq6trIMaIt r9QqEpd2OMmUEE+uUbVv73oYaArd4bNwZ6BLgzcuDwbvZcOUgM8Xy5z6GTrhJwewKzbH xnesFGQDOmGyDz+TyzyVaone9+JInCBGYtEbvPoCjm1e72lqwntP1HOPLaaNNo+V7dh7 JdG/t/6Rl4fO2JkwLGTJVQf9tK6ZRZteX5g9Xz7Uz5wdUj/iYPg0boRiOJ4qRQLe7pbM Jt5V6XkFt8IIAEzCVpBCgzWgiEBbdXu6s/Z8DG53fyUh/QbRDLVhV18joyFyPXeghOpu LV0A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u13si5148974qve.103.2019.04.03.12.33.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:42 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 21AB33001C82; Wed, 3 Apr 2019 19:33:42 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 034E2605CA; Wed, 3 Apr 2019 19:33:40 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Ralph Campbell , John Hubbard , Dan Williams , Ira Weiny Subject: [PATCH v3 10/12] mm/hmm: add helpers to test if mm is still alive or not Date: Wed, 3 Apr 2019 15:33:16 -0400 Message-Id: <20190403193318.16478-11-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 03 Apr 2019 19:33:42 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse The device driver can have kernel thread or worker doing work against a process mm and it is useful for those to test wether the mm is dead or alive to avoid doing useless work. Add an helper to test that so that driver can bail out early if a process is dying. Note that the helper does not perform any lock synchronization and thus is just a hint ie a process might be dying but the helper might still return the process as alive. All HMM functions are safe to use in that case as HMM internal properly protect itself with lock. If driver use this helper with non HMM functions it should ascertain that it is safe to do so. Signed-off-by: Jérôme Glisse Cc: Ralph Campbell Cc: Andrew Morton Cc: John Hubbard Cc: Dan Williams Cc: Ira Weiny --- include/linux/hmm.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index e5834082de60..a79fcc6681f5 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -438,6 +438,30 @@ struct hmm_mirror { int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm); void hmm_mirror_unregister(struct hmm_mirror *mirror); +/* + * hmm_mirror_mm_is_alive() - test if mm is still alive + * @mirror: the HMM mm mirror for which we want to lock the mmap_sem + * Returns: false if the mm is dead, true otherwise + * + * This is an optimization it will not accurately always return -EINVAL if the + * mm is dead ie there can be false negative (process is being kill but HMM is + * not yet inform of that). It is only intented to be use to optimize out case + * where driver is about to do something time consuming and it would be better + * to skip it if the mm is dead. + */ +static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror) +{ + struct mm_struct *mm; + + if (!mirror || !mirror->hmm) + return false; + mm = READ_ONCE(mirror->hmm->mm); + if (mirror->hmm->dead || !mm) + return false; + + return true; +} + /* * Please see Documentation/vm/hmm.rst for how to use the range API. From patchwork Wed Apr 3 19:33:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884409 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 367B917E9 for ; Wed, 3 Apr 2019 19:34:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1B1F4289C9 for ; Wed, 3 Apr 2019 19:34:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0C044289D0; Wed, 3 Apr 2019 19:34:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 440BB289E0 for ; Wed, 3 Apr 2019 19:34:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7BFDD6B0275; Wed, 3 Apr 2019 15:33:45 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6F5906B0276; Wed, 3 Apr 2019 15:33:45 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 520CC6B0277; Wed, 3 Apr 2019 15:33:45 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 274246B0275 for ; Wed, 3 Apr 2019 15:33:45 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id n1so122829qte.12 for ; Wed, 03 Apr 2019 12:33:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=CyU3rZ/sOKVxQY6gKYSz5WatUbvmZB1DmVxc8WNP/dY=; b=h/Gz0jUeRf8pmU7XDdTNnJ0ONq6OHNgmJruTwxllJHRAzmG1caDyIcjz7JoGs/1PTR tG34VAW4WMvLwqSjZhbkQ6kX1BGRi+U7fCEzJ6TSwRXC+TXdSFXtjnbhJbUhCbcJAK16 XEDI3cfNgx6TOC3ns7za6XJed6PV5IVjb9V/DfkkxEiUYxgvp1qgKPikie5qZG8LEB1U ugbIjQNTgT+lEUKbTSZzjIpU/EtIozjr4A3zmpGC8gUCEYV8YqFjeQ+DqcyRzV6F97+6 OEl0o0cdwiRe0UmWyBR6Mpc1inpHVql6KoGb3+qnBUIrZa6j8lwkxaRn6CwKEcIaRxTl fkmg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXEUFOLWkNlQz8tQy+aYdMCTgtFF9dyFg0RFHvvSIpKoO04FLeP BFoHpqKFx4FvDlrwbcsoQOPNhRtfB0J2trZ7uxjWlJWfsmbllaJ0aMQouBfGg6knWxXmbqlmSzK o+2U+EYkUMKSCYrnrxiyQl9V7Z+6pXPA0ZFzjn2ReIZgVm8wyAxo8zSrgcEMujgfboQ== X-Received: by 2002:a0c:969d:: with SMTP id a29mr1268460qvd.56.1554320024916; Wed, 03 Apr 2019 12:33:44 -0700 (PDT) X-Google-Smtp-Source: APXvYqyM7+JKfr8cM5bgsBTxWXhwyX7QBmzg1aE1YAWH8zcrg2zZML78EG4hoXDLd9Tw5fxCUbpz X-Received: by 2002:a0c:969d:: with SMTP id a29mr1268417qvd.56.1554320024196; Wed, 03 Apr 2019 12:33:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320024; cv=none; d=google.com; s=arc-20160816; b=lEmU5oyha892EqOJyAZPqeduccNURMNUIfYD/SJXRJOQJabM3+D3V1giOliz21x2h2 ttSZtYuJQ4Fzb9++AIMpfXvKlmm6B1GsudORUZgtPqw+wUF4kRax/3f7xTqID60epWxg eTwpKT9b8PiWDIu5R+FYKXFns4kYoX/b6oaaJfUNPYeJZepTPQhS/jPMO0kL77BAC8/C x33kX1eTRY8e0qPaOewlvYTvtNv00/f853AIXgbpu7VnUU88nq2kvac6os0HXcLp3+Tv JY2SpyRKOD+l8MEl2kxJCMpySU9E35PZ1jU7Iavz29Ge6DnZuhuVJHBGh+LTssutxnCl bUMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=CyU3rZ/sOKVxQY6gKYSz5WatUbvmZB1DmVxc8WNP/dY=; b=coCxQwP3qPtgmaw8KcwB2gH0zd3LI+iHQtZ1Ihj+JxT9QjhtiBw+EuHidU5VzHMuUW xZAMN+ZhIlZiI+JW2F/jtI954IotD0TR5shJwBz0l0i0o8NNDj3Eg1lqKvAnRGncD7x3 8qXZ91Zy0jHRQimVdZoBf1LfVj1QeVsYh/kDB/DQyktTP/PxOa6VJo1aq04ClYcX2J4y t18AFbM3QIYRNLnouSXhW8WrLFlEM62SKYtBIfaUK1eCBB3Edj4sdO8mCiAQCAi2Ibbg pgLRrNh7jRAZBnuXUqh9gfvG2jtAsMbRkN7MAVEBbWxhrlQR/7Gl9m7CCC1QbmeYaWIA lWKg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id b54si5365637qte.184.2019.04.03.12.33.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:44 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 621B13092669; Wed, 3 Apr 2019 19:33:43 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4AC056012C; Wed, 3 Apr 2019 19:33:42 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Ralph Campbell , John Hubbard , Dan Williams , Souptick Joarder Subject: [PATCH v3 11/12] mm/hmm: add an helper function that fault pages and map them to a device v3 Date: Wed, 3 Apr 2019 15:33:17 -0400 Message-Id: <20190403193318.16478-12-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 03 Apr 2019 19:33:43 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse This is a all in one helper that fault pages in a range and map them to a device so that every single device driver do not have to re-implement this common pattern. This is taken from ODP RDMA in preparation of ODP RDMA convertion. It will be use by nouveau and other drivers. Changes since v2: - Improved function comment for kernel documentation. Changes since v1: - improved commit message Signed-off-by: Jérôme Glisse Cc: Andrew Morton Cc: Ralph Campbell Cc: John Hubbard Cc: Dan Williams Cc: Souptick Joarder --- include/linux/hmm.h | 9 +++ mm/hmm.c | 152 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 161 insertions(+) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index a79fcc6681f5..f81fe2c0f343 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -474,6 +474,15 @@ int hmm_range_register(struct hmm_range *range, void hmm_range_unregister(struct hmm_range *range); long hmm_range_snapshot(struct hmm_range *range); long hmm_range_fault(struct hmm_range *range, bool block); +long hmm_range_dma_map(struct hmm_range *range, + struct device *device, + dma_addr_t *daddrs, + bool block); +long hmm_range_dma_unmap(struct hmm_range *range, + struct vm_area_struct *vma, + struct device *device, + dma_addr_t *daddrs, + bool dirty); /* * HMM_RANGE_DEFAULT_TIMEOUT - default timeout (ms) when waiting for a range diff --git a/mm/hmm.c b/mm/hmm.c index 39bc77d7e6e3..82fded7273d8 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include @@ -1173,6 +1174,157 @@ long hmm_range_fault(struct hmm_range *range, bool block) return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; } EXPORT_SYMBOL(hmm_range_fault); + +/** + * hmm_range_dma_map() - hmm_range_fault() and dma map page all in one. + * @range: range being faulted + * @device: device against to dma map page to + * @daddrs: dma address of mapped pages + * @block: allow blocking on fault (if true it sleeps and do not drop mmap_sem) + * Returns: number of pages mapped on success, -EAGAIN if mmap_sem have been + * drop and you need to try again, some other error value otherwise + * + * Note same usage pattern as hmm_range_fault(). + */ +long hmm_range_dma_map(struct hmm_range *range, + struct device *device, + dma_addr_t *daddrs, + bool block) +{ + unsigned long i, npages, mapped; + long ret; + + ret = hmm_range_fault(range, block); + if (ret <= 0) + return ret ? ret : -EBUSY; + + npages = (range->end - range->start) >> PAGE_SHIFT; + for (i = 0, mapped = 0; i < npages; ++i) { + enum dma_data_direction dir = DMA_FROM_DEVICE; + struct page *page; + + /* + * FIXME need to update DMA API to provide invalid DMA address + * value instead of a function to test dma address value. This + * would remove lot of dumb code duplicated accross many arch. + * + * For now setting it to 0 here is good enough as the pfns[] + * value is what is use to check what is valid and what isn't. + */ + daddrs[i] = 0; + + page = hmm_pfn_to_page(range, range->pfns[i]); + if (page == NULL) + continue; + + /* Check if range is being invalidated */ + if (!range->valid) { + ret = -EBUSY; + goto unmap; + } + + /* If it is read and write than map bi-directional. */ + if (range->pfns[i] & range->values[HMM_PFN_WRITE]) + dir = DMA_BIDIRECTIONAL; + + daddrs[i] = dma_map_page(device, page, 0, PAGE_SIZE, dir); + if (dma_mapping_error(device, daddrs[i])) { + ret = -EFAULT; + goto unmap; + } + + mapped++; + } + + return mapped; + +unmap: + for (npages = i, i = 0; (i < npages) && mapped; ++i) { + enum dma_data_direction dir = DMA_FROM_DEVICE; + struct page *page; + + page = hmm_pfn_to_page(range, range->pfns[i]); + if (page == NULL) + continue; + + if (dma_mapping_error(device, daddrs[i])) + continue; + + /* If it is read and write than map bi-directional. */ + if (range->pfns[i] & range->values[HMM_PFN_WRITE]) + dir = DMA_BIDIRECTIONAL; + + dma_unmap_page(device, daddrs[i], PAGE_SIZE, dir); + mapped--; + } + + return ret; +} +EXPORT_SYMBOL(hmm_range_dma_map); + +/** + * hmm_range_dma_unmap() - unmap range of that was map with hmm_range_dma_map() + * @range: range being unmapped + * @vma: the vma against which the range (optional) + * @device: device against which dma map was done + * @daddrs: dma address of mapped pages + * @dirty: dirty page if it had the write flag set + * Returns: number of page unmapped on success, -EINVAL otherwise + * + * Note that caller MUST abide by mmu notifier or use HMM mirror and abide + * to the sync_cpu_device_pagetables() callback so that it is safe here to + * call set_page_dirty(). Caller must also take appropriate locks to avoid + * concurrent mmu notifier or sync_cpu_device_pagetables() to make progress. + */ +long hmm_range_dma_unmap(struct hmm_range *range, + struct vm_area_struct *vma, + struct device *device, + dma_addr_t *daddrs, + bool dirty) +{ + unsigned long i, npages; + long cpages = 0; + + /* Sanity check. */ + if (range->end <= range->start) + return -EINVAL; + if (!daddrs) + return -EINVAL; + if (!range->pfns) + return -EINVAL; + + npages = (range->end - range->start) >> PAGE_SHIFT; + for (i = 0; i < npages; ++i) { + enum dma_data_direction dir = DMA_FROM_DEVICE; + struct page *page; + + page = hmm_pfn_to_page(range, range->pfns[i]); + if (page == NULL) + continue; + + /* If it is read and write than map bi-directional. */ + if (range->pfns[i] & range->values[HMM_PFN_WRITE]) { + dir = DMA_BIDIRECTIONAL; + + /* + * See comments in function description on why it is + * safe here to call set_page_dirty() + */ + if (dirty) + set_page_dirty(page); + } + + /* Unmap and clear pfns/dma address */ + dma_unmap_page(device, daddrs[i], PAGE_SIZE, dir); + range->pfns[i] = range->values[HMM_PFN_NONE]; + /* FIXME see comments in hmm_vma_dma_map() */ + daddrs[i] = 0; + cpages++; + } + + return cpages; +} +EXPORT_SYMBOL(hmm_range_dma_unmap); #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */ From patchwork Wed Apr 3 19:33:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerome Glisse X-Patchwork-Id: 10884411 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D295F1708 for ; Wed, 3 Apr 2019 19:34:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BBFEA2875F for ; Wed, 3 Apr 2019 19:34:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AFEB0289D0; Wed, 3 Apr 2019 19:34:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D8FBD289C9 for ; Wed, 3 Apr 2019 19:34:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B3FF76B0276; Wed, 3 Apr 2019 15:33:46 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AD22A6B0277; Wed, 3 Apr 2019 15:33:46 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 87BD46B0278; Wed, 3 Apr 2019 15:33:46 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by kanga.kvack.org (Postfix) with ESMTP id 583136B0276 for ; Wed, 3 Apr 2019 15:33:46 -0400 (EDT) Received: by mail-qk1-f197.google.com with SMTP id g25so130312qkm.22 for ; Wed, 03 Apr 2019 12:33:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=uKn34GWnpVv7ef4GRTWg8vkzWdtXgrNwtqkdqbeR1BA=; b=av7AOXfaM0l9Rvz2vlC8RTm5EXUlEzss0eQwzL1cnWpp6Bh/Cbt2vljpIN1bdlJT8L FB1LpHZfGoYhQ+8Engow22DGZi87WMq+ve3eoadSx7YpMDllkqbDe/aMQsh7WFp1ZgME 5zfxhDoM+h1mCT0DZT+sLETQnpUn9ne5BEcK7bho8LjDMlzbnpNGqfAHQMtiw/rwk/ue QiFL3V+vgRMr11xSx6Q+Mff62jFDyu/MuVx4GZV6CjaMUDqrQYHOkgw3PCj2BCgHYTdy EIvZLYu1yxD6Q/lzuEj7+gb0Xcb7M3FUByUxqkA/irDlDoQUh05oxwSKLysrs6XUIt9s sZtQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APjAAAXB/MREGj8NIVisxe2Xoc0UwRI0v4N2znKx+gDa4va/21tU1bM8 C4EMysz1VjsPxnefBppEYvQCK1+sy8WU2HrUUNqmFdbzT6A4dq9Bg7P4KRMSzv40PIZDH7Q3VyV WxBxLGY5JJSQlXc1keDo8SS05fPkp/zdDUA62V5Whj1qIHW+1VxLNCnZ0+/b27Ym1xQ== X-Received: by 2002:a37:61d0:: with SMTP id v199mr1630963qkb.159.1554320026110; Wed, 03 Apr 2019 12:33:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqxzW38jymq9jnp72B/u3Y+QdDkCi2RplQ7d4DujLEh7qrM2DzQB4dsLE1f0E4BSldzB0LaY X-Received: by 2002:a37:61d0:: with SMTP id v199mr1630923qkb.159.1554320025450; Wed, 03 Apr 2019 12:33:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554320025; cv=none; d=google.com; s=arc-20160816; b=SvaW4Z5qh/W93mLy9EKdb5IcEUDNGUj3VJAWb439dttfXcE21zlj2dOPIvR7yBINJ5 EHi0UCwlJuGBAsHMdeKj252vhOoFHl1LdFb8wW+jVdsh7IZk1l9J+xUAMCzEaqtFKRK4 +yzc1URrQ0klXi/FSOuPOhLqld+z0Xby5jcB4qiU89VDV57Hb2ojLbGwdvBsfVrj6hiu EFVUb7kjpYYrm9XACOzD1NDcuQi8r8Y8JByT3jSgJ+syXXTI2Vqjh0Aj3UZr/UlZUyoJ Cz24SKZJv0qulzaWyrAh3Y8sVh3VHZZtfzLjaVea9r9TkIlbtmRGGSD/9anL4X2hbv+u CGjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=uKn34GWnpVv7ef4GRTWg8vkzWdtXgrNwtqkdqbeR1BA=; b=WVR7Nl5sW4PWsHiB20lw+5PD459MSIMhp+x+ovxyqmo2nedki51xCnuN66pzFpOhdj XTX6HoXN71y7x7sxj/14RbV5NLoWXXsuQec8EW2UmFgVEof/lr84DZ8bB3TVj8QdAnt3 M1lz+rTAixNgJT5RhLszATQ/Pd0wNHbHlomxp7hhp98M3HCZnW/xuOjvQ3tWssAOgXeh cRFKqAqfX7+CxhuyLIR1uqkokSwOjtCIFX0zmq0U/5WaZf5Z8lGP+nP46S5Ux4dyELGO pDpVGB4KIPk1c2V/4bGUleu/EHFGYSSGmy5YdVHIr6BVc0L6p7+oumDAJNtAsdxK7bHL XJ+w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g185si2214509qkf.107.2019.04.03.12.33.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Apr 2019 12:33:45 -0700 (PDT) Received-SPF: pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of jglisse@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=jglisse@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A5D6CC049598; Wed, 3 Apr 2019 19:33:44 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-125-190.rdu2.redhat.com [10.10.125.190]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8FD0A6012C; Wed, 3 Apr 2019 19:33:43 +0000 (UTC) From: jglisse@redhat.com To: linux-mm@kvack.org, Andrew Morton Cc: linux-kernel@vger.kernel.org, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Ralph Campbell , John Hubbard , Dan Williams , Ira Weiny Subject: [PATCH v3 12/12] mm/hmm: convert various hmm_pfn_* to device_entry which is a better name Date: Wed, 3 Apr 2019 15:33:18 -0400 Message-Id: <20190403193318.16478-13-jglisse@redhat.com> In-Reply-To: <20190403193318.16478-1-jglisse@redhat.com> References: <20190403193318.16478-1-jglisse@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 03 Apr 2019 19:33:44 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Jérôme Glisse Convert hmm_pfn_* to device_entry_* as here we are dealing with device driver specific entry format and hmm provide helpers to allow differents components (including HMM) to create/parse device entry. We keep wrapper with the old name so that we can convert driver to use the new API in stages in each device driver tree. This will get remove once all driver are converted. Signed-off-by: Jérôme Glisse Cc: Andrew Morton Cc: Ralph Campbell Cc: John Hubbard Cc: Dan Williams Cc: Ira Weiny --- include/linux/hmm.h | 93 +++++++++++++++++++++++++++++++-------------- mm/hmm.c | 19 +++++---- 2 files changed, 75 insertions(+), 37 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index f81fe2c0f343..51ec27a84668 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -239,36 +239,36 @@ static inline bool hmm_range_valid(struct hmm_range *range) } /* - * hmm_pfn_to_page() - return struct page pointed to by a valid HMM pfn - * @range: range use to decode HMM pfn value - * @pfn: HMM pfn value to get corresponding struct page from - * Returns: struct page pointer if pfn is a valid HMM pfn, NULL otherwise + * hmm_device_entry_to_page() - return struct page pointed to by a device entry + * @range: range use to decode device entry value + * @entry: device entry value to get corresponding struct page from + * Returns: struct page pointer if entry is a valid, NULL otherwise * - * If the HMM pfn is valid (ie valid flag set) then return the struct page - * matching the pfn value stored in the HMM pfn. Otherwise return NULL. + * If the device entry is valid (ie valid flag set) then return the struct page + * matching the entry value. Otherwise return NULL. */ -static inline struct page *hmm_pfn_to_page(const struct hmm_range *range, - uint64_t pfn) +static inline struct page *hmm_device_entry_to_page(const struct hmm_range *range, + uint64_t entry) { - if (pfn == range->values[HMM_PFN_NONE]) + if (entry == range->values[HMM_PFN_NONE]) return NULL; - if (pfn == range->values[HMM_PFN_ERROR]) + if (entry == range->values[HMM_PFN_ERROR]) return NULL; - if (pfn == range->values[HMM_PFN_SPECIAL]) + if (entry == range->values[HMM_PFN_SPECIAL]) return NULL; - if (!(pfn & range->flags[HMM_PFN_VALID])) + if (!(entry & range->flags[HMM_PFN_VALID])) return NULL; - return pfn_to_page(pfn >> range->pfn_shift); + return pfn_to_page(entry >> range->pfn_shift); } /* - * hmm_pfn_to_pfn() - return pfn value store in a HMM pfn - * @range: range use to decode HMM pfn value - * @pfn: HMM pfn value to extract pfn from - * Returns: pfn value if HMM pfn is valid, -1UL otherwise + * hmm_device_entry_to_pfn() - return pfn value store in a device entry + * @range: range use to decode device entry value + * @entry: device entry to extract pfn from + * Returns: pfn value if device entry is valid, -1UL otherwise */ -static inline unsigned long hmm_pfn_to_pfn(const struct hmm_range *range, - uint64_t pfn) +static inline unsigned long +hmm_device_entry_to_pfn(const struct hmm_range *range, uint64_t pfn) { if (pfn == range->values[HMM_PFN_NONE]) return -1UL; @@ -282,31 +282,66 @@ static inline unsigned long hmm_pfn_to_pfn(const struct hmm_range *range, } /* - * hmm_pfn_from_page() - create a valid HMM pfn value from struct page + * hmm_device_entry_from_page() - create a valid device entry for a page * @range: range use to encode HMM pfn value - * @page: struct page pointer for which to create the HMM pfn - * Returns: valid HMM pfn for the page + * @page: page for which to create the device entry + * Returns: valid device entry for the page */ -static inline uint64_t hmm_pfn_from_page(const struct hmm_range *range, - struct page *page) +static inline uint64_t hmm_device_entry_from_page(const struct hmm_range *range, + struct page *page) { return (page_to_pfn(page) << range->pfn_shift) | range->flags[HMM_PFN_VALID]; } /* - * hmm_pfn_from_pfn() - create a valid HMM pfn value from pfn + * hmm_device_entry_from_pfn() - create a valid device entry value from pfn * @range: range use to encode HMM pfn value - * @pfn: pfn value for which to create the HMM pfn - * Returns: valid HMM pfn for the pfn + * @pfn: pfn value for which to create the device entry + * Returns: valid device entry for the pfn */ -static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range, - unsigned long pfn) +static inline uint64_t hmm_device_entry_from_pfn(const struct hmm_range *range, + unsigned long pfn) { return (pfn << range->pfn_shift) | range->flags[HMM_PFN_VALID]; } +/* + * Old API: + * hmm_pfn_to_page() + * hmm_pfn_to_pfn() + * hmm_pfn_from_page() + * hmm_pfn_from_pfn() + * + * This are the OLD API please use new API, it is here to avoid cross-tree + * merge painfullness ie we convert things to new API in stages. + */ +static inline struct page *hmm_pfn_to_page(const struct hmm_range *range, + uint64_t pfn) +{ + return hmm_device_entry_to_page(range, pfn); +} + +static inline unsigned long hmm_pfn_to_pfn(const struct hmm_range *range, + uint64_t pfn) +{ + return hmm_device_entry_to_pfn(range, pfn); +} + +static inline uint64_t hmm_pfn_from_page(const struct hmm_range *range, + struct page *page) +{ + return hmm_device_entry_from_page(range, page); +} + +static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range, + unsigned long pfn) +{ + return hmm_device_entry_from_pfn(range, pfn); +} + + #if IS_ENABLED(CONFIG_HMM_MIRROR) /* diff --git a/mm/hmm.c b/mm/hmm.c index 82fded7273d8..75d2ea906efb 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -542,7 +542,7 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, if (unlikely(!hmm_vma_walk->pgmap)) return -EBUSY; } - pfns[i] = hmm_pfn_from_pfn(range, pfn) | cpu_flags; + pfns[i] = hmm_device_entry_from_pfn(range, pfn) | cpu_flags; } if (hmm_vma_walk->pgmap) { put_dev_pagemap(hmm_vma_walk->pgmap); @@ -606,7 +606,8 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, &fault, &write_fault); if (fault || write_fault) goto fault; - *pfn = hmm_pfn_from_pfn(range, swp_offset(entry)); + *pfn = hmm_device_entry_from_pfn(range, + swp_offset(entry)); *pfn |= cpu_flags; return 0; } @@ -644,7 +645,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, return -EFAULT; } - *pfn = hmm_pfn_from_pfn(range, pte_pfn(pte)) | cpu_flags; + *pfn = hmm_device_entry_from_pfn(range, pte_pfn(pte)) | cpu_flags; return 0; fault: @@ -797,7 +798,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, hmm_vma_walk->pgmap); if (unlikely(!hmm_vma_walk->pgmap)) return -EBUSY; - pfns[i] = hmm_pfn_from_pfn(range, pfn) | cpu_flags; + pfns[i] = hmm_device_entry_from_pfn(range, pfn) | + cpu_flags; } if (hmm_vma_walk->pgmap) { put_dev_pagemap(hmm_vma_walk->pgmap); @@ -870,7 +872,8 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, pfn = pte_pfn(entry) + (start & mask); for (; addr < end; addr += size, i++, pfn += pfn_inc) - range->pfns[i] = hmm_pfn_from_pfn(range, pfn) | cpu_flags; + range->pfns[i] = hmm_device_entry_from_pfn(range, pfn) | + cpu_flags; hmm_vma_walk->last = end; unlock: @@ -1213,7 +1216,7 @@ long hmm_range_dma_map(struct hmm_range *range, */ daddrs[i] = 0; - page = hmm_pfn_to_page(range, range->pfns[i]); + page = hmm_device_entry_to_page(range, range->pfns[i]); if (page == NULL) continue; @@ -1243,7 +1246,7 @@ long hmm_range_dma_map(struct hmm_range *range, enum dma_data_direction dir = DMA_FROM_DEVICE; struct page *page; - page = hmm_pfn_to_page(range, range->pfns[i]); + page = hmm_device_entry_to_page(range, range->pfns[i]); if (page == NULL) continue; @@ -1298,7 +1301,7 @@ long hmm_range_dma_unmap(struct hmm_range *range, enum dma_data_direction dir = DMA_FROM_DEVICE; struct page *page; - page = hmm_pfn_to_page(range, range->pfns[i]); + page = hmm_device_entry_to_page(range, range->pfns[i]); if (page == NULL) continue;