Message ID | 20231114154945.490401-1-ryan.roberts@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1] mm: More ptep_get() conversion | expand |
On Tue, Nov 14, 2023 at 03:49:45PM +0000, Ryan Roberts wrote: > Commit c33c794828f2 ("mm: ptep_get() conversion") converted all > (non-arch) call sites to use ptep_get() instead of doing a direct > dereference of the pte. Full rationale can be found in that commit's > log. > > Since then, three new call sites have snuck in, which directly > dereference the pte, so let's fix those up. > > Unfortunately there is no reliable automated mechanism to catch these; > I'm relying on a combination of Coccinelle (which throws up a lot of > false positives) and some compiler magic to force a compiler error on > dereference (While this approach finds dereferences, it also yields a > non-booting kernel so can't be committed). Well ... let's see what we can come up with. struct raw_pte { pte_t pte; }; static inline pte_t ptep_get(struct raw_pte *rpte) { return rpte.pte; } Probably quite a lot of churn to put that into place, but better than a never-ending treadmill of fixing the places that people overlooked?
On 14/11/2023 16:30, Matthew Wilcox wrote: > On Tue, Nov 14, 2023 at 03:49:45PM +0000, Ryan Roberts wrote: >> Commit c33c794828f2 ("mm: ptep_get() conversion") converted all >> (non-arch) call sites to use ptep_get() instead of doing a direct >> dereference of the pte. Full rationale can be found in that commit's >> log. >> >> Since then, three new call sites have snuck in, which directly >> dereference the pte, so let's fix those up. >> >> Unfortunately there is no reliable automated mechanism to catch these; >> I'm relying on a combination of Coccinelle (which throws up a lot of >> false positives) and some compiler magic to force a compiler error on >> dereference (While this approach finds dereferences, it also yields a >> non-booting kernel so can't be committed). > > Well ... let's see what we can come up with. > > struct raw_pte { > pte_t pte; > }; pte_t is already a wrapper around the real value, at least on arm64: typedef struct { pteval_t pte; } pte_t; So doesn't adding extra wrapper just suggest that next year we will end up adding a third, then a fourth...? Fundamentally people can still just do pte->pte to dereference. The approach I took with the compiler magic I describe above was to pass around: typedef void* pte_handle_t; which is just a pointer to pte_t, but you can't deref without an explcit cast. So then I insert the explicit casts in the 5 or 6 places in the arm64 arch code that they are required and it mostly just works. (I have the core patch which is pretty small, then do find/replace on "pte_t *" -> "pte_handle_t" and it just works). But its a LOT of churn in the non-arch code, and leaves the other arches broken, many of which are dereferencing all over the place - it would be a huge effort to fix them all up. > > static inline pte_t ptep_get(struct raw_pte *rpte) > { > return rpte.pte; > } > > Probably quite a lot of churn to put that into place, but better than > a never-ending treadmill of fixing the places that people overlooked? Yes and no... agree it would be nice to automatically guard against it, but I didn't want to spend the next 6 months of my life fixing up all the other arches...
diff --git a/mm/filemap.c b/mm/filemap.c index 9710f43a89ac..32eedf3afd45 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3443,7 +3443,7 @@ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, * handled in the specific fault path, and it'll prohibit the * fault-around logic. */ - if (!pte_none(vmf->pte[count])) + if (!pte_none(ptep_get(&vmf->pte[count]))) goto skip; count++; diff --git a/mm/ksm.c b/mm/ksm.c index 7efcc68ccc6e..6a831009b4cb 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -468,7 +468,7 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex page = pfn_swap_entry_to_page(entry); } /* return 1 if the page is an normal ksm page or KSM-placed zero page */ - ret = (page && PageKsm(page)) || is_ksm_zero_pte(*pte); + ret = (page && PageKsm(page)) || is_ksm_zero_pte(ptent); pte_unmap_unlock(pte, ptl); return ret; } diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 96d9eae5c7cc..0b6ca553bebe 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -312,7 +312,7 @@ static int mfill_atomic_pte_poison(pmd_t *dst_pmd, ret = -EEXIST; /* Refuse to overwrite any PTE, even a PTE marker (e.g. UFFD WP). */ - if (!pte_none(*dst_pte)) + if (!pte_none(ptep_get(dst_pte))) goto out_unlock; set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
Commit c33c794828f2 ("mm: ptep_get() conversion") converted all (non-arch) call sites to use ptep_get() instead of doing a direct dereference of the pte. Full rationale can be found in that commit's log. Since then, three new call sites have snuck in, which directly dereference the pte, so let's fix those up. Unfortunately there is no reliable automated mechanism to catch these; I'm relying on a combination of Coccinelle (which throws up a lot of false positives) and some compiler magic to force a compiler error on dereference (While this approach finds dereferences, it also yields a non-booting kernel so can't be committed). Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> --- mm/filemap.c | 2 +- mm/ksm.c | 2 +- mm/userfaultfd.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) base-commit: b85ea95d086471afb4ad062012a4d73cd328fa86 -- 2.25.1