Message ID | 20250217190836.435039-1-willy@infradead.org (mailing list archive) |
---|---|
Headers | show |
Series | Add folio_mk_pte() and simplify mk_pte() | expand |
On 17.02.25 20:08, Matthew Wilcox (Oracle) wrote: > The intent is to add folio_mk_pte() to remove the conversion from folio > to page necessary to call mk_pte(). Eventually we might end up removing > mk_pte(), but that's not what's being proposed today. > > I didn't want to add folio_mk_pte() to each architecture, and I didn't > want to lose any optimisations that architectures have from their own > implementation of mk_pte(). Fortunately, most architectures have by > now turned their mk_pte() into a fairly bland variant of pfn_pte(), > but s390 is different. > > So patch 1 hoists the optimisation of calling pte_mkdirty() from s390 > to generic code. I'd appreciate some eyes on this from mm people who > understand this better than I do. I originally had > > - if (write) > + if (write || folio_test_dirty(folio)) > entry = maybe_mkwrite(pte_mkdirty(entry), vma); > > and I think that broke COW under some circumstances that 01.org could > reproduce and I couldn't. If it's an anon folio that logic would be broken, yes (anon CoW). We do have can_change_pte_writable() that tells you when it is safe to upgrade write permissions for a PTE. Looking at can_change_pte_writable(), I don't know if filesystems with writenotify might have a problem when setting the PTE dirty and allowing for write access, just because the folio is dirty. So I assume that it would break fs-level CoW indeed.