Message ID | 20250317141700.3701581-1-kevin.brodsky@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | Always call constructor for kernel page tables | expand |
On 17/03/2025 14:16, Kevin Brodsky wrote: > The complications in those special pgtable allocators beg the question: > does it really make sense to treat efi_mm and init_mm differently in > e.g. apply_to_pte_range()? Maybe what we really need is a way to tell if > an mm corresponds to user memory or not, and never use split locks for > non-user mm's. Feedback and suggestions welcome! The difference in treatment is whether or not the ptl is taken, right? So the real question is when calling apply_to_pte_range() for efi_mm, is there already a higher level serialization mechanism that prevents racy accesses? For init_mm, I think this is handled implicitly because there is no way for user space to cause apply_to_pte_range() for an arbitrary piece of kernel memory. Although I can't even see where apply_to_page_range() is called for efi_mm. FWIW, contpte.c has mm_is_user() which is used by arm64. Thanks, Ryan
On 17/03/2025 16:30, Ryan Roberts wrote: > On 17/03/2025 14:16, Kevin Brodsky wrote: >> The complications in those special pgtable allocators beg the question: >> does it really make sense to treat efi_mm and init_mm differently in >> e.g. apply_to_pte_range()? Maybe what we really need is a way to tell if >> an mm corresponds to user memory or not, and never use split locks for >> non-user mm's. Feedback and suggestions welcome! > The difference in treatment is whether or not the ptl is taken, right? So the > real question is when calling apply_to_pte_range() for efi_mm, is there already > a higher level serialization mechanism that prevents racy accesses? For init_mm, > I think this is handled implicitly because there is no way for user space to > cause apply_to_pte_range() for an arbitrary piece of kernel memory. Although I > can't even see where apply_to_page_range() is called for efi_mm. The commit I mentioned above, 61444cde9170 ("ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations"), shows that apply_to_page_range() is called from efi_set_mapping_permissions(), and this indeed hasn't changed. It is itself called from efi_virtmap_init(). I would expect that no locking at all is necessary here, since the mapping has just been created and surely isn't used yet. Now the question is where exactly init_mm is special-cased in this manner. I can see that walk_page_range() does something similar, there may be more cases. And the other question is whether those functions are ever used on special mm's, aside from efi_set_mapping_permissions(). > FWIW, contpte.c has mm_is_user() which is used by arm64. Interesting! But not pretty, that's basically checking that the mm is not &init_mm or &efi_mm... which wouldn't work for a generic implementation. It feels like adding some attribute to mm_struct wouldn't hurt. It looks like we've run out of MMF_* flags though :/ - Kevin