Message ID | 20200703153718.16973-8-catalin.marinas@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: Memory Tagging Extension user-space support | expand |
On 03.07.20 17:36, Catalin Marinas wrote: > When a huge page is split into normal pages, part of the head page flags > are transferred to the tail pages. However, the PG_arch_* flags are not > part of the preserved set. > > PG_arch_1 is currently used by the arch code to handle cache maintenance > for user space (either for I-D cache coherency or for D-cache aliases > consistent with the kernel mapping). Since splitting a huge page does > not change the physical or virtual address of a mapping, additional > cache maintenance for the tail pages is unnecessary. Preserving the > PG_arch_1 flag from the head page in the tail pages would not break the > current use-cases. ^ is fairly arm64 specific, no? (I remember that the semantics are different e.g., on s390x). Did you check if this is actually safe to do on other architectures? Maybe rephrase the description to make this clearer. > > PG_arch_2 is currently used for arm64 MTE support to mark pages that > have valid tags. The absence of such flag causes the arm64 set_pte_at() > to clear the tags in order to avoid stale tags exposed to user or the > swapping out hooks to ignore the tags. Not preserving PG_arch_2 on huge > page splitting leads to tag corruption in the tail pages. "currently"? I don't think so - isn't it follow-up patches in this series? > > To avoid the above and for consistency between the two PG_arch_* flags, > preserve both PG_arch_1 and PG_arch_2 in __split_huge_page_tail(). > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > --- > > Notes: > New in v6. > > mm/huge_memory.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 78c84bee7e29..22b3236a6dd8 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2364,6 +2364,10 @@ static void __split_huge_page_tail(struct page *head, int tail, > (1L << PG_workingset) | > (1L << PG_locked) | > (1L << PG_unevictable) | > + (1L << PG_arch_1) | > +#ifdef CONFIG_64BIT > + (1L << PG_arch_2) | > +#endif > (1L << PG_dirty))); > > /* ->mapping in first tail page is compound_mapcount */ >
On Mon, Jul 06, 2020 at 04:16:13PM +0200, David Hildenbrand wrote: > On 03.07.20 17:36, Catalin Marinas wrote: > > When a huge page is split into normal pages, part of the head page flags > > are transferred to the tail pages. However, the PG_arch_* flags are not > > part of the preserved set. > > > > PG_arch_1 is currently used by the arch code to handle cache maintenance > > for user space (either for I-D cache coherency or for D-cache aliases > > consistent with the kernel mapping). Since splitting a huge page does > > not change the physical or virtual address of a mapping, additional > > cache maintenance for the tail pages is unnecessary. Preserving the > > PG_arch_1 flag from the head page in the tail pages would not break the > > current use-cases. > > ^ is fairly arm64 specific, no? (I remember that the semantics are > different e.g., on s390x). Not entirely arm64 specific. Apart from s390 and x86, I think all the other architectures use this flag for cache maintenance (I guess they followed the cachetlb.rst suggestion). My understanding of the s390 and x86 is that transferring this flag from the head of a compound page to the tail pages should not cause any issue. We don't even document anywhere that this flag is meant to disappear on huge page splitting. I guess no-one noticed because clearing it is relatively benign. But if there are concerns, I'm happy to guard it with something like __ARCH_WANT_PG_ARCH_HEAD_TAIL (I need to think of a more suggestive name). > > have valid tags. The absence of such flag causes the arm64 set_pte_at() > > to clear the tags in order to avoid stale tags exposed to user or the > > swapping out hooks to ignore the tags. Not preserving PG_arch_2 on huge > > page splitting leads to tag corruption in the tail pages. > > "currently"? I don't think so - isn't it follow-up patches in this series? True. It used to be correct before reordering the patches prior to posting.
On 06.07.20 18:30, Catalin Marinas wrote: > On Mon, Jul 06, 2020 at 04:16:13PM +0200, David Hildenbrand wrote: >> On 03.07.20 17:36, Catalin Marinas wrote: >>> When a huge page is split into normal pages, part of the head page flags >>> are transferred to the tail pages. However, the PG_arch_* flags are not >>> part of the preserved set. >>> >>> PG_arch_1 is currently used by the arch code to handle cache maintenance >>> for user space (either for I-D cache coherency or for D-cache aliases >>> consistent with the kernel mapping). Since splitting a huge page does >>> not change the physical or virtual address of a mapping, additional >>> cache maintenance for the tail pages is unnecessary. Preserving the >>> PG_arch_1 flag from the head page in the tail pages would not break the >>> current use-cases. >> >> ^ is fairly arm64 specific, no? (I remember that the semantics are >> different e.g., on s390x). > > Not entirely arm64 specific. Apart from s390 and x86, I think all the > other architectures use this flag for cache maintenance (I guess they > followed the cachetlb.rst suggestion). My understanding of the s390 and > x86 is that transferring this flag from the head of a compound page to > the tail pages should not cause any issue. We don't even document > anywhere that this flag is meant to disappear on huge page splitting. I > guess no-one noticed because clearing it is relatively benign. On s390x, PG_arch_1 indicates (s390/kernel/uv.c:arch_make_page_accessible()) - kernel page tables - for hugetlbfs pages, that storage keys are initialized for that page (IIRC KVM only) - a user space page might be encrypted/secure (KVM only) The latter does not support hugetlbfs/THP. KVM does not support THP. So on s390x the bit should never be set in that context and, therefore, also won't be affected by this change. > > But if there are concerns, I'm happy to guard it with something like > __ARCH_WANT_PG_ARCH_HEAD_TAIL (I need to think of a more suggestive > name). I guess we can avoid that if we properly check+document all users. (ignoring x86 and s390x behavior here might be dangerous, although my gut feeling is that it's ok for both)
On Mon, Jul 06, 2020 at 07:56:43PM +0200, David Hildenbrand wrote: > On 06.07.20 18:30, Catalin Marinas wrote: > > On Mon, Jul 06, 2020 at 04:16:13PM +0200, David Hildenbrand wrote: > >> On 03.07.20 17:36, Catalin Marinas wrote: > >>> When a huge page is split into normal pages, part of the head page flags > >>> are transferred to the tail pages. However, the PG_arch_* flags are not > >>> part of the preserved set. > >>> > >>> PG_arch_1 is currently used by the arch code to handle cache maintenance > >>> for user space (either for I-D cache coherency or for D-cache aliases > >>> consistent with the kernel mapping). Since splitting a huge page does > >>> not change the physical or virtual address of a mapping, additional > >>> cache maintenance for the tail pages is unnecessary. Preserving the > >>> PG_arch_1 flag from the head page in the tail pages would not break the > >>> current use-cases. > >> > >> ^ is fairly arm64 specific, no? (I remember that the semantics are > >> different e.g., on s390x). > > > > Not entirely arm64 specific. Apart from s390 and x86, I think all the > > other architectures use this flag for cache maintenance (I guess they > > followed the cachetlb.rst suggestion). My understanding of the s390 and > > x86 is that transferring this flag from the head of a compound page to > > the tail pages should not cause any issue. We don't even document > > anywhere that this flag is meant to disappear on huge page splitting. I > > guess no-one noticed because clearing it is relatively benign. > > On s390x, PG_arch_1 indicates (s390/kernel/uv.c:arch_make_page_accessible()) > - kernel page tables > - for hugetlbfs pages, that storage keys are initialized for that page > (IIRC KVM only) > - a user space page might be encrypted/secure (KVM only) > > The latter does not support hugetlbfs/THP. KVM does not support THP. So > on s390x the bit should never be set in that context and, therefore, > also won't be affected by this change. Thanks for checking. > > But if there are concerns, I'm happy to guard it with something like > > __ARCH_WANT_PG_ARCH_HEAD_TAIL (I need to think of a more suggestive > > name). > > I guess we can avoid that if we properly check+document all users. > (ignoring x86 and s390x behavior here might be dangerous, although my > gut feeling is that it's ok for both) I'll post an independent patch for PG_arch_1 to get consensus among architectures. The PG_arch_2 introduced by the MTE patches can have the new behaviour since it would only be used by arm64 initially.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 78c84bee7e29..22b3236a6dd8 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2364,6 +2364,10 @@ static void __split_huge_page_tail(struct page *head, int tail, (1L << PG_workingset) | (1L << PG_locked) | (1L << PG_unevictable) | + (1L << PG_arch_1) | +#ifdef CONFIG_64BIT + (1L << PG_arch_2) | +#endif (1L << PG_dirty))); /* ->mapping in first tail page is compound_mapcount */
When a huge page is split into normal pages, part of the head page flags are transferred to the tail pages. However, the PG_arch_* flags are not part of the preserved set. PG_arch_1 is currently used by the arch code to handle cache maintenance for user space (either for I-D cache coherency or for D-cache aliases consistent with the kernel mapping). Since splitting a huge page does not change the physical or virtual address of a mapping, additional cache maintenance for the tail pages is unnecessary. Preserving the PG_arch_1 flag from the head page in the tail pages would not break the current use-cases. PG_arch_2 is currently used for arm64 MTE support to mark pages that have valid tags. The absence of such flag causes the arm64 set_pte_at() to clear the tags in order to avoid stale tags exposed to user or the swapping out hooks to ignore the tags. Not preserving PG_arch_2 on huge page splitting leads to tag corruption in the tail pages. To avoid the above and for consistency between the two PG_arch_* flags, preserve both PG_arch_1 and PG_arch_2 in __split_huge_page_tail(). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> --- Notes: New in v6. mm/huge_memory.c | 4 ++++ 1 file changed, 4 insertions(+)