Message ID | 20240204082627.3892816-3-tongtiangen@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | minor improvements for x86 mce processing | expand |
On Sun, Feb 04, 2024 at 04:26:26PM +0800, Tong Tiangen wrote: > diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c > index bca780fa5e57..b2cce1b6c96d 100644 > --- a/arch/x86/kernel/cpu/mce/severity.c > +++ b/arch/x86/kernel/cpu/mce/severity.c > @@ -292,11 +292,11 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) > case EX_TYPE_UACCESS: > if (!copy_user) > return IN_KERNEL; > + fallthrough; > + case EX_TYPE_DEFAULT_MCE_SAFE: > m->kflags |= MCE_IN_KERNEL_COPYIN; > fallthrough; I knew something was still bugging me here and this is still wrong. Let's imagine this flow: copy_mc_to_user() - note *src is kernel memory |-> copy_mc_enhanced_fast_string or copy_mc_fragile - it's the same thing |-> -#MC, exception type EX_TYPE_DEFAULT_MCE_SAFE |-> error_context(): case EX_TYPE_DEFAULT_MCE_SAFE: m->kflags |= MCE_IN_KERNEL_COPYIN; MCE_IN_KERNEL_COPYIN does kill_me_never(): pr_err("Kernel accessed poison in user space at %llx\n", p->mce_addr); but that's reading from kernel memory! IOW, I *think* that switch statement should be this: switch (fixup_type) { case EX_TYPE_UACCESS: case EX_TYPE_DEFAULT_MCE_SAFE: if (!copy_user) return IN_KERNEL; m->kflags |= MCE_IN_KERNEL_COPYIN; fallthrough; case EX_TYPE_FAULT_MCE_SAFE: m->kflags |= MCE_IN_KERNEL_RECOV; return IN_KERNEL_RECOV; default: return IN_KERNEL; } Provided I'm not missing a case and provided is_copy_from_user() really detects all cases properly. And then patch 3 is wrong because we only can handle "copy in" - not just any copy. Thx.
在 2024/2/7 20:29, Borislav Petkov 写道: > On Sun, Feb 04, 2024 at 04:26:26PM +0800, Tong Tiangen wrote: >> diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c >> index bca780fa5e57..b2cce1b6c96d 100644 >> --- a/arch/x86/kernel/cpu/mce/severity.c >> +++ b/arch/x86/kernel/cpu/mce/severity.c >> @@ -292,11 +292,11 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) >> case EX_TYPE_UACCESS: >> if (!copy_user) >> return IN_KERNEL; >> + fallthrough; >> + case EX_TYPE_DEFAULT_MCE_SAFE: >> m->kflags |= MCE_IN_KERNEL_COPYIN; >> fallthrough; > > I knew something was still bugging me here and this is still wrong. > > Let's imagine this flow: > > copy_mc_to_user() - note *src is kernel memory > |-> copy_mc_enhanced_fast_string or copy_mc_fragile - it's the same thing > |-> -#MC, exception type EX_TYPE_DEFAULT_MCE_SAFE > |-> error_context(): > case EX_TYPE_DEFAULT_MCE_SAFE: > m->kflags |= MCE_IN_KERNEL_COPYIN; > > MCE_IN_KERNEL_COPYIN does kill_me_never(): > > pr_err("Kernel accessed poison in user space at %llx\n", p->mce_addr); > > but that's reading from kernel memory! > > IOW, I *think* that switch statement should be this: > > switch (fixup_type) { > case EX_TYPE_UACCESS: > case EX_TYPE_DEFAULT_MCE_SAFE: > if (!copy_user) > return IN_KERNEL; > > m->kflags |= MCE_IN_KERNEL_COPYIN; > fallthrough; > > case EX_TYPE_FAULT_MCE_SAFE: > m->kflags |= MCE_IN_KERNEL_RECOV; > return IN_KERNEL_RECOV; > > default: > return IN_KERNEL; > } > > Provided I'm not missing a case and provided is_copy_from_user() really > detects all cases properly. > > And then patch 3 is wrong because we only can handle "copy in" - not > just any copy. That makes sense. I'll check that point out later. Many thanks. Tong. > > Thx. >
在 2024/2/7 20:29, Borislav Petkov 写道: > On Sun, Feb 04, 2024 at 04:26:26PM +0800, Tong Tiangen wrote: >> diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c >> index bca780fa5e57..b2cce1b6c96d 100644 >> --- a/arch/x86/kernel/cpu/mce/severity.c >> +++ b/arch/x86/kernel/cpu/mce/severity.c >> @@ -292,11 +292,11 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) >> case EX_TYPE_UACCESS: >> if (!copy_user) >> return IN_KERNEL; >> + fallthrough; >> + case EX_TYPE_DEFAULT_MCE_SAFE: >> m->kflags |= MCE_IN_KERNEL_COPYIN; >> fallthrough; > > I knew something was still bugging me here and this is still wrong. > > Let's imagine this flow: > > copy_mc_to_user() - note *src is kernel memory > |-> copy_mc_enhanced_fast_string or copy_mc_fragile - it's the same thing > |-> -#MC, exception type EX_TYPE_DEFAULT_MCE_SAFE > |-> error_context(): > case EX_TYPE_DEFAULT_MCE_SAFE: > m->kflags |= MCE_IN_KERNEL_COPYIN; > > MCE_IN_KERNEL_COPYIN does kill_me_never(): > > pr_err("Kernel accessed poison in user space at %llx\n", p->mce_addr); > > but that's reading from kernel memory! Hi: 1. The copy_mc_to_kernel() is used in the coredump, KSM, and COW scenarios, in these scenarios, the src mem stores the user data and the kernel use kernel address to access the src mem(using kmap()). 2. the src mem of copy_mc_to_user() is currently only used by the DAX: dax_iomap_iter() -> dax_copy_to_iter() -> _copy_mc_to_iter -> copy_to_user_iter_mc() -> copy_mc_to_user() DAX is also used to store user data,such as pmem,pmem uses the kernel address to access src mem(memremap_pages()): pmem_attach_disk() -> devm_memremap_pages() -> memremap_pages() 3. EX_TYPE_DEFAULT_MCE_SAFE is only used in copy_mc_to_user()/copy_mc_to_kernel()。 4. Therefore, for EX_TYPE_DEFAULT_MCE_SAFE, the memory page where the hardware error occurs stores user data, these page can be securely isolated. This is different from UACCESS, which can be securely isolated only COPYIN(the src mem is user data) is checked. Based on the above understanding, I think the original logic should be fine, except for the pr_err() in kill_me_never(). Thanks. Tong. > > IOW, I *think* that switch statement should be this: > > switch (fixup_type) { > case EX_TYPE_UACCESS: > case EX_TYPE_DEFAULT_MCE_SAFE: > if (!copy_user) > return IN_KERNEL; > > m->kflags |= MCE_IN_KERNEL_COPYIN; > fallthrough; > > case EX_TYPE_FAULT_MCE_SAFE: > m->kflags |= MCE_IN_KERNEL_RECOV; > return IN_KERNEL_RECOV; > > default: > return IN_KERNEL; > } > > Provided I'm not missing a case and provided is_copy_from_user() really > detects all cases properly. > > And then patch 3 is wrong because we only can handle "copy in" - not > just any copy. > > Thx. >
Hi Petkov: Kindly ping... Thanks, Tong. 在 2024/2/18 18:08, Tong Tiangen 写道: > > > 在 2024/2/7 20:29, Borislav Petkov 写道: >> On Sun, Feb 04, 2024 at 04:26:26PM +0800, Tong Tiangen wrote: >>> diff --git a/arch/x86/kernel/cpu/mce/severity.c >>> b/arch/x86/kernel/cpu/mce/severity.c >>> index bca780fa5e57..b2cce1b6c96d 100644 >>> --- a/arch/x86/kernel/cpu/mce/severity.c >>> +++ b/arch/x86/kernel/cpu/mce/severity.c >>> @@ -292,11 +292,11 @@ static noinstr int error_context(struct mce *m, >>> struct pt_regs *regs) >>> case EX_TYPE_UACCESS: >>> if (!copy_user) >>> return IN_KERNEL; >>> + fallthrough; >>> + case EX_TYPE_DEFAULT_MCE_SAFE: >>> m->kflags |= MCE_IN_KERNEL_COPYIN; >>> fallthrough; >> >> I knew something was still bugging me here and this is still wrong. >> >> Let's imagine this flow: >> >> copy_mc_to_user() - note *src is kernel memory >> |-> copy_mc_enhanced_fast_string or copy_mc_fragile - it's the same thing >> |-> -#MC, exception type EX_TYPE_DEFAULT_MCE_SAFE >> |-> error_context(): >> case EX_TYPE_DEFAULT_MCE_SAFE: >> m->kflags |= MCE_IN_KERNEL_COPYIN; >> >> MCE_IN_KERNEL_COPYIN does kill_me_never(): >> >> pr_err("Kernel accessed poison in user space at %llx\n", >> p->mce_addr); >> >> but that's reading from kernel memory! > > Hi: > > 1. The copy_mc_to_kernel() is used in the coredump, KSM, and COW > scenarios, in these scenarios, the src mem stores the user data and the > kernel use kernel address to access the src mem(using kmap()). > > 2. the src mem of copy_mc_to_user() is currently only used by the DAX: > > dax_iomap_iter() > -> dax_copy_to_iter() > -> _copy_mc_to_iter > -> copy_to_user_iter_mc() > -> copy_mc_to_user() > > DAX is also used to store user data,such as pmem,pmem uses the kernel > address to access src mem(memremap_pages()): > > pmem_attach_disk() > -> devm_memremap_pages() > -> memremap_pages() > > 3. EX_TYPE_DEFAULT_MCE_SAFE is only used in > copy_mc_to_user()/copy_mc_to_kernel()。 > > 4. Therefore, for EX_TYPE_DEFAULT_MCE_SAFE, the memory page where the > hardware error occurs stores user data, these page can be securely > isolated. This is different from UACCESS, which can be securely isolated > only COPYIN(the src mem is user data) is checked. > > Based on the above understanding, I think the original logic should be > fine, except for the pr_err() in kill_me_never(). > > Thanks. > Tong. > >> >> IOW, I *think* that switch statement should be this: >> >> switch (fixup_type) { >> case EX_TYPE_UACCESS: >> case EX_TYPE_DEFAULT_MCE_SAFE: >> if (!copy_user) >> return IN_KERNEL; >> >> m->kflags |= MCE_IN_KERNEL_COPYIN; >> fallthrough; >> >> case EX_TYPE_FAULT_MCE_SAFE: >> m->kflags |= MCE_IN_KERNEL_RECOV; >> return IN_KERNEL_RECOV; >> >> default: >> return IN_KERNEL; >> } >> >> Provided I'm not missing a case and provided is_copy_from_user() really >> detects all cases properly. >> >> And then patch 3 is wrong because we only can handle "copy in" - not >> just any copy. >> >> Thx. >>
On Sun, Feb 18, 2024 at 06:08:14PM +0800, Tong Tiangen wrote: > 1. The copy_mc_to_kernel() is used in the coredump, KSM, and COW > scenarios, in these scenarios, the src mem stores the user data and the > kernel use kernel address to access the src mem(using kmap()). > > 2. the src mem of copy_mc_to_user() is currently only used by the DAX: You mean just because it currently is used somewhere which probably is ok - no clue what DAX does - and even if the source address is still *kernel* memory and even at the danger that someone else might use it in the future and think the handling on a potential #MC is ok, you're still arguing that this is the right thing to do perhaps because it fits your use case?! Sorry Tiangen, not gonna happen.
在 2024/3/28 6:05, Borislav Petkov 写道: > On Sun, Feb 18, 2024 at 06:08:14PM +0800, Tong Tiangen wrote: >> 1. The copy_mc_to_kernel() is used in the coredump, KSM, and COW >> scenarios, in these scenarios, the src mem stores the user data and the >> kernel use kernel address to access the src mem(using kmap()). >> >> 2. the src mem of copy_mc_to_user() is currently only used by the DAX: > > You mean just because it currently is used somewhere which probably is > ok - no clue what DAX does - and even if the source address is still > *kernel* memory and even at the danger that someone else might use it in > the future and think the handling on a potential #MC is ok, you're still > arguing that this is the right thing to do perhaps because it fits your > use case?! > > Sorry Tiangen, not gonna happen. > I left the office last week and felt sorry for the lateness of the reply. You are right. Our current processing is based on "experience" rather than interface constraints. is_copy_from_user() determines whether a user is a "copy user" based on fault_in_kernel_space(). Therefore, it returns false for copy_mc_to_kernel()/copy_mc_to_user(). As a result, MCE_IN_KERNEL_COPYIN cannot be set in error_context(). Comprehensive consideration of all factors, it is better to manually call memory_failure_queue() to handle this problem case by case. Finally, do we consider accepting only the patch 1/3 ? Thanks, Tong.
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c index bca780fa5e57..b2cce1b6c96d 100644 --- a/arch/x86/kernel/cpu/mce/severity.c +++ b/arch/x86/kernel/cpu/mce/severity.c @@ -292,11 +292,11 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs) case EX_TYPE_UACCESS: if (!copy_user) return IN_KERNEL; + fallthrough; + case EX_TYPE_DEFAULT_MCE_SAFE: m->kflags |= MCE_IN_KERNEL_COPYIN; fallthrough; - case EX_TYPE_FAULT_MCE_SAFE: - case EX_TYPE_DEFAULT_MCE_SAFE: m->kflags |= MCE_IN_KERNEL_RECOV; return IN_KERNEL_RECOV; diff --git a/mm/ksm.c b/mm/ksm.c index 8c001819cf10..ba9d324ea1c6 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -3084,7 +3084,6 @@ struct folio *ksm_might_need_to_copy(struct folio *folio, if (copy_mc_user_highpage(folio_page(new_folio, 0), page, addr, vma)) { folio_put(new_folio); - memory_failure_queue(folio_pfn(folio), 0); return ERR_PTR(-EHWPOISON); } folio_set_dirty(new_folio); diff --git a/mm/memory.c b/mm/memory.c index 8d14ba440929..ee06a8f766ab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2846,10 +2846,8 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src, unsigned long addr = vmf->address; if (likely(src)) { - if (copy_mc_user_highpage(dst, src, addr, vma)) { - memory_failure_queue(page_to_pfn(src), 0); + if (copy_mc_user_highpage(dst, src, addr, vma)) return -EHWPOISON; - } return 0; } @@ -6179,10 +6177,8 @@ static int copy_user_gigantic_page(struct folio *dst, struct folio *src, cond_resched(); if (copy_mc_user_highpage(dst_page, src_page, - addr + i*PAGE_SIZE, vma)) { - memory_failure_queue(page_to_pfn(src_page), 0); + addr + i*PAGE_SIZE, vma)) return -EHWPOISON; - } } return 0; } @@ -6199,10 +6195,9 @@ static int copy_subpage(unsigned long addr, int idx, void *arg) struct page *dst = nth_page(copy_arg->dst, idx); struct page *src = nth_page(copy_arg->src, idx); - if (copy_mc_user_highpage(dst, src, addr, copy_arg->vma)) { - memory_failure_queue(page_to_pfn(src), 0); + if (copy_mc_user_highpage(dst, src, addr, copy_arg->vma)) return -EHWPOISON; - } + return 0; }