Message ID | 20220704161753.4033684-1-atomlin@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] module: kallsyms: Ensure preemption in add_kallsyms() with PREEMPT_RT | expand |
Hey Aaron, thanks again! On Mon, Jul 04, 2022 at 05:17:53PM +0100, Aaron Tomlin wrote: > To disable preemption in the context of add_kallsyms() is incorrect. Why, what broke? Did this used to work? Was the commit in question a regression then? Clarifying all this will help a lot. The commit log is better but yet doesn't make it easy for me to tell if I should send this to Linus as a fix in the rc series. Luis > Before kallsyms-specific data is prepared/or set-up, we ensure that > the unformed module is known to be unique i.e. does not already exist > (see load_module()). Therefore, we can fix this by using the common RCU > primitive as this section of code can be safely preempted. > > Reported-by: Steven Rostedt <rostedt@goodmis.org> > Fixes: 08126db5ff73 ("module: kallsyms: Fix suspicious rcu usage") > Signed-off-by: Aaron Tomlin <atomlin@redhat.com> > --- > kernel/module/kallsyms.c | 24 ++++++++++++------------ > 1 file changed, 12 insertions(+), 12 deletions(-) > > diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c > index 3e11523bc6f6..0b6fd82d5898 100644 > --- a/kernel/module/kallsyms.c > +++ b/kernel/module/kallsyms.c > @@ -174,14 +174,14 @@ void add_kallsyms(struct module *mod, const struct load_info *info) > mod->kallsyms = (void __rcu *)mod->init_layout.base + > info->mod_kallsyms_init_off; > > - preempt_disable(); > + rcu_read_lock(); > /* The following is safe since this pointer cannot change */ > - rcu_dereference_sched(mod->kallsyms)->symtab = (void *)symsec->sh_addr; > - rcu_dereference_sched(mod->kallsyms)->num_symtab = symsec->sh_size / sizeof(Elf_Sym); > + rcu_dereference(mod->kallsyms)->symtab = (void *)symsec->sh_addr; > + rcu_dereference(mod->kallsyms)->num_symtab = symsec->sh_size / sizeof(Elf_Sym); > /* Make sure we get permanent strtab: don't use info->strtab. */ > - rcu_dereference_sched(mod->kallsyms)->strtab = > + rcu_dereference(mod->kallsyms)->strtab = > (void *)info->sechdrs[info->index.str].sh_addr; > - rcu_dereference_sched(mod->kallsyms)->typetab = mod->init_layout.base + info->init_typeoffs; > + rcu_dereference(mod->kallsyms)->typetab = mod->init_layout.base + info->init_typeoffs; > > /* > * Now populate the cut down core kallsyms for after init > @@ -190,22 +190,22 @@ void add_kallsyms(struct module *mod, const struct load_info *info) > mod->core_kallsyms.symtab = dst = mod->data_layout.base + info->symoffs; > mod->core_kallsyms.strtab = s = mod->data_layout.base + info->stroffs; > mod->core_kallsyms.typetab = mod->data_layout.base + info->core_typeoffs; > - src = rcu_dereference_sched(mod->kallsyms)->symtab; > - for (ndst = i = 0; i < rcu_dereference_sched(mod->kallsyms)->num_symtab; i++) { > - rcu_dereference_sched(mod->kallsyms)->typetab[i] = elf_type(src + i, info); > + src = rcu_dereference(mod->kallsyms)->symtab; > + for (ndst = i = 0; i < rcu_dereference(mod->kallsyms)->num_symtab; i++) { > + rcu_dereference(mod->kallsyms)->typetab[i] = elf_type(src + i, info); > if (i == 0 || is_livepatch_module(mod) || > is_core_symbol(src + i, info->sechdrs, info->hdr->e_shnum, > info->index.pcpu)) { > mod->core_kallsyms.typetab[ndst] = > - rcu_dereference_sched(mod->kallsyms)->typetab[i]; > + rcu_dereference(mod->kallsyms)->typetab[i]; > dst[ndst] = src[i]; > dst[ndst++].st_name = s - mod->core_kallsyms.strtab; > - s += strscpy(s, > - &rcu_dereference_sched(mod->kallsyms)->strtab[src[i].st_name], > + s += strlcpy(s, > + &rcu_dereference(mod->kallsyms)->strtab[src[i].st_name], > KSYM_NAME_LEN) + 1; > } > } > - preempt_enable(); > + rcu_read_unlock(); > mod->core_kallsyms.num_symtab = ndst; > } > > -- > 2.34.3 >
On Wed 2022-07-06 10:58 -0700, Luis Chamberlain wrote: > Hey Aaron, thanks again! Hi Luis, No problem :) > On Mon, Jul 04, 2022 at 05:17:53PM +0100, Aaron Tomlin wrote: > > To disable preemption in the context of add_kallsyms() is incorrect. > > Why, what broke? Did this used to work? Was the commit in question a > regression then? Clarifying all this will help a lot. Sorry for the confusion! If I understand correctly, nothing broke intrinsically. Rather with commit 08126db5ff73 ("module: kallsyms: Fix suspicious rcu usage") under PREEMPT_RT=y, by disabling preemption, I introduced an unbounded latency since the loop is not fixed which is generally frowned upon. So, I would say this was a regression since earlier preemption was not disabled and we would dereference RCU-protected pointers explicitly i.e. without using the more appropriate rcu_dereference() family of primitives. That being said, these pointers cannot change in this context as explained previously. Would the above be suitable - just to confirm before I send another iteration? Kind regards,
On Thu, Jul 07, 2022 at 05:57:50PM +0100, Aaron Tomlin wrote: > On Wed 2022-07-06 10:58 -0700, Luis Chamberlain wrote: > > Hey Aaron, thanks again! > > Hi Luis, > > No problem :) > > > On Mon, Jul 04, 2022 at 05:17:53PM +0100, Aaron Tomlin wrote: > > > To disable preemption in the context of add_kallsyms() is incorrect. > > > > Why, what broke? Did this used to work? Was the commit in question a > > regression then? Clarifying all this will help a lot. > > Sorry for the confusion! If I understand correctly, nothing broke > intrinsically. > > Rather with commit 08126db5ff73 ("module: kallsyms: Fix suspicious rcu > usage") under PREEMPT_RT=y, by disabling preemption, I introduced an > unbounded latency since the loop is not fixed which is generally frowned > upon. This is incredibly important information which should be added to the commit log, specialy as PREEMPT_RT=y becomes a first class citizen. > So, I would say this was a regression since earlier preemption was > not disabled and we would dereference RCU-protected pointers explicitly > i.e. without using the more appropriate rcu_dereference() family > of primitives. That being said, these pointers cannot change in this > context as explained previously. > > Would the above be suitable - just to confirm before I send another > iteration? Yes, I would send this to Linus for the rc series. Please adjust the commit log with all this information. BTW I think there is just one more fix pending from you right? Luis
On Thu 2022-07-07 10:20 -0700, Luis Chamberlain wrote: > This is incredibly important information which should be added to the > commit log, specialy as PREEMPT_RT=y becomes a first class citizen. Understood. > > > So, I would say this was a regression since earlier preemption was > > not disabled and we would dereference RCU-protected pointers explicitly > > i.e. without using the more appropriate rcu_dereference() family > > of primitives. That being said, these pointers cannot change in this > > context as explained previously. > > > > Would the above be suitable - just to confirm before I send another > > iteration? > > Yes, I would send this to Linus for the rc series. Please adjust the > commit log with all this information. Will do. > BTW I think there is just one more fix pending from you right? Yes - I will send/or prepare it as a partial revert: 's/strscpy/strlcpy/' with a brief explanation. Kind regards,
On Thu, Jul 07, 2022 at 07:56:19PM +0100, Aaron Tomlin wrote: > On Thu 2022-07-07 10:20 -0700, Luis Chamberlain wrote: > > This is incredibly important information which should be added to the > > commit log, specialy as PREEMPT_RT=y becomes a first class citizen. > > Understood. > > > > > > So, I would say this was a regression since earlier preemption was > > > not disabled and we would dereference RCU-protected pointers explicitly > > > i.e. without using the more appropriate rcu_dereference() family > > > of primitives. That being said, these pointers cannot change in this > > > context as explained previously. > > > > > > Would the above be suitable - just to confirm before I send another > > > iteration? > > > > Yes, I would send this to Linus for the rc series. Please adjust the > > commit log with all this information. > > Will do. > > > BTW I think there is just one more fix pending from you right? > > Yes - I will send/or prepare it as a partial revert: > 's/strscpy/strlcpy/' with a brief explanation. If that is just the kallsyms fix Adrian Hunter already sent a fix: https://lkml.kernel.org/r/20220701094403.3044-1-adrian.hunter@intel.com Luis
diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c index 3e11523bc6f6..0b6fd82d5898 100644 --- a/kernel/module/kallsyms.c +++ b/kernel/module/kallsyms.c @@ -174,14 +174,14 @@ void add_kallsyms(struct module *mod, const struct load_info *info) mod->kallsyms = (void __rcu *)mod->init_layout.base + info->mod_kallsyms_init_off; - preempt_disable(); + rcu_read_lock(); /* The following is safe since this pointer cannot change */ - rcu_dereference_sched(mod->kallsyms)->symtab = (void *)symsec->sh_addr; - rcu_dereference_sched(mod->kallsyms)->num_symtab = symsec->sh_size / sizeof(Elf_Sym); + rcu_dereference(mod->kallsyms)->symtab = (void *)symsec->sh_addr; + rcu_dereference(mod->kallsyms)->num_symtab = symsec->sh_size / sizeof(Elf_Sym); /* Make sure we get permanent strtab: don't use info->strtab. */ - rcu_dereference_sched(mod->kallsyms)->strtab = + rcu_dereference(mod->kallsyms)->strtab = (void *)info->sechdrs[info->index.str].sh_addr; - rcu_dereference_sched(mod->kallsyms)->typetab = mod->init_layout.base + info->init_typeoffs; + rcu_dereference(mod->kallsyms)->typetab = mod->init_layout.base + info->init_typeoffs; /* * Now populate the cut down core kallsyms for after init @@ -190,22 +190,22 @@ void add_kallsyms(struct module *mod, const struct load_info *info) mod->core_kallsyms.symtab = dst = mod->data_layout.base + info->symoffs; mod->core_kallsyms.strtab = s = mod->data_layout.base + info->stroffs; mod->core_kallsyms.typetab = mod->data_layout.base + info->core_typeoffs; - src = rcu_dereference_sched(mod->kallsyms)->symtab; - for (ndst = i = 0; i < rcu_dereference_sched(mod->kallsyms)->num_symtab; i++) { - rcu_dereference_sched(mod->kallsyms)->typetab[i] = elf_type(src + i, info); + src = rcu_dereference(mod->kallsyms)->symtab; + for (ndst = i = 0; i < rcu_dereference(mod->kallsyms)->num_symtab; i++) { + rcu_dereference(mod->kallsyms)->typetab[i] = elf_type(src + i, info); if (i == 0 || is_livepatch_module(mod) || is_core_symbol(src + i, info->sechdrs, info->hdr->e_shnum, info->index.pcpu)) { mod->core_kallsyms.typetab[ndst] = - rcu_dereference_sched(mod->kallsyms)->typetab[i]; + rcu_dereference(mod->kallsyms)->typetab[i]; dst[ndst] = src[i]; dst[ndst++].st_name = s - mod->core_kallsyms.strtab; - s += strscpy(s, - &rcu_dereference_sched(mod->kallsyms)->strtab[src[i].st_name], + s += strlcpy(s, + &rcu_dereference(mod->kallsyms)->strtab[src[i].st_name], KSYM_NAME_LEN) + 1; } } - preempt_enable(); + rcu_read_unlock(); mod->core_kallsyms.num_symtab = ndst; }
To disable preemption in the context of add_kallsyms() is incorrect. Before kallsyms-specific data is prepared/or set-up, we ensure that the unformed module is known to be unique i.e. does not already exist (see load_module()). Therefore, we can fix this by using the common RCU primitive as this section of code can be safely preempted. Reported-by: Steven Rostedt <rostedt@goodmis.org> Fixes: 08126db5ff73 ("module: kallsyms: Fix suspicious rcu usage") Signed-off-by: Aaron Tomlin <atomlin@redhat.com> --- kernel/module/kallsyms.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)