Message ID | 20250121095739.986006-7-rppt@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/module: rework ROX cache to avoid writable copy | expand |
On 1/21/25 10:57, Mike Rapoport wrote: > In order to use execmem's API for temporal remapping of the memory > allocated from ROX cache as writable, there is a need to distinguish > between the state when the module is being formed and the state when it is > deconstructed and freed so that when module_memory_free() is called from > error paths during module loading it could restore ROX mappings. > > Replace open coded checks for MODULE_STATE_UNFORMED with a helper > function module_is_formed() and add a new MODULE_STATE_GONE that will be > set when the module is deconstructed and freed. I don't fully follow why this case requires a new module state. My understanding it that the function load_module() has the necessary context that after calling layout_and_allocate(), the updated ROX mappings need to be restored. I would then expect the function to be appropriately able to unwind this operation in case of an error. It could be done by having a helper that walks the mappings and calls execmem_restore_rox(), or if you want to keep it in module_memory_free() as done in the patch #7 then a flag could be passed down to module_deallocate() -> free_mod_mem() -> module_memory_free()? It is at least good that MODULE_STATE_GONE is only set in free_module() past the sysfs teardown, so it never shows in /sys/module/<mod>/initstate. Otherwise, this would require teaching kmod about this state as well.
On Thu, Jan 23, 2025 at 03:16:28PM +0100, Petr Pavlu wrote: > On 1/21/25 10:57, Mike Rapoport wrote: > > In order to use execmem's API for temporal remapping of the memory > > allocated from ROX cache as writable, there is a need to distinguish > > between the state when the module is being formed and the state when it is > > deconstructed and freed so that when module_memory_free() is called from > > error paths during module loading it could restore ROX mappings. > > > > Replace open coded checks for MODULE_STATE_UNFORMED with a helper > > function module_is_formed() and add a new MODULE_STATE_GONE that will be > > set when the module is deconstructed and freed. > > I don't fully follow why this case requires a new module state. My > understanding it that the function load_module() has the necessary > context that after calling layout_and_allocate(), the updated ROX > mappings need to be restored. I would then expect the function to be > appropriately able to unwind this operation in case of an error. It > could be done by having a helper that walks the mappings and calls > execmem_restore_rox(), or if you want to keep it in module_memory_free() > as done in the patch #7 then a flag could be passed down to > module_deallocate() -> free_mod_mem() -> module_memory_free()? Initially I wanted to track ROX <-> RW transitions in struct module_memory so that module_memory_free() could do the right thing depending on memory state. But that meant either ugly games with const'ness in strict_rwx.c, an additional helper or a new global module state. The latter seemed the most elegant to me. If a new global module state is really that intrusive, I can drop it in favor a helper that will be called from error handling paths. E.g. something like the patch below (on top of this series and with this patch reverted) diff --git a/kernel/module/main.c b/kernel/module/main.c index 7164cd353a78..4a02503836d7 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -1268,13 +1268,20 @@ static int module_memory_alloc(struct module *mod, enum mod_mem_type type) return 0; } +static void module_memory_restore_rox(struct module *mod) +{ + for_class_mod_mem_type(type, text) { + struct module_memory *mem = &mod->mem[type]; + + if (mem->is_rox) + execmem_restore_rox(mem->base, mem->size); + } +} + static void module_memory_free(struct module *mod, enum mod_mem_type type) { struct module_memory *mem = &mod->mem[type]; - if (mod->state == MODULE_STATE_UNFORMED && mem->is_rox) - execmem_restore_rox(mem->base, mem->size); - execmem_free(mem->base); } @@ -2617,6 +2624,7 @@ static int move_module(struct module *mod, struct load_info *info) return 0; out_err: + module_memory_restore_rox(mod); for (t--; t >= 0; t--) module_memory_free(mod, t); if (codetag_section_found) @@ -3372,6 +3380,7 @@ static int load_module(struct load_info *info, const char __user *uargs, mod->mem[type].size); } + module_memory_restore_rox(mod); module_deallocate(mod, info); free_copy: /* > It is at least good that MODULE_STATE_GONE is only set in free_module() > past the sysfs teardown, so it never shows in > /sys/module/<mod>/initstate. Otherwise, this would require teaching kmod > about this state as well. > > -- > Thanks, > Petr
On 1/24/25 12:06, Mike Rapoport wrote: > On Thu, Jan 23, 2025 at 03:16:28PM +0100, Petr Pavlu wrote: >> On 1/21/25 10:57, Mike Rapoport wrote: >>> In order to use execmem's API for temporal remapping of the memory >>> allocated from ROX cache as writable, there is a need to distinguish >>> between the state when the module is being formed and the state when it is >>> deconstructed and freed so that when module_memory_free() is called from >>> error paths during module loading it could restore ROX mappings. >>> >>> Replace open coded checks for MODULE_STATE_UNFORMED with a helper >>> function module_is_formed() and add a new MODULE_STATE_GONE that will be >>> set when the module is deconstructed and freed. >> >> I don't fully follow why this case requires a new module state. My >> understanding it that the function load_module() has the necessary >> context that after calling layout_and_allocate(), the updated ROX >> mappings need to be restored. I would then expect the function to be >> appropriately able to unwind this operation in case of an error. It >> could be done by having a helper that walks the mappings and calls >> execmem_restore_rox(), or if you want to keep it in module_memory_free() >> as done in the patch #7 then a flag could be passed down to >> module_deallocate() -> free_mod_mem() -> module_memory_free()? > > Initially I wanted to track ROX <-> RW transitions in struct module_memory > so that module_memory_free() could do the right thing depending on memory > state. But that meant either ugly games with const'ness in strict_rwx.c, > an additional helper or a new global module state. The latter seemed the > most elegant to me. > If a new global module state is really that intrusive, I can drop it in > favor a helper that will be called from error handling paths. E.g. > something like the patch below (on top of this series and with this patch > reverted) > > diff --git a/kernel/module/main.c b/kernel/module/main.c > index 7164cd353a78..4a02503836d7 100644 > --- a/kernel/module/main.c > +++ b/kernel/module/main.c > @@ -1268,13 +1268,20 @@ static int module_memory_alloc(struct module *mod, enum mod_mem_type type) > return 0; > } > > +static void module_memory_restore_rox(struct module *mod) > +{ > + for_class_mod_mem_type(type, text) { > + struct module_memory *mem = &mod->mem[type]; > + > + if (mem->is_rox) > + execmem_restore_rox(mem->base, mem->size); > + } > +} > + > static void module_memory_free(struct module *mod, enum mod_mem_type type) > { > struct module_memory *mem = &mod->mem[type]; > > - if (mod->state == MODULE_STATE_UNFORMED && mem->is_rox) > - execmem_restore_rox(mem->base, mem->size); > - > execmem_free(mem->base); > } > > @@ -2617,6 +2624,7 @@ static int move_module(struct module *mod, struct load_info *info) > > return 0; > out_err: > + module_memory_restore_rox(mod); > for (t--; t >= 0; t--) > module_memory_free(mod, t); > if (codetag_section_found) > @@ -3372,6 +3380,7 @@ static int load_module(struct load_info *info, const char __user *uargs, > mod->mem[type].size); > } > > + module_memory_restore_rox(mod); > module_deallocate(mod, info); > free_copy: > /* > This looks better to me. My view is that the module_state tracks major stages of a module during its lifecycle. It provides information to the module loader itself, other subsystems that need to closely interact with modules, and to the userspace via the initstate sysfs attribute. Adding a new state means potentially more complexity for all these parts. In this case, the state was needed because of a logic that is local only to the module loader, or even just to the function load_module(). I think it is better to avoid adding a new state only for that.
On Fri 2025-01-24 13:59:55, Petr Pavlu wrote: > On 1/24/25 12:06, Mike Rapoport wrote: > > On Thu, Jan 23, 2025 at 03:16:28PM +0100, Petr Pavlu wrote: > >> On 1/21/25 10:57, Mike Rapoport wrote: > >>> In order to use execmem's API for temporal remapping of the memory > >>> allocated from ROX cache as writable, there is a need to distinguish > >>> between the state when the module is being formed and the state when it is > >>> deconstructed and freed so that when module_memory_free() is called from > >>> error paths during module loading it could restore ROX mappings. > >>> > >>> Replace open coded checks for MODULE_STATE_UNFORMED with a helper > >>> function module_is_formed() and add a new MODULE_STATE_GONE that will be > >>> set when the module is deconstructed and freed. > >> > >> I don't fully follow why this case requires a new module state. My > >> understanding it that the function load_module() has the necessary > >> context that after calling layout_and_allocate(), the updated ROX > >> mappings need to be restored. I would then expect the function to be > >> appropriately able to unwind this operation in case of an error. It > >> could be done by having a helper that walks the mappings and calls > >> execmem_restore_rox(), or if you want to keep it in module_memory_free() > >> as done in the patch #7 then a flag could be passed down to > >> module_deallocate() -> free_mod_mem() -> module_memory_free()? > > > > Initially I wanted to track ROX <-> RW transitions in struct module_memory > > so that module_memory_free() could do the right thing depending on memory > > state. But that meant either ugly games with const'ness in strict_rwx.c, > > an additional helper or a new global module state. The latter seemed the > > most elegant to me. > > If a new global module state is really that intrusive, I can drop it in > > favor a helper that will be called from error handling paths. E.g. > > something like the patch below (on top of this series and with this patch > > reverted) > > > > diff --git a/kernel/module/main.c b/kernel/module/main.c > > index 7164cd353a78..4a02503836d7 100644 > > --- a/kernel/module/main.c > > +++ b/kernel/module/main.c > > @@ -1268,13 +1268,20 @@ static int module_memory_alloc(struct module *mod, enum mod_mem_type type) > > return 0; > > } > > > > +static void module_memory_restore_rox(struct module *mod) > > +{ > > + for_class_mod_mem_type(type, text) { > > + struct module_memory *mem = &mod->mem[type]; > > + > > + if (mem->is_rox) > > + execmem_restore_rox(mem->base, mem->size); > > + } > > +} > > + > > static void module_memory_free(struct module *mod, enum mod_mem_type type) > > { > > struct module_memory *mem = &mod->mem[type]; > > > > - if (mod->state == MODULE_STATE_UNFORMED && mem->is_rox) > > - execmem_restore_rox(mem->base, mem->size); > > - > > execmem_free(mem->base); > > } > > > > @@ -2617,6 +2624,7 @@ static int move_module(struct module *mod, struct load_info *info) > > > > return 0; > > out_err: > > + module_memory_restore_rox(mod); > > for (t--; t >= 0; t--) > > module_memory_free(mod, t); > > if (codetag_section_found) > > @@ -3372,6 +3380,7 @@ static int load_module(struct load_info *info, const char __user *uargs, > > mod->mem[type].size); > > } > > > > + module_memory_restore_rox(mod); > > module_deallocate(mod, info); > > free_copy: > > /* > > > > This looks better to me. > > My view is that the module_state tracks major stages of a module during > its lifecycle. It provides information to the module loader itself, > other subsystems that need to closely interact with modules, and to the > userspace via the initstate sysfs attribute. > > Adding a new state means potentially more complexity for all these > parts. In this case, the state was needed because of a logic that is > local only to the module loader, or even just to the function > load_module(). I think it is better to avoid adding a new state only for > that. I fully agree here. The added complexity is already visible in the original patch. It updates about 15 locations where mod->state is checked. Every location should be reviewed whether the change is correct. The changes are spread in various subsystems, like kallsyms, kdb, tracepoint, livepatch. Many people need to understand the meaning of the new state and decide if the change is OK. So, it affects many people and touches 15 locations where things my go wrong. The alternative solution, proposed above, looks much easier to me. Best Regards, Petr
diff --git a/include/linux/module.h b/include/linux/module.h index b3a643435357..624a5317d5a5 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -320,6 +320,7 @@ enum module_state { MODULE_STATE_COMING, /* Full formed, running module_init. */ MODULE_STATE_GOING, /* Going away. */ MODULE_STATE_UNFORMED, /* Still setting it up. */ + MODULE_STATE_GONE, /* Deconstructing and freeing. */ }; struct mod_tree_node { @@ -620,6 +621,11 @@ static inline bool module_is_coming(struct module *mod) return mod->state == MODULE_STATE_COMING; } +static inline bool module_is_formed(struct module *mod) +{ + return mod->state < MODULE_STATE_UNFORMED; +} + struct module *__module_text_address(unsigned long addr); struct module *__module_address(unsigned long addr); bool is_module_address(unsigned long addr); diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c index bf65e0c3c86f..daf9a9b3740f 100644 --- a/kernel/module/kallsyms.c +++ b/kernel/module/kallsyms.c @@ -361,7 +361,7 @@ int lookup_module_symbol_name(unsigned long addr, char *symname) preempt_disable(); list_for_each_entry_rcu(mod, &modules, list) { - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) continue; if (within_module(addr, mod)) { const char *sym; @@ -389,7 +389,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type, list_for_each_entry_rcu(mod, &modules, list) { struct mod_kallsyms *kallsyms; - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) continue; kallsyms = rcu_dereference_sched(mod->kallsyms); if (symnum < kallsyms->num_symtab) { @@ -441,7 +441,7 @@ static unsigned long __module_kallsyms_lookup_name(const char *name) list_for_each_entry_rcu(mod, &modules, list) { unsigned long ret; - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) continue; ret = __find_kallsyms_symbol_value(mod, name); if (ret) @@ -484,7 +484,7 @@ int module_kallsyms_on_each_symbol(const char *modname, list_for_each_entry(mod, &modules, list) { struct mod_kallsyms *kallsyms; - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) continue; if (modname && strcmp(modname, mod->name)) diff --git a/kernel/module/kdb.c b/kernel/module/kdb.c index 995c32d3698f..14f14700ffc2 100644 --- a/kernel/module/kdb.c +++ b/kernel/module/kdb.c @@ -23,7 +23,7 @@ int kdb_lsmod(int argc, const char **argv) kdb_printf("Module Size modstruct Used by\n"); list_for_each_entry(mod, &modules, list) { - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) continue; kdb_printf("%-20s%8u", mod->name, mod->mem[MOD_TEXT].size); diff --git a/kernel/module/main.c b/kernel/module/main.c index 5399c182b3cb..ad8ef20c120f 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -153,7 +153,7 @@ EXPORT_SYMBOL(unregister_module_notifier); */ static inline int strong_try_module_get(struct module *mod) { - BUG_ON(mod && mod->state == MODULE_STATE_UNFORMED); + BUG_ON(mod && !module_is_formed(mod)); if (mod && mod->state == MODULE_STATE_COMING) return -EBUSY; if (try_module_get(mod)) @@ -361,7 +361,7 @@ bool find_symbol(struct find_symbol_arg *fsa) GPL_ONLY }, }; - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) continue; for (i = 0; i < ARRAY_SIZE(arr); i++) @@ -386,7 +386,7 @@ struct module *find_module_all(const char *name, size_t len, list_for_each_entry_rcu(mod, &modules, list, lockdep_is_held(&module_mutex)) { - if (!even_unformed && mod->state == MODULE_STATE_UNFORMED) + if (!even_unformed && !module_is_formed(mod)) continue; if (strlen(mod->name) == len && !memcmp(mod->name, name, len)) return mod; @@ -457,7 +457,7 @@ bool __is_module_percpu_address(unsigned long addr, unsigned long *can_addr) preempt_disable(); list_for_each_entry_rcu(mod, &modules, list) { - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) continue; if (!mod->percpu_size) continue; @@ -1326,7 +1326,7 @@ static void free_module(struct module *mod) * that noone uses it while it's being deconstructed. */ mutex_lock(&module_mutex); - mod->state = MODULE_STATE_UNFORMED; + mod->state = MODULE_STATE_GONE; mutex_unlock(&module_mutex); /* Arch-specific cleanup. */ @@ -3048,8 +3048,7 @@ static int module_patient_check_exists(const char *name, if (old == NULL) return 0; - if (old->state == MODULE_STATE_COMING || - old->state == MODULE_STATE_UNFORMED) { + if (old->state == MODULE_STATE_COMING || !module_is_formed(old)) { /* Wait in case it fails to load. */ mutex_unlock(&module_mutex); err = wait_event_interruptible(module_wq, @@ -3608,7 +3607,7 @@ char *module_flags(struct module *mod, char *buf, bool show_state) { int bx = 0; - BUG_ON(mod->state == MODULE_STATE_UNFORMED); + BUG_ON(!module_is_formed(mod)); if (!mod->taints && !show_state) goto out; if (mod->taints || @@ -3702,7 +3701,7 @@ struct module *__module_address(unsigned long addr) mod = mod_find(addr, &mod_tree); if (mod) { BUG_ON(!within_module(addr, mod)); - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) mod = NULL; } return mod; @@ -3756,7 +3755,7 @@ void print_modules(void) /* Most callers should already have preempt disabled, but make sure */ preempt_disable(); list_for_each_entry_rcu(mod, &modules, list) { - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) continue; pr_cont(" %s%s", mod->name, module_flags(mod, buf, true)); } diff --git a/kernel/module/procfs.c b/kernel/module/procfs.c index 0a4841e88adb..2c617e6f8bc0 100644 --- a/kernel/module/procfs.c +++ b/kernel/module/procfs.c @@ -79,7 +79,7 @@ static int m_show(struct seq_file *m, void *p) unsigned int size; /* We always ignore unformed modules. */ - if (mod->state == MODULE_STATE_UNFORMED) + if (!module_is_formed(mod)) return 0; size = module_total_size(mod); diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 1848ce7e2976..e94247afb2c6 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -668,6 +668,8 @@ static int tracepoint_module_notify(struct notifier_block *self, break; case MODULE_STATE_UNFORMED: break; + case MODULE_STATE_GONE: + break; } return notifier_from_errno(ret); } diff --git a/lib/kunit/test.c b/lib/kunit/test.c index 089c832e3cdb..54eaed92a2d3 100644 --- a/lib/kunit/test.c +++ b/lib/kunit/test.c @@ -836,6 +836,8 @@ static int kunit_module_notify(struct notifier_block *nb, unsigned long val, break; case MODULE_STATE_UNFORMED: break; + case MODULE_STATE_GONE: + break; } return 0; diff --git a/samples/livepatch/livepatch-callbacks-demo.c b/samples/livepatch/livepatch-callbacks-demo.c index 11c3f4357812..324bddaef9a6 100644 --- a/samples/livepatch/livepatch-callbacks-demo.c +++ b/samples/livepatch/livepatch-callbacks-demo.c @@ -93,6 +93,7 @@ static const char *const module_state[] = { [MODULE_STATE_COMING] = "[MODULE_STATE_COMING] Full formed, running module_init", [MODULE_STATE_GOING] = "[MODULE_STATE_GOING] Going away", [MODULE_STATE_UNFORMED] = "[MODULE_STATE_UNFORMED] Still setting it up", + [MODULE_STATE_GONE] = "[MODULE_STATE_GONE] Deconstructing and freeing", }; static void callback_info(const char *callback, struct klp_object *obj) diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_callbacks_demo.c b/tools/testing/selftests/livepatch/test_modules/test_klp_callbacks_demo.c index 3fd8fe1cd1cc..8435e3254f85 100644 --- a/tools/testing/selftests/livepatch/test_modules/test_klp_callbacks_demo.c +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_callbacks_demo.c @@ -16,6 +16,7 @@ static const char *const module_state[] = { [MODULE_STATE_COMING] = "[MODULE_STATE_COMING] Full formed, running module_init", [MODULE_STATE_GOING] = "[MODULE_STATE_GOING] Going away", [MODULE_STATE_UNFORMED] = "[MODULE_STATE_UNFORMED] Still setting it up", + [MODULE_STATE_GONE] = "[MODULE_STATE_GONE] Deconstructing and freeing", }; static void callback_info(const char *callback, struct klp_object *obj) diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_callbacks_demo2.c b/tools/testing/selftests/livepatch/test_modules/test_klp_callbacks_demo2.c index 5417573e80af..78c1fff5d977 100644 --- a/tools/testing/selftests/livepatch/test_modules/test_klp_callbacks_demo2.c +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_callbacks_demo2.c @@ -16,6 +16,7 @@ static const char *const module_state[] = { [MODULE_STATE_COMING] = "[MODULE_STATE_COMING] Full formed, running module_init", [MODULE_STATE_GOING] = "[MODULE_STATE_GOING] Going away", [MODULE_STATE_UNFORMED] = "[MODULE_STATE_UNFORMED] Still setting it up", + [MODULE_STATE_GONE] = "[MODULE_STATE_GONE] Deconstructing and freeing", }; static void callback_info(const char *callback, struct klp_object *obj) diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_state.c b/tools/testing/selftests/livepatch/test_modules/test_klp_state.c index 57a4253acb01..bdebf1d24c98 100644 --- a/tools/testing/selftests/livepatch/test_modules/test_klp_state.c +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_state.c @@ -18,6 +18,7 @@ static const char *const module_state[] = { [MODULE_STATE_COMING] = "[MODULE_STATE_COMING] Full formed, running module_init", [MODULE_STATE_GOING] = "[MODULE_STATE_GOING] Going away", [MODULE_STATE_UNFORMED] = "[MODULE_STATE_UNFORMED] Still setting it up", + [MODULE_STATE_GONE] = "[MODULE_STATE_GONE] Deconstructing and freeing", }; static void callback_info(const char *callback, struct klp_object *obj) diff --git a/tools/testing/selftests/livepatch/test_modules/test_klp_state2.c b/tools/testing/selftests/livepatch/test_modules/test_klp_state2.c index c978ea4d5e67..1a55f84a8eb3 100644 --- a/tools/testing/selftests/livepatch/test_modules/test_klp_state2.c +++ b/tools/testing/selftests/livepatch/test_modules/test_klp_state2.c @@ -18,6 +18,7 @@ static const char *const module_state[] = { [MODULE_STATE_COMING] = "[MODULE_STATE_COMING] Full formed, running module_init", [MODULE_STATE_GOING] = "[MODULE_STATE_GOING] Going away", [MODULE_STATE_UNFORMED] = "[MODULE_STATE_UNFORMED] Still setting it up", + [MODULE_STATE_GONE] = "[MODULE_STATE_GONE] Deconstructing and freeing", }; static void callback_info(const char *callback, struct klp_object *obj)