Message ID | 20200724050553.1724168-6-jarkko.sakkinen@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arch/x86: kprobes: Remove MODULES dependency | expand |
* Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> wrote: > Use text_alloc() and text_free() instead of module_alloc() and > module_memfree() when an arch provides them. > > Cc: linux-mm@kvack.org > Cc: Andi Kleen <ak@linux.intel.com> > Cc: Masami Hiramatsu <mhiramat@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > --- > kernel/kprobes.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > index 4e46d96d4e16..611fcda9f6bf 100644 > --- a/kernel/kprobes.c > +++ b/kernel/kprobes.c > @@ -40,6 +40,7 @@ > #include <asm/cacheflush.h> > #include <asm/errno.h> > #include <linux/uaccess.h> > +#include <linux/vmalloc.h> > > #define KPROBE_HASH_BITS 6 > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > void __weak *alloc_insn_page(void) > { > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > + return text_alloc(PAGE_SIZE); > +#else > return module_alloc(PAGE_SIZE); > +#endif > } > > void __weak free_insn_page(void *page) > { > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > + text_free(page); > +#else > module_memfree(page); > +#endif > } I've read the observations in the other threads, but this #ifdef jungle is silly, it's a de-facto open coded text_alloc() with a module_alloc() fallback... Thanks, Ingo
On Fri, Jul 24, 2020 at 08:05:52AM +0300, Jarkko Sakkinen wrote: > Use text_alloc() and text_free() instead of module_alloc() and > module_memfree() when an arch provides them. > > Cc: linux-mm@kvack.org > Cc: Andi Kleen <ak@linux.intel.com> > Cc: Masami Hiramatsu <mhiramat@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > --- > kernel/kprobes.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > index 4e46d96d4e16..611fcda9f6bf 100644 > --- a/kernel/kprobes.c > +++ b/kernel/kprobes.c > @@ -40,6 +40,7 @@ > #include <asm/cacheflush.h> > #include <asm/errno.h> > #include <linux/uaccess.h> > +#include <linux/vmalloc.h> > > #define KPROBE_HASH_BITS 6 > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > void __weak *alloc_insn_page(void) > { > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > + return text_alloc(PAGE_SIZE); > +#else > return module_alloc(PAGE_SIZE); > +#endif > } > > void __weak free_insn_page(void *page) > { > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > + text_free(page); > +#else > module_memfree(page); > +#endif > } Both alloc_insn_page() and free_insn_page() are __weak and can be simple overriden in arch/x86 code. > struct kprobe_insn_cache kprobe_insn_slots = { > -- > 2.25.1 >
On Fri, 24 Jul 2020 at 12:27, Ingo Molnar <mingo@kernel.org> wrote: > > > * Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> wrote: > > > Use text_alloc() and text_free() instead of module_alloc() and > > module_memfree() when an arch provides them. > > > > Cc: linux-mm@kvack.org > > Cc: Andi Kleen <ak@linux.intel.com> > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > --- > > kernel/kprobes.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > index 4e46d96d4e16..611fcda9f6bf 100644 > > --- a/kernel/kprobes.c > > +++ b/kernel/kprobes.c > > @@ -40,6 +40,7 @@ > > #include <asm/cacheflush.h> > > #include <asm/errno.h> > > #include <linux/uaccess.h> > > +#include <linux/vmalloc.h> > > > > #define KPROBE_HASH_BITS 6 > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > void __weak *alloc_insn_page(void) > > { > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > + return text_alloc(PAGE_SIZE); > > +#else > > return module_alloc(PAGE_SIZE); > > +#endif > > } > > > > void __weak free_insn_page(void *page) > > { > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > + text_free(page); > > +#else > > module_memfree(page); > > +#endif > > } > > I've read the observations in the other threads, but this #ifdef > jungle is silly, it's a de-facto open coded text_alloc() with a > module_alloc() fallback... > Also, as I attempted to explain before, there is no reason to allocate kasan shadow for any of these use cases, so cloning module_alloc() to implement text_alloc() is not the correct approach even on x86. I suppose module_alloc() could be reimplemented in terms of text_alloc() in this case, but simply relabelling it like this seems inappropriate on all architectures.
On Fri, 24 Jul 2020 13:27:48 +0300 Mike Rapoport <rppt@kernel.org> wrote: > On Fri, Jul 24, 2020 at 08:05:52AM +0300, Jarkko Sakkinen wrote: > > Use text_alloc() and text_free() instead of module_alloc() and > > module_memfree() when an arch provides them. > > > > Cc: linux-mm@kvack.org > > Cc: Andi Kleen <ak@linux.intel.com> > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > --- > > kernel/kprobes.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > index 4e46d96d4e16..611fcda9f6bf 100644 > > --- a/kernel/kprobes.c > > +++ b/kernel/kprobes.c > > @@ -40,6 +40,7 @@ > > #include <asm/cacheflush.h> > > #include <asm/errno.h> > > #include <linux/uaccess.h> > > +#include <linux/vmalloc.h> > > > > #define KPROBE_HASH_BITS 6 > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > void __weak *alloc_insn_page(void) > > { > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > + return text_alloc(PAGE_SIZE); > > +#else > > return module_alloc(PAGE_SIZE); > > +#endif > > } > > > > void __weak free_insn_page(void *page) > > { > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > + text_free(page); > > +#else > > module_memfree(page); > > +#endif > > } > > Both alloc_insn_page() and free_insn_page() are __weak and can be simple > overriden in arch/x86 code. No, we can't use module_alloc/memfree() without CONFIG_MODULES, so we can not escape from this #ifdefs. (and I think this is not so bad.) Thank you,
On Fri, Jul 24, 2020 at 01:27:48PM +0300, Mike Rapoport wrote: > On Fri, Jul 24, 2020 at 08:05:52AM +0300, Jarkko Sakkinen wrote: > > Use text_alloc() and text_free() instead of module_alloc() and > > module_memfree() when an arch provides them. > > > > Cc: linux-mm@kvack.org > > Cc: Andi Kleen <ak@linux.intel.com> > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > --- > > kernel/kprobes.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > index 4e46d96d4e16..611fcda9f6bf 100644 > > --- a/kernel/kprobes.c > > +++ b/kernel/kprobes.c > > @@ -40,6 +40,7 @@ > > #include <asm/cacheflush.h> > > #include <asm/errno.h> > > #include <linux/uaccess.h> > > +#include <linux/vmalloc.h> > > > > #define KPROBE_HASH_BITS 6 > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > void __weak *alloc_insn_page(void) > > { > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > + return text_alloc(PAGE_SIZE); > > +#else > > return module_alloc(PAGE_SIZE); > > +#endif > > } > > > > void __weak free_insn_page(void *page) > > { > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > + text_free(page); > > +#else > > module_memfree(page); > > +#endif > > } > > Both alloc_insn_page() and free_insn_page() are __weak and can be simple > overriden in arch/x86 code. That is correct, but the override is obviously happening at linking time. module_alloc() and module_memfree() will still leave a compile time dependency to CONFIG_MODULES. /Jarkko
On Fri, Jul 24, 2020 at 11:27:46AM +0200, Ingo Molnar wrote: > > * Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> wrote: > > > Use text_alloc() and text_free() instead of module_alloc() and > > module_memfree() when an arch provides them. > > > > Cc: linux-mm@kvack.org > > Cc: Andi Kleen <ak@linux.intel.com> > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > --- > > kernel/kprobes.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > index 4e46d96d4e16..611fcda9f6bf 100644 > > --- a/kernel/kprobes.c > > +++ b/kernel/kprobes.c > > @@ -40,6 +40,7 @@ > > #include <asm/cacheflush.h> > > #include <asm/errno.h> > > #include <linux/uaccess.h> > > +#include <linux/vmalloc.h> > > > > #define KPROBE_HASH_BITS 6 > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > void __weak *alloc_insn_page(void) > > { > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > + return text_alloc(PAGE_SIZE); > > +#else > > return module_alloc(PAGE_SIZE); > > +#endif > > } > > > > void __weak free_insn_page(void *page) > > { > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > + text_free(page); > > +#else > > module_memfree(page); > > +#endif > > } > > I've read the observations in the other threads, but this #ifdef > jungle is silly, it's a de-facto open coded text_alloc() with a > module_alloc() fallback... In the previous version I had: https://lore.kernel.org/lkml/20200717030422.679972-4-jarkko.sakkinen@linux.intel.com/ and I had just calls to text_alloc() and text_free() in corresponding snippet to the above. I got this feedback from Mike: https://lore.kernel.org/lkml/20200718162359.GA2919062@kernel.org/ I'm not still sure that I fully understand this feedback as I don't see any inherent and obvious difference to the v4. In that version fallbacks are to module_alloc() and module_memfree() and text_alloc() and text_memfree() can be overridden by arch. > Thanks, > > Ingo /Jarkko
On Fri, Jul 24, 2020 at 03:16:08PM +0300, Ard Biesheuvel wrote: > On Fri, 24 Jul 2020 at 12:27, Ingo Molnar <mingo@kernel.org> wrote: > > > > > > * Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> wrote: > > > > > Use text_alloc() and text_free() instead of module_alloc() and > > > module_memfree() when an arch provides them. > > > > > > Cc: linux-mm@kvack.org > > > Cc: Andi Kleen <ak@linux.intel.com> > > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > > --- > > > kernel/kprobes.c | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > > index 4e46d96d4e16..611fcda9f6bf 100644 > > > --- a/kernel/kprobes.c > > > +++ b/kernel/kprobes.c > > > @@ -40,6 +40,7 @@ > > > #include <asm/cacheflush.h> > > > #include <asm/errno.h> > > > #include <linux/uaccess.h> > > > +#include <linux/vmalloc.h> > > > > > > #define KPROBE_HASH_BITS 6 > > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > > > void __weak *alloc_insn_page(void) > > > { > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > + return text_alloc(PAGE_SIZE); > > > +#else > > > return module_alloc(PAGE_SIZE); > > > +#endif > > > } > > > > > > void __weak free_insn_page(void *page) > > > { > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > + text_free(page); > > > +#else > > > module_memfree(page); > > > +#endif > > > } > > > > I've read the observations in the other threads, but this #ifdef > > jungle is silly, it's a de-facto open coded text_alloc() with a > > module_alloc() fallback... > > > > Also, as I attempted to explain before, there is no reason to allocate > kasan shadow for any of these use cases, so cloning module_alloc() to > implement text_alloc() is not the correct approach even on x86. > > I suppose module_alloc() could be reimplemented in terms of > text_alloc() in this case, but simply relabelling it like this seems > inappropriate on all architectures. I agree with this. Even if there was chance to do a merge of some kind, it should probably happen over time and accept some redundancy first. /Jarkko
On Sat, Jul 25, 2020 at 06:16:48AM +0300, Jarkko Sakkinen wrote: > On Fri, Jul 24, 2020 at 11:27:46AM +0200, Ingo Molnar wrote: > > > > * Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> wrote: > > > > > Use text_alloc() and text_free() instead of module_alloc() and > > > module_memfree() when an arch provides them. > > > > > > Cc: linux-mm@kvack.org > > > Cc: Andi Kleen <ak@linux.intel.com> > > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > > --- > > > kernel/kprobes.c | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > > index 4e46d96d4e16..611fcda9f6bf 100644 > > > --- a/kernel/kprobes.c > > > +++ b/kernel/kprobes.c > > > @@ -40,6 +40,7 @@ > > > #include <asm/cacheflush.h> > > > #include <asm/errno.h> > > > #include <linux/uaccess.h> > > > +#include <linux/vmalloc.h> > > > > > > #define KPROBE_HASH_BITS 6 > > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > > > void __weak *alloc_insn_page(void) > > > { > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > + return text_alloc(PAGE_SIZE); > > > +#else > > > return module_alloc(PAGE_SIZE); > > > +#endif > > > } > > > > > > void __weak free_insn_page(void *page) > > > { > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > + text_free(page); > > > +#else > > > module_memfree(page); > > > +#endif > > > } > > > > I've read the observations in the other threads, but this #ifdef > > jungle is silly, it's a de-facto open coded text_alloc() with a > > module_alloc() fallback... > > In the previous version I had: > > https://lore.kernel.org/lkml/20200717030422.679972-4-jarkko.sakkinen@linux.intel.com/ > > and I had just calls to text_alloc() and text_free() in corresponding > snippet to the above. > > I got this feedback from Mike: > > https://lore.kernel.org/lkml/20200718162359.GA2919062@kernel.org/ > > I'm not still sure that I fully understand this feedback as I don't see > any inherent and obvious difference to the v4. In that version fallbacks > are to module_alloc() and module_memfree() and text_alloc() and > text_memfree() can be overridden by arch. Let me try to elaborate. There are several subsystems that need to allocate memory for executable text. As it happens, they use module_alloc() with some abilities for architectures to override this behaviour. For many architectures, it would be enough to rename modules_alloc() to text_alloc(), make it built-in and this way allow removing dependency on MODULES. Yet, some architectures have different restrictions for code allocation for different subsystems so it would make sense to have more than one variant of text_alloc() and a single config option ARCH_HAS_TEXT_ALLOC won't be sufficient. I liked Mark's suggestion to have text_alloc_<something>() and proposed a way to introduce text_alloc_kprobes() along with HAVE_KPROBES_TEXT_ALLOC to enable arch overrides of this function. The major difference between your v4 and my suggestion is that I'm not trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to MODULES but rather to use per subsystem config option, e.g. HAVE_KPROBES_TEXT_ALLOC. Another thing, which might be worth doing regardless of the outcome of this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() because the former is way too generic and does not emphasize that the instruction page is actually used by kprobes only. > /Jarkko >
On Sun, 26 Jul 2020 at 11:14, Mike Rapoport <rppt@kernel.org> wrote: > > On Sat, Jul 25, 2020 at 06:16:48AM +0300, Jarkko Sakkinen wrote: > > On Fri, Jul 24, 2020 at 11:27:46AM +0200, Ingo Molnar wrote: > > > > > > * Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> wrote: > > > > > > > Use text_alloc() and text_free() instead of module_alloc() and > > > > module_memfree() when an arch provides them. > > > > > > > > Cc: linux-mm@kvack.org > > > > Cc: Andi Kleen <ak@linux.intel.com> > > > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > > > --- > > > > kernel/kprobes.c | 9 +++++++++ > > > > 1 file changed, 9 insertions(+) > > > > > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > > > index 4e46d96d4e16..611fcda9f6bf 100644 > > > > --- a/kernel/kprobes.c > > > > +++ b/kernel/kprobes.c > > > > @@ -40,6 +40,7 @@ > > > > #include <asm/cacheflush.h> > > > > #include <asm/errno.h> > > > > #include <linux/uaccess.h> > > > > +#include <linux/vmalloc.h> > > > > > > > > #define KPROBE_HASH_BITS 6 > > > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > > > > > void __weak *alloc_insn_page(void) > > > > { > > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > > + return text_alloc(PAGE_SIZE); > > > > +#else > > > > return module_alloc(PAGE_SIZE); > > > > +#endif > > > > } > > > > > > > > void __weak free_insn_page(void *page) > > > > { > > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > > + text_free(page); > > > > +#else > > > > module_memfree(page); > > > > +#endif > > > > } > > > > > > I've read the observations in the other threads, but this #ifdef > > > jungle is silly, it's a de-facto open coded text_alloc() with a > > > module_alloc() fallback... > > > > In the previous version I had: > > > > https://lore.kernel.org/lkml/20200717030422.679972-4-jarkko.sakkinen@linux.intel.com/ > > > > and I had just calls to text_alloc() and text_free() in corresponding > > snippet to the above. > > > > I got this feedback from Mike: > > > > https://lore.kernel.org/lkml/20200718162359.GA2919062@kernel.org/ > > > > I'm not still sure that I fully understand this feedback as I don't see > > any inherent and obvious difference to the v4. In that version fallbacks > > are to module_alloc() and module_memfree() and text_alloc() and > > text_memfree() can be overridden by arch. > > Let me try to elaborate. > > There are several subsystems that need to allocate memory for executable > text. As it happens, they use module_alloc() with some abilities for > architectures to override this behaviour. > > For many architectures, it would be enough to rename modules_alloc() to > text_alloc(), make it built-in and this way allow removing dependency on > MODULES. > > Yet, some architectures have different restrictions for code allocation > for different subsystems so it would make sense to have more than one > variant of text_alloc() and a single config option ARCH_HAS_TEXT_ALLOC > won't be sufficient. > > I liked Mark's suggestion to have text_alloc_<something>() and proposed > a way to introduce text_alloc_kprobes() along with > HAVE_KPROBES_TEXT_ALLOC to enable arch overrides of this function. > > The major difference between your v4 and my suggestion is that I'm not > trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to > MODULES but rather to use per subsystem config option, e.g. > HAVE_KPROBES_TEXT_ALLOC. > > Another thing, which might be worth doing regardless of the outcome of > this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() > because the former is way too generic and does not emphasize that the > instruction page is actually used by kprobes only. > Masami or Peter should correct me if I am wrong, but it seems to me that the way kprobes uses these pages does not require them to be in relative branching range of the core kernel on any architecture, given that they are populated with individual instruction opcodes that are executed in single step mode, and relative branches are emulated (when needed) So for kprobes in particular, we should be able to come up with a generic sequence that does not involve module_alloc(), and therefore removes the kprobes dependency on module support entirely (with the exception of power which maps the vmalloc space nx when module support is disabled). Renaming alloc_insn_page() to something more descriptive makes sense imo, but is a separate issue.
On Sun, 26 Jul 2020 19:06:20 +0300 Ard Biesheuvel <ardb@kernel.org> wrote: > On Sun, 26 Jul 2020 at 11:14, Mike Rapoport <rppt@kernel.org> wrote: > > > > On Sat, Jul 25, 2020 at 06:16:48AM +0300, Jarkko Sakkinen wrote: > > > On Fri, Jul 24, 2020 at 11:27:46AM +0200, Ingo Molnar wrote: > > > > > > > > * Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> wrote: > > > > > > > > > Use text_alloc() and text_free() instead of module_alloc() and > > > > > module_memfree() when an arch provides them. > > > > > > > > > > Cc: linux-mm@kvack.org > > > > > Cc: Andi Kleen <ak@linux.intel.com> > > > > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > > > > --- > > > > > kernel/kprobes.c | 9 +++++++++ > > > > > 1 file changed, 9 insertions(+) > > > > > > > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > > > > index 4e46d96d4e16..611fcda9f6bf 100644 > > > > > --- a/kernel/kprobes.c > > > > > +++ b/kernel/kprobes.c > > > > > @@ -40,6 +40,7 @@ > > > > > #include <asm/cacheflush.h> > > > > > #include <asm/errno.h> > > > > > #include <linux/uaccess.h> > > > > > +#include <linux/vmalloc.h> > > > > > > > > > > #define KPROBE_HASH_BITS 6 > > > > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > > > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > > > > > > > void __weak *alloc_insn_page(void) > > > > > { > > > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > > > + return text_alloc(PAGE_SIZE); > > > > > +#else > > > > > return module_alloc(PAGE_SIZE); > > > > > +#endif > > > > > } > > > > > > > > > > void __weak free_insn_page(void *page) > > > > > { > > > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > > > + text_free(page); > > > > > +#else > > > > > module_memfree(page); > > > > > +#endif > > > > > } > > > > > > > > I've read the observations in the other threads, but this #ifdef > > > > jungle is silly, it's a de-facto open coded text_alloc() with a > > > > module_alloc() fallback... > > > > > > In the previous version I had: > > > > > > https://lore.kernel.org/lkml/20200717030422.679972-4-jarkko.sakkinen@linux.intel.com/ > > > > > > and I had just calls to text_alloc() and text_free() in corresponding > > > snippet to the above. > > > > > > I got this feedback from Mike: > > > > > > https://lore.kernel.org/lkml/20200718162359.GA2919062@kernel.org/ > > > > > > I'm not still sure that I fully understand this feedback as I don't see > > > any inherent and obvious difference to the v4. In that version fallbacks > > > are to module_alloc() and module_memfree() and text_alloc() and > > > text_memfree() can be overridden by arch. > > > > Let me try to elaborate. > > > > There are several subsystems that need to allocate memory for executable > > text. As it happens, they use module_alloc() with some abilities for > > architectures to override this behaviour. > > > > For many architectures, it would be enough to rename modules_alloc() to > > text_alloc(), make it built-in and this way allow removing dependency on > > MODULES. > > > > Yet, some architectures have different restrictions for code allocation > > for different subsystems so it would make sense to have more than one > > variant of text_alloc() and a single config option ARCH_HAS_TEXT_ALLOC > > won't be sufficient. > > > > I liked Mark's suggestion to have text_alloc_<something>() and proposed > > a way to introduce text_alloc_kprobes() along with > > HAVE_KPROBES_TEXT_ALLOC to enable arch overrides of this function. > > > > The major difference between your v4 and my suggestion is that I'm not > > trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to > > MODULES but rather to use per subsystem config option, e.g. > > HAVE_KPROBES_TEXT_ALLOC. > > > > Another thing, which might be worth doing regardless of the outcome of > > this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() > > because the former is way too generic and does not emphasize that the > > instruction page is actually used by kprobes only. The name of the insn_pages came from the struct kprobe_insn_page, so if there is a text_alloc_kprobe(), I'm OK to rename it. (anyway, that is an allocation operator, we don't call it directly.) > Masami or Peter should correct me if I am wrong, but it seems to me > that the way kprobes uses these pages does not require them to be in > relative branching range of the core kernel on any architecture, given > that they are populated with individual instruction opcodes that are > executed in single step mode, and relative branches are emulated (when > needed) Actually, x86 and arm has the "relative branching range" requirements for the jump optimized kprobes. For the other architectures, I think we don't need it. Only executable text buffer is needed. Thank you, > So for kprobes in particular, we should be able to come up with a > generic sequence that does not involve module_alloc(), and therefore > removes the kprobes dependency on module support entirely (with the > exception of power which maps the vmalloc space nx when module support > is disabled). Renaming alloc_insn_page() to something more descriptive > makes sense imo, but is a separate issue.
On Tue, 28 Jul 2020 at 11:17, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > On Sun, 26 Jul 2020 19:06:20 +0300 > Ard Biesheuvel <ardb@kernel.org> wrote: > > > On Sun, 26 Jul 2020 at 11:14, Mike Rapoport <rppt@kernel.org> wrote: > > > > > > On Sat, Jul 25, 2020 at 06:16:48AM +0300, Jarkko Sakkinen wrote: > > > > On Fri, Jul 24, 2020 at 11:27:46AM +0200, Ingo Molnar wrote: > > > > > > > > > > * Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> wrote: > > > > > > > > > > > Use text_alloc() and text_free() instead of module_alloc() and > > > > > > module_memfree() when an arch provides them. > > > > > > > > > > > > Cc: linux-mm@kvack.org > > > > > > Cc: Andi Kleen <ak@linux.intel.com> > > > > > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > > > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> > > > > > > --- > > > > > > kernel/kprobes.c | 9 +++++++++ > > > > > > 1 file changed, 9 insertions(+) > > > > > > > > > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > > > > > index 4e46d96d4e16..611fcda9f6bf 100644 > > > > > > --- a/kernel/kprobes.c > > > > > > +++ b/kernel/kprobes.c > > > > > > @@ -40,6 +40,7 @@ > > > > > > #include <asm/cacheflush.h> > > > > > > #include <asm/errno.h> > > > > > > #include <linux/uaccess.h> > > > > > > +#include <linux/vmalloc.h> > > > > > > > > > > > > #define KPROBE_HASH_BITS 6 > > > > > > #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) > > > > > > @@ -111,12 +112,20 @@ enum kprobe_slot_state { > > > > > > > > > > > > void __weak *alloc_insn_page(void) > > > > > > { > > > > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > > > > + return text_alloc(PAGE_SIZE); > > > > > > +#else > > > > > > return module_alloc(PAGE_SIZE); > > > > > > +#endif > > > > > > } > > > > > > > > > > > > void __weak free_insn_page(void *page) > > > > > > { > > > > > > +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC > > > > > > + text_free(page); > > > > > > +#else > > > > > > module_memfree(page); > > > > > > +#endif > > > > > > } > > > > > > > > > > I've read the observations in the other threads, but this #ifdef > > > > > jungle is silly, it's a de-facto open coded text_alloc() with a > > > > > module_alloc() fallback... > > > > > > > > In the previous version I had: > > > > > > > > https://lore.kernel.org/lkml/20200717030422.679972-4-jarkko.sakkinen@linux.intel.com/ > > > > > > > > and I had just calls to text_alloc() and text_free() in corresponding > > > > snippet to the above. > > > > > > > > I got this feedback from Mike: > > > > > > > > https://lore.kernel.org/lkml/20200718162359.GA2919062@kernel.org/ > > > > > > > > I'm not still sure that I fully understand this feedback as I don't see > > > > any inherent and obvious difference to the v4. In that version fallbacks > > > > are to module_alloc() and module_memfree() and text_alloc() and > > > > text_memfree() can be overridden by arch. > > > > > > Let me try to elaborate. > > > > > > There are several subsystems that need to allocate memory for executable > > > text. As it happens, they use module_alloc() with some abilities for > > > architectures to override this behaviour. > > > > > > For many architectures, it would be enough to rename modules_alloc() to > > > text_alloc(), make it built-in and this way allow removing dependency on > > > MODULES. > > > > > > Yet, some architectures have different restrictions for code allocation > > > for different subsystems so it would make sense to have more than one > > > variant of text_alloc() and a single config option ARCH_HAS_TEXT_ALLOC > > > won't be sufficient. > > > > > > I liked Mark's suggestion to have text_alloc_<something>() and proposed > > > a way to introduce text_alloc_kprobes() along with > > > HAVE_KPROBES_TEXT_ALLOC to enable arch overrides of this function. > > > > > > The major difference between your v4 and my suggestion is that I'm not > > > trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to > > > MODULES but rather to use per subsystem config option, e.g. > > > HAVE_KPROBES_TEXT_ALLOC. > > > > > > Another thing, which might be worth doing regardless of the outcome of > > > this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() > > > because the former is way too generic and does not emphasize that the > > > instruction page is actually used by kprobes only. > > The name of the insn_pages came from the struct kprobe_insn_page, so > if there is a text_alloc_kprobe(), I'm OK to rename it. (anyway, that > is an allocation operator, we don't call it directly.) > > > Masami or Peter should correct me if I am wrong, but it seems to me > > that the way kprobes uses these pages does not require them to be in > > relative branching range of the core kernel on any architecture, given > > that they are populated with individual instruction opcodes that are > > executed in single step mode, and relative branches are emulated (when > > needed) > > Actually, x86 and arm has the "relative branching range" requirements > for the jump optimized kprobes. For the other architectures, I think > we don't need it. Only executable text buffer is needed. > Thanks for the explanation. Today, arm64 uses the definition below. void *alloc_insn_page(void) { return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, __builtin_return_address(0)); } Do you think we could use that as the generic implementation if we use MODULES_START/_END as the allocation window?
On Tue, 28 Jul 2020 13:56:43 +0300 Ard Biesheuvel <ardb@kernel.org> wrote: > On Tue, 28 Jul 2020 at 11:17, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > Masami or Peter should correct me if I am wrong, but it seems to me > > > that the way kprobes uses these pages does not require them to be in > > > relative branching range of the core kernel on any architecture, given > > > that they are populated with individual instruction opcodes that are > > > executed in single step mode, and relative branches are emulated (when > > > needed) > > > > Actually, x86 and arm has the "relative branching range" requirements > > for the jump optimized kprobes. For the other architectures, I think > > we don't need it. Only executable text buffer is needed. > > > > Thanks for the explanation. Today, arm64 uses the definition below. > > void *alloc_insn_page(void) > { > return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END, > GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, > NUMA_NO_NODE, __builtin_return_address(0)); > } > > Do you think we could use that as the generic implementation if we use > MODULES_START/_END as the allocation window? Yes, but for the generic implementation, we don't need to consider the relative branching range since we can override it for x86 and arm. (and that will be almost same as module_alloc() default code) BTW, is PAGE_KERNEL_ROX flag available generically? Thank you,
On Tue, 28 Jul 2020 at 16:35, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > On Tue, 28 Jul 2020 13:56:43 +0300 > Ard Biesheuvel <ardb@kernel.org> wrote: > > > On Tue, 28 Jul 2020 at 11:17, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > Masami or Peter should correct me if I am wrong, but it seems to me > > > > that the way kprobes uses these pages does not require them to be in > > > > relative branching range of the core kernel on any architecture, given > > > > that they are populated with individual instruction opcodes that are > > > > executed in single step mode, and relative branches are emulated (when > > > > needed) > > > > > > Actually, x86 and arm has the "relative branching range" requirements > > > for the jump optimized kprobes. For the other architectures, I think > > > we don't need it. Only executable text buffer is needed. > > > > > > > Thanks for the explanation. Today, arm64 uses the definition below. > > > > void *alloc_insn_page(void) > > { > > return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END, > > GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, > > NUMA_NO_NODE, __builtin_return_address(0)); > > } > > > > Do you think we could use that as the generic implementation if we use > > MODULES_START/_END as the allocation window? > > Yes, but for the generic implementation, we don't need to consider the > relative branching range since we can override it for x86 and arm. > (and that will be almost same as module_alloc() default code) Indeed. So having kprobes specific macros that default to VMALLOC_START/END but can be overridden would be sufficient. > BTW, is PAGE_KERNEL_ROX flag available generically? > Turns out that it is not :-( > Thank you, > -- > Masami Hiramatsu <mhiramat@kernel.org>
On Tue, 28 Jul 2020 20:51:08 +0300 Ard Biesheuvel <ardb@kernel.org> wrote: > On Tue, 28 Jul 2020 at 16:35, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > On Tue, 28 Jul 2020 13:56:43 +0300 > > Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > On Tue, 28 Jul 2020 at 11:17, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > Masami or Peter should correct me if I am wrong, but it seems to me > > > > > that the way kprobes uses these pages does not require them to be in > > > > > relative branching range of the core kernel on any architecture, given > > > > > that they are populated with individual instruction opcodes that are > > > > > executed in single step mode, and relative branches are emulated (when > > > > > needed) > > > > > > > > Actually, x86 and arm has the "relative branching range" requirements > > > > for the jump optimized kprobes. For the other architectures, I think > > > > we don't need it. Only executable text buffer is needed. > > > > > > > > > > Thanks for the explanation. Today, arm64 uses the definition below. > > > > > > void *alloc_insn_page(void) > > > { > > > return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END, > > > GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, > > > NUMA_NO_NODE, __builtin_return_address(0)); > > > } > > > > > > Do you think we could use that as the generic implementation if we use > > > MODULES_START/_END as the allocation window? > > > > Yes, but for the generic implementation, we don't need to consider the > > relative branching range since we can override it for x86 and arm. > > (and that will be almost same as module_alloc() default code) > > Indeed. So having kprobes specific macros that default to > VMALLOC_START/END but can be overridden would be sufficient. > > > BTW, is PAGE_KERNEL_ROX flag available generically? > > > > Turns out that it is not :-( Hmm, in that case, we need to use PAGE_KERNEL_EXEC. In the result, may it be similar to this? :) void * __weak module_alloc(unsigned long size) { return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, __builtin_return_address(0)); } The major difference between module_alloc() and kprobe's alloc_page_insn() is the alloc_page_insn() makes the page ROX after allocating the pages *ONLY* on x86 and arm64. $ git grep -w alloc_insn_page -- arch arch/arm64/kernel/probes/kprobes.c:void *alloc_insn_page(void) arch/x86/kernel/kprobes/core.c:void *alloc_insn_page(void) However since the module_alloc() owns its arch-dependent implementations most of major architectures, if we implement independent text_alloc_kprobe(), we need to make deadcopies of module_alloc() for each architecture. $ git grep 'module_alloc(unsigned' arch/ arch/arm/kernel/module.c:void *module_alloc(unsigned long size) arch/arm64/kernel/module.c:void *module_alloc(unsigned long size) arch/mips/kernel/module.c:void *module_alloc(unsigned long size) arch/nds32/kernel/module.c:void *module_alloc(unsigned long size) arch/nios2/kernel/module.c:void *module_alloc(unsigned long size) arch/parisc/kernel/module.c:void *module_alloc(unsigned long size) arch/riscv/kernel/module.c:void *module_alloc(unsigned long size) arch/s390/kernel/module.c:void *module_alloc(unsigned long size) arch/sparc/kernel/module.c:void *module_alloc(unsigned long size) arch/unicore32/kernel/module.c:void *module_alloc(unsigned long size) arch/x86/kernel/module.c:void *module_alloc(unsigned long size) It seems that some constrains for module_alloc() exists for above architectures. Anyway, for kprobe's text_alloc() requirements are - It must be executable for the arch which uses a single-step out-of-line. (and need to be registered to KASAN?) - It must be ROX if implemented (currently only for x86 and arm64) - It must be in the range of relative branching only for x86 and arm. Thank you,
On Wed, 29 Jul 2020 at 04:51, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > On Tue, 28 Jul 2020 20:51:08 +0300 > Ard Biesheuvel <ardb@kernel.org> wrote: > > > On Tue, 28 Jul 2020 at 16:35, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > On Tue, 28 Jul 2020 13:56:43 +0300 > > > Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > On Tue, 28 Jul 2020 at 11:17, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > Masami or Peter should correct me if I am wrong, but it seems to me > > > > > > that the way kprobes uses these pages does not require them to be in > > > > > > relative branching range of the core kernel on any architecture, given > > > > > > that they are populated with individual instruction opcodes that are > > > > > > executed in single step mode, and relative branches are emulated (when > > > > > > needed) > > > > > > > > > > Actually, x86 and arm has the "relative branching range" requirements > > > > > for the jump optimized kprobes. For the other architectures, I think > > > > > we don't need it. Only executable text buffer is needed. > > > > > > > > > > > > > Thanks for the explanation. Today, arm64 uses the definition below. > > > > > > > > void *alloc_insn_page(void) > > > > { > > > > return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END, > > > > GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, > > > > NUMA_NO_NODE, __builtin_return_address(0)); > > > > } > > > > > > > > Do you think we could use that as the generic implementation if we use > > > > MODULES_START/_END as the allocation window? > > > > > > Yes, but for the generic implementation, we don't need to consider the > > > relative branching range since we can override it for x86 and arm. > > > (and that will be almost same as module_alloc() default code) > > > > Indeed. So having kprobes specific macros that default to > > VMALLOC_START/END but can be overridden would be sufficient. > > > > > BTW, is PAGE_KERNEL_ROX flag available generically? > > > > > > > Turns out that it is not :-( > > Hmm, in that case, we need to use PAGE_KERNEL_EXEC. > > In the result, may it be similar to this? :) > > void * __weak module_alloc(unsigned long size) > { > return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, > GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, > NUMA_NO_NODE, __builtin_return_address(0)); > } > > The major difference between module_alloc() and kprobe's alloc_page_insn() > is the alloc_page_insn() makes the page ROX after allocating the pages *ONLY* > on x86 and arm64. > Right. > $ git grep -w alloc_insn_page -- arch > arch/arm64/kernel/probes/kprobes.c:void *alloc_insn_page(void) > arch/x86/kernel/kprobes/core.c:void *alloc_insn_page(void) > > However since the module_alloc() owns its arch-dependent implementations > most of major architectures, if we implement independent text_alloc_kprobe(), > we need to make deadcopies of module_alloc() for each architecture. > No, that is what we are trying to avoid. > $ git grep 'module_alloc(unsigned' arch/ > arch/arm/kernel/module.c:void *module_alloc(unsigned long size) > arch/arm64/kernel/module.c:void *module_alloc(unsigned long size) > arch/mips/kernel/module.c:void *module_alloc(unsigned long size) > arch/nds32/kernel/module.c:void *module_alloc(unsigned long size) > arch/nios2/kernel/module.c:void *module_alloc(unsigned long size) > arch/parisc/kernel/module.c:void *module_alloc(unsigned long size) > arch/riscv/kernel/module.c:void *module_alloc(unsigned long size) > arch/s390/kernel/module.c:void *module_alloc(unsigned long size) > arch/sparc/kernel/module.c:void *module_alloc(unsigned long size) > arch/unicore32/kernel/module.c:void *module_alloc(unsigned long size) > arch/x86/kernel/module.c:void *module_alloc(unsigned long size) > > It seems that some constrains for module_alloc() exists for above > architectures. > > Anyway, for kprobe's text_alloc() requirements are > - It must be executable for the arch which uses a single-step out-of-line. > (and need to be registered to KASAN?) No, kasan shadow is not needed here. > - It must be ROX if implemented (currently only for x86 and arm64) x86 does not actually define thr macro, but the result is the same. > - It must be in the range of relative branching only for x86 and arm. > So in summary, the generic module_alloc() above can be reused for kprobes on all arches except x86 and arm64, right? Then we can remove the call to it, and drop the modules dependency.
On Wed, 29 Jul 2020 09:13:21 +0300 Ard Biesheuvel <ardb@kernel.org> wrote: > On Wed, 29 Jul 2020 at 04:51, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > On Tue, 28 Jul 2020 20:51:08 +0300 > > Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > On Tue, 28 Jul 2020 at 16:35, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > > > On Tue, 28 Jul 2020 13:56:43 +0300 > > > > Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > On Tue, 28 Jul 2020 at 11:17, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > > Masami or Peter should correct me if I am wrong, but it seems to me > > > > > > > that the way kprobes uses these pages does not require them to be in > > > > > > > relative branching range of the core kernel on any architecture, given > > > > > > > that they are populated with individual instruction opcodes that are > > > > > > > executed in single step mode, and relative branches are emulated (when > > > > > > > needed) > > > > > > > > > > > > Actually, x86 and arm has the "relative branching range" requirements > > > > > > for the jump optimized kprobes. For the other architectures, I think > > > > > > we don't need it. Only executable text buffer is needed. > > > > > > > > > > > > > > > > Thanks for the explanation. Today, arm64 uses the definition below. > > > > > > > > > > void *alloc_insn_page(void) > > > > > { > > > > > return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END, > > > > > GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, > > > > > NUMA_NO_NODE, __builtin_return_address(0)); > > > > > } > > > > > > > > > > Do you think we could use that as the generic implementation if we use > > > > > MODULES_START/_END as the allocation window? > > > > > > > > Yes, but for the generic implementation, we don't need to consider the > > > > relative branching range since we can override it for x86 and arm. > > > > (and that will be almost same as module_alloc() default code) > > > > > > Indeed. So having kprobes specific macros that default to > > > VMALLOC_START/END but can be overridden would be sufficient. > > > > > > > BTW, is PAGE_KERNEL_ROX flag available generically? > > > > > > > > > > Turns out that it is not :-( > > > > Hmm, in that case, we need to use PAGE_KERNEL_EXEC. > > > > In the result, may it be similar to this? :) > > > > void * __weak module_alloc(unsigned long size) > > { > > return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, > > GFP_KERNEL, PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, > > NUMA_NO_NODE, __builtin_return_address(0)); > > } > > > > The major difference between module_alloc() and kprobe's alloc_page_insn() > > is the alloc_page_insn() makes the page ROX after allocating the pages *ONLY* > > on x86 and arm64. > > > > Right. > > > $ git grep -w alloc_insn_page -- arch > > arch/arm64/kernel/probes/kprobes.c:void *alloc_insn_page(void) > > arch/x86/kernel/kprobes/core.c:void *alloc_insn_page(void) > > > > However since the module_alloc() owns its arch-dependent implementations > > most of major architectures, if we implement independent text_alloc_kprobe(), > > we need to make deadcopies of module_alloc() for each architecture. > > > > No, that is what we are trying to avoid. > > > $ git grep 'module_alloc(unsigned' arch/ > > arch/arm/kernel/module.c:void *module_alloc(unsigned long size) > > arch/arm64/kernel/module.c:void *module_alloc(unsigned long size) > > arch/mips/kernel/module.c:void *module_alloc(unsigned long size) > > arch/nds32/kernel/module.c:void *module_alloc(unsigned long size) > > arch/nios2/kernel/module.c:void *module_alloc(unsigned long size) > > arch/parisc/kernel/module.c:void *module_alloc(unsigned long size) > > arch/riscv/kernel/module.c:void *module_alloc(unsigned long size) > > arch/s390/kernel/module.c:void *module_alloc(unsigned long size) > > arch/sparc/kernel/module.c:void *module_alloc(unsigned long size) > > arch/unicore32/kernel/module.c:void *module_alloc(unsigned long size) > > arch/x86/kernel/module.c:void *module_alloc(unsigned long size) > > > > It seems that some constrains for module_alloc() exists for above > > architectures. > > > > Anyway, for kprobe's text_alloc() requirements are > > - It must be executable for the arch which uses a single-step out-of-line. > > (and need to be registered to KASAN?) > > No, kasan shadow is not needed here. OK, I just saw the KASAN shadow was generated (kasan_module_alloc was called) in module_alloc() on x86. So current x86 alloc_insn_page() has it. > > - It must be ROX if implemented (currently only for x86 and arm64) > > x86 does not actually define thr macro, but the result is the same. Yes, alloc_insn_page() for x86 is adding the RO and X flag on the page. > > - It must be in the range of relative branching only for x86 and arm. > > > > So in summary, the generic module_alloc() above can be reused for > kprobes on all arches except x86 and arm64, right? Then we can remove > the call to it, and drop the modules dependency. Yes, that's correct. If there is a generic module_alloc(), kprobes is happy to use it :) Thank you,
On Sun, Jul 26, 2020 at 11:14:08AM +0300, Mike Rapoport wrote: > On Sat, Jul 25, 2020 at 06:16:48AM +0300, Jarkko Sakkinen wrote: > > > I've read the observations in the other threads, but this #ifdef > > > jungle is silly, it's a de-facto open coded text_alloc() with a > > > module_alloc() fallback... > > > > In the previous version I had: > > > > https://lore.kernel.org/lkml/20200717030422.679972-4-jarkko.sakkinen@linux.intel.com/ > > > > and I had just calls to text_alloc() and text_free() in corresponding > > snippet to the above. > > > > I got this feedback from Mike: > > > > https://lore.kernel.org/lkml/20200718162359.GA2919062@kernel.org/ > > > > I'm not still sure that I fully understand this feedback as I don't see > > any inherent and obvious difference to the v4. In that version fallbacks > > are to module_alloc() and module_memfree() and text_alloc() and > > text_memfree() can be overridden by arch. > > Let me try to elaborate. > > There are several subsystems that need to allocate memory for executable > text. As it happens, they use module_alloc() with some abilities for > architectures to override this behaviour. > > For many architectures, it would be enough to rename modules_alloc() to > text_alloc(), make it built-in and this way allow removing dependency on > MODULES. > > Yet, some architectures have different restrictions for code allocation > for different subsystems so it would make sense to have more than one > variant of text_alloc() and a single config option ARCH_HAS_TEXT_ALLOC > won't be sufficient. > > I liked Mark's suggestion to have text_alloc_<something>() and proposed > a way to introduce text_alloc_kprobes() along with > HAVE_KPROBES_TEXT_ALLOC to enable arch overrides of this function. > > The major difference between your v4 and my suggestion is that I'm not > trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to > MODULES but rather to use per subsystem config option, e.g. > HAVE_KPROBES_TEXT_ALLOC. > > Another thing, which might be worth doing regardless of the outcome of > this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() > because the former is way too generic and does not emphasize that the > instruction page is actually used by kprobes only. What if we in kernel/kprobes.c just: #ifndef CONFIG_ARCH_HAS_TEXT_ALLOC void __weak *alloc_insn_page(void) { return module_alloc(PAGE_SIZE); } void __weak free_insn_page(void *page) { module_memfree(page); } #endif In Kconfig (as in v5): config KPROBES bool "Kprobes" depends on MODULES || ARCH_HAS_TEXT_ALLOC I checked architectures that override alloc_insn_page(). With the exception of x86, they do not call module_alloc(). If no rename was done, then with this approach a more consistent. config flag name would be CONFIG_ARCH_HAS_ALLOC_INSN_PAGE. I'd call the function just as kprobes_alloc_page(). Then the config flag would become CONFIG_HAS_KPROBES_ALLOC_PAGE. > -- > Sincerely yours, > Mike. Thanks for the feedback! /Jarkko
On Tue, Aug 18, 2020 at 08:30:29AM +0300, Jarkko Sakkinen wrote: > On Sun, Jul 26, 2020 at 11:14:08AM +0300, Mike Rapoport wrote: > > > > > > I'm not still sure that I fully understand this feedback as I don't see > > > any inherent and obvious difference to the v4. In that version fallbacks > > > are to module_alloc() and module_memfree() and text_alloc() and > > > text_memfree() can be overridden by arch. > > > > The major difference between your v4 and my suggestion is that I'm not > > trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to > > MODULES but rather to use per subsystem config option, e.g. > > HAVE_KPROBES_TEXT_ALLOC. > > > > Another thing, which might be worth doing regardless of the outcome of > > this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() > > because the former is way too generic and does not emphasize that the > > instruction page is actually used by kprobes only. > > What if we in kernel/kprobes.c just: > > #ifndef CONFIG_ARCH_HAS_TEXT_ALLOC I don't think that CONFIG_ARCH_HAS_TEXT_ALLOC will work for all architectures. If an architecture has different restrictions for allocation of text for, say, modules, kprobes, bfp, it won't be able to use a single ARCH_HAS_TEXT_ALLOC. Which means that this architecture is stuck with dependency of kprobes on MODULES until the next rework. > void __weak *alloc_insn_page(void) > { > return module_alloc(PAGE_SIZE); > } > > void __weak free_insn_page(void *page) > { > module_memfree(page); > } > #endif > > In Kconfig (as in v5): > > config KPROBES > bool "Kprobes" > depends on MODULES || ARCH_HAS_TEXT_ALLOC > > I checked architectures that override alloc_insn_page(). With the > exception of x86, they do not call module_alloc(). > > If no rename was done, then with this approach a more consistent. > config flag name would be CONFIG_ARCH_HAS_ALLOC_INSN_PAGE. > > I'd call the function just as kprobes_alloc_page(). Then the > config flag would become CONFIG_HAS_KPROBES_ALLOC_PAGE. > > > -- > > Sincerely yours, > > Mike. > > Thanks for the feedback! > > /Jarkko
On Tue, Aug 18, 2020 at 02:51:41PM +0300, Mike Rapoport wrote: > On Tue, Aug 18, 2020 at 08:30:29AM +0300, Jarkko Sakkinen wrote: > > On Sun, Jul 26, 2020 at 11:14:08AM +0300, Mike Rapoport wrote: > > > > > > > > I'm not still sure that I fully understand this feedback as I don't see > > > > any inherent and obvious difference to the v4. In that version fallbacks > > > > are to module_alloc() and module_memfree() and text_alloc() and > > > > text_memfree() can be overridden by arch. > > > > > > The major difference between your v4 and my suggestion is that I'm not > > > trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to > > > MODULES but rather to use per subsystem config option, e.g. > > > HAVE_KPROBES_TEXT_ALLOC. > > > > > > Another thing, which might be worth doing regardless of the outcome of > > > this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() > > > because the former is way too generic and does not emphasize that the > > > instruction page is actually used by kprobes only. > > > > What if we in kernel/kprobes.c just: > > > > #ifndef CONFIG_ARCH_HAS_TEXT_ALLOC > > I don't think that CONFIG_ARCH_HAS_TEXT_ALLOC will work for all > architectures. > > If an architecture has different restrictions for allocation of text > for, say, modules, kprobes, bfp, it won't be able to use a single > ARCH_HAS_TEXT_ALLOC. Which means that this architecture is stuck with > dependency of kprobes on MODULES until the next rework. I was thinking to name it as CONFIG_HAS_KPROBES_ALLOC_PAGE, alogn the lines described below, so it is merely a glitch in my example. > > > void __weak *alloc_insn_page(void) > > { > > return module_alloc(PAGE_SIZE); > > } > > > > void __weak free_insn_page(void *page) > > { > > module_memfree(page); > > } > > #endif > > > > In Kconfig (as in v5): > > > > config KPROBES > > bool "Kprobes" > > depends on MODULES || ARCH_HAS_TEXT_ALLOC > > > > I checked architectures that override alloc_insn_page(). With the > > exception of x86, they do not call module_alloc(). > > > > If no rename was done, then with this approach a more consistent. > > config flag name would be CONFIG_ARCH_HAS_ALLOC_INSN_PAGE. > > > > I'd call the function just as kprobes_alloc_page(). Then the > > config flag would become CONFIG_HAS_KPROBES_ALLOC_PAGE. > > > > > -- > > > Sincerely yours, > > > Mike. > > > > Thanks for the feedback! > > > > /Jarkko > > -- > Sincerely yours, > Mike. BR, /Jarkko
On Tue, Aug 18, 2020 at 07:30:33PM +0300, Jarkko Sakkinen wrote: > On Tue, Aug 18, 2020 at 02:51:41PM +0300, Mike Rapoport wrote: > > On Tue, Aug 18, 2020 at 08:30:29AM +0300, Jarkko Sakkinen wrote: > > > On Sun, Jul 26, 2020 at 11:14:08AM +0300, Mike Rapoport wrote: > > > > > > > > > > I'm not still sure that I fully understand this feedback as I don't see > > > > > any inherent and obvious difference to the v4. In that version fallbacks > > > > > are to module_alloc() and module_memfree() and text_alloc() and > > > > > text_memfree() can be overridden by arch. > > > > > > > > The major difference between your v4 and my suggestion is that I'm not > > > > trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to > > > > MODULES but rather to use per subsystem config option, e.g. > > > > HAVE_KPROBES_TEXT_ALLOC. > > > > > > > > Another thing, which might be worth doing regardless of the outcome of > > > > this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() > > > > because the former is way too generic and does not emphasize that the > > > > instruction page is actually used by kprobes only. > > > > > > What if we in kernel/kprobes.c just: > > > > > > #ifndef CONFIG_ARCH_HAS_TEXT_ALLOC > > > > I don't think that CONFIG_ARCH_HAS_TEXT_ALLOC will work for all > > architectures. > > > > If an architecture has different restrictions for allocation of text > > for, say, modules, kprobes, bfp, it won't be able to use a single > > ARCH_HAS_TEXT_ALLOC. Which means that this architecture is stuck with > > dependency of kprobes on MODULES until the next rework. > > I was thinking to name it as CONFIG_HAS_KPROBES_ALLOC_PAGE, alogn the > lines described below, so it is merely a glitch in my example. IMHO, it would be better to emphasize that this is text page. I liked Mark's idea [1] to have text_alloc_<subsys>() naming for the allocation functions. The Kconfig options then would become HAVE_TEXT_ALLOC_<SUBSYS>. [1] https://lore.kernel.org/linux-riscv/20200714133314.GA67386@C02TD0UTHF1T.local/ > > > void __weak *alloc_insn_page(void) > > > { > > > return module_alloc(PAGE_SIZE); > > > } > > > > > > void __weak free_insn_page(void *page) > > > { > > > module_memfree(page); > > > } > > > #endif > > > > > > In Kconfig (as in v5): > > > > > > config KPROBES > > > bool "Kprobes" > > > depends on MODULES || ARCH_HAS_TEXT_ALLOC > > > > > > I checked architectures that override alloc_insn_page(). With the > > > exception of x86, they do not call module_alloc(). > > > > > > If no rename was done, then with this approach a more consistent. > > > config flag name would be CONFIG_ARCH_HAS_ALLOC_INSN_PAGE. > > > > > > I'd call the function just as kprobes_alloc_page(). Then the > > > config flag would become CONFIG_HAS_KPROBES_ALLOC_PAGE. > > > > > > > -- > > > > Sincerely yours, > > > > Mike. > > > > > > Thanks for the feedback! > > > > > > /Jarkko > > > > > -- > > Sincerely yours, > > Mike. > > BR, > /Jarkko
On Wed, Aug 19, 2020 at 09:47:18AM +0300, Mike Rapoport wrote: > On Tue, Aug 18, 2020 at 07:30:33PM +0300, Jarkko Sakkinen wrote: > > On Tue, Aug 18, 2020 at 02:51:41PM +0300, Mike Rapoport wrote: > > > On Tue, Aug 18, 2020 at 08:30:29AM +0300, Jarkko Sakkinen wrote: > > > > On Sun, Jul 26, 2020 at 11:14:08AM +0300, Mike Rapoport wrote: > > > > > > > > > > > > I'm not still sure that I fully understand this feedback as I don't see > > > > > > any inherent and obvious difference to the v4. In that version fallbacks > > > > > > are to module_alloc() and module_memfree() and text_alloc() and > > > > > > text_memfree() can be overridden by arch. > > > > > > > > > > The major difference between your v4 and my suggestion is that I'm not > > > > > trying to impose a single ARCH_HAS_TEXT_ALLOC as an alternative to > > > > > MODULES but rather to use per subsystem config option, e.g. > > > > > HAVE_KPROBES_TEXT_ALLOC. > > > > > > > > > > Another thing, which might be worth doing regardless of the outcome of > > > > > this discussion is to rename alloc_insn_pages() to text_alloc_kprobes() > > > > > because the former is way too generic and does not emphasize that the > > > > > instruction page is actually used by kprobes only. > > > > > > > > What if we in kernel/kprobes.c just: > > > > > > > > #ifndef CONFIG_ARCH_HAS_TEXT_ALLOC > > > > > > I don't think that CONFIG_ARCH_HAS_TEXT_ALLOC will work for all > > > architectures. > > > > > > If an architecture has different restrictions for allocation of text > > > for, say, modules, kprobes, bfp, it won't be able to use a single > > > ARCH_HAS_TEXT_ALLOC. Which means that this architecture is stuck with > > > dependency of kprobes on MODULES until the next rework. > > > > I was thinking to name it as CONFIG_HAS_KPROBES_ALLOC_PAGE, alogn the > > lines described below, so it is merely a glitch in my example. > > IMHO, it would be better to emphasize that this is text page. I liked > Mark's idea [1] to have text_alloc_<subsys>() naming for the allocation > functions. The Kconfig options then would become > HAVE_TEXT_ALLOC_<SUBSYS>. > > [1] https://lore.kernel.org/linux-riscv/20200714133314.GA67386@C02TD0UTHF1T.local/ I think I got the point now, the point being in future other subsystems could use the same naming convention for analogous stuff? I'll follow this convention. Thank you for the patience with this! /Jarkko
diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 4e46d96d4e16..611fcda9f6bf 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -40,6 +40,7 @@ #include <asm/cacheflush.h> #include <asm/errno.h> #include <linux/uaccess.h> +#include <linux/vmalloc.h> #define KPROBE_HASH_BITS 6 #define KPROBE_TABLE_SIZE (1 << KPROBE_HASH_BITS) @@ -111,12 +112,20 @@ enum kprobe_slot_state { void __weak *alloc_insn_page(void) { +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC + return text_alloc(PAGE_SIZE); +#else return module_alloc(PAGE_SIZE); +#endif } void __weak free_insn_page(void *page) { +#ifdef CONFIG_ARCH_HAS_TEXT_ALLOC + text_free(page); +#else module_memfree(page); +#endif } struct kprobe_insn_cache kprobe_insn_slots = {
Use text_alloc() and text_free() instead of module_alloc() and module_memfree() when an arch provides them. Cc: linux-mm@kvack.org Cc: Andi Kleen <ak@linux.intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> --- kernel/kprobes.c | 9 +++++++++ 1 file changed, 9 insertions(+)