Message ID | 20220907023327.85630-1-liaochang1@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | riscv/kprobe: Optimize the performance of patching instruction slot | expand |
On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: > Since no race condition occurs on each instruction slot, hence it is > safe to patch instruction slot without stopping machine. hmm, IMHO there's race when arming kprobe under SMP, so stopping machine is necessary here. Maybe I misundertand something. > > Signed-off-by: Liao Chang <liaochang1@huawei.com> > --- > arch/riscv/kernel/probes/kprobes.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c > index e6e950b7cf32..eff7d7fab535 100644 > --- a/arch/riscv/kernel/probes/kprobes.c > +++ b/arch/riscv/kernel/probes/kprobes.c > @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); > static void __kprobes arch_prepare_ss_slot(struct kprobe *p) > { > unsigned long offset = GET_INSN_LENGTH(p->opcode); > + const kprobe_opcode_t brk_insn = __BUG_INSN_32; > + kprobe_opcode_t slot[MAX_INSN_SIZE]; > > p->ainsn.api.restore = (unsigned long)p->addr + offset; > > - patch_text(p->ainsn.api.insn, p->opcode); > - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), > - __BUG_INSN_32); > + memcpy(slot, &p->opcode, offset); > + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); > + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); > } > > static void __kprobes arch_prepare_simulate(struct kprobe *p) > -- > 2.17.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Thu, 8 Sep 2022 01:21:27 +0800 Jisheng Zhang <jszhang@kernel.org> wrote: > On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: > > Since no race condition occurs on each instruction slot, hence it is > > safe to patch instruction slot without stopping machine. > > hmm, IMHO there's race when arming kprobe under SMP, so stopping > machine is necessary here. Maybe I misundertand something. Yeah, usually the self modifying code needs stop other CPUs some known points so that other CPUs does not execute the instruction which will be modified. Even if a chip ensures that, is that safe for other implementations? (Does RISC-V specification guarantee this behavior?) Thank you, > > > > > Signed-off-by: Liao Chang <liaochang1@huawei.com> > > --- > > arch/riscv/kernel/probes/kprobes.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c > > index e6e950b7cf32..eff7d7fab535 100644 > > --- a/arch/riscv/kernel/probes/kprobes.c > > +++ b/arch/riscv/kernel/probes/kprobes.c > > @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); > > static void __kprobes arch_prepare_ss_slot(struct kprobe *p) > > { > > unsigned long offset = GET_INSN_LENGTH(p->opcode); > > + const kprobe_opcode_t brk_insn = __BUG_INSN_32; > > + kprobe_opcode_t slot[MAX_INSN_SIZE]; > > > > p->ainsn.api.restore = (unsigned long)p->addr + offset; > > > > - patch_text(p->ainsn.api.insn, p->opcode); > > - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), > > - __BUG_INSN_32); > > + memcpy(slot, &p->opcode, offset); > > + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); > > + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); > > } > > > > static void __kprobes arch_prepare_simulate(struct kprobe *p) > > -- > > 2.17.1 > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv
Thanks for comment. 在 2022/9/8 1:21, Jisheng Zhang 写道: > On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: >> Since no race condition occurs on each instruction slot, hence it is >> safe to patch instruction slot without stopping machine. > > hmm, IMHO there's race when arming kprobe under SMP, so stopping > machine is necessary here. Maybe I misundertand something. > It is indeed necessary to stop machine when arm kprobe under SMP, but i don't think it need to stop machine when prepare instruction slot, two reasons: 1. Instruction slot is dynamically allocated data. 2. Kernel would not execute instruction slot until original instruction is replaced by breakpoint. >> >> Signed-off-by: Liao Chang <liaochang1@huawei.com> >> --- >> arch/riscv/kernel/probes/kprobes.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c >> index e6e950b7cf32..eff7d7fab535 100644 >> --- a/arch/riscv/kernel/probes/kprobes.c >> +++ b/arch/riscv/kernel/probes/kprobes.c >> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); >> static void __kprobes arch_prepare_ss_slot(struct kprobe *p) >> { >> unsigned long offset = GET_INSN_LENGTH(p->opcode); >> + const kprobe_opcode_t brk_insn = __BUG_INSN_32; >> + kprobe_opcode_t slot[MAX_INSN_SIZE]; >> >> p->ainsn.api.restore = (unsigned long)p->addr + offset; >> >> - patch_text(p->ainsn.api.insn, p->opcode); >> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), >> - __BUG_INSN_32); >> + memcpy(slot, &p->opcode, offset); >> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); >> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); >> } >> >> static void __kprobes arch_prepare_simulate(struct kprobe *p) >> -- >> 2.17.1 >> >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv > .
On Thu, 8 Sep 2022 09:43:45 +0800 "liaochang (A)" <liaochang1@huawei.com> wrote: > Thanks for comment. > > 在 2022/9/8 1:21, Jisheng Zhang 写道: > > On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: > >> Since no race condition occurs on each instruction slot, hence it is > >> safe to patch instruction slot without stopping machine. > > > > hmm, IMHO there's race when arming kprobe under SMP, so stopping > > machine is necessary here. Maybe I misundertand something. > > > > It is indeed necessary to stop machine when arm kprobe under SMP, > but i don't think it need to stop machine when prepare instruction slot, > two reasons: > > 1. Instruction slot is dynamically allocated data. > 2. Kernel would not execute instruction slot until original instruction > is replaced by breakpoint. Ah, this is for ss (single step out of line) slot. So until kprobe is enabled, this should not be used from other cores. OK, then it should be safe. > >> > >> Signed-off-by: Liao Chang <liaochang1@huawei.com> > >> --- > >> arch/riscv/kernel/probes/kprobes.c | 8 +++++--- > >> 1 file changed, 5 insertions(+), 3 deletions(-) > >> > >> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c > >> index e6e950b7cf32..eff7d7fab535 100644 > >> --- a/arch/riscv/kernel/probes/kprobes.c > >> +++ b/arch/riscv/kernel/probes/kprobes.c > >> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); > >> static void __kprobes arch_prepare_ss_slot(struct kprobe *p) > >> { > >> unsigned long offset = GET_INSN_LENGTH(p->opcode); > >> + const kprobe_opcode_t brk_insn = __BUG_INSN_32; > >> + kprobe_opcode_t slot[MAX_INSN_SIZE]; > >> > >> p->ainsn.api.restore = (unsigned long)p->addr + offset; > >> > >> - patch_text(p->ainsn.api.insn, p->opcode); > >> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), > >> - __BUG_INSN_32); > >> + memcpy(slot, &p->opcode, offset); > >> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); > >> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); BTW, didn't you have a macro for the size of __BUG_INSN_32? Thank you, > >> } > >> > >> static void __kprobes arch_prepare_simulate(struct kprobe *p) > >> -- > >> 2.17.1 > >> > >> > >> _______________________________________________ > >> linux-riscv mailing list > >> linux-riscv@lists.infradead.org > >> http://lists.infradead.org/mailman/listinfo/linux-riscv > > . > > -- > BR, > Liao, Chang
在 2022/9/8 20:49, Masami Hiramatsu (Google) 写道: > On Thu, 8 Sep 2022 09:43:45 +0800 > "liaochang (A)" <liaochang1@huawei.com> wrote: > >> Thanks for comment. >> >> 在 2022/9/8 1:21, Jisheng Zhang 写道: >>> On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: >>>> Since no race condition occurs on each instruction slot, hence it is >>>> safe to patch instruction slot without stopping machine. >>> >>> hmm, IMHO there's race when arming kprobe under SMP, so stopping >>> machine is necessary here. Maybe I misundertand something. >>> >> >> It is indeed necessary to stop machine when arm kprobe under SMP, >> but i don't think it need to stop machine when prepare instruction slot, >> two reasons: >> >> 1. Instruction slot is dynamically allocated data. >> 2. Kernel would not execute instruction slot until original instruction >> is replaced by breakpoint. > > Ah, this is for ss (single step out of line) slot. So until > kprobe is enabled, this should not be used from other cores. > OK, then it should be safe. Exactly, Masami, and i find out this optimization could be applied to some other architectures, such as arm64 and csky, do you think it is good time to do them all. Thanks. > > >>>> >>>> Signed-off-by: Liao Chang <liaochang1@huawei.com> >>>> --- >>>> arch/riscv/kernel/probes/kprobes.c | 8 +++++--- >>>> 1 file changed, 5 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c >>>> index e6e950b7cf32..eff7d7fab535 100644 >>>> --- a/arch/riscv/kernel/probes/kprobes.c >>>> +++ b/arch/riscv/kernel/probes/kprobes.c >>>> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); >>>> static void __kprobes arch_prepare_ss_slot(struct kprobe *p) >>>> { >>>> unsigned long offset = GET_INSN_LENGTH(p->opcode); >>>> + const kprobe_opcode_t brk_insn = __BUG_INSN_32; >>>> + kprobe_opcode_t slot[MAX_INSN_SIZE]; >>>> >>>> p->ainsn.api.restore = (unsigned long)p->addr + offset; >>>> >>>> - patch_text(p->ainsn.api.insn, p->opcode); >>>> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), >>>> - __BUG_INSN_32); >>>> + memcpy(slot, &p->opcode, offset); >>>> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); >>>> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); > > BTW, didn't you have a macro for the size of __BUG_INSN_32? > > Thank you, I think you are saying GET_INSN_LENGTH, i will use it to caculate the size of __BUG_INSN_32 in v2, instead of magic number '4'. Thanks. > > >>>> } >>>> >>>> static void __kprobes arch_prepare_simulate(struct kprobe *p) >>>> -- >>>> 2.17.1 >>>> >>>> >>>> _______________________________________________ >>>> linux-riscv mailing list >>>> linux-riscv@lists.infradead.org >>>> http://lists.infradead.org/mailman/listinfo/linux-riscv >>> . >> >> -- >> BR, >> Liao, Chang > >
On Fri, 9 Sep 2022 09:55:08 +0800 "liaochang (A)" <liaochang1@huawei.com> wrote: > > > 在 2022/9/8 20:49, Masami Hiramatsu (Google) 写道: > > On Thu, 8 Sep 2022 09:43:45 +0800 > > "liaochang (A)" <liaochang1@huawei.com> wrote: > > > >> Thanks for comment. > >> > >> 在 2022/9/8 1:21, Jisheng Zhang 写道: > >>> On Wed, Sep 07, 2022 at 10:33:27AM +0800, Liao Chang wrote: > >>>> Since no race condition occurs on each instruction slot, hence it is > >>>> safe to patch instruction slot without stopping machine. > >>> > >>> hmm, IMHO there's race when arming kprobe under SMP, so stopping > >>> machine is necessary here. Maybe I misundertand something. > >>> > >> > >> It is indeed necessary to stop machine when arm kprobe under SMP, > >> but i don't think it need to stop machine when prepare instruction slot, > >> two reasons: > >> > >> 1. Instruction slot is dynamically allocated data. > >> 2. Kernel would not execute instruction slot until original instruction > >> is replaced by breakpoint. > > > > Ah, this is for ss (single step out of line) slot. So until > > kprobe is enabled, this should not be used from other cores. > > OK, then it should be safe. > > Exactly, Masami, and i find out this optimization could be applied to some other > architectures, such as arm64 and csky, do you think it is good time to do them all. Yes, we should reduce the stop_machine() usage. Thanks for pointing it! > > Thanks. > > > > > > >>>> > >>>> Signed-off-by: Liao Chang <liaochang1@huawei.com> > >>>> --- > >>>> arch/riscv/kernel/probes/kprobes.c | 8 +++++--- > >>>> 1 file changed, 5 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c > >>>> index e6e950b7cf32..eff7d7fab535 100644 > >>>> --- a/arch/riscv/kernel/probes/kprobes.c > >>>> +++ b/arch/riscv/kernel/probes/kprobes.c > >>>> @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); > >>>> static void __kprobes arch_prepare_ss_slot(struct kprobe *p) > >>>> { > >>>> unsigned long offset = GET_INSN_LENGTH(p->opcode); > >>>> + const kprobe_opcode_t brk_insn = __BUG_INSN_32; > >>>> + kprobe_opcode_t slot[MAX_INSN_SIZE]; > >>>> > >>>> p->ainsn.api.restore = (unsigned long)p->addr + offset; > >>>> > >>>> - patch_text(p->ainsn.api.insn, p->opcode); > >>>> - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), > >>>> - __BUG_INSN_32); > >>>> + memcpy(slot, &p->opcode, offset); > >>>> + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); > >>>> + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); > > > > BTW, didn't you have a macro for the size of __BUG_INSN_32? > > > > Thank you, > > I think you are saying GET_INSN_LENGTH, i will use it to caculate > the size of __BUG_INSN_32 in v2, instead of magic number '4'. Yeah, that's better. Thank you! > > Thanks. > > > > > > >>>> } > >>>> > >>>> static void __kprobes arch_prepare_simulate(struct kprobe *p) > >>>> -- > >>>> 2.17.1 > >>>> > >>>> > >>>> _______________________________________________ > >>>> linux-riscv mailing list > >>>> linux-riscv@lists.infradead.org > >>>> http://lists.infradead.org/mailman/listinfo/linux-riscv > >>> . > >> > >> -- > >> BR, > >> Liao, Chang > > > > > > -- > BR, > Liao, Chang
diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c index e6e950b7cf32..eff7d7fab535 100644 --- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -24,12 +24,14 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *); static void __kprobes arch_prepare_ss_slot(struct kprobe *p) { unsigned long offset = GET_INSN_LENGTH(p->opcode); + const kprobe_opcode_t brk_insn = __BUG_INSN_32; + kprobe_opcode_t slot[MAX_INSN_SIZE]; p->ainsn.api.restore = (unsigned long)p->addr + offset; - patch_text(p->ainsn.api.insn, p->opcode); - patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset), - __BUG_INSN_32); + memcpy(slot, &p->opcode, offset); + memcpy((void *)((unsigned long)slot + offset), &brk_insn, 4); + patch_text_nosync(p->ainsn.api.insn, slot, offset + 4); } static void __kprobes arch_prepare_simulate(struct kprobe *p)
Since no race condition occurs on each instruction slot, hence it is safe to patch instruction slot without stopping machine. Signed-off-by: Liao Chang <liaochang1@huawei.com> --- arch/riscv/kernel/probes/kprobes.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)