[V2] arch: patch_text: Fixup last cpu should be master

Message ID	20220313012221.1755483-1-guoren@kernel.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org> From: guoren@kernel.org To: guoren@kernel.org Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-csky@vger.kernel.org, linux-riscv@lists.infradead.org, linux-xtensa@linux-xtensa.org, Guo Ren <guoren@linux.alibaba.com>, Max Filippov <jcmvbkbc@gmail.com>, Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com>, Palmer Dabbelt <palmer@dabbelt.com>, Masami Hiramatsu <mhiramat@kernel.org>, Chris Zankel <chris@zankel.net>, Arnd Bergmann <arnd@arndb.de> Subject: [PATCH V2] arch: patch_text: Fixup last cpu should be master Date: Sun, 13 Mar 2022 09:22:21 +0800 Message-Id: <20220313012221.1755483-1-guoren@kernel.org> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org> Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org
Series	[V2] arch: patch_text: Fixup last cpu should be master \| expand [V2] arch: patch_text: Fixup last cpu should be master

Guo Ren March 13, 2022, 1:22 a.m. UTC

From: Guo Ren <guoren@linux.alibaba.com>

These patch_text implementations are using stop_machine_cpuslocked
infrastructure with atomic cpu_count. The original idea: When the
master CPU patch_text, the others should wait for it. But current
implementation is using the first CPU as master, which couldn't
guarantee the remaining CPUs are waiting. This patch changes the
last CPU as the master to solve the potential risk.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra <peterz@infradead.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: Arnd Bergmann <arnd@arndb.de>

---
Changes in V2:
 - Fixup last cpu should be num_online_cpus() by Max Filippov
 - Fixup typos found by Max Filippov
---
 arch/arm64/kernel/patching.c      | 4 ++--
 arch/csky/kernel/probes/kprobes.c | 2 +-
 arch/riscv/kernel/patch.c         | 2 +-
 arch/xtensa/kernel/jump_label.c   | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

Catalin Marinas March 16, 2022, 2:38 p.m. UTC | #1

On Sun, Mar 13, 2022 at 09:22:21AM +0800, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> These patch_text implementations are using stop_machine_cpuslocked
> infrastructure with atomic cpu_count. The original idea: When the
> master CPU patch_text, the others should wait for it.

I couldn't find the original intent in the commit logs (at least not in
the arm64 logs). Maybe the intention was for the CPUs to wait for the
text patching to complete rather than the master CPU to wait for the
others to enter the cpu_relax() loop before patching.

I think your patch makes sense anyway, the master CPU would wait for all
the others to enter the cpu_relax() loop before patching and releasing
them with another increment. You probably wouldn't see any issues in
practice unless you insert probes in the multi_stop_cpu() function (or
we could mark this function as __kprobes and get rid of the extra loops
entirely).

> --- a/arch/arm64/kernel/patching.c
> +++ b/arch/arm64/kernel/patching.c
> @@ -117,8 +117,8 @@ static int __kprobes aarch64_insn_patch_text_cb(void *arg)
>  	int i, ret = 0;
>  	struct aarch64_insn_patch *pp = arg;
>  
> -	/* The first CPU becomes master */
> -	if (atomic_inc_return(&pp->cpu_count) == 1) {
> +	/* The last CPU becomes master */
> +	if (atomic_inc_return(&pp->cpu_count) == num_online_cpus()) {
>  		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
>  			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
>  							     pp->new_insns[i]);

For arm64:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Guo Ren March 17, 2022, 4:03 p.m. UTC | #2

On Wed, Mar 16, 2022 at 10:38 PM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> On Sun, Mar 13, 2022 at 09:22:21AM +0800, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > These patch_text implementations are using stop_machine_cpuslocked
> > infrastructure with atomic cpu_count. The original idea: When the
> > master CPU patch_text, the others should wait for it.
>
> I couldn't find the original intent in the commit logs (at least not in
> the arm64 logs). Maybe the intention was for the CPUs to wait for the
> text patching to complete rather than the master CPU to wait for the
> others to enter the cpu_relax() loop before patching.
>
> I think your patch makes sense anyway, the master CPU would wait for all
> the others to enter the cpu_relax() loop before patching and releasing
> them with another increment. You probably wouldn't see any issues in
> practice unless you insert probes in the multi_stop_cpu() function (or
> we could mark this function as __kprobes and get rid of the extra loops
> entirely).
That could depend on micro-arch, trigger other harts' IPI is not
guaranteed by hw.

>
> > --- a/arch/arm64/kernel/patching.c
> > +++ b/arch/arm64/kernel/patching.c
> > @@ -117,8 +117,8 @@ static int __kprobes aarch64_insn_patch_text_cb(void *arg)
> >       int i, ret = 0;
> >       struct aarch64_insn_patch *pp = arg;
> >
> > -     /* The first CPU becomes master */
> > -     if (atomic_inc_return(&pp->cpu_count) == 1) {
> > +     /* The last CPU becomes master */
> > +     if (atomic_inc_return(&pp->cpu_count) == num_online_cpus()) {
> >               for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
> >                       ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
> >                                                            pp->new_insns[i]);
>
> For arm64:
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Thx

>
> --
> Catalin

Palmer Dabbelt March 20, 2022, 6:05 p.m. UTC | #3

On Sat, 12 Mar 2022 17:22:21 PST (-0800), guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> These patch_text implementations are using stop_machine_cpuslocked
> infrastructure with atomic cpu_count. The original idea: When the
> master CPU patch_text, the others should wait for it. But current
> implementation is using the first CPU as master, which couldn't
> guarantee the remaining CPUs are waiting. This patch changes the
> last CPU as the master to solve the potential risk.
>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Palmer Dabbelt <palmer@dabbelt.com>
> Cc: Peter Zijlstra <peterz@infradead.org
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Chris Zankel <chris@zankel.net>
> Cc: Arnd Bergmann <arnd@arndb.de>
>
> ---
> Changes in V2:
>  - Fixup last cpu should be num_online_cpus() by Max Filippov
>  - Fixup typos found by Max Filippov
> ---
>  arch/arm64/kernel/patching.c      | 4 ++--
>  arch/csky/kernel/probes/kprobes.c | 2 +-
>  arch/riscv/kernel/patch.c         | 2 +-
>  arch/xtensa/kernel/jump_label.c   | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
> index 771f543464e0..33e0fabc0b79 100644
> --- a/arch/arm64/kernel/patching.c
> +++ b/arch/arm64/kernel/patching.c
> @@ -117,8 +117,8 @@ static int __kprobes aarch64_insn_patch_text_cb(void *arg)
>  	int i, ret = 0;
>  	struct aarch64_insn_patch *pp = arg;
>
> -	/* The first CPU becomes master */
> -	if (atomic_inc_return(&pp->cpu_count) == 1) {
> +	/* The last CPU becomes master */
> +	if (atomic_inc_return(&pp->cpu_count) == num_online_cpus()) {
>  		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
>  			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
>  							     pp->new_insns[i]);
> diff --git a/arch/csky/kernel/probes/kprobes.c b/arch/csky/kernel/probes/kprobes.c
> index 42920f25e73c..34ba684d5962 100644
> --- a/arch/csky/kernel/probes/kprobes.c
> +++ b/arch/csky/kernel/probes/kprobes.c
> @@ -30,7 +30,7 @@ static int __kprobes patch_text_cb(void *priv)
>  	struct csky_insn_patch *param = priv;
>  	unsigned int addr = (unsigned int)param->addr;
>
> -	if (atomic_inc_return(&param->cpu_count) == 1) {
> +	if (atomic_inc_return(&param->cpu_count) == num_online_cpus()) {
>  		*(u16 *) addr = cpu_to_le16(param->opcode);
>  		dcache_wb_range(addr, addr + 2);
>  		atomic_inc(&param->cpu_count);
> diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> index 0b552873a577..765004b60513 100644
> --- a/arch/riscv/kernel/patch.c
> +++ b/arch/riscv/kernel/patch.c
> @@ -104,7 +104,7 @@ static int patch_text_cb(void *data)
>  	struct patch_insn *patch = data;
>  	int ret = 0;
>
> -	if (atomic_inc_return(&patch->cpu_count) == 1) {
> +	if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
>  		ret =
>  		    patch_text_nosync(patch->addr, &patch->insn,
>  					    GET_INSN_LENGTH(patch->insn));

Acked-by: Palmer Dabbelt <palmer@rivosinc.com> # RISC-V

It's better if these sorts of things get split up: there's really no 
dependency between these diffs and having them together just makes for a 
merge/test headache for everyone.

I'm OK taking this through the RISC-V tree if other folks ack it, but 
for now I'm going to assume it's going to go in via somewhere else.

> diff --git a/arch/xtensa/kernel/jump_label.c b/arch/xtensa/kernel/jump_label.c
> index 61cf6497a646..b67efcd7e32c 100644
> --- a/arch/xtensa/kernel/jump_label.c
> +++ b/arch/xtensa/kernel/jump_label.c
> @@ -40,7 +40,7 @@ static int patch_text_stop_machine(void *data)
>  {
>  	struct patch *patch = data;
>
> -	if (atomic_inc_return(&patch->cpu_count) == 1) {
> +	if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
>  		local_patch_text(patch->addr, patch->data, patch->sz);
>  		atomic_inc(&patch->cpu_count);
>  	} else {

Guo Ren March 20, 2022, 11:49 p.m. UTC | #4

On Mon, Mar 21, 2022 at 2:05 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Sat, 12 Mar 2022 17:22:21 PST (-0800), guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > These patch_text implementations are using stop_machine_cpuslocked
> > infrastructure with atomic cpu_count. The original idea: When the
> > master CPU patch_text, the others should wait for it. But current
> > implementation is using the first CPU as master, which couldn't
> > guarantee the remaining CPUs are waiting. This patch changes the
> > last CPU as the master to solve the potential risk.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Palmer Dabbelt <palmer@dabbelt.com>
> > Cc: Peter Zijlstra <peterz@infradead.org
> > Cc: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Chris Zankel <chris@zankel.net>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> >
> > ---
> > Changes in V2:
> >  - Fixup last cpu should be num_online_cpus() by Max Filippov
> >  - Fixup typos found by Max Filippov
> > ---
> >  arch/arm64/kernel/patching.c      | 4 ++--
> >  arch/csky/kernel/probes/kprobes.c | 2 +-
> >  arch/riscv/kernel/patch.c         | 2 +-
> >  arch/xtensa/kernel/jump_label.c   | 2 +-
> >  4 files changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
> > index 771f543464e0..33e0fabc0b79 100644
> > --- a/arch/arm64/kernel/patching.c
> > +++ b/arch/arm64/kernel/patching.c
> > @@ -117,8 +117,8 @@ static int __kprobes aarch64_insn_patch_text_cb(void *arg)
> >       int i, ret = 0;
> >       struct aarch64_insn_patch *pp = arg;
> >
> > -     /* The first CPU becomes master */
> > -     if (atomic_inc_return(&pp->cpu_count) == 1) {
> > +     /* The last CPU becomes master */
> > +     if (atomic_inc_return(&pp->cpu_count) == num_online_cpus()) {
> >               for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
> >                       ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
> >                                                            pp->new_insns[i]);
> > diff --git a/arch/csky/kernel/probes/kprobes.c b/arch/csky/kernel/probes/kprobes.c
> > index 42920f25e73c..34ba684d5962 100644
> > --- a/arch/csky/kernel/probes/kprobes.c
> > +++ b/arch/csky/kernel/probes/kprobes.c
> > @@ -30,7 +30,7 @@ static int __kprobes patch_text_cb(void *priv)
> >       struct csky_insn_patch *param = priv;
> >       unsigned int addr = (unsigned int)param->addr;
> >
> > -     if (atomic_inc_return(&param->cpu_count) == 1) {
> > +     if (atomic_inc_return(&param->cpu_count) == num_online_cpus()) {
> >               *(u16 *) addr = cpu_to_le16(param->opcode);
> >               dcache_wb_range(addr, addr + 2);
> >               atomic_inc(&param->cpu_count);
> > diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> > index 0b552873a577..765004b60513 100644
> > --- a/arch/riscv/kernel/patch.c
> > +++ b/arch/riscv/kernel/patch.c
> > @@ -104,7 +104,7 @@ static int patch_text_cb(void *data)
> >       struct patch_insn *patch = data;
> >       int ret = 0;
> >
> > -     if (atomic_inc_return(&patch->cpu_count) == 1) {
> > +     if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
> >               ret =
> >                   patch_text_nosync(patch->addr, &patch->insn,
> >                                           GET_INSN_LENGTH(patch->insn));
>
> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> # RISC-V
>
> It's better if these sorts of things get split up: there's really no
> dependency between these diffs and having them together just makes for a
> merge/test headache for everyone.
I'm Okay with split patches, @Arnd Bergmann what's your opinion?
We've got all arch vendors' agreements (arm64, csky, riscv, xtensa).

>
> I'm OK taking this through the RISC-V tree if other folks ack it, but
> for now I'm going to assume it's going to go in via somewhere else.
Arnd has given some comments on unnecessary #error, maybe I need to update V2.

>
> > diff --git a/arch/xtensa/kernel/jump_label.c b/arch/xtensa/kernel/jump_label.c
> > index 61cf6497a646..b67efcd7e32c 100644
> > --- a/arch/xtensa/kernel/jump_label.c
> > +++ b/arch/xtensa/kernel/jump_label.c
> > @@ -40,7 +40,7 @@ static int patch_text_stop_machine(void *data)
> >  {
> >       struct patch *patch = data;
> >
> > -     if (atomic_inc_return(&patch->cpu_count) == 1) {
> > +     if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
> >               local_patch_text(patch->addr, patch->data, patch->sz);
> >               atomic_inc(&patch->cpu_count);
> >       } else {

Masami Hiramatsu (Google) March 23, 2022, 1:16 a.m. UTC | #5

On Sun, 13 Mar 2022 09:22:21 +0800
guoren@kernel.org wrote:

> From: Guo Ren <guoren@linux.alibaba.com>
> 
> These patch_text implementations are using stop_machine_cpuslocked
> infrastructure with atomic cpu_count. The original idea: When the
> master CPU patch_text, the others should wait for it. But current
> implementation is using the first CPU as master, which couldn't
> guarantee the remaining CPUs are waiting. This patch changes the
> last CPU as the master to solve the potential risk.
> 

This looks good to me. And maybe we'd better backport to stable branch
this since this can cause self-modification in wrong timing.

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you,

> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Palmer Dabbelt <palmer@dabbelt.com>
> Cc: Peter Zijlstra <peterz@infradead.org
> Cc: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Chris Zankel <chris@zankel.net>
> Cc: Arnd Bergmann <arnd@arndb.de>
> 
> ---
> Changes in V2:
>  - Fixup last cpu should be num_online_cpus() by Max Filippov
>  - Fixup typos found by Max Filippov
> ---
>  arch/arm64/kernel/patching.c      | 4 ++--
>  arch/csky/kernel/probes/kprobes.c | 2 +-
>  arch/riscv/kernel/patch.c         | 2 +-
>  arch/xtensa/kernel/jump_label.c   | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
> index 771f543464e0..33e0fabc0b79 100644
> --- a/arch/arm64/kernel/patching.c
> +++ b/arch/arm64/kernel/patching.c
> @@ -117,8 +117,8 @@ static int __kprobes aarch64_insn_patch_text_cb(void *arg)
>  	int i, ret = 0;
>  	struct aarch64_insn_patch *pp = arg;
>  
> -	/* The first CPU becomes master */
> -	if (atomic_inc_return(&pp->cpu_count) == 1) {
> +	/* The last CPU becomes master */
> +	if (atomic_inc_return(&pp->cpu_count) == num_online_cpus()) {
>  		for (i = 0; ret == 0 && i < pp->insn_cnt; i++)
>  			ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i],
>  							     pp->new_insns[i]);
> diff --git a/arch/csky/kernel/probes/kprobes.c b/arch/csky/kernel/probes/kprobes.c
> index 42920f25e73c..34ba684d5962 100644
> --- a/arch/csky/kernel/probes/kprobes.c
> +++ b/arch/csky/kernel/probes/kprobes.c
> @@ -30,7 +30,7 @@ static int __kprobes patch_text_cb(void *priv)
>  	struct csky_insn_patch *param = priv;
>  	unsigned int addr = (unsigned int)param->addr;
>  
> -	if (atomic_inc_return(&param->cpu_count) == 1) {
> +	if (atomic_inc_return(&param->cpu_count) == num_online_cpus()) {
>  		*(u16 *) addr = cpu_to_le16(param->opcode);
>  		dcache_wb_range(addr, addr + 2);
>  		atomic_inc(&param->cpu_count);
> diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> index 0b552873a577..765004b60513 100644
> --- a/arch/riscv/kernel/patch.c
> +++ b/arch/riscv/kernel/patch.c
> @@ -104,7 +104,7 @@ static int patch_text_cb(void *data)
>  	struct patch_insn *patch = data;
>  	int ret = 0;
>  
> -	if (atomic_inc_return(&patch->cpu_count) == 1) {
> +	if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
>  		ret =
>  		    patch_text_nosync(patch->addr, &patch->insn,
>  					    GET_INSN_LENGTH(patch->insn));
> diff --git a/arch/xtensa/kernel/jump_label.c b/arch/xtensa/kernel/jump_label.c
> index 61cf6497a646..b67efcd7e32c 100644
> --- a/arch/xtensa/kernel/jump_label.c
> +++ b/arch/xtensa/kernel/jump_label.c
> @@ -40,7 +40,7 @@ static int patch_text_stop_machine(void *data)
>  {
>  	struct patch *patch = data;
>  
> -	if (atomic_inc_return(&patch->cpu_count) == 1) {
> +	if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) {
>  		local_patch_text(patch->addr, patch->data, patch->sz);
>  		atomic_inc(&patch->cpu_count);
>  	} else {
> -- 
> 2.25.1
>

[V2] arch: patch_text: Fixup last cpu should be master

Commit Message

Comments

Patch