[-next] power/qos: fix a data race in pm_qos_*_value

Message ID	1582319350-23515-1-git-send-email-cai@lca.pw (mailing list archive)
State	Changes Requested, archived
Headers	show Return-Path: <SRS0=Ee89=4J=vger.kernel.org=linux-pm-owner@kernel.org> From: Qian Cai <cai@lca.pw> To: rjw@rjwysocki.net Cc: elver@google.com, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, Qian Cai <cai@lca.pw> Subject: [PATCH -next] power/qos: fix a data race in pm_qos_*_value Date: Fri, 21 Feb 2020 16:09:10 -0500 Message-Id: <1582319350-23515-1-git-send-email-cai@lca.pw> Sender: linux-pm-owner@vger.kernel.org Precedence: bulk
Series	[-next] power/qos: fix a data race in pm_qos__value \| expand [-next] power/qos: fix a data race in pm_qos__value

Qian Cai Feb. 21, 2020, 9:09 p.m. UTC

cpu_latency_constraints.target_value could be accessed concurrently as
noticed by KCSAN,

 LTP: starting ppoll01
 BUG: KCSAN: data-race in cpu_latency_qos_limit / pm_qos_update_target

 write to 0xffffffff99081470 of 4 bytes by task 27532 on cpu 2:
  pm_qos_update_target+0xa4/0x370
  pm_qos_set_value at kernel/power/qos.c:78
  cpu_latency_qos_apply+0x3b/0x50
  cpu_latency_qos_remove_request+0xea/0x270
  cpu_latency_qos_release+0x4b/0x70
  __fput+0x187/0x3d0
  ____fput+0x1e/0x30
  task_work_run+0xbf/0x130
  do_exit+0xa78/0xfd0
  do_group_exit+0x8b/0x180
  __x64_sys_exit_group+0x2e/0x30
  do_syscall_64+0x91/0xb05
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

 read to 0xffffffff99081470 of 4 bytes by task 0 on cpu 41:
  cpu_latency_qos_limit+0x1f/0x30
  pm_qos_read_value at kernel/power/qos.c:55
  cpuidle_governor_latency_req+0x4f/0x80
  cpuidle_governor_latency_req at drivers/cpuidle/governor.c:114
  menu_select+0x6b/0xc29
  cpuidle_select+0x50/0x70
  do_idle+0x214/0x280
  cpu_startup_entry+0x1d/0x1f
  start_secondary+0x1b2/0x230
  secondary_startup_64+0xb6/0xc0

 Reported by Kernel Concurrency Sanitizer on:
 CPU: 41 PID: 0 Comm: swapper/41 Tainted: G L 5.6.0-rc2-next-20200221+ #7
 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019

The read is outside pm_qos_lock critical section which results in a data
race. Fix it by adding a pair of READ|WRITE_ONCE().

Signed-off-by: Qian Cai <cai@lca.pw>
---
 kernel/power/qos.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Rafael J. Wysocki Feb. 24, 2020, 12:12 a.m. UTC | #1

On Fri, Feb 21, 2020 at 10:09 PM Qian Cai <cai@lca.pw> wrote:
>
> cpu_latency_constraints.target_value could be accessed concurrently as
> noticed by KCSAN,

Yes, they could, pretty much by design.

So why *exactly* is this a problem?

>  LTP: starting ppoll01
>  BUG: KCSAN: data-race in cpu_latency_qos_limit / pm_qos_update_target

It may be a bug under certain conditions, but you don't mention what
conditions they are.  Reporting it as a general bug is not accurate at
the very least.

>  write to 0xffffffff99081470 of 4 bytes by task 27532 on cpu 2:
>   pm_qos_update_target+0xa4/0x370
>   pm_qos_set_value at kernel/power/qos.c:78
>   cpu_latency_qos_apply+0x3b/0x50
>   cpu_latency_qos_remove_request+0xea/0x270
>   cpu_latency_qos_release+0x4b/0x70
>   __fput+0x187/0x3d0
>   ____fput+0x1e/0x30
>   task_work_run+0xbf/0x130
>   do_exit+0xa78/0xfd0
>   do_group_exit+0x8b/0x180
>   __x64_sys_exit_group+0x2e/0x30
>   do_syscall_64+0x91/0xb05
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
>  read to 0xffffffff99081470 of 4 bytes by task 0 on cpu 41:
>   cpu_latency_qos_limit+0x1f/0x30
>   pm_qos_read_value at kernel/power/qos.c:55
>   cpuidle_governor_latency_req+0x4f/0x80
>   cpuidle_governor_latency_req at drivers/cpuidle/governor.c:114
>   menu_select+0x6b/0xc29
>   cpuidle_select+0x50/0x70
>   do_idle+0x214/0x280
>   cpu_startup_entry+0x1d/0x1f
>   start_secondary+0x1b2/0x230
>   secondary_startup_64+0xb6/0xc0
>
>  Reported by Kernel Concurrency Sanitizer on:
>  CPU: 41 PID: 0 Comm: swapper/41 Tainted: G L 5.6.0-rc2-next-20200221+ #7
>  Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
>
> The read is outside pm_qos_lock critical section which results in a data
> race.

This is purely theoretical AFAICS and so it should be presented this way.

Also the call traces above don't add much value to the changelog, so
maybe try to explain what the problem is in English.

> Fix it by adding a pair of READ|WRITE_ONCE().
>
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>  kernel/power/qos.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/power/qos.c b/kernel/power/qos.c
> index 32927682bcc4..db0bed2cae26 100644
> --- a/kernel/power/qos.c
> +++ b/kernel/power/qos.c
> @@ -52,7 +52,7 @@
>   */
>  s32 pm_qos_read_value(struct pm_qos_constraints *c)
>  {
> -       return c->target_value;
> +       return READ_ONCE(c->target_value);
>  }
>
>  static int pm_qos_get_value(struct pm_qos_constraints *c)
> @@ -75,7 +75,7 @@ static int pm_qos_get_value(struct pm_qos_constraints *c)
>
>  static void pm_qos_set_value(struct pm_qos_constraints *c, s32 value)
>  {
> -       c->target_value = value;
> +       WRITE_ONCE(c->target_value, value);
>  }
>
>  /**
> --
> 1.8.3.1
>

Qian Cai Feb. 24, 2020, 1:01 a.m. UTC | #2

> On Feb 23, 2020, at 7:12 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> 
> It may be a bug under certain conditions, but you don't mention what
> conditions they are.  Reporting it as a general bug is not accurate at
> the very least.

Could we rule out load tearing, store tearing and reload of global_req in cpuidle_governor_latency() for all compilers and architectures which could introduce logic bugs?

	int global_req = cpu_latency_qos_limit();

	if (device_req > global_req)
		device_req = global_req;

If under register pressure, the compiler might get ride of the tmp variable, i.e.,

If (device_req > cpu_latency_qos_limit())
—-> race with the writer.
         device_req = cpu_latency_qos_limit();

Rafael J. Wysocki Feb. 24, 2020, 9:54 a.m. UTC | #3

On Mon, Feb 24, 2020 at 2:01 AM Qian Cai <cai@lca.pw> wrote:
>
>
>
> > On Feb 23, 2020, at 7:12 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > It may be a bug under certain conditions, but you don't mention what
> > conditions they are.  Reporting it as a general bug is not accurate at
> > the very least.
>
> Could we rule out load tearing, store tearing and reload of global_req in cpuidle_governor_latency() for all compilers and architectures which could introduce logic bugs?
>
>         int global_req = cpu_latency_qos_limit();
>
>         if (device_req > global_req)
>                 device_req = global_req;
>
> If under register pressure, the compiler might get ride of the tmp variable, i.e.,
>
> If (device_req > cpu_latency_qos_limit())
> —-> race with the writer.
>          device_req = cpu_latency_qos_limit();

Yes, there is a race here with or without the WRITE_ONCE()/READ_ONCE()
annotations (note that these annotations don't prevent CPUs from
reordering things, so device_req may be set before global_req
regardless).

However, worst-case it may cause an old value to be used and that can
happen anyway if the entire cpuidle_governor_latency_req() runs
between the curr_value update and pm_qos_set_value() in
pm_qos_update_target(), for example.

IOW, there is no guarantee that the new value will be used immediately
after updating a QoS request anyway.

I agree with adding the annotations (I was considering posting a patch
doing that myself), but just as a matter of making the intention
clear.

Qian Cai Feb. 24, 2020, 7:02 p.m. UTC | #4

On Mon, 2020-02-24 at 10:54 +0100, Rafael J. Wysocki wrote:
> On Mon, Feb 24, 2020 at 2:01 AM Qian Cai <cai@lca.pw> wrote:
> > 
> > 
> > 
> > > On Feb 23, 2020, at 7:12 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > 
> > > It may be a bug under certain conditions, but you don't mention what
> > > conditions they are.  Reporting it as a general bug is not accurate at
> > > the very least.
> > 
> > Could we rule out load tearing, store tearing and reload of global_req in cpuidle_governor_latency() for all compilers and architectures which could introduce logic bugs?
> > 
> >         int global_req = cpu_latency_qos_limit();
> > 
> >         if (device_req > global_req)
> >                 device_req = global_req;
> > 
> > If under register pressure, the compiler might get ride of the tmp variable, i.e.,
> > 
> > If (device_req > cpu_latency_qos_limit())
> > —-> race with the writer.
> >          device_req = cpu_latency_qos_limit();
> 
> Yes, there is a race here with or without the WRITE_ONCE()/READ_ONCE()
> annotations (note that these annotations don't prevent CPUs from
> reordering things, so device_req may be set before global_req
> regardless).
> 
> However, worst-case it may cause an old value to be used and that can
> happen anyway if the entire cpuidle_governor_latency_req() runs
> between the curr_value update and pm_qos_set_value() in
> pm_qos_update_target(), for example.
> 
> IOW, there is no guarantee that the new value will be used immediately
> after updating a QoS request anyway.
> 
> I agree with adding the annotations (I was considering posting a patch
> doing that myself), but just as a matter of making the intention
> clear.

OK, how about this updated texts?

[PATCH -next] power/qos: annotate a data race in pm_qos_*_value

cpu_latency_constraints.target_value could be accessed concurrently via,

cpu_latency_qos_apply
  pm_qos_update_target
    pm_qos_set_value

cpuidle_governor_latency_req
  cpu_latency_qos_limit
    pm_qos_read_value

The read is outside pm_qos_lock critical section which results in a data race.
However, the worst case is that an old value to be used and that can happen
anyway, so annotate this data race using a pair of READ|WRITE_ONCE().

Rafael J. Wysocki Feb. 26, 2020, 12:10 a.m. UTC | #5

On Mon, Feb 24, 2020 at 8:02 PM Qian Cai <cai@lca.pw> wrote:
>
> On Mon, 2020-02-24 at 10:54 +0100, Rafael J. Wysocki wrote:
> > On Mon, Feb 24, 2020 at 2:01 AM Qian Cai <cai@lca.pw> wrote:
> > >
> > >
> > >
> > > > On Feb 23, 2020, at 7:12 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > >
> > > > It may be a bug under certain conditions, but you don't mention what
> > > > conditions they are.  Reporting it as a general bug is not accurate at
> > > > the very least.
> > >
> > > Could we rule out load tearing, store tearing and reload of global_req in cpuidle_governor_latency() for all compilers and architectures which could introduce logic bugs?
> > >
> > >         int global_req = cpu_latency_qos_limit();
> > >
> > >         if (device_req > global_req)
> > >                 device_req = global_req;
> > >
> > > If under register pressure, the compiler might get ride of the tmp variable, i.e.,
> > >
> > > If (device_req > cpu_latency_qos_limit())
> > > —-> race with the writer.
> > >          device_req = cpu_latency_qos_limit();
> >
> > Yes, there is a race here with or without the WRITE_ONCE()/READ_ONCE()
> > annotations (note that these annotations don't prevent CPUs from
> > reordering things, so device_req may be set before global_req
> > regardless).
> >
> > However, worst-case it may cause an old value to be used and that can
> > happen anyway if the entire cpuidle_governor_latency_req() runs
> > between the curr_value update and pm_qos_set_value() in
> > pm_qos_update_target(), for example.
> >
> > IOW, there is no guarantee that the new value will be used immediately
> > after updating a QoS request anyway.
> >
> > I agree with adding the annotations (I was considering posting a patch
> > doing that myself), but just as a matter of making the intention
> > clear.
>
> OK, how about this updated texts?
>
> [PATCH -next] power/qos: annotate a data race in pm_qos_*_value
>
> cpu_latency_constraints.target_value could be accessed concurrently via,
>
> cpu_latency_qos_apply
>   pm_qos_update_target
>     pm_qos_set_value
>
> cpuidle_governor_latency_req
>   cpu_latency_qos_limit
>     pm_qos_read_value
>
> The read is outside pm_qos_lock critical section which results in a data race.
> However, the worst case is that an old value to be used and that can happen
> anyway, so annotate this data race using a pair of READ|WRITE_ONCE().

I would rather say something like this:

The target_value field in struct pm_qos_constraints is used for
lockless access to the effective constraint value of a given QoS list,
so the readers of it cannot expect it to always reflect the most
recent effective constraint value.  However, they can and do expect it
to be equal to a valid effective constraint value computed at a
certain time in the past (event though it may not be the most recent
one), so add READ|WRITE_ONCE() annotations around the target_value
accesses to prevent the compiler from possibly causing that
expectation to be unmet by generating code in an exceptionally
convoluted way.

Qian Cai Feb. 26, 2020, 12:34 a.m. UTC | #6

> On Feb 25, 2020, at 7:10 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> 
> The target_value field in struct pm_qos_constraints is used for
> lockless access to the effective constraint value of a given QoS list,
> so the readers of it cannot expect it to always reflect the most
> recent effective constraint value.  However, they can and do expect it
> to be equal to a valid effective constraint value computed at a
> certain time in the past (event though it may not be the most recent
> one), so add READ|WRITE_ONCE() annotations around the target_value
> accesses to prevent the compiler from possibly causing that
> expectation to be unmet by generating code in an exceptionally
> convoluted way.

Perfect. I’ll send a v2 for that unless you would like to squash it in.

[-next] power/qos: fix a data race in pm_qos_*_value

Commit Message

Comments

Patch