diff mbox series

KVM: kvm_vcpu_kick: Do not read potentially-stale vcpu->cpu

Message ID 20210630031037.584190-1-venkateshs@chromium.org (mailing list archive)
State New, archived
Headers show
Series KVM: kvm_vcpu_kick: Do not read potentially-stale vcpu->cpu | expand

Commit Message

Venkatesh Srinivas June 30, 2021, 3:10 a.m. UTC
vcpu->cpu contains the last cpu a vcpu run/is running on;
kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI
to the last CPU; vcpu->cpu is updated on sched_in when a vcpu
moves to a new CPU, so it possible for the vcpu->cpu field to
be stale.

kvm_vcpu_kick also read vcpu->cpu early with a plain read,
while vcpu->cpu could be concurrently written. This caused
a data race, found by kcsan:

write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0:
 kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018
 kvm_sched_in virt/kvm/kvm_main.c:4864

vs
 kvm_vcpu_kick virt/kvm/kvm_main.c:2909
 kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287
 async_pf_execute virt/kvm/async_pf.c:79
 ...

Use a READ_ONCE to atomically read vcpu->cpu and avoid the
data race.

Found by kcsan & syzbot.

Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>
---
 virt/kvm/kvm_main.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

David Matlack June 30, 2021, 4:24 p.m. UTC | #1
On Tue, Jun 29, 2021 at 08:10:37PM -0700, Venkatesh Srinivas wrote:
> vcpu->cpu contains the last cpu a vcpu run/is running on;
> kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI
> to the last CPU; vcpu->cpu is updated on sched_in when a vcpu
> moves to a new CPU, so it possible for the vcpu->cpu field to
> be stale.
> 
> kvm_vcpu_kick also read vcpu->cpu early with a plain read,
> while vcpu->cpu could be concurrently written. This caused
> a data race, found by kcsan:
> 
> write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0:
>  kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018
>  kvm_sched_in virt/kvm/kvm_main.c:4864
> 
> vs
>  kvm_vcpu_kick virt/kvm/kvm_main.c:2909
>  kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287
>  async_pf_execute virt/kvm/async_pf.c:79
>  ...
> 
> Use a READ_ONCE to atomically read vcpu->cpu and avoid the
> data race.
> 
> Found by kcsan & syzbot.
> 
> Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  virt/kvm/kvm_main.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 46fb042837d2..0525f42afb7d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3058,16 +3058,18 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
>   */
>  void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
>  {
> -	int me;
> -	int cpu = vcpu->cpu;
> +	int me, cpu;
>  
>  	if (kvm_vcpu_wake_up(vcpu))
>  		return;
>  
>  	me = get_cpu();
> -	if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> -		if (kvm_arch_vcpu_should_kick(vcpu))
> +	if (kvm_arch_vcpu_should_kick(vcpu)) {
> +		cpu = READ_ONCE(vcpu->cpu);
> +		WARN_ON_ONCE(cpu == me);

nit: A comment here may be useful to explain the interaction with
kvm_arch_vcpu_should_kick(). Took me a minute to understand why you
added the warning.

> +		if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
>  			smp_send_reschedule(cpu);
> +	}
>  	put_cpu();
>  }
>  EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
> -- 
> 2.30.2
>
Sean Christopherson July 9, 2021, 5:47 p.m. UTC | #2
On Wed, Jun 30, 2021, David Matlack wrote:
> On Tue, Jun 29, 2021 at 08:10:37PM -0700, Venkatesh Srinivas wrote:
> > vcpu->cpu contains the last cpu a vcpu run/is running on;
> > kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI
> > to the last CPU; vcpu->cpu is updated on sched_in when a vcpu
> > moves to a new CPU, so it possible for the vcpu->cpu field to
> > be stale.

This fails to document the actual bug being fixed, and why the fix is correct.
The fact that vcpu->cpu may be stale is itself not a bug, e.g. even with this
patch, the IPI can be sent to the "wrong", i.e. smp_send_reschedule() can still
consume a stale vcpu->cpu.

The bug that's being fixed is that grabbing the potentially-stale vcpu->cpu
_before_ kvm_arch_vcpu_should_kick() can cause KVM to send an IPI to the wrong
CPU _and_ let the vCPU run longer than intended.  The fix is correct because KVM
doesn't truly care about sending an IPI to the correct pCPU, it only cares about
about kicking the pCPU out of the guest.  If the vCPU has exited and been loaded
on a different pCPU after kvm_arch_vcpu_should_kick(), then its mission has been
accomplished and the IPI (to the wrong pCPU) is truly spurious.

> > kvm_vcpu_kick also read vcpu->cpu early with a plain read,
> > while vcpu->cpu could be concurrently written. This caused
> > a data race, found by kcsan:
> > 
> > write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0:
> >  kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018
> >  kvm_sched_in virt/kvm/kvm_main.c:4864
> > 
> > vs
> >  kvm_vcpu_kick virt/kvm/kvm_main.c:2909
> >  kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287
> >  async_pf_execute virt/kvm/async_pf.c:79
> >  ...
> > 
> > Use a READ_ONCE to atomically read vcpu->cpu and avoid the
> > data race.
> > 
> > Found by kcsan & syzbot.
> > 
> > Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>

Suggested-by: Sean Christopherson <seanjc@google.com>

> Reviewed-by: David Matlack <dmatlack@google.com>
> 
> > ---
> >  virt/kvm/kvm_main.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 46fb042837d2..0525f42afb7d 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -3058,16 +3058,18 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
> >   */
> >  void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
> >  {
> > -	int me;
> > -	int cpu = vcpu->cpu;
> > +	int me, cpu;
> >  
> >  	if (kvm_vcpu_wake_up(vcpu))
> >  		return;
> >  
> >  	me = get_cpu();
> > -	if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> > -		if (kvm_arch_vcpu_should_kick(vcpu))
> > +	if (kvm_arch_vcpu_should_kick(vcpu)) {
> > +		cpu = READ_ONCE(vcpu->cpu);
> > +		WARN_ON_ONCE(cpu == me);
> 
> nit: A comment here may be useful to explain the interaction with
> kvm_arch_vcpu_should_kick(). Took me a minute to understand why you
> added the warning.

Agreed.

> > +		if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> >  			smp_send_reschedule(cpu);
> > +	}
> >  	put_cpu();
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
> > -- 
> > 2.30.2
> >
Venkatesh Srinivas July 13, 2021, 1:45 a.m. UTC | #3
On Fri, Jul 9, 2021 at 10:47 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Jun 30, 2021, David Matlack wrote:
> > On Tue, Jun 29, 2021 at 08:10:37PM -0700, Venkatesh Srinivas wrote:
> > > vcpu->cpu contains the last cpu a vcpu run/is running on;
> > > kvm_vcpu_kick is used to 'kick' a guest vcpu by sending an IPI
> > > to the last CPU; vcpu->cpu is updated on sched_in when a vcpu
> > > moves to a new CPU, so it possible for the vcpu->cpu field to
> > > be stale.
>
> This fails to document the actual bug being fixed, and why the fix is correct.
> The fact that vcpu->cpu may be stale is itself not a bug, e.g. even with this
> patch, the IPI can be sent to the "wrong", i.e. smp_send_reschedule() can still
> consume a stale vcpu->cpu.
>
> The bug that's being fixed is that grabbing the potentially-stale vcpu->cpu
> _before_ kvm_arch_vcpu_should_kick() can cause KVM to send an IPI to the wrong
> CPU _and_ let the vCPU run longer than intended.  The fix is correct because KVM
> doesn't truly care about sending an IPI to the correct pCPU, it only cares about
> about kicking the pCPU out of the guest.  If the vCPU has exited and been loaded
> on a different pCPU after kvm_arch_vcpu_should_kick(), then its mission has been
> accomplished and the IPI (to the wrong pCPU) is truly spurious.

Will improve the comment for V2.

> > > kvm_vcpu_kick also read vcpu->cpu early with a plain read,
> > > while vcpu->cpu could be concurrently written. This caused
> > > a data race, found by kcsan:
> > >
> > > write to 0xffff8880274e8460 of 4 bytes by task 30718 on cpu 0:
> > >  kvm_arch_vcpu_load arch/x86/kvm/x86.c:4018
> > >  kvm_sched_in virt/kvm/kvm_main.c:4864
> > >
> > > vs
> > >  kvm_vcpu_kick virt/kvm/kvm_main.c:2909
> > >  kvm_arch_async_page_present_queued arch/x86/kvm/x86.c:11287
> > >  async_pf_execute virt/kvm/async_pf.c:79
> > >  ...
> > >
> > > Use a READ_ONCE to atomically read vcpu->cpu and avoid the
> > > data race.
> > >
> > > Found by kcsan & syzbot.
> > >
> > > Signed-off-by: Venkatesh Srinivas <venkateshs@chromium.org>
>
> Suggested-by: Sean Christopherson <seanjc@google.com>
>
> > Reviewed-by: David Matlack <dmatlack@google.com>
> >
> > > ---
> > >  virt/kvm/kvm_main.c | 10 ++++++----
> > >  1 file changed, 6 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > index 46fb042837d2..0525f42afb7d 100644
> > > --- a/virt/kvm/kvm_main.c
> > > +++ b/virt/kvm/kvm_main.c
> > > @@ -3058,16 +3058,18 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
> > >   */
> > >  void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
> > >  {
> > > -   int me;
> > > -   int cpu = vcpu->cpu;
> > > +   int me, cpu;
> > >
> > >     if (kvm_vcpu_wake_up(vcpu))
> > >             return;
> > >
> > >     me = get_cpu();
> > > -   if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> > > -           if (kvm_arch_vcpu_should_kick(vcpu))
> > > +   if (kvm_arch_vcpu_should_kick(vcpu)) {
> > > +           cpu = READ_ONCE(vcpu->cpu);
> > > +           WARN_ON_ONCE(cpu == me);
> >
> > nit: A comment here may be useful to explain the interaction with
> > kvm_arch_vcpu_should_kick(). Took me a minute to understand why you
> > added the warning.
>
> Agreed.

Will do for V2

> > > +           if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> > >                     smp_send_reschedule(cpu);
> > > +   }
> > >     put_cpu();
> > >  }
> > >  EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
> > > --
> > > 2.30.2
> > >
diff mbox series

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 46fb042837d2..0525f42afb7d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3058,16 +3058,18 @@  EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
  */
 void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 {
-	int me;
-	int cpu = vcpu->cpu;
+	int me, cpu;
 
 	if (kvm_vcpu_wake_up(vcpu))
 		return;
 
 	me = get_cpu();
-	if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
-		if (kvm_arch_vcpu_should_kick(vcpu))
+	if (kvm_arch_vcpu_should_kick(vcpu)) {
+		cpu = READ_ONCE(vcpu->cpu);
+		WARN_ON_ONCE(cpu == me);
+		if ((unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
 			smp_send_reschedule(cpu);
+	}
 	put_cpu();
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_kick);