[v3] KVM: VMX: fix lockdep warning on posted intr wakeup

Split a single per_cpu lock on wakeup_list into a sched_in lock and
a sched_out lock to break the possible circular locking dependency
reported by lockdep.

The lockdep complains about "possible circular locking dependency
detected".

Chain exists of:
   &p->pi_lock --> &rq->__lock --> &per_cpu(wakeup_vcpus_on_cpu_lock, cpu)

  Possible unsafe locking scenario:

        CPU0                CPU1
        ----                ----
   lock(&per_cpu(wakeup_vcpus_on_cpu_lock, cpu));
                            lock(&rq->__lock);
                            lock(&per_cpu(wakeup_vcpus_on_cpu_lock, cpu));
   lock(&p->pi_lock);

  *** DEADLOCK ***

path irq,
        sysvec_kvm_posted_intr_wakeup_ipi() --> pi_wakeup_handler()
        --> kvm_vcpu_wake_up() --> try_to_wake_up(),
        the lock order is
        &per_cpu(wakeup_vcpus_on_cpu_lock, cpu) --> &p->pi_lock.

path sched_out,
        vcpu_block() --> schedule() --> kvm_sched_out() --> vmx_vcpu_put()
        --> vmx_vcpu_pi_put() --> pi_enable_wakeup_handler(),
        the lock order is
        &rq->__lock --> &per_cpu(wakeup_vcpus_on_cpu_lock, cpu).

However, it is found out that path irq and sched_out are not racing
because: path irq is in interrupt context, path sched_out is in interrupt
disabled context, at the same pcpu as path irq.

Consider path sched_out is the very path that tells lockdep the lock
ordering: &rq->__lock --> &per_cpu(wakeup_vcpus_on_cpu_lock, cpu),
it's desired for path irq not to hold the same per cpu lock as path
sched_out.

So, in the patch, a single wakeup_list lock is divided into a sched_in lock
and a sched_out lock.
- "path sched_out": add vcpu on pcpu (irq disabled)
              It takes sched_out lock.

- "path irq": read vcpu list on pcpu (irq context, running on the same pcpu
              as "path sched_out")
              It only takes sched_in lock.

- "path sched_in": delete vcpu on previous pCPU.
                  (irq disabled, running on the same or different pCPU
                  as "path irq")
                  It takes sched_in and sched_out lock as it can race
                  with the other two paths.

The lock ordering after this patch are:
- &p->pi_lock --> &rq->__lock -->
  &per_cpu(wakeup_vcpus_on_cpu_lock_out, cpu)
- &per_cpu(wakeup_vcpus_on_cpu_lock_in, cpu) -->
  &per_cpu(wakeup_vcpus_on_cpu_lock_out, cpu)
- &per_cpu(wakeup_vcpus_on_cpu_lock_in, cpu) --> &p->pi_lock

Currently, &rq->__lock is not held in "path sched_in".
However, if in future "path sched_in" takes &p->pi_lock or &rq->__lock,
lockdep is able to detect and warn in that case.

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
[sean: path sched_out and path irq does not race, path sched_in does not
take &rq->__lock]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
v3:
removed unnecessary blank line changes and delete a improper
description.

v2:
switch from rcu to two raw_spin_locks. as rcu may not let the irq
handler see list removal timely. (sean)
https://lore.kernel.org/all/20230313094753.8345-1-yan.y.zhao@intel.com/

v1:
https://lore.kernel.org/all/20230310155955.29652-1-yan.y.zhao@intel.com/
---
 arch/x86/kvm/vmx/posted_intr.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

base-commit: 89400df96a7570b651404bbc3b7afe627c52a192

Message ID	20230313111022.13793-1-yan.y.zhao@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@vger.kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FA51C61DA4 for <kvm@archiver.kernel.org>; Mon, 13 Mar 2023 11:35:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230179AbjCMLe5 (ORCPT <rfc822;kvm@archiver.kernel.org>); Mon, 13 Mar 2023 07:34:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230452AbjCMLeq (ORCPT <rfc822;kvm@vger.kernel.org>); Mon, 13 Mar 2023 07:34:46 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1092AD00; Mon, 13 Mar 2023 04:34:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678707259; x=1710243259; h=from:to:cc:subject:date:message-id; bh=tMqMRDn9qbbSMIPbslCNua9xCzVDM6s8e/f/VlLQpK0=; b=BIUN2DIR4nwESbFsOJXxeBVa/Kyv1VD74yp7KnMw0JHuNcU4588BedxV 42tFsTp/75odx0+SZ0kZsjJ094cE4218vGL9UsR4WOkqlKcczW16tesOu KgjnmIUSP9lzlVkVjIYT3Vt49WlWKuNl/es9bxifRcA0lRMQLG9eu31ue fax2c9GtssNhQP3K4RquNn4R1umG5FG+UmkAUKJEuXUEJVcqx+7C7lFuU nbkSlZqgQORYxUIcU16p8FMfwL2nBEfTYpoqoGyHr6XL0KgUj4QYT0f7F pAmtjGqjI5j9NU8FdZdEPEqsSxyVBcTxSGKQ4USG+6lweqsiw0nogpP+V g==; X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="334599332" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="334599332" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 04:34:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="852735232" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="852735232" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 04:34:09 -0700 From: Yan Zhao <yan.y.zhao@intel.com> To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: seanjc@google.com, pbonzini@redhat.com, Yan Zhao <yan.y.zhao@intel.com> Subject: [PATCH v3] KVM: VMX: fix lockdep warning on posted intr wakeup Date: Mon, 13 Mar 2023 19:10:22 +0800 Message-Id: <20230313111022.13793-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 Precedence: bulk List-ID: <kvm.vger.kernel.org> X-Mailing-List: kvm@vger.kernel.org
Series	[v3] KVM: VMX: fix lockdep warning on posted intr wakeup \| expand [v3] KVM: VMX: fix lockdep warning on posted intr wakeup

[v3] KVM: VMX: fix lockdep warning on posted intr wakeup

Commit Message

Comments

Patch