From patchwork Sun Nov 12 04:16:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453247 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38E3B1C3A for ; Sun, 12 Nov 2023 04:12:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="cxzp4Fw8" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90A8730D1; Sat, 11 Nov 2023 20:12:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762328; x=1731298328; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xLzahHQ/E9A0FWloIOuFYyseO5OqbdkcivVWmoaEYAQ=; b=cxzp4Fw83WOv2Y9+gcofcyH27FBYprYbLz4ST95C3J83d1IMqNjBg5K3 hka9af1fhgvJffgJdoMk+iiOITliXZhj8fNdrGeHa8UkeKh+rubGHa/nW jSfwFd7oIhik+eERgM6m7SPVzPapWXfjt67VTvVa21d7PChp29CluK6QU KlkVHm9SFp+4FOjPBPP95gZ+iV0kKAh6JyS3OUZLRflePNyB8oP4T7JD1 DcI73F4u1bWROWRkmKiCulYIvTBy7IF2c3JwFcW9PO3mF73bCn/TMgTIx D8CnhFPf9hTkEBprkpLjB9y08KlXMIKgwtdCyG2mS0wgqSzhYgNmIorsG A==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533846" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533846" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936736" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936736" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:07 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 01/13] x86: Move posted interrupt descriptor out of vmx code Date: Sat, 11 Nov 2023 20:16:31 -0800 Message-Id: <20231112041643.2868316-2-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To prepare native usage of posted interrupt, move PID declaration out of VMX code such that they can be shared. Signed-off-by: Jacob Pan --- arch/x86/include/asm/posted_intr.h | 97 ++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/posted_intr.h | 93 +--------------------------- arch/x86/kvm/vmx/vmx.c | 1 + arch/x86/kvm/vmx/vmx.h | 2 +- 4 files changed, 100 insertions(+), 93 deletions(-) create mode 100644 arch/x86/include/asm/posted_intr.h diff --git a/arch/x86/include/asm/posted_intr.h b/arch/x86/include/asm/posted_intr.h new file mode 100644 index 000000000000..9f2fa38fa57b --- /dev/null +++ b/arch/x86/include/asm/posted_intr.h @@ -0,0 +1,97 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _X86_POSTED_INTR_H +#define _X86_POSTED_INTR_H + +#define POSTED_INTR_ON 0 +#define POSTED_INTR_SN 1 + +#define PID_TABLE_ENTRY_VALID 1 + +/* Posted-Interrupt Descriptor */ +struct pi_desc { + u32 pir[8]; /* Posted interrupt requested */ + union { + struct { + /* bit 256 - Outstanding Notification */ + u16 on : 1, + /* bit 257 - Suppress Notification */ + sn : 1, + /* bit 271:258 - Reserved */ + rsvd_1 : 14; + /* bit 279:272 - Notification Vector */ + u8 nv; + /* bit 287:280 - Reserved */ + u8 rsvd_2; + /* bit 319:288 - Notification Destination */ + u32 ndst; + }; + u64 control; + }; + u32 rsvd[6]; +} __aligned(64); + +static inline bool pi_test_and_set_on(struct pi_desc *pi_desc) +{ + return test_and_set_bit(POSTED_INTR_ON, + (unsigned long *)&pi_desc->control); +} + +static inline bool pi_test_and_clear_on(struct pi_desc *pi_desc) +{ + return test_and_clear_bit(POSTED_INTR_ON, + (unsigned long *)&pi_desc->control); +} + +static inline bool pi_test_and_clear_sn(struct pi_desc *pi_desc) +{ + return test_and_clear_bit(POSTED_INTR_SN, + (unsigned long *)&pi_desc->control); +} + +static inline bool pi_test_and_set_pir(int vector, struct pi_desc *pi_desc) +{ + return test_and_set_bit(vector, (unsigned long *)pi_desc->pir); +} + +static inline bool pi_is_pir_empty(struct pi_desc *pi_desc) +{ + return bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS); +} + +static inline void pi_set_sn(struct pi_desc *pi_desc) +{ + set_bit(POSTED_INTR_SN, + (unsigned long *)&pi_desc->control); +} + +static inline void pi_set_on(struct pi_desc *pi_desc) +{ + set_bit(POSTED_INTR_ON, + (unsigned long *)&pi_desc->control); +} + +static inline void pi_clear_on(struct pi_desc *pi_desc) +{ + clear_bit(POSTED_INTR_ON, + (unsigned long *)&pi_desc->control); +} + +static inline void pi_clear_sn(struct pi_desc *pi_desc) +{ + clear_bit(POSTED_INTR_SN, + (unsigned long *)&pi_desc->control); +} + +static inline bool pi_test_on(struct pi_desc *pi_desc) +{ + return test_bit(POSTED_INTR_ON, + (unsigned long *)&pi_desc->control); +} + +static inline bool pi_test_sn(struct pi_desc *pi_desc) +{ + return test_bit(POSTED_INTR_SN, + (unsigned long *)&pi_desc->control); +} + +#endif /* _X86_POSTED_INTR_H */ diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h index 26992076552e..6b2a0226257e 100644 --- a/arch/x86/kvm/vmx/posted_intr.h +++ b/arch/x86/kvm/vmx/posted_intr.h @@ -1,98 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef __KVM_X86_VMX_POSTED_INTR_H #define __KVM_X86_VMX_POSTED_INTR_H - -#define POSTED_INTR_ON 0 -#define POSTED_INTR_SN 1 - -#define PID_TABLE_ENTRY_VALID 1 - -/* Posted-Interrupt Descriptor */ -struct pi_desc { - u32 pir[8]; /* Posted interrupt requested */ - union { - struct { - /* bit 256 - Outstanding Notification */ - u16 on : 1, - /* bit 257 - Suppress Notification */ - sn : 1, - /* bit 271:258 - Reserved */ - rsvd_1 : 14; - /* bit 279:272 - Notification Vector */ - u8 nv; - /* bit 287:280 - Reserved */ - u8 rsvd_2; - /* bit 319:288 - Notification Destination */ - u32 ndst; - }; - u64 control; - }; - u32 rsvd[6]; -} __aligned(64); - -static inline bool pi_test_and_set_on(struct pi_desc *pi_desc) -{ - return test_and_set_bit(POSTED_INTR_ON, - (unsigned long *)&pi_desc->control); -} - -static inline bool pi_test_and_clear_on(struct pi_desc *pi_desc) -{ - return test_and_clear_bit(POSTED_INTR_ON, - (unsigned long *)&pi_desc->control); -} - -static inline bool pi_test_and_clear_sn(struct pi_desc *pi_desc) -{ - return test_and_clear_bit(POSTED_INTR_SN, - (unsigned long *)&pi_desc->control); -} - -static inline bool pi_test_and_set_pir(int vector, struct pi_desc *pi_desc) -{ - return test_and_set_bit(vector, (unsigned long *)pi_desc->pir); -} - -static inline bool pi_is_pir_empty(struct pi_desc *pi_desc) -{ - return bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS); -} - -static inline void pi_set_sn(struct pi_desc *pi_desc) -{ - set_bit(POSTED_INTR_SN, - (unsigned long *)&pi_desc->control); -} - -static inline void pi_set_on(struct pi_desc *pi_desc) -{ - set_bit(POSTED_INTR_ON, - (unsigned long *)&pi_desc->control); -} - -static inline void pi_clear_on(struct pi_desc *pi_desc) -{ - clear_bit(POSTED_INTR_ON, - (unsigned long *)&pi_desc->control); -} - -static inline void pi_clear_sn(struct pi_desc *pi_desc) -{ - clear_bit(POSTED_INTR_SN, - (unsigned long *)&pi_desc->control); -} - -static inline bool pi_test_on(struct pi_desc *pi_desc) -{ - return test_bit(POSTED_INTR_ON, - (unsigned long *)&pi_desc->control); -} - -static inline bool pi_test_sn(struct pi_desc *pi_desc) -{ - return test_bit(POSTED_INTR_SN, - (unsigned long *)&pi_desc->control); -} +#include void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu); void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 72e3943f3693..d54fa0e06c70 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -66,6 +66,7 @@ #include "vmx.h" #include "x86.h" #include "smm.h" +#include "posted_intr.h" MODULE_AUTHOR("Qumranet"); MODULE_LICENSE("GPL"); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index c2130d2c8e24..817b76794ee1 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -7,10 +7,10 @@ #include #include #include +#include #include "capabilities.h" #include "../kvm_cache_regs.h" -#include "posted_intr.h" #include "vmcs.h" #include "vmx_ops.h" #include "../cpuid.h" From patchwork Sun Nov 12 04:16:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453244 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AF0B1C3E for ; Sun, 12 Nov 2023 04:12:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nvaugRDI" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0384130D5; Sat, 11 Nov 2023 20:12:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762329; x=1731298329; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tKxaF+OKLwmoAMwaAxN7sL0lSOz2Mx8YYsRJLKhsXTQ=; b=nvaugRDIYksMdxA2eLgYkX2XNCqrWeg5zSV5+wv4fWN1s98wN899mftp EQN+Pj3YoLGtvplPY22ZJ4UbnWWdltEXo73tet2FD3qkiyymHj2h2P43O Ph1wjf5x8U7vZ+GaG4jTgBUTDN6c7GwKUtpjJAHF1c1CXlfc1OcouxnCu y4JQ5OkGQW02+dvMKm/LG0lj7woMQG+YnUk94JcrjC8CZ+fZYc9Oc3xdW 80ITCXlOTBg3gRR4P1T4aGn05/GnNnea/LJobdXimASuHXY4vL5qZ7aXm cCTkSaN1UnyjjvmZLb47mbl9vdMWxOP03PogUCCmmEaVNM2dGGjquj4tm A==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533857" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533857" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936740" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936740" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:07 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 02/13] x86: Add a Kconfig option for posted MSI Date: Sat, 11 Nov 2023 20:16:32 -0800 Message-Id: <20231112041643.2868316-3-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This option will be used to support delivering MSIs as posted interrupts. Interrupt remapping is required. Signed-off-by: Jacob Pan --- arch/x86/Kconfig | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 66bfabae8814..f16882ddb390 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -463,6 +463,16 @@ config X86_X2APIC If you don't know what to do here, say N. +config X86_POSTED_MSI + bool "Enable MSI and MSI-x delivery by posted interrupts" + depends on X86_X2APIC && X86_64 && IRQ_REMAP + help + This enables MSIs that are under IRQ remapping to be delivered as posted + interrupts to the host kernel. IRQ throughput can potentially be improved + by coalescing CPU notifications during high frequency IRQ bursts. + + If you don't know what to do here, say N. + config X86_MPPARSE bool "Enable MPS table" if ACPI default y From patchwork Sun Nov 12 04:16:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453245 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC8611FB7 for ; Sun, 12 Nov 2023 04:12:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="h/PjgH6q" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4F8130D6; Sat, 11 Nov 2023 20:12:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762329; x=1731298329; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ouYom2tBlHd3aDBaZZVG3gi5SN++gaATZnTLzCcbbws=; b=h/PjgH6q+G04R+PtwuR5ggWplCwXG3RlwKuciSl6RWbDb4LZje6Y7l3y RtwQt1iTOeY1TEwefpV/YYRkn9gMM9Xn/oFBPEImBI4D2VCilar+CFQ7p JfpkgrJBTsxn/58A16OO103R6MizrpQli7+9mCiW9pyPIbrjyui6GHLyj pPlCFJnlYWz0umip4qvyUGUmOJL1Neujd+haYG4UMM5Z/o0OfsFjobCgN HN3a0lrwCEAVwKD4F3hUKga/4LD01w2jiL2ULfC0rZzveftRifxg6c0Mf BiyRr7k4lhAtbe8MO3xvqOM6a3n/mXPVAp3tXNLVXiENQSXn2yAHbcQ3e w==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533871" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533871" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936743" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936743" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:07 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 03/13] x86: Reserved a per CPU IDT vector for posted MSIs Date: Sat, 11 Nov 2023 20:16:33 -0800 Message-Id: <20231112041643.2868316-4-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Under posted MSIs, all device MSIs are multiplexed into a single CPU notification vector. MSI handlers will be de-multiplexed at run-time by system software without IDT delivery. This vector has a priority class below the rest of the system vectors. Potentially, external vector number space for MSIs can be expanded to the entire 0-256 range. Signed-off-by: Jacob Pan --- arch/x86/include/asm/irq_vectors.h | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 3a19904c2db6..077ca38f5a91 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -99,9 +99,22 @@ #define LOCAL_TIMER_VECTOR 0xec +/* + * Posted interrupt notification vector for all device MSIs delivered to + * the host kernel. + * + * Choose lower priority class bit [7:4] than other system vectors such + * that it can be preempted by the system interrupts. + * + * It is also higher than all external vectors but it should not matter + * in that external vectors for posted MSIs are in a different number space. + */ +#define POSTED_MSI_NOTIFICATION_VECTOR 0xdf #define NR_VECTORS 256 -#ifdef CONFIG_X86_LOCAL_APIC +#ifdef X86_POSTED_MSI +#define FIRST_SYSTEM_VECTOR POSTED_MSI_NOTIFICATION_VECTOR +#elif defined(CONFIG_X86_LOCAL_APIC) #define FIRST_SYSTEM_VECTOR LOCAL_TIMER_VECTOR #else #define FIRST_SYSTEM_VECTOR NR_VECTORS From patchwork Sun Nov 12 04:16:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453249 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D76323B7 for ; Sun, 12 Nov 2023 04:12:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YCQghsjy" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF31F30E6; Sat, 11 Nov 2023 20:12:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762329; x=1731298329; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DCQy9ziTOxUJmllg1ZKA+yDNfB6XKqM3GiK9nVcKCAU=; b=YCQghsjykYPIODV/cTAWzVgUuAeAK8PDeSLxVJshMN5xAqY1ATLNgBCd luG9Xl3TJ2tjOZPzcPU/rpHasoRoeIiVxWoL8zEQJ0guBBnj2UbKU4Vwt 83WORiv0QzLXv6LXPDrnX/bFBjRjlJjNI5XY3b+caDHQvAP1KmlaMPyGc KCpZfyxiPaAb5XA326wg4XmR8Xt42Udl1f7jHZHZyfsqK4AcVBBGuAAhs 9ME1w6nVMSLdhz6ndICtAbTYoW3HLYjeUDF8PDOXEpisMYuNT5bkrfetS ykd62ZjqlYqrvCSv1bDF6441cD914ppwJ+heh5bEgyQfo8Yx2jRvtbcZf Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533880" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533880" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936747" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936747" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:08 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 04/13] iommu/vt-d: Add helper and flag to check/disable posted MSI Date: Sat, 11 Nov 2023 20:16:34 -0800 Message-Id: <20231112041643.2868316-5-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Allow command line opt-out posted MSI under CONFIG_X86_POSTED_MSI=y. And add a helper function for testing if posted MSI is supported on the CPU side. Signed-off-by: Jacob Pan --- arch/x86/include/asm/irq_remapping.h | 11 +++++++++++ drivers/iommu/irq_remapping.c | 17 +++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h index 7a2ed154a5e1..706f58900962 100644 --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -50,6 +50,17 @@ static inline struct irq_domain *arch_get_ir_parent_domain(void) return x86_vector_domain; } +#ifdef CONFIG_X86_POSTED_MSI +extern unsigned int posted_msi_off; + +static inline bool posted_msi_supported(void) +{ + return !posted_msi_off && irq_remapping_cap(IRQ_POSTING_CAP); +} +#else +static inline bool posted_msi_supported(void) { return false; }; +#endif + #else /* CONFIG_IRQ_REMAP */ static inline bool irq_remapping_cap(enum irq_remap_cap cap) { return 0; } diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c index 83314b9d8f38..00de6963bb07 100644 --- a/drivers/iommu/irq_remapping.c +++ b/drivers/iommu/irq_remapping.c @@ -24,6 +24,23 @@ int no_x2apic_optout; int disable_irq_post = 0; +#ifdef CONFIG_X86_POSTED_MSI + +unsigned int posted_msi_off; + +static int __init cmdl_posted_msi_off(char *str) +{ + int value = 0; + + get_option(&str, &value); + posted_msi_off = value; + + return 1; +} + +__setup("posted_msi_off=", cmdl_posted_msi_off); +#endif + static int disable_irq_remap; static struct irq_remap_ops *remap_ops; From patchwork Sun Nov 12 04:16:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453252 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4842E2572 for ; Sun, 12 Nov 2023 04:12:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KJilG8cj" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 899FB30F7; Sat, 11 Nov 2023 20:12:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762330; x=1731298330; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XzHdLhX/vY4asS0xx/Bsw4Ay4NQKud3SHGFE5ORwta8=; b=KJilG8cj1ruuny36putjNWwuDeQLtiBTpPFn57V/proMK6KTHj/pyKOt sE5wRL83A5TIqmsOjvF4gxg3Iezw9KZfz1R+OCVGdC9CECIlEzxE/uFgE bdKU4TAgZAkthfQIP6kf3ktepxjSouD32Hq60L+iwP3/zW/3SUvxs8ay5 fyrSfdUirWPBxBTrGGXoC5f0CA2C3s+ww/O+7Kw2aoxK4K6Mzj7h9AOBK mmTtUZs2ViLpbM6OIA+05DX6cSYPAYBuJmJkNqEJB6MZFS9vrwF0D6cga ZQPWuCuMBpM/hmSZezz5x6MCiaw/b3gPyyzARqLdP/OOKHUH9lw+2HisW g==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533891" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533891" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936751" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936751" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:08 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 05/13] x86/irq: Set up per host CPU posted interrupt descriptors Date: Sat, 11 Nov 2023 20:16:35 -0800 Message-Id: <20231112041643.2868316-6-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Thomas Gleixner To support posted MSIs, create a posted interrupt descriptor (PID) for each host CPU. Later on, when setting up IRQ CPU affinity, IOMMU's interrupt remapping table entry (IRTE) will point to the physical address of the matching CPU's PID. Each PID is initialized with the owner CPU's physical APICID as the destination. Signed-off-by: Thomas Gleixner Signed-off-by: Jacob Pan --- arch/x86/include/asm/hardirq.h | 3 +++ arch/x86/include/asm/posted_intr.h | 7 +++++++ arch/x86/kernel/cpu/common.c | 3 +++ arch/x86/kernel/irq.c | 13 +++++++++++++ 4 files changed, 26 insertions(+) diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 66837b8c67f1..72c6a084dba3 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -48,6 +48,9 @@ typedef struct { DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); +#ifdef CONFIG_X86_POSTED_MSI +DECLARE_PER_CPU_ALIGNED(struct pi_desc, posted_interrupt_desc); +#endif #define __ARCH_IRQ_STAT #define inc_irq_stat(member) this_cpu_inc(irq_stat.member) diff --git a/arch/x86/include/asm/posted_intr.h b/arch/x86/include/asm/posted_intr.h index 9f2fa38fa57b..2cd9ac1af835 100644 --- a/arch/x86/include/asm/posted_intr.h +++ b/arch/x86/include/asm/posted_intr.h @@ -94,4 +94,11 @@ static inline bool pi_test_sn(struct pi_desc *pi_desc) (unsigned long *)&pi_desc->control); } +#ifdef CONFIG_X86_POSTED_MSI +extern void intel_posted_msi_init(void); + +#else +static inline void intel_posted_msi_init(void) {}; + +#endif /* X86_POSTED_MSI */ #endif /* _X86_POSTED_INTR_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 4e5ffc8b0e46..08b2d1560f8b 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -65,6 +65,7 @@ #include #include #include +#include #include "cpu.h" @@ -2266,6 +2267,8 @@ void cpu_init(void) barrier(); x2apic_setup(); + + intel_posted_msi_init(); } mmgrab(&init_mm); diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 11761c124545..fd4d664d81bb 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -22,6 +22,8 @@ #include #include #include +#include +#include #define CREATE_TRACE_POINTS #include @@ -334,6 +336,17 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm_posted_intr_nested_ipi) } #endif +#ifdef CONFIG_X86_POSTED_MSI + +/* Posted Interrupt Descriptors for coalesced MSIs to be posted */ +DEFINE_PER_CPU_ALIGNED(struct pi_desc, posted_interrupt_desc); + +void intel_posted_msi_init(void) +{ + this_cpu_write(posted_interrupt_desc.nv, POSTED_MSI_NOTIFICATION_VECTOR); + this_cpu_write(posted_interrupt_desc.ndst, this_cpu_read(x86_cpu_to_apicid)); +} +#endif /* X86_POSTED_MSI */ #ifdef CONFIG_HOTPLUG_CPU /* A cpu has been removed from cpu_online_mask. Reset irq affinities. */ From patchwork Sun Nov 12 04:16:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453248 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4CF61FCE for ; Sun, 12 Nov 2023 04:12:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="FgMV0xSW" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0482830EB; Sat, 11 Nov 2023 20:12:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762330; x=1731298330; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KgNwZKlEknUh48ho6jIqFjsi+lR3uCezgczS6gTj3Dk=; b=FgMV0xSWXOakTmNh32PJv5YQtta8fgkAGPAvci5uzJH1fRuR2OhxVzij rrfDI6sjDjDe2o4lynHhD70o8B1C7ZGPEiOazFTSRihh5vBH9dZbCps6y htg2rjH7sl/PZ6+hargAlZzwFNw8V7jax1QNMlrlKatiWp0PfweWNmZvX rLRFDOSOw6+lzb9p7+XPbpOdt4jFyjs7Qb2M3gEQL5U2EwquIRiK2zSSr T3a5ZQnR6vuMk9v4vJq6fUnIx+x8mPz5oPFrLZwXoaVhzeop6z4Tc7m1n oyYI+syGHpeltHbW8GZZ1df17ZVz5UN5Qb9j1CNLj4FOaw/HdxgEkMv1K Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533894" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533894" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936754" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936754" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:08 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 06/13] x86/irq: Unionize PID.PIR for 64bit access w/o casting Date: Sat, 11 Nov 2023 20:16:36 -0800 Message-Id: <20231112041643.2868316-7-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Make PIR field into u64 such that atomic xchg64 can be used without ugly casting. Suggested-by: Thomas Gleixner Signed-off-by: Jacob Pan --- arch/x86/include/asm/posted_intr.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/posted_intr.h b/arch/x86/include/asm/posted_intr.h index 2cd9ac1af835..3af00f5395e4 100644 --- a/arch/x86/include/asm/posted_intr.h +++ b/arch/x86/include/asm/posted_intr.h @@ -9,7 +9,10 @@ /* Posted-Interrupt Descriptor */ struct pi_desc { - u32 pir[8]; /* Posted interrupt requested */ + union { + u32 pir[8]; /* Posted interrupt requested */ + u64 pir_l[4]; + }; union { struct { /* bit 256 - Outstanding Notification */ From patchwork Sun Nov 12 04:16:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453254 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6D694410 for ; Sun, 12 Nov 2023 04:12:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZVNq8+xw" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B336735BB; Sat, 11 Nov 2023 20:12:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762331; x=1731298331; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nMrBMNqKNmE7TkVKdlJ7grKzLWcWZpw3D0RMZ2QEFQQ=; b=ZVNq8+xwI2juJuGpWyX6Ug5Dpfutn1YjEZ/AUVY6x7et5ETWyPp0B06b 5bMVi6t7neCTUQnEDDkJeZ7eOMnsN81qIn+OqiI511TtXOtAkH3BQ+bnw 1EAd3zdhoNOddE+E5PvhmiBzF8qmmNJxkYCZoan9gtTm6llvc/qmrPHxV eFQlsFiRfCXhbl3ibem/HuVg5/11x50ExoXUkveOylLWeK2IhJ+6gwRmy 1Y1Z4gZaddUmr8q6/pMT6Z+BKEaSVNbytKf1EEh+mu/Fr73Li7dmEnC+R 2+r1fabIp7j891TDtSPJmDIcc0x7e6gFTLsiANEtYl6rtFDM2TBw7UARK w==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533909" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533909" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936757" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936757" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:08 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 07/13] x86/irq: Add helpers for checking Intel PID Date: Sat, 11 Nov 2023 20:16:37 -0800 Message-Id: <20231112041643.2868316-8-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Intel posted interrupt descriptor (PID) stores pending interrupts in its posted interrupt requests (PIR) bitmap. Add helper functions to check individual vector status and the entire bitmap. They are used for interrupt migration and runtime demultiplexing posted MSI vectors. Signed-off-by: Jacob Pan --- arch/x86/include/asm/posted_intr.h | 31 ++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/arch/x86/include/asm/posted_intr.h b/arch/x86/include/asm/posted_intr.h index 3af00f5395e4..12a4fa3ff60e 100644 --- a/arch/x86/include/asm/posted_intr.h +++ b/arch/x86/include/asm/posted_intr.h @@ -98,9 +98,40 @@ static inline bool pi_test_sn(struct pi_desc *pi_desc) } #ifdef CONFIG_X86_POSTED_MSI +/* + * Not all external vectors are subject to interrupt remapping, e.g. IOMMU's + * own interrupts. Here we do not distinguish them since those vector bits in + * PIR will always be zero. + */ +static inline bool is_pi_pending_this_cpu(unsigned int vector) +{ + struct pi_desc *pid; + + if (WARN_ON(vector > NR_VECTORS || vector < FIRST_EXTERNAL_VECTOR)) + return false; + + pid = this_cpu_ptr(&posted_interrupt_desc); + + return (pid->pir[vector >> 5] & (1 << (vector % 32))); +} + +static inline bool is_pir_pending(struct pi_desc *pid) +{ + int i; + + for (i = 0; i < 4; i++) { + if (pid->pir_l[i]) + return true; + } + + return false; +} + extern void intel_posted_msi_init(void); #else +static inline bool is_pi_pending_this_cpu(unsigned int vector) {return false; } + static inline void intel_posted_msi_init(void) {}; #endif /* X86_POSTED_MSI */ From patchwork Sun Nov 12 04:16:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453250 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5565A2581 for ; Sun, 12 Nov 2023 04:12:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Up+2QgpB" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01AF43253; Sat, 11 Nov 2023 20:12:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762331; x=1731298331; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=v4JwSbwScASMaLOkydBAHwmmqkSbLjlt1xmF7LwItuo=; b=Up+2QgpBEV0ySWhwR8cJzWzuKk5x5MVNfl/HfWPTWKWZlsSvBMqj/ZmQ Sh094lmy6+rzxXvSHo/NvZyoxzNyMtWoqcM/L58oPkidv9hOJi+LJL2pH EzHf1qMinKfr1FH2Dknb9TR2OmYtDtCnzhjrRJ9isNnQjplzvw+2ZPLmU uZHrnqQ/88040NVyL+nA6d0DmRCFgzaQrWQORTc2wTGLDqBG7OZzfZLmn /ZFXPnSZfFHxCr5WnrFPYpVQiTeoPdydQO9bk83syn9ZvNqwIh0CeTdDP AKTeW57lHf1o0eXY4FTsjF8MYhkzw/XTInMugymYd7V6LLPk3QkpVxyAF g==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533917" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533917" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936760" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936760" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:08 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 08/13] x86/irq: Factor out calling ISR from common_interrupt Date: Sat, 11 Nov 2023 20:16:38 -0800 Message-Id: <20231112041643.2868316-9-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Prepare for calling external IRQ handlers directly from the posted MSI demultiplexing loop. Extract the common code with common interrupt to avoid code duplication. Signed-off-by: Jacob Pan --- arch/x86/kernel/irq.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index fd4d664d81bb..0bffe8152385 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -242,18 +242,10 @@ static __always_inline void handle_irq(struct irq_desc *desc, __handle_irq(desc, regs); } -/* - * common_interrupt() handles all normal device IRQ's (the special SMP - * cross-CPU interrupts have their own entry points). - */ -DEFINE_IDTENTRY_IRQ(common_interrupt) +static __always_inline void call_irq_handler(int vector, struct pt_regs *regs) { - struct pt_regs *old_regs = set_irq_regs(regs); struct irq_desc *desc; - /* entry code tells RCU that we're not quiescent. Check it. */ - RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU"); - desc = __this_cpu_read(vector_irq[vector]); if (likely(!IS_ERR_OR_NULL(desc))) { handle_irq(desc, regs); @@ -268,7 +260,20 @@ DEFINE_IDTENTRY_IRQ(common_interrupt) __this_cpu_write(vector_irq[vector], VECTOR_UNUSED); } } +} + +/* + * common_interrupt() handles all normal device IRQ's (the special SMP + * cross-CPU interrupts have their own entry points). + */ +DEFINE_IDTENTRY_IRQ(common_interrupt) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + + /* entry code tells RCU that we're not quiescent. Check it. */ + RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU"); + call_irq_handler(vector, regs); set_irq_regs(old_regs); } From patchwork Sun Nov 12 04:16:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453256 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE48C3C3B for ; Sun, 12 Nov 2023 04:12:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="az44HjVb" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E4B530FA; Sat, 11 Nov 2023 20:12:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762330; x=1731298330; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZRkqYyrDq8K80kW7DHz3TYCS29krPawHw58dDr57YPY=; b=az44HjVbm2FA1FjMmNiKwdKW8mnfR7FrADuETyHzf1YD1RZyLlSONGqe 1mX/LLEdt5BoQ1Ilt7ULrDHD1lSevBgS3/xwAR9Ox0TYyBVvJCbTo6Rgs MA6sVi5Pu8cIH0F1jWN+XaJ+82l158Kpl/EvGA11MJI5Qkujj27xzrboz VE9XcFVCKU0tJ1cbtlU4uaMNyc1830m8UU+qHT2ET+JOcVQ+PLMm6Psm5 U9Z3PpjlU6/+yF/rqBbkjanCS/YenuPhB6QL0lf84pOs3Swoarja29zOg Sw42DhXI5VtBkxQOwiA6Q8/kDShOo5/h5/XfKWm/bX9YtHLZ74MrLLzUz A==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533930" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533930" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936764" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936764" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:09 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 09/13] x86/irq: Install posted MSI notification handler Date: Sat, 11 Nov 2023 20:16:39 -0800 Message-Id: <20231112041643.2868316-10-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 All MSI vectors are multiplexed into a single notification vector when posted MSI is enabled. It is the responsibility of the notification vector handler to demultiplex MSI vectors. In this handler, for each pending bit, MSI vector handlers are dispatched without IDT delivery. For example, the interrupt flow will change as follows: (3 MSIs of different vectors arrive in a a high frequency burst) BEFORE: interrupt(MSI) irq_enter() handler() /* EOI */ irq_exit() process_softirq() interrupt(MSI) irq_enter() handler() /* EOI */ irq_exit() process_softirq() interrupt(MSI) irq_enter() handler() /* EOI */ irq_exit() process_softirq() AFTER: interrupt /* Posted MSI notification vector */ irq_enter() atomic_xchg(PIR) handler() handler() handler() pi_clear_on() apic_eoi() irq_exit() process_softirq() Except for the leading MSI, CPU notifications are skipped/coalesced. For MSIs arrive at a low frequency, the demultiplexing loop does not wait for more interrupts to coalesce. Therefore, there's no additional latency other than the processing time. Signed-off-by: Jacob Pan --- arch/x86/include/asm/hardirq.h | 3 ++ arch/x86/include/asm/idtentry.h | 3 ++ arch/x86/kernel/idt.c | 3 ++ arch/x86/kernel/irq.c | 91 +++++++++++++++++++++++++++++++++ 4 files changed, 100 insertions(+) diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 72c6a084dba3..6c8daa7518eb 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -44,6 +44,9 @@ typedef struct { unsigned int irq_hv_reenlightenment_count; unsigned int hyperv_stimer0_count; #endif +#ifdef CONFIG_X86_POSTED_MSI + unsigned int posted_msi_notification_count; +#endif } ____cacheline_aligned irq_cpustat_t; DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h index 05fd175cec7d..f756e761e7c0 100644 --- a/arch/x86/include/asm/idtentry.h +++ b/arch/x86/include/asm/idtentry.h @@ -644,6 +644,9 @@ DECLARE_IDTENTRY_SYSVEC(ERROR_APIC_VECTOR, sysvec_error_interrupt); DECLARE_IDTENTRY_SYSVEC(SPURIOUS_APIC_VECTOR, sysvec_spurious_apic_interrupt); DECLARE_IDTENTRY_SYSVEC(LOCAL_TIMER_VECTOR, sysvec_apic_timer_interrupt); DECLARE_IDTENTRY_SYSVEC(X86_PLATFORM_IPI_VECTOR, sysvec_x86_platform_ipi); +# ifdef CONFIG_X86_POSTED_MSI +DECLARE_IDTENTRY_SYSVEC(POSTED_MSI_NOTIFICATION_VECTOR, sysvec_posted_msi_notification); +# endif #endif #ifdef CONFIG_SMP diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index b786d48f5a0f..d5840d777469 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -159,6 +159,9 @@ static const __initconst struct idt_data apic_idts[] = { # endif INTG(SPURIOUS_APIC_VECTOR, asm_sysvec_spurious_apic_interrupt), INTG(ERROR_APIC_VECTOR, asm_sysvec_error_interrupt), +# ifdef CONFIG_X86_POSTED_MSI + INTG(POSTED_MSI_NOTIFICATION_VECTOR, asm_sysvec_posted_msi_notification), +# endif #endif }; diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 0bffe8152385..786c2c8330f4 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -183,6 +183,13 @@ int arch_show_interrupts(struct seq_file *p, int prec) seq_printf(p, "%10u ", irq_stats(j)->kvm_posted_intr_wakeup_ipis); seq_puts(p, " Posted-interrupt wakeup event\n"); +#endif +#ifdef CONFIG_X86_POSTED_MSI + seq_printf(p, "%*s: ", prec, "PMN"); + for_each_online_cpu(j) + seq_printf(p, "%10u ", + irq_stats(j)->posted_msi_notification_count); + seq_puts(p, " Posted MSI notification event\n"); #endif return 0; } @@ -351,6 +358,90 @@ void intel_posted_msi_init(void) this_cpu_write(posted_interrupt_desc.nv, POSTED_MSI_NOTIFICATION_VECTOR); this_cpu_write(posted_interrupt_desc.ndst, this_cpu_read(x86_cpu_to_apicid)); } + +static __always_inline inline void handle_pending_pir(struct pi_desc *pid, struct pt_regs *regs) +{ + int i, vec = FIRST_EXTERNAL_VECTOR; + u64 pir_copy[4]; + + /* + * Make a copy of PIR which contains IRQ pending bits for vectors, + * then invoke IRQ handlers for each pending vector. + * If any new interrupts were posted while we are processing, will + * do again before allowing new notifications. The idea is to + * minimize the number of the expensive notifications if IRQs come + * in a high frequency burst. + */ + for (i = 0; i < 4; i++) + pir_copy[i] = raw_atomic64_xchg((atomic64_t *)&pid->pir_l[i], 0); + + /* + * Ideally, we should start from the high order bits set in the PIR + * since each bit represents a vector. Higher order bit position means + * the vector has higher priority. But external vectors are allocated + * based on availability not priority. + * + * EOI is included in the IRQ handlers call to apic_ack_irq, which + * allows higher priority system interrupt to get in between. + */ + for_each_set_bit_from(vec, (unsigned long *)&pir_copy[0], 256) + call_irq_handler(vec, regs); + +} + +/* + * Performance data shows that 3 is good enough to harvest 90+% of the benefit + * on high IRQ rate workload. + * Alternatively, could make this tunable, use 3 as default. + */ +#define MAX_POSTED_MSI_COALESCING_LOOP 3 + +/* + * For MSIs that are delivered as posted interrupts, the CPU notifications + * can be coalesced if the MSIs arrive in high frequency bursts. + */ +DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + struct pi_desc *pid; + int i = 0; + + pid = this_cpu_ptr(&posted_interrupt_desc); + + inc_irq_stat(posted_msi_notification_count); + irq_enter(); + + while (i++ < MAX_POSTED_MSI_COALESCING_LOOP) { + handle_pending_pir(pid, regs); + + /* + * If there are new interrupts posted in PIR, do again. If + * nothing pending, no need to wait for more interrupts. + */ + if (is_pir_pending(pid)) + continue; + else + break; + } + + /* + * Clear outstanding notification bit to allow new IRQ notifications, + * do this last to maximize the window of interrupt coalescing. + */ + pi_clear_on(pid); + + /* + * There could be a race of PI notification and the clearing of ON bit, + * process PIR bits one last time such that handling the new interrupts + * are not delayed until the next IRQ. + */ + if (unlikely(is_pir_pending(pid))) + handle_pending_pir(pid, regs); + + apic_eoi(); + irq_exit(); + set_irq_regs(old_regs); +} #endif /* X86_POSTED_MSI */ #ifdef CONFIG_HOTPLUG_CPU From patchwork Sun Nov 12 04:16:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453253 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD788440E for ; Sun, 12 Nov 2023 04:12:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="eng2LJO8" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 086FA3840; Sat, 11 Nov 2023 20:12:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762332; x=1731298332; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bUqv9KmUqsXG7vFMJHl27FbTCB3T6tJUXKy8ifc6Wm4=; b=eng2LJO8TCP7SkPzLh5v6ZEFHOlGOEoj89K6p9YRensaWSbAzvQoo+aF +HrjA777KtW9PvfctnMYwOw3ahY1E9MoqUn82yx5pFeaXDeW2OVxZUhYy 5bthu/qMlnA0Ae9tkwu9E8IaTeJCRlDvJ5urn5HHLDiiIKOF2PpFbedTD pVxteSnj10ytFz3i+5gYEEolU6Wzk8IP/whsT0ZmUiGL++/B/4mPo05f0 XXodrPtReCOZLAQAs8FAAOn3HrKPDYq0ql8wCrcZ3q4hTiM0qVR4pRBL8 xuoMPZJUPjhua4XaiiXdwNwIQucnw5E3+xyF+MRmxyx7ktuE8GvPZTvvu w==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533937" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533937" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936767" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936767" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:09 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 10/13] x86/irq: Handle potential lost IRQ during migration and CPU offline Date: Sat, 11 Nov 2023 20:16:40 -0800 Message-Id: <20231112041643.2868316-11-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Though IRTE modification for IRQ affinity change is a atomic operation, it does not guarantee the timing of IRQ posting at PID. considered the following scenario: Device system agent iommu memory CPU/LAPIC 1 FEEX_XXXX 2 Interrupt request 3 Fetch IRTE -> 4 ->Atomic Swap PID.PIR(vec) Push to Global Observable(GO) 5 if (ON*) i done;* else 6 send a notification -> * ON: outstanding notification, 1 will suppress new notifications If IRQ affinity change happens between 3 and 5 in IOMMU, old CPU's PIR could have pending bit set for the vector being moved. We must check PID.PIR to prevent the lost of interrupts. Suggested-by: Thomas Gleixner Signed-off-by: Jacob Pan --- arch/x86/kernel/apic/vector.c | 8 +++++++- arch/x86/kernel/irq.c | 20 +++++++++++++++++--- 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 319448d87b99..14fc33cfdb37 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include @@ -978,9 +979,14 @@ static void __vector_cleanup(struct vector_cleanup *cl, bool check_irr) * Do not check IRR when called from lapic_offline(), because * fixup_irqs() was just called to scan IRR for set bits and * forward them to new destination CPUs via IPIs. + * + * If the vector to be cleaned is delivered as posted intr, + * it is possible that the interrupt has been posted but + * not made to the IRR due to coalesced notifications. + * Therefore, check PIR to see if the interrupt was posted. */ irr = check_irr ? apic_read(APIC_IRR + (vector / 32 * 0x10)) : 0; - if (irr & (1U << (vector % 32))) { + if (irr & (1U << (vector % 32)) || is_pi_pending_this_cpu(vector)) { pr_warn_once("Moved interrupt pending in old target APIC %u\n", apicd->irq); rearm = true; continue; diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 786c2c8330f4..7732cb9bbf0c 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -444,11 +444,26 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) } #endif /* X86_POSTED_MSI */ +/* + * Check if a given vector is pending in APIC IRR or PIR if posted interrupt + * is enabled for coalesced interrupt delivery (CID). + */ +static inline bool is_vector_pending(unsigned int vector) +{ + unsigned int irr; + + irr = apic_read(APIC_IRR + (vector / 32 * 0x10)); + if (irr & (1 << (vector % 32))) + return true; + + return is_pi_pending_this_cpu(vector); +} + #ifdef CONFIG_HOTPLUG_CPU /* A cpu has been removed from cpu_online_mask. Reset irq affinities. */ void fixup_irqs(void) { - unsigned int irr, vector; + unsigned int vector; struct irq_desc *desc; struct irq_data *data; struct irq_chip *chip; @@ -475,8 +490,7 @@ void fixup_irqs(void) if (IS_ERR_OR_NULL(__this_cpu_read(vector_irq[vector]))) continue; - irr = apic_read(APIC_IRR + (vector / 32 * 0x10)); - if (irr & (1 << (vector % 32))) { + if (is_vector_pending(vector)) { desc = __this_cpu_read(vector_irq[vector]); raw_spin_lock(&desc->lock); From patchwork Sun Nov 12 04:16:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453251 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A805A3C2C for ; Sun, 12 Nov 2023 04:12:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UW7JkIS8" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5389B3850; Sat, 11 Nov 2023 20:12:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762332; x=1731298332; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BmEOk8548DeJZDcMSeMiFjR9mep2QMzw5DJH2sJXQNs=; b=UW7JkIS8jDXbq010IehICO2UZvcVq9JwKhc/Mk2sxHJz3fZWPx+t9MWb V52R7LUwz4s3nc7EOta7pYpxbLZYwIXZS8Tedm1oqEaDK/xoWnosPys/l 6NElsnxO7xUBWjzkd7LK4UpStKP3fwYlTmsDXDIKPto+h0VmKp1dwv77e kK8I8BejHlbCkL+bFNNPf6P8vhHdQ2X8bPgxRVF7Ea6OtAh5e3f+4m31F ilJ/mE4tJq2xeu4iqA+cvG7ygoaavVZVlsOOL75iFsP3bys3nXbeR8u46 3rP0gdmRtr+/83Yd8XRs7ampMLUpbl0VUKrxxetHtVqGdeCkiAnRVph1e w==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533951" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533951" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936770" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936770" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:09 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 11/13] iommu/vt-d: Add an irq_chip for posted MSIs Date: Sat, 11 Nov 2023 20:16:41 -0800 Message-Id: <20231112041643.2868316-12-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 With posted MSIs, end of interrupt is handled by the notification handler. Each MSI handler does not go through local APIC IRR, ISR processing. There's no need to do apic_eoi() in those handlers. Add a new acpi_ack_irq_no_eoi() for the posted MSI IR chip. At runtime the call trace looks like: __sysvec_posted_msi_notification() { irq_chip_ack_parent() { apic_ack_irq_no_eoi(); } handle_irq_event() { handle_irq_event_percpu() { driver_handler() } } IO-APIC IR is excluded the from posted MSI, we need to make sure it still performs EOI. Signed-off-by: Jacob Pan --- arch/x86/include/asm/apic.h | 1 + arch/x86/kernel/apic/io_apic.c | 2 +- arch/x86/kernel/apic/vector.c | 5 ++++ drivers/iommu/intel/irq_remapping.c | 38 ++++++++++++++++++++++++++++- 4 files changed, 44 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 5af4ec1a0f71..a88015d5638b 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -485,6 +485,7 @@ static inline void apic_setup_apic_calls(void) { } #endif /* CONFIG_X86_LOCAL_APIC */ extern void apic_ack_irq(struct irq_data *data); +extern void apic_ack_irq_no_eoi(struct irq_data *data); static inline bool lapic_vector_set_in_irr(unsigned int vector) { diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 00da6cf6b07d..ca398ee9075b 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -1993,7 +1993,7 @@ static struct irq_chip ioapic_ir_chip __read_mostly = { .irq_startup = startup_ioapic_irq, .irq_mask = mask_ioapic_irq, .irq_unmask = unmask_ioapic_irq, - .irq_ack = irq_chip_ack_parent, + .irq_ack = apic_ack_irq, .irq_eoi = ioapic_ir_ack_level, .irq_set_affinity = ioapic_set_affinity, .irq_retrigger = irq_chip_retrigger_hierarchy, diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 14fc33cfdb37..01223ac4f57a 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -911,6 +911,11 @@ void apic_ack_irq(struct irq_data *irqd) apic_eoi(); } +void apic_ack_irq_no_eoi(struct irq_data *irqd) +{ + irq_move_irq(irqd); +} + void apic_ack_edge(struct irq_data *irqd) { irq_complete_move(irqd_cfg(irqd)); diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index 29b9e55dcf26..f2870d3c8313 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1233,6 +1233,42 @@ static struct irq_chip intel_ir_chip = { .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, }; +/* + * With posted MSIs, all vectors are multiplexed into a single notification + * vector. Devices MSIs are then dispatched in a demux loop where + * EOIs can be coalesced as well. + * + * IR chip "INTEL-IR-POST" does not do EOI on ACK instead letting posted + * interrupt notification handler to perform EOI. + * + * For the example below, 3 MSIs are coalesced in one CPU notification. Only + * one apic_eoi() is needed. + * + * __sysvec_posted_msi_notification() { + * irq_enter() + * handle_edge_irq() + * irq_chip_ack_parent() + * apic_ack_irq_no_eoi() + * handle_irq() + * handle_edge_irq() + * irq_chip_ack_parent() + * apic_ack_irq_no_eoi() + * handle_irq() + * handle_edge_irq() + * irq_chip_ack_parent() + * apic_ack_irq_no_eoi() + * handle_irq() + * apic_eoi() + * irq_exit() + */ +static struct irq_chip intel_ir_chip_post_msi = { + .name = "INTEL-IR-POST", + .irq_ack = apic_ack_irq_no_eoi, + .irq_set_affinity = intel_ir_set_affinity, + .irq_compose_msi_msg = intel_ir_compose_msi_msg, + .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, +}; + static void fill_msi_msg(struct msi_msg *msg, u32 index, u32 subhandle) { memset(msg, 0, sizeof(*msg)); @@ -1361,7 +1397,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain, irq_data->hwirq = (index << 16) + i; irq_data->chip_data = ird; - irq_data->chip = &intel_ir_chip; + irq_data->chip = posted_msi_supported() ? &intel_ir_chip_post_msi : &intel_ir_chip; intel_irq_remapping_prepare_irte(ird, irq_cfg, info, index, i); irq_set_status_flags(virq + i, IRQ_MOVE_PCNTXT); } From patchwork Sun Nov 12 04:16:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453255 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66C3D46B2 for ; Sun, 12 Nov 2023 04:12:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="gOnPHrSP" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B351530D5; Sat, 11 Nov 2023 20:12:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762332; x=1731298332; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xFZk1jO0NsxwqOQHg0a3driwffs3nDYuRHJpiimktAA=; b=gOnPHrSPA0GRoHHYvgsWYBpVIn1KSrHL7uQHOEglojLEBN9bAq3py+wz zIJQ2t6p9Uj1/1XSjwcaCoke0JyRbQDFl70FY/075uRbF+9JIPZ4oSG7I WPq5OwK9Qs+eaRpO6KbDTgjwu0B9i4/DTr2JHEJ+IcLaoqJImUYH6G541 1WiGcwLVPAHyg1gAuwFBFeD+ljyAMwVRijXRNUNYDMHjXenurqCDojoa8 9RCmEdzfXL2q0bo8Nb/Gy5kzx6pikr2pfmvtcEAqYJhuTmTJe2QipU9GV fSa1qA/NlcdjSJ1t9oUSaCMn0znaxi+jHQqhfoadnQnBaqpORYDp2UH0o g==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533961" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533961" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936773" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936773" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:09 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 12/13] iommu/vt-d: Add a helper to retrieve PID address Date: Sat, 11 Nov 2023 20:16:42 -0800 Message-Id: <20231112041643.2868316-13-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Thomas Gleixner When programming IRTE for posted mode, we need to retrieve the physical address of the posted interrupt descriptor (PID) that belongs to it's target CPU. This per CPU PID has already been set up during cpu_init(). Signed-off-by: Thomas Gleixner Signed-off-by: Jacob Pan --- drivers/iommu/intel/irq_remapping.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index f2870d3c8313..971e6c37002f 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1125,6 +1125,15 @@ struct irq_remap_ops intel_irq_remap_ops = { .reenable = reenable_irq_remapping, .enable_faulting = enable_drhd_fault_handling, }; +#ifdef CONFIG_X86_POSTED_MSI + +static u64 get_pi_desc_addr(struct irq_data *irqd) +{ + int cpu = cpumask_first(irq_data_get_effective_affinity_mask(irqd)); + + return __pa(per_cpu_ptr(&posted_interrupt_desc, cpu)); +} +#endif static void intel_ir_reconfigure_irte(struct irq_data *irqd, bool force) { From patchwork Sun Nov 12 04:16:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Pan X-Patchwork-Id: 13453257 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A72BB5386 for ; Sun, 12 Nov 2023 04:12:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bnbxXBWw" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D0F930C2; Sat, 11 Nov 2023 20:12:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699762333; x=1731298333; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=T0ai5rnWeiXxamAohbdsJLJK4HQfw4C6pjUvQ0qQo0M=; b=bnbxXBWw6rVmDlnod9bO8+vGfiRs12EfeBvpU1pJtLku1b1+35YDYswS DM17jK6CyMz1AAYlF67ce127p/nOVt4HV5uQKZ8SxuBXkaODhtkqjHidY OH58RctALytMceKV3xPPPtZPVwiWjiieW3UhMhaok9Mj4FvhhEKXELL8p NLcArukXn7KBUPgXSvku2JLkybV73lRpAMizWuCWwhThTZXjdB7RrZtZv w3Y2rxLHLBYYD5BZU8K8+dgh7NQt4dy9fRPs8BEwd1b1D5Bw2Hv4My9Sp grsFXxu+UCmFy1PMaYPiyTCcDIRC3rvH1PDk2EtETrtbtegUM5ksC2Pb+ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="476533975" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="476533975" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Nov 2023 20:12:10 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10891"; a="713936776" X-IronPort-AV: E=Sophos;i="6.03,296,1694761200"; d="scan'208";a="713936776" Received: from srinivas-otcpl-7600.jf.intel.com (HELO jacob-builder.jf.intel.com) ([10.54.39.116]) by orsmga003.jf.intel.com with ESMTP; 11 Nov 2023 20:12:10 -0800 From: Jacob Pan To: LKML , X86 Kernel , iommu@lists.linux.dev, Thomas Gleixner , "Lu Baolu" , kvm@vger.kernel.org, Dave Hansen , Joerg Roedel , "H. Peter Anvin" , "Borislav Petkov" , "Ingo Molnar" Cc: Raj Ashok , "Tian, Kevin" , maz@kernel.org, peterz@infradead.org, seanjc@google.com, "Robin Murphy" , Jacob Pan Subject: [PATCH RFC 13/13] iommu/vt-d: Enable posted mode for device MSIs Date: Sat, 11 Nov 2023 20:16:43 -0800 Message-Id: <20231112041643.2868316-14-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> References: <20231112041643.2868316-1-jacob.jun.pan@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 With posted MSI feature is enabled on the CPU side, iommu IRTEs for device MSIs can be allocated, activated, and programed in posted mode. This means that IRTEs are linked with their respective PIDs of the target CPU. Excluding the following: - legacy devices IOAPIC, HPET (may be needed for booting, not a source of high MSIs) - VT-d's own IRQs (not remappable). Signed-off-by: Jacob Pan --- arch/x86/include/asm/posted_intr.h | 1 + drivers/iommu/intel/irq_remapping.c | 55 ++++++++++++++++++++++++++--- 2 files changed, 52 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/posted_intr.h b/arch/x86/include/asm/posted_intr.h index 12a4fa3ff60e..c6d245f53225 100644 --- a/arch/x86/include/asm/posted_intr.h +++ b/arch/x86/include/asm/posted_intr.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _X86_POSTED_INTR_H #define _X86_POSTED_INTR_H +#include #define POSTED_INTR_ON 0 #define POSTED_INTR_SN 1 diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index 971e6c37002f..1b88846d5338 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -19,6 +19,7 @@ #include #include #include +#include #include "iommu.h" #include "../irq_remapping.h" @@ -49,6 +50,7 @@ struct irq_2_iommu { u16 sub_handle; u8 irte_mask; enum irq_mode mode; + bool posted_msi; }; struct intel_ir_data { @@ -1118,6 +1120,14 @@ static void prepare_irte(struct irte *irte, int vector, unsigned int dest) irte->redir_hint = 1; } +static void prepare_irte_posted(struct irte *irte) +{ + memset(irte, 0, sizeof(*irte)); + + irte->present = 1; + irte->p_pst = 1; +} + struct irq_remap_ops intel_irq_remap_ops = { .prepare = intel_prepare_irq_remapping, .enable = intel_enable_irq_remapping, @@ -1125,6 +1135,7 @@ struct irq_remap_ops intel_irq_remap_ops = { .reenable = reenable_irq_remapping, .enable_faulting = enable_drhd_fault_handling, }; + #ifdef CONFIG_X86_POSTED_MSI static u64 get_pi_desc_addr(struct irq_data *irqd) @@ -1133,6 +1144,29 @@ static u64 get_pi_desc_addr(struct irq_data *irqd) return __pa(per_cpu_ptr(&posted_interrupt_desc, cpu)); } + +static void intel_ir_reconfigure_irte_posted(struct irq_data *irqd) +{ + struct intel_ir_data *ir_data = irqd->chip_data; + struct irte *irte = &ir_data->irte_entry; + struct irte irte_pi; + u64 pid_addr; + + pid_addr = get_pi_desc_addr(irqd); + + memset(&irte_pi, 0, sizeof(irte_pi)); + + /* The shared IRTE already be set up as posted during alloc_irte */ + dmar_copy_shared_irte(&irte_pi, irte); + + irte_pi.pda_l = (pid_addr >> (32 - PDA_LOW_BIT)) & ~(-1UL << PDA_LOW_BIT); + irte_pi.pda_h = (pid_addr >> 32) & ~(-1UL << PDA_HIGH_BIT); + + modify_irte(&ir_data->irq_2_iommu, &irte_pi); +} + +#else +static inline void intel_ir_reconfigure_irte_posted(struct irq_data *irqd) {} #endif static void intel_ir_reconfigure_irte(struct irq_data *irqd, bool force) @@ -1148,8 +1182,9 @@ static void intel_ir_reconfigure_irte(struct irq_data *irqd, bool force) irte->vector = cfg->vector; irte->dest_id = IRTE_DEST(cfg->dest_apicid); - /* Update the hardware only if the interrupt is in remapped mode. */ - if (force || ir_data->irq_2_iommu.mode == IRQ_REMAPPING) + if (ir_data->irq_2_iommu.posted_msi) + intel_ir_reconfigure_irte_posted(irqd); + else if (force || ir_data->irq_2_iommu.mode == IRQ_REMAPPING) modify_irte(&ir_data->irq_2_iommu, irte); } @@ -1203,7 +1238,7 @@ static int intel_ir_set_vcpu_affinity(struct irq_data *data, void *info) struct intel_ir_data *ir_data = data->chip_data; struct vcpu_data *vcpu_pi_info = info; - /* stop posting interrupts, back to remapping mode */ + /* stop posting interrupts, back to the default mode */ if (!vcpu_pi_info) { modify_irte(&ir_data->irq_2_iommu, &ir_data->irte_entry); } else { @@ -1300,10 +1335,14 @@ static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data, { struct irte *irte = &data->irte_entry; - prepare_irte(irte, irq_cfg->vector, irq_cfg->dest_apicid); + if (data->irq_2_iommu.mode == IRQ_POSTING) + prepare_irte_posted(irte); + else + prepare_irte(irte, irq_cfg->vector, irq_cfg->dest_apicid); switch (info->type) { case X86_IRQ_ALLOC_TYPE_IOAPIC: + prepare_irte(irte, irq_cfg->vector, irq_cfg->dest_apicid); /* Set source-id of interrupt request */ set_ioapic_sid(irte, info->devid); apic_printk(APIC_VERBOSE, KERN_DEBUG "IOAPIC[%d]: Set IRTE entry (P:%d FPD:%d Dst_Mode:%d Redir_hint:%d Trig_Mode:%d Dlvry_Mode:%X Avail:%X Vector:%02X Dest:%08X SID:%04X SQ:%X SVT:%X)\n", @@ -1315,10 +1354,18 @@ static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data, sub_handle = info->ioapic.pin; break; case X86_IRQ_ALLOC_TYPE_HPET: + prepare_irte(irte, irq_cfg->vector, irq_cfg->dest_apicid); set_hpet_sid(irte, info->devid); break; case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: + if (posted_msi_supported()) { + prepare_irte_posted(irte); + data->irq_2_iommu.posted_msi = 1; + } else { + prepare_irte(irte, irq_cfg->vector, irq_cfg->dest_apicid); + } + set_msi_sid(irte, pci_real_dma_dev(msi_desc_to_pci_dev(info->desc))); break;