From patchwork Mon Oct 5 15:28:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 11816825 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 667A2139A for ; Mon, 5 Oct 2020 15:28:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 14004207BC for ; Mon, 5 Oct 2020 15:28:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="azF5mWxl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727413AbgJEP2f (ORCPT ); Mon, 5 Oct 2020 11:28:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727393AbgJEP2b (ORCPT ); Mon, 5 Oct 2020 11:28:31 -0400 Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F416DC0613CE; Mon, 5 Oct 2020 08:28:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=Mime-Version:Content-Type:Date:Cc:To: From:Subject:Message-ID:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:In-Reply-To:References; bh=dFkgso4DV7mAFdVY77wdMaL2nqc+vFfAseQkT62QKIY=; b=azF5mWxlJrzpErgwCEWLNfLYXr YkAk6vHonHHNCW+5GFVc8Y0AvfGHMGT0Q1PdHV/OoIXDFZyBJ3Y7avI8aVx6GMWhPMj8WQnM+8pGk 55Q7BvnBYB+Z8gVvSQPaellG773V/aUWS4djpbdkwXzelgB1spHBfOdwA4/uVfzOOJ0Px8uglKekR bi32CnSwQJZaiyf88KslB3ZzTpPgVHeKxrRseZQkw87DPPsdV+05CT1x2oVcR6g4YQzmWoM3xnFQx s3C4vCGbMCXHwBvYJyG1+LFLH4bSeFPOZUp6H8QCZm29gljX9jQdCLMqJc+Bpls63Rz+pXo5OAhRh OGrWredQ==; Received: from 54-240-197-232.amazon.com ([54.240.197.232] helo=u3832b3a9db3152.ant.amazon.com) by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1kPSPa-0004Fx-7o; Mon, 05 Oct 2020 15:28:26 +0000 Message-ID: <77e64f977f559412f62b467fd062d051ea288f14.camel@infradead.org> Subject: [PATCH 0/13] Fix per-domain IRQ affinity, allow >255 CPUs on x86 without IRQ remapping From: David Woodhouse To: x86 Cc: iommu , kvm , linux-hyperv@vger.kernel.org, Paolo Bonzini Date: Mon, 05 Oct 2020 16:28:24 +0100 X-Mailer: Evolution 3.28.5-0ubuntu0.18.04.2 Mime-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by merlin.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Linux currently refuses to use >255 CPUs on x86 unless it has interrupt remapping. This is a bit gratuitous because it could use those extra CPUs just fine; it just can't target external interrupts at them. The only problem is that our generic IRQ domain code cann't cope with the concept of domains which can only target a subset of CPUs. The hyperv-iommu IRQ remapping driver works around this — not by actually doing any remapping, but just returning -EINVAL if the affinity is ever set to an unreachable CPU. This almost works, but ends up being a bit late because irq_set_affinity_locked() doesn't call into the irqchip driver immediately; the error only happens later. This patch series implements a per-domain "maximum affinity" set and uses it for the non-remapped IOAPIC and MSI domains on x86. As well as allowing more CPUs to be used without interrupt remapping, this also fixes the case where some IOAPICs or PCI devices aren't actually in scope of any active IOMMU and are operating without remapping. While we're at it, recognise that the 8-bit limit is a bit gratuitous and a hypervisor could offer at least 15 bits of APIC ID in the IOAPIC RTE and MSI address bits 11-5 without even needing to use remapping. David Woodhouse (13): x86/apic: Use x2apic in guest kernels even with unusable CPUs. x86/msi: Only use high bits of MSI address for DMAR unit x86/ioapic: Handle Extended Destination ID field in RTE x86/apic: Support 15 bits of APIC ID in IOAPIC/MSI where available genirq: Prepare for default affinity to be passed to __irq_alloc_descs() genirq: Add default_affinity argument to __irq_alloc_descs() irqdomain: Add max_affinity argument to irq_domain_alloc_descs() genirq: Add irq_domain_set_affinity() x86/irq: Add x86_non_ir_cpumask x86/irq: Limit IOAPIC and MSI domains' affinity without IR x86/smp: Allow more than 255 CPUs even without interrupt remapping iommu/irq_remapping: Kill most of hyperv-iommu.c now it's redundant x86/kvm: Add KVM_FEATURE_MSI_EXT_DEST_ID Documentation/virt/kvm/cpuid.rst | 4 + arch/x86/include/asm/apic.h | 1 + arch/x86/include/asm/io_apic.h | 3 +- arch/x86/include/asm/mpspec.h | 2 + arch/x86/include/asm/x86_init.h | 2 + arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kernel/apic/apic.c | 41 +++++++++- arch/x86/kernel/apic/io_apic.c | 23 ++++-- arch/x86/kernel/apic/msi.c | 44 +++++++++-- arch/x86/kernel/apic/x2apic_phys.c | 9 +++ arch/x86/kernel/kvm.c | 6 ++ arch/x86/kernel/x86_init.c | 1 + drivers/iommu/hyperv-iommu.c | 149 +---------------------------------- include/linux/interrupt.h | 2 + include/linux/irq.h | 10 ++- include/linux/irqdomain.h | 7 +- kernel/irq/devres.c | 8 +- kernel/irq/ipi.c | 2 +- kernel/irq/irqdesc.c | 29 ++++--- kernel/irq/irqdomain.c | 69 ++++++++++++++-- kernel/irq/manage.c | 19 ++++- 21 files changed, 240 insertions(+), 192 deletions(-)