From patchwork Mon Nov 23 13:28:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11925199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74FC4C388F9 for ; Mon, 23 Nov 2020 13:28:14 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 25356206E3 for ; Mon, 23 Nov 2020 13:28:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="u36KuH/K" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 25356206E3 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from list by lists.xenproject.org with outflank-mailman.34279.65196 (Exim 4.92) (envelope-from ) id 1khBsz-0007Vh-BF; Mon, 23 Nov 2020 13:28:05 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 34279.65196; Mon, 23 Nov 2020 13:28:05 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBsz-0007Va-8N; Mon, 23 Nov 2020 13:28:05 +0000 Received: by outflank-mailman (input) for mailman id 34279; Mon, 23 Nov 2020 13:28:03 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBsx-0007VT-8a for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:28:03 +0000 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 382e91eb-a16f-4163-8d26-460ff8fdb498; Mon, 23 Nov 2020 13:28:02 +0000 (UTC) Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 75DD1ABCE; Mon, 23 Nov 2020 13:28:01 +0000 (UTC) Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBsx-0007VT-8a for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:28:03 +0000 X-Inumbo-ID: 382e91eb-a16f-4163-8d26-460ff8fdb498 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 382e91eb-a16f-4163-8d26-460ff8fdb498; Mon, 23 Nov 2020 13:28:02 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1606138081; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Wx22CUcVcpFAZagcGFpuRTrgWXF49Q2PbjSMmqNqzXk=; b=u36KuH/KcLIsybJGTvI1VETWVGHZMWqssoCH7PVsnAJFYaYQYvB9OT9Ubm/Y9HLyEUZFQG 9exaNxmqNQ1bQcraK2BBsoxqaG3FPMLqWQ4vFjZ4SW1h2wghx93moXxYLepTpcT4+VE5oZ 0Zb5zf7mAwu6f0mi6zfJQB1yHR5l5Cs= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 75DD1ABCE; Mon, 23 Nov 2020 13:28:01 +0000 (UTC) Subject: [PATCH v3 1/5] evtchn: drop acquiring of per-channel lock from send_guest_{global,vcpu}_virq() From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , George Dunlap , Ian Jackson , Julien Grall , Wei Liu , Stefano Stabellini References: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Message-ID: Date: Mon, 23 Nov 2020 14:28:00 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Content-Language: en-US The per-vCPU virq_lock, which is being held anyway, together with there not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[] is zero, provide sufficient guarantees. Undo the lock addition done for XSA-343 (commit e045199c7c9c "evtchn: address races with evtchn_reset()"). Update the description next to struct evtchn_port_ops accordingly. Signed-off-by: Jan Beulich --- v3: Re-base. v2: New. --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -809,7 +809,6 @@ void send_guest_vcpu_virq(struct vcpu *v unsigned long flags; int port; struct domain *d; - struct evtchn *chn; ASSERT(!virq_is_global(virq)); @@ -820,12 +819,7 @@ void send_guest_vcpu_virq(struct vcpu *v goto out; d = v->domain; - chn = evtchn_from_port(d, port); - if ( evtchn_read_trylock(chn) ) - { - evtchn_port_set_pending(d, v->vcpu_id, chn); - evtchn_read_unlock(chn); - } + evtchn_port_set_pending(d, v->vcpu_id, evtchn_from_port(d, port)); out: spin_unlock_irqrestore(&v->virq_lock, flags); @@ -854,11 +848,7 @@ void send_guest_global_virq(struct domai goto out; chn = evtchn_from_port(d, port); - if ( evtchn_read_trylock(chn) ) - { - evtchn_port_set_pending(d, chn->notify_vcpu_id, chn); - evtchn_read_unlock(chn); - } + evtchn_port_set_pending(d, chn->notify_vcpu_id, chn); out: spin_unlock_irqrestore(&v->virq_lock, flags); --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -192,9 +192,16 @@ int evtchn_reset(struct domain *d, bool * Low-level event channel port ops. * * All hooks have to be called with a lock held which prevents the channel - * from changing state. This may be the domain event lock, the per-channel - * lock, or in the case of sending interdomain events also the other side's - * per-channel lock. Exceptions apply in certain cases for the PV shim. + * from changing state. This may be + * - the domain event lock, + * - the per-channel lock, + * - in the case of sending interdomain events the other side's per-channel + * lock, + * - in the case of sending non-global vIRQ-s the per-vCPU virq_lock (in + * combination with the ordering enforced through how the vCPU's + * virq_to_evtchn[] gets updated), + * - in the case of sending global vIRQ-s vCPU 0's virq_lock. + * Exceptions apply in certain cases for the PV shim. */ struct evtchn_port_ops { void (*init)(struct domain *d, struct evtchn *evtchn); From patchwork Mon Nov 23 13:28:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11925201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 703ACC2D0E4 for ; Mon, 23 Nov 2020 13:28:35 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 176B82080A for ; Mon, 23 Nov 2020 13:28:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="EgFofeHv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 176B82080A Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from list by lists.xenproject.org with outflank-mailman.34282.65209 (Exim 4.92) (envelope-from ) id 1khBtH-0007cC-KM; Mon, 23 Nov 2020 13:28:23 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 34282.65209; Mon, 23 Nov 2020 13:28:23 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBtH-0007c4-HN; Mon, 23 Nov 2020 13:28:23 +0000 Received: by outflank-mailman (input) for mailman id 34282; Mon, 23 Nov 2020 13:28:23 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBtH-0007bu-10 for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:28:23 +0000 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 4042d843-f1e3-460f-8058-9a1bce6b1987; Mon, 23 Nov 2020 13:28:22 +0000 (UTC) Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 9301EAC2E; Mon, 23 Nov 2020 13:28:21 +0000 (UTC) Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBtH-0007bu-10 for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:28:23 +0000 X-Inumbo-ID: 4042d843-f1e3-460f-8058-9a1bce6b1987 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 4042d843-f1e3-460f-8058-9a1bce6b1987; Mon, 23 Nov 2020 13:28:22 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1606138101; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iIpBVVXINJLxyaNZ9rqCJ2+moiWnSKhQgi3f+gHmkr8=; b=EgFofeHvZq1Fx66zNAN6E4Q4VugkMnnfcWFLj8KxcOTmhCeOukza8WnCGFLzVnKTFcm/ir zAEEJF6COD36UNpZwf58ys3aWyZ9/KswF/a4pii1PwWvM20Qm0dvqAHlcuRxBNbCEvUHdT +3Akux17DQg3KI22vTYdsBRKKcrarW0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 9301EAC2E; Mon, 23 Nov 2020 13:28:21 +0000 (UTC) Subject: [PATCH v3 2/5] evtchn: avoid access tearing for ->virq_to_evtchn[] accesses From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , George Dunlap , Ian Jackson , Julien Grall , Wei Liu , Stefano Stabellini References: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Message-ID: Date: Mon, 23 Nov 2020 14:28:21 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Content-Language: en-US Use {read,write}_atomic() to exclude any eventualities, in particular observing that accesses aren't all happening under a consistent lock. Requested-by: Julien Grall Signed-off-by: Jan Beulich Reviewed-by: Julien Grall --- v3: New. --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -446,7 +446,7 @@ int evtchn_bind_virq(evtchn_bind_virq_t spin_lock(&d->event_lock); - if ( v->virq_to_evtchn[virq] != 0 ) + if ( read_atomic(&v->virq_to_evtchn[virq]) ) ERROR_EXIT(-EEXIST); if ( port != 0 ) @@ -474,7 +474,8 @@ int evtchn_bind_virq(evtchn_bind_virq_t evtchn_write_unlock(chn); - v->virq_to_evtchn[virq] = bind->port = port; + bind->port = port; + write_atomic(&v->virq_to_evtchn[virq], port); out: spin_unlock(&d->event_lock); @@ -660,9 +661,9 @@ int evtchn_close(struct domain *d1, int case ECS_VIRQ: for_each_vcpu ( d1, v ) { - if ( v->virq_to_evtchn[chn1->u.virq] != port1 ) + if ( read_atomic(&v->virq_to_evtchn[chn1->u.virq]) != port1 ) continue; - v->virq_to_evtchn[chn1->u.virq] = 0; + write_atomic(&v->virq_to_evtchn[chn1->u.virq], 0); spin_barrier(&v->virq_lock); } break; @@ -801,7 +802,7 @@ bool evtchn_virq_enabled(const struct vc if ( virq_is_global(virq) && v->vcpu_id ) v = domain_vcpu(v->domain, 0); - return v->virq_to_evtchn[virq]; + return read_atomic(&v->virq_to_evtchn[virq]); } void send_guest_vcpu_virq(struct vcpu *v, uint32_t virq) @@ -814,7 +815,7 @@ void send_guest_vcpu_virq(struct vcpu *v spin_lock_irqsave(&v->virq_lock, flags); - port = v->virq_to_evtchn[virq]; + port = read_atomic(&v->virq_to_evtchn[virq]); if ( unlikely(port == 0) ) goto out; @@ -843,7 +844,7 @@ void send_guest_global_virq(struct domai spin_lock_irqsave(&v->virq_lock, flags); - port = v->virq_to_evtchn[virq]; + port = read_atomic(&v->virq_to_evtchn[virq]); if ( unlikely(port == 0) ) goto out; From patchwork Mon Nov 23 13:28:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11925203 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA3EDC2D0E4 for ; Mon, 23 Nov 2020 13:28:55 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4E276206E3 for ; Mon, 23 Nov 2020 13:28:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="koWTdlgv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4E276206E3 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from list by lists.xenproject.org with outflank-mailman.34288.65221 (Exim 4.92) (envelope-from ) id 1khBte-0007kL-3K; Mon, 23 Nov 2020 13:28:46 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 34288.65221; Mon, 23 Nov 2020 13:28:46 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBte-0007kE-0D; Mon, 23 Nov 2020 13:28:46 +0000 Received: by outflank-mailman (input) for mailman id 34288; Mon, 23 Nov 2020 13:28:45 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBtc-0007k0-Ty for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:28:44 +0000 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 1efc338a-1c06-4c6c-b64a-8e394327c95a; Mon, 23 Nov 2020 13:28:44 +0000 (UTC) Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4089EACC6; Mon, 23 Nov 2020 13:28:43 +0000 (UTC) Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBtc-0007k0-Ty for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:28:44 +0000 X-Inumbo-ID: 1efc338a-1c06-4c6c-b64a-8e394327c95a Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 1efc338a-1c06-4c6c-b64a-8e394327c95a; Mon, 23 Nov 2020 13:28:44 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1606138123; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u2CaRag6/oUqYllh9jVd06ZD4oOB0HabLrsdZBZjXfs=; b=koWTdlgvz4DaO2jfk3SOuhmqC1wZ9ssswhghaExLgcbamSKMFK7KHqaiIZuaBDIML+5ZpJ f/JWg62SDspsjzxd2s0574g2S9PSEuff+9DkVPmlvnw1Z5NO1fRkMAU4wc52WvJTg1iefG AuERTJ6ZiQmbcNZ6k8AAUuWsgNIfAxA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4089EACC6; Mon, 23 Nov 2020 13:28:43 +0000 (UTC) Subject: [PATCH v3 3/5] evtchn: convert vIRQ lock to an r/w one From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , George Dunlap , Ian Jackson , Julien Grall , Wei Liu , Stefano Stabellini References: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Message-ID: Date: Mon, 23 Nov 2020 14:28:42 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Content-Language: en-US There's no need to serialize all sending of vIRQ-s; all that's needed is serialization against the closing of the respective event channels (so far by means of a barrier). To facilitate the conversion, switch to an ordinary write locked region in evtchn_close(). Signed-off-by: Jan Beulich Reviewed-by: Julien Grall --- v3: Re-base over added new earlier patch. v2: Don't introduce/use rw_barrier() here. Add comment to evtchn_bind_virq(). Re-base. --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -160,7 +160,7 @@ struct vcpu *vcpu_create(struct domain * v->vcpu_id = vcpu_id; v->dirty_cpu = VCPU_CPU_CLEAN; - spin_lock_init(&v->virq_lock); + rwlock_init(&v->virq_lock); tasklet_init(&v->continue_hypercall_tasklet, NULL, NULL); --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -475,6 +475,13 @@ int evtchn_bind_virq(evtchn_bind_virq_t evtchn_write_unlock(chn); bind->port = port; + /* + * If by any, the update of virq_to_evtchn[] would need guarding by + * virq_lock, but since this is the last action here, there's no strict + * need to acquire the lock. Hence holding event_lock isn't helpful + * anymore at this point, but utilize that its unlocking acts as the + * otherwise necessary smp_wmb() here. + */ write_atomic(&v->virq_to_evtchn[virq], port); out: @@ -661,10 +668,12 @@ int evtchn_close(struct domain *d1, int case ECS_VIRQ: for_each_vcpu ( d1, v ) { - if ( read_atomic(&v->virq_to_evtchn[chn1->u.virq]) != port1 ) - continue; - write_atomic(&v->virq_to_evtchn[chn1->u.virq], 0); - spin_barrier(&v->virq_lock); + unsigned long flags; + + write_lock_irqsave(&v->virq_lock, flags); + if ( read_atomic(&v->virq_to_evtchn[chn1->u.virq]) == port1 ) + write_atomic(&v->virq_to_evtchn[chn1->u.virq], 0); + write_unlock_irqrestore(&v->virq_lock, flags); } break; @@ -813,7 +822,7 @@ void send_guest_vcpu_virq(struct vcpu *v ASSERT(!virq_is_global(virq)); - spin_lock_irqsave(&v->virq_lock, flags); + read_lock_irqsave(&v->virq_lock, flags); port = read_atomic(&v->virq_to_evtchn[virq]); if ( unlikely(port == 0) ) @@ -823,7 +832,7 @@ void send_guest_vcpu_virq(struct vcpu *v evtchn_port_set_pending(d, v->vcpu_id, evtchn_from_port(d, port)); out: - spin_unlock_irqrestore(&v->virq_lock, flags); + read_unlock_irqrestore(&v->virq_lock, flags); } void send_guest_global_virq(struct domain *d, uint32_t virq) @@ -842,7 +851,7 @@ void send_guest_global_virq(struct domai if ( unlikely(v == NULL) ) return; - spin_lock_irqsave(&v->virq_lock, flags); + read_lock_irqsave(&v->virq_lock, flags); port = read_atomic(&v->virq_to_evtchn[virq]); if ( unlikely(port == 0) ) @@ -852,7 +861,7 @@ void send_guest_global_virq(struct domai evtchn_port_set_pending(d, chn->notify_vcpu_id, chn); out: - spin_unlock_irqrestore(&v->virq_lock, flags); + read_unlock_irqrestore(&v->virq_lock, flags); } void send_guest_pirq(struct domain *d, const struct pirq *pirq) --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -238,7 +238,7 @@ struct vcpu /* IRQ-safe virq_lock protects against delivering VIRQ to stale evtchn. */ evtchn_port_t virq_to_evtchn[NR_VIRQS]; - spinlock_t virq_lock; + rwlock_t virq_lock; /* Tasklet for continue_hypercall_on_cpu(). */ struct tasklet continue_hypercall_tasklet; From patchwork Mon Nov 23 13:29:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11925205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6D28C6379F for ; Mon, 23 Nov 2020 13:29:33 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6B3EA206E3 for ; Mon, 23 Nov 2020 13:29:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="LLiVfDZn" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6B3EA206E3 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from list by lists.xenproject.org with outflank-mailman.34293.65232 (Exim 4.92) (envelope-from ) id 1khBu5-0007rb-DG; Mon, 23 Nov 2020 13:29:13 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 34293.65232; Mon, 23 Nov 2020 13:29:13 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBu5-0007rU-9A; Mon, 23 Nov 2020 13:29:13 +0000 Received: by outflank-mailman (input) for mailman id 34293; Mon, 23 Nov 2020 13:29:11 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBu3-0007rB-J7 for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:29:11 +0000 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 21a62d6e-6f53-4610-ab6f-f624c3a00f84; Mon, 23 Nov 2020 13:29:09 +0000 (UTC) Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 53929ACC6; Mon, 23 Nov 2020 13:29:08 +0000 (UTC) Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBu3-0007rB-J7 for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:29:11 +0000 X-Inumbo-ID: 21a62d6e-6f53-4610-ab6f-f624c3a00f84 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 21a62d6e-6f53-4610-ab6f-f624c3a00f84; Mon, 23 Nov 2020 13:29:09 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1606138148; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zYGWGZcvIzB6bNUNFtqGuAY55PoIprvcCYGCok4/knU=; b=LLiVfDZn/mJAL9+XMS8xT1a12vtyga7qtUil+/B5bvVJd2yDK2ik02gomF/qPYU3NWiKKU QJ9tCCH9ejw7rPCuHA08NZ/e/XMMzX4bjlUxR5z321usgemvqHtFa61kQSwzq01pdnjLQb r5zQmn6Zeu+Q3AeYH3N0f6dLboZoc5U= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 53929ACC6; Mon, 23 Nov 2020 13:29:08 +0000 (UTC) Subject: [PATCH v3 4/5] evtchn: convert domain event lock to an r/w one From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , George Dunlap , Ian Jackson , Julien Grall , Wei Liu , Stefano Stabellini References: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Message-ID: Date: Mon, 23 Nov 2020 14:29:07 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Content-Language: en-US Especially for the use in evtchn_move_pirqs() (called when moving a vCPU across pCPU-s) and the ones in EOI handling in PCI pass-through code, serializing perhaps an entire domain isn't helpful when no state (which isn't e.g. further protected by the per-channel lock) changes. Unfortunately this implies dropping of lock profiling for this lock, until r/w locks may get enabled for such functionality. While ->notify_vcpu_id is now meant to be consistently updated with the per-channel lock held, an extension applies to ECS_PIRQ: The field is also guaranteed to not change with the per-domain event lock held for writing. Therefore the link_pirq_port() call from evtchn_bind_pirq() could in principle be moved out of the per-channel locked regions, but this further code churn didn't seem worth it. Signed-off-by: Jan Beulich --- v3: Re-base. v2: Consistently lock for writing in evtchn_reset(). Fix error path in pci_clean_dpci_irqs(). Lock for writing in pt_irq_time_out(), hvm_dirq_assist(), hvm_dpci_eoi(), and hvm_dpci_isairq_eoi(). Move rw_barrier() introduction here. Re-base over changes earlier in the series. --- RFC: Wouldn't flask_get_peer_sid() better use the per-channel lock? --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -917,7 +917,7 @@ int arch_domain_soft_reset(struct domain if ( !is_hvm_domain(d) ) return -EINVAL; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); for ( i = 0; i < d->nr_pirqs ; i++ ) { if ( domain_pirq_to_emuirq(d, i) != IRQ_UNBOUND ) @@ -927,7 +927,7 @@ int arch_domain_soft_reset(struct domain break; } } - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); if ( ret ) return ret; --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -528,9 +528,9 @@ void hvm_migrate_pirqs(struct vcpu *v) if ( !is_iommu_enabled(d) || !hvm_domain_irq(d)->dpci ) return; - spin_lock(&d->event_lock); + read_lock(&d->event_lock); pt_pirq_iterate(d, migrate_pirq, v); - spin_unlock(&d->event_lock); + read_unlock(&d->event_lock); } static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info) --- a/xen/arch/x86/hvm/irq.c +++ b/xen/arch/x86/hvm/irq.c @@ -404,9 +404,9 @@ int hvm_inject_msi(struct domain *d, uin { int rc; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); rc = map_domain_emuirq_pirq(d, pirq, IRQ_MSI_EMU); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); if ( rc ) return rc; info = pirq_info(d, pirq); --- a/xen/arch/x86/hvm/vioapic.c +++ b/xen/arch/x86/hvm/vioapic.c @@ -203,9 +203,9 @@ static int vioapic_hwdom_map_gsi(unsigne { gprintk(XENLOG_WARNING, "vioapic: error binding GSI %u: %d\n", gsi, ret); - spin_lock(&currd->event_lock); + write_lock(&currd->event_lock); unmap_domain_pirq(currd, pirq); - spin_unlock(&currd->event_lock); + write_unlock(&currd->event_lock); } pcidevs_unlock(); --- a/xen/arch/x86/hvm/vmsi.c +++ b/xen/arch/x86/hvm/vmsi.c @@ -465,7 +465,7 @@ int msixtbl_pt_register(struct domain *d int r = -EINVAL; ASSERT(pcidevs_locked()); - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); if ( !msixtbl_initialised(d) ) return -ENODEV; @@ -535,7 +535,7 @@ void msixtbl_pt_unregister(struct domain struct msixtbl_entry *entry; ASSERT(pcidevs_locked()); - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); if ( !msixtbl_initialised(d) ) return; @@ -589,13 +589,13 @@ void msixtbl_pt_cleanup(struct domain *d if ( !msixtbl_initialised(d) ) return; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); list_for_each_entry_safe( entry, temp, &d->arch.hvm.msixtbl_list, list ) del_msixtbl_entry(entry); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); } void msix_write_completion(struct vcpu *v) @@ -719,9 +719,9 @@ int vpci_msi_arch_update(struct vpci_msi msi->arch.pirq, msi->mask); if ( rc ) { - spin_lock(&pdev->domain->event_lock); + write_lock(&pdev->domain->event_lock); unmap_domain_pirq(pdev->domain, msi->arch.pirq); - spin_unlock(&pdev->domain->event_lock); + write_unlock(&pdev->domain->event_lock); pcidevs_unlock(); msi->arch.pirq = INVALID_PIRQ; return rc; @@ -760,9 +760,9 @@ static int vpci_msi_enable(const struct rc = vpci_msi_update(pdev, data, address, vectors, pirq, mask); if ( rc ) { - spin_lock(&pdev->domain->event_lock); + write_lock(&pdev->domain->event_lock); unmap_domain_pirq(pdev->domain, pirq); - spin_unlock(&pdev->domain->event_lock); + write_unlock(&pdev->domain->event_lock); pcidevs_unlock(); return rc; } @@ -807,9 +807,9 @@ static void vpci_msi_disable(const struc ASSERT(!rc); } - spin_lock(&pdev->domain->event_lock); + write_lock(&pdev->domain->event_lock); unmap_domain_pirq(pdev->domain, pirq); - spin_unlock(&pdev->domain->event_lock); + write_unlock(&pdev->domain->event_lock); pcidevs_unlock(); } --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -2413,10 +2413,10 @@ int ioapic_guest_write(unsigned long phy } if ( pirq >= 0 ) { - spin_lock(&hardware_domain->event_lock); + write_lock(&hardware_domain->event_lock); ret = map_domain_pirq(hardware_domain, pirq, irq, MAP_PIRQ_TYPE_GSI, NULL); - spin_unlock(&hardware_domain->event_lock); + write_unlock(&hardware_domain->event_lock); if ( ret < 0 ) return ret; } --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1536,7 +1536,7 @@ int pirq_guest_bind(struct vcpu *v, stru irq_guest_action_t *action, *newaction = NULL; int rc = 0; - WARN_ON(!spin_is_locked(&v->domain->event_lock)); + WARN_ON(!rw_is_write_locked(&v->domain->event_lock)); BUG_ON(!local_irq_is_enabled()); retry: @@ -1756,7 +1756,7 @@ void pirq_guest_unbind(struct domain *d, struct irq_desc *desc; int irq = 0; - WARN_ON(!spin_is_locked(&d->event_lock)); + WARN_ON(!rw_is_write_locked(&d->event_lock)); BUG_ON(!local_irq_is_enabled()); desc = pirq_spin_lock_irq_desc(pirq, NULL); @@ -1793,7 +1793,7 @@ static bool pirq_guest_force_unbind(stru unsigned int i; bool bound = false; - WARN_ON(!spin_is_locked(&d->event_lock)); + WARN_ON(!rw_is_write_locked(&d->event_lock)); BUG_ON(!local_irq_is_enabled()); desc = pirq_spin_lock_irq_desc(pirq, NULL); @@ -2037,7 +2037,7 @@ int get_free_pirq(struct domain *d, int { int i; - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); if ( type == MAP_PIRQ_TYPE_GSI ) { @@ -2062,7 +2062,7 @@ int get_free_pirqs(struct domain *d, uns { unsigned int i, found = 0; - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); for ( i = d->nr_pirqs - 1; i >= nr_irqs_gsi; --i ) if ( is_free_pirq(d, pirq_info(d, i)) ) @@ -2090,7 +2090,7 @@ int map_domain_pirq( DECLARE_BITMAP(prepared, MAX_MSI_IRQS) = {}; DECLARE_BITMAP(granted, MAX_MSI_IRQS) = {}; - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); if ( !irq_access_permitted(current->domain, irq)) return -EPERM; @@ -2309,7 +2309,7 @@ int unmap_domain_pirq(struct domain *d, return -EINVAL; ASSERT(pcidevs_locked()); - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); info = pirq_info(d, pirq); if ( !info || (irq = info->arch.irq) <= 0 ) @@ -2436,13 +2436,13 @@ void free_domain_pirqs(struct domain *d) int i; pcidevs_lock(); - spin_lock(&d->event_lock); + write_lock(&d->event_lock); for ( i = 0; i < d->nr_pirqs; i++ ) if ( domain_pirq_to_irq(d, i) > 0 ) unmap_domain_pirq(d, i); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); pcidevs_unlock(); } @@ -2683,7 +2683,7 @@ int map_domain_emuirq_pirq(struct domain int old_emuirq = IRQ_UNBOUND, old_pirq = IRQ_UNBOUND; struct pirq *info; - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); if ( !is_hvm_domain(d) ) return -EINVAL; @@ -2749,7 +2749,7 @@ int unmap_domain_pirq_emuirq(struct doma if ( (pirq < 0) || (pirq >= d->nr_pirqs) ) return -EINVAL; - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); emuirq = domain_pirq_to_emuirq(d, pirq); if ( emuirq == IRQ_UNBOUND ) @@ -2797,7 +2797,7 @@ static int allocate_pirq(struct domain * { int current_pirq; - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); current_pirq = domain_irq_to_pirq(d, irq); if ( pirq < 0 ) { @@ -2869,7 +2869,7 @@ int allocate_and_map_gsi_pirq(struct dom } /* Verify or get pirq. */ - spin_lock(&d->event_lock); + write_lock(&d->event_lock); pirq = allocate_pirq(d, index, *pirq_p, irq, MAP_PIRQ_TYPE_GSI, NULL); if ( pirq < 0 ) { @@ -2882,7 +2882,7 @@ int allocate_and_map_gsi_pirq(struct dom *pirq_p = pirq; done: - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return ret; } @@ -2923,7 +2923,7 @@ int allocate_and_map_msi_pirq(struct dom pcidevs_lock(); /* Verify or get pirq. */ - spin_lock(&d->event_lock); + write_lock(&d->event_lock); pirq = allocate_pirq(d, index, *pirq_p, irq, type, &msi->entry_nr); if ( pirq < 0 ) { @@ -2936,7 +2936,7 @@ int allocate_and_map_msi_pirq(struct dom *pirq_p = pirq; done: - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); pcidevs_unlock(); if ( ret ) { --- a/xen/arch/x86/physdev.c +++ b/xen/arch/x86/physdev.c @@ -34,7 +34,7 @@ static int physdev_hvm_map_pirq( ASSERT(!is_hardware_domain(d)); - spin_lock(&d->event_lock); + write_lock(&d->event_lock); switch ( type ) { case MAP_PIRQ_TYPE_GSI: { @@ -84,7 +84,7 @@ static int physdev_hvm_map_pirq( break; } - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return ret; } @@ -154,18 +154,18 @@ int physdev_unmap_pirq(domid_t domid, in if ( is_hvm_domain(d) && has_pirq(d) ) { - spin_lock(&d->event_lock); + write_lock(&d->event_lock); if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND ) ret = unmap_domain_pirq_emuirq(d, pirq); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); if ( domid == DOMID_SELF || ret ) goto free_domain; } pcidevs_lock(); - spin_lock(&d->event_lock); + write_lock(&d->event_lock); ret = unmap_domain_pirq(d, pirq); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); pcidevs_unlock(); free_domain: @@ -192,10 +192,10 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H ret = -EINVAL; if ( eoi.irq >= currd->nr_pirqs ) break; - spin_lock(&currd->event_lock); + read_lock(&currd->event_lock); pirq = pirq_info(currd, eoi.irq); if ( !pirq ) { - spin_unlock(&currd->event_lock); + read_unlock(&currd->event_lock); break; } if ( currd->arch.auto_unmask ) @@ -214,7 +214,7 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H && hvm_irq->gsi_assert_count[gsi] ) send_guest_pirq(currd, pirq); } - spin_unlock(&currd->event_lock); + read_unlock(&currd->event_lock); ret = 0; break; } @@ -626,7 +626,7 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H if ( copy_from_guest(&out, arg, 1) != 0 ) break; - spin_lock(&currd->event_lock); + write_lock(&currd->event_lock); ret = get_free_pirq(currd, out.type); if ( ret >= 0 ) @@ -639,7 +639,7 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_H ret = -ENOMEM; } - spin_unlock(&currd->event_lock); + write_unlock(&currd->event_lock); if ( ret >= 0 ) { --- a/xen/arch/x86/pv/shim.c +++ b/xen/arch/x86/pv/shim.c @@ -448,7 +448,7 @@ static long pv_shim_event_channel_op(int if ( rc ) \ break; \ \ - spin_lock(&d->event_lock); \ + write_lock(&d->event_lock); \ rc = evtchn_allocate_port(d, op.port_field); \ if ( rc ) \ { \ @@ -457,7 +457,7 @@ static long pv_shim_event_channel_op(int } \ else \ evtchn_reserve(d, op.port_field); \ - spin_unlock(&d->event_lock); \ + write_unlock(&d->event_lock); \ \ if ( !rc && __copy_to_guest(arg, &op, 1) ) \ rc = -EFAULT; \ @@ -585,11 +585,11 @@ static long pv_shim_event_channel_op(int if ( rc ) break; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); rc = evtchn_allocate_port(d, ipi.port); if ( rc ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); close.port = ipi.port; BUG_ON(xen_hypercall_event_channel_op(EVTCHNOP_close, &close)); @@ -598,7 +598,7 @@ static long pv_shim_event_channel_op(int evtchn_assign_vcpu(d, ipi.port, ipi.vcpu); evtchn_reserve(d, ipi.port); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); if ( __copy_to_guest(arg, &ipi, 1) ) rc = -EFAULT; --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -294,7 +294,7 @@ static long evtchn_alloc_unbound(evtchn_ if ( d == NULL ) return -ESRCH; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); if ( (port = get_free_port(d)) < 0 ) ERROR_EXIT_DOM(port, d); @@ -317,7 +317,7 @@ static long evtchn_alloc_unbound(evtchn_ out: check_free_port(d, port); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); rcu_unlock_domain(d); return rc; @@ -363,14 +363,14 @@ static long evtchn_bind_interdomain(evtc /* Avoid deadlock by first acquiring lock of domain with smaller id. */ if ( ld < rd ) { - spin_lock(&ld->event_lock); - spin_lock(&rd->event_lock); + write_lock(&ld->event_lock); + read_lock(&rd->event_lock); } else { if ( ld != rd ) - spin_lock(&rd->event_lock); - spin_lock(&ld->event_lock); + read_lock(&rd->event_lock); + write_lock(&ld->event_lock); } if ( (lport = get_free_port(ld)) < 0 ) @@ -411,9 +411,9 @@ static long evtchn_bind_interdomain(evtc out: check_free_port(ld, lport); - spin_unlock(&ld->event_lock); + write_unlock(&ld->event_lock); if ( ld != rd ) - spin_unlock(&rd->event_lock); + read_unlock(&rd->event_lock); rcu_unlock_domain(rd); @@ -444,7 +444,7 @@ int evtchn_bind_virq(evtchn_bind_virq_t if ( (v = domain_vcpu(d, vcpu)) == NULL ) return -ENOENT; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); if ( read_atomic(&v->virq_to_evtchn[virq]) ) ERROR_EXIT(-EEXIST); @@ -485,7 +485,7 @@ int evtchn_bind_virq(evtchn_bind_virq_t write_atomic(&v->virq_to_evtchn[virq], port); out: - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; } @@ -501,7 +501,7 @@ static long evtchn_bind_ipi(evtchn_bind_ if ( domain_vcpu(d, vcpu) == NULL ) return -ENOENT; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); if ( (port = get_free_port(d)) < 0 ) ERROR_EXIT(port); @@ -519,7 +519,7 @@ static long evtchn_bind_ipi(evtchn_bind_ bind->port = port; out: - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; } @@ -565,7 +565,7 @@ static long evtchn_bind_pirq(evtchn_bind if ( !is_hvm_domain(d) && !pirq_access_permitted(d, pirq) ) return -EPERM; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); if ( pirq_to_evtchn(d, pirq) != 0 ) ERROR_EXIT(-EEXIST); @@ -605,7 +605,7 @@ static long evtchn_bind_pirq(evtchn_bind out: check_free_port(d, port); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; } @@ -620,7 +620,7 @@ int evtchn_close(struct domain *d1, int long rc = 0; again: - spin_lock(&d1->event_lock); + write_lock(&d1->event_lock); if ( !port_is_valid(d1, port1) ) { @@ -690,13 +690,11 @@ int evtchn_close(struct domain *d1, int BUG(); if ( d1 < d2 ) - { - spin_lock(&d2->event_lock); - } + read_lock(&d2->event_lock); else if ( d1 != d2 ) { - spin_unlock(&d1->event_lock); - spin_lock(&d2->event_lock); + write_unlock(&d1->event_lock); + read_lock(&d2->event_lock); goto again; } } @@ -743,11 +741,11 @@ int evtchn_close(struct domain *d1, int if ( d2 != NULL ) { if ( d1 != d2 ) - spin_unlock(&d2->event_lock); + read_unlock(&d2->event_lock); put_domain(d2); } - spin_unlock(&d1->event_lock); + write_unlock(&d1->event_lock); return rc; } @@ -963,7 +961,7 @@ int evtchn_status(evtchn_status_t *statu if ( d == NULL ) return -ESRCH; - spin_lock(&d->event_lock); + read_lock(&d->event_lock); if ( !port_is_valid(d, port) ) { @@ -1016,7 +1014,7 @@ int evtchn_status(evtchn_status_t *statu status->vcpu = chn->notify_vcpu_id; out: - spin_unlock(&d->event_lock); + read_unlock(&d->event_lock); rcu_unlock_domain(d); return rc; @@ -1034,7 +1032,7 @@ long evtchn_bind_vcpu(unsigned int port, if ( (v = domain_vcpu(d, vcpu_id)) == NULL ) return -ENOENT; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); if ( !port_is_valid(d, port) ) { @@ -1078,7 +1076,7 @@ long evtchn_bind_vcpu(unsigned int port, } out: - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; } @@ -1124,7 +1122,7 @@ int evtchn_reset(struct domain *d, bool if ( d != current->domain && !d->controller_pause_count ) return -EINVAL; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); /* * If we are resuming, then start where we stopped. Otherwise, check @@ -1135,7 +1133,7 @@ int evtchn_reset(struct domain *d, bool if ( i > d->next_evtchn ) d->next_evtchn = i; - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); if ( !i ) return -EBUSY; @@ -1147,14 +1145,14 @@ int evtchn_reset(struct domain *d, bool /* NB: Choice of frequency is arbitrary. */ if ( !(i & 0x3f) && hypercall_preempt_check() ) { - spin_lock(&d->event_lock); + write_lock(&d->event_lock); d->next_evtchn = i; - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return -ERESTART; } } - spin_lock(&d->event_lock); + write_lock(&d->event_lock); d->next_evtchn = 0; @@ -1167,7 +1165,7 @@ int evtchn_reset(struct domain *d, bool evtchn_2l_init(d); } - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; } @@ -1357,7 +1355,7 @@ int alloc_unbound_xen_event_channel( struct evtchn *chn; int port, rc; - spin_lock(&ld->event_lock); + write_lock(&ld->event_lock); port = rc = get_free_port(ld); if ( rc < 0 ) @@ -1385,7 +1383,7 @@ int alloc_unbound_xen_event_channel( out: check_free_port(ld, port); - spin_unlock(&ld->event_lock); + write_unlock(&ld->event_lock); return rc < 0 ? rc : port; } @@ -1473,7 +1471,8 @@ int evtchn_init(struct domain *d, unsign return -ENOMEM; d->valid_evtchns = EVTCHNS_PER_BUCKET; - spin_lock_init_prof(d, event_lock); + rwlock_init(&d->event_lock); + if ( get_free_port(d) != 0 ) { free_evtchn_bucket(d, d->evtchn); @@ -1500,7 +1499,7 @@ int evtchn_destroy(struct domain *d) /* After this barrier no new event-channel allocations can occur. */ BUG_ON(!d->is_dying); - spin_barrier(&d->event_lock); + rw_barrier(&d->event_lock); /* Close all existing event channels. */ for ( i = d->valid_evtchns; --i; ) @@ -1558,13 +1557,13 @@ void evtchn_move_pirqs(struct vcpu *v) unsigned int port; struct evtchn *chn; - spin_lock(&d->event_lock); + read_lock(&d->event_lock); for ( port = v->pirq_evtchn_head; port; port = chn->u.pirq.next_port ) { chn = evtchn_from_port(d, port); pirq_set_affinity(d, chn->u.pirq.irq, mask); } - spin_unlock(&d->event_lock); + read_unlock(&d->event_lock); } @@ -1577,7 +1576,7 @@ static void domain_dump_evtchn_info(stru "Polling vCPUs: {%*pbl}\n" " port [p/m/s]\n", d->domain_id, d->max_vcpus, d->poll_mask); - spin_lock(&d->event_lock); + read_lock(&d->event_lock); for ( port = 1; port_is_valid(d, port); ++port ) { @@ -1624,7 +1623,7 @@ static void domain_dump_evtchn_info(stru } } - spin_unlock(&d->event_lock); + read_unlock(&d->event_lock); } static void dump_evtchn_info(unsigned char key) --- a/xen/common/event_fifo.c +++ b/xen/common/event_fifo.c @@ -561,7 +561,7 @@ int evtchn_fifo_init_control(struct evtc if ( offset & (8 - 1) ) return -EINVAL; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); /* * If this is the first control block, setup an empty event array @@ -593,13 +593,13 @@ int evtchn_fifo_init_control(struct evtc else rc = map_control_block(v, gfn, offset); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; error: evtchn_fifo_destroy(d); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; } @@ -652,9 +652,9 @@ int evtchn_fifo_expand_array(const struc if ( !d->evtchn_fifo ) return -EOPNOTSUPP; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); rc = add_page_to_event_array(d, expand_array->array_gfn); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; } --- a/xen/drivers/passthrough/io.c +++ b/xen/drivers/passthrough/io.c @@ -105,7 +105,7 @@ static void pt_pirq_softirq_reset(struct { struct domain *d = pirq_dpci->dom; - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_write_locked(&d->event_lock)); switch ( cmpxchg(&pirq_dpci->state, 1 << STATE_SCHED, 0) ) { @@ -162,7 +162,7 @@ static void pt_irq_time_out(void *data) const struct hvm_irq_dpci *dpci; const struct dev_intx_gsi_link *digl; - spin_lock(&irq_map->dom->event_lock); + write_lock(&irq_map->dom->event_lock); if ( irq_map->flags & HVM_IRQ_DPCI_IDENTITY_GSI ) { @@ -177,7 +177,7 @@ static void pt_irq_time_out(void *data) hvm_gsi_deassert(irq_map->dom, dpci_pirq(irq_map)->pirq); irq_map->flags |= HVM_IRQ_DPCI_EOI_LATCH; pt_irq_guest_eoi(irq_map->dom, irq_map, NULL); - spin_unlock(&irq_map->dom->event_lock); + write_unlock(&irq_map->dom->event_lock); return; } @@ -185,7 +185,7 @@ static void pt_irq_time_out(void *data) if ( unlikely(!dpci) ) { ASSERT_UNREACHABLE(); - spin_unlock(&irq_map->dom->event_lock); + write_unlock(&irq_map->dom->event_lock); return; } list_for_each_entry ( digl, &irq_map->digl_list, list ) @@ -204,7 +204,7 @@ static void pt_irq_time_out(void *data) pt_pirq_iterate(irq_map->dom, pt_irq_guest_eoi, NULL); - spin_unlock(&irq_map->dom->event_lock); + write_unlock(&irq_map->dom->event_lock); } struct hvm_irq_dpci *domain_get_irq_dpci(const struct domain *d) @@ -288,7 +288,7 @@ int pt_irq_create_bind( return -EINVAL; restart: - spin_lock(&d->event_lock); + write_lock(&d->event_lock); hvm_irq_dpci = domain_get_irq_dpci(d); if ( !hvm_irq_dpci && !is_hardware_domain(d) ) @@ -304,7 +304,7 @@ int pt_irq_create_bind( hvm_irq_dpci = xzalloc(struct hvm_irq_dpci); if ( hvm_irq_dpci == NULL ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return -ENOMEM; } for ( i = 0; i < NR_HVM_DOMU_IRQS; i++ ) @@ -316,7 +316,7 @@ int pt_irq_create_bind( info = pirq_get_info(d, pirq); if ( !info ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return -ENOMEM; } pirq_dpci = pirq_dpci(info); @@ -331,7 +331,7 @@ int pt_irq_create_bind( */ if ( pt_pirq_softirq_active(pirq_dpci) ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); cpu_relax(); goto restart; } @@ -389,7 +389,7 @@ int pt_irq_create_bind( pirq_dpci->dom = NULL; pirq_dpci->flags = 0; pirq_cleanup_check(info, d); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return rc; } } @@ -399,7 +399,7 @@ int pt_irq_create_bind( if ( (pirq_dpci->flags & mask) != mask ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return -EBUSY; } @@ -423,7 +423,7 @@ int pt_irq_create_bind( dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode); pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id; - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); pirq_dpci->gmsi.posted = false; vcpu = (dest_vcpu_id >= 0) ? d->vcpu[dest_vcpu_id] : NULL; @@ -483,7 +483,7 @@ int pt_irq_create_bind( if ( !digl || !girq ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); xfree(girq); xfree(digl); return -ENOMEM; @@ -510,7 +510,7 @@ int pt_irq_create_bind( if ( pt_irq_bind->irq_type != PT_IRQ_TYPE_PCI || pirq >= hvm_domain_irq(d)->nr_gsis ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return -EINVAL; } @@ -546,7 +546,7 @@ int pt_irq_create_bind( if ( mask < 0 || trigger_mode < 0 ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); ASSERT_UNREACHABLE(); return -EINVAL; @@ -594,14 +594,14 @@ int pt_irq_create_bind( } pirq_dpci->flags = 0; pirq_cleanup_check(info, d); - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); xfree(girq); xfree(digl); return rc; } } - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); if ( iommu_verbose ) { @@ -619,7 +619,7 @@ int pt_irq_create_bind( } default: - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return -EOPNOTSUPP; } @@ -672,13 +672,13 @@ int pt_irq_destroy_bind( return -EOPNOTSUPP; } - spin_lock(&d->event_lock); + write_lock(&d->event_lock); hvm_irq_dpci = domain_get_irq_dpci(d); if ( !hvm_irq_dpci && !is_hardware_domain(d) ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return -EINVAL; } @@ -711,7 +711,7 @@ int pt_irq_destroy_bind( if ( girq ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return -EINVAL; } @@ -755,7 +755,7 @@ int pt_irq_destroy_bind( pirq_cleanup_check(pirq, d); } - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); if ( what && iommu_verbose ) { @@ -799,7 +799,7 @@ int pt_pirq_iterate(struct domain *d, unsigned int pirq = 0, n, i; struct pirq *pirqs[8]; - ASSERT(spin_is_locked(&d->event_lock)); + ASSERT(rw_is_locked(&d->event_lock)); do { n = radix_tree_gang_lookup(&d->pirq_tree, (void **)pirqs, pirq, @@ -880,9 +880,9 @@ void hvm_dpci_msi_eoi(struct domain *d, (!hvm_domain_irq(d)->dpci && !is_hardware_domain(d)) ) return; - spin_lock(&d->event_lock); + read_lock(&d->event_lock); pt_pirq_iterate(d, _hvm_dpci_msi_eoi, (void *)(long)vector); - spin_unlock(&d->event_lock); + read_unlock(&d->event_lock); } static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci) @@ -893,7 +893,7 @@ static void hvm_dirq_assist(struct domai return; } - spin_lock(&d->event_lock); + write_lock(&d->event_lock); if ( test_and_clear_bool(pirq_dpci->masked) ) { struct pirq *pirq = dpci_pirq(pirq_dpci); @@ -947,7 +947,7 @@ static void hvm_dirq_assist(struct domai } out: - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); } static void hvm_pirq_eoi(struct pirq *pirq, @@ -1012,7 +1012,7 @@ void hvm_dpci_eoi(struct domain *d, unsi if ( is_hardware_domain(d) ) { - spin_lock(&d->event_lock); + write_lock(&d->event_lock); hvm_gsi_eoi(d, guest_gsi, ent); goto unlock; } @@ -1023,7 +1023,7 @@ void hvm_dpci_eoi(struct domain *d, unsi return; } - spin_lock(&d->event_lock); + write_lock(&d->event_lock); hvm_irq_dpci = domain_get_irq_dpci(d); if ( !hvm_irq_dpci ) @@ -1033,7 +1033,7 @@ void hvm_dpci_eoi(struct domain *d, unsi __hvm_dpci_eoi(d, girq, ent); unlock: - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); } /* --- a/xen/common/rwlock.c +++ b/xen/common/rwlock.c @@ -102,6 +102,14 @@ void queue_write_lock_slowpath(rwlock_t spin_unlock(&lock->lock); } +void _rw_barrier(rwlock_t *lock) +{ + check_barrier(&lock->lock.debug); + smp_mb(); + while ( _rw_is_locked(lock) ) + arch_lock_relax(); + smp_mb(); +} static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers); --- a/xen/common/spinlock.c +++ b/xen/common/spinlock.c @@ -65,7 +65,7 @@ void check_lock(union lock_debug *debug, } } -static void check_barrier(union lock_debug *debug) +void check_barrier(union lock_debug *debug) { if ( unlikely(atomic_read(&spin_debug) <= 0) ) return; @@ -108,7 +108,6 @@ void spin_debug_disable(void) #else /* CONFIG_DEBUG_LOCKS */ -#define check_barrier(l) ((void)0) #define got_lock(l) ((void)0) #define rel_lock(l) ((void)0) --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -878,7 +878,7 @@ static int pci_clean_dpci_irqs(struct do if ( !is_hvm_domain(d) ) return 0; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); hvm_irq_dpci = domain_get_irq_dpci(d); if ( hvm_irq_dpci != NULL ) { @@ -896,14 +896,14 @@ static int pci_clean_dpci_irqs(struct do ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL); if ( ret ) { - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return ret; } hvm_domain_irq(d)->dpci = NULL; free_hvm_irq_dpci(hvm_irq_dpci); } - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); return 0; } --- a/xen/drivers/passthrough/vtd/x86/hvm.c +++ b/xen/drivers/passthrough/vtd/x86/hvm.c @@ -54,7 +54,7 @@ void hvm_dpci_isairq_eoi(struct domain * if ( !is_iommu_enabled(d) ) return; - spin_lock(&d->event_lock); + write_lock(&d->event_lock); dpci = domain_get_irq_dpci(d); @@ -63,5 +63,5 @@ void hvm_dpci_isairq_eoi(struct domain * /* Multiple mirq may be mapped to one isa irq */ pt_pirq_iterate(d, _hvm_dpci_isairq_eoi, (void *)(long)isairq); } - spin_unlock(&d->event_lock); + write_unlock(&d->event_lock); } --- a/xen/include/xen/rwlock.h +++ b/xen/include/xen/rwlock.h @@ -247,6 +247,8 @@ static inline int _rw_is_write_locked(rw return (atomic_read(&lock->cnts) & _QW_WMASK) == _QW_LOCKED; } +void _rw_barrier(rwlock_t *lock); + #define read_lock(l) _read_lock(l) #define read_lock_irq(l) _read_lock_irq(l) #define read_lock_irqsave(l, f) \ @@ -276,6 +278,7 @@ static inline int _rw_is_write_locked(rw #define rw_is_locked(l) _rw_is_locked(l) #define rw_is_write_locked(l) _rw_is_write_locked(l) +#define rw_barrier(l) _rw_barrier(l) typedef struct percpu_rwlock percpu_rwlock_t; --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -376,7 +376,7 @@ struct domain unsigned int xen_evtchns; /* Port to resume from in evtchn_reset(), when in a continuation. */ unsigned int next_evtchn; - spinlock_t event_lock; + rwlock_t event_lock; const struct evtchn_port_ops *evtchn_port_ops; struct evtchn_fifo_domain *evtchn_fifo; --- a/xen/include/xen/spinlock.h +++ b/xen/include/xen/spinlock.h @@ -22,12 +22,14 @@ union lock_debug { }; #define _LOCK_DEBUG { LOCK_DEBUG_INITVAL } void check_lock(union lock_debug *debug, bool try); +void check_barrier(union lock_debug *debug); void spin_debug_enable(void); void spin_debug_disable(void); #else union lock_debug { }; #define _LOCK_DEBUG { } #define check_lock(l, t) ((void)0) +#define check_barrier(l) ((void)0) #define spin_debug_enable() ((void)0) #define spin_debug_disable() ((void)0) #endif --- a/xen/xsm/flask/flask_op.c +++ b/xen/xsm/flask/flask_op.c @@ -555,7 +555,7 @@ static int flask_get_peer_sid(struct xen struct evtchn *chn; struct domain_security_struct *dsec; - spin_lock(&d->event_lock); + read_lock(&d->event_lock); if ( !port_is_valid(d, arg->evtchn) ) goto out; @@ -573,7 +573,7 @@ static int flask_get_peer_sid(struct xen rv = 0; out: - spin_unlock(&d->event_lock); + read_unlock(&d->event_lock); return rv; } From patchwork Mon Nov 23 13:30:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11925207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 191D2C2D0E4 for ; Mon, 23 Nov 2020 13:30:39 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 90143206E3 for ; Mon, 23 Nov 2020 13:30:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="BG9k2RjR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 90143206E3 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from list by lists.xenproject.org with outflank-mailman.34300.65245 (Exim 4.92) (envelope-from ) id 1khBvI-0000HT-Tm; Mon, 23 Nov 2020 13:30:28 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 34300.65245; Mon, 23 Nov 2020 13:30:28 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBvI-0000HL-Qf; Mon, 23 Nov 2020 13:30:28 +0000 Received: by outflank-mailman (input) for mailman id 34300; Mon, 23 Nov 2020 13:30:28 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBvI-0000HE-0F for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:30:28 +0000 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 287d2545-a77c-426f-8d65-7c70bb0fa7d7; Mon, 23 Nov 2020 13:30:27 +0000 (UTC) Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5CD23AF16; Mon, 23 Nov 2020 13:30:26 +0000 (UTC) Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1khBvI-0000HE-0F for xen-devel@lists.xenproject.org; Mon, 23 Nov 2020 13:30:28 +0000 X-Inumbo-ID: 287d2545-a77c-426f-8d65-7c70bb0fa7d7 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 287d2545-a77c-426f-8d65-7c70bb0fa7d7; Mon, 23 Nov 2020 13:30:27 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1606138226; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7jt0BkRlKG8ZX6o9qxDhFaVFYDZdwpFBoupgoJURDZ8=; b=BG9k2RjRcZu8KM66M5gYSdqbg/ZJjaxOQBOZ1vp66TrSez4PyekKHZ4NKfM6EnVGUa1+kI 8wZSo+8viK66yeY0eBzxSnMbVbwtVlJSdKHbPWURRT0cARKTGaq8IomDc3aB64DgbG9hoH JEoVOivGheiKmUT1htOCisPPxKpxh4s= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5CD23AF16; Mon, 23 Nov 2020 13:30:26 +0000 (UTC) Subject: [PATCH v3 5/5] evtchn: don't call Xen consumer callback with per-channel lock held From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , George Dunlap , Ian Jackson , Julien Grall , Wei Liu , Stefano Stabellini , Tamas K Lengyel , Petre Ovidiu PIRCALABU , Alexandru Isaila References: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Message-ID: Date: Mon, 23 Nov 2020 14:30:25 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <9d7a052a-6222-80ff-cbf1-612d4ca50c2a@suse.com> Content-Language: en-US While there don't look to be any problems with this right now, the lock order implications from holding the lock can be very difficult to follow (and may be easy to violate unknowingly). The present callbacks don't (and no such callback should) have any need for the lock to be held. However, vm_event_disable() frees the structures used by respective callbacks and isn't otherwise synchronized with invocations of these callbacks, so maintain a count of in-progress calls, for evtchn_close() to wait to drop to zero before freeing the port (and dropping the lock). Signed-off-by: Jan Beulich Reviewed-by: Alexandru Isaila --- Should we make this accounting optional, to be requested through a new parameter to alloc_unbound_xen_event_channel(), or derived from other than the default callback being requested? --- v3: Drain callbacks before proceeding with closing. Re-base. --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -397,6 +397,7 @@ static long evtchn_bind_interdomain(evtc rchn->u.interdomain.remote_dom = ld; rchn->u.interdomain.remote_port = lport; + atomic_set(&rchn->u.interdomain.active_calls, 0); rchn->state = ECS_INTERDOMAIN; /* @@ -720,6 +721,10 @@ int evtchn_close(struct domain *d1, int double_evtchn_lock(chn1, chn2); + if ( consumer_is_xen(chn1) ) + while ( atomic_read(&chn1->u.interdomain.active_calls) ) + cpu_relax(); + evtchn_free(d1, chn1); chn2->state = ECS_UNBOUND; @@ -781,9 +786,15 @@ int evtchn_send(struct domain *ld, unsig rport = lchn->u.interdomain.remote_port; rchn = evtchn_from_port(rd, rport); if ( consumer_is_xen(rchn) ) + { + /* Don't keep holding the lock for the call below. */ + atomic_inc(&rchn->u.interdomain.active_calls); + evtchn_read_unlock(lchn); xen_notification_fn(rchn)(rd->vcpu[rchn->notify_vcpu_id], rport); - else - evtchn_port_set_pending(rd, rchn->notify_vcpu_id, rchn); + atomic_dec(&rchn->u.interdomain.active_calls); + return 0; + } + evtchn_port_set_pending(rd, rchn->notify_vcpu_id, rchn); break; case ECS_IPI: evtchn_port_set_pending(ld, lchn->notify_vcpu_id, lchn); --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -104,6 +104,7 @@ struct evtchn } unbound; /* state == ECS_UNBOUND */ struct { evtchn_port_t remote_port; + atomic_t active_calls; struct domain *remote_dom; } interdomain; /* state == ECS_INTERDOMAIN */ struct {