From patchwork Mon Sep 18 06:25:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4999CD13D1 for ; Mon, 18 Sep 2023 06:28:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239868AbjIRG2X (ORCPT ); Mon, 18 Sep 2023 02:28:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239854AbjIRG17 (ORCPT ); Mon, 18 Sep 2023 02:27:59 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DCF7A8F; Sun, 17 Sep 2023 23:27:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018474; x=1726554474; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+N6SUzAHNndnwKrBhBL4M+fSkr/yc6MgkvTjYgc1qew=; b=blcPUIxmmea+olr9vB6MbC4u2sIU0xWtNl9zIia1e4he5Pprpk9Bq9n0 Xi5cf3AsoBrBwxad3paZAVLeaKfRdSIwAag076IO2MZwIfcvPHN/1uzp+ mzpve6hu1q9PkYp/74e7pVta3DkGgM55cEWEyHK9/hB73ddHx87E3OgdY 9XCsnZKTOEitotUdi6Ciyj0pN9rw+BARruOiP3m1D8IoIM4Owb3rwTKhp fchbuamlXtAfz2zoeNU+50UP8Wk9nXT1ghKivagaq2uc70/S2weLdyFvt jT0WWgL6k4MFJqEjDqRzWpj1quTEPRsL4VR1rxyx4r2BjBQtoaZCLf44r Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488469" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488469" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:27:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893364" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893364" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:27:49 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 01/13] ice: Fix missing legacy 32byte RXDID in the supported bitmap Date: Mon, 18 Sep 2023 06:25:34 +0000 Message-Id: <20230918062546.40419-2-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Xu Ting 32byte legacy descriptor format is preassigned. Commit e753df8fbca5 ("ice: Add support Flex RXD") created a supported RXDIDs bitmap according to DDP package. But it missed the legacy 32byte RXDID since it is not listed in the package. Mark 32byte legacy descriptor format as supported in the supported RXDIDs flags. Signed-off-by: Xu Ting Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- drivers/net/ethernet/intel/ice/ice_virtchnl.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c index cad237dd8894..3bf95d3c50d3 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c @@ -2657,10 +2657,13 @@ static int ice_vc_query_rxdid(struct ice_vf *vf) /* Read flexiflag registers to determine whether the * corresponding RXDID is configured and supported or not. - * Since Legacy 16byte descriptor format is not supported, - * start from Legacy 32byte descriptor. + * But the legacy 32byte RXDID is not listed in DDP package, + * add it in the bitmap manually and skip check for it in the loop. + * Legacy 16byte descriptor is not supported. */ - for (i = ICE_RXDID_LEGACY_1; i < ICE_FLEX_DESC_RXDID_MAX_NUM; i++) { + rxdid->supported_rxdids |= BIT(ICE_RXDID_LEGACY_1); + + for (i = ICE_RXDID_FLEX_NIC; i < ICE_FLEX_DESC_RXDID_MAX_NUM; i++) { regval = rd32(hw, GLFLXP_RXDID_FLAGS(i, 0)); if ((regval >> GLFLXP_RXDID_FLAGS_FLEXIFLAG_4N_S) & GLFLXP_RXDID_FLAGS_FLEXIFLAG_4N_M) From patchwork Mon Sep 18 06:25:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388832 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E40FFCD13D3 for ; Mon, 18 Sep 2023 06:28:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239899AbjIRG20 (ORCPT ); Mon, 18 Sep 2023 02:28:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239862AbjIRG2G (ORCPT ); Mon, 18 Sep 2023 02:28:06 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6D7710F; Sun, 17 Sep 2023 23:28:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018480; x=1726554480; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4bV4dc9Lxk0qRc1xvTaSV5zDFH1c10RazegGUfXW6Cg=; b=edG2NwRyO+6n/IKJ6BFAqHufv3qG8v8JAbo8NpvY4Gr70TJDuw3BXeDI 4JvhL/09WEvnQrXDit2O08u1kzkI0Tnp+KdARks0IHcXXKvqd2iGhSGBf YOtQIs7Dx1XrRivV6c9RtxhmfZF3pTDRfW8//6hTP5Bq6EcdRBde8NAnD RRAFJCgBVFd0fjfo5HetS2/Au5cDjS9Ecy84lEXh6ypKPhv9y1Wv8/VEU LbmtYoFoojDkJuPwFS10xSU4AGvhOjIrDScTkP+82KeB9IfLRYijZjbl1 S2VpqAMnqDPpajj1Yp7hHxuYNZBSO1rAQCSJZ9HqeZCgE1HNr6zP5z/wL Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488485" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488485" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893392" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893392" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:27:54 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 02/13] ice: Add function to get RX queue context Date: Mon, 18 Sep 2023 06:25:35 +0000 Message-Id: <20230918062546.40419-3-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Export RX queue context get function which is consumed by linux live migration driver to save and load device state. Signed-off-by: Yahui Cao Signed-off-by: Lingyu Liu --- drivers/net/ethernet/intel/ice/ice_common.c | 268 ++++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 5 + 2 files changed, 273 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 6f12ea050d35..5892d5a22323 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -1217,6 +1217,34 @@ ice_copy_rxq_ctx_to_hw(struct ice_hw *hw, u8 *ice_rxq_ctx, u32 rxq_index) return 0; } +/** + * ice_copy_rxq_ctx_from_hw - Copy rxq context register from HW + * @hw: pointer to the hardware structure + * @ice_rxq_ctx: pointer to the rxq context + * @rxq_index: the index of the Rx queue + * + * Copy rxq context from HW register space to dense structure + */ +static int +ice_copy_rxq_ctx_from_hw(struct ice_hw *hw, u8 *ice_rxq_ctx, u32 rxq_index) +{ + u8 i; + + if (!ice_rxq_ctx || rxq_index > QRX_CTRL_MAX_INDEX) + return -EINVAL; + + /* Copy each dword separately from HW */ + for (i = 0; i < ICE_RXQ_CTX_SIZE_DWORDS; i++) { + u32 *ctx = (u32 *)(ice_rxq_ctx + (i * sizeof(u32))); + + *ctx = rd32(hw, QRX_CONTEXT(i, rxq_index)); + + ice_debug(hw, ICE_DBG_QCTX, "qrxdata[%d]: %08X\n", i, *ctx); + } + + return 0; +} + /* LAN Rx Queue Context */ static const struct ice_ctx_ele ice_rlan_ctx_info[] = { /* Field Width LSB */ @@ -1268,6 +1296,32 @@ ice_write_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx, return ice_copy_rxq_ctx_to_hw(hw, ctx_buf, rxq_index); } +/** + * ice_read_rxq_ctx - Read rxq context from HW + * @hw: pointer to the hardware structure + * @rlan_ctx: pointer to the rxq context + * @rxq_index: the index of the Rx queue + * + * Read rxq context from HW register space and then converts it from dense + * structure to sparse + */ +int +ice_read_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx, + u32 rxq_index) +{ + u8 ctx_buf[ICE_RXQ_CTX_SZ] = { 0 }; + int status; + + if (!rlan_ctx) + return -EINVAL; + + status = ice_copy_rxq_ctx_from_hw(hw, ctx_buf, rxq_index); + if (status) + return status; + + return ice_get_ctx(ctx_buf, (u8 *)rlan_ctx, ice_rlan_ctx_info); +} + /* LAN Tx Queue Context */ const struct ice_ctx_ele ice_tlan_ctx_info[] = { /* Field Width LSB */ @@ -4443,6 +4497,220 @@ ice_set_ctx(struct ice_hw *hw, u8 *src_ctx, u8 *dest_ctx, return 0; } +/** + * ice_read_byte - read context byte into struct + * @src_ctx: the context structure to read from + * @dest_ctx: the context to be written to + * @ce_info: a description of the struct to be filled + */ +static void +ice_read_byte(u8 *src_ctx, u8 *dest_ctx, const struct ice_ctx_ele *ce_info) +{ + u8 dest_byte, mask; + u8 *src, *target; + u16 shift_width; + + /* prepare the bits and mask */ + shift_width = ce_info->lsb % 8; + mask = (u8)(BIT(ce_info->width) - 1); + + /* shift to correct alignment */ + mask <<= shift_width; + + /* get the current bits from the src bit string */ + src = src_ctx + (ce_info->lsb / 8); + + memcpy(&dest_byte, src, sizeof(dest_byte)); + + dest_byte &= mask; + + dest_byte >>= shift_width; + + /* get the address from the struct field */ + target = dest_ctx + ce_info->offset; + + /* put it back in the struct */ + memcpy(target, &dest_byte, sizeof(dest_byte)); +} + +/** + * ice_read_word - read context word into struct + * @src_ctx: the context structure to read from + * @dest_ctx: the context to be written to + * @ce_info: a description of the struct to be filled + */ +static void +ice_read_word(u8 *src_ctx, u8 *dest_ctx, const struct ice_ctx_ele *ce_info) +{ + u16 dest_word, mask; + u8 *src, *target; + __le16 src_word; + u16 shift_width; + + /* prepare the bits and mask */ + shift_width = ce_info->lsb % 8; + mask = BIT(ce_info->width) - 1; + + /* shift to correct alignment */ + mask <<= shift_width; + + /* get the current bits from the src bit string */ + src = src_ctx + (ce_info->lsb / 8); + + memcpy(&src_word, src, sizeof(src_word)); + + /* the data in the memory is stored as little endian so mask it + * correctly + */ + src_word &= cpu_to_le16(mask); + + /* get the data back into host order before shifting */ + dest_word = le16_to_cpu(src_word); + + dest_word >>= shift_width; + + /* get the address from the struct field */ + target = dest_ctx + ce_info->offset; + + /* put it back in the struct */ + memcpy(target, &dest_word, sizeof(dest_word)); +} + +/** + * ice_read_dword - read context dword into struct + * @src_ctx: the context structure to read from + * @dest_ctx: the context to be written to + * @ce_info: a description of the struct to be filled + */ +static void +ice_read_dword(u8 *src_ctx, u8 *dest_ctx, const struct ice_ctx_ele *ce_info) +{ + u32 dest_dword, mask; + __le32 src_dword; + u8 *src, *target; + u16 shift_width; + + /* prepare the bits and mask */ + shift_width = ce_info->lsb % 8; + + /* if the field width is exactly 32 on an x86 machine, then the shift + * operation will not work because the SHL instructions count is masked + * to 5 bits so the shift will do nothing + */ + if (ce_info->width < 32) + mask = BIT(ce_info->width) - 1; + else + mask = (u32)~0; + + /* shift to correct alignment */ + mask <<= shift_width; + + /* get the current bits from the src bit string */ + src = src_ctx + (ce_info->lsb / 8); + + memcpy(&src_dword, src, sizeof(src_dword)); + + /* the data in the memory is stored as little endian so mask it + * correctly + */ + src_dword &= cpu_to_le32(mask); + + /* get the data back into host order before shifting */ + dest_dword = le32_to_cpu(src_dword); + + dest_dword >>= shift_width; + + /* get the address from the struct field */ + target = dest_ctx + ce_info->offset; + + /* put it back in the struct */ + memcpy(target, &dest_dword, sizeof(dest_dword)); +} + +/** + * ice_read_qword - read context qword into struct + * @src_ctx: the context structure to read from + * @dest_ctx: the context to be written to + * @ce_info: a description of the struct to be filled + */ +static void +ice_read_qword(u8 *src_ctx, u8 *dest_ctx, const struct ice_ctx_ele *ce_info) +{ + u64 dest_qword, mask; + __le64 src_qword; + u8 *src, *target; + u16 shift_width; + + /* prepare the bits and mask */ + shift_width = ce_info->lsb % 8; + + /* if the field width is exactly 64 on an x86 machine, then the shift + * operation will not work because the SHL instructions count is masked + * to 6 bits so the shift will do nothing + */ + if (ce_info->width < 64) + mask = BIT_ULL(ce_info->width) - 1; + else + mask = (u64)~0; + + /* shift to correct alignment */ + mask <<= shift_width; + + /* get the current bits from the src bit string */ + src = src_ctx + (ce_info->lsb / 8); + + memcpy(&src_qword, src, sizeof(src_qword)); + + /* the data in the memory is stored as little endian so mask it + * correctly + */ + src_qword &= cpu_to_le64(mask); + + /* get the data back into host order before shifting */ + dest_qword = le64_to_cpu(src_qword); + + dest_qword >>= shift_width; + + /* get the address from the struct field */ + target = dest_ctx + ce_info->offset; + + /* put it back in the struct */ + memcpy(target, &dest_qword, sizeof(dest_qword)); +} + +/** + * ice_get_ctx - extract context bits from a packed structure + * @src_ctx: pointer to a generic packed context structure + * @dest_ctx: pointer to a generic non-packed context structure + * @ce_info: a description of the structure to be read from + */ +int +ice_get_ctx(u8 *src_ctx, u8 *dest_ctx, const struct ice_ctx_ele *ce_info) +{ + int i; + + for (i = 0; ce_info[i].width; i++) { + switch (ce_info[i].size_of) { + case 1: + ice_read_byte(src_ctx, dest_ctx, &ce_info[i]); + break; + case 2: + ice_read_word(src_ctx, dest_ctx, &ce_info[i]); + break; + case 4: + ice_read_dword(src_ctx, dest_ctx, &ce_info[i]); + break; + case 8: + ice_read_qword(src_ctx, dest_ctx, &ce_info[i]); + break; + default: + return -EINVAL; + } + } + + return 0; +} + /** * ice_get_lan_q_ctx - get the LAN queue context for the given VSI and TC * @hw: pointer to the HW struct diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index 71381d9835a1..657767c50be6 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -55,6 +55,9 @@ void ice_set_safe_mode_caps(struct ice_hw *hw); int ice_write_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx, u32 rxq_index); +int +ice_read_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx, + u32 rxq_index); int ice_aq_get_rss_lut(struct ice_hw *hw, struct ice_aq_get_set_rss_lut_params *get_params); @@ -74,6 +77,8 @@ extern const struct ice_ctx_ele ice_tlan_ctx_info[]; int ice_set_ctx(struct ice_hw *hw, u8 *src_ctx, u8 *dest_ctx, const struct ice_ctx_ele *ce_info); +int +ice_get_ctx(u8 *src_ctx, u8 *dest_ctx, const struct ice_ctx_ele *ce_info); extern struct mutex ice_global_cfg_lock_sw; From patchwork Mon Sep 18 06:25:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388833 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0573FCD13D2 for ; Mon, 18 Sep 2023 06:28:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239922AbjIRG23 (ORCPT ); Mon, 18 Sep 2023 02:28:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239871AbjIRG2L (ORCPT ); Mon, 18 Sep 2023 02:28:11 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 189A5E6; Sun, 17 Sep 2023 23:28:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018486; x=1726554486; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=umOFRQhC/QKZXyjspALq4K5Uu2DMOzTHqghxUg2G6jM=; b=R+KU8oPzIKAiB/mwBOk6fI/1PlMFLxUpEA3MA1+BnH7JR84btgWRwUt8 QrV/oFPjD6BQt549FXbGyP0KR2NntkE/5v4ZioYWqCep6cZxwlFH94hkQ j+9oz26JASyVyq91ymXk4/ZNLTMIm9LbqL/tnvl2eH6esheiLtOmXXNN+ F6+UEkj/LGFwWwnVE6u+TfdcRoWpGvY99du6ww0HWRmuvneDkmbtuYqSM KxtAy7EwxQe8AgwMAetKvauVpN2cdzJWnn7ntTcnD4n+hI5O5FUODy2UP WcOFZgXA4GWWYOnuGaclk9gtlna4DUE+oH6vhh6WN4+Roep5TQ9JAV/Yt w==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488503" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488503" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893419" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893419" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:00 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 03/13] ice: Add function to get and set TX queue context Date: Mon, 18 Sep 2023 06:25:36 +0000 Message-Id: <20230918062546.40419-4-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Export TX queue context get and set function which is consumed by linux live migration driver to save and load device state. TX queue context contains static fields which does not change during TX traffic and dynamic fields which may change during TX traffic. Signed-off-by: Yahui Cao --- drivers/net/ethernet/intel/ice/ice_common.c | 216 +++++++++++++++++- drivers/net/ethernet/intel/ice/ice_common.h | 6 + .../net/ethernet/intel/ice/ice_hw_autogen.h | 15 ++ .../net/ethernet/intel/ice/ice_lan_tx_rx.h | 3 + 4 files changed, 239 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 5892d5a22323..63ccd631a5d5 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -1322,7 +1322,10 @@ ice_read_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx, return ice_get_ctx(ctx_buf, (u8 *)rlan_ctx, ice_rlan_ctx_info); } -/* LAN Tx Queue Context */ +/* LAN Tx Queue Context used for set Tx config by ice_aqc_opc_add_txqs, + * Bit[0-175] is valid + */ + const struct ice_ctx_ele ice_tlan_ctx_info[] = { /* Field Width LSB */ ICE_CTX_STORE(ice_tlan_ctx, base, 57, 0), @@ -1356,6 +1359,217 @@ const struct ice_ctx_ele ice_tlan_ctx_info[] = { { 0 } }; +/* LAN Tx Queue Context used for get Tx config from QTXCOMM_CNTX data, + * Bit[0-292] is valid, including internal queue state. Since internal + * queue state is dynamic field, its value will be cleared once queue + * is disabled + */ +static const struct ice_ctx_ele ice_tlan_ctx_data_info[] = { + /* Field Width LSB */ + ICE_CTX_STORE(ice_tlan_ctx, base, 57, 0), + ICE_CTX_STORE(ice_tlan_ctx, port_num, 3, 57), + ICE_CTX_STORE(ice_tlan_ctx, cgd_num, 5, 60), + ICE_CTX_STORE(ice_tlan_ctx, pf_num, 3, 65), + ICE_CTX_STORE(ice_tlan_ctx, vmvf_num, 10, 68), + ICE_CTX_STORE(ice_tlan_ctx, vmvf_type, 2, 78), + ICE_CTX_STORE(ice_tlan_ctx, src_vsi, 10, 80), + ICE_CTX_STORE(ice_tlan_ctx, tsyn_ena, 1, 90), + ICE_CTX_STORE(ice_tlan_ctx, internal_usage_flag, 1, 91), + ICE_CTX_STORE(ice_tlan_ctx, alt_vlan, 1, 92), + ICE_CTX_STORE(ice_tlan_ctx, cpuid, 8, 93), + ICE_CTX_STORE(ice_tlan_ctx, wb_mode, 1, 101), + ICE_CTX_STORE(ice_tlan_ctx, tphrd_desc, 1, 102), + ICE_CTX_STORE(ice_tlan_ctx, tphrd, 1, 103), + ICE_CTX_STORE(ice_tlan_ctx, tphwr_desc, 1, 104), + ICE_CTX_STORE(ice_tlan_ctx, cmpq_id, 9, 105), + ICE_CTX_STORE(ice_tlan_ctx, qnum_in_func, 14, 114), + ICE_CTX_STORE(ice_tlan_ctx, itr_notification_mode, 1, 128), + ICE_CTX_STORE(ice_tlan_ctx, adjust_prof_id, 6, 129), + ICE_CTX_STORE(ice_tlan_ctx, qlen, 13, 135), + ICE_CTX_STORE(ice_tlan_ctx, quanta_prof_idx, 4, 148), + ICE_CTX_STORE(ice_tlan_ctx, tso_ena, 1, 152), + ICE_CTX_STORE(ice_tlan_ctx, tso_qnum, 11, 153), + ICE_CTX_STORE(ice_tlan_ctx, legacy_int, 1, 164), + ICE_CTX_STORE(ice_tlan_ctx, drop_ena, 1, 165), + ICE_CTX_STORE(ice_tlan_ctx, cache_prof_idx, 2, 166), + ICE_CTX_STORE(ice_tlan_ctx, pkt_shaper_prof_idx, 3, 168), + ICE_CTX_STORE(ice_tlan_ctx, tail, 13, 184), + { 0 } +}; + +/** + * ice_copy_txq_ctx_from_hw - Copy txq context register from HW + * @hw: pointer to the hardware structure + * @ice_txq_ctx: pointer to the txq context + * + * Copy txq context from HW register space to dense structure + */ +static int +ice_copy_txq_ctx_from_hw(struct ice_hw *hw, u8 *ice_txq_ctx) +{ + u8 i; + + if (!ice_txq_ctx) + return -EINVAL; + + /* Copy each dword separately from HW */ + for (i = 0; i < ICE_TXQ_CTX_SIZE_DWORDS; i++) { + u32 *ctx = (u32 *)(ice_txq_ctx + (i * sizeof(u32))); + + *ctx = rd32(hw, GLCOMM_QTX_CNTX_DATA(i)); + + ice_debug(hw, ICE_DBG_QCTX, "qtxdata[%d]: %08X\n", i, *ctx); + } + + return 0; +} + +/** + * ice_copy_txq_ctx_to_hw - Copy txq context register into HW + * @hw: pointer to the hardware structure + * @ice_txq_ctx: pointer to the txq context + * + * Copy txq context from dense structure to HW register space + */ +static int +ice_copy_txq_ctx_to_hw(struct ice_hw *hw, u8 *ice_txq_ctx) +{ + u8 i; + + if (!ice_txq_ctx) + return -EINVAL; + + /* Copy each dword separately to HW */ + for (i = 0; i < ICE_TXQ_CTX_SIZE_DWORDS; i++) { + u32 *ctx = (u32 *)(ice_txq_ctx + (i * sizeof(u32))); + + wr32(hw, GLCOMM_QTX_CNTX_DATA(i), *ctx); + + ice_debug(hw, ICE_DBG_QCTX, "qtxdata[%d]: %08X\n", i, *ctx); + } + + return 0; +} + +/* Configuration access to tx ring context(from PF) is done via indirect + * interface, GLCOMM_QTX_CNTX_CTL/DATA registers. However, there registers + * are shared by all the PFs with single PCI card. So multiplied PF may + * access there registers simultaneously, causing access conflicts. Then + * card-level grained locking is required to protect these registers from + * being competed by PF devices within the same card. However, there is no + * such kind of card-level locking supported. So introduce a coarse grained + * global lock which is shared by all the PF driver. + * + * The overall flow is to acquire the lock, read/write TXQ context through + * GLCOMM_QTX_CNTX_CTL/DATA indirect interface and release the lock once + * access is completed. In this way, only one PF can have access to TXQ + * context safely. + */ +static DEFINE_MUTEX(ice_global_txq_ctx_lock); + +/** + * ice_read_txq_ctx - Read txq context from HW + * @hw: pointer to the hardware structure + * @tlan_ctx: pointer to the txq context + * @txq_index: the index of the Tx queue + * + * Read txq context from HW register space and then converts it from dense + * structure to sparse + */ +int +ice_read_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx *tlan_ctx, + u32 txq_index) +{ + u8 ctx_buf[ICE_TXQ_CTX_SZ] = { 0 }; + int status; + u32 txq_base; + u32 cmd, reg; + + if (!tlan_ctx) + return -EINVAL; + + if (txq_index > QTX_COMM_HEAD_MAX_INDEX) + return -EINVAL; + + /* Get TXQ base within card space */ + txq_base = rd32(hw, PFLAN_TX_QALLOC(hw->pf_id)); + txq_base = (txq_base & PFLAN_TX_QALLOC_FIRSTQ_M) >> + PFLAN_TX_QALLOC_FIRSTQ_S; + + cmd = (GLCOMM_QTX_CNTX_CTL_CMD_READ + << GLCOMM_QTX_CNTX_CTL_CMD_S) & GLCOMM_QTX_CNTX_CTL_CMD_M; + reg = cmd | GLCOMM_QTX_CNTX_CTL_CMD_EXEC_M | + (((txq_base + txq_index) << GLCOMM_QTX_CNTX_CTL_QUEUE_ID_S) & + GLCOMM_QTX_CNTX_CTL_QUEUE_ID_M); + + mutex_lock(&ice_global_txq_ctx_lock); + + wr32(hw, GLCOMM_QTX_CNTX_CTL, reg); + ice_flush(hw); + + status = ice_copy_txq_ctx_from_hw(hw, ctx_buf); + if (status) { + mutex_unlock(&ice_global_txq_ctx_lock); + return status; + } + + mutex_unlock(&ice_global_txq_ctx_lock); + + return ice_get_ctx(ctx_buf, (u8 *)tlan_ctx, ice_tlan_ctx_data_info); +} + +/** + * ice_write_txq_ctx - Write txq context from HW + * @hw: pointer to the hardware structure + * @tlan_ctx: pointer to the txq context + * @txq_index: the index of the Tx queue + * + * Convert txq context from sparse to dense structure and then writes + * it to HW register space + */ +int +ice_write_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx *tlan_ctx, + u32 txq_index) +{ + u8 ctx_buf[ICE_TXQ_CTX_SZ] = { 0 }; + int status; + u32 txq_base; + u32 cmd, reg; + + if (!tlan_ctx) + return -EINVAL; + + if (txq_index > QTX_COMM_HEAD_MAX_INDEX) + return -EINVAL; + + ice_set_ctx(hw, (u8 *)tlan_ctx, ctx_buf, ice_tlan_ctx_info); + + /* Get TXQ base within card space */ + txq_base = rd32(hw, PFLAN_TX_QALLOC(hw->pf_id)); + txq_base = (txq_base & PFLAN_TX_QALLOC_FIRSTQ_M) >> + PFLAN_TX_QALLOC_FIRSTQ_S; + + cmd = (GLCOMM_QTX_CNTX_CTL_CMD_WRITE_NO_DYN + << GLCOMM_QTX_CNTX_CTL_CMD_S) & GLCOMM_QTX_CNTX_CTL_CMD_M; + reg = cmd | GLCOMM_QTX_CNTX_CTL_CMD_EXEC_M | + (((txq_base + txq_index) << GLCOMM_QTX_CNTX_CTL_QUEUE_ID_S) & + GLCOMM_QTX_CNTX_CTL_QUEUE_ID_M); + + mutex_lock(&ice_global_txq_ctx_lock); + + status = ice_copy_txq_ctx_to_hw(hw, ctx_buf); + if (status) { + mutex_lock(&ice_global_txq_ctx_lock); + return status; + } + + wr32(hw, GLCOMM_QTX_CNTX_CTL, reg); + ice_flush(hw); + + mutex_unlock(&ice_global_txq_ctx_lock); + + return 0; +} /* Sideband Queue command wrappers */ /** diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index 657767c50be6..e7274721a268 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -58,6 +58,12 @@ ice_write_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx, int ice_read_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx, u32 rxq_index); +int +ice_read_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx *tlan_ctx, + u32 txq_index); +int +ice_write_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx *tlan_ctx, + u32 txq_index); int ice_aq_get_rss_lut(struct ice_hw *hw, struct ice_aq_get_set_rss_lut_params *get_params); diff --git a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h index 6756f3d51d14..67d8332d92f6 100644 --- a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h +++ b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h @@ -8,6 +8,7 @@ #define QTX_COMM_DBELL(_DBQM) (0x002C0000 + ((_DBQM) * 4)) #define QTX_COMM_HEAD(_DBQM) (0x000E0000 + ((_DBQM) * 4)) +#define QTX_COMM_HEAD_MAX_INDEX 16383 #define QTX_COMM_HEAD_HEAD_S 0 #define QTX_COMM_HEAD_HEAD_M ICE_M(0x1FFF, 0) #define PF_FW_ARQBAH 0x00080180 @@ -258,6 +259,9 @@ #define VPINT_ALLOC_PCI_VALID_M BIT(31) #define VPINT_MBX_CTL(_VSI) (0x0016A000 + ((_VSI) * 4)) #define VPINT_MBX_CTL_CAUSE_ENA_M BIT(30) +#define PFLAN_TX_QALLOC(_PF) (0x001D2580 + ((_PF) * 4)) +#define PFLAN_TX_QALLOC_FIRSTQ_S 0 +#define PFLAN_TX_QALLOC_FIRSTQ_M ICE_M(0x3FFF, 0) #define GLLAN_RCTL_0 0x002941F8 #define QRX_CONTEXT(_i, _QRX) (0x00280000 + ((_i) * 8192 + (_QRX) * 4)) #define QRX_CTRL(_QRX) (0x00120000 + ((_QRX) * 4)) @@ -352,6 +356,17 @@ #define GLNVM_ULD_POR_DONE_1_M BIT(8) #define GLNVM_ULD_PCIER_DONE_2_M BIT(9) #define GLNVM_ULD_PE_DONE_M BIT(10) +#define GLCOMM_QTX_CNTX_CTL 0x002D2DC8 /* Reset Source: CORER */ +#define GLCOMM_QTX_CNTX_CTL_QUEUE_ID_S 0 +#define GLCOMM_QTX_CNTX_CTL_QUEUE_ID_M ICE_M(0x3FFF, 0) +#define GLCOMM_QTX_CNTX_CTL_CMD_S 16 +#define GLCOMM_QTX_CNTX_CTL_CMD_M ICE_M(0x7, 16) +#define GLCOMM_QTX_CNTX_CTL_CMD_READ 0 +#define GLCOMM_QTX_CNTX_CTL_CMD_WRITE 1 +#define GLCOMM_QTX_CNTX_CTL_CMD_RESET 3 +#define GLCOMM_QTX_CNTX_CTL_CMD_WRITE_NO_DYN 4 +#define GLCOMM_QTX_CNTX_CTL_CMD_EXEC_M BIT(19) +#define GLCOMM_QTX_CNTX_DATA(_i) (0x002D2D40 + ((_i) * 4)) /* _i=0...9 */ #define GLPCI_CNF2 0x000BE004 #define GLPCI_CNF2_CACHELINE_SIZE_M BIT(1) #define PF_FUNC_RID 0x0009E880 diff --git a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h index 89f986a75cc8..79e07c863ae0 100644 --- a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h +++ b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h @@ -431,6 +431,8 @@ enum ice_rx_flex_desc_status_error_1_bits { #define ICE_RXQ_CTX_SIZE_DWORDS 8 #define ICE_RXQ_CTX_SZ (ICE_RXQ_CTX_SIZE_DWORDS * sizeof(u32)) +#define ICE_TXQ_CTX_SIZE_DWORDS 10 +#define ICE_TXQ_CTX_SZ (ICE_TXQ_CTX_SIZE_DWORDS * sizeof(u32)) #define ICE_TX_CMPLTNQ_CTX_SIZE_DWORDS 22 #define ICE_TX_DRBELL_Q_CTX_SIZE_DWORDS 5 #define GLTCLAN_CQ_CNTX(i, CQ) (GLTCLAN_CQ_CNTX0(CQ) + ((i) * 0x0800)) @@ -649,6 +651,7 @@ struct ice_tlan_ctx { u8 cache_prof_idx; u8 pkt_shaper_prof_idx; u8 int_q_state; /* width not needed - internal - DO NOT WRITE!!! */ + u16 tail; }; /* The ice_ptype_lkup table is used to convert from the 10-bit ptype in the From patchwork Mon Sep 18 06:25:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CADE0C46CA1 for ; Mon, 18 Sep 2023 06:29:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239832AbjIRG2u (ORCPT ); Mon, 18 Sep 2023 02:28:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239835AbjIRG2T (ORCPT ); Mon, 18 Sep 2023 02:28:19 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6BA9710F; Sun, 17 Sep 2023 23:28:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018492; x=1726554492; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PvhEam7jMyPniZYHXz11nuOCnPlK2kB8jEpEfoNSjzU=; b=VQQvya8cwWfuaUxQBDuruVeH4qy7QYhJ/YcbDIS8VvPRaUtKh+Y3nxQ5 cyIgQm5mvNIn9zxnniU7sJdFdnL3TWM2FYV0XI1yrMW5ebmD17qRFdQX+ i6TwU1Hd3l9GQeJ6EEKJEq7t91d7txKk4smfQTF0iDN5jQFOMHqgTQePg s6O4XgqMKRueNQk1DFe0CWzx2Crit0jgaYGvf+6+rkZUvAnBdAcmO6pE7 XUk/4MwjzQNnW2m1S/+7Tm6s88reI8ZS0kWZ6nS2p8QpN3KK02r35BtVZ gZxL/hFPSs0Pa58liITzGXc3r3kxBc6F4s4izu8Qm82Ops5WRQbXU2EAe Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488524" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488524" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893443" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893443" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:05 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 04/13] ice: Introduce VF state ICE_VF_STATE_REPLAYING_VC for migration Date: Mon, 18 Sep 2023 06:25:37 +0000 Message-Id: <20230918062546.40419-5-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu During migration device resume stage, part of device state is loaded by replaying logged virtual channel message. By default, once virtual channel message is processed successfully, PF will send message to VF. In addition, PF will notify VF about link state while handling virtual channel message GET_VF_RESOURCE and ENABLE_QUEUES. And VF driver will print link state change info once receiving notification from PF. However, device resume stage does not need PF to send messages to VF for the above cases. So stop PF from sending messages to VF while VF is in replay state. Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- drivers/net/ethernet/intel/ice/ice_vf_lib.h | 1 + drivers/net/ethernet/intel/ice/ice_virtchnl.c | 185 ++++++++++-------- drivers/net/ethernet/intel/ice/ice_virtchnl.h | 8 +- .../ethernet/intel/ice/ice_virtchnl_fdir.c | 28 +-- 4 files changed, 124 insertions(+), 98 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h b/drivers/net/ethernet/intel/ice/ice_vf_lib.h index 31a082e8a827..ff1438373f69 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h @@ -37,6 +37,7 @@ enum ice_vf_states { ICE_VF_STATE_DIS, ICE_VF_STATE_MC_PROMISC, ICE_VF_STATE_UC_PROMISC, + ICE_VF_STATE_REPLAYING_VC, ICE_VF_STATES_NBITS }; diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c index 3bf95d3c50d3..6be796ed70a8 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c @@ -233,6 +233,9 @@ void ice_vc_notify_vf_link_state(struct ice_vf *vf) struct virtchnl_pf_event pfe = { 0 }; struct ice_hw *hw = &vf->pf->hw; + if (test_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states)) + return; + pfe.event = VIRTCHNL_EVENT_LINK_CHANGE; pfe.severity = PF_EVENT_SEVERITY_INFO; @@ -281,19 +284,10 @@ void ice_vc_notify_reset(struct ice_pf *pf) (u8 *)&pfe, sizeof(struct virtchnl_pf_event)); } -/** - * ice_vc_send_msg_to_vf - Send message to VF - * @vf: pointer to the VF info - * @v_opcode: virtual channel opcode - * @v_retval: virtual channel return value - * @msg: pointer to the msg buffer - * @msglen: msg length - * - * send msg to VF - */ -int -ice_vc_send_msg_to_vf(struct ice_vf *vf, u32 v_opcode, - enum virtchnl_status_code v_retval, u8 *msg, u16 msglen) +static int +ice_vc_send_response_to_vf(struct ice_vf *vf, u32 v_opcode, + enum virtchnl_status_code v_retval, + u8 *msg, u16 msglen) { struct device *dev; struct ice_pf *pf; @@ -314,6 +308,39 @@ ice_vc_send_msg_to_vf(struct ice_vf *vf, u32 v_opcode, return 0; } +/** + * ice_vc_respond_to_vf - Respond to VF + * @vf: pointer to the VF info + * @v_opcode: virtual channel opcode + * @v_retval: virtual channel return value + * @msg: pointer to the msg buffer + * @msglen: msg length + * + * Respond to VF. If it is replaying, return directly. + * + * Return 0 for success, negative for error. + */ +int +ice_vc_respond_to_vf(struct ice_vf *vf, u32 v_opcode, + enum virtchnl_status_code v_retval, u8 *msg, u16 msglen) +{ + struct device *dev; + struct ice_pf *pf = vf->pf; + + dev = ice_pf_to_dev(pf); + + if (test_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states)) { + if (v_retval == VIRTCHNL_STATUS_SUCCESS) + return 0; + + dev_dbg(dev, "Unable to replay virt channel command, VF ID %d, virtchnl status code %d. op code %d, len %d.\n", + vf->vf_id, v_retval, v_opcode, msglen); + return -EIO; + } + + return ice_vc_send_response_to_vf(vf, v_opcode, v_retval, msg, msglen); +} + /** * ice_vc_get_ver_msg * @vf: pointer to the VF info @@ -332,9 +359,9 @@ static int ice_vc_get_ver_msg(struct ice_vf *vf, u8 *msg) if (VF_IS_V10(&vf->vf_ver)) info.minor = VIRTCHNL_VERSION_MINOR_NO_VF_CAPS; - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_VERSION, - VIRTCHNL_STATUS_SUCCESS, (u8 *)&info, - sizeof(struct virtchnl_version_info)); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_VERSION, + VIRTCHNL_STATUS_SUCCESS, (u8 *)&info, + sizeof(struct virtchnl_version_info)); } /** @@ -522,8 +549,8 @@ static int ice_vc_get_vf_res_msg(struct ice_vf *vf, u8 *msg) err: /* send the response back to the VF */ - ret = ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_GET_VF_RESOURCES, v_ret, - (u8 *)vfres, len); + ret = ice_vc_respond_to_vf(vf, VIRTCHNL_OP_GET_VF_RESOURCES, v_ret, + (u8 *)vfres, len); kfree(vfres); return ret; @@ -892,7 +919,7 @@ static int ice_vc_handle_rss_cfg(struct ice_vf *vf, u8 *msg, bool add) } error_param: - return ice_vc_send_msg_to_vf(vf, v_opcode, v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, v_opcode, v_ret, NULL, 0); } /** @@ -938,8 +965,8 @@ static int ice_vc_config_rss_key(struct ice_vf *vf, u8 *msg) if (ice_set_rss_key(vsi, vrk->key)) v_ret = VIRTCHNL_STATUS_ERR_ADMIN_QUEUE_ERROR; error_param: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_RSS_KEY, v_ret, - NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_CONFIG_RSS_KEY, v_ret, + NULL, 0); } /** @@ -984,7 +1011,7 @@ static int ice_vc_config_rss_lut(struct ice_vf *vf, u8 *msg) if (ice_set_rss_lut(vsi, vrl->lut, ICE_LUT_VSI_SIZE)) v_ret = VIRTCHNL_STATUS_ERR_ADMIN_QUEUE_ERROR; error_param: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_RSS_LUT, v_ret, + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_CONFIG_RSS_LUT, v_ret, NULL, 0); } @@ -1124,8 +1151,8 @@ static int ice_vc_cfg_promiscuous_mode_msg(struct ice_vf *vf, u8 *msg) } error_param: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_PROMISCUOUS_MODE, - v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_CONFIG_PROMISCUOUS_MODE, + v_ret, NULL, 0); } /** @@ -1165,8 +1192,8 @@ static int ice_vc_get_stats_msg(struct ice_vf *vf, u8 *msg) error_param: /* send the response to the VF */ - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_GET_STATS, v_ret, - (u8 *)&stats, sizeof(stats)); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_GET_STATS, v_ret, + (u8 *)&stats, sizeof(stats)); } /** @@ -1315,8 +1342,8 @@ static int ice_vc_ena_qs_msg(struct ice_vf *vf, u8 *msg) error_param: /* send the response to the VF */ - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ENABLE_QUEUES, v_ret, - NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ENABLE_QUEUES, v_ret, + NULL, 0); } /** @@ -1455,8 +1482,8 @@ static int ice_vc_dis_qs_msg(struct ice_vf *vf, u8 *msg) error_param: /* send the response to the VF */ - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_DISABLE_QUEUES, v_ret, - NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_DISABLE_QUEUES, v_ret, + NULL, 0); } /** @@ -1586,8 +1613,8 @@ static int ice_vc_cfg_irq_map_msg(struct ice_vf *vf, u8 *msg) error_param: /* send the response to the VF */ - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_IRQ_MAP, v_ret, - NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_CONFIG_IRQ_MAP, v_ret, + NULL, 0); } /** @@ -1730,8 +1757,8 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg) } /* send the response to the VF */ - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES, - VIRTCHNL_STATUS_SUCCESS, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES, + VIRTCHNL_STATUS_SUCCESS, NULL, 0); error_param: /* disable whatever we can */ for (; i >= 0; i--) { @@ -1746,8 +1773,8 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg) ice_lag_move_new_vf_nodes(vf); /* send the response to the VF */ - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES, - VIRTCHNL_STATUS_ERR_PARAM, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES, + VIRTCHNL_STATUS_ERR_PARAM, NULL, 0); } /** @@ -2049,7 +2076,7 @@ ice_vc_handle_mac_addr_msg(struct ice_vf *vf, u8 *msg, bool set) handle_mac_exit: /* send the response to the VF */ - return ice_vc_send_msg_to_vf(vf, vc_op, v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, vc_op, v_ret, NULL, 0); } /** @@ -2132,8 +2159,8 @@ static int ice_vc_request_qs_msg(struct ice_vf *vf, u8 *msg) error_param: /* send the response to the VF */ - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_REQUEST_QUEUES, - v_ret, (u8 *)vfres, sizeof(*vfres)); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_REQUEST_QUEUES, + v_ret, (u8 *)vfres, sizeof(*vfres)); } /** @@ -2398,11 +2425,11 @@ static int ice_vc_process_vlan_msg(struct ice_vf *vf, u8 *msg, bool add_v) error_param: /* send the response to the VF */ if (add_v) - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ADD_VLAN, v_ret, - NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ADD_VLAN, v_ret, + NULL, 0); else - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_DEL_VLAN, v_ret, - NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_DEL_VLAN, v_ret, + NULL, 0); } /** @@ -2477,8 +2504,8 @@ static int ice_vc_ena_vlan_stripping(struct ice_vf *vf) vf->vlan_strip_ena |= ICE_INNER_VLAN_STRIP_ENA; error_param: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ENABLE_VLAN_STRIPPING, - v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ENABLE_VLAN_STRIPPING, + v_ret, NULL, 0); } /** @@ -2514,8 +2541,8 @@ static int ice_vc_dis_vlan_stripping(struct ice_vf *vf) vf->vlan_strip_ena &= ~ICE_INNER_VLAN_STRIP_ENA; error_param: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_DISABLE_VLAN_STRIPPING, - v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_DISABLE_VLAN_STRIPPING, + v_ret, NULL, 0); } /** @@ -2550,8 +2577,8 @@ static int ice_vc_get_rss_hena(struct ice_vf *vf) vrh->hena = ICE_DEFAULT_RSS_HENA; err: /* send the response back to the VF */ - ret = ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_GET_RSS_HENA_CAPS, v_ret, - (u8 *)vrh, len); + ret = ice_vc_respond_to_vf(vf, VIRTCHNL_OP_GET_RSS_HENA_CAPS, v_ret, + (u8 *)vrh, len); kfree(vrh); return ret; } @@ -2616,8 +2643,8 @@ static int ice_vc_set_rss_hena(struct ice_vf *vf, u8 *msg) /* send the response to the VF */ err: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_SET_RSS_HENA, v_ret, - NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_SET_RSS_HENA, v_ret, + NULL, 0); } /** @@ -2673,8 +2700,8 @@ static int ice_vc_query_rxdid(struct ice_vf *vf) pf->supported_rxdids = rxdid->supported_rxdids; err: - ret = ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_GET_SUPPORTED_RXDIDS, - v_ret, (u8 *)rxdid, len); + ret = ice_vc_respond_to_vf(vf, VIRTCHNL_OP_GET_SUPPORTED_RXDIDS, + v_ret, (u8 *)rxdid, len); kfree(rxdid); return ret; } @@ -2910,8 +2937,8 @@ static int ice_vc_get_offload_vlan_v2_caps(struct ice_vf *vf) memcpy(&vf->vlan_v2_caps, caps, sizeof(*caps)); out: - err = ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS, - v_ret, (u8 *)caps, len); + err = ice_vc_respond_to_vf(vf, VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS, + v_ret, (u8 *)caps, len); kfree(caps); return err; } @@ -3152,8 +3179,7 @@ static int ice_vc_remove_vlan_v2_msg(struct ice_vf *vf, u8 *msg) v_ret = VIRTCHNL_STATUS_ERR_PARAM; out: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_DEL_VLAN_V2, v_ret, NULL, - 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_DEL_VLAN_V2, v_ret, NULL, 0); } /** @@ -3294,8 +3320,7 @@ static int ice_vc_add_vlan_v2_msg(struct ice_vf *vf, u8 *msg) v_ret = VIRTCHNL_STATUS_ERR_PARAM; out: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ADD_VLAN_V2, v_ret, NULL, - 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ADD_VLAN_V2, v_ret, NULL, 0); } /** @@ -3526,8 +3551,8 @@ static int ice_vc_ena_vlan_stripping_v2_msg(struct ice_vf *vf, u8 *msg) vf->vlan_strip_ena |= ICE_INNER_VLAN_STRIP_ENA; out: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ENABLE_VLAN_STRIPPING_V2, - v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ENABLE_VLAN_STRIPPING_V2, + v_ret, NULL, 0); } /** @@ -3601,8 +3626,8 @@ static int ice_vc_dis_vlan_stripping_v2_msg(struct ice_vf *vf, u8 *msg) vf->vlan_strip_ena &= ~ICE_INNER_VLAN_STRIP_ENA; out: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_DISABLE_VLAN_STRIPPING_V2, - v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_DISABLE_VLAN_STRIPPING_V2, + v_ret, NULL, 0); } /** @@ -3660,8 +3685,8 @@ static int ice_vc_ena_vlan_insertion_v2_msg(struct ice_vf *vf, u8 *msg) } out: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ENABLE_VLAN_INSERTION_V2, - v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ENABLE_VLAN_INSERTION_V2, + v_ret, NULL, 0); } /** @@ -3715,8 +3740,8 @@ static int ice_vc_dis_vlan_insertion_v2_msg(struct ice_vf *vf, u8 *msg) } out: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_DISABLE_VLAN_INSERTION_V2, - v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_DISABLE_VLAN_INSERTION_V2, + v_ret, NULL, 0); } static const struct ice_virtchnl_ops ice_virtchnl_dflt_ops = { @@ -3813,8 +3838,8 @@ static int ice_vc_repr_add_mac(struct ice_vf *vf, u8 *msg) } handle_mac_exit: - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ADD_ETH_ADDR, - v_ret, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ADD_ETH_ADDR, + v_ret, NULL, 0); } /** @@ -3833,8 +3858,8 @@ ice_vc_repr_del_mac(struct ice_vf __always_unused *vf, u8 __always_unused *msg) ice_update_legacy_cached_mac(vf, &al->list[0]); - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_DEL_ETH_ADDR, - VIRTCHNL_STATUS_SUCCESS, NULL, 0); + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_DEL_ETH_ADDR, + VIRTCHNL_STATUS_SUCCESS, NULL, 0); } static int @@ -3843,8 +3868,8 @@ ice_vc_repr_cfg_promiscuous_mode(struct ice_vf *vf, u8 __always_unused *msg) dev_dbg(ice_pf_to_dev(vf->pf), "Can't config promiscuous mode in switchdev mode for VF %d\n", vf->vf_id); - return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_PROMISCUOUS_MODE, - VIRTCHNL_STATUS_ERR_NOT_SUPPORTED, + return ice_vc_respond_to_vf(vf, VIRTCHNL_OP_CONFIG_PROMISCUOUS_MODE, + VIRTCHNL_STATUS_ERR_NOT_SUPPORTED, NULL, 0); } @@ -3987,16 +4012,16 @@ void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, error_handler: if (err) { - ice_vc_send_msg_to_vf(vf, v_opcode, VIRTCHNL_STATUS_ERR_PARAM, - NULL, 0); + ice_vc_respond_to_vf(vf, v_opcode, VIRTCHNL_STATUS_ERR_PARAM, + NULL, 0); dev_err(dev, "Invalid message from VF %d, opcode %d, len %d, error %d\n", vf_id, v_opcode, msglen, err); goto finish; } if (!ice_vc_is_opcode_allowed(vf, v_opcode)) { - ice_vc_send_msg_to_vf(vf, v_opcode, - VIRTCHNL_STATUS_ERR_NOT_SUPPORTED, NULL, + ice_vc_respond_to_vf(vf, v_opcode, + VIRTCHNL_STATUS_ERR_NOT_SUPPORTED, NULL, 0); goto finish; } @@ -4107,9 +4132,9 @@ void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, default: dev_err(dev, "Unsupported opcode %d from VF %d\n", v_opcode, vf_id); - err = ice_vc_send_msg_to_vf(vf, v_opcode, - VIRTCHNL_STATUS_ERR_NOT_SUPPORTED, - NULL, 0); + err = ice_vc_respond_to_vf(vf, v_opcode, + VIRTCHNL_STATUS_ERR_NOT_SUPPORTED, + NULL, 0); break; } if (err) { diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.h b/drivers/net/ethernet/intel/ice/ice_virtchnl.h index cd747718de73..a2b6094e2f2f 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.h +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.h @@ -60,8 +60,8 @@ void ice_vc_notify_vf_link_state(struct ice_vf *vf); void ice_vc_notify_link_state(struct ice_pf *pf); void ice_vc_notify_reset(struct ice_pf *pf); int -ice_vc_send_msg_to_vf(struct ice_vf *vf, u32 v_opcode, - enum virtchnl_status_code v_retval, u8 *msg, u16 msglen); +ice_vc_respond_to_vf(struct ice_vf *vf, u32 v_opcode, + enum virtchnl_status_code v_retval, u8 *msg, u16 msglen); bool ice_vc_isvalid_vsi_id(struct ice_vf *vf, u16 vsi_id); void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, struct ice_mbx_data *mbxdata); @@ -73,8 +73,8 @@ static inline void ice_vc_notify_link_state(struct ice_pf *pf) { } static inline void ice_vc_notify_reset(struct ice_pf *pf) { } static inline int -ice_vc_send_msg_to_vf(struct ice_vf *vf, u32 v_opcode, - enum virtchnl_status_code v_retval, u8 *msg, u16 msglen) +ice_vc_respond_to_vf(struct ice_vf *vf, u32 v_opcode, + enum virtchnl_status_code v_retval, u8 *msg, u16 msglen) { return -EOPNOTSUPP; } diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c index daa6a1e894cf..bf6c24901cb0 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c @@ -1571,8 +1571,8 @@ ice_vc_add_fdir_fltr_post(struct ice_vf *vf, struct ice_vf_fdir_ctx *ctx, resp->flow_id = conf->flow_id; vf->fdir.fdir_fltr_cnt[conf->input.flow_type][is_tun]++; - ret = ice_vc_send_msg_to_vf(vf, ctx->v_opcode, v_ret, - (u8 *)resp, len); + ret = ice_vc_respond_to_vf(vf, ctx->v_opcode, v_ret, + (u8 *)resp, len); kfree(resp); dev_dbg(dev, "VF %d: flow_id:0x%X, FDIR %s success!\n", @@ -1587,8 +1587,8 @@ ice_vc_add_fdir_fltr_post(struct ice_vf *vf, struct ice_vf_fdir_ctx *ctx, ice_vc_fdir_remove_entry(vf, conf, conf->flow_id); devm_kfree(dev, conf); - ret = ice_vc_send_msg_to_vf(vf, ctx->v_opcode, v_ret, - (u8 *)resp, len); + ret = ice_vc_respond_to_vf(vf, ctx->v_opcode, v_ret, + (u8 *)resp, len); kfree(resp); return ret; } @@ -1635,8 +1635,8 @@ ice_vc_del_fdir_fltr_post(struct ice_vf *vf, struct ice_vf_fdir_ctx *ctx, ice_vc_fdir_remove_entry(vf, conf, conf->flow_id); vf->fdir.fdir_fltr_cnt[conf->input.flow_type][is_tun]--; - ret = ice_vc_send_msg_to_vf(vf, ctx->v_opcode, v_ret, - (u8 *)resp, len); + ret = ice_vc_respond_to_vf(vf, ctx->v_opcode, v_ret, + (u8 *)resp, len); kfree(resp); dev_dbg(dev, "VF %d: flow_id:0x%X, FDIR %s success!\n", @@ -1652,8 +1652,8 @@ ice_vc_del_fdir_fltr_post(struct ice_vf *vf, struct ice_vf_fdir_ctx *ctx, if (success) devm_kfree(dev, conf); - ret = ice_vc_send_msg_to_vf(vf, ctx->v_opcode, v_ret, - (u8 *)resp, len); + ret = ice_vc_respond_to_vf(vf, ctx->v_opcode, v_ret, + (u8 *)resp, len); kfree(resp); return ret; } @@ -1850,8 +1850,8 @@ int ice_vc_add_fdir_fltr(struct ice_vf *vf, u8 *msg) v_ret = VIRTCHNL_STATUS_SUCCESS; stat->status = VIRTCHNL_FDIR_SUCCESS; devm_kfree(dev, conf); - ret = ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ADD_FDIR_FILTER, - v_ret, (u8 *)stat, len); + ret = ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ADD_FDIR_FILTER, + v_ret, (u8 *)stat, len); goto exit; } @@ -1909,8 +1909,8 @@ int ice_vc_add_fdir_fltr(struct ice_vf *vf, u8 *msg) err_free_conf: devm_kfree(dev, conf); err_exit: - ret = ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_ADD_FDIR_FILTER, v_ret, - (u8 *)stat, len); + ret = ice_vc_respond_to_vf(vf, VIRTCHNL_OP_ADD_FDIR_FILTER, v_ret, + (u8 *)stat, len); kfree(stat); return ret; } @@ -1993,8 +1993,8 @@ int ice_vc_del_fdir_fltr(struct ice_vf *vf, u8 *msg) err_del_tmr: ice_vc_fdir_clear_irq_ctx(vf); err_exit: - ret = ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_DEL_FDIR_FILTER, v_ret, - (u8 *)stat, len); + ret = ice_vc_respond_to_vf(vf, VIRTCHNL_OP_DEL_FDIR_FILTER, v_ret, + (u8 *)stat, len); kfree(stat); return ret; } From patchwork Mon Sep 18 06:25:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F6B7CD13D1 for ; Mon, 18 Sep 2023 06:29:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239884AbjIRG2z (ORCPT ); Mon, 18 Sep 2023 02:28:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239882AbjIRG2Z (ORCPT ); Mon, 18 Sep 2023 02:28:25 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A19E121; Sun, 17 Sep 2023 23:28:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018496; x=1726554496; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1PbplPb+kY7NA5Ye6l58CM/AmLov4qOiDxAqUc+MDUA=; b=P0mBf4ka5I3GBKpyrDEnlSYaW3g4ZX92lE/0N+poOI/d4jB08syLCOd4 mlXNBuC/LYMj9MSf0lyow9QT0rqHHIAzWciw8Oea+ARYO8QLPYxAMIvjD F6K5x55hd1m1+2KAiRy3xNmKLdv5Ap1d3zrHCWyqnklqvRAhP7KFyBgUv NYYuF33bV4Ff5II/k3VAWmfFhoj+u6eeVL08K61km1sgaNV34F4YDdbzt HQXEXL7ErcKyPMtNAWbmvkorurxbjYqWA2MScbwXeQ35fpvHmmbQch/K3 jfYWk6tB5GQh3pQ5YETWcKvqTQ0YH21lPaOsSfRLbHLeFLeP+2iVzuRe2 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488540" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488540" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893473" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893473" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:11 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 05/13] ice: Add fundamental migration init and exit function Date: Mon, 18 Sep 2023 06:25:38 +0000 Message-Id: <20230918062546.40419-6-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu Add basic entry point for live migration functionality initialization, uninitialization and add helper function for vfio driver to reach pf driver data. Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- drivers/net/ethernet/intel/ice/Makefile | 3 +- drivers/net/ethernet/intel/ice/ice.h | 3 + drivers/net/ethernet/intel/ice/ice_main.c | 15 ++++ .../net/ethernet/intel/ice/ice_migration.c | 83 +++++++++++++++++++ .../intel/ice/ice_migration_private.h | 21 +++++ drivers/net/ethernet/intel/ice/ice_vf_lib.c | 4 + drivers/net/ethernet/intel/ice/ice_vf_lib.h | 1 + include/linux/net/intel/ice_migration.h | 25 ++++++ 8 files changed, 154 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/intel/ice/ice_migration.c create mode 100644 drivers/net/ethernet/intel/ice/ice_migration_private.h create mode 100644 include/linux/net/intel/ice_migration.h diff --git a/drivers/net/ethernet/intel/ice/Makefile b/drivers/net/ethernet/intel/ice/Makefile index 18985da8ec49..e3a7af06235e 100644 --- a/drivers/net/ethernet/intel/ice/Makefile +++ b/drivers/net/ethernet/intel/ice/Makefile @@ -50,4 +50,5 @@ ice-$(CONFIG_DCB) += ice_dcb.o ice_dcb_nl.o ice_dcb_lib.o ice-$(CONFIG_RFS_ACCEL) += ice_arfs.o ice-$(CONFIG_XDP_SOCKETS) += ice_xsk.o ice-$(CONFIG_ICE_SWITCHDEV) += ice_eswitch.o ice_eswitch_br.o -ice-$(CONFIG_GNSS) += ice_gnss.o \ No newline at end of file +ice-$(CONFIG_GNSS) += ice_gnss.o +ice-$(CONFIG_ICE_VFIO_PCI) += ice_migration.o diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index 051007ccab43..837a89d3541c 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -55,6 +55,7 @@ #include #include #include +#include #include "ice_devids.h" #include "ice_type.h" #include "ice_txrx.h" @@ -76,6 +77,7 @@ #include "ice_vsi_vlan_ops.h" #include "ice_gnss.h" #include "ice_irq.h" +#include "ice_migration_private.h" #define ICE_BAR0 0 #define ICE_REQ_DESC_MULTIPLE 32 @@ -962,6 +964,7 @@ int ice_stop(struct net_device *netdev); void ice_service_task_schedule(struct ice_pf *pf); int ice_load(struct ice_pf *pf); void ice_unload(struct ice_pf *pf); +struct ice_pf *ice_get_pf_from_vf_pdev(struct pci_dev *pdev); /** * ice_set_rdma_cap - enable RDMA support diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index a5997008bb98..b2031ee7acf8 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -9327,3 +9327,18 @@ static const struct net_device_ops ice_netdev_ops = { .ndo_xdp_xmit = ice_xdp_xmit, .ndo_xsk_wakeup = ice_xsk_wakeup, }; + +/** + * ice_get_pf_from_vf_pdev - Get PF structure from PCI device + * @pdev: pointer to PCI device + * + * Return pointer to ice PF structure, NULL for failure + */ +struct ice_pf *ice_get_pf_from_vf_pdev(struct pci_dev *pdev) +{ + struct ice_pf *pf; + + pf = pci_iov_get_pf_drvdata(pdev, &ice_driver); + + return !IS_ERR(pf) ? pf : NULL; +} diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c new file mode 100644 index 000000000000..bd2248765750 --- /dev/null +++ b/drivers/net/ethernet/intel/ice/ice_migration.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2018-2023 Intel Corporation */ + +#include "ice.h" + +/** + * ice_migration_get_pf - Get ice PF structure pointer by pdev + * @pdev: pointer to ice vfio pci VF pdev structure + * + * Return nonzero for success, NULL for failure. + */ +struct ice_pf *ice_migration_get_pf(struct pci_dev *pdev) +{ + return ice_get_pf_from_vf_pdev(pdev); +} +EXPORT_SYMBOL(ice_migration_get_pf); + +/** + * ice_migration_init_vf - init ice VF device state data + * @vf: pointer to VF + */ +void ice_migration_init_vf(struct ice_vf *vf) +{ + vf->migration_enabled = true; +} + +/** + * ice_migration_uninit_vf - uninit VF device state data + * @vf: pointer to VF + */ +void ice_migration_uninit_vf(struct ice_vf *vf) +{ + if (!vf->migration_enabled) + return; + + vf->migration_enabled = false; +} + +/** + * ice_migration_init_dev - init ice migration device + * @pf: pointer to PF of migration device + * @vf_id: VF index of migration device + * + * Return 0 for success, negative for failure + */ +int ice_migration_init_dev(struct ice_pf *pf, int vf_id) +{ + struct device *dev = ice_pf_to_dev(pf); + struct ice_vf *vf; + + vf = ice_get_vf_by_id(pf, vf_id); + if (!vf) { + dev_err(dev, "Unable to locate VF from VF ID%d\n", vf_id); + return -EINVAL; + } + + ice_migration_init_vf(vf); + + ice_put_vf(vf); + return 0; +} +EXPORT_SYMBOL(ice_migration_init_dev); + +/** + * ice_migration_uninit_dev - uninit ice migration device + * @pf: pointer to PF of migration device + * @vf_id: VF index of migration device + */ +void ice_migration_uninit_dev(struct ice_pf *pf, int vf_id) +{ + struct device *dev = ice_pf_to_dev(pf); + struct ice_vf *vf; + + vf = ice_get_vf_by_id(pf, vf_id); + if (!vf) { + dev_err(dev, "Unable to locate VF from VF ID%d\n", vf_id); + return; + } + + ice_migration_uninit_vf(vf); + ice_put_vf(vf); +} +EXPORT_SYMBOL(ice_migration_uninit_dev); diff --git a/drivers/net/ethernet/intel/ice/ice_migration_private.h b/drivers/net/ethernet/intel/ice/ice_migration_private.h new file mode 100644 index 000000000000..2cc2f515fc5e --- /dev/null +++ b/drivers/net/ethernet/intel/ice/ice_migration_private.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (C) 2018-2023 Intel Corporation */ + +#ifndef _ICE_MIGRATION_PRIVATE_H_ +#define _ICE_MIGRATION_PRIVATE_H_ + +/* This header file is for exposing functions in ice_migration.c to + * files which will be compiled in ice.ko. + * Functions which may be used by other files which will be compiled + * in ice-vfio-pic.ko should be exposed as part of ice_migration.h. + */ + +#if IS_ENABLED(CONFIG_ICE_VFIO_PCI) +void ice_migration_init_vf(struct ice_vf *vf); +void ice_migration_uninit_vf(struct ice_vf *vf); +#else +static inline void ice_migration_init_vf(struct ice_vf *vf) { } +static inline void ice_migration_uninit_vf(struct ice_vf *vf) { } +#endif /* CONFIG_ICE_VFIO_PCI */ + +#endif /* _ICE_MIGRATION_PRIVATE_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c index 24e4f4d897b6..53d0f37fb65c 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c @@ -241,6 +241,10 @@ static void ice_vf_pre_vsi_rebuild(struct ice_vf *vf) if (vf->vf_ops->irq_close) vf->vf_ops->irq_close(vf); + if (vf->migration_enabled) { + ice_migration_uninit_vf(vf); + ice_migration_init_vf(vf); + } ice_vf_clear_counters(vf); vf->vf_ops->clear_reset_trigger(vf); } diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h b/drivers/net/ethernet/intel/ice/ice_vf_lib.h index ff1438373f69..351568d786a2 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h @@ -137,6 +137,7 @@ struct ice_vf { /* devlink port data */ struct devlink_port devlink_port; + bool migration_enabled; }; /* Flags for controlling behavior of ice_reset_vf */ diff --git a/include/linux/net/intel/ice_migration.h b/include/linux/net/intel/ice_migration.h new file mode 100644 index 000000000000..d7228de7b02d --- /dev/null +++ b/include/linux/net/intel/ice_migration.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (C) 2018-2023 Intel Corporation */ + +#ifndef _ICE_MIGRATION_H_ +#define _ICE_MIGRATION_H_ + +#if IS_ENABLED(CONFIG_ICE_VFIO_PCI) + +struct ice_pf; + +struct ice_pf *ice_migration_get_pf(struct pci_dev *pdev); +int ice_migration_init_dev(struct ice_pf *pf, int vf_id); +void ice_migration_uninit_dev(struct ice_pf *pf, int vf_id); + +#else +static inline struct ice_pf *ice_migration_get_pf(struct pci_dev *pdev) +{ + return NULL; +} + +static inline int ice_migration_init_dev(struct ice_pf *pf, int vf_id) { } +static inline void ice_migration_uninit_dev(struct ice_pf *pf, int vf_id) { } +#endif /* CONFIG_ICE_VFIO_PCI */ + +#endif /* _ICE_MIGRATION_H_ */ From patchwork Mon Sep 18 06:25:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 496CFCD13D8 for ; Mon, 18 Sep 2023 06:29:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239921AbjIRG3C (ORCPT ); Mon, 18 Sep 2023 02:29:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239932AbjIRG2b (ORCPT ); Mon, 18 Sep 2023 02:28:31 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C661121; Sun, 17 Sep 2023 23:28:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018502; x=1726554502; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gHkw4vGUJtBw/7w1jzwy2EEnGCfkD2i3C5CgZi2TvcI=; b=C/bx5V4UzqtiNXYqxiLrBuFckqQklpezeaiKCrD9vEtyY1WXrM1A20jM JxihWROgc9SVRDzS5Na4oGLBXT6hFlCNpi2dRc5ysUHSZAOZZVR6cJkWl mVWZkFdq+biWoBPcGy3tKQc21h5kM4f6qMVgg3CdXZloV7r/Sp1TCm7Fk MfEp/RK9TG4XDxT2ZGFUvtDhB55XPcnsNhyJ8tnnkDQh+IDycChHwynY4 F9rnMDnLYS0BB4eBqOD1DEKjhyRER/0lDGPcK7ZPESkv8qMtr6knfLslV iVX+qysXeTIPKXSGwxWwcEaQ0XTwnBCszpP340yKmzTsLpRChrGMUb1/e A==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488565" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488565" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893505" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893505" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:16 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 06/13] ice: Log virtual channel messages in PF Date: Mon, 18 Sep 2023 06:25:39 +0000 Message-Id: <20230918062546.40419-7-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu Save the virtual channel messages sent by VF on the source side during runtime. The logged virtchnl messages will be transferred and loaded into the device on the destination side during the device resume stage. For the feature which can not be migrated yet, it must be disabled or blocked to prevent from being abused by VF. Otherwise, it may introduce functional and security issue. Mask unsupported VF capability flags in the VF-PF negotiaion stage. Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- .../net/ethernet/intel/ice/ice_migration.c | 167 ++++++++++++++++++ .../intel/ice/ice_migration_private.h | 12 ++ drivers/net/ethernet/intel/ice/ice_vf_lib.h | 5 + drivers/net/ethernet/intel/ice/ice_virtchnl.c | 29 +++ 4 files changed, 213 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c index bd2248765750..88ec0653a1ce 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration.c +++ b/drivers/net/ethernet/intel/ice/ice_migration.c @@ -3,6 +3,17 @@ #include "ice.h" +struct ice_migration_virtchnl_msg_slot { + u32 opcode; + u16 msg_len; + char msg_buffer[]; +}; + +struct ice_migration_virtchnl_msg_listnode { + struct list_head node; + struct ice_migration_virtchnl_msg_slot msg_slot; +}; + /** * ice_migration_get_pf - Get ice PF structure pointer by pdev * @pdev: pointer to ice vfio pci VF pdev structure @@ -22,6 +33,9 @@ EXPORT_SYMBOL(ice_migration_get_pf); void ice_migration_init_vf(struct ice_vf *vf) { vf->migration_enabled = true; + INIT_LIST_HEAD(&vf->virtchnl_msg_list); + vf->virtchnl_msg_num = 0; + vf->virtchnl_msg_size = 0; } /** @@ -30,10 +44,24 @@ void ice_migration_init_vf(struct ice_vf *vf) */ void ice_migration_uninit_vf(struct ice_vf *vf) { + struct ice_migration_virtchnl_msg_listnode *msg_listnode; + struct ice_migration_virtchnl_msg_listnode *dtmp; + if (!vf->migration_enabled) return; vf->migration_enabled = false; + + if (list_empty(&vf->virtchnl_msg_list)) + return; + list_for_each_entry_safe(msg_listnode, dtmp, + &vf->virtchnl_msg_list, + node) { + list_del(&msg_listnode->node); + kfree(msg_listnode); + } + vf->virtchnl_msg_num = 0; + vf->virtchnl_msg_size = 0; } /** @@ -81,3 +109,142 @@ void ice_migration_uninit_dev(struct ice_pf *pf, int vf_id) ice_put_vf(vf); } EXPORT_SYMBOL(ice_migration_uninit_dev); + +/** + * ice_migration_is_loggable_msg - is this message loggable or not + * @v_opcode: virtchnl message operation code + * + * Return 1 for true, return 0 for false + */ +static inline int ice_migration_is_loggable_msg(u32 v_opcode) +{ + switch (v_opcode) { + case VIRTCHNL_OP_VERSION: + case VIRTCHNL_OP_GET_VF_RESOURCES: + case VIRTCHNL_OP_CONFIG_VSI_QUEUES: + case VIRTCHNL_OP_CONFIG_IRQ_MAP: + case VIRTCHNL_OP_ADD_ETH_ADDR: + case VIRTCHNL_OP_DEL_ETH_ADDR: + case VIRTCHNL_OP_CONFIG_PROMISCUOUS_MODE: + case VIRTCHNL_OP_ENABLE_QUEUES: + case VIRTCHNL_OP_DISABLE_QUEUES: + case VIRTCHNL_OP_ADD_VLAN: + case VIRTCHNL_OP_DEL_VLAN: + case VIRTCHNL_OP_ENABLE_VLAN_STRIPPING: + case VIRTCHNL_OP_DISABLE_VLAN_STRIPPING: + case VIRTCHNL_OP_CONFIG_RSS_KEY: + case VIRTCHNL_OP_CONFIG_RSS_LUT: + case VIRTCHNL_OP_GET_SUPPORTED_RXDIDS: + return 1; + default: + return 0; + } +} + +/** + * ice_migration_log_vf_msg - Log request message from VF + * @vf: pointer to the VF structure + * @event: pointer to the AQ event + * + * Log VF message for later restore during live migration + * + * Return 0 for success, negative for error + */ +int ice_migration_log_vf_msg(struct ice_vf *vf, + struct ice_rq_event_info *event) +{ + struct ice_migration_virtchnl_msg_listnode *msg_listnode; + u32 v_opcode = le32_to_cpu(event->desc.cookie_high); + struct device *dev = ice_pf_to_dev(vf->pf); + u16 msglen = event->msg_len; + u8 *msg = event->msg_buf; + + if (!ice_migration_is_loggable_msg(v_opcode)) + return 0; + + if (vf->virtchnl_msg_num >= VIRTCHNL_MSG_MAX) { + dev_warn(dev, "VF %d has maximum number virtual channel commands\n", + vf->vf_id); + return -ENOMEM; + } + + msg_listnode = (struct ice_migration_virtchnl_msg_listnode *) + kzalloc(struct_size(msg_listnode, + msg_slot.msg_buffer, + msglen), + GFP_KERNEL); + if (!msg_listnode) { + dev_err(dev, "VF %d failed to allocate memory for msg listnode\n", + vf->vf_id); + return -ENOMEM; + } + dev_dbg(dev, "VF %d save virtual channel command, op code: %d, len: %d\n", + vf->vf_id, v_opcode, msglen); + msg_listnode->msg_slot.opcode = v_opcode; + msg_listnode->msg_slot.msg_len = msglen; + memcpy(msg_listnode->msg_slot.msg_buffer, msg, msglen); + list_add_tail(&msg_listnode->node, &vf->virtchnl_msg_list); + vf->virtchnl_msg_num++; + vf->virtchnl_msg_size += struct_size(&msg_listnode->msg_slot, + msg_buffer, + msglen); + return 0; +} + +/** + * ice_migration_unlog_vf_msg - revert logged message + * @vf: pointer to the VF structure + * @v_opcode: virtchnl message operation code + * + * Remove the virtual channel message logged by ice_migration_log_vf_msg() + * before. + */ +void ice_migration_unlog_vf_msg(struct ice_vf *vf, u32 v_opcode) +{ + struct ice_migration_virtchnl_msg_listnode *msg_listnode; + + if (!ice_migration_is_loggable_msg(v_opcode)) + return; + + if (WARN_ON_ONCE(list_empty(&vf->virtchnl_msg_list))) + return; + + msg_listnode = list_last_entry(&vf->virtchnl_msg_list, + struct ice_migration_virtchnl_msg_listnode, + node); + if (WARN_ON_ONCE(msg_listnode->msg_slot.opcode != v_opcode)) + return; + + list_del(&msg_listnode->node); + kfree(msg_listnode); + vf->virtchnl_msg_num--; + vf->virtchnl_msg_size -= struct_size(&msg_listnode->msg_slot, + msg_buffer, + msg_listnode->msg_slot.msg_len); +} + +#define VIRTCHNL_VF_MIGRATION_SUPPORT_FEATURE \ + (VIRTCHNL_VF_OFFLOAD_L2 | \ + VIRTCHNL_VF_OFFLOAD_RSS_PF | \ + VIRTCHNL_VF_OFFLOAD_RSS_AQ | \ + VIRTCHNL_VF_OFFLOAD_RSS_REG | \ + VIRTCHNL_VF_OFFLOAD_RSS_PCTYPE_V2 | \ + VIRTCHNL_VF_OFFLOAD_ENCAP | \ + VIRTCHNL_VF_OFFLOAD_ENCAP_CSUM | \ + VIRTCHNL_VF_OFFLOAD_RX_POLLING | \ + VIRTCHNL_VF_OFFLOAD_WB_ON_ITR | \ + VIRTCHNL_VF_CAP_ADV_LINK_SPEED | \ + VIRTCHNL_VF_OFFLOAD_VLAN | \ + VIRTCHNL_VF_OFFLOAD_RX_FLEX_DESC | \ + VIRTCHNL_VF_OFFLOAD_USO) + +/** + * ice_migration_supported_caps - get migration supported VF capabilities + * + * When migration is activated, some VF capabilities are not supported. + * So unmask those capability flags for VF resources. + */ +u32 ice_migration_supported_caps(void) +{ + return VIRTCHNL_VF_MIGRATION_SUPPORT_FEATURE; +} diff --git a/drivers/net/ethernet/intel/ice/ice_migration_private.h b/drivers/net/ethernet/intel/ice/ice_migration_private.h index 2cc2f515fc5e..678ae361cf0c 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration_private.h +++ b/drivers/net/ethernet/intel/ice/ice_migration_private.h @@ -13,9 +13,21 @@ #if IS_ENABLED(CONFIG_ICE_VFIO_PCI) void ice_migration_init_vf(struct ice_vf *vf); void ice_migration_uninit_vf(struct ice_vf *vf); +int ice_migration_log_vf_msg(struct ice_vf *vf, + struct ice_rq_event_info *event); +void ice_migration_unlog_vf_msg(struct ice_vf *vf, u32 v_opcode); +u32 ice_migration_supported_caps(void); #else static inline void ice_migration_init_vf(struct ice_vf *vf) { } static inline void ice_migration_uninit_vf(struct ice_vf *vf) { } +static inline void +ice_migration_save_vf_msg(struct ice_vf *vf, + struct ice_rq_event_info *event) { } +static inline u32 +ice_migration_supported_caps(void) +{ + return 0xFFFFFFFF; +} #endif /* CONFIG_ICE_VFIO_PCI */ #endif /* _ICE_MIGRATION_PRIVATE_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h b/drivers/net/ethernet/intel/ice/ice_vf_lib.h index 351568d786a2..011398655739 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h @@ -77,6 +77,7 @@ struct ice_vfs { unsigned long last_printed_mdd_jiffies; /* MDD message rate limit */ }; +#define VIRTCHNL_MSG_MAX 1000 /* VF information structure */ struct ice_vf { struct hlist_node entry; @@ -138,6 +139,10 @@ struct ice_vf { /* devlink port data */ struct devlink_port devlink_port; bool migration_enabled; + struct list_head virtchnl_msg_list; + u64 virtchnl_msg_num; + u64 virtchnl_msg_size; + u32 virtchnl_retval; }; /* Flags for controlling behavior of ice_reset_vf */ diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c index 6be796ed70a8..b40e91958f0d 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c @@ -338,6 +338,12 @@ ice_vc_respond_to_vf(struct ice_vf *vf, u32 v_opcode, return -EIO; } + /* v_retval will not be returned in this function, store it in the + * per VF field to be used by migration logging logic later. + */ + if (vf->migration_enabled) + vf->virtchnl_retval = v_retval; + return ice_vc_send_response_to_vf(vf, v_opcode, v_retval, msg, msglen); } @@ -470,6 +476,8 @@ static int ice_vc_get_vf_res_msg(struct ice_vf *vf, u8 *msg) VIRTCHNL_VF_OFFLOAD_RSS_REG | VIRTCHNL_VF_OFFLOAD_VLAN; + if (vf->migration_enabled) + vf->driver_caps &= ice_migration_supported_caps(); vfres->vf_cap_flags = VIRTCHNL_VF_OFFLOAD_L2; vsi = ice_get_vf_vsi(vf); if (!vsi) { @@ -4026,6 +4034,15 @@ void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, goto finish; } + if (vf->migration_enabled) { + if (ice_migration_log_vf_msg(vf, event)) { + err = ice_vc_respond_to_vf(vf, v_opcode, + VIRTCHNL_STATUS_ERR_NO_MEMORY, + NULL, 0); + goto finish; + } + } + switch (v_opcode) { case VIRTCHNL_OP_VERSION: err = ops->get_ver_msg(vf, msg); @@ -4145,6 +4162,18 @@ void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, vf_id, v_opcode, err); } + /* All of the loggable virtual channel messages are logged by + * ice_migration_unlog_vf_msg() before they are processed. + * + * Two kinds of error may happen, virtual channel message's result + * is failure after processed by PF or message is not sent to VF + * successfully. If error happened, fallback here by reverting logged + * messages. + */ + if (vf->migration_enabled && + (vf->virtchnl_retval != VIRTCHNL_STATUS_SUCCESS || err)) + ice_migration_unlog_vf_msg(vf, v_opcode); + finish: mutex_unlock(&vf->cfg_lock); ice_put_vf(vf); From patchwork Mon Sep 18 06:25:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59F2ACD13D3 for ; Mon, 18 Sep 2023 06:29:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239941AbjIRG3E (ORCPT ); Mon, 18 Sep 2023 02:29:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239957AbjIRG2h (ORCPT ); Mon, 18 Sep 2023 02:28:37 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C821B137; Sun, 17 Sep 2023 23:28:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018506; x=1726554506; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ukpgyWlNedcPkap/JHFovqgSzBYn/7rGzzGPUHICzUM=; b=DH9FR4EmuzXYmP5gpetRAspuzij2YHdDDGxR7LLK2OQJHwvPDZXi0bev vW9/uLZr7xnx+T26lRgqL7diX0fJGXBEkPkwR61hz/bQV/PvSW6JRHy+U 4c3V/Kl3BPqnT4wXK763Bt7LlJZyGIl69VknAXeCbROZqHPcTlmYvrpIV sAL69IRRIDL4ogDZN9qvuGHWf+KiY+bATcPcVyU0lgkOH/B+Fa+X4Oda1 CsLHIYOLMF1KC74Wyznz/rV5Sbux4LXgvzvUsJ4gKioInlx6jIlM2CHp+ RyvvxEzVh0zl8j9hQ3I6qjN8+Sw/CqvE68xYwJ3DNpgwKX1obir6fM4RT Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488585" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488585" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893552" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893552" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:20 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 07/13] ice: Add device state save/restore function for migration Date: Mon, 18 Sep 2023 06:25:40 +0000 Message-Id: <20230918062546.40419-8-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu Add device state save/restore function to adapter vfio migration stack when device is in stop-copy/resume stage. Device state saving handler is called by vfio driver in device stop copy stage. It snapshots the device state, translates device state into device specific data and fills the data into migration buffer. Device state restoring handler is called by vfio driver in device resume stage. It gets device specific data from the migration buffer, translates the data into the device state and recover the device with the state. Currently only the virtual channel messages are handled. Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- .../net/ethernet/intel/ice/ice_migration.c | 222 ++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_virtchnl.c | 26 +- drivers/net/ethernet/intel/ice/ice_virtchnl.h | 7 +- include/linux/net/intel/ice_migration.h | 12 + 4 files changed, 258 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c index 88ec0653a1ce..edcd6df332ba 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration.c +++ b/drivers/net/ethernet/intel/ice/ice_migration.c @@ -3,6 +3,9 @@ #include "ice.h" +#define ICE_MIG_DEVSTAT_MAGIC 0xE8000001 +#define ICE_MIG_DEVSTAT_VERSION 0x1 + struct ice_migration_virtchnl_msg_slot { u32 opcode; u16 msg_len; @@ -14,6 +17,17 @@ struct ice_migration_virtchnl_msg_listnode { struct ice_migration_virtchnl_msg_slot msg_slot; }; +struct ice_migration_dev_state { + u32 magic; + u32 version; + u64 total_size; + u32 vf_caps; + u16 num_txq; + u16 num_rxq; + + u8 virtchnl_msgs[]; +} __aligned(8); + /** * ice_migration_get_pf - Get ice PF structure pointer by pdev * @pdev: pointer to ice vfio pci VF pdev structure @@ -248,3 +262,211 @@ u32 ice_migration_supported_caps(void) { return VIRTCHNL_VF_MIGRATION_SUPPORT_FEATURE; } + +/** + * ice_migration_save_devstate - save device state to migration buffer + * @pf: pointer to PF of migration device + * @vf_id: VF index of migration device + * @buf: pointer to VF msg in migration buffer + * @buf_sz: size of migration buffer + * + * Return 0 for success, negative for error + */ +int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_sz) +{ + struct ice_migration_virtchnl_msg_listnode *msg_listnode; + struct ice_migration_virtchnl_msg_slot *dummy_op; + struct ice_migration_dev_state *devstate; + struct device *dev = ice_pf_to_dev(pf); + struct ice_vsi *vsi; + struct ice_vf *vf; + u64 total_sz; + int ret = 0; + + vf = ice_get_vf_by_id(pf, vf_id); + if (!vf) { + dev_err(dev, "Unable to locate VF from VF ID%d\n", vf_id); + return -EINVAL; + } + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + ret = -EINVAL; + goto out_put_vf; + } + + /* Reserve space to store device state */ + total_sz = sizeof(struct ice_migration_dev_state) + + vf->virtchnl_msg_size + sizeof(*dummy_op); + if (total_sz > buf_sz) { + dev_err(dev, "Insufficient buffer to store device state for VF %d\n", + vf->vf_id); + ret = -ENOBUFS; + goto out_put_vf; + } + + devstate = (struct ice_migration_dev_state *)buf; + devstate->magic = ICE_MIG_DEVSTAT_MAGIC; + devstate->version = ICE_MIG_DEVSTAT_VERSION; + devstate->total_size = total_sz; + devstate->vf_caps = ice_migration_supported_caps(); + devstate->num_txq = vsi->num_txq; + devstate->num_rxq = vsi->num_rxq; + buf = devstate->virtchnl_msgs; + + list_for_each_entry(msg_listnode, &vf->virtchnl_msg_list, node) { + struct ice_migration_virtchnl_msg_slot *msg_slot; + u64 slot_size; + + msg_slot = &msg_listnode->msg_slot; + slot_size = struct_size(msg_slot, msg_buffer, + msg_slot->msg_len); + dev_dbg(dev, "VF %d copy virtchnl message to migration buffer op: %d, len: %d\n", + vf->vf_id, msg_slot->opcode, msg_slot->msg_len); + memcpy(buf, msg_slot, slot_size); + buf += slot_size; + } + + /* Use op code unknown to mark end of vc messages */ + dummy_op = (struct ice_migration_virtchnl_msg_slot *)buf; + dummy_op->opcode = VIRTCHNL_OP_UNKNOWN; + +out_put_vf: + ice_put_vf(vf); + return ret; +} +EXPORT_SYMBOL(ice_migration_save_devstate); + +/** + * ice_migration_check_match - check if configuration is matched or not + * @pf: pointer to VF + * @buf: pointer to device state buffer + * @buf_sz: size of buffer + * + * Return 0 for success, negative for error + */ +static int ice_migration_check_match(struct ice_vf *vf, const u8 *buf, u64 buf_sz) +{ + u32 supported_caps = ice_migration_supported_caps(); + struct device *dev = ice_pf_to_dev(vf->pf); + struct ice_migration_dev_state *devstate; + struct ice_vsi *vsi; + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + if (sizeof(struct ice_migration_dev_state) > buf_sz) { + dev_err(dev, "VF %d devstate header exceeds buffer size\n", + vf->vf_id); + return -EINVAL; + } + + devstate = (struct ice_migration_dev_state *)buf; + if (devstate->magic != ICE_MIG_DEVSTAT_MAGIC) { + dev_err(dev, "VF %d devstate has invalid magic 0x%x\n", + vf->vf_id, devstate->magic); + return -EINVAL; + } + + if (devstate->version != ICE_MIG_DEVSTAT_VERSION) { + dev_err(dev, "VF %d devstate has invalid version 0x%x\n", + vf->vf_id, devstate->version); + return -EINVAL; + } + + if (devstate->num_txq != vsi->num_txq) { + dev_err(dev, "Failed to match VF %d tx queue number, request %d, support %d\n", + vf->vf_id, devstate->num_txq, vsi->num_txq); + return -EINVAL; + } + + if (devstate->num_rxq != vsi->num_rxq) { + dev_err(dev, "Failed to match VF %d rx queue number, request %d, support %d\n", + vf->vf_id, devstate->num_rxq, vsi->num_rxq); + return -EINVAL; + } + + if ((devstate->vf_caps & supported_caps) != devstate->vf_caps) { + dev_err(dev, "Failed to match VF %d caps, request 0x%x, support 0x%x\n", + vf->vf_id, devstate->vf_caps, supported_caps); + return -EINVAL; + } + + if (devstate->total_size > buf_sz) { + dev_err(dev, "VF %d devstate exceeds buffer size\n", + vf->vf_id); + return -EINVAL; + } + + return 0; +} + +/** + * ice_migration_restore_devstate - restore device state at dst + * @pf: pointer to PF of migration device + * @vf_id: VF index of migration device + * @buf: pointer to device state buf in migration buffer + * @buf_sz: size of migration buffer + * + * This function uses the device state saved in migration buffer + * to restore device state at dst VM + * + * Return 0 for success, negative for error + */ +int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, u64 buf_sz) +{ + struct ice_migration_virtchnl_msg_slot *msg_slot; + struct ice_migration_dev_state *devstate; + struct device *dev = ice_pf_to_dev(pf); + struct ice_vf *vf; + int ret = 0; + + if (!buf) + return -EINVAL; + + vf = ice_get_vf_by_id(pf, vf_id); + if (!vf) { + dev_err(dev, "Unable to locate VF from VF ID%d\n", vf_id); + return -EINVAL; + } + + ret = ice_migration_check_match(vf, buf, buf_sz); + if (ret) + goto out_put_vf; + + devstate = (struct ice_migration_dev_state *)buf; + msg_slot = (struct ice_migration_virtchnl_msg_slot *)devstate->virtchnl_msgs; + set_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states); + + while (msg_slot->opcode != VIRTCHNL_OP_UNKNOWN) { + struct ice_rq_event_info event; + u64 slot_sz; + + slot_sz = struct_size(msg_slot, msg_buffer, msg_slot->msg_len); + dev_dbg(dev, "VF %d replay virtchnl message op code: %d, msg len: %d\n", + vf->vf_id, msg_slot->opcode, msg_slot->msg_len); + event.desc.cookie_high = msg_slot->opcode; + event.msg_len = msg_slot->msg_len; + event.desc.retval = vf->vf_id; + event.msg_buf = (unsigned char *)msg_slot->msg_buffer; + ret = ice_vc_process_vf_msg(vf->pf, &event, NULL); + if (ret) { + dev_err(dev, "VF %d failed to replay virtchnl message op code: %d\n", + vf->vf_id, msg_slot->opcode); + goto out_clear_replay; + } + event.msg_buf = NULL; + msg_slot = (struct ice_migration_virtchnl_msg_slot *) + ((char *)msg_slot + slot_sz); + } +out_clear_replay: + clear_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states); +out_put_vf: + ice_put_vf(vf); + return ret; +} +EXPORT_SYMBOL(ice_migration_restore_devstate); diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c index b40e91958f0d..e34ea781a81c 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c @@ -3971,11 +3971,24 @@ ice_is_malicious_vf(struct ice_vf *vf, struct ice_mbx_data *mbxdata) * @event: pointer to the AQ event * @mbxdata: information used to detect VF attempting mailbox overflow * - * called from the common asq/arq handler to - * process request from VF + * This function will be called from: + * 1. the common asq/arq handler to process request from VF + * + * The return value is ignored, as the command handler will send the status + * of the request as a response to the VF. This flow sets the mbxdata to + * a non-NULL value and must call ice_is_malicious_vf to determine if this + * VF might be attempting to overflow the PF message queue. + * + * 2. replay virtual channel commamds during live migration + * + * The return value is used to indicate failure to replay vc commands and + * that the migration failed. This flow sets mbxdata to NULL and skips the + * ice_is_malicious_vf checks which are unnecessary during replay. + * + * Return 0 if success, negative for failure. */ -void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, - struct ice_mbx_data *mbxdata) +int ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, + struct ice_mbx_data *mbxdata) { u32 v_opcode = le32_to_cpu(event->desc.cookie_high); s16 vf_id = le16_to_cpu(event->desc.retval); @@ -3992,13 +4005,13 @@ void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, if (!vf) { dev_err(dev, "Unable to locate VF for message from VF ID %d, opcode %d, len %d\n", vf_id, v_opcode, msglen); - return; + return -EINVAL; } mutex_lock(&vf->cfg_lock); /* Check if the VF is trying to overflow the mailbox */ - if (ice_is_malicious_vf(vf, mbxdata)) + if (!test_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states) && ice_is_malicious_vf(vf, mbxdata)) goto finish; /* Check if VF is disabled. */ @@ -4177,4 +4190,5 @@ void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, finish: mutex_unlock(&vf->cfg_lock); ice_put_vf(vf); + return err; } diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.h b/drivers/net/ethernet/intel/ice/ice_virtchnl.h index a2b6094e2f2f..4b151a228c52 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.h +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.h @@ -63,8 +63,8 @@ int ice_vc_respond_to_vf(struct ice_vf *vf, u32 v_opcode, enum virtchnl_status_code v_retval, u8 *msg, u16 msglen); bool ice_vc_isvalid_vsi_id(struct ice_vf *vf, u16 vsi_id); -void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, - struct ice_mbx_data *mbxdata); +int ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, + struct ice_mbx_data *mbxdata); #else /* CONFIG_PCI_IOV */ static inline void ice_virtchnl_set_dflt_ops(struct ice_vf *vf) { } static inline void ice_virtchnl_set_repr_ops(struct ice_vf *vf) { } @@ -84,10 +84,11 @@ static inline bool ice_vc_isvalid_vsi_id(struct ice_vf *vf, u16 vsi_id) return false; } -static inline void +static inline int ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, struct ice_mbx_data *mbxdata) { + return -EOPNOTSUPP; } #endif /* !CONFIG_PCI_IOV */ diff --git a/include/linux/net/intel/ice_migration.h b/include/linux/net/intel/ice_migration.h index d7228de7b02d..57c0e60e21d4 100644 --- a/include/linux/net/intel/ice_migration.h +++ b/include/linux/net/intel/ice_migration.h @@ -11,6 +11,8 @@ struct ice_pf; struct ice_pf *ice_migration_get_pf(struct pci_dev *pdev); int ice_migration_init_dev(struct ice_pf *pf, int vf_id); void ice_migration_uninit_dev(struct ice_pf *pf, int vf_id); +int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_sz); +int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, u64 buf_sz); #else static inline struct ice_pf *ice_migration_get_pf(struct pci_dev *pdev) @@ -20,6 +22,16 @@ static inline struct ice_pf *ice_migration_get_pf(struct pci_dev *pdev) static inline int ice_migration_init_dev(struct ice_pf *pf, int vf_id) { } static inline void ice_migration_uninit_dev(struct ice_pf *pf, int vf_id) { } +static inline int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_sz) +{ + return 0; +} + +static inline int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, + const u8 *buf, u64 buf_sz) +{ + return 0; +} #endif /* CONFIG_ICE_VFIO_PCI */ #endif /* _ICE_MIGRATION_H_ */ From patchwork Mon Sep 18 06:25:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388848 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A073CD13DA for ; Mon, 18 Sep 2023 06:29:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239956AbjIRG3G (ORCPT ); Mon, 18 Sep 2023 02:29:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239976AbjIRG2n (ORCPT ); Mon, 18 Sep 2023 02:28:43 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CD1E11A; Sun, 17 Sep 2023 23:28:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018511; x=1726554511; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=T4XDQlaeGUnJmw+JumZ6xll5gqQvfIwOAAdZVBL6bhI=; b=CAZ1xCEV3dygdr2FYdzb3ZWXj9tE4EW8bmCEOBU0Lpw0wfwARm+hWlyU qCqEjUCrDQOLOX+5cbOBjrSTV1Sm1+DQM7OATBhPyIxXBIWzZHEDxctXy /ji/0MkDub/I1KaqoGKghwGRQMqppVwGbR+sENddyOnZP+O6nuiUOh37B 2Z8p3V6XcrQnaYQQEkjUhntXHuQcOE8i9mBTF/XlGIK1qKV9phvLzYfrX KJPLxnEZsDY8e5b1Mn7D5ewRhYcj6YOWIkBh4yfsYoLDXWukH6unxO8Ih gWqVbdmLHwqlIwSLfeb/D6VWJ7s9d/b0dO7mlFTbih5QapLwjveukOLLT Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488599" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488599" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893566" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893566" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:26 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 08/13] ice: Fix VSI id in virtual channel message for migration Date: Mon, 18 Sep 2023 06:25:41 +0000 Message-Id: <20230918062546.40419-9-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu VSI id is a resource id for each VF and it is an absolute hardware id per PCI card. It is exposed to VF driver through virtual channel messages at the VF-PF negotiation stage. It is constant during the whole device lifecycle unless driver re-init. Almost all of the virtual channel messages will contain the VSI id. Once PF receives message, it will check if VSI id in the message is equal to the VF's VSI id for security and other reason. If a VM backed by VF VSI A is migrated to a VM backed by VF with VSI B, while in messages replaying stage, all the messages will be rejected by PF due to the invalid VSI id. Even after migration, VM runtime will get failure as well. Fix this gap by modifying the VSI id in the virtual channel message at migration device resuming stage and VM runtime stage. In most cases the VSI id will vary between migration source and destination side. And this is a slow path anyway. Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- .../net/ethernet/intel/ice/ice_migration.c | 96 +++++++++++++++++++ .../intel/ice/ice_migration_private.h | 4 + drivers/net/ethernet/intel/ice/ice_vf_lib.h | 1 + drivers/net/ethernet/intel/ice/ice_virtchnl.c | 1 + 4 files changed, 102 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c index edcd6df332ba..99faf9acff13 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration.c +++ b/drivers/net/ethernet/intel/ice/ice_migration.c @@ -25,6 +25,7 @@ struct ice_migration_dev_state { u16 num_txq; u16 num_rxq; + u16 vsi_id; u8 virtchnl_msgs[]; } __aligned(8); @@ -50,6 +51,7 @@ void ice_migration_init_vf(struct ice_vf *vf) INIT_LIST_HEAD(&vf->virtchnl_msg_list); vf->virtchnl_msg_num = 0; vf->virtchnl_msg_size = 0; + vf->vm_vsi_num = vf->lan_vsi_num; } /** @@ -314,6 +316,7 @@ int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_s devstate->num_txq = vsi->num_txq; devstate->num_rxq = vsi->num_rxq; buf = devstate->virtchnl_msgs; + devstate->vsi_id = vf->vm_vsi_num; list_for_each_entry(msg_listnode, &vf->virtchnl_msg_list, node) { struct ice_migration_virtchnl_msg_slot *msg_slot; @@ -439,6 +442,8 @@ int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, goto out_put_vf; devstate = (struct ice_migration_dev_state *)buf; + vf->vm_vsi_num = devstate->vsi_id; + dev_dbg(dev, "VF %d vm vsi num is:%d\n", vf->vf_id, vf->vm_vsi_num); msg_slot = (struct ice_migration_virtchnl_msg_slot *)devstate->virtchnl_msgs; set_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states); @@ -470,3 +475,94 @@ int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, return ret; } EXPORT_SYMBOL(ice_migration_restore_devstate); + +/** + * ice_migration_fix_msg_vsi - change virtual channel msg VSI id + * + * @vf: pointer to the VF structure + * @v_opcode: virtchnl message operation code + * @msg: pointer to the virtual channel message + * + * After migration, the VSI id of virtual channel message is still + * migration src VSI id. Some virtual channel commands will fail + * due to unmatch VSI id. + * Change virtual channel message payload VSI id to real VSI id. + */ +void ice_migration_fix_msg_vsi(struct ice_vf *vf, u32 v_opcode, u8 *msg) +{ + if (!vf->migration_enabled) + return; + + switch (v_opcode) { + case VIRTCHNL_OP_ADD_ETH_ADDR: + case VIRTCHNL_OP_DEL_ETH_ADDR: + case VIRTCHNL_OP_ENABLE_QUEUES: + case VIRTCHNL_OP_DISABLE_QUEUES: + case VIRTCHNL_OP_CONFIG_RSS_KEY: + case VIRTCHNL_OP_CONFIG_RSS_LUT: + case VIRTCHNL_OP_GET_STATS: + case VIRTCHNL_OP_CONFIG_PROMISCUOUS_MODE: + case VIRTCHNL_OP_ADD_FDIR_FILTER: + case VIRTCHNL_OP_DEL_FDIR_FILTER: + case VIRTCHNL_OP_ADD_VLAN: + case VIRTCHNL_OP_DEL_VLAN: { + /* Read the beginning two bytes of message for VSI id */ + u16 *vsi_id = (u16 *)msg; + + /* For VM runtime stage, vsi_id in the virtual channel message + * should be equal to the PF logged vsi_id and vsi_id is + * replaced by VF's VSI id to guarantee that messages are + * processed successfully. If vsi_id is not equal to the PF + * logged vsi_id, then this message must be sent by malicious + * VF and no replacement is needed. Just let virtual channel + * handler to fail this message. + * + * For virtual channel replaying stage, all of the PF logged + * virtual channel messages are trusted and vsi_id is replaced + * anyway to guarantee the messages are processed successfully. + */ + if (*vsi_id == vf->vm_vsi_num || + test_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states)) + *vsi_id = vf->lan_vsi_num; + break; + } + case VIRTCHNL_OP_CONFIG_IRQ_MAP: { + struct virtchnl_irq_map_info *irqmap_info; + u16 num_q_vectors_mapped; + int i; + + irqmap_info = (struct virtchnl_irq_map_info *)msg; + num_q_vectors_mapped = irqmap_info->num_vectors; + for (i = 0; i < num_q_vectors_mapped; i++) { + struct virtchnl_vector_map *map; + + map = &irqmap_info->vecmap[i]; + if (map->vsi_id == vf->vm_vsi_num || + test_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states)) + map->vsi_id = vf->lan_vsi_num; + } + break; + } + case VIRTCHNL_OP_CONFIG_VSI_QUEUES: { + struct virtchnl_vsi_queue_config_info *qci; + + qci = (struct virtchnl_vsi_queue_config_info *)msg; + if (qci->vsi_id == vf->vm_vsi_num || + test_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states)) { + int i; + + qci->vsi_id = vf->lan_vsi_num; + for (i = 0; i < qci->num_queue_pairs; i++) { + struct virtchnl_queue_pair_info *qpi; + + qpi = &qci->qpair[i]; + qpi->txq.vsi_id = vf->lan_vsi_num; + qpi->rxq.vsi_id = vf->lan_vsi_num; + } + } + break; + } + default: + break; + } +} diff --git a/drivers/net/ethernet/intel/ice/ice_migration_private.h b/drivers/net/ethernet/intel/ice/ice_migration_private.h index 678ae361cf0c..af70025f2f36 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration_private.h +++ b/drivers/net/ethernet/intel/ice/ice_migration_private.h @@ -17,6 +17,7 @@ int ice_migration_log_vf_msg(struct ice_vf *vf, struct ice_rq_event_info *event); void ice_migration_unlog_vf_msg(struct ice_vf *vf, u32 v_opcode); u32 ice_migration_supported_caps(void); +void ice_migration_fix_msg_vsi(struct ice_vf *vf, u32 v_opcode, u8 *msg); #else static inline void ice_migration_init_vf(struct ice_vf *vf) { } static inline void ice_migration_uninit_vf(struct ice_vf *vf) { } @@ -28,6 +29,9 @@ ice_migration_supported_caps(void) { return 0xFFFFFFFF; } + +static inline void +ice_migration_fix_msg_vsi(struct ice_vf *vf, u32 v_opcode, u8 *msg) { } #endif /* CONFIG_ICE_VFIO_PCI */ #endif /* _ICE_MIGRATION_PRIVATE_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h b/drivers/net/ethernet/intel/ice/ice_vf_lib.h index 011398655739..e37c3b0ecc06 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h @@ -143,6 +143,7 @@ struct ice_vf { u64 virtchnl_msg_num; u64 virtchnl_msg_size; u32 virtchnl_retval; + u16 vm_vsi_num; }; /* Flags for controlling behavior of ice_reset_vf */ diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c index e34ea781a81c..7cedd0542d4b 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c @@ -4048,6 +4048,7 @@ int ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, } if (vf->migration_enabled) { + ice_migration_fix_msg_vsi(vf, v_opcode, msg); if (ice_migration_log_vf_msg(vf, event)) { err = ice_vc_respond_to_vf(vf, v_opcode, VIRTCHNL_STATUS_ERR_NO_MEMORY, From patchwork Mon Sep 18 06:25:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79F81CD13D9 for ; Mon, 18 Sep 2023 06:29:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239963AbjIRG3G (ORCPT ); Mon, 18 Sep 2023 02:29:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239982AbjIRG2p (ORCPT ); Mon, 18 Sep 2023 02:28:45 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1201124; Sun, 17 Sep 2023 23:28:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018516; x=1726554516; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MNy2PcBWRzbTHueI2ijmsawCjISW9Er1KUb7rCXIUh4=; b=nufCblSpV6F9XwAIb96kLwB3p98M09iAE9wNrSB4DRLg/IO6GNJVqANt 97YXN89Ft932N1O+gOakunTB9jL4FpfGOlq/YLOQPyR5pJJnUuyzcN9B/ 8XlnlXfXn3gcVfXJk62iNhqTzjTK8Ertj+F4bRUXVh4ksZV8BYq4mZ3tp B0cNwd/pw8ZV2n145n7CJP7Pd8+x9zgXKWInZ+IxFtdyBs532rpdEg2oY 923fBgXLl+D3gOrPo1BWObrDCzftqe8NSYYgaXpWsn5QcAofoeoDF3POj Gu6pyzO5GYxigCYXMDWD1jNX5InwW66OLe3GHLjuPlMoRT4/ZMA3oleEa A==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488618" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488618" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893586" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893586" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:31 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 09/13] ice: Save and restore RX Queue head Date: Mon, 18 Sep 2023 06:25:42 +0000 Message-Id: <20230918062546.40419-10-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu RX Queue head is a fundamental dma ring context which determines the next RX descriptor to be fetched. However, RX Queue head is not visible to VF while it is only visible in PF. As a result, PF needs to save and restore RX Queue Head explicitly. Since network packets may come in at any time once RX Queue is enabled, RX Queue head needs to be restored before Queue is enabled. RX Queue head restoring handler is implemented by reading and then overwriting queue context with specific HEAD value. Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- .../net/ethernet/intel/ice/ice_migration.c | 125 ++++++++++++++++++ 1 file changed, 125 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c index 99faf9acff13..34cfc58ed525 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration.c +++ b/drivers/net/ethernet/intel/ice/ice_migration.c @@ -2,9 +2,11 @@ /* Copyright (C) 2018-2023 Intel Corporation */ #include "ice.h" +#include "ice_base.h" #define ICE_MIG_DEVSTAT_MAGIC 0xE8000001 #define ICE_MIG_DEVSTAT_VERSION 0x1 +#define ICE_MIG_VF_QRX_TAIL_MAX 256 struct ice_migration_virtchnl_msg_slot { u32 opcode; @@ -26,6 +28,8 @@ struct ice_migration_dev_state { u16 num_rxq; u16 vsi_id; + /* next RX desc index to be processed by the device */ + u16 rx_head[ICE_MIG_VF_QRX_TAIL_MAX]; u8 virtchnl_msgs[]; } __aligned(8); @@ -265,6 +269,54 @@ u32 ice_migration_supported_caps(void) return VIRTCHNL_VF_MIGRATION_SUPPORT_FEATURE; } +/** + * ice_migration_save_rx_head - save rx head into device state buffer + * @vf: pointer to VF structure + * @devstate: pointer to migration buffer + * + * Return 0 for success, negative for error + */ +static int +ice_migration_save_rx_head(struct ice_vf *vf, + struct ice_migration_dev_state *devstate) +{ + struct device *dev = ice_pf_to_dev(vf->pf); + struct ice_vsi *vsi; + int i; + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + ice_for_each_rxq(vsi, i) { + struct ice_rx_ring *rx_ring = vsi->rx_rings[i]; + struct ice_rlan_ctx rlan_ctx = {}; + struct ice_hw *hw = &vf->pf->hw; + u16 rxq_index; + int status; + + if (WARN_ON_ONCE(!rx_ring)) + return -EINVAL; + + devstate->rx_head[i] = 0; + if (!test_bit(i, vf->rxq_ena)) + continue; + + rxq_index = rx_ring->reg_idx; + status = ice_read_rxq_ctx(hw, &rlan_ctx, rxq_index); + if (status) { + dev_err(dev, "Failed to read RXQ[%d] context, err=%d\n", + rx_ring->q_index, status); + return -EIO; + } + devstate->rx_head[i] = rlan_ctx.head; + } + + return 0; +} + /** * ice_migration_save_devstate - save device state to migration buffer * @pf: pointer to PF of migration device @@ -318,6 +370,12 @@ int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_s buf = devstate->virtchnl_msgs; devstate->vsi_id = vf->vm_vsi_num; + ret = ice_migration_save_rx_head(vf, devstate); + if (ret) { + dev_err(dev, "VF %d failed to save rxq head\n", vf->vf_id); + goto out_put_vf; + } + list_for_each_entry(msg_listnode, &vf->virtchnl_msg_list, node) { struct ice_migration_virtchnl_msg_slot *msg_slot; u64 slot_size; @@ -408,6 +466,57 @@ static int ice_migration_check_match(struct ice_vf *vf, const u8 *buf, u64 buf_s return 0; } +/** + * ice_migration_restore_rx_head - restore rx head from device state buffer + * @vf: pointer to VF structure + * @devstate: pointer to migration device state + * + * Return 0 for success, negative for error + */ +static int +ice_migration_restore_rx_head(struct ice_vf *vf, + struct ice_migration_dev_state *devstate) +{ + struct device *dev = ice_pf_to_dev(vf->pf); + struct ice_vsi *vsi; + int i; + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + ice_for_each_rxq(vsi, i) { + struct ice_rx_ring *rx_ring = vsi->rx_rings[i]; + struct ice_rlan_ctx rlan_ctx = {}; + struct ice_hw *hw = &vf->pf->hw; + u16 rxq_index; + int status; + + if (WARN_ON_ONCE(!rx_ring)) + return -EINVAL; + + rxq_index = rx_ring->reg_idx; + status = ice_read_rxq_ctx(hw, &rlan_ctx, rxq_index); + if (status) { + dev_err(dev, "Failed to read RXQ[%d] context, err=%d\n", + rx_ring->q_index, status); + return -EIO; + } + + rlan_ctx.head = devstate->rx_head[i]; + status = ice_write_rxq_ctx(hw, &rlan_ctx, rxq_index); + if (status) { + dev_err(dev, "Failed to set LAN RXQ[%d] context, err=%d\n", + rx_ring->q_index, status); + return -EIO; + } + } + + return 0; +} + /** * ice_migration_restore_devstate - restore device state at dst * @pf: pointer to PF of migration device @@ -464,6 +573,22 @@ int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, vf->vf_id, msg_slot->opcode); goto out_clear_replay; } + + /* Once RX Queue is enabled, network traffic may come in at any + * time. As a result, RX Queue head needs to be restored before + * RX Queue is enabled. + * For simplicity and integration, overwrite RX head just after + * RX ring context is configured. + */ + if (msg_slot->opcode == VIRTCHNL_OP_CONFIG_VSI_QUEUES) { + ret = ice_migration_restore_rx_head(vf, devstate); + if (ret) { + dev_err(dev, "VF %d failed to restore rx head\n", + vf->vf_id); + goto out_clear_replay; + } + } + event.msg_buf = NULL; msg_slot = (struct ice_migration_virtchnl_msg_slot *) ((char *)msg_slot + slot_sz); From patchwork Mon Sep 18 06:25:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AA57CD13DE for ; Mon, 18 Sep 2023 06:29:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239978AbjIRG3I (ORCPT ); Mon, 18 Sep 2023 02:29:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239992AbjIRG2t (ORCPT ); Mon, 18 Sep 2023 02:28:49 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39390E6; Sun, 17 Sep 2023 23:28:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018522; x=1726554522; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QAGAvhgofHvjdu05VrvwkkzXHxMwIcITqc7vd0VaIM8=; b=Ur1QV58LM+uVkIW4rVBSoNKfbtfHiR8scFiLnf91IZLR4NDTKmh/OpNW A6m8r6aTxrBXGLzLYyV6zjzSWldrdDW76rO7/Q7wq18O4IMR+oN6GLF2V CVmpN0BAQmJBAPPMSq3AN/0mLMARtWNRLKRSeC3gszJTWEn3TypU2q7uy lQdSieU9sQ7d2cxG2may/6XXdiuVxt/FdESry0aGPrAxQ1vJckxl9JRQ4 +adz73T5L9jcikRZAMUcTlaVUZIbJxIBsyjdryz6dPrIgzw0Ri3SiCGC5 jCx3H/7/lbVZyb6bHcauv9qyklBdvPugKaPZoURISy/tXO33OgKXsVvMd Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488635" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488635" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893613" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893613" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:36 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 10/13] ice: Save and restore TX Queue head Date: Mon, 18 Sep 2023 06:25:43 +0000 Message-Id: <20230918062546.40419-11-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu TX Queue head is a fundamental DMA ring context which determines the next TX descriptor to be fetched. However, TX Queue head is not visible to VF while it is only visible in PF. As a result, PF needs to save and restore TX Queue head explicitly. Unfortunately, due to HW limitation, TX Queue head can't be recovered through writing mmio registers. Since sending one packet will make TX head advanced by 1 index, TX Queue head can be advanced by N index through sending N packets. So filling in DMA ring with NOP descriptors and bumping doorbell can be used to change TX Queue head indirectly. And this method has no side effects except changing TX head value. To advance TX Head queue, HW needs to touch memory by DMA. But directly touching VM's memory to advance TX Queue head does not follow vfio migration protocol design, because vIOMMU state is not defined by the protocol. Even this may introduce functional and security issue under hostile guest circumstances. In order not to touch any VF memory or IO page table, TX Queue head restore is using PF managed memory and PF isolation domain. This will also introduce another dependency that while switching TX Queue between PF space and VF space, TX Queue head value is not changed. HW provides an indirect context access so that head value can be kept while switching context. In virtual channel model, VF driver only send TX queue ring base and length info to PF, while rest of the TX queue context are managed by PF. TX queue length must be verified by PF during virtual channel message processing. When PF uses dummy descriptors to advance TX head, it will configure the TX ring base as the new address managed by PF itself. As a result, all of the TX queue context is taken control of by PF and this method won't generate any attacking vulnerability The overall steps for TX head restoring handler are: 1. Backup TX context, switch TX queue context as PF space and PF DMA ring base with interrupt disabled 2. Fill the DMA ring with dummy descriptors and bump doorbell to advance TX head. Once kicking doorbell, HW will issue DMA and send PCI upstream memory transaction tagged by PF BDF. Since ring base is PF's managed DMA buffer, DMA can work successfully and TX Head is advanced as expected. 3. Overwrite TX context by the backup context in step 1. Since TX queue head value is not changed while context switch, TX queue head is successfully restored. Since everything is happening inside PF context, it is transparent to vfio driver and has no effects outside of PF. Co-developed-by: Yahui Cao Signed-off-by: Yahui Cao Signed-off-by: Lingyu Liu --- .../net/ethernet/intel/ice/ice_migration.c | 277 ++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_virtchnl.c | 17 ++ 2 files changed, 294 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c index 34cfc58ed525..3b6bb6b975f7 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration.c +++ b/drivers/net/ethernet/intel/ice/ice_migration.c @@ -3,10 +3,14 @@ #include "ice.h" #include "ice_base.h" +#include "ice_txrx_lib.h" #define ICE_MIG_DEVSTAT_MAGIC 0xE8000001 #define ICE_MIG_DEVSTAT_VERSION 0x1 #define ICE_MIG_VF_QRX_TAIL_MAX 256 +#define QTX_HEAD_RESTORE_DELAY_MAX 100 +#define QTX_HEAD_RESTORE_DELAY_SLEEP_US_MIN 10 +#define QTX_HEAD_RESTORE_DELAY_SLEEP_US_MAX 10 struct ice_migration_virtchnl_msg_slot { u32 opcode; @@ -30,6 +34,8 @@ struct ice_migration_dev_state { u16 vsi_id; /* next RX desc index to be processed by the device */ u16 rx_head[ICE_MIG_VF_QRX_TAIL_MAX]; + /* next TX desc index to be processed by the device */ + u16 tx_head[ICE_MIG_VF_QRX_TAIL_MAX]; u8 virtchnl_msgs[]; } __aligned(8); @@ -317,6 +323,62 @@ ice_migration_save_rx_head(struct ice_vf *vf, return 0; } +/** + * ice_migration_save_tx_head - save tx head in migration region + * @vf: pointer to VF structure + * @devstate: pointer to migration device state + * + * Return 0 for success, negative for error + */ +static int +ice_migration_save_tx_head(struct ice_vf *vf, + struct ice_migration_dev_state *devstate) +{ + struct ice_vsi *vsi = ice_get_vf_vsi(vf); + struct ice_pf *pf = vf->pf; + struct device *dev; + int i = 0; + + dev = ice_pf_to_dev(pf); + + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + ice_for_each_txq(vsi, i) { + u16 tx_head; + u32 reg; + + devstate->tx_head[i] = 0; + if (!test_bit(i, vf->txq_ena)) + continue; + + reg = rd32(&pf->hw, QTX_COMM_HEAD(vsi->txq_map[i])); + tx_head = (reg & QTX_COMM_HEAD_HEAD_M) + >> QTX_COMM_HEAD_HEAD_S; + + /* 1. If TX head is QTX_COMM_HEAD_HEAD_M marker, which means + * it is the value written by software and there are no + * descriptors write back happened, then there are no + * packets sent since queue enabled. + * 2. If TX head is ring length minus 1, then it just returns + * to the start of the ring. + */ + if (tx_head == QTX_COMM_HEAD_HEAD_M || + tx_head == (vsi->tx_rings[i]->count - 1)) + tx_head = 0; + else + /* Add compensation since value read from TX Head + * register is always the real TX head minus 1 + */ + tx_head++; + + devstate->tx_head[i] = tx_head; + } + return 0; +} + /** * ice_migration_save_devstate - save device state to migration buffer * @pf: pointer to PF of migration device @@ -376,6 +438,12 @@ int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_s goto out_put_vf; } + ret = ice_migration_save_tx_head(vf, devstate); + if (ret) { + dev_err(dev, "VF %d failed to save txq head\n", vf->vf_id); + goto out_put_vf; + } + list_for_each_entry(msg_listnode, &vf->virtchnl_msg_list, node) { struct ice_migration_virtchnl_msg_slot *msg_slot; u64 slot_size; @@ -517,6 +585,205 @@ ice_migration_restore_rx_head(struct ice_vf *vf, return 0; } +/** + * ice_migration_init_dummy_desc - init dma ring by dummy descriptor + * @ice_tx_desc: tx ring descriptor array + * @len: array length + * @tx_pkt_dma: dummy packet dma address + */ +static inline void +ice_migration_init_dummy_desc(struct ice_tx_desc *tx_desc, + u16 len, + dma_addr_t tx_pkt_dma) +{ + int i; + + /* Init ring with dummy descriptors */ + for (i = 0; i < len; i++) { + u32 td_cmd; + + td_cmd = ICE_TXD_LAST_DESC_CMD | ICE_TX_DESC_CMD_DUMMY; + tx_desc[i].cmd_type_offset_bsz = + ice_build_ctob(td_cmd, 0, SZ_256, 0); + tx_desc[i].buf_addr = cpu_to_le64(tx_pkt_dma); + } +} + +/** + * ice_migration_inject_dummy_desc - inject dummy descriptors + * @vf: pointer to VF structure + * @tx_ring: tx ring instance + * @head: tx head to be restored + * @tx_desc_dma:tx descriptor ring base dma address + * + * For each TX queue, restore the TX head by following below steps: + * 1. Backup TX context, switch TX queue context as PF space and PF + * DMA ring base with interrupt disabled + * 2. Fill the DMA ring with dummy descriptors and bump doorbell to + * advance TX head. Once kicking doorbell, HW will issue DMA and + * send PCI upstream memory transaction tagged by PF BDF. Since + * ring base is PF's managed DMA buffer, DMA can work successfully + * and TX Head is advanced as expected. + * 3. Overwrite TX context by the backup context in step 1. Since TX + * queue head value is not changed while context switch, TX queue + * head is successfully restored. + * + * Return 0 for success, negative for error. + */ +static int +ice_migration_inject_dummy_desc(struct ice_vf *vf, struct ice_tx_ring *tx_ring, + u16 head, dma_addr_t tx_desc_dma) +{ + struct ice_tlan_ctx tlan_ctx, tlan_ctx_orig; + struct device *dev = ice_pf_to_dev(vf->pf); + struct ice_hw *hw = &vf->pf->hw; + u32 reg_dynctl_orig; + u32 reg_tqctl_orig; + u32 tx_head; + int status; + int i; + + /* 1.1 Backup TX Queue context */ + status = ice_read_txq_ctx(hw, &tlan_ctx, tx_ring->reg_idx); + if (status) { + dev_err(dev, "Failed to read TXQ[%d] context, err=%d\n", + tx_ring->q_index, status); + return -EIO; + } + memcpy(&tlan_ctx_orig, &tlan_ctx, sizeof(tlan_ctx)); + reg_tqctl_orig = rd32(hw, QINT_TQCTL(tx_ring->reg_idx)); + if (tx_ring->q_vector) + reg_dynctl_orig = rd32(hw, GLINT_DYN_CTL(tx_ring->q_vector->reg_idx)); + + /* 1.2 switch TX queue context as PF space and PF DMA ring base */ + tlan_ctx.vmvf_type = ICE_TLAN_CTX_VMVF_TYPE_PF; + tlan_ctx.vmvf_num = 0; + tlan_ctx.base = tx_desc_dma >> ICE_TLAN_CTX_BASE_S; + status = ice_write_txq_ctx(hw, &tlan_ctx, tx_ring->reg_idx); + if (status) { + dev_err(dev, "Failed to write TXQ[%d] context, err=%d\n", + tx_ring->q_index, status); + return -EIO; + } + + /* 1.3 Disable TX queue interrupt */ + wr32(hw, QINT_TQCTL(tx_ring->reg_idx), QINT_TQCTL_ITR_INDX_M); + + /* To disable tx queue interrupt during run time, software should + * write mmio to trigger a MSIX interrupt. + */ + if (tx_ring->q_vector) + wr32(hw, GLINT_DYN_CTL(tx_ring->q_vector->reg_idx), + (ICE_ITR_NONE << GLINT_DYN_CTL_ITR_INDX_S) | + GLINT_DYN_CTL_SWINT_TRIG_M | + GLINT_DYN_CTL_INTENA_M); + + /* Force memory writes to complete before letting h/w know there + * are new descriptors to fetch. + */ + wmb(); + + /* 2.1 Bump doorbell to advance TX Queue head */ + writel(head, tx_ring->tail); + + /* 2.2 Wait until TX Queue head move to expected place */ + tx_head = rd32(hw, QTX_COMM_HEAD(tx_ring->reg_idx)); + tx_head = (tx_head & QTX_COMM_HEAD_HEAD_M) + >> QTX_COMM_HEAD_HEAD_S; + for (i = 0; i < QTX_HEAD_RESTORE_DELAY_MAX && tx_head != (head - 1); i++) { + usleep_range(QTX_HEAD_RESTORE_DELAY_SLEEP_US_MIN, + QTX_HEAD_RESTORE_DELAY_SLEEP_US_MAX); + tx_head = rd32(hw, QTX_COMM_HEAD(tx_ring->reg_idx)); + tx_head = (tx_head & QTX_COMM_HEAD_HEAD_M) + >> QTX_COMM_HEAD_HEAD_S; + } + if (i == QTX_HEAD_RESTORE_DELAY_MAX) { + dev_err(dev, "VF %d txq[%d] head restore timeout\n", + vf->vf_id, tx_ring->q_index); + return -EIO; + } + + /* 3. Overwrite TX Queue context with backup context */ + status = ice_write_txq_ctx(hw, &tlan_ctx_orig, tx_ring->reg_idx); + if (status) { + dev_err(dev, "Failed to write TXQ[%d] context, err=%d\n", + tx_ring->q_index, status); + return -EIO; + } + wr32(hw, QINT_TQCTL(tx_ring->reg_idx), reg_tqctl_orig); + if (tx_ring->q_vector) + wr32(hw, GLINT_DYN_CTL(tx_ring->q_vector->reg_idx), reg_dynctl_orig); + + return 0; +} + +/** + * ice_migration_restore_tx_head - restore tx head at dst + * @vf: pointer to VF structure + * @devstate: pointer to migration device state + * + * Return 0 for success, negative for error + */ +static int +ice_migration_restore_tx_head(struct ice_vf *vf, + struct ice_migration_dev_state *devstate) +{ + struct device *dev = ice_pf_to_dev(vf->pf); + u16 max_ring_len = ICE_MAX_NUM_DESC; + dma_addr_t tx_desc_dma, tx_pkt_dma; + struct ice_tx_desc *tx_desc; + struct ice_vsi *vsi; + char *tx_pkt; + int ret = 0; + int i = 0; + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + /* Allocate DMA ring and descriptor by PF */ + tx_desc = dma_alloc_coherent(dev, max_ring_len * sizeof(struct ice_tx_desc), + &tx_desc_dma, GFP_KERNEL | __GFP_ZERO); + tx_pkt = dma_alloc_coherent(dev, SZ_4K, &tx_pkt_dma, GFP_KERNEL | __GFP_ZERO); + if (!tx_desc || !tx_pkt) { + dev_err(dev, "PF failed to allocate memory for VF %d\n", vf->vf_id); + ret = -ENOMEM; + goto err; + } + + ice_for_each_txq(vsi, i) { + struct ice_tx_ring *tx_ring = vsi->tx_rings[i]; + u16 *tx_heads = devstate->tx_head; + + /* 1. Skip if TX Queue is not enabled */ + if (!test_bit(i, vf->txq_ena) || tx_heads[i] == 0) + continue; + + if (tx_heads[i] >= tx_ring->count) { + dev_err(dev, "VF %d: invalid tx ring length to restore\n", + vf->vf_id); + ret = -EINVAL; + goto err; + } + + /* Dummy descriptors must be re-initialized after use, since + * it may be written back by HW + */ + ice_migration_init_dummy_desc(tx_desc, max_ring_len, tx_pkt_dma); + ret = ice_migration_inject_dummy_desc(vf, tx_ring, tx_heads[i], tx_desc_dma); + if (ret) + goto err; + } + +err: + dma_free_coherent(dev, max_ring_len * sizeof(struct ice_tx_desc), tx_desc, tx_desc_dma); + dma_free_coherent(dev, SZ_4K, tx_pkt, tx_pkt_dma); + + return ret; +} + /** * ice_migration_restore_devstate - restore device state at dst * @pf: pointer to PF of migration device @@ -593,6 +860,16 @@ int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, msg_slot = (struct ice_migration_virtchnl_msg_slot *) ((char *)msg_slot + slot_sz); } + + /* Only do the TX Queue head restore after rest of device state is + * loaded successfully. + */ + ret = ice_migration_restore_tx_head(vf, devstate); + if (ret) { + dev_err(dev, "VF %d failed to restore rx head\n", vf->vf_id); + goto out_clear_replay; + } + out_clear_replay: clear_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states); out_put_vf: diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c index 7cedd0542d4b..df00defa550d 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c @@ -1341,6 +1341,23 @@ static int ice_vc_ena_qs_msg(struct ice_vf *vf, u8 *msg) continue; ice_vf_ena_txq_interrupt(vsi, vf_q_id); + + /* TX head register is a shadow copy of on-die TX head which + * maintains the accurate location. And TX head register is + * updated only after a packet is sent. If nothing is sent + * after the queue is enabled, then the value is the one + * updated last time and out-of-date. + * + * QTX_COMM_HEAD.HEAD rang value from 0x1fe0 to 0x1fff is + * reserved and will never be used by HW. Manually write a + * reserved value into TX head and use this as a marker for + * the case that there's no packets sent. + * + * This marker is only used in live migration use case. + */ + if (vf->migration_enabled) + wr32(&vsi->back->hw, QTX_COMM_HEAD(vsi->txq_map[vf_q_id]), + QTX_COMM_HEAD_HEAD_M); set_bit(vf_q_id, vf->txq_ena); } From patchwork Mon Sep 18 06:25:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5F19CD37B0 for ; Mon, 18 Sep 2023 06:29:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239851AbjIRG3V (ORCPT ); Mon, 18 Sep 2023 02:29:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239855AbjIRG2w (ORCPT ); Mon, 18 Sep 2023 02:28:52 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51804E6; Sun, 17 Sep 2023 23:28:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018527; x=1726554527; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DGcxIaiFnxUS4K9bhSUlSz58353lNoPiha4ihUOVwcI=; b=UDadAcsleWFkoSO56i3AXwqkD9sQ31ISAWSFLFx0LqCuwBsdPLDEsEcA lY8lzcMKasUva42g1GfkwnTTvTneyjU+2McgGbwc/NKxb8csal5yZ3Kdq KsAqdPiDC92wjk7HBcSuZJ8GB2cqK6DuJyUZjeaeeObWTdgDylur3siMp 8t1AxoaoA6sZVq+yRe5lIBmuj27zTs+E1bLu1b8fFKS+4TN5zQ94NZgoc INtf40KTDsFsyl7iIU63I480a4QIehhv8iaEjEIDu6XREwyRQTPx+nEE6 waO9TV+Uhfuazilhxpf0IE2HAnrW83WJPvPLpOR2QM9LpVMM/uTd6x/Vj w==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488650" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488650" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893634" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893634" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:41 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 11/13] ice: Add device suspend function for migration Date: Mon, 18 Sep 2023 06:25:44 +0000 Message-Id: <20230918062546.40419-12-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu Device suspend handler is called by vfio driver before saving device state. Typical operation includes stopping TX/RX queue. Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- .../net/ethernet/intel/ice/ice_migration.c | 70 +++++++++++++++++++ include/linux/net/intel/ice_migration.h | 5 ++ 2 files changed, 75 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c index 3b6bb6b975f7..7cf3a28a95b0 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration.c +++ b/drivers/net/ethernet/intel/ice/ice_migration.c @@ -2,6 +2,8 @@ /* Copyright (C) 2018-2023 Intel Corporation */ #include "ice.h" +#include "ice_lib.h" +#include "ice_fltr.h" #include "ice_base.h" #include "ice_txrx_lib.h" @@ -275,6 +277,74 @@ u32 ice_migration_supported_caps(void) return VIRTCHNL_VF_MIGRATION_SUPPORT_FEATURE; } +/** + * ice_migration_suspend_dev - suspend device on src + * @pf: pointer to PF of migration device + * @vf_id: VF index of migration device + * + * Return 0 for success, negative for error + */ +int ice_migration_suspend_dev(struct ice_pf *pf, int vf_id) +{ + struct device *dev = ice_pf_to_dev(pf); + struct ice_vsi *vsi; + struct ice_vf *vf; + int ret; + + vf = ice_get_vf_by_id(pf, vf_id); + if (!vf) { + dev_err(dev, "Unable to locate VF from VF ID%d\n", vf_id); + return -EINVAL; + } + + if (!test_bit(ICE_VF_STATE_QS_ENA, vf->vf_states)) { + ice_put_vf(vf); + return 0; + } + + dev = ice_pf_to_dev(pf); + if (vf->virtchnl_msg_num > VIRTCHNL_MSG_MAX) { + dev_err(dev, "SR-IOV live migration disabled on VF %d. Migration buffer exceeded\n", + vf->vf_id); + ret = -EIO; + goto out_put_vf; + } + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + ret = -EINVAL; + goto out_put_vf; + } + + /* Prevent VSI from queuing incoming packets by removing all filters */ + ice_fltr_remove_all(vsi); + + /* MAC based filter rule is disabled at this point. Set MAC to zero + * to keep consistency with VF mac address info shown by ip link + */ + eth_zero_addr(vf->hw_lan_addr); + eth_zero_addr(vf->dev_lan_addr); + + ret = ice_vsi_stop_lan_tx_rings(vsi, ICE_NO_RESET, vf->vf_id); + if (ret) { + dev_err(dev, "VF %d failed to stop tx rings\n", vf->vf_id); + ret = -EIO; + goto out_put_vf; + } + ret = ice_vsi_stop_all_rx_rings(vsi); + if (ret) { + dev_err(dev, "VF %d failed to stop rx rings\n", vf->vf_id); + ret = -EIO; + goto out_put_vf; + } + +out_put_vf: + ice_put_vf(vf); + return ret; +} +EXPORT_SYMBOL(ice_migration_suspend_dev); + /** * ice_migration_save_rx_head - save rx head into device state buffer * @vf: pointer to VF structure diff --git a/include/linux/net/intel/ice_migration.h b/include/linux/net/intel/ice_migration.h index 57c0e60e21d4..494a9bd1f121 100644 --- a/include/linux/net/intel/ice_migration.h +++ b/include/linux/net/intel/ice_migration.h @@ -11,6 +11,7 @@ struct ice_pf; struct ice_pf *ice_migration_get_pf(struct pci_dev *pdev); int ice_migration_init_dev(struct ice_pf *pf, int vf_id); void ice_migration_uninit_dev(struct ice_pf *pf, int vf_id); +int ice_migration_suspend_dev(struct ice_pf *pf, int vf_id); int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_sz); int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, u64 buf_sz); @@ -22,6 +23,10 @@ static inline struct ice_pf *ice_migration_get_pf(struct pci_dev *pdev) static inline int ice_migration_init_dev(struct ice_pf *pf, int vf_id) { } static inline void ice_migration_uninit_dev(struct ice_pf *pf, int vf_id) { } +static inline int ice_migration_suspend_dev(struct ice_pf *pf, int vf_id) +{ + return 0; +} static inline int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_sz) { return 0; From patchwork Mon Sep 18 06:25:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5255CD13D1 for ; Mon, 18 Sep 2023 06:29:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239885AbjIRG3Y (ORCPT ); Mon, 18 Sep 2023 02:29:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239901AbjIRG27 (ORCPT ); Mon, 18 Sep 2023 02:28:59 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5885F12C; Sun, 17 Sep 2023 23:28:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018531; x=1726554531; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ld+ZJ5057Q+8KA+jeihl1t+bwCq6BgWp+IhBcyLnv5A=; b=a5+FjbN+2VhbjqNPqwO8pi2lJYJyeTPQVk17rR4rsM2S3ZdHzbNKVlLf AoeG/g/fv65Tvs5Speu5VLxIfd2Gsiay0eAnTZUXgNaXvy/tSgsgghds5 U6s+OReV7IjABCLNkR6vG/ytVMVlMR13ZDSiyVV1j8wVSS7G5/CTclXXD oYH4U6BylQI6jT1M36RnRC+zxBS2F1vch7u/aFGglXxp3DY76yVGYMEjk Pbbzb18l4Ft83FbY+QsCKS05dLl/J0aF5Mb5WCmrcmXCgYaIZdOmOFB1s +gRQTo2TeXU8+WkKHGMoyQWo2ll/SORtG/zH0yzdeSuDdqryMBrM+wf8v w==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488668" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488668" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893642" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893642" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:46 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 12/13] ice: Save and restore mmio registers Date: Mon, 18 Sep 2023 06:25:45 +0000 Message-Id: <20230918062546.40419-13-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In E800 device model, VF takes direct control over the context of AdminQ, irq ctrl, TX tail and RX tail by accessing VF pci mmio. Rest of all the state can only be setup by PF and the procedure is that VF sends all these configuration to PF through virtual channel messages to setup the rest of the state. To migrate AdminQ/irq context successfully, only AdminQ/irq register needs to be restored, rest of the part like generic msix is handled by migration stack. To migrate RX dma ring successfully, RX ring base, length(setup via virtual channel messages) and tail register (setup via VF pci mmio) must be restored before RX queue is enabled. To migrate TX dma ring successfully, TX ring base and length(setup via virtual channel messages) must be restored before TX queue is enabled, and TX tail(setup via VF pci mmio) doesn't need to be restored since TX queue is drained before migration and TX tail is stateless. For simplicity, just restore all the VF pci mmio before virtual channel messages are replayed so that all the TX/RX ring context are restored before queue is enabled. However, there are 2 corner cases which need to be taken care of: - During device suspenion, irq register may be dirtied when stopping queue. So save irq register into internal pre-saved area before queue is stopped and fetch the pre-saved irq register value at device saving stage. - When PF processes virtual channel VIRTCHNL_OP_CONFIG_VSI_QUEUES, irq register may be dirtied. So restore the affacted irq register after virtual channel messages are replayed. Signed-off-by: Yahui Cao --- .../net/ethernet/intel/ice/ice_hw_autogen.h | 8 + .../net/ethernet/intel/ice/ice_migration.c | 304 ++++++++++++++++++ .../intel/ice/ice_migration_private.h | 7 + drivers/net/ethernet/intel/ice/ice_vf_lib.h | 2 + 4 files changed, 321 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h index 67d8332d92f6..3ce8503c482f 100644 --- a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h +++ b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h @@ -31,8 +31,16 @@ #define PF_FW_ATQLEN_ATQVFE_M BIT(28) #define PF_FW_ATQLEN_ATQOVFL_M BIT(29) #define PF_FW_ATQLEN_ATQCRIT_M BIT(30) +#define VF_MBX_ARQBAH(_VF) (0x0022B800 + ((_VF) * 4)) +#define VF_MBX_ARQBAL(_VF) (0x0022B400 + ((_VF) * 4)) +#define VF_MBX_ARQH(_VF) (0x0022C000 + ((_VF) * 4)) #define VF_MBX_ARQLEN(_VF) (0x0022BC00 + ((_VF) * 4)) +#define VF_MBX_ARQT(_VF) (0x0022C400 + ((_VF) * 4)) +#define VF_MBX_ATQBAH(_VF) (0x0022A400 + ((_VF) * 4)) +#define VF_MBX_ATQBAL(_VF) (0x0022A000 + ((_VF) * 4)) +#define VF_MBX_ATQH(_VF) (0x0022AC00 + ((_VF) * 4)) #define VF_MBX_ATQLEN(_VF) (0x0022A800 + ((_VF) * 4)) +#define VF_MBX_ATQT(_VF) (0x0022B000 + ((_VF) * 4)) #define PF_FW_ATQLEN_ATQENABLE_M BIT(31) #define PF_FW_ATQT 0x00080400 #define PF_MBX_ARQBAH 0x0022E400 diff --git a/drivers/net/ethernet/intel/ice/ice_migration.c b/drivers/net/ethernet/intel/ice/ice_migration.c index 7cf3a28a95b0..9f8e88108932 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration.c +++ b/drivers/net/ethernet/intel/ice/ice_migration.c @@ -25,6 +25,27 @@ struct ice_migration_virtchnl_msg_listnode { struct ice_migration_virtchnl_msg_slot msg_slot; }; +struct ice_migration_mmio_regs { + /* VF Interrupts */ + u32 int_dyn_ctl[ICE_MIG_VF_MSIX_MAX]; + u32 int_intr[ICE_MIG_VF_ITR_NUM][ICE_MIG_VF_MSIX_MAX]; + + /* VF Control Queues */ + u32 asq_bal; + u32 asq_bah; + u32 asq_len; + u32 asq_head; + u32 asq_tail; + u32 arq_bal; + u32 arq_bah; + u32 arq_len; + u32 arq_head; + u32 arq_tail; + + /* VF LAN RX */ + u32 rx_tail[ICE_MIG_VF_QRX_TAIL_MAX]; +}; + struct ice_migration_dev_state { u32 magic; u32 version; @@ -33,6 +54,7 @@ struct ice_migration_dev_state { u16 num_txq; u16 num_rxq; + struct ice_migration_mmio_regs regs; u16 vsi_id; /* next RX desc index to be processed by the device */ u16 rx_head[ICE_MIG_VF_QRX_TAIL_MAX]; @@ -277,6 +299,57 @@ u32 ice_migration_supported_caps(void) return VIRTCHNL_VF_MIGRATION_SUPPORT_FEATURE; } +/** + * ice_migration_save_dirtied_regs - save registers which may be dirtied + * @vf: pointer to VF structure + * @devstate: pointer to migration device state + * + * Return 0 for success, negative for error + */ +static int ice_migration_save_dirtied_regs(struct ice_vf *vf) +{ + struct ice_migration_dirtied_regs *dirtied_regs = &vf->dirtied_regs; + struct device *dev = ice_pf_to_dev(vf->pf); + struct ice_hw *hw = &vf->pf->hw; + struct ice_vsi *vsi; + int itr, v_id; + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + if (WARN_ON_ONCE(vsi->num_q_vectors + ICE_NONQ_VECS_VF > ICE_MIG_VF_MSIX_MAX)) + return -EINVAL; + + /* Save Mailbox Q vectors */ + dirtied_regs->int_dyn_ctl[0] = + rd32(hw, GLINT_DYN_CTL(vf->first_vector_idx)); + for (itr = 0; itr < ICE_MIG_VF_ITR_NUM; itr++) + dirtied_regs->int_intr[itr][0] = + rd32(hw, GLINT_ITR(itr, vf->first_vector_idx)); + + /* Save Data Q vectors */ + for (v_id = 0; v_id < vsi->num_q_vectors; v_id++) { + int irq = v_id + ICE_NONQ_VECS_VF; + struct ice_q_vector *q_vector; + + q_vector = vsi->q_vectors[v_id]; + if (!q_vector) { + dev_err(dev, "VF %d invalid q vectors\n", vf->vf_id); + return -EINVAL; + } + dirtied_regs->int_dyn_ctl[irq] = + rd32(hw, GLINT_DYN_CTL(q_vector->reg_idx)); + for (itr = 0; itr < ICE_MIG_VF_ITR_NUM; itr++) + dirtied_regs->int_intr[itr][irq] = + rd32(hw, GLINT_ITR(itr, q_vector->reg_idx)); + } + + return 0; +} + /** * ice_migration_suspend_dev - suspend device on src * @pf: pointer to PF of migration device @@ -326,6 +399,15 @@ int ice_migration_suspend_dev(struct ice_pf *pf, int vf_id) eth_zero_addr(vf->hw_lan_addr); eth_zero_addr(vf->dev_lan_addr); + /* Irq register may be dirtied when stopping queue. So save irq + * register into pre-saved area before queue is stopped. + */ + ret = ice_migration_save_dirtied_regs(vf); + if (ret) { + dev_err(dev, "VF %d failed to save dirtied register copy\n", + vf->vf_id); + goto out_put_vf; + } ret = ice_vsi_stop_lan_tx_rings(vsi, ICE_NO_RESET, vf->vf_id); if (ret) { dev_err(dev, "VF %d failed to stop tx rings\n", vf->vf_id); @@ -449,6 +531,83 @@ ice_migration_save_tx_head(struct ice_vf *vf, return 0; } +/** + * ice_migration_save_regs - save mmio registers in migration region + * @vf: pointer to VF structure + * @devstate: pointer to migration device state + * + * Return 0 for success, negative for error + */ +static int +ice_migration_save_regs(struct ice_vf *vf, + struct ice_migration_dev_state *devstate) +{ + struct ice_migration_dirtied_regs *dirtied_regs = &vf->dirtied_regs; + struct ice_migration_mmio_regs *regs = &devstate->regs; + struct device *dev = ice_pf_to_dev(vf->pf); + struct ice_hw *hw = &vf->pf->hw; + struct ice_vsi *vsi; + int i, itr, v_id; + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + if (WARN_ON_ONCE(vsi->num_q_vectors + ICE_NONQ_VECS_VF > ICE_MIG_VF_MSIX_MAX)) + return -EINVAL; + + /* For irq registers which may be dirtied when virtual channel message + * VIRTCHNL_OP_CONFIG_VSI_QUEUES is processed, load values from + * pre-saved area. + */ + + /* Save Mailbox Q vectors */ + regs->int_dyn_ctl[0] = dirtied_regs->int_dyn_ctl[0]; + for (itr = 0; itr < ICE_MIG_VF_ITR_NUM; itr++) + regs->int_intr[itr][0] = dirtied_regs->int_intr[itr][0]; + + /* Save Data Q vectors */ + for (v_id = 0; v_id < vsi->num_q_vectors; v_id++) { + int irq = v_id + ICE_NONQ_VECS_VF; + struct ice_q_vector *q_vector; + + q_vector = vsi->q_vectors[v_id]; + if (!q_vector) { + dev_err(dev, "VF %d invalid q vectors\n", vf->vf_id); + return -EINVAL; + } + regs->int_dyn_ctl[irq] = dirtied_regs->int_dyn_ctl[irq]; + for (itr = 0; itr < ICE_MIG_VF_ITR_NUM; itr++) + regs->int_intr[itr][irq] = + dirtied_regs->int_intr[itr][irq]; + } + + regs->asq_bal = rd32(hw, VF_MBX_ATQBAL(vf->vf_id)); + regs->asq_bah = rd32(hw, VF_MBX_ATQBAH(vf->vf_id)); + regs->asq_len = rd32(hw, VF_MBX_ATQLEN(vf->vf_id)); + regs->asq_head = rd32(hw, VF_MBX_ATQH(vf->vf_id)); + regs->asq_tail = rd32(hw, VF_MBX_ATQT(vf->vf_id)); + regs->arq_bal = rd32(hw, VF_MBX_ARQBAL(vf->vf_id)); + regs->arq_bah = rd32(hw, VF_MBX_ARQBAH(vf->vf_id)); + regs->arq_len = rd32(hw, VF_MBX_ARQLEN(vf->vf_id)); + regs->arq_head = rd32(hw, VF_MBX_ARQH(vf->vf_id)); + regs->arq_tail = rd32(hw, VF_MBX_ARQT(vf->vf_id)); + + ice_for_each_rxq(vsi, i) { + struct ice_rx_ring *rx_ring = vsi->rx_rings[i]; + + regs->rx_tail[i] = 0; + if (!test_bit(i, vf->rxq_ena)) + continue; + + regs->rx_tail[i] = rd32(hw, QRX_TAIL(rx_ring->reg_idx)); + } + + return 0; +} + /** * ice_migration_save_devstate - save device state to migration buffer * @pf: pointer to PF of migration device @@ -502,6 +661,12 @@ int ice_migration_save_devstate(struct ice_pf *pf, int vf_id, u8 *buf, u64 buf_s buf = devstate->virtchnl_msgs; devstate->vsi_id = vf->vm_vsi_num; + ret = ice_migration_save_regs(vf, devstate); + if (ret) { + dev_err(dev, "VF %d failed to save mmio registers\n", vf->vf_id); + goto out_put_vf; + } + ret = ice_migration_save_rx_head(vf, devstate); if (ret) { dev_err(dev, "VF %d failed to save rxq head\n", vf->vf_id); @@ -854,6 +1019,122 @@ ice_migration_restore_tx_head(struct ice_vf *vf, return ret; } +/** + * ice_migration_restore_regs - restore mmio registers from device state buffer + * @vf: pointer to VF structure + * @devstate: pointer to migration device state + * + * Return 0 for success, negative for error + */ +static int +ice_migration_restore_regs(struct ice_vf *vf, + struct ice_migration_dev_state *devstate) +{ + struct ice_migration_mmio_regs *regs = &devstate->regs; + struct device *dev = ice_pf_to_dev(vf->pf); + struct ice_hw *hw = &vf->pf->hw; + struct ice_vsi *vsi; + int i, itr, v_id; + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + if (WARN_ON_ONCE(vsi->num_q_vectors + ICE_NONQ_VECS_VF > ICE_MIG_VF_MSIX_MAX)) + return -EINVAL; + + /* Restore Mailbox Q vectors */ + wr32(hw, GLINT_DYN_CTL(vf->first_vector_idx), regs->int_dyn_ctl[0]); + for (itr = 0; itr < ICE_MIG_VF_ITR_NUM; itr++) + wr32(hw, GLINT_ITR(itr, vf->first_vector_idx), regs->int_intr[itr][0]); + + /* Restore Data Q vectors */ + for (v_id = 0; v_id < vsi->num_q_vectors; v_id++) { + int irq = v_id + ICE_NONQ_VECS_VF; + struct ice_q_vector *q_vector; + + q_vector = vsi->q_vectors[v_id]; + if (!q_vector) { + dev_err(dev, "VF %d invalid q vectors\n", vf->vf_id); + return -EINVAL; + } + wr32(hw, GLINT_DYN_CTL(q_vector->reg_idx), + regs->int_dyn_ctl[irq]); + for (itr = 0; itr < ICE_MIG_VF_ITR_NUM; itr++) + wr32(hw, GLINT_ITR(itr, q_vector->reg_idx), + regs->int_intr[itr][irq]); + } + + wr32(hw, VF_MBX_ATQBAL(vf->vf_id), regs->asq_bal); + wr32(hw, VF_MBX_ATQBAH(vf->vf_id), regs->asq_bah); + wr32(hw, VF_MBX_ATQLEN(vf->vf_id), regs->asq_len); + wr32(hw, VF_MBX_ATQH(vf->vf_id), regs->asq_head); + /* Since Mailbox ctrl tx queue tail is bumped by VF driver to notify + * HW to send pks, VF_MBX_ATQT is not necessry to be restored here. + */ + wr32(hw, VF_MBX_ARQBAL(vf->vf_id), regs->arq_bal); + wr32(hw, VF_MBX_ARQBAH(vf->vf_id), regs->arq_bah); + wr32(hw, VF_MBX_ARQLEN(vf->vf_id), regs->arq_len); + wr32(hw, VF_MBX_ARQH(vf->vf_id), regs->arq_head); + wr32(hw, VF_MBX_ARQT(vf->vf_id), regs->arq_tail); + + ice_for_each_rxq(vsi, i) { + struct ice_rx_ring *rx_ring = vsi->rx_rings[i]; + + wr32(hw, QRX_TAIL(rx_ring->reg_idx), regs->rx_tail[i]); + } + + return 0; +} + +/** + * ice_migration_restore_dirtied_regs - restore registers which may be dirtied + * @vf: pointer to VF structure + * @devstate: pointer to migration device state + * + * Return 0 for success, negative for error + */ +static int +ice_migration_restore_dirtied_regs(struct ice_vf *vf, + struct ice_migration_dev_state *devstate) +{ + struct ice_migration_mmio_regs *regs = &devstate->regs; + struct device *dev = ice_pf_to_dev(vf->pf); + struct ice_hw *hw = &vf->pf->hw; + struct ice_vsi *vsi; + int itr, v_id; + + vsi = ice_get_vf_vsi(vf); + if (!vsi) { + dev_err(dev, "VF %d VSI is NULL\n", vf->vf_id); + return -EINVAL; + } + + if (WARN_ON_ONCE(vsi->num_q_vectors + ICE_NONQ_VECS_VF > ICE_MIG_VF_MSIX_MAX)) + return -EINVAL; + + /* Restore Data Q vectors */ + for (v_id = 0; v_id < vsi->num_q_vectors; v_id++) { + int irq = v_id + ICE_NONQ_VECS_VF; + struct ice_q_vector *q_vector; + + q_vector = vsi->q_vectors[v_id]; + if (!q_vector) { + dev_err(dev, "VF %d invalid q vectors\n", vf->vf_id); + return -EINVAL; + } + wr32(hw, GLINT_DYN_CTL(q_vector->reg_idx), + regs->int_dyn_ctl[irq]); + for (itr = 0; itr < ICE_MIG_VF_ITR_NUM; itr++) + wr32(hw, GLINT_ITR(itr, q_vector->reg_idx), + regs->int_intr[itr][irq]); + } + + return 0; +} + /** * ice_migration_restore_devstate - restore device state at dst * @pf: pointer to PF of migration device @@ -890,6 +1171,18 @@ int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, devstate = (struct ice_migration_dev_state *)buf; vf->vm_vsi_num = devstate->vsi_id; dev_dbg(dev, "VF %d vm vsi num is:%d\n", vf->vf_id, vf->vm_vsi_num); + + /* RX tail register must be restored before queue is enabled. For + * simplicity, just restore all the mmio before virtual channel messages + * are replayed. + */ + ret = ice_migration_restore_regs(vf, devstate); + if (ret) { + dev_err(dev, "VF %d failed to restore mmio registers\n", + vf->vf_id); + goto out_put_vf; + } + msg_slot = (struct ice_migration_virtchnl_msg_slot *)devstate->virtchnl_msgs; set_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states); @@ -940,6 +1233,17 @@ int ice_migration_restore_devstate(struct ice_pf *pf, int vf_id, const u8 *buf, goto out_clear_replay; } + /* When PF processes virtual channel VIRTCHNL_OP_CONFIG_VSI_QUEUES, irq + * register may be dirtied. So restore the affacted irq register again + * after virtual channel messages are replayed. + */ + ret = ice_migration_restore_dirtied_regs(vf, devstate); + if (ret) { + dev_err(dev, "VF %d failed to restore dirtied registers\n", + vf->vf_id); + goto out_clear_replay; + } + out_clear_replay: clear_bit(ICE_VF_STATE_REPLAYING_VC, vf->vf_states); out_put_vf: diff --git a/drivers/net/ethernet/intel/ice/ice_migration_private.h b/drivers/net/ethernet/intel/ice/ice_migration_private.h index af70025f2f36..c5bbe35a0d1f 100644 --- a/drivers/net/ethernet/intel/ice/ice_migration_private.h +++ b/drivers/net/ethernet/intel/ice/ice_migration_private.h @@ -10,6 +10,13 @@ * in ice-vfio-pic.ko should be exposed as part of ice_migration.h. */ +#define ICE_MIG_VF_MSIX_MAX 65 +#define ICE_MIG_VF_ITR_NUM 4 +struct ice_migration_dirtied_regs { + u32 int_dyn_ctl[ICE_MIG_VF_MSIX_MAX]; + u32 int_intr[ICE_MIG_VF_ITR_NUM][ICE_MIG_VF_MSIX_MAX]; +}; + #if IS_ENABLED(CONFIG_ICE_VFIO_PCI) void ice_migration_init_vf(struct ice_vf *vf); void ice_migration_uninit_vf(struct ice_vf *vf); diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h b/drivers/net/ethernet/intel/ice/ice_vf_lib.h index e37c3b0ecc06..f5cc3844fbbd 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h @@ -14,6 +14,7 @@ #include "ice_type.h" #include "ice_virtchnl_fdir.h" #include "ice_vsi_vlan_ops.h" +#include "ice_migration_private.h" #define ICE_MAX_SRIOV_VFS 256 @@ -144,6 +145,7 @@ struct ice_vf { u64 virtchnl_msg_size; u32 virtchnl_retval; u16 vm_vsi_num; + struct ice_migration_dirtied_regs dirtied_regs; }; /* Flags for controlling behavior of ice_reset_vf */ From patchwork Mon Sep 18 06:25:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Yahui" X-Patchwork-Id: 13388854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3D75CD13D2 for ; Mon, 18 Sep 2023 06:29:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239893AbjIRG30 (ORCPT ); Mon, 18 Sep 2023 02:29:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239994AbjIRG3L (ORCPT ); Mon, 18 Sep 2023 02:29:11 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 274C1196; Sun, 17 Sep 2023 23:28:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695018537; x=1726554537; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BYqtARKbdwN0UZ9MRl8Yh4ddeA5cCMaKnoseZlIqpBs=; b=EhzBBh77jTEAjC1EPptAs4yPwJwEk0J3YxfqdGZ5yqaO9RGo372KcEgY MWFYJ3Vh+pWBvKnJ+CaF+rgRepc008jkyU3h3Aj8ZA5q51FNYszlUbP0m pYjUD1uRcivXXdbRHSwgWnaNUhG5aalO4/r4FRhjXpSws9CDEBZ44oqvI ZCse55XE9zh4v561nJ0IPYl/OaKOuw2J/KJ6MrDLUj/+grp+8N6Q+fMLG VvLr9wmyBcna4CALXfV0Z0VDQTYLprd3y1OMlpOuboFHGiuiKMFoFo2Ep FKeqZUwAmjBlynrvG5W3+kC9IHJySqhGgCSgEvLMc74eanO0SaaezFT0N A==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="378488682" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="378488682" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Sep 2023 23:28:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="815893685" X-IronPort-AV: E=Sophos;i="6.02,155,1688454000"; d="scan'208";a="815893685" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.186]) by fmsmga004.fm.intel.com with ESMTP; 17 Sep 2023 23:28:51 -0700 From: Yahui Cao To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Subject: [PATCH iwl-next v3 13/13] vfio/ice: Implement vfio_pci driver for E800 devices Date: Mon, 18 Sep 2023 06:25:46 +0000 Message-Id: <20230918062546.40419-14-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230918062546.40419-1-yahui.cao@intel.com> References: <20230918062546.40419-1-yahui.cao@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lingyu Liu Add a vendor-specific vfio_pci driver for E800 devices. It uses vfio_pci_core to register to the VFIO subsystem and then implements the E800 specific logic to support VF live migration. It implements the device state transition flow for live migration. Signed-off-by: Lingyu Liu Signed-off-by: Yahui Cao --- MAINTAINERS | 7 + drivers/vfio/pci/Kconfig | 2 + drivers/vfio/pci/Makefile | 2 + drivers/vfio/pci/ice/Kconfig | 10 + drivers/vfio/pci/ice/Makefile | 4 + drivers/vfio/pci/ice/ice_vfio_pci.c | 707 ++++++++++++++++++++++++++++ 6 files changed, 732 insertions(+) create mode 100644 drivers/vfio/pci/ice/Kconfig create mode 100644 drivers/vfio/pci/ice/Makefile create mode 100644 drivers/vfio/pci/ice/ice_vfio_pci.c diff --git a/MAINTAINERS b/MAINTAINERS index 389fe9e38884..09ea8454219a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -22608,6 +22608,13 @@ L: kvm@vger.kernel.org S: Maintained F: drivers/vfio/pci/mlx5/ +VFIO ICE PCI DRIVER +M: Yahui Cao +M: Lingyu Liu +L: kvm@vger.kernel.org +S: Maintained +F: drivers/vfio/pci/ice/ + VFIO PCI DEVICE SPECIFIC DRIVERS R: Jason Gunthorpe R: Yishai Hadas diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig index 8125e5f37832..6618208947af 100644 --- a/drivers/vfio/pci/Kconfig +++ b/drivers/vfio/pci/Kconfig @@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig" source "drivers/vfio/pci/pds/Kconfig" +source "drivers/vfio/pci/ice/Kconfig" + endmenu diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile index 45167be462d8..fc1df82df3ac 100644 --- a/drivers/vfio/pci/Makefile +++ b/drivers/vfio/pci/Makefile @@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI) += mlx5/ obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/ obj-$(CONFIG_PDS_VFIO_PCI) += pds/ + +obj-$(CONFIG_ICE_VFIO_PCI) += ice/ diff --git a/drivers/vfio/pci/ice/Kconfig b/drivers/vfio/pci/ice/Kconfig new file mode 100644 index 000000000000..4c6f348d3062 --- /dev/null +++ b/drivers/vfio/pci/ice/Kconfig @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0-only +config ICE_VFIO_PCI + tristate "VFIO support for Intel(R) Ethernet Connection E800 Series" + depends on ICE + depends on VFIO_PCI_CORE + help + This provides migration support for Intel(R) Ethernet connection E800 + series devices using the VFIO framework. + + If you don't know what to do here, say N. diff --git a/drivers/vfio/pci/ice/Makefile b/drivers/vfio/pci/ice/Makefile new file mode 100644 index 000000000000..259d4ab89105 --- /dev/null +++ b/drivers/vfio/pci/ice/Makefile @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0-only +obj-$(CONFIG_ICE_VFIO_PCI) += ice-vfio-pci.o +ice-vfio-pci-y := ice_vfio_pci.o + diff --git a/drivers/vfio/pci/ice/ice_vfio_pci.c b/drivers/vfio/pci/ice/ice_vfio_pci.c new file mode 100644 index 000000000000..60a0582d7932 --- /dev/null +++ b/drivers/vfio/pci/ice/ice_vfio_pci.c @@ -0,0 +1,707 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2018-2023 Intel Corporation */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define DRIVER_DESC "ICE VFIO PCI - User Level meta-driver for Intel E800 device family" + +struct ice_vfio_pci_migration_file { + struct file *filp; + struct mutex lock; /* protect migration file access */ + bool disabled; + + u8 mig_data[SZ_128K]; + size_t total_length; +}; + +struct ice_vfio_pci_core_device { + struct vfio_pci_core_device core_device; + u8 deferred_reset:1; + struct mutex state_mutex; /* protect migration state */ + enum vfio_device_mig_state mig_state; + /* protect the reset_done flow */ + spinlock_t reset_lock; + struct ice_vfio_pci_migration_file *resuming_migf; + struct ice_vfio_pci_migration_file *saving_migf; + struct vfio_device_migration_info mig_info; + u8 *mig_data; + u8 __iomem *io_base; + struct ice_pf *pf; + int vf_id; +}; + +/** + * ice_vfio_pci_load_state - VFIO device state reloading + * @ice_vdev: pointer to ice vfio pci core device structure + * + * Load device state and restore it. This function is called when the VFIO uAPI + * consumer wants to load the device state info from VFIO migration region and + * restore them into the device. This function should make sure all the device + * state info is loaded and restored successfully. As a result, return value is + * mandatory to be checked. + * + * Return 0 for success, negative value for failure. + */ +static int __must_check +ice_vfio_pci_load_state(struct ice_vfio_pci_core_device *ice_vdev) +{ + struct ice_vfio_pci_migration_file *migf = ice_vdev->resuming_migf; + + return ice_migration_restore_devstate(ice_vdev->pf, + ice_vdev->vf_id, + migf->mig_data, + SZ_128K); +} + +/** + * ice_vfio_pci_save_state - VFIO device state saving + * @ice_vdev: pointer to ice vfio pci core device structure + * @migf: pointer to migration file + * + * Snapshot the device state and save it. This function is called when the + * VFIO uAPI consumer wants to snapshot the current device state and saves + * it into the VFIO migration region. This function should make sure all + * of the device state info is collectted and saved successfully. As a + * result, return value is mandatory to be checked. + * + * Return 0 for success, negative value for failure. + */ +static int __must_check +ice_vfio_pci_save_state(struct ice_vfio_pci_core_device *ice_vdev, + struct ice_vfio_pci_migration_file *migf) +{ + migf->total_length = SZ_128K; + + return ice_migration_save_devstate(ice_vdev->pf, + ice_vdev->vf_id, + migf->mig_data, + SZ_128K); +} + +/** + * ice_vfio_migration_init - Initialization for live migration function + * @ice_vdev: pointer to ice vfio pci core device structure + * + * Returns 0 on success, negative value on error + */ +static int ice_vfio_migration_init(struct ice_vfio_pci_core_device *ice_vdev) +{ + struct pci_dev *pdev = ice_vdev->core_device.pdev; + int ret = 0; + + ice_vdev->pf = ice_migration_get_pf(pdev); + if (!ice_vdev->pf) + return -EFAULT; + + ice_vdev->vf_id = pci_iov_vf_id(pdev); + if (ice_vdev->vf_id < 0) + return -EINVAL; + + ret = ice_migration_init_dev(ice_vdev->pf, ice_vdev->vf_id); + if (ret) + return ret; + + ice_vdev->io_base = (u8 __iomem *)pci_iomap(pdev, 0, 0); + if (!ice_vdev->io_base) + return -EFAULT; + + return 0; +} + +/** + * ice_vfio_migration_uninit - Cleanup for live migration function + * @ice_vdev: pointer to ice vfio pci core device structure + */ +static void ice_vfio_migration_uninit(struct ice_vfio_pci_core_device *ice_vdev) +{ + pci_iounmap(ice_vdev->core_device.pdev, ice_vdev->io_base); + ice_migration_uninit_dev(ice_vdev->pf, ice_vdev->vf_id); +} + +/** + * ice_vfio_pci_disable_fd - Close migration file + * @migf: pointer to ice vfio pci migration file + */ +static void ice_vfio_pci_disable_fd(struct ice_vfio_pci_migration_file *migf) +{ + mutex_lock(&migf->lock); + migf->disabled = true; + migf->total_length = 0; + migf->filp->f_pos = 0; + mutex_unlock(&migf->lock); +} + +/** + * ice_vfio_pci_disable_fds - Close migration files of ice vfio pci device + * @ice_vdev: pointer to ice vfio pci core device structure + */ +static void ice_vfio_pci_disable_fds(struct ice_vfio_pci_core_device *ice_vdev) +{ + if (ice_vdev->resuming_migf) { + ice_vfio_pci_disable_fd(ice_vdev->resuming_migf); + fput(ice_vdev->resuming_migf->filp); + ice_vdev->resuming_migf = NULL; + } + if (ice_vdev->saving_migf) { + ice_vfio_pci_disable_fd(ice_vdev->saving_migf); + fput(ice_vdev->saving_migf->filp); + ice_vdev->saving_migf = NULL; + } +} + +/* + * This function is called in all state_mutex unlock cases to + * handle a 'deferred_reset' if exists. + * @ice_vdev: pointer to ice vfio pci core device structure + */ +static void +ice_vfio_pci_state_mutex_unlock(struct ice_vfio_pci_core_device *ice_vdev) +{ +again: + spin_lock(&ice_vdev->reset_lock); + if (ice_vdev->deferred_reset) { + ice_vdev->deferred_reset = false; + spin_unlock(&ice_vdev->reset_lock); + ice_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; + ice_vfio_pci_disable_fds(ice_vdev); + goto again; + } + mutex_unlock(&ice_vdev->state_mutex); + spin_unlock(&ice_vdev->reset_lock); +} + +static void ice_vfio_pci_reset_done(struct pci_dev *pdev) +{ + struct ice_vfio_pci_core_device *ice_vdev = + (struct ice_vfio_pci_core_device *)dev_get_drvdata(&pdev->dev); + + /* + * As the higher VFIO layers are holding locks across reset and using + * those same locks with the mm_lock we need to prevent ABBA deadlock + * with the state_mutex and mm_lock. + * In case the state_mutex was taken already we defer the cleanup work + * to the unlock flow of the other running context. + */ + spin_lock(&ice_vdev->reset_lock); + ice_vdev->deferred_reset = true; + if (!mutex_trylock(&ice_vdev->state_mutex)) { + spin_unlock(&ice_vdev->reset_lock); + return; + } + spin_unlock(&ice_vdev->reset_lock); + ice_vfio_pci_state_mutex_unlock(ice_vdev); +} + +/** + * ice_vfio_pci_open_device - Called when a vfio device is probed by VFIO UAPI + * @core_vdev: the vfio device to open + * + * Initialization of the vfio device + * + * Returns 0 on success, negative value on error + */ +static int ice_vfio_pci_open_device(struct vfio_device *core_vdev) +{ + struct ice_vfio_pci_core_device *ice_vdev = container_of(core_vdev, + struct ice_vfio_pci_core_device, core_device.vdev); + struct vfio_pci_core_device *vdev = &ice_vdev->core_device; + int ret; + + ret = vfio_pci_core_enable(vdev); + if (ret) + return ret; + + ret = ice_vfio_migration_init(ice_vdev); + if (ret) { + vfio_pci_core_disable(vdev); + return ret; + } + ice_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; + vfio_pci_core_finish_enable(vdev); + + return 0; +} + +/** + * ice_vfio_pci_close_device - Called when a vfio device fd is closed + * @core_vdev: the vfio device to close + */ +static void ice_vfio_pci_close_device(struct vfio_device *core_vdev) +{ + struct ice_vfio_pci_core_device *ice_vdev = container_of(core_vdev, + struct ice_vfio_pci_core_device, core_device.vdev); + + ice_vfio_pci_disable_fds(ice_vdev); + vfio_pci_core_close_device(core_vdev); + ice_vfio_migration_uninit(ice_vdev); +} + +/** + * ice_vfio_pci_release_file - release ice vfio pci migration file + * @inode: pointer to inode + * @filp: pointer to the file to release + * + * Return 0 for success, negative for error + */ +static int ice_vfio_pci_release_file(struct inode *inode, struct file *filp) +{ + struct ice_vfio_pci_migration_file *migf = filp->private_data; + + ice_vfio_pci_disable_fd(migf); + mutex_destroy(&migf->lock); + kfree(migf); + return 0; +} + +/** + * ice_vfio_pci_save_read - save migration file data to user space + * @filp: pointer to migration file + * @buf: pointer to user space buffer + * @len: data length to be saved + * @pos: should be 0 + * + * Return len of saved data, negative for error + */ +static ssize_t ice_vfio_pci_save_read(struct file *filp, char __user *buf, + size_t len, loff_t *pos) +{ + struct ice_vfio_pci_migration_file *migf = filp->private_data; + loff_t *off = &filp->f_pos; + ssize_t done = 0; + int ret; + + if (pos) + return -ESPIPE; + + mutex_lock(&migf->lock); + if (*off > migf->total_length) { + done = -EINVAL; + goto out_unlock; + } + + if (migf->disabled) { + done = -ENODEV; + goto out_unlock; + } + + len = min_t(size_t, migf->total_length - *off, len); + if (len) { + ret = copy_to_user(buf, migf->mig_data + *off, len); + if (ret) { + done = -EFAULT; + goto out_unlock; + } + *off += len; + done = len; + } +out_unlock: + mutex_unlock(&migf->lock); + return done; +} + +static const struct file_operations ice_vfio_pci_save_fops = { + .owner = THIS_MODULE, + .read = ice_vfio_pci_save_read, + .release = ice_vfio_pci_release_file, + .llseek = no_llseek, +}; + +/** + * ice_vfio_pci_stop_copy - create migration file and save migration state to it + * @ice_vdev: pointer to ice vfio pci core device structure + * + * Return migration file handler + */ +static struct ice_vfio_pci_migration_file * +ice_vfio_pci_stop_copy(struct ice_vfio_pci_core_device *ice_vdev) +{ + struct ice_vfio_pci_migration_file *migf; + int ret; + + migf = kzalloc(sizeof(*migf), GFP_KERNEL); + if (!migf) + return ERR_PTR(-ENOMEM); + + migf->filp = anon_inode_getfile("ice_vfio_pci_mig", + &ice_vfio_pci_save_fops, migf, + O_RDONLY); + if (IS_ERR(migf->filp)) { + int err = PTR_ERR(migf->filp); + + kfree(migf); + return ERR_PTR(err); + } + + stream_open(migf->filp->f_inode, migf->filp); + mutex_init(&migf->lock); + + ret = ice_vfio_pci_save_state(ice_vdev, migf); + if (ret) { + fput(migf->filp); + kfree(migf); + return ERR_PTR(ret); + } + + return migf; +} + +/** + * ice_vfio_pci_resume_write- copy migration file data from user space + * @filp: pointer to migration file + * @buf: pointer to user space buffer + * @len: data length to be copied + * @pos: should be 0 + * + * Return len of saved data, negative for error + */ +static ssize_t +ice_vfio_pci_resume_write(struct file *filp, const char __user *buf, + size_t len, loff_t *pos) +{ + struct ice_vfio_pci_migration_file *migf = filp->private_data; + loff_t *off = &filp->f_pos; + loff_t requested_length; + ssize_t done = 0; + int ret; + + if (pos) + return -ESPIPE; + + if (*off < 0 || + check_add_overflow((loff_t)len, *off, &requested_length)) + return -EINVAL; + + if (requested_length > sizeof(migf->mig_data)) + return -ENOMEM; + + mutex_lock(&migf->lock); + if (migf->disabled) { + done = -ENODEV; + goto out_unlock; + } + + ret = copy_from_user(migf->mig_data + *off, buf, len); + if (ret) { + done = -EFAULT; + goto out_unlock; + } + *off += len; + done = len; + migf->total_length += len; +out_unlock: + mutex_unlock(&migf->lock); + return done; +} + +static const struct file_operations ice_vfio_pci_resume_fops = { + .owner = THIS_MODULE, + .write = ice_vfio_pci_resume_write, + .release = ice_vfio_pci_release_file, + .llseek = no_llseek, +}; + +/** + * ice_vfio_pci_resume - create resuming migration file + * @ice_vdev: pointer to ice vfio pci core device structure + * + * Return migration file handler, negative value for failure + */ +static struct ice_vfio_pci_migration_file * +ice_vfio_pci_resume(struct ice_vfio_pci_core_device *ice_vdev) +{ + struct ice_vfio_pci_migration_file *migf; + + migf = kzalloc(sizeof(*migf), GFP_KERNEL); + if (!migf) + return ERR_PTR(-ENOMEM); + + migf->filp = anon_inode_getfile("ice_vfio_pci_mig", + &ice_vfio_pci_resume_fops, migf, + O_WRONLY); + if (IS_ERR(migf->filp)) { + int err = PTR_ERR(migf->filp); + + kfree(migf); + return ERR_PTR(err); + } + + stream_open(migf->filp->f_inode, migf->filp); + mutex_init(&migf->lock); + return migf; +} + +/** + * ice_vfio_pci_step_device_state_locked - process device state change + * @ice_vdev: pointer to ice vfio pci core device structure + * @new: new device state + * @final: final device state + * + * Return migration file handler or NULL for success, negative value for failure + */ +static struct file * +ice_vfio_pci_step_device_state_locked(struct ice_vfio_pci_core_device *ice_vdev, + u32 new, u32 final) +{ + u32 cur = ice_vdev->mig_state; + int ret; + + if (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P) { + ice_migration_suspend_dev(ice_vdev->pf, ice_vdev->vf_id); + return NULL; + } + + if (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_STOP) + return NULL; + + if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_STOP_COPY) { + struct ice_vfio_pci_migration_file *migf; + + migf = ice_vfio_pci_stop_copy(ice_vdev); + if (IS_ERR(migf)) + return ERR_CAST(migf); + get_file(migf->filp); + ice_vdev->saving_migf = migf; + return migf->filp; + } + + if (cur == VFIO_DEVICE_STATE_STOP_COPY && new == VFIO_DEVICE_STATE_STOP) { + ice_vfio_pci_disable_fds(ice_vdev); + return NULL; + } + + if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RESUMING) { + struct ice_vfio_pci_migration_file *migf; + + migf = ice_vfio_pci_resume(ice_vdev); + if (IS_ERR(migf)) + return ERR_CAST(migf); + get_file(migf->filp); + ice_vdev->resuming_migf = migf; + return migf->filp; + } + + if (cur == VFIO_DEVICE_STATE_RESUMING && new == VFIO_DEVICE_STATE_STOP) + return NULL; + + if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING_P2P) { + ret = ice_vfio_pci_load_state(ice_vdev); + if (ret) + return ERR_PTR(ret); + ice_vfio_pci_disable_fds(ice_vdev); + return NULL; + } + + if (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_RUNNING) + return NULL; + + /* + * vfio_mig_get_next_state() does not use arcs other than the above + */ + WARN_ON(true); + return ERR_PTR(-EINVAL); +} + +/** + * ice_vfio_pci_set_device_state - Config device state + * @vdev: pointer to vfio pci device + * @new_state: device state + * + * Return 0 for success, negative value for failure. + */ +static struct file * +ice_vfio_pci_set_device_state(struct vfio_device *vdev, + enum vfio_device_mig_state new_state) +{ + struct ice_vfio_pci_core_device *ice_vdev = container_of(vdev, + struct ice_vfio_pci_core_device, + core_device.vdev); + enum vfio_device_mig_state next_state; + struct file *res = NULL; + int ret; + + mutex_lock(&ice_vdev->state_mutex); + while (new_state != ice_vdev->mig_state) { + ret = vfio_mig_get_next_state(vdev, ice_vdev->mig_state, + new_state, &next_state); + if (ret) { + res = ERR_PTR(ret); + break; + } + res = ice_vfio_pci_step_device_state_locked(ice_vdev, next_state, + new_state); + if (IS_ERR(res)) + break; + ice_vdev->mig_state = next_state; + if (WARN_ON(res && new_state != ice_vdev->mig_state)) { + fput(res); + res = ERR_PTR(-EINVAL); + break; + } + } + ice_vfio_pci_state_mutex_unlock(ice_vdev); + return res; +} + +/** + * ice_vfio_pci_get_device_state - get device state + * @vdev: pointer to vfio pci device + * @curr_state: device state + * + * Return 0 for success + */ +static int ice_vfio_pci_get_device_state(struct vfio_device *vdev, + enum vfio_device_mig_state *curr_state) +{ + struct ice_vfio_pci_core_device *ice_vdev = + container_of(vdev, struct ice_vfio_pci_core_device, core_device.vdev); + + mutex_lock(&ice_vdev->state_mutex); + *curr_state = ice_vdev->mig_state; + ice_vfio_pci_state_mutex_unlock(ice_vdev); + return 0; +} + +/** + * ice_vfio_pci_get_data_size - get migration data size + * @vdev: pointer to vfio pci device + * @stop_copy_length: migration data size + * + * Return 0 for success + */ +static int +ice_vfio_pci_get_data_size(struct vfio_device *vdev, + unsigned long *stop_copy_length) +{ + *stop_copy_length = SZ_128K; + return 0; +} + +static const struct vfio_migration_ops ice_vfio_pci_migrn_state_ops = { + .migration_set_state = ice_vfio_pci_set_device_state, + .migration_get_state = ice_vfio_pci_get_device_state, + .migration_get_data_size = ice_vfio_pci_get_data_size, +}; + +/** + * ice_vfio_pci_core_init_dev - initialize vfio device + * @core_vdev: pointer to vfio device + * + * Return 0 for success + */ +static int ice_vfio_pci_core_init_dev(struct vfio_device *core_vdev) +{ + struct ice_vfio_pci_core_device *ice_vdev = container_of(core_vdev, + struct ice_vfio_pci_core_device, core_device.vdev); + + mutex_init(&ice_vdev->state_mutex); + spin_lock_init(&ice_vdev->reset_lock); + + core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P; + core_vdev->mig_ops = &ice_vfio_pci_migrn_state_ops; + + return vfio_pci_core_init_dev(core_vdev); +} + +static const struct vfio_device_ops ice_vfio_pci_ops = { + .name = "ice-vfio-pci", + .init = ice_vfio_pci_core_init_dev, + .release = vfio_pci_core_release_dev, + .open_device = ice_vfio_pci_open_device, + .close_device = ice_vfio_pci_close_device, + .device_feature = vfio_pci_core_ioctl_feature, + .read = vfio_pci_core_read, + .write = vfio_pci_core_write, + .ioctl = vfio_pci_core_ioctl, + .mmap = vfio_pci_core_mmap, + .request = vfio_pci_core_request, + .match = vfio_pci_core_match, + .bind_iommufd = vfio_iommufd_physical_bind, + .unbind_iommufd = vfio_iommufd_physical_unbind, + .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, +}; + +/** + * ice_vfio_pci_probe - Device initialization routine + * @pdev: PCI device information struct + * @id: entry in ice_vfio_pci_table + * + * Returns 0 on success, negative on failure + */ +static int +ice_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) +{ + struct ice_vfio_pci_core_device *ice_vdev; + int ret; + + ice_vdev = vfio_alloc_device(ice_vfio_pci_core_device, core_device.vdev, + &pdev->dev, &ice_vfio_pci_ops); + if (!ice_vdev) + return -ENOMEM; + + dev_set_drvdata(&pdev->dev, &ice_vdev->core_device); + + ret = vfio_pci_core_register_device(&ice_vdev->core_device); + if (ret) + goto out_free; + + return 0; + +out_free: + vfio_put_device(&ice_vdev->core_device.vdev); + return ret; +} + +/** + * ice_vfio_pci_remove - Device removal routine + * @pdev: PCI device information struct + */ +static void ice_vfio_pci_remove(struct pci_dev *pdev) +{ + struct ice_vfio_pci_core_device *ice_vdev = + (struct ice_vfio_pci_core_device *)dev_get_drvdata(&pdev->dev); + + vfio_pci_core_unregister_device(&ice_vdev->core_device); + vfio_put_device(&ice_vdev->core_device.vdev); +} + +/* ice_pci_tbl - PCI Device ID Table + * + * Wildcard entries (PCI_ANY_ID) should come last + * Last entry must be all 0s + * + * { Vendor ID, Device ID, SubVendor ID, SubDevice ID, + * Class, Class Mask, private data (not used) } + */ +static const struct pci_device_id ice_vfio_pci_table[] = { + { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_INTEL, 0x1889) }, + {} +}; +MODULE_DEVICE_TABLE(pci, ice_vfio_pci_table); + +static const struct pci_error_handlers ice_vfio_pci_core_err_handlers = { + .reset_done = ice_vfio_pci_reset_done, + .error_detected = vfio_pci_core_aer_err_detected, +}; + +static struct pci_driver ice_vfio_pci_driver = { + .name = "ice-vfio-pci", + .id_table = ice_vfio_pci_table, + .probe = ice_vfio_pci_probe, + .remove = ice_vfio_pci_remove, + .err_handler = &ice_vfio_pci_core_err_handlers, + .driver_managed_dma = true, +}; + +module_pci_driver(ice_vfio_pci_driver); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Intel Corporation, "); +MODULE_DESCRIPTION(DRIVER_DESC);