[iwl-next,v4,09/12] ice: Save and load TX Queue head

From: Lingyu Liu <lingyu.liu@intel.com>

From: Lingyu Liu <lingyu.liu@intel.com>

TX Queue head is a fundamental DMA ring context which determines the
next TX descriptor to be fetched. However, TX Queue head is not visible
to VF while it is only visible in PF. As a result, PF needs to save and
load TX Queue head explicitly.

Unfortunately, due to HW limitation, TX Queue head can't be recovered
through writing mmio registers.

Since sending one packet will make TX head advanced by 1 index, TX Queue
head can be advanced by N index through sending N packets. Filling in
DMA ring with NOP descriptors and bumping doorbell can be used to change
TX Queue head indirectly. And this method has no side effects except
changing TX head value.

To advance TX Head queue, HW needs to touch memory by DMA. But directly
touching VM's memory to advance TX Queue head does not follow vfio
migration protocol design, because vIOMMU state is not defined by the
protocol. Even this may introduce functional and security issue under
hostile guest circumstances.

In order not to touch any VF memory or IO page table, TX Queue head
loading is using PF managed memory and PF isolation domain. This will
also introduce another dependency that while switching TX Queue between
PF space and VF space, TX Queue head value is not changed. HW provides
an indirect context access so that head value can be kept while
switching context.

In virtual channel model, VF driver only send TX queue ring base and
length info to PF, while rest of the TX queue context are managed by PF.
TX queue length must be verified by PF during virtual channel message
processing. When PF uses dummy descriptors to advance TX head, it will
configure the TX ring base as the new address managed by PF itself. As a
result, all of the TX queue context is taken control of by PF and this
method won't generate any attacking vulnerability

The overall steps for TX head loading handler are:
1. Backup TX context, switch TX queue context as PF space and PF
   DMA ring base with interrupt disabled
2. Fill the DMA ring with dummy descriptors and bump doorbell to
   advance TX head. Once kicking doorbell, HW will issue DMA and
   send PCI upstream memory transaction tagged by PF BDF. Since
   ring base is PF's managed DMA buffer, DMA can work successfully
   and TX Head is advanced as expected.
3. Overwrite TX context by the backup context in step 1. Since TX
   queue head value is not changed while context switch, TX queue
   head is successfully loaded.

Since everything is happening inside PF context, it is transparent to
vfio driver and has no effects outside of PF.

Co-developed-by: Yahui Cao <yahui.cao@intel.com>
Signed-off-by: Yahui Cao <yahui.cao@intel.com>
Signed-off-by: Lingyu Liu <lingyu.liu@intel.com>
---
 .../net/ethernet/intel/ice/ice_migration.c    | 306 ++++++++++++++++++
 drivers/net/ethernet/intel/ice/ice_virtchnl.c |  18 ++
 2 files changed, 324 insertions(+)

Message ID	20231121025111.257597-10-yahui.cao@intel.com (mailing list archive)
State	New, archived
Headers	show Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ce/n7FJN" Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E7CFC4; Mon, 20 Nov 2023 18:50:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700535034; x=1732071034; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Pq73yuf6JeL2kdbR+BSOiKELUA3aE1vbhSDePzfws90=; b=ce/n7FJN45FYMPVMgFIp1i+ZH2YWJa1aUNcTULp7wIY468kzICRFUHep q9Uzwp9FB9KBJ+EyfndJxeMuGQUUNMWIGD8BqNotaa9Eq9VQpU8NWMZSw ilApDrCTQeRT0ERLEeEdc0HZ2cEIefn2YyGTj3Lx/UWoV1ufg6F0e+n4h 9mNGPt4Z1Az3nhv7YYq40aFbLJ5hswi+mB1d20cA/uaJk+38FlqOkXH+K 91c/dqNl8AJgft+j9LcRtTLo+SXsTQsD9+cLjh8aG5BuBuDQ1CsmyA0Ny ScDeHjNtBI0qX+igi1E8WzzRUNSv/VCcHSXZECeNl2tyfYe4s0U3V7n/W Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10900"; a="458246112" X-IronPort-AV: E=Sophos;i="6.04,215,1695711600"; d="scan'208";a="458246112" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2023 18:50:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10900"; a="832488539" X-IronPort-AV: E=Sophos;i="6.04,215,1695711600"; d="scan'208";a="832488539" Received: from dpdk-yahui-icx1.sh.intel.com ([10.67.111.85]) by fmsmga008.fm.intel.com with ESMTP; 20 Nov 2023 18:50:20 -0800 From: Yahui Cao <yahui.cao@intel.com> To: intel-wired-lan@lists.osuosl.org Cc: kvm@vger.kernel.org, netdev@vger.kernel.org, lingyu.liu@intel.com, kevin.tian@intel.com, madhu.chittim@intel.com, sridhar.samudrala@intel.com, alex.williamson@redhat.com, jgg@nvidia.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, brett.creeley@amd.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Subject: [PATCH iwl-next v4 09/12] ice: Save and load TX Queue head Date: Tue, 21 Nov 2023 02:51:08 +0000 Message-Id: <20231121025111.257597-10-yahui.cao@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231121025111.257597-1-yahui.cao@intel.com> References: <20231121025111.257597-1-yahui.cao@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: <kvm.vger.kernel.org> List-Subscribe: <mailto:kvm+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:kvm+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Add E800 live migration driver \| expand [iwl-next,v4,00/12] Add E800 live migration driver [iwl-next,v4,01/12] ice: Add function to get RX queue context [iwl-next,v4,02/12] ice: Add function to get and set TX queue context [iwl-next,v4,03/12] ice: Introduce VF state ICE_VF_STATE_REPLAYING_VC for migration [iwl-next,v4,04/12] ice: Add fundamental migration init and exit function [iwl-next,v4,05/12] ice: Log virtual channel messages in PF [iwl-next,v4,06/12] ice: Add device state save/load function for migration [iwl-next,v4,07/12] ice: Fix VSI id in virtual channel message for migration [iwl-next,v4,08/12] ice: Save and load RX Queue head [iwl-next,v4,09/12] ice: Save and load TX Queue head [iwl-next,v4,10/12] ice: Add device suspend function for migration [iwl-next,v4,11/12] ice: Save and load mmio registers [iwl-next,v4,12/12] vfio/ice: Implement vfio_pci driver for E800 devices

[iwl-next,v4,09/12] ice: Save and load TX Queue head

Commit Message

Comments

Patch