From patchwork Sat Sep 12 22:54:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772127 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B639213B1 for ; Sat, 12 Sep 2020 22:55:50 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 65E5A206E6 for ; Sat, 12 Sep 2020 22:55:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="nMl2Jb52" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 65E5A206E6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:44920 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEQv-0005TB-HA for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 18:55:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48772) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPs-0003f5-FH; Sat, 12 Sep 2020 18:54:44 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26879) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPp-0005R7-M4; Sat, 12 Sep 2020 18:54:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951281; x=1631487281; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KgbQNch+4cFx3msApzBbuLL93ECkVe6TluohZCU2jzA=; b=nMl2Jb52XZDhq8pbmMEJscpVsaPYIbGb+PHyeX8KZHwKgorJXlc146BR ZDYecis+T9R8PTzwx9IepIdDYH3QUYlbp1Nb2eKakh+WJ/m9RsYWd56Ob QuPhCpqH96bOTsUVLQlHV67F4A1OO5/Zr43NjeCmCQVH/u+hkHCxMH9eF 5wOj0OJukKKgHjFm26TWyod0BbBAXE4kCg0fyd0Tf0okB6zxrwBCmythO ceqO2zM/UjJKdBjXmmeWVDyIRabO0XGf5Ri5WELlR2lUvnbbQ81dmeUj7 6D4Y9BBJBA3SfhwjEfnwodw1D54JbZrReC90qDZ3iQlDs3AqrdCvuCewL w==; IronPort-SDR: Y2vMpU4fn7UQBsqPLGBFmAFetCJB+aCvMAPbBvbqFYrfooeb3CppSsftmhioQioK5CY1CpbIPJ RupuKG7LDgnAtih21fYX/LriMLdzVv27yopTUOQvli82XpXfpfoTaEMp7sfYNpWysKqyX51cw4 hyExIzY5OzydWTPzpUiafoPkX+yJFaa8MPHQ/x2q2ti5SOGc1g9JsYw+ZGoLTmMayRIh0lGEMI /iJjl8HsfmPVNRpuDnLeiUyK3fVlMJGRDE2zLNGRVHaUsO1IhLNZ8p7NvDDbUvenzCS4kjujQs 65s= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834836" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:39 +0800 IronPort-SDR: 2ivk/8YgkwGUHqKXsC+s3yEsNHbiCBptRQA+wPrnT+aUe6aXNuZcnDIir5hf0gGL8wBfptLueJ lXBuphXxQrqQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:40:59 -0700 IronPort-SDR: fXj7i3Dt7Wj4e7Me/2pomfpuZhMpfxXm1jcgCFX+B0HMUGI/4HSKIR70B2kb8Npl7emIVj6+42 QU4xzzydMNeQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:38 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 01/15] hw/block/nvme: Define 64 bit cqe.result Date: Sun, 13 Sep 2020 07:54:16 +0900 Message-Id: <20200912225430.1772-2-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Ajay Joshi A new write command, Zone Append, is added as a part of Zoned Namespace Command Set. Upon successful completion of this command, the controller returns the start LBA of the performed write operation in cqe.result field. Therefore, the maximum size of this variable needs to be changed from 32 to 64 bit, consuming the reserved 32 bit field that follows the result in CQE struct. Since the existing commands are expected to return a 32 bit LE value, two separate variables, result32 and result64, are now kept in a union. Signed-off-by: Ajay Joshi Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen --- block/nvme.c | 2 +- block/trace-events | 2 +- hw/block/nvme.c | 10 +++++----- include/block/nvme.h | 6 ++++-- 4 files changed, 11 insertions(+), 9 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index 05485fdd11..45e1a5dcd1 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -333,7 +333,7 @@ static inline int nvme_translate_error(const NvmeCqe *c) { uint16_t status = (le16_to_cpu(c->status) >> 1) & 0xFF; if (status) { - trace_nvme_error(le32_to_cpu(c->result), + trace_nvme_error(le64_to_cpu(c->result64), le16_to_cpu(c->sq_head), le16_to_cpu(c->sq_id), le16_to_cpu(c->cid), diff --git a/block/trace-events b/block/trace-events index e1c79a910d..55c54a18c3 100644 --- a/block/trace-events +++ b/block/trace-events @@ -139,7 +139,7 @@ qed_aio_write_main(void *s, void *acb, int ret, uint64_t offset, size_t len) "s # nvme.c nvme_kick(void *s, int queue) "s %p queue %d" nvme_dma_flush_queue_wait(void *s) "s %p" -nvme_error(int cmd_specific, int sq_head, int sqid, int cid, int status) "cmd_specific %d sq_head %d sqid %d cid %d status 0x%x" +nvme_error(uint64_t cmd_specific, int sq_head, int sqid, int cid, int status) "cmd_specific %ld sq_head %d sqid %d cid %d status 0x%x" nvme_process_completion(void *s, int index, int inflight) "s %p queue %d inflight %d" nvme_process_completion_queue_plugged(void *s, int index) "s %p queue %d" nvme_complete_command(void *s, int index, int cid) "s %p queue %d cid %d" diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 63078f6009..3a90d80694 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -524,7 +524,7 @@ static void nvme_process_aers(void *opaque) req = n->aer_reqs[n->outstanding_aers]; - result = (NvmeAerResult *) &req->cqe.result; + result = (NvmeAerResult *) &req->cqe.result32; result->event_type = event->result.event_type; result->event_info = event->result.event_info; result->log_page = event->result.log_page; @@ -1247,7 +1247,7 @@ static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req) { uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff; - req->cqe.result = 1; + req->cqe.result32 = 1; if (nvme_check_sqid(n, sqid)) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1425,7 +1425,7 @@ defaults: } out: - req->cqe.result = cpu_to_le32(result); + req->cqe.result32 = cpu_to_le32(result); return NVME_SUCCESS; } @@ -1534,8 +1534,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) ((dw11 >> 16) & 0xFFFF) + 1, n->params.max_ioqpairs, n->params.max_ioqpairs); - req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) | - ((n->params.max_ioqpairs - 1) << 16)); + req->cqe.result32 = cpu_to_le32((n->params.max_ioqpairs - 1) | + ((n->params.max_ioqpairs - 1) << 16)); break; case NVME_ASYNCHRONOUS_EVENT_CONF: n->features.async_config = dw11; diff --git a/include/block/nvme.h b/include/block/nvme.h index 65e68a82c8..ac0ccfcb26 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -617,8 +617,10 @@ typedef struct QEMU_PACKED NvmeAerResult { } NvmeAerResult; typedef struct QEMU_PACKED NvmeCqe { - uint32_t result; - uint32_t rsvd; + union { + uint64_t result64; + uint32_t result32; + }; uint16_t sq_head; uint16_t sq_id; uint16_t cid; From patchwork Sat Sep 12 22:54:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772151 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF250112E for ; Sat, 12 Sep 2020 23:01:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 938B620770 for ; Sat, 12 Sep 2020 23:01:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="UbcYHeG9" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 938B620770 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:45422 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEWY-0000YY-N3 for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:01:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48788) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPt-0003hS-QN; Sat, 12 Sep 2020 18:54:45 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26882) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPr-0005Sh-Me; Sat, 12 Sep 2020 18:54:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951283; x=1631487283; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U0YF82MA56tPKneosTj7Mr55BcKBrlkeEjxyv7jPa74=; b=UbcYHeG9GIopPeB48IYyin4Dg4U9qSuQ9byWzUD2oHKzJanhaJAGs216 cwhxsabQV/1OOXEyuAXyFqoWFQqDiWv0XiDFPUYJfXMBxRzarJtinHss9 7yagTPm1txt3DgEbq5SohJSAGsD5JGXDDoF0/JlAPde4IwrDny/nsQiX6 83OInJaqVhyflw6//tDck/I5+xpinCEu/Cef+jZ2nL/I/dW74XaUd3LWP nrV7cIjuNj6DB+ijRPtJK5UduM6HfunWZEr8PD99rPZMpo+E3H/CT2Iw5 ySNCz+UZf5ThpPWF85wxeAzSPNOJdOYASQxgJdl4gA+bXti8+JH8TW/RI A==; IronPort-SDR: UlB+LkEosqHyiJSn1YL/y4eZIQWY+5aChmmpIhQeQnLiJshPXlYuAESOR1wq+ROXw7Q9uWmJAV X0VyXWSbYE/ItgFeAdIECSSmWoBGwIi0H3DYrBYRgBA5ky/xwfYnPigsdNvlZI3fN87kv2Y0C+ xX0pp+cu5dcq0CfLcRD50pyMqmEGaasY5k165yEyk/B9MspbB1IKnlE30RM2oMubIJ5SDAkdHd SJnuly5V1LLjbwD59bxA/RZKaBFMIaXsN060FodvJPlTxHKYylmBKFkA+7ScAAc2PFDH8GCRfL ZSw= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834838" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:41 +0800 IronPort-SDR: BRkhCc4uhSC4lRjQ/sp4U1LUdBeF25O8AhKi21TanVq2DSnUubR8sXnrzUA3iGtVlJdWuX/m5u SnXSbc8k5bwg== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:02 -0700 IronPort-SDR: sBG+Yiupy9I5Z3doTF7P2gdeusPQe+JYMFaJGXO0n9dVWnNSiIeBCZz9Wzt9W15JJ5HOAJMoWk gJvMipM81yjA== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:40 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 02/15] hw/block/nvme: Report actual LBA data shift in LBAF Date: Sun, 13 Sep 2020 07:54:17 +0900 Message-Id: <20200912225430.1772-3-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Calculate the data shift value to report based on the set value of logical_block_size device property. In the process, use a local variable to calculate the LBA format index instead of the hardcoded value 0. This makes the code more readable and it will make it easier to add support for multiple LBA formats in the future. Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 4 +++- hw/block/nvme.h | 11 +++++++++++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 3a90d80694..1cfc136042 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -2203,6 +2203,7 @@ static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) { int64_t bs_size; NvmeIdNs *id_ns = &ns->id_ns; + int lba_index; bs_size = blk_getlength(n->conf.blk); if (bs_size < 0) { @@ -2212,7 +2213,8 @@ static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) n->ns_size = bs_size; - id_ns->lbaf[0].ds = BDRV_SECTOR_BITS; + lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); + id_ns->lbaf[lba_index].ds = nvme_ilog2(n->conf.logical_block_size); id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(n, ns)); /* no thin provisioning */ diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 52ba794f2e..190c974b6c 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -137,4 +137,15 @@ static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns) return n->ns_size >> nvme_ns_lbads(ns); } +static inline int nvme_ilog2(uint64_t i) +{ + int log = -1; + + while (i) { + i >>= 1; + log++; + } + return log; +} + #endif /* HW_NVME_H */ From patchwork Sat Sep 12 22:54:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772147 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 77B3C112E for ; Sat, 12 Sep 2020 23:01:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3A3D420770 for ; Sat, 12 Sep 2020 23:01:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Nno9RYfQ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3A3D420770 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:42562 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEW1-0007pa-1b for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:01:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48808) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPv-0003m0-MV; Sat, 12 Sep 2020 18:54:47 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26879) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPt-0005R7-F1; Sat, 12 Sep 2020 18:54:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951285; x=1631487285; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QENL7yTeijRb1dwVCGSga8++FHk9Er5Wz/qHMOOR3pg=; b=Nno9RYfQKVaLEIEzxiszI7moqgm+qsOfh3sk3Y98uMX+UMLiGQkRkDOB ElnJlLOcsxmRYoP1MkY1IuSfRwoxa59+SuHcg/QkikiT6gvON1mPYoMiB 1nk2ZpiBpHhNrRMcrEbYjYcfFMIlFtPYNtBOBm5mBSxBPtpVm/X5NbdhV U54TmJc+9KfzEfmdBgWmRsG9XvY+NuPSZYOmJ8EtdzCxVsruaDnHAX6rA iQaBWnf15uu7PpozlIBf7vN5dG8mqUWvp3L1pxJ15M9iLUxOvTgXIBO0+ 0ju3yZ5Wa50b2n9gXAetKABnNyxb/uGujmy070UTru0f6E6HXqYbMwjQ+ A==; IronPort-SDR: 9Wp5Yvfz0Re2TXarxbU1nBtueEzbs9TF6dhq8+gh+8Jq04D2NUSyHKbJg4r5ySjl/riqk/76dL AD16XBaW0vfLoEBP5Pa6OrZYW0oQngAjYLp6uM+vXXykdBPuUnok8f20PWZiveJSiaA/xNGY9U D1ggeH77UO/an+PQRLlIMoa5qr0ra47+4BK2oV3ylvCs47SCVEGqy5g2RcWITLlUf46iCjfMna 16ya+0EEss2W8Rcuf7ZWAKt1cJCcTs5+0emAF0uT+JhyzfHRGr8pkRlCVAO8NRkvCUAsksXhPK JK8= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834840" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:43 +0800 IronPort-SDR: tHk6Fh2jSyvPVfCNqpVmQsNlMijMN2RFJ/TrjVUZBMGIiJTT8w6+eGJmyhwkDU10UFsYDj2NNx w50qgOLktGIQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:04 -0700 IronPort-SDR: 5+4XAPddIz1dDpQbSWkYtUAr1ormwPWaCcekZEWqI5v5tEiVwm7hnvFvAt6eAlPd89fYz0nuig mETFywf+TcYw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:43 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 03/15] hw/block/nvme: Add Commands Supported and Effects log Date: Sun, 13 Sep 2020 07:54:18 +0900 Message-Id: <20200912225430.1772-4-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" This log page becomes necessary to implement to allow checking for Zone Append command support in Zoned Namespace Command Set. This commit adds the code to report this log page for NVM Command Set only. The parts that are specific to zoned operation will be added later in the series. Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 44 ++++++++++++++++++++++++++++++++++++++++++- hw/block/trace-events | 2 ++ include/block/nvme.h | 19 +++++++++++++++++++ 3 files changed, 64 insertions(+), 1 deletion(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 1cfc136042..39c2d5b0b4 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -957,6 +957,46 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, + uint64_t off, NvmeRequest *req) +{ + NvmeCmd *cmd = &req->cmd; + uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1); + uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2); + NvmeEffectsLog cmd_eff_log = {}; + uint32_t *iocs = cmd_eff_log.iocs; + uint32_t *acs = cmd_eff_log.acs; + uint32_t trans_len; + + trace_pci_nvme_cmd_supp_and_effects_log_read(); + + if (off >= sizeof(cmd_eff_log)) { + trace_pci_nvme_err_invalid_effects_log_offset(off); + return NVME_INVALID_FIELD | NVME_DNR; + } + + acs[NVME_ADM_CMD_DELETE_SQ] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_CREATE_SQ] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_DELETE_CQ] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_CREATE_CQ] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_IDENTIFY] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_SET_FEATURES] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_GET_FEATURES] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_GET_LOG_PAGE] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFFECTS_CSUPP; + + iocs[NVME_CMD_FLUSH] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFFECTS_CSUPP | + NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_WRITE] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_READ] = NVME_CMD_EFFECTS_CSUPP; + + trans_len = MIN(sizeof(cmd_eff_log) - off, buf_len); + + return nvme_dma_prp(n, ((uint8_t *)&cmd_eff_log) + off, trans_len, + prp1, prp2, DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) { NvmeCmd *cmd = &req->cmd; @@ -1000,6 +1040,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) return nvme_smart_info(n, rae, len, off, req); case NVME_LOG_FW_SLOT_INFO: return nvme_fw_log_info(n, len, off, req); + case NVME_LOG_CMD_EFFECTS: + return nvme_cmd_effects(n, len, off, req); default: trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid); return NVME_INVALID_FIELD | NVME_DNR; @@ -2350,7 +2392,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) id->acl = 3; id->aerl = n->params.aerl; id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO; - id->lpa = NVME_LPA_EXTENDED; + id->lpa = NVME_LPA_CSE | NVME_LPA_EXTENDED; /* recommended default value (~70 C) */ id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING); diff --git a/hw/block/trace-events b/hw/block/trace-events index 72cf2d15cb..79c9da652d 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -83,6 +83,7 @@ pci_nvme_mmio_start_success(void) "setting controller enable bit succeeded" pci_nvme_mmio_stopped(void) "cleared controller enable bit" pci_nvme_mmio_shutdown_set(void) "shutdown bit set" pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" +pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects log read" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" @@ -95,6 +96,7 @@ pci_nvme_err_invalid_ns(uint32_t ns, uint32_t limit) "invalid namespace %u not w pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" +pci_nvme_err_invalid_effects_log_offset(uint64_t ofs) "commands supported and effects log offset must be 0, got %"PRIu64"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" pci_nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16"" diff --git a/include/block/nvme.h b/include/block/nvme.h index ac0ccfcb26..62136a906f 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -736,10 +736,27 @@ enum NvmeSmartWarn { NVME_SMART_FAILED_VOLATILE_MEDIA = 1 << 4, }; +typedef struct NvmeEffectsLog { + uint32_t acs[256]; + uint32_t iocs[256]; + uint8_t resv[2048]; +} NvmeEffectsLog; + +enum { + NVME_CMD_EFFECTS_CSUPP = 1 << 0, + NVME_CMD_EFFECTS_LBCC = 1 << 1, + NVME_CMD_EFFECTS_NCC = 1 << 2, + NVME_CMD_EFFECTS_NIC = 1 << 3, + NVME_CMD_EFFECTS_CCC = 1 << 4, + NVME_CMD_EFFECTS_CSE_MASK = 3 << 16, + NVME_CMD_EFFECTS_UUID_SEL = 1 << 19, +}; + enum NvmeLogIdentifier { NVME_LOG_ERROR_INFO = 0x01, NVME_LOG_SMART_INFO = 0x02, NVME_LOG_FW_SLOT_INFO = 0x03, + NVME_LOG_CMD_EFFECTS = 0x05, }; typedef struct QEMU_PACKED NvmePSD { @@ -851,6 +868,7 @@ enum NvmeIdCtrlFrmw { }; enum NvmeIdCtrlLpa { + NVME_LPA_CSE = 1 << 1, NVME_LPA_EXTENDED = 1 << 2, }; @@ -1050,6 +1068,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeErrorLog) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); + QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); From patchwork Sat Sep 12 22:54:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772153 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 32FE0746 for ; Sat, 12 Sep 2020 23:02:28 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EB3A620770 for ; Sat, 12 Sep 2020 23:02:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="ES+19dhF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EB3A620770 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:49738 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEXL-0002ID-1G for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:02:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48832) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPy-0003s0-Cs; Sat, 12 Sep 2020 18:54:50 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26887) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPw-0005VV-6Z; Sat, 12 Sep 2020 18:54:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951287; x=1631487287; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2oLHRnibop+I5Gl3pRqjcltaMzR53KAvQqxeRCbBeqo=; b=ES+19dhF7/a9PgEs43Ii+jzxnqfw9tAPgQ2nr8s7QxK8xQMC2mbj2QQw lquLf8txHSeJe6p8QLjFZuVLk7l93thY9S7Pll9m8k049yLsO/XHUrckH WVZOCsC8hcICQiGtFy3tGbUkAX+DXhpwsKR8Ccs6rqpd9RuLM+VL5KIPv ObsOYeg+bMxhgF3Oz4uAnNa4m/iPARm3mOIhAI7I3VpQcLq6Yt4Fic/am fYlbj+j8NxX7SqXkj+L3lWz2B0HfJnHTDQLADijNwjduIuY+y7922USKV 2jJo8MbV0avuyjyxPd/216787M4HpIqUjsB93IvIyvu8rjtFJgZ3B4/V0 A==; IronPort-SDR: zJdO1Mo+ZGIGWJPWXmseGhqZR1U2nr1NbwL6Os08Jk1MDu/sLxUs2nHQP4F/wn4XFGs7AGr7qY 73Ma9pU2BbNTF20bakFWRAR5q85opVtcPE3gsfaWMovXxTHbax0vlsrJOzZA17PdlF8d744IxI DeSe9aZSXKoX0gvj7tBVjjvNELEvoaAzG213En29vefN6v78/BVPGQ7pOgdN3PTteBbWBFdQuI c/ozlH3v/r+C19lQS6apzdcirsdrfXQaJdXVEgol37LGmDXIb8AgJdVHthVRe3+InNo5AoUKqu Cxo= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834842" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:46 +0800 IronPort-SDR: NAUnWnTLGfKqo0YfD6XCjRnzbQCgeHUyPYm013NYs21cgyq+czMXaxHhWwtBCPs4b6c3brZBW+ GTIe9sYSlX2Q== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:06 -0700 IronPort-SDR: cmxewNlzxaX4DPxP+PoeP1wsePXfx3mQszgi+0Ty75jN8Uqf7p+6YMTPabzTopzoVwZE6X1cBS MNir2JsgFQjw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:45 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 04/15] hw/block/nvme: Introduce the Namespace Types definitions Date: Sun, 13 Sep 2020 07:54:19 +0900 Message-Id: <20200912225430.1772-5-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Niklas Cassel Define the structures and constants required to implement Namespace Types support. Signed-off-by: Niklas Cassel Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 2 +- hw/block/nvme.h | 3 ++ include/block/nvme.h | 74 +++++++++++++++++++++++++++++++++++--------- 3 files changed, 64 insertions(+), 15 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 39c2d5b0b4..4bd88f4046 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1259,7 +1259,7 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) * here. */ ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID; - ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN; + ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID; stl_be_p(&ns_descrs->uuid.v, nsid); return nvme_dma_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2, diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 190c974b6c..252e2d5921 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -64,6 +64,9 @@ typedef struct NvmeCQueue { typedef struct NvmeNamespace { NvmeIdNs id_ns; + uint32_t nsid; + uint8_t csi; + QemuUUID uuid; } NvmeNamespace; static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) diff --git a/include/block/nvme.h b/include/block/nvme.h index 62136a906f..f2cff5aa6b 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -51,6 +51,11 @@ enum NvmeCapMask { CAP_PMR_MASK = 0x1, }; +enum NvmeCapCssBits { + CAP_CSS_NVM = 0x01, + CAP_CSS_CSI_SUPP = 0x40, +}; + #define NVME_CAP_MQES(cap) (((cap) >> CAP_MQES_SHIFT) & CAP_MQES_MASK) #define NVME_CAP_CQR(cap) (((cap) >> CAP_CQR_SHIFT) & CAP_CQR_MASK) #define NVME_CAP_AMS(cap) (((cap) >> CAP_AMS_SHIFT) & CAP_AMS_MASK) @@ -102,6 +107,12 @@ enum NvmeCcMask { CC_IOCQES_MASK = 0xf, }; +enum NvmeCcCss { + CSS_NVM_ONLY = 0, + CSS_CSI = 6, + CSS_ADMIN_ONLY = 7, +}; + #define NVME_CC_EN(cc) ((cc >> CC_EN_SHIFT) & CC_EN_MASK) #define NVME_CC_CSS(cc) ((cc >> CC_CSS_SHIFT) & CC_CSS_MASK) #define NVME_CC_MPS(cc) ((cc >> CC_MPS_SHIFT) & CC_MPS_MASK) @@ -110,6 +121,21 @@ enum NvmeCcMask { #define NVME_CC_IOSQES(cc) ((cc >> CC_IOSQES_SHIFT) & CC_IOSQES_MASK) #define NVME_CC_IOCQES(cc) ((cc >> CC_IOCQES_SHIFT) & CC_IOCQES_MASK) +#define NVME_SET_CC_EN(cc, val) \ + (cc |= (uint32_t)((val) & CC_EN_MASK) << CC_EN_SHIFT) +#define NVME_SET_CC_CSS(cc, val) \ + (cc |= (uint32_t)((val) & CC_CSS_MASK) << CC_CSS_SHIFT) +#define NVME_SET_CC_MPS(cc, val) \ + (cc |= (uint32_t)((val) & CC_MPS_MASK) << CC_MPS_SHIFT) +#define NVME_SET_CC_AMS(cc, val) \ + (cc |= (uint32_t)((val) & CC_AMS_MASK) << CC_AMS_SHIFT) +#define NVME_SET_CC_SHN(cc, val) \ + (cc |= (uint32_t)((val) & CC_SHN_MASK) << CC_SHN_SHIFT) +#define NVME_SET_CC_IOSQES(cc, val) \ + (cc |= (uint32_t)((val) & CC_IOSQES_MASK) << CC_IOSQES_SHIFT) +#define NVME_SET_CC_IOCQES(cc, val) \ + (cc |= (uint32_t)((val) & CC_IOCQES_MASK) << CC_IOCQES_SHIFT) + enum NvmeCstsShift { CSTS_RDY_SHIFT = 0, CSTS_CFS_SHIFT = 1, @@ -524,8 +550,13 @@ typedef struct QEMU_PACKED NvmeIdentify { uint64_t rsvd2[2]; uint64_t prp1; uint64_t prp2; - uint32_t cns; - uint32_t rsvd11[5]; + uint8_t cns; + uint8_t rsvd10; + uint16_t ctrlid; + uint16_t nvmsetid; + uint8_t rsvd11; + uint8_t csi; + uint32_t rsvd12[4]; } NvmeIdentify; typedef struct QEMU_PACKED NvmeRwCmd { @@ -647,6 +678,7 @@ enum NvmeStatusCodes { NVME_MD_SGL_LEN_INVALID = 0x0010, NVME_SGL_DESCR_TYPE_INVALID = 0x0011, NVME_INVALID_USE_OF_CMB = 0x0012, + NVME_CMD_SET_CMB_REJECTED = 0x002b, NVME_LBA_RANGE = 0x0080, NVME_CAP_EXCEEDED = 0x0081, NVME_NS_NOT_READY = 0x0082, @@ -773,11 +805,15 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 -enum { - NVME_ID_CNS_NS = 0x0, - NVME_ID_CNS_CTRL = 0x1, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x2, - NVME_ID_CNS_NS_DESCR_LIST = 0x3, +enum NvmeIdCns { + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_CS_NS = 0x05, + NVME_ID_CNS_CS_CTRL = 0x06, + NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, + NVME_ID_CNS_IO_COMMAND_SET = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl { @@ -924,6 +960,7 @@ enum NvmeFeatureIds { NVME_WRITE_ATOMICITY = 0xa, NVME_ASYNCHRONOUS_EVENT_CONF = 0xb, NVME_TIMESTAMP = 0xe, + NVME_COMMAND_SET_PROFILE = 0x19, NVME_SOFTWARE_PROGRESS_MARKER = 0x80, NVME_FID_MAX = 0x100, }; @@ -1008,18 +1045,26 @@ typedef struct QEMU_PACKED NvmeIdNsDescr { uint8_t rsvd2[2]; } NvmeIdNsDescr; -enum { - NVME_NIDT_EUI64_LEN = 8, - NVME_NIDT_NGUID_LEN = 16, - NVME_NIDT_UUID_LEN = 16, +enum NvmeNsIdentifierLength { + NVME_NIDL_EUI64 = 8, + NVME_NIDL_NGUID = 16, + NVME_NIDL_UUID = 16, + NVME_NIDL_CSI = 1, }; enum NvmeNsIdentifierType { - NVME_NIDT_EUI64 = 0x1, - NVME_NIDT_NGUID = 0x2, - NVME_NIDT_UUID = 0x3, + NVME_NIDT_EUI64 = 0x01, + NVME_NIDT_NGUID = 0x02, + NVME_NIDT_UUID = 0x03, + NVME_NIDT_CSI = 0x04, }; +enum NvmeCsi { + NVME_CSI_NVM = 0x00, +}; + +#define NVME_SET_CSI(vec, csi) (vec |= (uint8_t)(1 << (csi))) + /*Deallocate Logical Block Features*/ #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat) ((dlfeat) & 0x10) #define NVME_ID_NS_DLFEAT_WRITE_ZEROES(dlfeat) ((dlfeat) & 0x08) @@ -1070,6 +1115,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); From patchwork Sat Sep 12 22:54:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772157 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD5B0746 for ; Sat, 12 Sep 2020 23:02:51 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6B01A20770 for ; Sat, 12 Sep 2020 23:02:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Rm8kh8e0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6B01A20770 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:51810 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEXi-00036i-8Y for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:02:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48858) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPz-0003vq-Ti; Sat, 12 Sep 2020 18:54:51 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26879) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEPx-0005R7-Pe; Sat, 12 Sep 2020 18:54:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951289; x=1631487289; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zxe81MuX4urUutIHEtJFE6J8nCgNhHUrmEj1y7p0mBw=; b=Rm8kh8e0OLV0mWx6lzhlB7NDHgq1SEdT1DONXzqXJcRaPD1pFkUAR24F r9DvIcNvyq3P3K0l5VYwItjHRs3lGlmVp06th18zJQNus4Dx3++egU5Vy E7jan7HhCByU3UV4biDV9UFMTcyi+bBOew1EjeyFZddUhKhBsSYUsZ4ie lE5kqkvRG2qgZcF8OeSX954eTtkaP+gDUlwGRwBm9ukJGIQ4I+sBAU1lo 7Ip6cCiEbaRMkx8kFS80NHRm8Qc4ecgWShrpysZIRBBepENHyic+6LFvt X0mzfy9DAL8ZP35GJYw5f3NCQ6X/61BizIA/wDCgtxw/HaRfrW77cWTCd g==; IronPort-SDR: Omeotocw6yOjQheq3wNJnm8eW+PBrBrdDW57eJqQzuOxlULmrofN2yoGo9j7LjjXxcBQedqlZG ixdUPtavgxvWuKbmzL2tnSgdxSh/uH04/tE7Q43YMsWi+C9I/sdQWj3VWbRbjQB9VW4gpc0tci QENTm1A54pCUjFRFjbBF4dfhposku7pfUsm8CN4uomadCEV7QHsdIm0Cjnog+7bM8ImH0Ol6B+ 3DiTzlizQ8L4iAuou7gIMK2fsyjXBaQ4q2o9EsthLUC8yw3Vyxhbi5dZXVjnvcBSGWC3KqW1d3 FrY= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834844" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:48 +0800 IronPort-SDR: fvEiqaCauq2J8sp7Bt4pqv7GzrJB5dWnH4hiaX4m+wKKYFd58d4wL/RaxGc/2QfgVRarEoB5l2 hfCvWJO+/Hig== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:08 -0700 IronPort-SDR: SV0QDCDHVHLv04t4jcrRsErooydDBp2z/OQC5iGkgo7QaM0u8iYckrFJqXHv8IsqcNA8ImNWQZ dQLtCtwgTfAw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:47 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 05/15] hw/block/nvme: Define trace events related to NS Types Date: Sun, 13 Sep 2020 07:54:20 +0900 Message-Id: <20200912225430.1772-6-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" A few trace events are defined that are relevant to implementing Namespace Types (NVMe TP 4056). Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen --- hw/block/trace-events | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/hw/block/trace-events b/hw/block/trace-events index 79c9da652d..2414dcbc79 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -46,8 +46,12 @@ pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16"" pci_nvme_identify_ctrl(void) "identify controller" +pci_nvme_identify_ctrl_csi(uint8_t csi) "identify controller, csi=0x%"PRIx8"" pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_ns_csi(uint32_t ns, uint8_t csi) "nsid=%"PRIu32", csi=0x%"PRIx8"" pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=%"PRIu16", csi=0x%"PRIx8"" +pci_nvme_identify_cmd_set(void) "identify i/o command set" pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32"" pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64"" pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32"" @@ -84,6 +88,8 @@ pci_nvme_mmio_stopped(void) "cleared controller enable bit" pci_nvme_mmio_shutdown_set(void) "shutdown bit set" pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects log read" +pci_nvme_css_nvm_cset_selected_by_host(uint32_t cc) "NVM command set selected by host, bar.cc=0x%"PRIx32"" +pci_nvme_css_all_csets_sel_by_host(uint32_t cc) "all supported command sets selected by host, bar.cc=0x%"PRIx32"" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" @@ -97,6 +103,9 @@ pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" pci_nvme_err_invalid_effects_log_offset(uint64_t ofs) "commands supported and effects log offset must be 0, got %"PRIu64"" +pci_nvme_err_change_css_when_enabled(void) "changing CC.CSS while controller is enabled" +pci_nvme_err_only_nvm_cmd_set_avail(void) "setting 110b CC.CSS, but only NVM command set is enabled" +pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combination index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" pci_nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16"" @@ -152,6 +161,7 @@ pci_nvme_ub_db_wr_invalid_cq(uint32_t qid) "completion queue doorbell write for pci_nvme_ub_db_wr_invalid_cqhead(uint32_t qid, uint16_t new_head) "completion queue doorbell write value beyond queue size, cqid=%"PRIu32", new_head=%"PRIu16", ignoring" pci_nvme_ub_db_wr_invalid_sq(uint32_t qid) "submission queue doorbell write for nonexistent queue, sqid=%"PRIu32", ignoring" pci_nvme_ub_db_wr_invalid_sqtail(uint32_t qid, uint16_t new_tail) "submission queue doorbell write value beyond queue size, sqid=%"PRIu32", new_head=%"PRIu16", ignoring" +pci_nvme_ub_unknown_css_value(void) "unknown value in cc.css field" # xen-block.c xen_block_realize(const char *type, uint32_t disk, uint32_t partition) "%s d%up%u" From patchwork Sat Sep 12 22:54:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772165 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B050139F for ; Sat, 12 Sep 2020 23:05:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B10BC20723 for ; Sat, 12 Sep 2020 23:05:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="mnM8DzU+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B10BC20723 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:35622 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEaL-00082w-RH for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:05:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48890) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ2-00040N-NZ; Sat, 12 Sep 2020 18:54:55 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26887) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ0-0005VV-55; Sat, 12 Sep 2020 18:54:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951291; x=1631487291; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oGpiw/XjGVJV5+zoC8o+kdLyx5GH5YcYXRJCfsY8ndM=; b=mnM8DzU+qERUKSIYZT7Lmbun62IYGWXRfivNng5dajIvv5WHAjb9XflR 9xORzKxRyxQUrS8xRdg74Ne/lldB5f4nO3nWNQGkx+Aqc/UfCqP7MiAOi GI1dCipdgk+ydmd81CJSaG5dhcz9AKWbQ9IMomquoqu8MwT+sYSjumUXp jIcj+8Gz7CULJGkWj65by3eP7ihzwhyWSDaxL0iYJ/1m2vxlC2pBy/rnJ t0uyvzOEaPPPkmTmSHJV2eLT3mnFQSH2l94sV/gHsYcXNBKwOeIa/kxYH +U5VcVPwKbxccXPlOxWyUuCRAPop8BAPyB6kZmZwnsFjoHm2PvjUjiN4w g==; IronPort-SDR: 7Mti2WJg02uLdU4FEix96nVLBGljsMlKT+cidRdQD5+z3Gf/8Jh32imKrNZ5LzV62ZytaMAhQt KGMHVDfquNABechQj3VK3w8YU1E/oj1NSQIcHfvN1pxrGUHZ4o88vzEZMjHtHHhgXhEm7BYnGi 3ljn7S56EebAjEZV5aW37CIBhF4mS1fVzhqkwSDP0TLO0DBp5Fov13HYcs7NrGuVZGRuWBU9lC YOQnv5NpAvtE6ZWHHDlbv+a2GrAkGOlOWHjPbOtq7FCLEyb1PUJWyMAKZ+M8W4T+O1jkyQwyfJ m6Q= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834849" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:50 +0800 IronPort-SDR: sO06Nav3ZesCMJZk/WP+JXbuDJ1IQSrppUDDjaQI78vr2sjs2naG/IMVgI+q6ZMg6uh6a3KaiH SZl90C2Tj2kw== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:10 -0700 IronPort-SDR: aRt39J88Yx4SXv1CdsX83GUM/DrlBP7NyDyoLzjNmnyyNSGnUt3StE+JQQBqhd3mzxKwBNKPnH pr9Jmcwn+/lg== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:49 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 06/15] hw/block/nvme: Add support for Namespace Types Date: Sun, 13 Sep 2020 07:54:21 +0900 Message-Id: <20200912225430.1772-7-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Niklas Cassel Namespace Types introduce a new command set, "I/O Command Sets", that allows the host to retrieve the command sets associated with a namespace. Introduce support for the command set and enable detection for the NVM Command Set. The new workflows for identify commands rely heavily on zero-filled identify structs. E.g., certain CNS commands are defined to return a zero-filled identify struct when an inactive namespace NSID is supplied. Add a helper function in order to avoid code duplication when reporting zero-filled identify structures. Signed-off-by: Niklas Cassel Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 208 +++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 188 insertions(+), 20 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 4bd88f4046..004f1c9578 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1153,6 +1153,19 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req) return NVME_SUCCESS; } +static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, uint64_t prp1, + uint64_t prp2, NvmeRequest *req) +{ + uint16_t status; + + status = nvme_map_prp(n, prp1, prp2, NVME_IDENTIFY_DATA_SIZE, req); + if (status) { + return status; + } + nvme_fill_data(&req->qsg, &req->iov, 0, 0); + return NVME_SUCCESS; +} + static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1165,6 +1178,21 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) prp2, DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + uint64_t prp1 = le64_to_cpu(c->prp1); + uint64_t prp2 = le64_to_cpu(c->prp2); + + trace_pci_nvme_identify_ctrl_csi(c->csi); + + if (c->csi == NVME_CSI_NVM) { + return nvme_rpt_empty_id_struct(n, prp1, prp2, req); + } + + return NVME_INVALID_FIELD | NVME_DNR; +} + static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) { NvmeNamespace *ns; @@ -1181,11 +1209,37 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) } ns = &n->namespaces[nsid - 1]; + assert(nsid == ns->nsid); return nvme_dma_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns), prp1, prp2, DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + NvmeNamespace *ns; + uint32_t nsid = le32_to_cpu(c->nsid); + uint64_t prp1 = le64_to_cpu(c->prp1); + uint64_t prp2 = le64_to_cpu(c->prp2); + + trace_pci_nvme_identify_ns_csi(nsid, c->csi); + + if (unlikely(nsid == 0 || nsid > n->num_namespaces)) { + trace_pci_nvme_err_invalid_ns(nsid, n->num_namespaces); + return NVME_INVALID_NSID | NVME_DNR; + } + + ns = &n->namespaces[nsid - 1]; + assert(nsid == ns->nsid); + + if (c->csi == NVME_CSI_NVM) { + return nvme_rpt_empty_id_struct(n, prp1, prp2, req); + } + + return NVME_INVALID_FIELD | NVME_DNR; +} + static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1225,23 +1279,51 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return ret; } +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + static const int data_len = NVME_IDENTIFY_DATA_SIZE; + uint32_t min_nsid = le32_to_cpu(c->nsid); + uint64_t prp1 = le64_to_cpu(c->prp1); + uint64_t prp2 = le64_to_cpu(c->prp2); + uint32_t *list; + uint16_t ret; + int i, j = 0; + + trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi); + + if (c->csi != NVME_CSI_NVM) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + list = g_malloc0(data_len); + for (i = 0; i < n->num_namespaces; i++) { + if (i < min_nsid) { + continue; + } + list[j++] = cpu_to_le32(i + 1); + if (j == data_len / sizeof(uint32_t)) { + break; + } + } + ret = nvme_dma_prp(n, (uint8_t *)list, data_len, prp1, prp2, + DMA_DIRECTION_FROM_DEVICE, req); + g_free(list); + return ret; +} + static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + NvmeNamespace *ns; uint32_t nsid = le32_to_cpu(c->nsid); uint64_t prp1 = le64_to_cpu(c->prp1); uint64_t prp2 = le64_to_cpu(c->prp2); - - uint8_t list[NVME_IDENTIFY_DATA_SIZE]; - - struct data { - struct { - NvmeIdNsDescr hdr; - uint8_t v[16]; - } uuid; - }; - - struct data *ns_descrs = (struct data *)list; + void *buf_ptr; + NvmeIdNsDescr *desc; + static const int data_len = NVME_IDENTIFY_DATA_SIZE; + uint8_t *buf; + uint16_t status; trace_pci_nvme_identify_ns_descr_list(nsid); @@ -1250,7 +1332,11 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } - memset(list, 0x0, sizeof(list)); + ns = &n->namespaces[nsid - 1]; + assert(nsid == ns->nsid); + + buf = g_malloc0(data_len); + buf_ptr = buf; /* * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data @@ -1258,12 +1344,44 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) * Namespace Identification Descriptor. Add a very basic Namespace UUID * here. */ - ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID; - ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID; - stl_be_p(&ns_descrs->uuid.v, nsid); + desc = buf_ptr; + desc->nidt = NVME_NIDT_UUID; + desc->nidl = NVME_NIDL_UUID; + buf_ptr += sizeof(*desc); + memcpy(buf_ptr, ns->uuid.data, NVME_NIDL_UUID); + buf_ptr += NVME_NIDL_UUID; - return nvme_dma_prp(n, list, NVME_IDENTIFY_DATA_SIZE, prp1, prp2, - DMA_DIRECTION_FROM_DEVICE, req); + desc = buf_ptr; + desc->nidt = NVME_NIDT_CSI; + desc->nidl = NVME_NIDL_CSI; + buf_ptr += sizeof(*desc); + *(uint8_t *)buf_ptr = NVME_CSI_NVM; + + status = nvme_dma_prp(n, buf, data_len, prp1, prp2, + DMA_DIRECTION_FROM_DEVICE, req); + g_free(buf); + return status; +} + +static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + uint64_t prp1 = le64_to_cpu(c->prp1); + uint64_t prp2 = le64_to_cpu(c->prp2); + static const int data_len = NVME_IDENTIFY_DATA_SIZE; + uint32_t *list; + uint8_t *ptr; + uint16_t status; + + trace_pci_nvme_identify_cmd_set(); + + list = g_malloc0(data_len); + ptr = (uint8_t *)list; + NVME_SET_CSI(*ptr, NVME_CSI_NVM); + status = nvme_dma_prp(n, (uint8_t *)list, data_len, prp1, prp2, + DMA_DIRECTION_FROM_DEVICE, req); + g_free(list); + return status; } static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) @@ -1273,12 +1391,20 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: return nvme_identify_ns(n, req); + case NVME_ID_CNS_CS_NS: + return nvme_identify_ns_csi(n, req); case NVME_ID_CNS_CTRL: return nvme_identify_ctrl(n, req); + case NVME_ID_CNS_CS_CTRL: + return nvme_identify_ctrl_csi(n, req); case NVME_ID_CNS_NS_ACTIVE_LIST: return nvme_identify_nslist(n, req); + case NVME_ID_CNS_CS_NS_ACTIVE_LIST: + return nvme_identify_nslist_csi(n, req); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); + case NVME_ID_CNS_IO_COMMAND_SET: + return nvme_identify_cmd_set(n, req); default: trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns)); return NVME_INVALID_FIELD | NVME_DNR; @@ -1460,6 +1586,9 @@ defaults: result |= NVME_INTVC_NOCOALESCING; } + break; + case NVME_COMMAND_SET_PROFILE: + result = 0; break; default: result = nvme_feature_default[fid]; @@ -1584,6 +1713,12 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) break; case NVME_TIMESTAMP: return nvme_set_feature_timestamp(n, req); + case NVME_COMMAND_SET_PROFILE: + if (dw11 & 0x1ff) { + trace_pci_nvme_err_invalid_iocsci(dw11 & 0x1ff); + return NVME_CMD_SET_CMB_REJECTED | NVME_DNR; + } + break; default: return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR; } @@ -1845,6 +1980,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data, break; case 0x14: /* CC */ trace_pci_nvme_mmio_cfg(data & 0xffffffff); + + if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) { + if (NVME_CC_EN(n->bar.cc)) { + NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled, + "changing selected command set when enabled"); + } else { + switch (NVME_CC_CSS(data)) { + case CSS_NVM_ONLY: + trace_pci_nvme_css_nvm_cset_selected_by_host(data & + 0xffffffff); + break; + case CSS_CSI: + NVME_SET_CC_CSS(n->bar.cc, CSS_CSI); + trace_pci_nvme_css_all_csets_sel_by_host(data & 0xffffffff); + break; + case CSS_ADMIN_ONLY: + break; + default: + NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value, + "unknown value in CC.CSS field"); + } + } + } + /* Windows first sends data, then sends enable bit */ if (!NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc) && !NVME_CC_SHN(data) && !NVME_CC_SHN(n->bar.cc)) @@ -2255,6 +2414,8 @@ static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) n->ns_size = bs_size; + ns->csi = NVME_CSI_NVM; + qemu_uuid_generate(&ns->uuid); /* TODO make UUIDs persistent */ lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); id_ns->lbaf[lba_index].ds = nvme_ilog2(n->conf.logical_block_size); id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(n, ns)); @@ -2419,7 +2580,11 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) NVME_CAP_SET_MQES(n->bar.cap, 0x7ff); NVME_CAP_SET_CQR(n->bar.cap, 1); NVME_CAP_SET_TO(n->bar.cap, 0xf); - NVME_CAP_SET_CSS(n->bar.cap, 1); + /* + * The device now always supports NS Types, but all commands + * that support CSI field will only handle NVM Command Set. + */ + NVME_CAP_SET_CSS(n->bar.cap, (CAP_CSS_NVM | CAP_CSS_CSI_SUPP)); NVME_CAP_SET_MPSMAX(n->bar.cap, 4); n->bar.vs = NVME_SPEC_VER; @@ -2429,6 +2594,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) static void nvme_realize(PCIDevice *pci_dev, Error **errp) { NvmeCtrl *n = NVME(pci_dev); + NvmeNamespace *ns; Error *local_err = NULL; int i; @@ -2454,8 +2620,10 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) nvme_init_ctrl(n, pci_dev); - for (i = 0; i < n->num_namespaces; i++) { - nvme_init_namespace(n, &n->namespaces[i], &local_err); + ns = n->namespaces; + for (i = 0; i < n->num_namespaces; i++, ns++) { + ns->nsid = i + 1; + nvme_init_namespace(n, ns, &local_err); if (local_err) { error_propagate(errp, local_err); return; From patchwork Sat Sep 12 22:54:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772133 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 58F0114F6 for ; Sat, 12 Sep 2020 22:57:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 13D4320731 for ; Sat, 12 Sep 2020 22:57:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="k7orfZzw" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 13D4320731 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:52146 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHESA-0008TU-5B for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 18:57:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48916) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ5-00043m-AK; Sat, 12 Sep 2020 18:54:57 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26879) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ2-0005R7-Lw; Sat, 12 Sep 2020 18:54:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951294; x=1631487294; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JSDNzq2YRZFWgQQ5laOdIp5GgR3BkqG8oSkdubzhgO8=; b=k7orfZzwdxVmDo2P241OplY7+E8YpDTqnGfTEHp1ygWzisVYMQT1da79 rqMi+l0NuwCX6ZDHGaHfbR4TfPRBOWUR11DuxSGRvJCZ0nCxrCwRA7YRv WLSHLF8BlWZxewJmnicoDlDTut67/KyTZkUpY2VYW2xLSa1bHBnKzpZcv 34f2t0tZPgqm3choaWu88JS3dFps2iJ4y0naocVq31thrQbJTzoreor8L a3jsPOAa/uJxA9ZG0v7e+Z120Cj1VAQJ7jGzZj7PTZtZWaaLy0cBjt14o Ky2BVyCXIEcG2MXSZs3uEhxicnySOYCmewLVfeLAN4PX8oFE5fb+bLGVJ w==; IronPort-SDR: XVzY9q64VtWmkb7aY0/IQFTembI9SBFExHg0DdW/s0KW/WPpipnvH1g6DOYqWogLauHhy9Xv1X 5ixedDMOizWglftvbAuDwKy9vpchiuezE++sFVFu/c/ArNn37ZeIr7rmtrIRD59HUAloC3crWL wqvZoLx7XE8h1ehZT4rClprNX1P+fGbS9bcGqI5FI8dXDpRl5GVPoJqlTvWxwy53DKf0eTuMgL z+Aie1UHqCNFcHjxkSBbbYoEnVqEXPMz45gNZkwbU/L8HlqMhHK7B5u69IHEGB6R59I6tVsAft LiU= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834852" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:52 +0800 IronPort-SDR: WmDt/lo5i97H9Ju0jc5sqGeGzzLAamFMHRObdK56/M49zhMqvbYMsC3hLKwxKyfvcpWdNQv5HI +bRHSsMpToGQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:13 -0700 IronPort-SDR: 1d7E9fjzi06GMMTpiOhiQduQYRQmMi29ITMUOBEZeXp0MQxLW7i9Gti2HUY48IsEGoK1WYlHp4 fqpxtzQUG1IQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:51 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 07/15] hw/block/nvme: Add support for active/inactive namespaces Date: Sun, 13 Sep 2020 07:54:22 +0900 Message-Id: <20200912225430.1772-8-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Niklas Cassel In NVMe, a namespace is active if it exists and is attached to the controller. CAP.CSS (together with the I/O Command Set data structure) defines what command sets are supported by the controller. CC.CSS (together with Set Profile) can be set to enable a subset of the available command sets. The namespaces belonging to a disabled command set will not be able to attach to the controller, and will thus be inactive. E.g., if the user sets CC.CSS to Admin Only, NVM namespaces should be marked as inactive. The identify namespace, the identify namespace CSI specific, and the namespace list commands have two different versions, one that only shows active namespaces, and the other version that shows existing namespaces, regardless of whether the namespace is attached or not. Add an attached member to struct NvmeNamespace, and implement the missing CNS commands. The added functionality will also simplify the implementation of namespace management in the future, since namespace management can also attach and detach namespaces. Signed-off-by: Niklas Cassel Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 54 ++++++++++++++++++++++++++++++++++++-------- hw/block/nvme.h | 1 + include/block/nvme.h | 20 +++++++++------- 3 files changed, 57 insertions(+), 18 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 004f1c9578..6dd6bf9183 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1193,7 +1193,8 @@ static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } -static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1211,11 +1212,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) ns = &n->namespaces[nsid - 1]; assert(nsid == ns->nsid); + if (only_active && !ns->attached) { + return nvme_rpt_empty_id_struct(n, prp1, prp2, req); + } + return nvme_dma_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns), prp1, prp2, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; NvmeNamespace *ns; @@ -1233,6 +1239,10 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) ns = &n->namespaces[nsid - 1]; assert(nsid == ns->nsid); + if (only_active && !ns->attached) { + return nvme_rpt_empty_id_struct(n, prp1, prp2, req); + } + if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, prp1, prp2, req); } @@ -1240,7 +1250,8 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } -static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; static const int data_len = NVME_IDENTIFY_DATA_SIZE; @@ -1265,7 +1276,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) list = g_malloc0(data_len); for (i = 0; i < n->num_namespaces; i++) { - if (i < min_nsid) { + if (i < min_nsid || (only_active && !n->namespaces[i].attached)) { continue; } list[j++] = cpu_to_le32(i + 1); @@ -1279,7 +1290,8 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return ret; } -static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; static const int data_len = NVME_IDENTIFY_DATA_SIZE; @@ -1298,7 +1310,8 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) list = g_malloc0(data_len); for (i = 0; i < n->num_namespaces; i++) { - if (i < min_nsid) { + if (i < min_nsid || c->csi != n->namespaces[i].csi || + (only_active && !n->namespaces[i].attached)) { continue; } list[j++] = cpu_to_le32(i + 1); @@ -1390,17 +1403,25 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: - return nvme_identify_ns(n, req); + return nvme_identify_ns(n, req, true); case NVME_ID_CNS_CS_NS: - return nvme_identify_ns_csi(n, req); + return nvme_identify_ns_csi(n, req, true); + case NVME_ID_CNS_NS_PRESENT: + return nvme_identify_ns(n, req, false); + case NVME_ID_CNS_CS_NS_PRESENT: + return nvme_identify_ns_csi(n, req, false); case NVME_ID_CNS_CTRL: return nvme_identify_ctrl(n, req); case NVME_ID_CNS_CS_CTRL: return nvme_identify_ctrl_csi(n, req); case NVME_ID_CNS_NS_ACTIVE_LIST: - return nvme_identify_nslist(n, req); + return nvme_identify_nslist(n, req, true); case NVME_ID_CNS_CS_NS_ACTIVE_LIST: - return nvme_identify_nslist_csi(n, req); + return nvme_identify_nslist_csi(n, req, true); + case NVME_ID_CNS_NS_PRESENT_LIST: + return nvme_identify_nslist(n, req, false); + case NVME_ID_CNS_CS_NS_PRESENT_LIST: + return nvme_identify_nslist_csi(n, req, false); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); case NVME_ID_CNS_IO_COMMAND_SET: @@ -1842,6 +1863,7 @@ static int nvme_start_ctrl(NvmeCtrl *n) { uint32_t page_bits = NVME_CC_MPS(n->bar.cc) + 12; uint32_t page_size = 1 << page_bits; + int i; if (unlikely(n->cq[0])) { trace_pci_nvme_err_startfail_cq(); @@ -1928,6 +1950,18 @@ static int nvme_start_ctrl(NvmeCtrl *n) nvme_init_sq(&n->admin_sq, n, n->bar.asq, 0, 0, NVME_AQA_ASQS(n->bar.aqa) + 1); + for (i = 0; i < n->num_namespaces; i++) { + n->namespaces[i].attached = false; + switch (n->namespaces[i].csi) { + case NVME_CSI_NVM: + if (NVME_CC_CSS(n->bar.cc) == CSS_NVM_ONLY || + NVME_CC_CSS(n->bar.cc) == CSS_CSI) { + n->namespaces[i].attached = true; + } + break; + } + } + nvme_set_timestamp(n, 0ULL); QTAILQ_INIT(&n->aer_queue); diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 252e2d5921..dec337bbf9 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -66,6 +66,7 @@ typedef struct NvmeNamespace { NvmeIdNs id_ns; uint32_t nsid; uint8_t csi; + bool attached; QemuUUID uuid; } NvmeNamespace; diff --git a/include/block/nvme.h b/include/block/nvme.h index f2cff5aa6b..53b0463a2a 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -806,14 +806,18 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 enum NvmeIdCns { - NVME_ID_CNS_NS = 0x00, - NVME_ID_CNS_CTRL = 0x01, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, - NVME_ID_CNS_NS_DESCR_LIST = 0x03, - NVME_ID_CNS_CS_NS = 0x05, - NVME_ID_CNS_CS_CTRL = 0x06, - NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, - NVME_ID_CNS_IO_COMMAND_SET = 0x1c, + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_CS_NS = 0x05, + NVME_ID_CNS_CS_CTRL = 0x06, + NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, + NVME_ID_CNS_NS_PRESENT_LIST = 0x10, + NVME_ID_CNS_NS_PRESENT = 0x11, + NVME_ID_CNS_CS_NS_PRESENT_LIST = 0x1a, + NVME_ID_CNS_CS_NS_PRESENT = 0x1b, + NVME_ID_CNS_IO_COMMAND_SET = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl { From patchwork Sat Sep 12 22:54:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772163 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 391B8746 for ; Sat, 12 Sep 2020 23:04:12 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E239320770 for ; Sat, 12 Sep 2020 23:04:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="lzxI3hl1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E239320770 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:58586 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEZ1-0005nm-36 for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:04:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48928) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ7-00045V-IN; Sat, 12 Sep 2020 18:54:59 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26887) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ4-0005VV-Gk; Sat, 12 Sep 2020 18:54:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951296; x=1631487296; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=V5o4os7MaJTL09OMj8ZKEJmEmF2PXw36IXz8bHtDChw=; b=lzxI3hl1YZRQ2rGBL0X4UmudbbyhcQpACjecPq9HYuQlPsbYQkRyX5ju q3E7h8T+OAaxWHCsdMlrn8RPbFBKR8Zyp3lCemwPzEZwQj7PQLltFG1kO NKvguKIJ3i4YVtlCmSsiCFWjux9AcWywX7yX2lIUZsIjrtwJJTv5UelxD Dsa3iPuc/0VnqRhjaSgWqfrxe4QjlhF0tHCcl0wDr/x8E9TjFjJacBknY vklCPyt7IRVXE2Et7vahMLwB7h9PiRX5OuKDj9eB91QA3R66mjhNdbkhy pgQVyIq/ZIk2Op0bq/WBv3YWGzcHIjovAEzBV58YDTUKiOMhzYcLecJMT g==; IronPort-SDR: JSvucFGxyHyPkiuWDwaAGePZEOz+Q4Lsoc19AiLiWZ00LRwbbD81JLrM91NTxo4N3MiARFPd4b BsNGBiucjJT3pq2DtRcL3mv17N9EfLHxL6NcMru+hj08qI/PUJClumjGd5ySlfCj7LLcTAoxTY zATCWI0rTsW8HHizuOt5FGn4QuPPBUnL26nu6QbEpO1R3oMARLmTC4RQrOfn8xGnXfEB8QJk2w iaG2IL1qCTNAFzunNm+I+FxfHtuODNZDRR5R5GcVuBc4N7Z/2OkMEwnHOyZfyl1aPcGnhcLlXj DVM= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834855" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:54 +0800 IronPort-SDR: HgkaPYD/NOJb+61NQtb8uCFTAb7WQPN6veLUKM6obsKZR6PRfUwyf7nzVEBaViRiSbHpbjVKqR LGOd7zCNBeOQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:15 -0700 IronPort-SDR: 0wq0rOdVyZSYsNivPGnhDWPPIYKKM5HAGOfYwJu5AYtSjdcm29QmRbOrmZ57TQhc7lE57Zd+Ga rzkshB6ZDNaw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:53 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 08/15] hw/block/nvme: Make Zoned NS Command Set definitions Date: Sun, 13 Sep 2020 07:54:23 +0900 Message-Id: <20200912225430.1772-9-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Define values and structures that are needed to support Zoned Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator. All new protocol definitions are located in include/block/nvme.h and everything added that is specific to this implementation is kept in hw/block/nvme.h. In order to improve scalability, all open, closed and full zones are organized in separate linked lists. Consequently, almost all zone operations don't require scanning of the entire zone array (which potentially can be quite large) - it is only necessary to enumerate one or more zone lists. Zone lists are designed to be position-independent as they can be persisted to the backing file as a part of zone metadata. NvmeZoneList struct defined in this patch serves as a head of every zone list. NvmeZone structure encapsulates NvmeZoneDescriptor defined in Zoned Command Set specification and adds a few more fields that are internal to this implementation. Signed-off-by: Niklas Cassel Signed-off-by: Hans Holmberg Signed-off-by: Ajay Joshi Signed-off-by: Matias Bjorling Signed-off-by: Shin'ichiro Kawasaki Signed-off-by: Alexey Bogoslavsky Signed-off-by: Dmitry Fomichev --- hw/block/nvme.h | 124 +++++++++++++++++++++++++++++++++++++++++++ include/block/nvme.h | 107 +++++++++++++++++++++++++++++++++++++ 2 files changed, 231 insertions(+) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index dec337bbf9..9514c58919 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -3,6 +3,9 @@ #include "block/nvme.h" +#define NVME_DEFAULT_ZONE_SIZE 128 /* MiB */ +#define NVME_DEFAULT_MAX_ZA_SIZE 128 /* KiB */ + typedef struct NvmeParams { char *serial; uint32_t num_queues; /* deprecated since 5.1 */ @@ -12,6 +15,13 @@ typedef struct NvmeParams { uint8_t aerl; uint32_t aer_max_queued; uint8_t mdts; + + bool zoned; + bool cross_zone_read; + uint8_t fill_pattern; + uint32_t zasl_kb; + uint64_t zone_size_mb; + uint64_t zone_capacity_mb; } NvmeParams; typedef struct NvmeAsyncEvent { @@ -24,6 +34,7 @@ typedef struct NvmeRequest { struct NvmeNamespace *ns; BlockAIOCB *aiocb; uint16_t status; + int64_t fill_ofs; NvmeCqe cqe; NvmeCmd cmd; BlockAcctCookie acct; @@ -62,12 +73,36 @@ typedef struct NvmeCQueue { QTAILQ_HEAD(, NvmeRequest) req_list; } NvmeCQueue; +typedef struct NvmeZone { + NvmeZoneDescr d; + uint64_t tstamp; + uint32_t next; + uint32_t prev; + uint8_t rsvd80[8]; +} NvmeZone; + +#define NVME_ZONE_LIST_NIL UINT_MAX + +typedef struct NvmeZoneList { + uint32_t head; + uint32_t tail; + uint32_t size; + uint8_t rsvd12[4]; +} NvmeZoneList; + typedef struct NvmeNamespace { NvmeIdNs id_ns; uint32_t nsid; uint8_t csi; bool attached; QemuUUID uuid; + + NvmeIdNsZoned *id_ns_zoned; + NvmeZone *zone_array; + NvmeZoneList *exp_open_zones; + NvmeZoneList *imp_open_zones; + NvmeZoneList *closed_zones; + NvmeZoneList *full_zones; } NvmeNamespace; static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) @@ -126,6 +161,15 @@ typedef struct NvmeCtrl { QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue; int aer_queued; + int zone_file_fd; + uint32_t num_zones; + uint64_t zone_size; + uint64_t zone_capacity; + uint64_t zone_array_size; + uint32_t zone_size_log2; + uint32_t zasl_bs; + uint8_t zasl; + NvmeNamespace *namespaces; NvmeSQueue **sq; NvmeCQueue **cq; @@ -141,6 +185,86 @@ static inline uint64_t nvme_ns_nlbas(NvmeCtrl *n, NvmeNamespace *ns) return n->ns_size >> nvme_ns_lbads(ns); } +static inline uint8_t nvme_get_zone_state(NvmeZone *zone) +{ + return zone->d.zs >> 4; +} + +static inline void nvme_set_zone_state(NvmeZone *zone, enum NvmeZoneState state) +{ + zone->d.zs = state << 4; +} + +static inline uint64_t nvme_zone_rd_boundary(NvmeCtrl *n, NvmeZone *zone) +{ + return zone->d.zslba + n->zone_size; +} + +static inline uint64_t nvme_zone_wr_boundary(NvmeZone *zone) +{ + return zone->d.zslba + zone->d.zcap; +} + +static inline bool nvme_wp_is_valid(NvmeZone *zone) +{ + uint8_t st = nvme_get_zone_state(zone); + + return st != NVME_ZONE_STATE_FULL && + st != NVME_ZONE_STATE_READ_ONLY && + st != NVME_ZONE_STATE_OFFLINE; +} + +/* + * Initialize a zone list head. + */ +static inline void nvme_init_zone_list(NvmeZoneList *zl) +{ + zl->head = NVME_ZONE_LIST_NIL; + zl->tail = NVME_ZONE_LIST_NIL; + zl->size = 0; +} + +/* + * Initialize the number of entries contained in a zone list. + */ +static inline uint32_t nvme_zone_list_size(NvmeZoneList *zl) +{ + return zl->size; +} + +/* + * Check if the zone is not currently included into any zone list. + */ +static inline bool nvme_zone_not_in_list(NvmeZone *zone) +{ + return (bool)(zone->prev == 0 && zone->next == 0); +} + +/* + * Return the zone at the head of zone list or NULL if the list is empty. + */ +static inline NvmeZone *nvme_peek_zone_head(NvmeNamespace *ns, NvmeZoneList *zl) +{ + if (zl->head == NVME_ZONE_LIST_NIL) { + return NULL; + } + return &ns->zone_array[zl->head]; +} + +/* + * Return the next zone in the list. + */ +static inline NvmeZone *nvme_next_zone_in_list(NvmeNamespace *ns, NvmeZone *z, + NvmeZoneList *zl) +{ + assert(!nvme_zone_not_in_list(z)); + + if (z->next == NVME_ZONE_LIST_NIL) { + return NULL; + } + return &ns->zone_array[z->next]; +} + static inline int nvme_ilog2(uint64_t i) { int log = -1; diff --git a/include/block/nvme.h b/include/block/nvme.h index 53b0463a2a..772659859e 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -488,6 +488,9 @@ enum NvmeIoCommands { NVME_CMD_COMPARE = 0x05, NVME_CMD_WRITE_ZEROES = 0x08, NVME_CMD_DSM = 0x09, + NVME_CMD_ZONE_MGMT_SEND = 0x79, + NVME_CMD_ZONE_MGMT_RECV = 0x7a, + NVME_CMD_ZONE_APND = 0x7d, }; typedef struct QEMU_PACKED NvmeDeleteQ { @@ -679,6 +682,7 @@ enum NvmeStatusCodes { NVME_SGL_DESCR_TYPE_INVALID = 0x0011, NVME_INVALID_USE_OF_CMB = 0x0012, NVME_CMD_SET_CMB_REJECTED = 0x002b, + NVME_INVALID_CMD_SET = 0x002c, NVME_LBA_RANGE = 0x0080, NVME_CAP_EXCEEDED = 0x0081, NVME_NS_NOT_READY = 0x0082, @@ -703,6 +707,14 @@ enum NvmeStatusCodes { NVME_CONFLICTING_ATTRS = 0x0180, NVME_INVALID_PROT_INFO = 0x0181, NVME_WRITE_TO_RO = 0x0182, + NVME_ZONE_BOUNDARY_ERROR = 0x01b8, + NVME_ZONE_FULL = 0x01b9, + NVME_ZONE_READ_ONLY = 0x01ba, + NVME_ZONE_OFFLINE = 0x01bb, + NVME_ZONE_INVALID_WRITE = 0x01bc, + NVME_ZONE_TOO_MANY_ACTIVE = 0x01bd, + NVME_ZONE_TOO_MANY_OPEN = 0x01be, + NVME_ZONE_INVAL_TRANSITION = 0x01bf, NVME_WRITE_FAULT = 0x0280, NVME_UNRECOVERED_READ = 0x0281, NVME_E2E_GUARD_ERROR = 0x0282, @@ -887,6 +899,11 @@ typedef struct QEMU_PACKED NvmeIdCtrl { uint8_t vs[1024]; } NvmeIdCtrl; +typedef struct NvmeIdCtrlZoned { + uint8_t zasl; + uint8_t rsvd1[4095]; +} NvmeIdCtrlZoned; + enum NvmeIdCtrlOacs { NVME_OACS_SECURITY = 1 << 0, NVME_OACS_FORMAT = 1 << 1, @@ -1011,6 +1028,12 @@ typedef struct QEMU_PACKED NvmeLBAF { uint8_t rp; } NvmeLBAF; +typedef struct QEMU_PACKED NvmeLBAFE { + uint64_t zsze; + uint8_t zdes; + uint8_t rsvd9[7]; +} NvmeLBAFE; + #define NVME_NSID_BROADCAST 0xffffffff typedef struct QEMU_PACKED NvmeIdNs { @@ -1065,10 +1088,24 @@ enum NvmeNsIdentifierType { enum NvmeCsi { NVME_CSI_NVM = 0x00, + NVME_CSI_ZONED = 0x02, }; #define NVME_SET_CSI(vec, csi) (vec |= (uint8_t)(1 << (csi))) +typedef struct QEMU_PACKED NvmeIdNsZoned { + uint16_t zoc; + uint16_t ozcs; + uint32_t mar; + uint32_t mor; + uint32_t rrl; + uint32_t frl; + uint8_t rsvd20[2796]; + NvmeLBAFE lbafe[16]; + uint8_t rsvd3072[768]; + uint8_t vs[256]; +} NvmeIdNsZoned; + /*Deallocate Logical Block Features*/ #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat) ((dlfeat) & 0x10) #define NVME_ID_NS_DLFEAT_WRITE_ZEROES(dlfeat) ((dlfeat) & 0x08) @@ -1100,6 +1137,71 @@ enum NvmeIdNsDps { DPS_FIRST_EIGHT = 8, }; +enum NvmeZoneAttr { + NVME_ZA_FINISHED_BY_CTLR = 1 << 0, + NVME_ZA_FINISH_RECOMMENDED = 1 << 1, + NVME_ZA_RESET_RECOMMENDED = 1 << 2, + NVME_ZA_ZD_EXT_VALID = 1 << 7, +}; + +typedef struct QEMU_PACKED NvmeZoneReportHeader { + uint64_t nr_zones; + uint8_t rsvd[56]; +} NvmeZoneReportHeader; + +enum NvmeZoneReceiveAction { + NVME_ZONE_REPORT = 0, + NVME_ZONE_REPORT_EXTENDED = 1, +}; + +enum NvmeZoneReportType { + NVME_ZONE_REPORT_ALL = 0, + NVME_ZONE_REPORT_EMPTY = 1, + NVME_ZONE_REPORT_IMPLICITLY_OPEN = 2, + NVME_ZONE_REPORT_EXPLICITLY_OPEN = 3, + NVME_ZONE_REPORT_CLOSED = 4, + NVME_ZONE_REPORT_FULL = 5, + NVME_ZONE_REPORT_READ_ONLY = 6, + NVME_ZONE_REPORT_OFFLINE = 7, +}; + +enum NvmeZoneType { + NVME_ZONE_TYPE_RESERVED = 0x00, + NVME_ZONE_TYPE_SEQ_WRITE = 0x02, +}; + +enum NvmeZoneSendAction { + NVME_ZONE_ACTION_RSD = 0x00, + NVME_ZONE_ACTION_CLOSE = 0x01, + NVME_ZONE_ACTION_FINISH = 0x02, + NVME_ZONE_ACTION_OPEN = 0x03, + NVME_ZONE_ACTION_RESET = 0x04, + NVME_ZONE_ACTION_OFFLINE = 0x05, + NVME_ZONE_ACTION_SET_ZD_EXT = 0x10, +}; + +typedef struct QEMU_PACKED NvmeZoneDescr { + uint8_t zt; + uint8_t zs; + uint8_t za; + uint8_t rsvd3[5]; + uint64_t zcap; + uint64_t zslba; + uint64_t wp; + uint8_t rsvd32[32]; +} NvmeZoneDescr; + +enum NvmeZoneState { + NVME_ZONE_STATE_RESERVED = 0x00, + NVME_ZONE_STATE_EMPTY = 0x01, + NVME_ZONE_STATE_IMPLICITLY_OPEN = 0x02, + NVME_ZONE_STATE_EXPLICITLY_OPEN = 0x03, + NVME_ZONE_STATE_CLOSED = 0x04, + NVME_ZONE_STATE_READ_ONLY = 0x0D, + NVME_ZONE_STATE_FULL = 0x0E, + NVME_ZONE_STATE_OFFLINE = 0x0F, +}; + static inline void _nvme_check_size(void) { QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096); @@ -1119,9 +1221,14 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlZoned) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeLBAF) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeLBAFE) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsZoned) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneDescr) != 64); } #endif From patchwork Sat Sep 12 22:54:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772171 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 18CC8618 for ; Sat, 12 Sep 2020 23:06:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D368A207BB for ; Sat, 12 Sep 2020 23:06:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="gAIlhhWy" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D368A207BB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:40208 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEbY-0001T2-1I for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:06:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48948) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ9-0004Br-VE; Sat, 12 Sep 2020 18:55:01 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26879) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ7-0005R7-Kg; Sat, 12 Sep 2020 18:55:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951299; x=1631487299; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jsS0Qmbf7IOzITobP9sBEEdGOMLsAU/G1rgEijwm2vw=; b=gAIlhhWy7TnyHdaVKR4wFR2if9Jk7gX/EYAp53UwsvHdz/LpgBEcl7JW dRChmwfXqlJ3Uk87v0MgkGpDEVeTda/Oe7mq+VtXP1xKmXdYZuDgRTQj/ dDaGGpT3wuEWgQCcC/bnM33UuwXbfoNukd7sw/wOWI8N6BG100FMmPbvK SXfwUA+6WWaVA98x//uZ7N54jijuiwClsUiPQ6uipU+E9fRZS7IQlI3zN iV3QAlCaXMXgMwNXQV2Kp61FtM8Y5LWnGxHwOHP/H5SBhQRv7cMcui0a4 Rg4usQ6sTLyMpDmghhX7jr1ymVSbtZOrqxAQCGWtP+ThX42hxgt8Fmf4a w==; IronPort-SDR: IuHM5UHoqjqTd+jhOGeCmP/I7L40GH/ynMh8ei2qecelvRNQLtKzHItfKZ8AgVHGL0qgWOqzvf Qox2k2Y/HQCQIclWirF4D8x7F1d0WVNMRI9ZFZ1RWFA8KnocmNhUqA9RQA6wjTAd7QfYy1LjEA I3IqC5/snylOnmZ5+5Y0HmPOKPIJWvDgO/HqFNkwpdpdgzTYXa6UkNdOZZbs/e+h4g77h5vU+d XEsMRZ1ukCHdcKNO7LYRN4UyFyJskKauzgteBlcAl+qvPh5fufVQZEx26FBGk9k9cJSsCXUy9W ce4= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834858" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:57 +0800 IronPort-SDR: EInzEy0tBCqY40D08VjXEzW00613NsiGMwbNlpXmIv+LvlylCORIpR+znp7jldBN10yzRhOqIJ OyGoa6LDMffA== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:17 -0700 IronPort-SDR: thFYVlB09+AIltEMkH57JeauiB/xC0vWL9ZCOyfavcflxv7WIE6EndKywRLgiNEMB+s0P82/D0 ViHR/wAauM/g== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:56 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 09/15] hw/block/nvme: Define Zoned NS Command Set trace events Date: Sun, 13 Sep 2020 07:54:24 +0900 Message-Id: <20200912225430.1772-10-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" The Zoned Namespace Command Set / Namespace Types implementation that is being introduced in this series adds a good number of trace events. Combine all tracepoint definitions into a separate patch to make reviewing more convenient. Signed-off-by: Dmitry Fomichev --- hw/block/trace-events | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/hw/block/trace-events b/hw/block/trace-events index 2414dcbc79..53c7a2fd1f 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -90,6 +90,17 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects log read" pci_nvme_css_nvm_cset_selected_by_host(uint32_t cc) "NVM command set selected by host, bar.cc=0x%"PRIx32"" pci_nvme_css_all_csets_sel_by_host(uint32_t cc) "all supported command sets selected by host, bar.cc=0x%"PRIx32"" +pci_nvme_open_zone(uint64_t slba, uint32_t zone_idx, int all) "open zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_close_zone(uint64_t slba, uint32_t zone_idx, int all) "close zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_finish_zone(uint64_t slba, uint32_t zone_idx, int all) "finish zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_reset_zone(uint64_t slba, uint32_t zone_idx, int all) "reset zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_offline_zone(uint64_t slba, uint32_t zone_idx, int all) "offline zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_set_descriptor_extension(uint64_t slba, uint32_t zone_idx) "set zone descriptor extension, slba=%"PRIu64", idx=%"PRIu32"" +pci_nvme_zd_extension_set(uint32_t zone_idx) "set descriptor extension for zone_idx=%"PRIu32"" +pci_nvme_power_on_close(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Closed state" +pci_nvme_power_on_reset(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Empty state" +pci_nvme_power_on_full(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Full state" +pci_nvme_mapped_zone_file(char *zfile_name, int ret) "mapped zone file %s, error %d" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" @@ -102,9 +113,23 @@ pci_nvme_err_invalid_ns(uint32_t ns, uint32_t limit) "invalid namespace %u not w pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" +pci_nvme_err_unaligned_zone_cmd(uint8_t action, uint64_t slba, uint64_t zslba) "unaligned zone op 0x%"PRIx32", got slba=%"PRIu64", zslba=%"PRIu64"" +pci_nvme_err_invalid_zone_state_transition(uint8_t state, uint8_t action, uint64_t slba, uint8_t attrs) "0x%"PRIx32"->0x%"PRIx32", slba=%"PRIu64", attrs=0x%"PRIx32"" +pci_nvme_err_write_not_at_wp(uint64_t slba, uint64_t zone, uint64_t wp) "writing at slba=%"PRIu64", zone=%"PRIu64", but wp=%"PRIu64"" +pci_nvme_err_append_not_at_start(uint64_t slba, uint64_t zone) "appending at slba=%"PRIu64", but zone=%"PRIu64"" +pci_nvme_err_zone_write_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slba=%"PRIu64", nlb=%"PRIu32", status=0x%"PRIx16"" +pci_nvme_err_zone_read_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slba=%"PRIu64", nlb=%"PRIu32", status=0x%"PRIx16"" +pci_nvme_err_append_too_large(uint64_t slba, uint32_t nlb, uint8_t zasl) "slba=%"PRIu64", nlb=%"PRIu32", zasl=%"PRIu8"" +pci_nvme_err_insuff_active_res(uint32_t max_active) "max_active=%"PRIu32" zone limit exceeded" +pci_nvme_err_insuff_open_res(uint32_t max_open) "max_open=%"PRIu32" zone limit exceeded" +pci_nvme_err_zone_file_invalid(int error) "validation error=%"PRIi32"" +pci_nvme_err_zd_extension_map_error(uint32_t zone_idx) "can't map descriptor extension for zone_idx=%"PRIu32"" +pci_nvme_err_invalid_changed_zone_list_offset(uint64_t ofs) "changed zone list log offset must be 0, got %"PRIu64"" +pci_nvme_err_invalid_changed_zone_list_len(uint32_t len) "changed zone list log size is 4096, got %"PRIu32"" pci_nvme_err_invalid_effects_log_offset(uint64_t ofs) "commands supported and effects log offset must be 0, got %"PRIu64"" pci_nvme_err_change_css_when_enabled(void) "changing CC.CSS while controller is enabled" pci_nvme_err_only_nvm_cmd_set_avail(void) "setting 110b CC.CSS, but only NVM command set is enabled" +pci_nvme_err_only_zoned_cmd_set_avail(void) "setting 001b CC.CSS, but only ZONED+NVM command set is enabled" pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combination index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" @@ -138,6 +163,7 @@ pci_nvme_err_startfail_sqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_ pci_nvme_err_startfail_asqent_sz_zero(void) "nvme_start_ctrl failed because the admin submission queue size is zero" pci_nvme_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed because the admin completion queue size is zero" pci_nvme_err_startfail(void) "setting controller enable bit failed" +pci_nvme_err_invalid_mgmt_action(int action) "action=0x%"PRIx8"" # Traces for undefined behavior pci_nvme_ub_mmiowr_misaligned32(uint64_t offset) "MMIO write not 32-bit aligned, offset=0x%"PRIx64"" From patchwork Sat Sep 12 22:54:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772173 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 88BC7618 for ; Sat, 12 Sep 2020 23:07:51 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 177D7207BB for ; Sat, 12 Sep 2020 23:07:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="qmcx7ERH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 177D7207BB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:43518 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEcX-0002ol-NS for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:07:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQD-0004L3-VC; Sat, 12 Sep 2020 18:55:05 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26887) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQ9-0005VV-8v; Sat, 12 Sep 2020 18:55:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951300; x=1631487300; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4+WcEkUIXCgDfTwU+0b0SXgqGxa31c+hexjxRrHZucc=; b=qmcx7ERHwQVr6vRo+MLZhErUgwNGwAYOS0XwbBcmQ7Q+Lr9+1EOZQchL 9p32wrMx0atXF51O5Qaqp3JHc5mWxw7L3hxKKjjPYCQ6NAdghY4ljL0vk P1EAzLPG9F5CIUhxeAzATdBmSeVAlBCKaRvEBFA87V+LSggzSsuBOvX5v FakkGjPqaCTp5KUKWY70ldIpTmPqPWJDkb52sR3TJJlmyNOACY0HjrfDA beoUa1yyjVelvNDSGqGYCIU4zoSjpwWAjktXnhxoxMhli5ju7lQGnFdbw dWUKzYFKI3pOC7AXu0O0H7ZWbOMp4XYqjQ9Nd62S05x+KiT1er8sId+qJ g==; IronPort-SDR: mMdGgR7653HLmJDJRzm1zw2nUiqKFK0Yyn+IZDKvP+8XI78xaCgOUcWXPkU6fD7C02IQul8jQT QiFwwVeDBuUC0Q5G0AXZMz279ipgKNzFafm7bARBj7tt/Cm6JuQGKT1aRi3v4XnDQbghEY7iy0 0CPJRj96SzPtf/KkyL7fTw819+vT7dBIyxmNvCZYcrUN1KxH2x9GkTYDtCw6nh/71+SrbRG/h4 0AMTASXR02UK8zJFmTv3HELGZgAW2qL3nNT/5D7FutmWfs7+jeUGzM/HpWpXk9GBp06KfLvCFW l64= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834860" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:54:59 +0800 IronPort-SDR: lOfvLxvNDt5U0wC+Biu2uct2w/3s/2tUseS6+6NIfn3OaKZCm3H88zEoNBMHA+KlaaQV9JHJJt sW3cAu7BVB6w== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:19 -0700 IronPort-SDR: pbBA/0axaS5aDW7hSCos2dRTAKuSocIR6taN/UQLMIWd8wEeDXyXbqP6kQbmKe/WksN3lch9MK usxIN6lAYyUw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:54:58 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 10/15] hw/block/nvme: Support Zoned Namespace Command Set Date: Sun, 13 Sep 2020 07:54:25 +0900 Message-Id: <20200912225430.1772-11-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" The emulation code has been changed to advertise NVM Command Set when "zoned" device property is not set (default) and Zoned Namespace Command Set otherwise. Handlers for three new NVMe commands introduced in Zoned Namespace Command Set specification are added, namely for Zone Management Receive, Zone Management Send and Zone Append. Device initialization code has been extended to create a proper configuration for zoned operation using device properties. Read/Write command handler is modified to only allow writes at the write pointer if the namespace is zoned. For Zone Append command, writes implicitly happen at the write pointer and the starting write pointer value is returned as the result of the command. Write Zeroes handler is modified to add zoned checks that are identical to those done as a part of Write flow. The code to support for Zone Descriptor Extensions is not included in this commit and ZDES 0 is always reported. A later commit in this series will add ZDE support. This commit doesn't yet include checks for active and open zone limits. It is assumed that there are no limits on either active or open zones. Signed-off-by: Niklas Cassel Signed-off-by: Hans Holmberg Signed-off-by: Ajay Joshi Signed-off-by: Chaitanya Kulkarni Signed-off-by: Matias Bjorling Signed-off-by: Aravind Ramesh Signed-off-by: Shin'ichiro Kawasaki Signed-off-by: Adam Manzanares Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 995 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 966 insertions(+), 29 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 6dd6bf9183..1b0e06002c 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -53,6 +53,7 @@ #include "qemu/osdep.h" #include "qemu/units.h" #include "qemu/error-report.h" +#include "crypto/random.h" #include "hw/block/block.h" #include "hw/pci/msix.h" #include "hw/pci/pci.h" @@ -125,6 +126,98 @@ static uint16_t nvme_sqid(NvmeRequest *req) return le16_to_cpu(req->sq->sqid); } +/* + * Add a zone to the tail of a zone list. + */ +static void nvme_add_zone_tail(NvmeCtrl *n, NvmeNamespace *ns, NvmeZoneList *zl, + NvmeZone *zone) +{ + uint32_t idx = (uint32_t)(zone - ns->zone_array); + + assert(nvme_zone_not_in_list(zone)); + + if (!zl->size) { + zl->head = zl->tail = idx; + zone->next = zone->prev = NVME_ZONE_LIST_NIL; + } else { + ns->zone_array[zl->tail].next = idx; + zone->prev = zl->tail; + zone->next = NVME_ZONE_LIST_NIL; + zl->tail = idx; + } + zl->size++; +} + +/* + * Remove a zone from a zone list. The zone must be linked in the list. + */ +static void nvme_remove_zone(NvmeCtrl *n, NvmeNamespace *ns, NvmeZoneList *zl, + NvmeZone *zone) +{ + uint32_t idx = (uint32_t)(zone - ns->zone_array); + + assert(!nvme_zone_not_in_list(zone)); + + --zl->size; + if (zl->size == 0) { + zl->head = NVME_ZONE_LIST_NIL; + zl->tail = NVME_ZONE_LIST_NIL; + } else if (idx == zl->head) { + zl->head = zone->next; + ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL; + } else if (idx == zl->tail) { + zl->tail = zone->prev; + ns->zone_array[zl->tail].next = NVME_ZONE_LIST_NIL; + } else { + ns->zone_array[zone->next].prev = zone->prev; + ns->zone_array[zone->prev].next = zone->next; + } + + zone->prev = zone->next = 0; +} + +static void nvme_assign_zone_state(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint8_t state) +{ + if (!nvme_zone_not_in_list(zone)) { + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_remove_zone(n, ns, ns->exp_open_zones, zone); + break; + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_remove_zone(n, ns, ns->imp_open_zones, zone); + break; + case NVME_ZONE_STATE_CLOSED: + nvme_remove_zone(n, ns, ns->closed_zones, zone); + break; + case NVME_ZONE_STATE_FULL: + nvme_remove_zone(n, ns, ns->full_zones, zone); + } + } + + nvme_set_zone_state(zone, state); + + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_add_zone_tail(n, ns, ns->exp_open_zones, zone); + break; + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_add_zone_tail(n, ns, ns->imp_open_zones, zone); + break; + case NVME_ZONE_STATE_CLOSED: + nvme_add_zone_tail(n, ns, ns->closed_zones, zone); + break; + case NVME_ZONE_STATE_FULL: + nvme_add_zone_tail(n, ns, ns->full_zones, zone); + break; + default: + zone->d.za = 0; + /* fall through */ + case NVME_ZONE_STATE_READ_ONLY: + zone->tstamp = 0; + } +} + static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) { hwaddr low = n->ctrl_mem.addr; @@ -483,6 +576,33 @@ static void nvme_post_cqes(void *opaque) } } +static void nvme_fill_data(QEMUSGList *qsg, QEMUIOVector *iov, + uint64_t offset, uint8_t pattern) +{ + ScatterGatherEntry *entry; + uint32_t len, ent_len; + + if (qsg->nsg > 0) { + entry = qsg->sg; + for (len = qsg->size; len > 0; len -= ent_len) { + ent_len = MIN(len, entry->len); + if (offset > ent_len) { + offset -= ent_len; + } else if (offset != 0) { + dma_memory_set(qsg->as, entry->base + offset, + pattern, ent_len - offset); + offset = 0; + } else { + dma_memory_set(qsg->as, entry->base, pattern, ent_len); + } + entry++; + } + } else if (iov->iov) { + qemu_iovec_memset(iov, offset, pattern, + iov_size(iov->iov, iov->niov) - offset); + } +} + static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req) { assert(cq->cqid == req->sq->cqid); @@ -595,6 +715,138 @@ static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns, return NVME_SUCCESS; } +static uint16_t nvme_check_zone_write(NvmeZone *zone, uint64_t slba, + uint32_t nlb) +{ + uint16_t status; + + if (unlikely((slba + nlb) > nvme_zone_wr_boundary(zone))) { + return NVME_ZONE_BOUNDARY_ERROR; + } + + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + status = NVME_SUCCESS; + break; + case NVME_ZONE_STATE_FULL: + status = NVME_ZONE_FULL; + break; + case NVME_ZONE_STATE_OFFLINE: + status = NVME_ZONE_OFFLINE; + break; + case NVME_ZONE_STATE_READ_ONLY: + status = NVME_ZONE_READ_ONLY; + break; + default: + assert(false); + } + return status; +} + +static uint16_t nvme_check_zone_read(NvmeCtrl *n, NvmeZone *zone, uint64_t slba, + uint32_t nlb, bool zone_x_ok) +{ + uint64_t lba = slba, count; + uint16_t status; + uint8_t zs; + + do { + if (!zone_x_ok && (lba + nlb > nvme_zone_rd_boundary(n, zone))) { + return NVME_ZONE_BOUNDARY_ERROR | NVME_DNR; + } + + zs = nvme_get_zone_state(zone); + switch (zs) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_FULL: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_READ_ONLY: + status = NVME_SUCCESS; + break; + case NVME_ZONE_STATE_OFFLINE: + status = NVME_ZONE_OFFLINE | NVME_DNR; + break; + default: + assert(false); + } + if (status != NVME_SUCCESS) { + break; + } + + if (lba + nlb > nvme_zone_rd_boundary(n, zone)) { + count = nvme_zone_rd_boundary(n, zone) - lba; + } else { + count = nlb; + } + + lba += count; + nlb -= count; + zone++; + } while (nlb); + + return status; +} + +static inline uint32_t nvme_zone_idx(NvmeCtrl *n, uint64_t slba) +{ + return n->zone_size_log2 > 0 ? slba >> n->zone_size_log2 : + slba / n->zone_size; +} + +static void nvme_finalize_zone_write(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + NvmeNamespace *ns; + NvmeZone *zone; + uint64_t slba; + uint32_t nlb, zone_idx; + uint8_t zs; + + if (rw->opcode != NVME_CMD_WRITE && + rw->opcode != NVME_CMD_ZONE_APND && + rw->opcode != NVME_CMD_WRITE_ZEROES) { + return; + } + + slba = le64_to_cpu(rw->slba); + nlb = le16_to_cpu(rw->nlb) + 1; + zone_idx = nvme_zone_idx(n, slba); + assert(zone_idx < n->num_zones); + ns = req->ns; + zone = &ns->zone_array[zone_idx]; + + zone->d.wp += nlb; + + zs = nvme_get_zone_state(zone); + if (zone->d.wp == nvme_zone_wr_boundary(zone)) { + switch (zs) { + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_EMPTY: + break; + default: + assert(false); + } + nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_FULL); + } else { + switch (zs) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_CLOSED: + nvme_assign_zone_state(n, ns, zone, + NVME_ZONE_STATE_IMPLICITLY_OPEN); + } + } + + req->cqe.result64 = zone->d.wp; + return; +} + static void nvme_rw_cb(void *opaque, int ret) { NvmeRequest *req = opaque; @@ -605,6 +857,13 @@ static void nvme_rw_cb(void *opaque, int ret) trace_pci_nvme_rw_cb(nvme_cid(req)); if (!ret) { + if (n->params.zoned) { + if (req->fill_ofs >= 0) { + nvme_fill_data(&req->qsg, &req->iov, req->fill_ofs, + n->params.fill_pattern); + } + nvme_finalize_zone_write(n, req); + } block_acct_done(blk_get_stats(n->conf.blk), &req->acct); req->status = NVME_SUCCESS; } else { @@ -628,12 +887,14 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; + NvmeZone *zone = NULL; const uint8_t lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); const uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds; uint64_t slba = le64_to_cpu(rw->slba); uint32_t nlb = le16_to_cpu(rw->nlb) + 1; uint64_t offset = slba << data_shift; uint32_t count = nlb << data_shift; + uint32_t zone_idx; uint16_t status; trace_pci_nvme_write_zeroes(nvme_cid(req), slba, nlb); @@ -644,25 +905,47 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) return status; } + if (n->params.zoned) { + zone_idx = nvme_zone_idx(n, slba); + assert(zone_idx < n->num_zones); + zone = &ns->zone_array[zone_idx]; + + status = nvme_check_zone_write(zone, slba, nlb); + if (status != NVME_SUCCESS) { + trace_pci_nvme_err_zone_write_not_ok(slba, nlb, status); + return status | NVME_DNR; + } + + assert(nvme_wp_is_valid(zone)); + if (unlikely(slba != zone->d.wp)) { + trace_pci_nvme_err_write_not_at_wp(slba, zone->d.zslba, + zone->d.wp); + return NVME_ZONE_INVALID_WRITE | NVME_DNR; + } + } + block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0, BLOCK_ACCT_WRITE); req->aiocb = blk_aio_pwrite_zeroes(n->conf.blk, offset, count, BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req); + return NVME_NO_COMPLETE; } -static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req, bool append) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; + NvmeZone *zone = NULL; uint32_t nlb = le32_to_cpu(rw->nlb) + 1; uint64_t slba = le64_to_cpu(rw->slba); uint8_t lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); uint8_t data_shift = ns->id_ns.lbaf[lba_index].ds; uint64_t data_size = (uint64_t)nlb << data_shift; - uint64_t data_offset = slba << data_shift; - int is_write = rw->opcode == NVME_CMD_WRITE ? 1 : 0; + uint64_t data_offset; + uint32_t zone_idx = 0; + bool is_write = rw->opcode == NVME_CMD_WRITE || append; enum BlockAcctType acct = is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; uint16_t status; @@ -682,11 +965,77 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) return status; } + if (n->params.zoned) { + zone_idx = nvme_zone_idx(n, slba); + assert(zone_idx < n->num_zones); + zone = &ns->zone_array[zone_idx]; + + if (is_write) { + status = nvme_check_zone_write(zone, slba, nlb); + if (status != NVME_SUCCESS) { + trace_pci_nvme_err_zone_write_not_ok(slba, nlb, status); + return status | NVME_DNR; + } + + assert(nvme_wp_is_valid(zone)); + if (append) { + if (unlikely(slba != zone->d.zslba)) { + trace_pci_nvme_err_append_not_at_start(slba, zone->d.zslba); + return NVME_ZONE_INVALID_WRITE | NVME_DNR; + } + if (data_size > (n->page_size << n->zasl)) { + trace_pci_nvme_err_append_too_large(slba, nlb, n->zasl); + return NVME_INVALID_FIELD | NVME_DNR; + } + slba = zone->d.wp; + } else if (unlikely(slba != zone->d.wp)) { + trace_pci_nvme_err_write_not_at_wp(slba, zone->d.zslba, + zone->d.wp); + return NVME_ZONE_INVALID_WRITE | NVME_DNR; + } + req->fill_ofs = -1LL; + } else { + status = nvme_check_zone_read(n, zone, slba, nlb, + n->params.cross_zone_read); + if (status != NVME_SUCCESS) { + trace_pci_nvme_err_zone_read_not_ok(slba, nlb, status); + return status | NVME_DNR; + } + + if (slba + nlb > zone->d.wp) { + /* + * All or some data is read above the WP. Need to + * fill out the buffer area that has no backing data + * with a predefined data pattern (zeros by default) + */ + if (slba >= zone->d.wp) { + req->fill_ofs = 0; + } else { + req->fill_ofs = ((zone->d.wp - slba) << data_shift); + } + } else { + req->fill_ofs = -1LL; + } + } + } else if (append) { + trace_pci_nvme_err_invalid_opc(rw->opcode); + return NVME_INVALID_OPCODE | NVME_DNR; + } + if (nvme_map_dptr(n, data_size, req)) { block_acct_invalid(blk_get_stats(n->conf.blk), acct); return NVME_INVALID_FIELD | NVME_DNR; } + if (unlikely(n->params.zoned && req->fill_ofs == 0)) { + /* No backend I/O necessary, only need to fill the buffer */ + nvme_fill_data(&req->qsg, &req->iov, 0, n->params.fill_pattern); + req->status = NVME_SUCCESS; + return NVME_SUCCESS; + } + + data_offset = slba << data_shift; + if (req->qsg.nsg > 0) { block_acct_start(blk_get_stats(n->conf.blk), &req->acct, req->qsg.size, acct); @@ -708,6 +1057,380 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) return NVME_NO_COMPLETE; } +static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeCtrl *n, NvmeNamespace *ns, + NvmeCmd *c, uint64_t *slba, + uint32_t *zone_idx) +{ + uint32_t dw10 = le32_to_cpu(c->cdw10); + uint32_t dw11 = le32_to_cpu(c->cdw11); + + if (!n->params.zoned) { + trace_pci_nvme_err_invalid_opc(c->opcode); + return NVME_INVALID_OPCODE | NVME_DNR; + } + + *slba = ((uint64_t)dw11) << 32 | dw10; + if (unlikely(*slba >= ns->id_ns.nsze)) { + trace_pci_nvme_err_invalid_lba_range(*slba, 0, ns->id_ns.nsze); + *slba = 0; + return NVME_LBA_RANGE | NVME_DNR; + } + + *zone_idx = nvme_zone_idx(n, *slba); + assert(*zone_idx < n->num_zones); + + return NVME_SUCCESS; +} + +static uint16_t nvme_open_zone(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN); + /* fall through */ + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_open_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_CLOSED; +} + +static uint16_t nvme_close_zone(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_CLOSED); + /* fall through */ + case NVME_ZONE_STATE_CLOSED: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_close_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_IMPLICITLY_OPEN || + state == NVME_ZONE_STATE_EXPLICITLY_OPEN; +} + +static uint16_t nvme_finish_zone(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_EMPTY: + zone->d.wp = nvme_zone_wr_boundary(zone); + nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_FULL); + /* fall through */ + case NVME_ZONE_STATE_FULL: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_finish_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_IMPLICITLY_OPEN || + state == NVME_ZONE_STATE_EXPLICITLY_OPEN || + state == NVME_ZONE_STATE_CLOSED; +} + +static uint16_t nvme_reset_zone(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_FULL: + zone->d.wp = zone->d.zslba; + nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_EMPTY); + /* fall through */ + case NVME_ZONE_STATE_EMPTY: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_reset_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_IMPLICITLY_OPEN || + state == NVME_ZONE_STATE_EXPLICITLY_OPEN || + state == NVME_ZONE_STATE_CLOSED || + state == NVME_ZONE_STATE_FULL; +} + +static uint16_t nvme_offline_zone(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_READ_ONLY: + nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_OFFLINE); + /* fall through */ + case NVME_ZONE_STATE_OFFLINE: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_offline_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_READ_ONLY; +} + +typedef uint16_t (*op_handler_t)(NvmeCtrl *, NvmeNamespace *, NvmeZone *, + uint8_t); +typedef bool (*need_to_proc_zone_t)(uint8_t); + +static uint16_t name_do_zone_op(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint8_t state, bool all, + op_handler_t op_hndlr, + need_to_proc_zone_t proc_zone) +{ + int i; + uint16_t status = 0; + + if (!all) { + status = op_hndlr(n, ns, zone, state); + } else { + for (i = 0; i < n->num_zones; i++, zone++) { + state = nvme_get_zone_state(zone); + if (proc_zone(state)) { + status = op_hndlr(n, ns, zone, state); + if (status != NVME_SUCCESS) { + break; + } + } + } + } + + return status; +} + +static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeCmd *cmd = (NvmeCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + uint32_t dw13 = le32_to_cpu(cmd->cdw13); + uint64_t slba = 0; + uint32_t zone_idx = 0; + uint16_t status; + uint8_t action, state; + bool all; + NvmeZone *zone; + + action = dw13 & 0xff; + all = dw13 & 0x100; + + req->status = NVME_SUCCESS; + + if (!all) { + status = nvme_get_mgmt_zone_slba_idx(n, ns, cmd, &slba, &zone_idx); + if (status) { + return status; + } + } + + zone = &ns->zone_array[zone_idx]; + if (slba != zone->d.zslba) { + trace_pci_nvme_err_unaligned_zone_cmd(action, slba, zone->d.zslba); + return NVME_INVALID_FIELD | NVME_DNR; + } + state = nvme_get_zone_state(zone); + + switch (action) { + + case NVME_ZONE_ACTION_OPEN: + trace_pci_nvme_open_zone(slba, zone_idx, all); + status = name_do_zone_op(n, ns, zone, state, all, + nvme_open_zone, nvme_cond_open_all); + break; + + case NVME_ZONE_ACTION_CLOSE: + trace_pci_nvme_close_zone(slba, zone_idx, all); + status = name_do_zone_op(n, ns, zone, state, all, + nvme_close_zone, nvme_cond_close_all); + break; + + case NVME_ZONE_ACTION_FINISH: + trace_pci_nvme_finish_zone(slba, zone_idx, all); + status = name_do_zone_op(n, ns, zone, state, all, + nvme_finish_zone, nvme_cond_finish_all); + break; + + case NVME_ZONE_ACTION_RESET: + trace_pci_nvme_reset_zone(slba, zone_idx, all); + status = name_do_zone_op(n, ns, zone, state, all, + nvme_reset_zone, nvme_cond_reset_all); + break; + + case NVME_ZONE_ACTION_OFFLINE: + trace_pci_nvme_offline_zone(slba, zone_idx, all); + status = name_do_zone_op(n, ns, zone, state, all, + nvme_offline_zone, nvme_cond_offline_all); + break; + + case NVME_ZONE_ACTION_SET_ZD_EXT: + trace_pci_nvme_set_descriptor_extension(slba, zone_idx); + return NVME_INVALID_FIELD | NVME_DNR; + break; + + default: + trace_pci_nvme_err_invalid_mgmt_action(action); + status = NVME_INVALID_FIELD; + } + + if (status == NVME_ZONE_INVAL_TRANSITION) { + trace_pci_nvme_err_invalid_zone_state_transition(state, action, slba, + zone->d.za); + } + if (status) { + status |= NVME_DNR; + } + + return status; +} + +static bool nvme_zone_matches_filter(uint32_t zafs, NvmeZone *zl) +{ + int zs = nvme_get_zone_state(zl); + + switch (zafs) { + case NVME_ZONE_REPORT_ALL: + return true; + case NVME_ZONE_REPORT_EMPTY: + return (zs == NVME_ZONE_STATE_EMPTY); + case NVME_ZONE_REPORT_IMPLICITLY_OPEN: + return (zs == NVME_ZONE_STATE_IMPLICITLY_OPEN); + case NVME_ZONE_REPORT_EXPLICITLY_OPEN: + return (zs == NVME_ZONE_STATE_EXPLICITLY_OPEN); + case NVME_ZONE_REPORT_CLOSED: + return (zs == NVME_ZONE_STATE_CLOSED); + case NVME_ZONE_REPORT_FULL: + return (zs == NVME_ZONE_STATE_FULL); + case NVME_ZONE_REPORT_READ_ONLY: + return (zs == NVME_ZONE_STATE_READ_ONLY); + case NVME_ZONE_REPORT_OFFLINE: + return (zs == NVME_ZONE_STATE_OFFLINE); + default: + return false; + } +} + +static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeCmd *cmd = (NvmeCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + uint64_t prp1 = le64_to_cpu(cmd->dptr.prp1); + uint64_t prp2 = le64_to_cpu(cmd->dptr.prp2); + /* cdw12 is zero-based number of dwords to return. Convert to bytes */ + uint32_t len = (le32_to_cpu(cmd->cdw12) + 1) << 2; + uint32_t dw13 = le32_to_cpu(cmd->cdw13); + uint32_t zone_idx, zra, zrasf, partial; + uint64_t max_zones, nr_zones = 0; + uint16_t ret; + uint64_t slba; + NvmeZoneDescr *z; + NvmeZone *zs; + NvmeZoneReportHeader *header; + void *buf, *buf_p; + size_t zone_entry_sz; + + req->status = NVME_SUCCESS; + + ret = nvme_get_mgmt_zone_slba_idx(n, ns, cmd, &slba, &zone_idx); + if (ret) { + return ret; + } + + if (len < sizeof(NvmeZoneReportHeader)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + zra = dw13 & 0xff; + if (!(zra == NVME_ZONE_REPORT || zra == NVME_ZONE_REPORT_EXTENDED)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + if (zra == NVME_ZONE_REPORT_EXTENDED) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + zrasf = (dw13 >> 8) & 0xff; + if (zrasf > NVME_ZONE_REPORT_OFFLINE) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + partial = (dw13 >> 16) & 0x01; + + zone_entry_sz = sizeof(NvmeZoneDescr); + + max_zones = (len - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; + buf = g_malloc0(len); + + header = (NvmeZoneReportHeader *)buf; + buf_p = buf + sizeof(NvmeZoneReportHeader); + + while (zone_idx < n->num_zones && nr_zones < max_zones) { + zs = &ns->zone_array[zone_idx]; + + if (!nvme_zone_matches_filter(zrasf, zs)) { + zone_idx++; + continue; + } + + z = (NvmeZoneDescr *)buf_p; + buf_p += sizeof(NvmeZoneDescr); + nr_zones++; + + z->zt = zs->d.zt; + z->zs = zs->d.zs; + z->zcap = cpu_to_le64(zs->d.zcap); + z->zslba = cpu_to_le64(zs->d.zslba); + z->za = zs->d.za; + + if (nvme_wp_is_valid(zs)) { + z->wp = cpu_to_le64(zs->d.wp); + } else { + z->wp = cpu_to_le64(~0ULL); + } + + zone_idx++; + } + + if (!partial) { + for (; zone_idx < n->num_zones; zone_idx++) { + zs = &ns->zone_array[zone_idx]; + if (nvme_zone_matches_filter(zrasf, zs)) { + nr_zones++; + } + } + } + header->nr_zones = cpu_to_le64(nr_zones); + + ret = nvme_dma_prp(n, (uint8_t *)buf, len, prp1, prp2, + DMA_DIRECTION_FROM_DEVICE, req); + g_free(buf); + + return ret; +} + static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) { uint32_t nsid = le32_to_cpu(req->cmd.nsid); @@ -726,9 +1449,15 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) return nvme_flush(n, req); case NVME_CMD_WRITE_ZEROES: return nvme_write_zeroes(n, req); + case NVME_CMD_ZONE_APND: + return nvme_rw(n, req, true); case NVME_CMD_WRITE: case NVME_CMD_READ: - return nvme_rw(n, req); + return nvme_rw(n, req, false); + case NVME_CMD_ZONE_MGMT_SEND: + return nvme_zone_mgmt_send(n, req); + case NVME_CMD_ZONE_MGMT_RECV: + return nvme_zone_mgmt_recv(n, req); default: trace_pci_nvme_err_invalid_opc(req->cmd.opcode); return NVME_INVALID_OPCODE | NVME_DNR; @@ -957,7 +1686,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, +static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len, uint64_t off, NvmeRequest *req) { NvmeCmd *cmd = &req->cmd; @@ -985,11 +1714,19 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, acs[NVME_ADM_CMD_GET_LOG_PAGE] = NVME_CMD_EFFECTS_CSUPP; acs[NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFFECTS_CSUPP; - iocs[NVME_CMD_FLUSH] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; - iocs[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFFECTS_CSUPP | - NVME_CMD_EFFECTS_LBCC; - iocs[NVME_CMD_WRITE] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; - iocs[NVME_CMD_READ] = NVME_CMD_EFFECTS_CSUPP; + if (NVME_CC_CSS(n->bar.cc) != CSS_ADMIN_ONLY) { + iocs[NVME_CMD_FLUSH] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFFECTS_CSUPP | + NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_WRITE] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_READ] = NVME_CMD_EFFECTS_CSUPP; + } + if (csi == NVME_CSI_ZONED && NVME_CC_CSS(n->bar.cc) == CSS_CSI) { + iocs[NVME_CMD_ZONE_APND] = NVME_CMD_EFFECTS_CSUPP | + NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_ZONE_MGMT_SEND] = NVME_CMD_EFFECTS_CSUPP; + iocs[NVME_CMD_ZONE_MGMT_RECV] = NVME_CMD_EFFECTS_CSUPP; + } trans_len = MIN(sizeof(cmd_eff_log) - off, buf_len); @@ -1008,6 +1745,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) uint8_t lid = dw10 & 0xff; uint8_t lsp = (dw10 >> 8) & 0xf; uint8_t rae = (dw10 >> 15) & 0x1; + uint8_t csi = le32_to_cpu(cmd->cdw14) >> 24; uint32_t numdl, numdu; uint64_t off, lpol, lpou; size_t len; @@ -1041,7 +1779,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) case NVME_LOG_FW_SLOT_INFO: return nvme_fw_log_info(n, len, off, req); case NVME_LOG_CMD_EFFECTS: - return nvme_cmd_effects(n, len, off, req); + return nvme_cmd_effects(n, csi, len, off, req); default: trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid); return NVME_INVALID_FIELD | NVME_DNR; @@ -1166,6 +1904,16 @@ static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, uint64_t prp1, return NVME_SUCCESS; } +static inline bool nvme_csi_has_nvm_support(NvmeNamespace *ns) +{ + switch (ns->csi) { + case NVME_CSI_NVM: + case NVME_CSI_ZONED: + return true; + } + return false; +} + static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1181,13 +1929,22 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + NvmeIdCtrlZoned *id; uint64_t prp1 = le64_to_cpu(c->prp1); uint64_t prp2 = le64_to_cpu(c->prp2); + uint16_t ret; trace_pci_nvme_identify_ctrl_csi(c->csi); if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, prp1, prp2, req); + } else if (c->csi == NVME_CSI_ZONED && n->params.zoned) { + id = g_malloc0(sizeof(*id)); + id->zasl = n->zasl; + ret = nvme_dma_prp(n, (uint8_t *)id, sizeof(*id), prp1, prp2, + DMA_DIRECTION_FROM_DEVICE, req); + g_free(id); + return ret; } return NVME_INVALID_FIELD | NVME_DNR; @@ -1216,8 +1973,12 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, return nvme_rpt_empty_id_struct(n, prp1, prp2, req); } - return nvme_dma_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns), prp1, - prp2, DMA_DIRECTION_FROM_DEVICE, req); + if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { + return nvme_dma_prp(n, (uint8_t *)&ns->id_ns, sizeof(ns->id_ns), prp1, + prp2, DMA_DIRECTION_FROM_DEVICE, req); + } + + return NVME_INVALID_CMD_SET | NVME_DNR; } static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, @@ -1243,8 +2004,12 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, return nvme_rpt_empty_id_struct(n, prp1, prp2, req); } - if (c->csi == NVME_CSI_NVM) { + if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { return nvme_rpt_empty_id_struct(n, prp1, prp2, req); + } else if (c->csi == NVME_CSI_ZONED && ns->csi == NVME_CSI_ZONED) { + return nvme_dma_prp(n, (uint8_t *)ns->id_ns_zoned, + sizeof(*ns->id_ns_zoned), prp1, prp2, + DMA_DIRECTION_FROM_DEVICE, req); } return NVME_INVALID_FIELD | NVME_DNR; @@ -1304,7 +2069,7 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi); - if (c->csi != NVME_CSI_NVM) { + if (c->csi != NVME_CSI_NVM && c->csi != NVME_CSI_ZONED) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1368,7 +2133,7 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) desc->nidt = NVME_NIDT_CSI; desc->nidl = NVME_NIDL_CSI; buf_ptr += sizeof(*desc); - *(uint8_t *)buf_ptr = NVME_CSI_NVM; + *(uint8_t *)buf_ptr = ns->csi; status = nvme_dma_prp(n, buf, data_len, prp1, prp2, DMA_DIRECTION_FROM_DEVICE, req); @@ -1391,6 +2156,9 @@ static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req) list = g_malloc0(data_len); ptr = (uint8_t *)list; NVME_SET_CSI(*ptr, NVME_CSI_NVM); + if (n->params.zoned) { + NVME_SET_CSI(*ptr, NVME_CSI_ZONED); + } status = nvme_dma_prp(n, (uint8_t *)list, data_len, prp1, prp2, DMA_DIRECTION_FROM_DEVICE, req); g_free(list); @@ -1959,6 +2727,20 @@ static int nvme_start_ctrl(NvmeCtrl *n) n->namespaces[i].attached = true; } break; + case NVME_CSI_ZONED: + if (NVME_CC_CSS(n->bar.cc) == CSS_CSI) { + n->namespaces[i].attached = true; + } + break; + } + } + + if (n->params.zoned) { + if (!n->zasl_bs) { + assert(n->params.mdts); + n->zasl = n->params.mdts; + } else { + n->zasl = nvme_ilog2(n->zasl_bs / n->page_size); } } @@ -2022,12 +2804,18 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data, } else { switch (NVME_CC_CSS(data)) { case CSS_NVM_ONLY: - trace_pci_nvme_css_nvm_cset_selected_by_host(data & - 0xffffffff); + if (n->params.zoned) { + NVME_GUEST_ERR(pci_nvme_err_only_zoned_cmd_set_avail, + "only NVM+ZONED command set can be selected"); break; + } + trace_pci_nvme_css_nvm_cset_selected_by_host(data & + 0xffffffff); + break; case CSS_CSI: NVME_SET_CC_CSS(n->bar.cc, CSS_CSI); - trace_pci_nvme_css_all_csets_sel_by_host(data & 0xffffffff); + trace_pci_nvme_css_all_csets_sel_by_host(data & + 0xffffffff); break; case CSS_ADMIN_ONLY: break; @@ -2359,6 +3147,126 @@ static const MemoryRegionOps nvme_cmb_ops = { }, }; +static int nvme_init_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, + uint64_t capacity) +{ + NvmeZone *zone; + uint64_t start = 0, zone_size = n->zone_size; + int i; + + ns->zone_array = g_malloc0(n->zone_array_size); + ns->exp_open_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->imp_open_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->closed_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->full_zones = g_malloc0(sizeof(NvmeZoneList)); + zone = ns->zone_array; + + nvme_init_zone_list(ns->exp_open_zones); + nvme_init_zone_list(ns->imp_open_zones); + nvme_init_zone_list(ns->closed_zones); + nvme_init_zone_list(ns->full_zones); + + for (i = 0; i < n->num_zones; i++, zone++) { + if (start + zone_size > capacity) { + zone_size = capacity - start; + } + zone->d.zt = NVME_ZONE_TYPE_SEQ_WRITE; + nvme_set_zone_state(zone, NVME_ZONE_STATE_EMPTY); + zone->d.za = 0; + zone->d.zcap = n->zone_capacity; + zone->d.zslba = start; + zone->d.wp = start; + zone->prev = 0; + zone->next = 0; + start += zone_size; + } + + return 0; +} + +static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) +{ + uint64_t zone_size, zone_cap; + uint32_t nz; + + if (n->params.zone_size_mb) { + zone_size = n->params.zone_size_mb; + } else { + zone_size = NVME_DEFAULT_ZONE_SIZE; + } + if (n->params.zone_capacity_mb) { + zone_cap = n->params.zone_capacity_mb; + } else { + zone_cap = zone_size; + } + n->zone_size = zone_size * MiB / n->conf.logical_block_size; + n->zone_capacity = zone_cap * MiB / n->conf.logical_block_size; + if (n->zone_capacity > n->zone_size) { + error_setg(errp, "zone capacity exceeds zone size"); + return; + } + + nz = DIV_ROUND_UP(n->ns_size / n->conf.logical_block_size, n->zone_size); + n->num_zones = nz; + n->zone_array_size = sizeof(NvmeZone) * nz; + n->zone_size_log2 = is_power_of_2(n->zone_size) ? nvme_ilog2(n->zone_size) : + 0; + + if (!n->params.zasl_kb) { + n->zasl_bs = n->params.mdts ? 0 : NVME_DEFAULT_MAX_ZA_SIZE * KiB; + } else { + n->zasl_bs = n->params.zasl_kb * KiB; + } + + return; +} + +static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, + Error **errp) +{ + int ret; + + ret = nvme_init_zone_meta(n, ns, n->num_zones * n->zone_size); + if (ret) { + error_setg(errp, "could not init zone metadata"); + return -1; + } + + ns->id_ns_zoned = g_malloc0(sizeof(*ns->id_ns_zoned)); + + /* MAR/MOR are zeroes-based, 0xffffffff means no limit */ + ns->id_ns_zoned->mar = 0xffffffff; + ns->id_ns_zoned->mor = 0xffffffff; + ns->id_ns_zoned->zoc = 0; + ns->id_ns_zoned->ozcs = n->params.cross_zone_read ? 0x01 : 0x00; + + ns->id_ns_zoned->lbafe[lba_index].zsze = cpu_to_le64(n->zone_size); + ns->id_ns_zoned->lbafe[lba_index].zdes = 0; + + if (n->params.fill_pattern == 0) { + ns->id_ns.dlfeat = 0x01; + } else if (n->params.fill_pattern == 0xff) { + ns->id_ns.dlfeat = 0x02; + } + + return 0; +} + +static void nvme_zoned_clear(NvmeCtrl *n) +{ + int i; + + for (i = 0; i < n->num_namespaces; i++) { + NvmeNamespace *ns = &n->namespaces[i]; + g_free(ns->id_ns_zoned); + g_free(ns->zone_array); + g_free(ns->exp_open_zones); + g_free(ns->imp_open_zones); + g_free(ns->closed_zones); + g_free(ns->full_zones); + } +} + static void nvme_check_constraints(NvmeCtrl *n, Error **errp) { NvmeParams *params = &n->params; @@ -2427,18 +3335,13 @@ static void nvme_init_state(NvmeCtrl *n) static void nvme_init_blk(NvmeCtrl *n, Error **errp) { + int64_t bs_size; + if (!blkconf_blocksizes(&n->conf, errp)) { return; } blkconf_apply_backend_options(&n->conf, blk_is_read_only(n->conf.blk), false, errp); -} - -static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) -{ - int64_t bs_size; - NvmeIdNs *id_ns = &ns->id_ns; - int lba_index; bs_size = blk_getlength(n->conf.blk); if (bs_size < 0) { @@ -2447,6 +3350,12 @@ static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) } n->ns_size = bs_size; +} + +static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) +{ + NvmeIdNs *id_ns = &ns->id_ns; + int lba_index; ns->csi = NVME_CSI_NVM; qemu_uuid_generate(&ns->uuid); /* TODO make UUIDs persistent */ @@ -2454,8 +3363,18 @@ static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) id_ns->lbaf[lba_index].ds = nvme_ilog2(n->conf.logical_block_size); id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(n, ns)); + if (n->params.zoned) { + ns->csi = NVME_CSI_ZONED; + id_ns->ncap = cpu_to_le64(n->zone_capacity * n->num_zones); + if (nvme_zoned_init_ns(n, ns, lba_index, errp) != 0) { + return; + } + } else { + ns->csi = NVME_CSI_NVM; + id_ns->ncap = id_ns->nsze; + } + /* no thin provisioning */ - id_ns->ncap = id_ns->nsze; id_ns->nuse = id_ns->ncap; } @@ -2615,8 +3534,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) NVME_CAP_SET_CQR(n->bar.cap, 1); NVME_CAP_SET_TO(n->bar.cap, 0xf); /* - * The device now always supports NS Types, but all commands - * that support CSI field will only handle NVM Command Set. + * The device now always supports NS Types, even when "zoned" property + * is set to zero. If this is the case, all commands that support CSI + * field only handle NVM Command Set. */ NVME_CAP_SET_CSS(n->bar.cap, (CAP_CSS_NVM | CAP_CSS_CSI_SUPP)); NVME_CAP_SET_MPSMAX(n->bar.cap, 4); @@ -2652,6 +3572,13 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) return; } + if (n->params.zoned) { + nvme_zoned_init_ctrl(n, &local_err); + if (local_err) { + error_propagate(errp, local_err); + return; + } + } nvme_init_ctrl(n, pci_dev); ns = n->namespaces; @@ -2670,6 +3597,9 @@ static void nvme_exit(PCIDevice *pci_dev) NvmeCtrl *n = NVME(pci_dev); nvme_clear_ctrl(n); + if (n->params.zoned) { + nvme_zoned_clear(n); + } g_free(n->namespaces); g_free(n->cq); g_free(n->sq); @@ -2697,6 +3627,13 @@ static Property nvme_props[] = { DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3), DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64), DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7), + DEFINE_PROP_BOOL("zoned", NvmeCtrl, params.zoned, false), + DEFINE_PROP_UINT64("zone_size", NvmeCtrl, params.zone_size_mb, + NVME_DEFAULT_ZONE_SIZE), + DEFINE_PROP_UINT64("zone_capacity", NvmeCtrl, params.zone_capacity_mb, 0), + DEFINE_PROP_UINT32("zone_append_size_limit", NvmeCtrl, params.zasl_kb, 0), + DEFINE_PROP_BOOL("cross_zone_read", NvmeCtrl, params.cross_zone_read, true), + DEFINE_PROP_UINT8("fill_pattern", NvmeCtrl, params.fill_pattern, 0), DEFINE_PROP_END_OF_LIST(), }; From patchwork Sat Sep 12 22:54:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772139 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 71C7A112E for ; Sat, 12 Sep 2020 22:58:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0EDBA20731 for ; Sat, 12 Sep 2020 22:58:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="EqRVtjME" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0EDBA20731 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:59464 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHETZ-000359-4z for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 18:58:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48984) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQE-0004Le-6S; Sat, 12 Sep 2020 18:55:06 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26879) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQB-0005R7-3Y; Sat, 12 Sep 2020 18:55:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951302; x=1631487302; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=d1RT9MutixJLktOervHixvGvgRTaqnkc5lEkoE9fCGs=; b=EqRVtjME6Y16dTKeXdyyzsVPo3BWohon16KwlnVibjd97PJTB2xHH7Jx 1hlqOJGxdDbrOILd2wxsfzBzkCjF5sbwfOdfCM3ObsnmGm8OT5znd0vZq 7os5wvkdW1trb7F/8nzREdDSX6sLENsgOjhagF9HlX2bx00l4RCpSpIpY 1EWLf/wcBdsAHoEYQgaN7flUX9toCSUmQc2LXXTsrWMkojIUVFc33QWSC KbxJ7tc8ZrxfWi4B105O77b+TEoLgx5XSsp68T8NbThIx9BxNDPAgS83U WOxdKXAFyOPaPudm+77LSu+cMtvqIMFk1Ds9ukiqHR6V7LMLrn0c87Zih w==; IronPort-SDR: cL0GJa1fWO+hBCTqxvPLws4EcdqK3bLn5w77x6VMfmSc2FoukEEt5JxdlBjCY8T/eXzv1R+WBw 7g8nKGTk2FSvtkltXw7H4r/2MEJuUy0xL/UHrQNoRKXESrUcD0T8eOn+LscFO046tzajUMGXHN 2iZDFjyoox1foPkuceOCf0zgdhEulx1js2Mv/tpvzp/6Z1Ruq5AC3zSlzz7jnIoc7BbpYbatMA laOCl4dGbIrGeE9N5g4ld1b7eYBS0D121yX76mSMCf43/TzQE50V54aWYqag1j2YKMeH383AGu YTo= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834862" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:55:01 +0800 IronPort-SDR: GC3eRfAOlhZWqTaVBlP/YQ1UI329F+ZrE3h4Ef84u98vPJyN2GYcO9HfgeOt2n3Ih56Sfhljww wgjue71IgrAQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:21 -0700 IronPort-SDR: UG4YrvMaHOYwX0lyDXNeoWWkZK2rw7k5XDNNKTFWIORIci74J5qkBVXm5WzWYBdo9GVjRio1Mt t4B4ugKMBzSw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:55:00 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 11/15] hw/block/nvme: Introduce max active and open zone limits Date: Sun, 13 Sep 2020 07:54:26 +0900 Message-Id: <20200912225430.1772-12-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Added two module properties, "max_active" and "max_open" to control the maximum number of zones that can be active or open. Once these variables are set to non-default values, these limits are checked during I/O and Too Many Active or Too Many Open command status is returned if they are exceeded. Signed-off-by: Hans Holmberg Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 179 +++++++++++++++++++++++++++++++++++++++++++++++- hw/block/nvme.h | 4 ++ 2 files changed, 181 insertions(+), 2 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 1b0e06002c..df536fd736 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -176,6 +176,87 @@ static void nvme_remove_zone(NvmeCtrl *n, NvmeNamespace *ns, NvmeZoneList *zl, zone->prev = zone->next = 0; } +/* + * Take the first zone out from a list, return NULL if the list is empty. + */ +static NvmeZone *nvme_remove_zone_head(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZoneList *zl) +{ + NvmeZone *zone = nvme_peek_zone_head(ns, zl); + + if (zone) { + --zl->size; + if (zl->size == 0) { + zl->head = NVME_ZONE_LIST_NIL; + zl->tail = NVME_ZONE_LIST_NIL; + } else { + zl->head = zone->next; + ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL; + } + zone->prev = zone->next = 0; + } + + return zone; +} + +/* + * Check if we can open a zone without exceeding open/active limits. + * AOR stands for "Active and Open Resources" (see TP 4053 section 2.5). + */ +static int nvme_aor_check(NvmeCtrl *n, NvmeNamespace *ns, + uint32_t act, uint32_t opn) +{ + if (n->params.max_active_zones != 0 && + ns->nr_active_zones + act > n->params.max_active_zones) { + trace_pci_nvme_err_insuff_active_res(n->params.max_active_zones); + return NVME_ZONE_TOO_MANY_ACTIVE | NVME_DNR; + } + if (n->params.max_open_zones != 0 && + ns->nr_open_zones + opn > n->params.max_open_zones) { + trace_pci_nvme_err_insuff_open_res(n->params.max_open_zones); + return NVME_ZONE_TOO_MANY_OPEN | NVME_DNR; + } + + return NVME_SUCCESS; +} + +static inline void nvme_aor_inc_open(NvmeCtrl *n, NvmeNamespace *ns) +{ + assert(ns->nr_open_zones >= 0); + if (n->params.max_open_zones) { + ns->nr_open_zones++; + assert(ns->nr_open_zones <= n->params.max_open_zones); + } +} + +static inline void nvme_aor_dec_open(NvmeCtrl *n, NvmeNamespace *ns) +{ + if (n->params.max_open_zones) { + assert(ns->nr_open_zones > 0); + ns->nr_open_zones--; + } + assert(ns->nr_open_zones >= 0); +} + +static inline void nvme_aor_inc_active(NvmeCtrl *n, NvmeNamespace *ns) +{ + assert(ns->nr_active_zones >= 0); + if (n->params.max_active_zones) { + ns->nr_active_zones++; + assert(ns->nr_active_zones <= n->params.max_active_zones); + } +} + +static inline void nvme_aor_dec_active(NvmeCtrl *n, NvmeNamespace *ns) +{ + if (n->params.max_active_zones) { + assert(ns->nr_active_zones > 0); + ns->nr_active_zones--; + assert(ns->nr_active_zones >= ns->nr_open_zones); + } + assert(ns->nr_active_zones >= 0); +} + static void nvme_assign_zone_state(NvmeCtrl *n, NvmeNamespace *ns, NvmeZone *zone, uint8_t state) { @@ -715,6 +796,24 @@ static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns, return NVME_SUCCESS; } +static void nvme_auto_transition_zone(NvmeCtrl *n, NvmeNamespace *ns, + bool implicit, bool adding_active) +{ + NvmeZone *zone; + + if (implicit && n->params.max_open_zones && + ns->nr_open_zones == n->params.max_open_zones) { + zone = nvme_remove_zone_head(n, ns, ns->imp_open_zones); + if (zone) { + /* + * Automatically close this implicitly open zone. + */ + nvme_aor_dec_open(n, ns); + nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_CLOSED); + } + } +} + static uint16_t nvme_check_zone_write(NvmeZone *zone, uint64_t slba, uint32_t nlb) { @@ -792,6 +891,23 @@ static uint16_t nvme_check_zone_read(NvmeCtrl *n, NvmeZone *zone, uint64_t slba, return status; } +static uint16_t nvme_auto_open_zone(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone) +{ + uint16_t status = NVME_SUCCESS; + uint8_t zs = nvme_get_zone_state(zone); + + if (zs == NVME_ZONE_STATE_EMPTY) { + nvme_auto_transition_zone(n, ns, true, true); + status = nvme_aor_check(n, ns, 1, 1); + } else if (zs == NVME_ZONE_STATE_CLOSED) { + nvme_auto_transition_zone(n, ns, true, false); + status = nvme_aor_check(n, ns, 0, 1); + } + + return status; +} + static inline uint32_t nvme_zone_idx(NvmeCtrl *n, uint64_t slba) { return n->zone_size_log2 > 0 ? slba >> n->zone_size_log2 : @@ -827,7 +943,11 @@ static void nvme_finalize_zone_write(NvmeCtrl *n, NvmeRequest *req) switch (zs) { case NVME_ZONE_STATE_IMPLICITLY_OPEN: case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_aor_dec_open(n, ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(n, ns); + /* fall through */ case NVME_ZONE_STATE_EMPTY: break; default: @@ -837,7 +957,10 @@ static void nvme_finalize_zone_write(NvmeCtrl *n, NvmeRequest *req) } else { switch (zs) { case NVME_ZONE_STATE_EMPTY: + nvme_aor_inc_active(n, ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_inc_open(n, ns); nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_IMPLICITLY_OPEN); } @@ -922,6 +1045,11 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) zone->d.wp); return NVME_ZONE_INVALID_WRITE | NVME_DNR; } + + status = nvme_auto_open_zone(n, ns, zone); + if (status != NVME_SUCCESS) { + return status; + } } block_acct_start(blk_get_stats(n->conf.blk), &req->acct, 0, @@ -993,6 +1121,12 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req, bool append) zone->d.wp); return NVME_ZONE_INVALID_WRITE | NVME_DNR; } + + status = nvme_auto_open_zone(n, ns, zone); + if (status != NVME_SUCCESS) { + return status; + } + req->fill_ofs = -1LL; } else { status = nvme_check_zone_read(n, zone, slba, nlb, @@ -1085,9 +1219,27 @@ static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeCtrl *n, NvmeNamespace *ns, static uint16_t nvme_open_zone(NvmeCtrl *n, NvmeNamespace *ns, NvmeZone *zone, uint8_t state) { + uint16_t status; + switch (state) { case NVME_ZONE_STATE_EMPTY: + nvme_auto_transition_zone(n, ns, false, true); + status = nvme_aor_check(n, ns, 1, 0); + if (status != NVME_SUCCESS) { + return status; + } + nvme_aor_inc_active(n, ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + status = nvme_aor_check(n, ns, 0, 1); + if (status != NVME_SUCCESS) { + if (state == NVME_ZONE_STATE_EMPTY) { + nvme_aor_dec_active(n, ns); + } + return status; + } + nvme_aor_inc_open(n, ns); + /* fall through */ case NVME_ZONE_STATE_IMPLICITLY_OPEN: nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN); /* fall through */ @@ -1109,6 +1261,7 @@ static uint16_t nvme_close_zone(NvmeCtrl *n, NvmeNamespace *ns, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(n, ns); nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_CLOSED); /* fall through */ case NVME_ZONE_STATE_CLOSED: @@ -1130,7 +1283,11 @@ static uint16_t nvme_finish_zone(NvmeCtrl *n, NvmeNamespace *ns, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(n, ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(n, ns); + /* fall through */ case NVME_ZONE_STATE_EMPTY: zone->d.wp = nvme_zone_wr_boundary(zone); nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_FULL); @@ -1155,7 +1312,11 @@ static uint16_t nvme_reset_zone(NvmeCtrl *n, NvmeNamespace *ns, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(n, ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(n, ns); + /* fall through */ case NVME_ZONE_STATE_FULL: zone->d.wp = zone->d.zslba; nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_EMPTY); @@ -3218,6 +3379,18 @@ static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) n->zasl_bs = n->params.zasl_kb * KiB; } + /* Make sure that the values of all Zoned Command Set properties are sane */ + if (n->params.max_open_zones > nz) { + warn_report("max_open_zones value %u exceeds the number of zones %u," + " adjusting", n->params.max_open_zones, nz); + n->params.max_open_zones = nz; + } + if (n->params.max_active_zones > nz) { + warn_report("max_active_zones value %u exceeds the number of zones %u," + " adjusting", n->params.max_active_zones, nz); + n->params.max_active_zones = nz; + } + return; } @@ -3235,8 +3408,8 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, ns->id_ns_zoned = g_malloc0(sizeof(*ns->id_ns_zoned)); /* MAR/MOR are zeroes-based, 0xffffffff means no limit */ - ns->id_ns_zoned->mar = 0xffffffff; - ns->id_ns_zoned->mor = 0xffffffff; + ns->id_ns_zoned->mar = cpu_to_le32(n->params.max_active_zones - 1); + ns->id_ns_zoned->mor = cpu_to_le32(n->params.max_open_zones - 1); ns->id_ns_zoned->zoc = 0; ns->id_ns_zoned->ozcs = n->params.cross_zone_read ? 0x01 : 0x00; @@ -3632,6 +3805,8 @@ static Property nvme_props[] = { NVME_DEFAULT_ZONE_SIZE), DEFINE_PROP_UINT64("zone_capacity", NvmeCtrl, params.zone_capacity_mb, 0), DEFINE_PROP_UINT32("zone_append_size_limit", NvmeCtrl, params.zasl_kb, 0), + DEFINE_PROP_UINT32("max_active", NvmeCtrl, params.max_active_zones, 0), + DEFINE_PROP_UINT32("max_open", NvmeCtrl, params.max_open_zones, 0), DEFINE_PROP_BOOL("cross_zone_read", NvmeCtrl, params.cross_zone_read, true), DEFINE_PROP_UINT8("fill_pattern", NvmeCtrl, params.fill_pattern, 0), DEFINE_PROP_END_OF_LIST(), diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 9514c58919..4a3b23ed72 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -22,6 +22,8 @@ typedef struct NvmeParams { uint32_t zasl_kb; uint64_t zone_size_mb; uint64_t zone_capacity_mb; + uint32_t max_active_zones; + uint32_t max_open_zones; } NvmeParams; typedef struct NvmeAsyncEvent { @@ -103,6 +105,8 @@ typedef struct NvmeNamespace { NvmeZoneList *imp_open_zones; NvmeZoneList *closed_zones; NvmeZoneList *full_zones; + int32_t nr_open_zones; + int32_t nr_active_zones; } NvmeNamespace; static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) From patchwork Sat Sep 12 22:54:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772167 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D3EE746 for ; Sat, 12 Sep 2020 23:05:36 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4354F20723 for ; Sat, 12 Sep 2020 23:05:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="k2Kby+g3" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4354F20723 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:35746 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEaN-00085y-EJ for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:05:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49054) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQb-0004rs-8W; Sat, 12 Sep 2020 18:55:29 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26909) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQY-0005ef-NG; Sat, 12 Sep 2020 18:55:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951326; x=1631487326; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xcY2j/8W2gFLX21Q3zxBsUT9jmjD/iDcFP+Spk0iA6U=; b=k2Kby+g3p25qN+8fR6ZN38TxPZfwZdnRWpE0SZ2ZzkK0XqZhY18csGmP RLNb2xszWRXYmCAT8YTlz4eLGamvwp4Zabpawr9Nd2STS7SqmjsSb9y8n YYvNnnF91+vNKyDHLSuC55sIsRdIlEOCSaDuogYBy4JqprzdzyNWcJH1z VgLUVUkA2WOXIG8UW2ul8Ot1AEGNZzcx8oF9okvOqgVTa6UzH5LHQbH6b +ytwUZyxBUHdQ7Rm7lsi5VlzXflEo0giLBa2+19GV3jc1ljDTZh/kMiSX 6SnqMGqwmlUQkhu+fNHJwIEctF2FduVsbC12fMLn03v6LbLkhgdfuFVGv w==; IronPort-SDR: S+2X7/PwQ0cBaPxH5QCcuUGzZFaWkp8s+bmsXo9RYdYdVjJj6zT6wrp1kKCqpN6F1RinkrLIC6 8YYQ2Xi8zkSj7l5XHB7ylmzMMtJbqWwIcZKbD82t7rh4aJn+ZvOIi6CKksKIbjxTHF6dzqXEdf //FkssDd1dKzTiVuncmJdrTFhKABSPEYmaQNgBePEHhgn+XtnCYDmNa7h9F+o/KOwlKgG35ocw aoehlfpJa7GFZ7f0wkdZusiPjBO/KR9b1S2WSCNf9l6Owncd4mv6sw5nenxnjBQrKj+05d3VCz Ocg= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834864" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:55:03 +0800 IronPort-SDR: 2Ds0IL+ga3rOwXOVXU5pzJFrJcZIQ7Yb5qk2QRQc/NKiKkf8s0AIkg/YMSqpGF7GRQUTEq2d/f owcfpXrMXhGw== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:24 -0700 IronPort-SDR: 5gx5KLrlDqFUQjFktGc/5whg4cYQzSkRHsvQ9pk4WH2SG0PlI+t3oGT1x8rMVuc9pVRWXCMDw1 sMXkUO0y+9Sw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:55:02 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 12/15] hw/block/nvme: Support Zone Descriptor Extensions Date: Sun, 13 Sep 2020 07:54:27 +0900 Message-Id: <20200912225430.1772-13-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Zone Descriptor Extension is a label that can be assigned to a zone. It can be set to an Empty zone and it stays assigned until the zone is reset. This commit adds a new optional module property, "zone_descr_ext_size". Its value must be a multiple of 64 bytes. If this value is non-zero, it becomes possible to assign extensions of that size to any Empty zones. The default value for this property is 0, therefore setting extensions is disabled by default. Signed-off-by: Hans Holmberg Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen --- hw/block/nvme.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++--- hw/block/nvme.h | 8 ++++++ 2 files changed, 77 insertions(+), 4 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index df536fd736..ec7fade674 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1355,6 +1355,26 @@ static bool nvme_cond_offline_all(uint8_t state) return state == NVME_ZONE_STATE_READ_ONLY; } +static uint16_t nvme_set_zd_ext(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, uint8_t state) +{ + uint16_t status; + + if (state == NVME_ZONE_STATE_EMPTY) { + nvme_auto_transition_zone(n, ns, false, true); + status = nvme_aor_check(n, ns, 1, 0); + if (status != NVME_SUCCESS) { + return status; + } + nvme_aor_inc_active(n, ns); + zone->d.za |= NVME_ZA_ZD_EXT_VALID; + nvme_assign_zone_state(n, ns, zone, NVME_ZONE_STATE_CLOSED); + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + typedef uint16_t (*op_handler_t)(NvmeCtrl *, NvmeNamespace *, NvmeZone *, uint8_t); typedef bool (*need_to_proc_zone_t)(uint8_t); @@ -1389,12 +1409,14 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) NvmeCmd *cmd = (NvmeCmd *)&req->cmd; NvmeNamespace *ns = req->ns; uint32_t dw13 = le32_to_cpu(cmd->cdw13); + uint64_t prp1, prp2; uint64_t slba = 0; uint32_t zone_idx = 0; uint16_t status; uint8_t action, state; bool all; NvmeZone *zone; + uint8_t *zd_ext; action = dw13 & 0xff; all = dw13 & 0x100; @@ -1449,7 +1471,24 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) case NVME_ZONE_ACTION_SET_ZD_EXT: trace_pci_nvme_set_descriptor_extension(slba, zone_idx); - return NVME_INVALID_FIELD | NVME_DNR; + if (all || !n->params.zd_extension_size) { + return NVME_INVALID_FIELD | NVME_DNR; + } + zd_ext = nvme_get_zd_extension(n, ns, zone_idx); + prp1 = le64_to_cpu(cmd->dptr.prp1); + prp2 = le64_to_cpu(cmd->dptr.prp2); + status = nvme_dma_prp(n, zd_ext, n->params.zd_extension_size, + prp1, prp2, DMA_DIRECTION_TO_DEVICE, req); + if (status) { + trace_pci_nvme_err_zd_extension_map_error(zone_idx); + return status; + } + + status = nvme_set_zd_ext(n, ns, zone, state); + if (status == NVME_SUCCESS) { + trace_pci_nvme_zd_extension_set(zone_idx); + return status; + } break; default: @@ -1529,7 +1568,7 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } - if (zra == NVME_ZONE_REPORT_EXTENDED) { + if (zra == NVME_ZONE_REPORT_EXTENDED && !n->params.zd_extension_size) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1541,6 +1580,9 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) partial = (dw13 >> 16) & 0x01; zone_entry_sz = sizeof(NvmeZoneDescr); + if (zra == NVME_ZONE_REPORT_EXTENDED) { + zone_entry_sz += n->params.zd_extension_size; + } max_zones = (len - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; buf = g_malloc0(len); @@ -1572,6 +1614,14 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) z->wp = cpu_to_le64(~0ULL); } + if (zra == NVME_ZONE_REPORT_EXTENDED) { + if (zs->d.za & NVME_ZA_ZD_EXT_VALID) { + memcpy(buf_p, nvme_get_zd_extension(n, ns, zone_idx), + n->params.zd_extension_size); + } + buf_p += n->params.zd_extension_size; + } + zone_idx++; } @@ -2686,7 +2736,6 @@ static uint16_t nvme_aer(NvmeCtrl *n, NvmeRequest *req) n->aer_reqs[n->outstanding_aers] = req; n->outstanding_aers++; - if (!QTAILQ_EMPTY(&n->aer_queue)) { nvme_process_aers(n); } @@ -3320,6 +3369,7 @@ static int nvme_init_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, ns->imp_open_zones = g_malloc0(sizeof(NvmeZoneList)); ns->closed_zones = g_malloc0(sizeof(NvmeZoneList)); ns->full_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->zd_extensions = g_malloc0(n->params.zd_extension_size * n->num_zones); zone = ns->zone_array; nvme_init_zone_list(ns->exp_open_zones); @@ -3390,6 +3440,17 @@ static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) " adjusting", n->params.max_active_zones, nz); n->params.max_active_zones = nz; } + if (n->params.zd_extension_size) { + if (n->params.zd_extension_size & 0x3f) { + error_setg(errp, + "zone descriptor extension size must be a multiple of 64B"); + return; + } + if ((n->params.zd_extension_size >> 6) > 0xff) { + error_setg(errp, "zone descriptor extension size is too large"); + return; + } + } return; } @@ -3414,7 +3475,8 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, ns->id_ns_zoned->ozcs = n->params.cross_zone_read ? 0x01 : 0x00; ns->id_ns_zoned->lbafe[lba_index].zsze = cpu_to_le64(n->zone_size); - ns->id_ns_zoned->lbafe[lba_index].zdes = 0; + ns->id_ns_zoned->lbafe[lba_index].zdes = + n->params.zd_extension_size >> 6; /* Units of 64B */ if (n->params.fill_pattern == 0) { ns->id_ns.dlfeat = 0x01; @@ -3437,6 +3499,7 @@ static void nvme_zoned_clear(NvmeCtrl *n) g_free(ns->imp_open_zones); g_free(ns->closed_zones); g_free(ns->full_zones); + g_free(ns->zd_extensions); } } @@ -3805,6 +3868,8 @@ static Property nvme_props[] = { NVME_DEFAULT_ZONE_SIZE), DEFINE_PROP_UINT64("zone_capacity", NvmeCtrl, params.zone_capacity_mb, 0), DEFINE_PROP_UINT32("zone_append_size_limit", NvmeCtrl, params.zasl_kb, 0), + DEFINE_PROP_UINT32("zone_descr_ext_size", NvmeCtrl, + params.zd_extension_size, 0), DEFINE_PROP_UINT32("max_active", NvmeCtrl, params.max_active_zones, 0), DEFINE_PROP_UINT32("max_open", NvmeCtrl, params.max_open_zones, 0), DEFINE_PROP_BOOL("cross_zone_read", NvmeCtrl, params.cross_zone_read, true), diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 4a3b23ed72..e53388ba66 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -24,6 +24,7 @@ typedef struct NvmeParams { uint64_t zone_capacity_mb; uint32_t max_active_zones; uint32_t max_open_zones; + uint32_t zd_extension_size; } NvmeParams; typedef struct NvmeAsyncEvent { @@ -105,6 +106,7 @@ typedef struct NvmeNamespace { NvmeZoneList *imp_open_zones; NvmeZoneList *closed_zones; NvmeZoneList *full_zones; + uint8_t *zd_extensions; int32_t nr_open_zones; int32_t nr_active_zones; } NvmeNamespace; @@ -218,6 +220,12 @@ static inline bool nvme_wp_is_valid(NvmeZone *zone) st != NVME_ZONE_STATE_OFFLINE; } +static inline uint8_t *nvme_get_zd_extension(NvmeCtrl *n, NvmeNamespace *ns, + uint32_t zone_idx) +{ + return &ns->zd_extensions[zone_idx * n->params.zd_extension_size]; +} + /* * Initialize a zone list head. */ From patchwork Sat Sep 12 22:54:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772159 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1067B746 for ; Sat, 12 Sep 2020 23:03:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B918A20770 for ; Sat, 12 Sep 2020 23:03:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="nE/jjvSS" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B918A20770 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:57024 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEYb-0005Bd-U3 for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:03:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49078) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQc-0004vi-Op; Sat, 12 Sep 2020 18:55:30 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26912) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQa-0005f4-2F; Sat, 12 Sep 2020 18:55:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951327; x=1631487327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Fe7FRRGcF+PM1kLw/MifPE71wdqaWcMDdZBdLsZIvO8=; b=nE/jjvSS4FOY4Yei+0n63+qpqRoqWDnkeFnmTHMfDd277LlIrlB8z6MI 9gO/0vM1kY76TJXDyfKOVedfY+vHxzDpS6jRtoPyYhJ+x2aTxV6MppdyM 0LNAQyGz7sKbj6rePBrJSn2anBJPgTQSx5XNIc3tVYBMAEac05BKrgw2O cuUaBur3rrc82AB3fMYOKtv/rV9EKdp6RFAEPGSu+7qtMjcRZNmV0QnvW DTAZB1/NiTHFbfjk8Vv/F34dhlbzpffZmv/5L2NLctGbAQCVTPWcCAWjI ZXxpJK9ZZy8cU9MhtrkVqjlAPxoFxtL4pY/YFSpUP+ZdGMAcxmBglKfNG Q==; IronPort-SDR: rTXXkeh80pabVRHjnD1UrdfaQ91w7SnpsTRgpNjX/bySrXbkYCZBFP+Gxuzd58pt/indNTRVLd dwIdx6KKl/w4k0k/VC3Fe+iqO0XFPENqXse9S020ErHEanU+7xQNAOd0ijKFPpL1DDjzgBjByj zF45CYBQ9Kfg6Khg4tWoBh8TT3YyLdzuxKo76o18vB6t1s/mRAfPK6NqQQ1J3JNYQXm2VPc/gy H0JnLY+fLqxc4iIrVTxuPxM56yg/0YnyyNgb2Pn2YtUH0F9q0KQ3G2fQfIdmo4zusvMW5pHpP+ X8I= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834867" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:55:05 +0800 IronPort-SDR: APIhFRTGdaEbsiiJhWEmzkDR4z57dsCVFg8Am78Po5FvO3lc0/VQooUwWjgl8/3+0ky/asNGwV n/gOfP6S/OTg== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:26 -0700 IronPort-SDR: Wur8w/eGWIuuW6RHucMXkh8Fyue5XoqJvOmWCWoyKK4XGOHG7nmHVqsTfdkVs16Lmx5Wdxvgr0 dM34yty1Pj8Q== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:55:05 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 13/15] hw/block/nvme: Add injection of Offline/Read-Only zones Date: Sun, 13 Sep 2020 07:54:28 +0900 Message-Id: <20200912225430.1772-14-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" ZNS specification defines two zone conditions for the zones that no longer can function properly, possibly because of flash wear or other internal fault. It is useful to be able to "inject" a small number of such zones for testing purposes. This commit defines two optional device properties, "offline_zones" and "rdonly_zones". Users can assign non-zero values to these variables to specify the number of zones to be initialized as Offline or Read-Only. The actual number of injected zones may be smaller than the requested amount - Read-Only and Offline counts are expected to be much smaller than the total number of zones on a drive. Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ hw/block/nvme.h | 2 ++ 2 files changed, 48 insertions(+) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index ec7fade674..f0a03bea75 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -3361,8 +3361,11 @@ static int nvme_init_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, uint64_t capacity) { NvmeZone *zone; + Error *err; uint64_t start = 0, zone_size = n->zone_size; + uint32_t rnd; int i; + uint16_t zs; ns->zone_array = g_malloc0(n->zone_array_size); ns->exp_open_zones = g_malloc0(sizeof(NvmeZoneList)); @@ -3392,6 +3395,37 @@ static int nvme_init_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, start += zone_size; } + /* If required, make some zones Offline or Read Only */ + + for (i = 0; i < n->params.nr_offline_zones; i++) { + do { + qcrypto_random_bytes(&rnd, sizeof(rnd), &err); + rnd %= n->num_zones; + } while (rnd < n->params.max_open_zones); + zone = &ns->zone_array[rnd]; + zs = nvme_get_zone_state(zone); + if (zs != NVME_ZONE_STATE_OFFLINE) { + nvme_set_zone_state(zone, NVME_ZONE_STATE_OFFLINE); + } else { + i--; + } + } + + for (i = 0; i < n->params.nr_rdonly_zones; i++) { + do { + qcrypto_random_bytes(&rnd, sizeof(rnd), &err); + rnd %= n->num_zones; + } while (rnd < n->params.max_open_zones); + zone = &ns->zone_array[rnd]; + zs = nvme_get_zone_state(zone); + if (zs != NVME_ZONE_STATE_OFFLINE && + zs != NVME_ZONE_STATE_READ_ONLY) { + nvme_set_zone_state(zone, NVME_ZONE_STATE_READ_ONLY); + } else { + i--; + } + } + return 0; } @@ -3440,6 +3474,16 @@ static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) " adjusting", n->params.max_active_zones, nz); n->params.max_active_zones = nz; } + if (n->params.max_open_zones < nz) { + if (n->params.nr_offline_zones > nz - n->params.max_open_zones) { + n->params.nr_offline_zones = nz - n->params.max_open_zones; + } + if (n->params.nr_rdonly_zones > + nz - n->params.max_open_zones - n->params.nr_offline_zones) { + n->params.nr_rdonly_zones = + nz - n->params.max_open_zones - n->params.nr_offline_zones; + } + } if (n->params.zd_extension_size) { if (n->params.zd_extension_size & 0x3f) { error_setg(errp, @@ -3872,6 +3916,8 @@ static Property nvme_props[] = { params.zd_extension_size, 0), DEFINE_PROP_UINT32("max_active", NvmeCtrl, params.max_active_zones, 0), DEFINE_PROP_UINT32("max_open", NvmeCtrl, params.max_open_zones, 0), + DEFINE_PROP_UINT32("offline_zones", NvmeCtrl, params.nr_offline_zones, 0), + DEFINE_PROP_UINT32("rdonly_zones", NvmeCtrl, params.nr_rdonly_zones, 0), DEFINE_PROP_BOOL("cross_zone_read", NvmeCtrl, params.cross_zone_read, true), DEFINE_PROP_UINT8("fill_pattern", NvmeCtrl, params.fill_pattern, 0), DEFINE_PROP_END_OF_LIST(), diff --git a/hw/block/nvme.h b/hw/block/nvme.h index e53388ba66..9a5f4787b7 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -25,6 +25,8 @@ typedef struct NvmeParams { uint32_t max_active_zones; uint32_t max_open_zones; uint32_t zd_extension_size; + uint32_t nr_offline_zones; + uint32_t nr_rdonly_zones; } NvmeParams; typedef struct NvmeAsyncEvent { From patchwork Sat Sep 12 22:54:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772169 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1CF17159A for ; Sat, 12 Sep 2020 23:06:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B170C207BB for ; Sat, 12 Sep 2020 23:06:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="g2bX6m0m" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B170C207BB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:40074 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEbV-0001Pk-RQ for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:06:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49094) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQh-00055R-DG; Sat, 12 Sep 2020 18:55:36 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26918) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQc-0005fL-7q; Sat, 12 Sep 2020 18:55:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951329; x=1631487329; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Bt1/nOYmE1By8PZyHGdb3fn9UZsQz8p9fYPzJryi6/8=; b=g2bX6m0mDbKY0P/eHV2PdgGS9rnfioNay+frU+d5DhGaPE7tbcHlnq7w PX6Jz5axttLNX7cfybFWl73+2bJLzr/fyJ3qsz1LKoP5SBWeX/LYw1uvV 9WMS6RWCiyz1Cau+knzdl0c9Hmt6f+gX9vWlAQUN+Kw5tPFd3h703Earz 321yV8Oud42UY+zZ04tlJlfj3vQlF7vF2mbL2l5+D7mxxzzDq/n1mnkjj RTF8CEzrpmYBLlODM4LhpPe0b3dQJ5ARNv1G1FLFYm+AvxzLO0KaSq33M 4h1//IisUxj73YrkKH3dhz+aIaqbbJ1RVDTmPjMjY5bJreQPIkdlGf2gQ A==; IronPort-SDR: 6a9vp7/BI+9KcClnbJJxRqP5/FElE4+wP9bgobu6tsz03Vy4TBaPRRjkit1Y0N6lfRBEHblHyb Kt/fBA1h7Cj1ihFk12rA8quB+4mPKXpMhQbppyX9jbF+txyDY053s6F8lSmk+RMDYxwWi4CUjL 8U/EfXkS9yAOTGyE/EXpAycuG8d+q0HNtpg3KgSgj8nxeHtOONDhKTLzqakuEETnFaukSjTUIK uccxuNDxAyu6NjNAw4LTBfABitYnlg/yxJGDUbr86+aDKCSxIklOFTRI2p9q+PuLNL+iAB/j3e T60= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834872" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:55:08 +0800 IronPort-SDR: xJUnKJTD1m3TY5prX22UmwUU1DEoz6Ej1o9rR52BFLRUM7bd7Xi/C0K8Ie7PMPF3USX0e8Ijqw Tsj3UDyi39qQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:28 -0700 IronPort-SDR: svveHy04ktn82T4Fyu7nMTHtLosHDKpw1gEJelE306g/O7Uj0le1VUxtG8TeXnsk+xZ1vdfnUP ns+tyXwm4EGQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:55:07 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 14/15] hw/block/nvme: Use zone metadata file for persistence Date: Sun, 13 Sep 2020 07:54:29 +0900 Message-Id: <20200912225430.1772-15-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" A ZNS drive that is emulated by this module is currently initialized with all zones Empty upon startup. However, actual ZNS SSDs save the state and condition of all zones in their internal NVRAM in the event of power loss. When such a drive is powered up again, it closes or finishes all zones that were open at the moment of shutdown. Besides that, the write pointer position as well as the state and condition of all zones is preserved across power-downs. This commit adds the capability to have a persistent zone metadata to the device. The new optional module property, "zone_file", is introduced. If added to the command line, this property specifies the name of the file that stores the zone metadata. If "zone_file" is omitted, the device will be initialized with all zones empty, the same as before. If zone metadata is configured to be persistent, then zone descriptor extensions also persist across controller shutdowns. Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 370 +++++++++++++++++++++++++++++++++++++++++++++--- hw/block/nvme.h | 37 +++++ 2 files changed, 386 insertions(+), 21 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index f0a03bea75..3e8e6e1472 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -111,6 +111,8 @@ static const uint32_t nvme_feature_cap[NVME_FID_MAX] = { }; static void nvme_process_sq(void *opaque); +static void nvme_sync_zone_file(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, int len); static uint16_t nvme_cid(NvmeRequest *req) { @@ -146,6 +148,7 @@ static void nvme_add_zone_tail(NvmeCtrl *n, NvmeNamespace *ns, NvmeZoneList *zl, zl->tail = idx; } zl->size++; + nvme_set_zone_meta_dirty(n, ns, true); } /* @@ -162,12 +165,15 @@ static void nvme_remove_zone(NvmeCtrl *n, NvmeNamespace *ns, NvmeZoneList *zl, if (zl->size == 0) { zl->head = NVME_ZONE_LIST_NIL; zl->tail = NVME_ZONE_LIST_NIL; + nvme_set_zone_meta_dirty(n, ns, true); } else if (idx == zl->head) { zl->head = zone->next; ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL; + nvme_set_zone_meta_dirty(n, ns, true); } else if (idx == zl->tail) { zl->tail = zone->prev; ns->zone_array[zl->tail].next = NVME_ZONE_LIST_NIL; + nvme_set_zone_meta_dirty(n, ns, true); } else { ns->zone_array[zone->next].prev = zone->prev; ns->zone_array[zone->prev].next = zone->next; @@ -194,6 +200,7 @@ static NvmeZone *nvme_remove_zone_head(NvmeCtrl *n, NvmeNamespace *ns, ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL; } zone->prev = zone->next = 0; + nvme_set_zone_meta_dirty(n, ns, true); } return zone; @@ -297,6 +304,7 @@ static void nvme_assign_zone_state(NvmeCtrl *n, NvmeNamespace *ns, case NVME_ZONE_STATE_READ_ONLY: zone->tstamp = 0; } + nvme_sync_zone_file(n, ns, zone, sizeof(NvmeZone)); } static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) @@ -3357,9 +3365,114 @@ static const MemoryRegionOps nvme_cmb_ops = { }, }; -static int nvme_init_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, +static int nvme_validate_zone_file(NvmeCtrl *n, NvmeNamespace *ns, + uint64_t capacity) +{ + NvmeZoneMeta *meta = ns->zone_meta; + NvmeZone *zone = ns->zone_array; + uint64_t start = 0, zone_size = n->zone_size; + int i, n_imp_open = 0, n_exp_open = 0, n_closed = 0, n_full = 0; + + if (meta->magic != NVME_ZONE_META_MAGIC) { + return 1; + } + if (meta->version != NVME_ZONE_META_VER) { + return 2; + } + if (meta->zone_size != zone_size) { + return 3; + } + if (meta->zone_capacity != n->zone_capacity) { + return 4; + } + if (meta->nr_offline_zones != n->params.nr_offline_zones) { + return 5; + } + if (meta->nr_rdonly_zones != n->params.nr_rdonly_zones) { + return 6; + } + if (meta->lba_size != n->conf.logical_block_size) { + return 7; + } + if (meta->zd_extension_size != n->params.zd_extension_size) { + return 8; + } + + for (i = 0; i < n->num_zones; i++, zone++) { + if (start + zone_size > capacity) { + zone_size = capacity - start; + } + if (zone->d.zt != NVME_ZONE_TYPE_SEQ_WRITE) { + return 9; + } + if (zone->d.zcap != n->zone_capacity) { + return 10; + } + if (zone->d.zslba != start) { + return 11; + } + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_OFFLINE: + case NVME_ZONE_STATE_READ_ONLY: + if (zone->d.wp != start) { + return 12; + } + break; + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + if (zone->d.wp < start || + zone->d.wp >= zone->d.zslba + zone->d.zcap) { + return 13; + } + n_imp_open++; + break; + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + if (zone->d.wp < start || + zone->d.wp >= zone->d.zslba + zone->d.zcap) { + return 13; + } + n_exp_open++; + break; + case NVME_ZONE_STATE_CLOSED: + if (zone->d.wp < start || + zone->d.wp >= zone->d.zslba + zone->d.zcap) { + return 13; + } + n_closed++; + break; + case NVME_ZONE_STATE_FULL: + if (zone->d.wp != zone->d.zslba + zone->d.zcap) { + return 14; + } + n_full++; + break; + default: + return 15; + } + + start += zone_size; + } + + if (n_imp_open != nvme_zone_list_size(ns->exp_open_zones)) { + return 16; + } + if (n_exp_open != nvme_zone_list_size(ns->imp_open_zones)) { + return 17; + } + if (n_closed != nvme_zone_list_size(ns->closed_zones)) { + return 18; + } + if (n_full != nvme_zone_list_size(ns->full_zones)) { + return 19; + } + + return 0; +} + +static int nvme_init_zone_file(NvmeCtrl *n, NvmeNamespace *ns, uint64_t capacity) { + NvmeZoneMeta *meta = ns->zone_meta; NvmeZone *zone; Error *err; uint64_t start = 0, zone_size = n->zone_size; @@ -3367,18 +3480,33 @@ static int nvme_init_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, int i; uint16_t zs; - ns->zone_array = g_malloc0(n->zone_array_size); - ns->exp_open_zones = g_malloc0(sizeof(NvmeZoneList)); - ns->imp_open_zones = g_malloc0(sizeof(NvmeZoneList)); - ns->closed_zones = g_malloc0(sizeof(NvmeZoneList)); - ns->full_zones = g_malloc0(sizeof(NvmeZoneList)); - ns->zd_extensions = g_malloc0(n->params.zd_extension_size * n->num_zones); + if (n->params.zone_file) { + meta->magic = NVME_ZONE_META_MAGIC; + meta->version = NVME_ZONE_META_VER; + meta->zone_size = zone_size; + meta->zone_capacity = n->zone_capacity; + meta->lba_size = n->conf.logical_block_size; + meta->nr_offline_zones = n->params.nr_offline_zones; + meta->nr_rdonly_zones = n->params.nr_rdonly_zones; + meta->zd_extension_size = n->params.zd_extension_size; + } else { + ns->zone_array = g_malloc0(n->zone_array_size); + ns->exp_open_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->imp_open_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->closed_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->full_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->zd_extensions = + g_malloc0(n->params.zd_extension_size * n->num_zones); + } zone = ns->zone_array; nvme_init_zone_list(ns->exp_open_zones); nvme_init_zone_list(ns->imp_open_zones); nvme_init_zone_list(ns->closed_zones); nvme_init_zone_list(ns->full_zones); + if (n->params.zone_file) { + nvme_set_zone_meta_dirty(n, ns, true); + } for (i = 0; i < n->num_zones; i++, zone++) { if (start + zone_size > capacity) { @@ -3429,7 +3557,188 @@ static int nvme_init_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, return 0; } -static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) +static int nvme_open_zone_file(NvmeCtrl *n, bool *init_meta) +{ + struct stat statbuf; + size_t fsize; + int ret; + + ret = stat(n->params.zone_file, &statbuf); + if (ret && errno == ENOENT) { + *init_meta = true; + } else if (!S_ISREG(statbuf.st_mode)) { + fprintf(stderr, "%s is not a regular file\n", strerror(errno)); + return -1; + } + + n->zone_file_fd = open(n->params.zone_file, + O_RDWR | O_LARGEFILE | O_BINARY | O_CREAT, 644); + if (n->zone_file_fd < 0) { + fprintf(stderr, "failed to create zone file %s, err %s\n", + n->params.zone_file, strerror(errno)); + return -1; + } + + fsize = n->meta_size * n->num_namespaces; + + if (stat(n->params.zone_file, &statbuf)) { + fprintf(stderr, "can't stat zone file %s, err %s\n", + n->params.zone_file, strerror(errno)); + return -1; + } + if (statbuf.st_size != fsize) { + ret = ftruncate(n->zone_file_fd, fsize); + if (ret < 0) { + fprintf(stderr, "can't truncate zone file %s, err %s\n", + n->params.zone_file, strerror(errno)); + return -1; + } + *init_meta = true; + } + + return 0; +} + +static int nvme_map_zone_file(NvmeCtrl *n, NvmeNamespace *ns, bool *init_meta) +{ + off_t meta_ofs = n->meta_size * (ns->nsid - 1); + + ns->zone_meta = mmap(0, n->meta_size, PROT_READ | PROT_WRITE, + MAP_SHARED, n->zone_file_fd, meta_ofs); + if (ns->zone_meta == MAP_FAILED) { + fprintf(stderr, "failed to map zone file %s, ofs %lu, err %s\n", + n->params.zone_file, meta_ofs, strerror(errno)); + return -1; + } + + ns->zone_array = (NvmeZone *)(ns->zone_meta + 1); + ns->exp_open_zones = &ns->zone_meta->exp_open_zones; + ns->imp_open_zones = &ns->zone_meta->imp_open_zones; + ns->closed_zones = &ns->zone_meta->closed_zones; + ns->full_zones = &ns->zone_meta->full_zones; + + if (n->params.zd_extension_size) { + ns->zd_extensions = (uint8_t *)(ns->zone_meta + 1); + ns->zd_extensions += n->zone_array_size; + } + + return 0; +} + +static void nvme_sync_zone_file(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, int len) +{ + uintptr_t addr, zd = (uintptr_t)zone; + + addr = zd & qemu_real_host_page_mask; + len += zd - addr; + if (msync((void *)addr, len, MS_ASYNC) < 0) + fprintf(stderr, "msync: failed to sync zone descriptors, file %s\n", + strerror(errno)); + + if (nvme_zone_meta_dirty(n, ns)) { + nvme_set_zone_meta_dirty(n, ns, false); + if (msync(ns->zone_meta, sizeof(NvmeZoneMeta), MS_ASYNC) < 0) + fprintf(stderr, "msync: failed to sync zone meta, file %s\n", + strerror(errno)); + } +} + +/* + * Close or finish all the zones that might be still open after power-down. + */ +static void nvme_prepare_zones(NvmeCtrl *n, NvmeNamespace *ns) +{ + NvmeZone *zone; + uint32_t set_state; + int i; + + assert(!ns->nr_active_zones); + assert(!ns->nr_open_zones); + + zone = ns->zone_array; + for (i = 0; i < n->num_zones; i++, zone++) { + zone->tstamp = 0; + + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + break; + case NVME_ZONE_STATE_CLOSED: + nvme_aor_inc_active(n, ns); + /* pass through */ + default: + continue; + } + + if (zone->d.za & NVME_ZA_ZD_EXT_VALID) { + set_state = NVME_ZONE_STATE_CLOSED; + } else if (zone->d.wp == zone->d.zslba) { + set_state = NVME_ZONE_STATE_EMPTY; + } else if (n->params.max_active_zones == 0 || + ns->nr_active_zones < n->params.max_active_zones) { + set_state = NVME_ZONE_STATE_CLOSED; + } else { + set_state = NVME_ZONE_STATE_FULL; + } + + switch (set_state) { + case NVME_ZONE_STATE_CLOSED: + trace_pci_nvme_power_on_close(nvme_get_zone_state(zone), + zone->d.zslba); + nvme_aor_inc_active(n, ns); + nvme_add_zone_tail(n, ns, ns->closed_zones, zone); + break; + case NVME_ZONE_STATE_EMPTY: + trace_pci_nvme_power_on_reset(nvme_get_zone_state(zone), + zone->d.zslba); + break; + case NVME_ZONE_STATE_FULL: + trace_pci_nvme_power_on_full(nvme_get_zone_state(zone), + zone->d.zslba); + zone->d.wp = nvme_zone_wr_boundary(zone); + } + + nvme_set_zone_state(zone, set_state); + } +} + +static int nvme_load_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, + uint64_t capacity, bool init_meta) +{ + int ret = 0; + + if (n->params.zone_file) { + ret = nvme_map_zone_file(n, ns, &init_meta); + trace_pci_nvme_mapped_zone_file(n->params.zone_file, ret); + if (ret < 0) { + return ret; + } + + if (!init_meta) { + ret = nvme_validate_zone_file(n, ns, capacity); + if (ret) { + trace_pci_nvme_err_zone_file_invalid(ret); + init_meta = true; + } + } + } else { + init_meta = true; + } + + if (init_meta) { + ret = nvme_init_zone_file(n, ns, capacity); + } else { + nvme_prepare_zones(n, ns); + } + if (!ret && n->params.zone_file) { + nvme_sync_zone_file(n, ns, ns->zone_array, n->zone_array_size); + } + + return ret; +} + +static void nvme_zoned_init_ctrl(NvmeCtrl *n, bool *init_meta, Error **errp) { uint64_t zone_size, zone_cap; uint32_t nz; @@ -3456,6 +3765,9 @@ static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) n->zone_array_size = sizeof(NvmeZone) * nz; n->zone_size_log2 = is_power_of_2(n->zone_size) ? nvme_ilog2(n->zone_size) : 0; + n->meta_size = sizeof(NvmeZoneMeta) + n->zone_array_size + + nz * n->params.zd_extension_size; + n->meta_size = ROUND_UP(n->meta_size, qemu_real_host_page_size); if (!n->params.zasl_kb) { n->zasl_bs = n->params.mdts ? 0 : NVME_DEFAULT_MAX_ZA_SIZE * KiB; @@ -3496,17 +3808,25 @@ static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) } } + if (n->params.zone_file) { + if (nvme_open_zone_file(n, init_meta) < 0) { + error_setg(errp, "cannot open zone metadata file"); + return; + } + } + return; } static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, - Error **errp) + bool init_meta, Error **errp) { int ret; - ret = nvme_init_zone_meta(n, ns, n->num_zones * n->zone_size); + ret = nvme_load_zone_meta(n, ns, n->num_zones * n->zone_size, + init_meta); if (ret) { - error_setg(errp, "could not init zone metadata"); + error_setg(errp, "could not load/init zone metadata"); return -1; } @@ -3535,15 +3855,20 @@ static void nvme_zoned_clear(NvmeCtrl *n) { int i; + if (n->params.zone_file) { + close(n->zone_file_fd); + } for (i = 0; i < n->num_namespaces; i++) { NvmeNamespace *ns = &n->namespaces[i]; g_free(ns->id_ns_zoned); - g_free(ns->zone_array); - g_free(ns->exp_open_zones); - g_free(ns->imp_open_zones); - g_free(ns->closed_zones); - g_free(ns->full_zones); - g_free(ns->zd_extensions); + if (!n->params.zone_file) { + g_free(ns->zone_array); + g_free(ns->exp_open_zones); + g_free(ns->imp_open_zones); + g_free(ns->closed_zones); + g_free(ns->full_zones); + g_free(ns->zd_extensions); + } } } @@ -3632,7 +3957,8 @@ static void nvme_init_blk(NvmeCtrl *n, Error **errp) n->ns_size = bs_size; } -static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) +static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, bool init_meta, + Error **errp) { NvmeIdNs *id_ns = &ns->id_ns; int lba_index; @@ -3646,7 +3972,7 @@ static void nvme_init_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) if (n->params.zoned) { ns->csi = NVME_CSI_ZONED; id_ns->ncap = cpu_to_le64(n->zone_capacity * n->num_zones); - if (nvme_zoned_init_ns(n, ns, lba_index, errp) != 0) { + if (nvme_zoned_init_ns(n, ns, lba_index, init_meta, errp) != 0) { return; } } else { @@ -3830,6 +4156,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) NvmeCtrl *n = NVME(pci_dev); NvmeNamespace *ns; Error *local_err = NULL; + bool init_meta = false; int i; @@ -3853,7 +4180,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) } if (n->params.zoned) { - nvme_zoned_init_ctrl(n, &local_err); + nvme_zoned_init_ctrl(n, &init_meta, &local_err); if (local_err) { error_propagate(errp, local_err); return; @@ -3864,7 +4191,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) ns = n->namespaces; for (i = 0; i < n->num_namespaces; i++, ns++) { ns->nsid = i + 1; - nvme_init_namespace(n, ns, &local_err); + nvme_init_namespace(n, ns, init_meta, &local_err); if (local_err) { error_propagate(errp, local_err); return; @@ -3912,6 +4239,7 @@ static Property nvme_props[] = { NVME_DEFAULT_ZONE_SIZE), DEFINE_PROP_UINT64("zone_capacity", NvmeCtrl, params.zone_capacity_mb, 0), DEFINE_PROP_UINT32("zone_append_size_limit", NvmeCtrl, params.zasl_kb, 0), + DEFINE_PROP_STRING("zone_file", NvmeCtrl, params.zone_file), DEFINE_PROP_UINT32("zone_descr_ext_size", NvmeCtrl, params.zd_extension_size, 0), DEFINE_PROP_UINT32("max_active", NvmeCtrl, params.max_active_zones, 0), diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 9a5f4787b7..c46e31dcfe 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -18,6 +18,7 @@ typedef struct NvmeParams { bool zoned; bool cross_zone_read; + char *zone_file; uint8_t fill_pattern; uint32_t zasl_kb; uint64_t zone_size_mb; @@ -95,6 +96,27 @@ typedef struct NvmeZoneList { uint8_t rsvd12[4]; } NvmeZoneList; +#define NVME_ZONE_META_MAGIC 0x3aebaa70 +#define NVME_ZONE_META_VER 1 + +typedef struct NvmeZoneMeta { + uint32_t magic; + uint32_t version; + uint64_t zone_size; + uint64_t zone_capacity; + uint32_t nr_offline_zones; + uint32_t nr_rdonly_zones; + uint32_t lba_size; + uint32_t rsvd40; + NvmeZoneList exp_open_zones; + NvmeZoneList imp_open_zones; + NvmeZoneList closed_zones; + NvmeZoneList full_zones; + uint8_t zd_extension_size; + uint8_t dirty; + uint8_t rsvd594[3990]; +} NvmeZoneMeta; + typedef struct NvmeNamespace { NvmeIdNs id_ns; uint32_t nsid; @@ -104,6 +126,7 @@ typedef struct NvmeNamespace { NvmeIdNsZoned *id_ns_zoned; NvmeZone *zone_array; + NvmeZoneMeta *zone_meta; NvmeZoneList *exp_open_zones; NvmeZoneList *imp_open_zones; NvmeZoneList *closed_zones; @@ -171,6 +194,7 @@ typedef struct NvmeCtrl { int zone_file_fd; uint32_t num_zones; + size_t meta_size; uint64_t zone_size; uint64_t zone_capacity; uint64_t zone_array_size; @@ -279,6 +303,19 @@ static inline NvmeZone *nvme_next_zone_in_list(NvmeNamespace *ns, NvmeZone *z, return &ns->zone_array[z->next]; } +static inline bool nvme_zone_meta_dirty(NvmeCtrl *n, NvmeNamespace *ns) +{ + return n->params.zone_file ? ns->zone_meta->dirty : false; +} + +static inline void nvme_set_zone_meta_dirty(NvmeCtrl *n, NvmeNamespace *ns, + bool yesno) +{ + if (n->params.zone_file) { + ns->zone_meta->dirty = yesno; + } +} + static inline int nvme_ilog2(uint64_t i) { int log = -1; From patchwork Sat Sep 12 22:54:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 11772145 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 502A1746 for ; Sat, 12 Sep 2020 23:00:50 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F11832064E for ; Sat, 12 Sep 2020 23:00:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="mnkPSHnV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F11832064E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:41760 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kHEVk-0007Up-RG for patchwork-qemu-devel@patchwork.kernel.org; Sat, 12 Sep 2020 19:00:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49086) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQe-0004z1-0k; Sat, 12 Sep 2020 18:55:32 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:26909) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kHEQb-0005ef-TG; Sat, 12 Sep 2020 18:55:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1599951329; x=1631487329; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=e++3QhnRNffa0GIl2+VYyMaBj8dBPJeZ5MT33Nj7yxg=; b=mnkPSHnVVW1280DGufkc31jKUqAT+I3M3uR584cb0nfwB+6SOKwpyUZm OdqPd8Nyaw8vACdtpXbXAofRIMPBcdKpfT1YLnXN5sbymbLRHTtZoxaHm WiN11loFnWJHlYxXpOCq5ezLFmv+icb93vevsBBnFxy3IzTyuD0eOYw0j 5FHx40S8uuII3PM6csKGbJDTYhFYd1olxNTAGbS9UQDIfReSpNbTHvf2S z/Wly94yn7jaXEb+Vrb5UWVjP0iiPS8lrLL65/xRLfRpsmdTTHPY7gmUA psV/YxghqgvRr8WCjFWa6EjC4JQnxQhwdRxfI+/H80Vgj7x4iNWsCxLNj g==; IronPort-SDR: cwAhIacimDfCBJH1RMJcpyJxBewmkJeLjC2G94M/NP3+wgXvZzGeTVZ1maw+6G+xwJM3kaxDCi Jg0nPbAVlM2VF37M2HykNNJjptEMqal9Q2Qw+wgKPrwTBfrZOs0CfCow0p9pm53/jz8l09JTbo /TYf5Y6GKG3DHwBxs7AjXMHjEBnaRSuTf4YRJLRHzsU3fcpRGNoZ3U7HG8HkKKm7bibJwF3ZsZ QK5LUzwycXWELsf/qH5aueycOmd/ecKXWH9x2f+HH+wN3aErn55nAl0/w2pjJABP2SklJFw1FI A5E= X-IronPort-AV: E=Sophos;i="5.76,420,1592841600"; d="scan'208";a="256834874" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 13 Sep 2020 06:55:10 +0800 IronPort-SDR: sbwVKPkqKFPXLH4jqwF+0+wIj69Gbxle9GFt0H+ErYGjq/ux1WD3PG1wfiOdCkEGPHxZuLnjvf 7X2vQrbRa5zw== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2020 15:41:30 -0700 IronPort-SDR: 5bPfhaD4gxFT93nNw2ychB66JM48o8fzAT0KxrGRMAcIsRmO6MncNYC2NtiZUPMaHQPYiUjtKw 1RRCddvonNiA== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 12 Sep 2020 15:55:09 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v2 15/15] hw/block/nvme: Document zoned parameters in usage text Date: Sun, 13 Sep 2020 07:54:30 +0900 Message-Id: <20200912225430.1772-16-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200912225430.1772-1-dmitry.fomichev@wdc.com> References: <20200912225430.1772-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=517336518=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/12 18:54:38 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Added brief descriptions of the new device properties that are now available to users to configure features of Zoned Namespace Command Set in the emulator. This patch is for documentation only, no functionality change. Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 43 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 41 insertions(+), 2 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 3e8e6e1472..9b1d80a204 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -9,7 +9,7 @@ */ /** - * Reference Specs: http://www.nvmexpress.org, 1.2, 1.1, 1.0e + * Reference Specs: http://www.nvmexpress.org, 1.4, 1.3, 1.2, 1.1, 1.0e * * https://nvmexpress.org/developers/nvme-specification/ */ @@ -22,7 +22,7 @@ * [pmrdev=,] \ * max_ioqpairs=, \ * aerl=, aer_max_queued=, \ - * mdts= + * mdts=, zoned= * * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. @@ -48,6 +48,45 @@ * completion when there are no oustanding AERs. When the maximum number of * enqueued events are reached, subsequent events will be dropped. * + * Setting `zoned` to true makes the device to support zoned namespaces. + * In this case, of the following options are available to configure zoned + * operation: + * zone_size= + * + * zone_capacity= + * The value 0 (default) forces zone capacity to be the same as zone + * size. The value of this property may not exceed zone size. + * + * zone_file= + * Zone metadata file, if specified, allows zone information + * to be persistent across shutdowns and restarts. + * + * zone_descr_ext_size= + * This value needs to be specified in 64B units. If it is zero, + * namespace(s) will not support zone descriptor extensions. + * + * max_active= + * + * max_open= + * + * zone_append_size_limit= + * The maximum I/O size that can be supported by Zone Append + * command. Since internally this this value is maintained as + * ZASL = log2( / ), some + * values assigned to this property may be rounded down and + * result in a lower maximum ZA data size being in effect. + * If MDTS property is not assigned, the default value of 128KiB is + * used as ZASL. + * + * offline_zones= + * + * rdonly_zones= + * + * cross_zone_read= + * + * fill_pattern= + * The byte pattern to return for any portions of unwritten data + * during read. */ #include "qemu/osdep.h"