From patchwork Fri Jul 22 08:28:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuji Ishikawa X-Patchwork-Id: 12926123 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AB12CCA47F for ; Fri, 22 Jul 2022 08:34:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235012AbiGVIew (ORCPT ); Fri, 22 Jul 2022 04:34:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59700 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235011AbiGVIev (ORCPT ); Fri, 22 Jul 2022 04:34:51 -0400 Received: from mo-csw.securemx.jp (mo-csw1114.securemx.jp [210.130.202.156]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99DC39EC42; Fri, 22 Jul 2022 01:34:50 -0700 (PDT) Received: by mo-csw.securemx.jp (mx-mo-csw1114) id 26M8YHbO004356; Fri, 22 Jul 2022 17:34:17 +0900 X-Iguazu-Qid: 2wHHCGtN6yCYJPGagK X-Iguazu-QSIG: v=2; s=0; t=1658478857; q=2wHHCGtN6yCYJPGagK; m=nVA3CjmG/8lyNcuw0V5v1MshkB+vWVTzQZb1hzhV+1o= Received: from imx12-a.toshiba.co.jp ([38.106.60.135]) by relay.securemx.jp (mx-mr1112) id 26M8YFCv023985 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=NOT); Fri, 22 Jul 2022 17:34:15 +0900 X-SA-MID: 39258142 From: Yuji Ishikawa To: Rob Herring , Hans Verkuil , Nobuhiro Iwamatsu , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, yuji2.ishikawa@toshiba.co.jp Subject: [PATCH v2 1/5] dt-bindings: soc: visconti: Add Toshiba Visconti DNN image processing accelerator bindings Date: Fri, 22 Jul 2022 17:28:54 +0900 X-TSB-HOP2: ON Message-Id: <20220722082858.17880-2-yuji2.ishikawa@toshiba.co.jp> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> References: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org This commit adds the Device Tree binding documentation that allows to describe the DNN image processing accelerator found in Toshiba Visconti SoCs. Signed-off-by: Yuji Ishikawa Reviewed-by: Nobuhiro Iwamatsu --- .../soc/visconti/toshiba,visconti-dnn.yaml | 54 +++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/visconti/toshiba,visconti-dnn.yaml diff --git a/Documentation/devicetree/bindings/soc/visconti/toshiba,visconti-dnn.yaml b/Documentation/devicetree/bindings/soc/visconti/toshiba,visconti-dnn.yaml new file mode 100644 index 000000000..28576a55e --- /dev/null +++ b/Documentation/devicetree/bindings/soc/visconti/toshiba,visconti-dnn.yaml @@ -0,0 +1,54 @@ +# SPDX-LIcense-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/soc/visconti/toshiba,visconti-dnn.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Toshiba Visconti DNN image processing accelerator + +maintainers: + - Nobuhiro Iwamatsu + +description: | + Toshiba Visconti DNN image processing accelerator. + Visconti5 have up to 2 DNN units. + +properties: + compatible: + items: + - const: toshiba,visconti-dnn + + reg: + maxItems: 1 + + interrupts: + maxItems: 1 + + index: + enum: [0, 1] + +required: + - compatible + - reg + - interrupts + - index + +additionalProperties: false + +examples: + - | + #include + #include + + soc { + #address-cells = <2>; + #size-cells = <2>; + + dnn0: dnn@14280000 { + compatible = "toshiba,visconti-dnn"; + reg = <0 0x14280000 0 0x8000>; + interrupts = ; + index = <0>; + status = "disabled"; + }; + }; From patchwork Fri Jul 22 08:28:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuji Ishikawa X-Patchwork-Id: 12926124 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0554C43334 for ; Fri, 22 Jul 2022 08:34:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235031AbiGVIe5 (ORCPT ); Fri, 22 Jul 2022 04:34:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235016AbiGVIex (ORCPT ); Fri, 22 Jul 2022 04:34:53 -0400 Received: from mo-csw.securemx.jp (mo-csw1116.securemx.jp [210.130.202.158]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C301E9EC42; Fri, 22 Jul 2022 01:34:51 -0700 (PDT) Received: by mo-csw.securemx.jp (mx-mo-csw1116) id 26M8YIVg010128; Fri, 22 Jul 2022 17:34:18 +0900 X-Iguazu-Qid: 2wHHacZk91iSaUMTvF X-Iguazu-QSIG: v=2; s=0; t=1658478858; q=2wHHacZk91iSaUMTvF; m=eKV1ntskB+2zMVwi0kIqn8R6s1J56TBjYPkFOJYTyXk= Received: from imx2-a.toshiba.co.jp (imx2-a.toshiba.co.jp [106.186.93.35]) by relay.securemx.jp (mx-mr1113) id 26M8YGEu012869 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=NOT); Fri, 22 Jul 2022 17:34:16 +0900 X-SA-MID: 42481707 From: Yuji Ishikawa To: Rob Herring , Hans Verkuil , Nobuhiro Iwamatsu , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, yuji2.ishikawa@toshiba.co.jp Subject: [PATCH v2 2/5] soc: visconti: Add Toshiba Visconti image processing accelerator common source Date: Fri, 22 Jul 2022 17:28:55 +0900 X-TSB-HOP2: ON Message-Id: <20220722082858.17880-3-yuji2.ishikawa@toshiba.co.jp> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> References: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org This commit adds common definitions shared among image processing accelerator drivers for Toshiba Visconti SoCs. Signed-off-by: Yuji Ishikawa Reviewed-by: Nobuhiro Iwamatsu --- v1 -> v2: - applied checkpatch.pl --strict --- drivers/soc/Kconfig | 1 + drivers/soc/Makefile | 1 + drivers/soc/visconti/Kconfig | 1 + drivers/soc/visconti/Makefile | 6 +++ drivers/soc/visconti/ipa_common.c | 55 +++++++++++++++++++ drivers/soc/visconti/ipa_common.h | 18 +++++++ drivers/soc/visconti/uapi/ipa.h | 90 +++++++++++++++++++++++++++++++ 7 files changed, 172 insertions(+) create mode 100644 drivers/soc/visconti/Kconfig create mode 100644 drivers/soc/visconti/Makefile create mode 100644 drivers/soc/visconti/ipa_common.c create mode 100644 drivers/soc/visconti/ipa_common.h create mode 100644 drivers/soc/visconti/uapi/ipa.h diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig index e8a30c4c5..c99139aa8 100644 --- a/drivers/soc/Kconfig +++ b/drivers/soc/Kconfig @@ -22,6 +22,7 @@ source "drivers/soc/tegra/Kconfig" source "drivers/soc/ti/Kconfig" source "drivers/soc/ux500/Kconfig" source "drivers/soc/versatile/Kconfig" +source "drivers/soc/visconti/Kconfig" source "drivers/soc/xilinx/Kconfig" endmenu diff --git a/drivers/soc/Makefile b/drivers/soc/Makefile index a05e9fbcd..455b993c2 100644 --- a/drivers/soc/Makefile +++ b/drivers/soc/Makefile @@ -28,4 +28,5 @@ obj-$(CONFIG_ARCH_TEGRA) += tegra/ obj-y += ti/ obj-$(CONFIG_ARCH_U8500) += ux500/ obj-$(CONFIG_PLAT_VERSATILE) += versatile/ +obj-$(CONFIG_ARCH_VISCONTI) += visconti/ obj-y += xilinx/ diff --git a/drivers/soc/visconti/Kconfig b/drivers/soc/visconti/Kconfig new file mode 100644 index 000000000..8b1378917 --- /dev/null +++ b/drivers/soc/visconti/Kconfig @@ -0,0 +1 @@ + diff --git a/drivers/soc/visconti/Makefile b/drivers/soc/visconti/Makefile new file mode 100644 index 000000000..8d710da08 --- /dev/null +++ b/drivers/soc/visconti/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the Visconti specific device drivers. +# + +obj-y += ipa_common.o diff --git a/drivers/soc/visconti/ipa_common.c b/drivers/soc/visconti/ipa_common.c new file mode 100644 index 000000000..6345f33c5 --- /dev/null +++ b/drivers/soc/visconti/ipa_common.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +/* Toshiba Visconti Image Processing Accelerator Support + * + * (C) Copyright 2022 TOSHIBA CORPORATION + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation + */ + +#include "ipa_common.h" + +int ipa_attach_dmabuf(struct device *dev, int fd, struct dma_buf_attachment **a, + struct sg_table **s, dma_addr_t *addr, enum dma_data_direction dma_dir) +{ + struct dma_buf_attachment *attachment; + struct dma_buf *dmabuf; + struct sg_table *sgt; + int ret; + + dmabuf = dma_buf_get(fd); + if (IS_ERR(dmabuf)) { + dev_err(dev, "Invalid dmabuf FD\n"); + return PTR_ERR(dmabuf); + } + attachment = dma_buf_attach(dmabuf, dev); + + if (IS_ERR(attachment)) { + dev_err(dev, "Failed to attach dmabuf\n"); + ret = PTR_ERR(attachment); + goto err_put; + } + sgt = dma_buf_map_attachment(attachment, dma_dir); + if (IS_ERR(sgt)) { + dev_err(dev, "Failed to get dmabufs sg_table\n"); + ret = PTR_ERR(sgt); + goto err_detach; + } + if (sgt->nents != 1) { + dev_err(dev, "Sparse DMA region is unsupported\n"); + ret = -EINVAL; + goto err_unmap; + } + + *addr = sg_dma_address(sgt->sgl); + *a = attachment; + *s = sgt; + + return 0; + +err_unmap: + dma_buf_unmap_attachment(attachment, sgt, dma_dir); +err_detach: + dma_buf_detach(dmabuf, attachment); +err_put: + dma_buf_put(dmabuf); + return ret; +} diff --git a/drivers/soc/visconti/ipa_common.h b/drivers/soc/visconti/ipa_common.h new file mode 100644 index 000000000..786988c52 --- /dev/null +++ b/drivers/soc/visconti/ipa_common.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* Toshiba Visconti Image Processing Accelerator Support + * + * (C) Copyright 2022 TOSHIBA CORPORATION + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation + */ + +#include +#include + +#define COHERENT_ADDRESS_BIT (0x400000000ULL) +#define IPA_POLL_EVENT_NONE (0) +#define IPA_POLL_EVENT_DONE (1) +#define IPA_POLL_EVENT_ERROR (2) +#define IPA_WAKEUP_RETRY_DELAY (300 * 1000) /*usec*/ + +int ipa_attach_dmabuf(struct device *dev, int fd, struct dma_buf_attachment **a, + struct sg_table **s, dma_addr_t *addr, enum dma_data_direction dma_dir); diff --git a/drivers/soc/visconti/uapi/ipa.h b/drivers/soc/visconti/uapi/ipa.h new file mode 100644 index 000000000..b13bcf9c5 --- /dev/null +++ b/drivers/soc/visconti/uapi/ipa.h @@ -0,0 +1,90 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* Toshiba Visconti Image Processing Accelerator Support + * + * (C) Copyright 2022 TOSHIBA CORPORATION + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation + */ + +#ifndef _UAPI_LINUX_IPA_H +#define _UAPI_LINUX_IPA_H + +#include + +/** + * enum drv_ipa_state - state of Visconti IPA driver + * + * @DRV_IPA_STATE_IDLE: driver is idle + * @DRV_IPA_STATE_BUSY: device is busy + */ +enum drv_ipa_state { DRV_IPA_STATE_IDLE = 0, DRV_IPA_STATE_BUSY }; + +/** + * enum drv_ipa_buffer_direction - usage of buffer + * + * @DRV_IPA_DIR_BIDIRECTION: cpu writes/reads data + * @DRV_IPA_DIR_TO_DEVICE: cpu writes data + * @DRV_IPA_DIR_FROM_DEVICE: cpu reads data + * @DRV_IPA_DIR_NONE: cpu not access or non-coherent + */ +enum drv_ipa_buffer_direction { + DRV_IPA_DIR_BIDIRECTION = 0, + DRV_IPA_DIR_TO_DEVICE, + DRV_IPA_DIR_FROM_DEVICE, + DRV_IPA_DIR_NONE +}; + +/** + * struct drv_ipa_buffer_info - buffer information for Visconti IPA drivers + * + * @fd: file descriptor of Ion allocated heap + * @coherent: allocated via coherent bus or not + * @direction: buffer direction (to/from device) + * + * note: When the Ion heap is allocated wit ION_FLAG_CACHED, + * &struct drv_ipa_buffer_info.coherent should be true. + * Else, it should be false. + */ +struct drv_ipa_buffer_info { + __u32 fd; + bool coherent; + enum drv_ipa_buffer_direction direction; +}; + +/** + * struct drv_ipa_addr - address structure for Visconti IPA drivers + * + * @buffer_index: array index of &struct drv_ipa_buffer_info + * @offset: offset from the top of Ion heap + */ +struct drv_ipa_addr { + __u32 buffer_index : 4; + __u32 offset : 28; +}; + +#define IPA_BUFFER_INDEX_MAX (16) +#define IPA_INVALID_ADDR \ + { \ + .buffer_index = 0xfu, .offset = 0xfffffffu \ + } + +static inline struct drv_ipa_addr drv_ipa_invalid_addr(void) +{ + struct drv_ipa_addr invalid_addr = IPA_INVALID_ADDR; + return invalid_addr; +} + +static inline bool drv_ipa_is_invalid_addr(struct drv_ipa_addr addr) +{ + struct drv_ipa_addr invalid_addr = IPA_INVALID_ADDR; + + if (addr.buffer_index == invalid_addr.buffer_index && addr.offset == invalid_addr.offset) { + return true; + } + return false; +} + +#define IOC_IPA_MAGIC 'I' +#define IOC_IPA_START _IO(IOC_IPA_MAGIC, 0) +#define IOC_IPA_GET_STATUS _IO(IOC_IPA_MAGIC, 1) + +#endif /* _UAPI_LINUX_IPA_H */ From patchwork Fri Jul 22 08:28:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuji Ishikawa X-Patchwork-Id: 12926127 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AC76C43334 for ; Fri, 22 Jul 2022 08:35:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232987AbiGVIe7 (ORCPT ); Fri, 22 Jul 2022 04:34:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235022AbiGVIe4 (ORCPT ); Fri, 22 Jul 2022 04:34:56 -0400 Received: from mo-csw.securemx.jp (mo-csw1514.securemx.jp [210.130.202.153]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 159859EC4B; Fri, 22 Jul 2022 01:34:52 -0700 (PDT) Received: by mo-csw.securemx.jp (mx-mo-csw1514) id 26M8YNxK031959; Fri, 22 Jul 2022 17:34:23 +0900 X-Iguazu-Qid: 34ts2UY8vbB4rqSozt X-Iguazu-QSIG: v=2; s=0; t=1658478862; q=34ts2UY8vbB4rqSozt; m=id5QLdCTvLnbRhef7WhRYzMStjL39kbIt3ASO/UDAwA= Received: from imx2-a.toshiba.co.jp (imx2-a.toshiba.co.jp [106.186.93.35]) by relay.securemx.jp (mx-mr1513) id 26M8YL77009461 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=NOT); Fri, 22 Jul 2022 17:34:21 +0900 X-SA-MID: 42481710 From: Yuji Ishikawa To: Rob Herring , Hans Verkuil , Nobuhiro Iwamatsu , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, yuji2.ishikawa@toshiba.co.jp Subject: [PATCH v2 3/5] soc: visconti: Add Toshiba Visconti DNN image processing accelerator Date: Fri, 22 Jul 2022 17:28:56 +0900 X-TSB-HOP2: ON Message-Id: <20220722082858.17880-4-yuji2.ishikawa@toshiba.co.jp> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> References: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org Add support to DNN image processing accelerator on Toshiba Visconti ARM SoCs. The accelerator is applicable to DNN inference tasks. Signed-off-by: Yuji Ishikawa Reviewed-by: Nobuhiro Iwamatsu --- v1 -> v2: - applied checkpatch.pl --strict - removed unused code --- drivers/soc/visconti/Kconfig | 6 + drivers/soc/visconti/Makefile | 2 + drivers/soc/visconti/dnn/Makefile | 6 + drivers/soc/visconti/dnn/dnn.c | 523 +++++++++++++++++++++++++ drivers/soc/visconti/dnn/hwd_dnn.c | 183 +++++++++ drivers/soc/visconti/dnn/hwd_dnn.h | 68 ++++ drivers/soc/visconti/dnn/hwd_dnn_reg.h | 228 +++++++++++ drivers/soc/visconti/uapi/dnn.h | 77 ++++ 8 files changed, 1093 insertions(+) create mode 100644 drivers/soc/visconti/dnn/Makefile create mode 100644 drivers/soc/visconti/dnn/dnn.c create mode 100644 drivers/soc/visconti/dnn/hwd_dnn.c create mode 100644 drivers/soc/visconti/dnn/hwd_dnn.h create mode 100644 drivers/soc/visconti/dnn/hwd_dnn_reg.h create mode 100644 drivers/soc/visconti/uapi/dnn.h diff --git a/drivers/soc/visconti/Kconfig b/drivers/soc/visconti/Kconfig index 8b1378917..a25287d0c 100644 --- a/drivers/soc/visconti/Kconfig +++ b/drivers/soc/visconti/Kconfig @@ -1 +1,7 @@ +if ARCH_VISCONTI + +config VISCONTI_DNN + bool "Visconti DNN driver" + +endif diff --git a/drivers/soc/visconti/Makefile b/drivers/soc/visconti/Makefile index 8d710da08..b9bd0f7e2 100644 --- a/drivers/soc/visconti/Makefile +++ b/drivers/soc/visconti/Makefile @@ -4,3 +4,5 @@ # obj-y += ipa_common.o + +obj-$(CONFIG_VISCONTI_DNN) += dnn/ diff --git a/drivers/soc/visconti/dnn/Makefile b/drivers/soc/visconti/dnn/Makefile new file mode 100644 index 000000000..52d57b60d --- /dev/null +++ b/drivers/soc/visconti/dnn/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the Visconti DNN driver +# + +obj-y += dnn.o hwd_dnn.o diff --git a/drivers/soc/visconti/dnn/dnn.c b/drivers/soc/visconti/dnn/dnn.c new file mode 100644 index 000000000..08dcc4e77 --- /dev/null +++ b/drivers/soc/visconti/dnn/dnn.c @@ -0,0 +1,523 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +/* Toshiba Visconti DNN Accelerator Support + * + * (C) Copyright 2022 TOSHIBA CORPORATION + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../ipa_common.h" +#include "../uapi/dnn.h" +#include "hwd_dnn.h" + +struct dnn_priv { + struct device *dev; + struct miscdevice miscdev; + struct mutex lock; + void __iomem *regs; + int irq; + wait_queue_head_t waitq; + enum drv_ipa_state status; + unsigned int poll_event; + int id; + char name[16]; + struct hwd_dnn_status hwd_status; + + u32 *list_vaddr[DRV_DNN_BASE_ADDR_NUM]; + dma_addr_t list_paddr[DRV_DNN_BASE_ADDR_NUM]; + unsigned int err_flags[HWD_DNN_EER_FLAG_NUM]; + + struct dma_buf_attachment *dba[DRV_DNN_BUFFER_INDEX_MAX]; + struct sg_table *sgt[DRV_DNN_BUFFER_INDEX_MAX]; + enum dma_data_direction dma_dir[DRV_DNN_BUFFER_INDEX_MAX]; + unsigned int dma_count; + + dma_addr_t buffer_iova[DRV_DNN_BUFFER_INDEX_MAX]; + + struct drv_ipa_addr temp_list[HWD_DNN_LIST_NUM_MAX]; +}; + +static uint32_t dnn_ipa_addr_to_iova(struct dnn_priv *priv, struct drv_ipa_addr addr) +{ + u32 iova = 0; + + if (addr.buffer_index < priv->dma_count && + addr.offset < priv->dba[addr.buffer_index]->dmabuf->size) + iova = priv->buffer_iova[addr.buffer_index] + addr.offset; + return iova; +} + +static int dnn_attach_dma_buf(struct dnn_priv *priv, unsigned int buffer_index, + struct drv_ipa_buffer_info buffer_info[DRV_DNN_BUFFER_INDEX_MAX]) +{ + int ret = 0; + dma_addr_t addr; + + if (buffer_index >= DRV_DNN_BUFFER_INDEX_MAX) { + dev_err(priv->dev, "Buffer index invalid: index=%d\n", buffer_index); + return -EINVAL; + } + + switch (buffer_info[buffer_index].direction) { + case DRV_IPA_DIR_NONE: + priv->dma_dir[priv->dma_count] = DMA_NONE; + break; + case DRV_IPA_DIR_TO_DEVICE: + priv->dma_dir[priv->dma_count] = DMA_TO_DEVICE; + break; + case DRV_IPA_DIR_FROM_DEVICE: + priv->dma_dir[priv->dma_count] = DMA_FROM_DEVICE; + break; + case DRV_IPA_DIR_BIDIRECTION: + priv->dma_dir[priv->dma_count] = DMA_BIDIRECTIONAL; + break; + default: + dev_err(priv->dev, "DMA direction invalid: index=%d dir=%d\n", buffer_index, + buffer_info[buffer_index].direction); + return -EINVAL; + } + + ret = ipa_attach_dmabuf(priv->dev, buffer_info[buffer_index].fd, + &priv->dba[priv->dma_count], &priv->sgt[priv->dma_count], &addr, + priv->dma_dir[priv->dma_count]); + if (ret == 0) { + priv->dma_count++; + priv->buffer_iova[buffer_index] = addr; + } + + return ret; +} + +static void dnn_detach_dma_buf(struct dnn_priv *priv) +{ + struct dma_buf *dmabuf; + int i; + + for (i = 0; i < priv->dma_count; i++) { + dmabuf = priv->dba[i]->dmabuf; + dma_buf_unmap_attachment(priv->dba[i], priv->sgt[i], priv->dma_dir[i]); + dma_buf_detach(dmabuf, priv->dba[i]); + dma_buf_put(dmabuf); + } +} + +static irqreturn_t dnn_irq(int irq, void *dev_id) +{ + struct dnn_priv *priv = dev_id; + unsigned int event; + + event = hwd_dnn_irq_handler(priv->id); + + if (event & HWD_DNN_EVENT_EXEC_DONE) { + disable_irq_nosync(priv->irq); + return IRQ_WAKE_THREAD; + } + + return IRQ_NONE; +} + +static irqreturn_t dnn_irq_thread(int irq, void *dev_id) +{ + struct dnn_priv *priv = dev_id; + unsigned long delay = 1; + + mutex_lock(&priv->lock); + dnn_detach_dma_buf(priv); + + hwd_dnn_get_status(priv->id, &priv->hwd_status); + + priv->status = DRV_IPA_STATE_IDLE; + + /* status should be updated before poll_event so that + * when poll() returns, user context must observe state as idle + */ + smp_wmb(); + + if (priv->hwd_status.eer) + priv->poll_event = IPA_POLL_EVENT_ERROR; + else + priv->poll_event = IPA_POLL_EVENT_DONE; + + /* General barrier to avoid re-ordering of priv->poll_event=N and + * waitqueue_active() + */ + smp_mb(); + + /* Threads going to sleep in poll() can miss wakeup, when wakeup is done + * between event check in ipa_poll() and sleeping. Wakeup repeatedly. + */ + while (waitqueue_active(&priv->waitq)) { + wake_up_interruptible(&priv->waitq); + + WARN_ON(delay > IPA_WAKEUP_RETRY_DELAY); + usleep_range(delay, delay + 1); + delay += delay; + } + + mutex_unlock(&priv->lock); + + return IRQ_HANDLED; +} + +static void dnn_start(struct dnn_priv *priv, struct hwd_dnn_descriptor *desc) +{ + hwd_dnn_start(priv->id, desc); +} + +static int dnn_parse_entry_addr(struct dnn_priv *priv, int idx, struct drv_dnn_descriptor *desc, + struct hwd_dnn_descriptor *hwd_desc) +{ + dma_addr_t addr; + struct drv_ipa_addr ipa_addr; + int j, ret = 0; + + switch (desc->base_addr[idx].purpose) { + case DRV_DNN_BASE_ADDR_PURPOSE_OUTPUT: + case DRV_DNN_BASE_ADDR_PURPOSE_INPUT: + if (copy_from_user(priv->temp_list, + (void __user *)desc->base_addr[idx].addr.list_addr, + sizeof(struct drv_ipa_addr) * (desc->list_num + 1))) { + dev_err(priv->dev, "IPA address to iova conversion error: %d\n", __LINE__); + ret = -EFAULT; + break; + } + + hwd_desc->base_addr[idx] = priv->list_paddr[idx]; + for (j = 0; j < desc->list_num + 1; j++) { + ipa_addr = priv->temp_list[j]; + addr = dnn_ipa_addr_to_iova(priv, ipa_addr); + if (addr == 0) { + dev_err(priv->dev, + "@LIST@ IPA address to iova conversion error: %d\n", + __LINE__); + ret = -EINVAL; + break; + } + dev_dbg(priv->dev, "@LIST@ %d: fd=%d %llx %x\n", idx, + desc->buffer_info[ipa_addr.buffer_index].fd, + (u64)hwd_desc->base_addr[idx], (u32)addr); + priv->list_vaddr[idx][j] = addr; + } + break; + case DRV_DNN_BASE_ADDR_PURPOSE_AWB: + ipa_addr = desc->base_addr[idx].addr.ipa_addr; + addr = dnn_ipa_addr_to_iova(priv, ipa_addr); + if (addr == 0) { + dev_err(priv->dev, "@AWB@ IPA address to iova conversion error: %d\n", + __LINE__); + ret = -EINVAL; + break; + } + hwd_desc->base_addr[idx] = addr; + dev_dbg(priv->dev, "@AWB@ %d: fd=%d %llx\n", idx, + desc->buffer_info[ipa_addr.buffer_index].fd, (u64)hwd_desc->base_addr[idx]); + break; + case DRV_DNN_BASE_ADDR_PURPOSE_TEMPORARY: + ipa_addr = desc->base_addr[idx].addr.ipa_addr; + addr = dnn_ipa_addr_to_iova(priv, ipa_addr); + if (addr == 0) { + dev_err(priv->dev, "@TEMP@ IPA address to iova conversion error: %d\n", + __LINE__); + ret = -EINVAL; + break; + } + hwd_desc->base_addr[idx] = addr; + dev_dbg(priv->dev, "@TEMP@ %d: fd=%d %llx\n", idx, + desc->buffer_info[ipa_addr.buffer_index].fd, (u64)hwd_desc->base_addr[idx]); + break; + default: + hwd_desc->base_addr[idx] = 0; + break; + } + return ret; +} + +static int dnn_ioctl_start(struct dnn_priv *priv, unsigned long arg) +{ + struct drv_dnn_descriptor desc; + struct hwd_dnn_descriptor hwd_desc; + int i, ret = 0; + + ret = mutex_lock_interruptible(&priv->lock); + if (ret) + return ret; + + if (priv->status == DRV_IPA_STATE_BUSY) { + dev_dbg(priv->dev, "busy: %d\n", priv->status); + mutex_unlock(&priv->lock); + return -EBUSY; + } + + if (copy_from_user(&desc, (void __user *)arg, sizeof(struct drv_dnn_descriptor))) { + dev_err(priv->dev, "Descriptor memory access error\n"); + ret = -EFAULT; + goto err1; + } + + if (DRV_DNN_BIT_CONFIG_DESC_FINAL != (desc.config_done & DRV_DNN_BIT_CONFIG_DESC_FINAL)) { + dev_err(priv->dev, "Descriptor configuration not complete\n"); + ret = -EINVAL; + goto err1; + } + + priv->dma_count = 0; + + /* setup buffer */ + for (i = 0; i < desc.buffer_info_num; i++) { + ret = dnn_attach_dma_buf(priv, i, desc.buffer_info); + if (ret) { + dev_err(priv->dev, "dma buf attach error: index=%d\n", i); + goto err2; + } + dev_dbg(priv->dev, "@buffer[%d]@: fd=%d %s iova=%llx\n", i, desc.buffer_info[i].fd, + desc.buffer_info[i].coherent ? "coherent" : "non-coherent", + (uint64_t)priv->buffer_iova[i]); + } + + /* get iova address of configuration area */ + hwd_desc.configuration = dnn_ipa_addr_to_iova(priv, desc.configuration); + if (hwd_desc.configuration == 0) { + dev_err(priv->dev, "Invalid IPA Address: configuration %s: %d\n", __func__, + __LINE__); + ret = -EINVAL; + goto err2; + } + + dev_dbg(priv->dev, "@config@: fd=%d %llx\n", + desc.buffer_info[desc.configuration.buffer_index].fd, + (uint64_t)hwd_desc.configuration); + + for (i = 0; i < DRV_DNN_BASE_ADDR_NUM; i++) { + ret = dnn_parse_entry_addr(priv, i, &desc, &hwd_desc); + if (ret) + goto err2; + } + + hwd_desc.configuration_size = desc.configuration_size; + hwd_desc.list_num = desc.list_num; + hwd_desc.eer_flags_addr = priv->err_flags; + hwd_desc.base_addr_flag = desc.base_addr_flag; + hwd_desc.config_done = desc.config_done; + + hwd_desc.configuration += desc.configuration_offset; + + dnn_start(priv, &hwd_desc); + + priv->poll_event = IPA_POLL_EVENT_NONE; + priv->status = DRV_IPA_STATE_BUSY; + enable_irq(priv->irq); + + mutex_unlock(&priv->lock); + + return ret; + +err2: + dnn_detach_dma_buf(priv); +err1: + mutex_unlock(&priv->lock); + return ret; +} + +static int dnn_ioctl_get_status(struct dnn_priv *priv, unsigned long arg) +{ + struct drv_dnn_status status; + struct hwd_dnn_status hwd_status; + int ret = 0, i; + + ret = mutex_lock_interruptible(&priv->lock); + if (ret) + return ret; + + if (priv->status == DRV_IPA_STATE_BUSY) + hwd_dnn_get_status(priv->id, &hwd_status); + else + hwd_status = priv->hwd_status; + + status.state = priv->status; + mutex_unlock(&priv->lock); + + status.eer_cmd = hwd_status.eer_cmd; + status.eer = hwd_status.eer; + for (i = 0; i < HWD_DNN_EER_FLAG_NUM; i++) + status.eer_flags[i] = priv->err_flags[i]; + + if (copy_to_user((void __user *)arg, &status, sizeof(struct drv_dnn_status))) { + dev_err(priv->dev, "status memory access error\n"); + ret = -EFAULT; + } + + return ret; +} + +static long dnn_ioctl(struct file *fp, unsigned int cmd, unsigned long arg) +{ + struct dnn_priv *priv = container_of(fp->private_data, struct dnn_priv, miscdev); + int ret = 0; + + switch (cmd) { + case IOC_IPA_START: + ret = dnn_ioctl_start(priv, arg); + break; + case IOC_IPA_GET_STATUS: + ret = dnn_ioctl_get_status(priv, arg); + break; + default: + ret = -ENOIOCTLCMD; + break; + } + + return ret; +} + +static __poll_t dnn_poll(struct file *fp, poll_table *wait) +{ + struct dnn_priv *priv = container_of(fp->private_data, struct dnn_priv, miscdev); + __poll_t mask = 0; + unsigned int poll_event; + + poll_wait(fp, &priv->waitq, wait); + + /* Barrier to avoid re-ordering of poll_wait() and event load + * Read barrier here and release barrier in poll_wait() together will + * prevent re-ordering + */ + smp_rmb(); + poll_event = priv->poll_event; + if (poll_event != IPA_POLL_EVENT_NONE) { + mask = EPOLLIN | EPOLLRDNORM; + if (poll_event == IPA_POLL_EVENT_ERROR) + mask |= EPOLLERR; + } + return mask; +} + +static const struct file_operations dnn_fops = { + .owner = THIS_MODULE, + .unlocked_ioctl = dnn_ioctl, + .poll = dnn_poll, +}; + +static int dnn_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct dnn_priv *priv; + int i, ret; + + priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + + mutex_init(&priv->lock); + + /* update DMA mask */ + ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(36)); + if (ret) + return ret; + + priv->regs = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(priv->regs)) + return PTR_ERR(priv->regs); + + priv->irq = platform_get_irq(pdev, 0); + if (priv->irq < 0) + return priv->irq; + + ret = devm_request_threaded_irq(dev, priv->irq, dnn_irq, dnn_irq_thread, 0, "dnn", priv); + if (ret) { + dev_err(dev, "irq request failed\n"); + return ret; + } + disable_irq(priv->irq); + + ret = of_property_read_u32(dev->of_node, "index", &priv->id); + if (ret) { + dev_err(dev, "failed to acquire irq resource\n"); + return ret; + } + + /* + * allocate uncached-memory to hold address-list for DNN operation + * to hold ptr to 32bit addr memory block + * for each base address (upto HWD_DNN_BASE_ADDR_NUM) + * for each iteration (up to HWD_DNN_LIST_NUM_MAX) + * uint32_t list_addr [HWD_DNN_BASE_ADDR_NUM] [HWD_DNN_LIST_NUM_MAX]; + */ + priv->list_vaddr[0] = + dma_alloc_wc(dev, sizeof(u32) * HWD_DNN_LIST_NUM_MAX * HWD_DNN_BASE_ADDR_NUM, + &priv->list_paddr[0], GFP_KERNEL); + if (!priv->list_vaddr[0]) { + dev_err(dev, "dma_alloc_wc failed\n"); + return -ENOMEM; + } + for (i = 1; i < HWD_DNN_BASE_ADDR_NUM; i++) { + priv->list_vaddr[i] = priv->list_vaddr[0] + HWD_DNN_LIST_NUM_MAX * i; + priv->list_paddr[i] = priv->list_paddr[0] + sizeof(u32) * HWD_DNN_LIST_NUM_MAX * i; + } + + hwd_dnn_initialize(priv->id, priv->regs); + + snprintf(priv->name, sizeof(priv->name), "dnn%d", priv->id); + priv->miscdev.minor = MISC_DYNAMIC_MINOR; + priv->miscdev.name = priv->name; + priv->miscdev.fops = &dnn_fops; + ret = misc_register(&priv->miscdev); + if (ret) { + dev_err(dev, "misc registration failed\n"); + hwd_dnn_uninitialize(priv->id); + dma_free_wc(dev, sizeof(u32) * HWD_DNN_LIST_NUM_MAX * HWD_DNN_BASE_ADDR_NUM, + priv->list_vaddr[0], priv->list_paddr[0]); + return ret; + } + + priv->dev = dev; + platform_set_drvdata(pdev, priv); + + init_waitqueue_head(&priv->waitq); + + priv->status = DRV_IPA_STATE_IDLE; + return 0; +} + +static int dnn_remove(struct platform_device *pdev) +{ + struct dnn_priv *priv = platform_get_drvdata(pdev); + + misc_deregister(&priv->miscdev); + hwd_dnn_uninitialize(priv->id); + dma_free_wc(&pdev->dev, sizeof(u32) * HWD_DNN_LIST_NUM_MAX * HWD_DNN_BASE_ADDR_NUM, + priv->list_vaddr[0], priv->list_paddr[0]); + return 0; +} + +static const struct of_device_id dnn_of_match[] = { + { + .compatible = "toshiba,visconti-dnn", + }, + {}, +}; +MODULE_DEVICE_TABLE(platform, dnn_of_match); + +static struct platform_driver dnn_driver = { + .probe = dnn_probe, + .remove = dnn_remove, + .driver = { + .name = "visconti_dnn", + .of_match_table = of_match_ptr(dnn_of_match), + }, +}; +module_platform_driver(dnn_driver); + +MODULE_AUTHOR("Yuji Ishikawa "); +MODULE_DESCRIPTION("Toshiba Visconti DNN driver"); +MODULE_LICENSE("Dual BSD/GPL"); diff --git a/drivers/soc/visconti/dnn/hwd_dnn.c b/drivers/soc/visconti/dnn/hwd_dnn.c new file mode 100644 index 000000000..832040329 --- /dev/null +++ b/drivers/soc/visconti/dnn/hwd_dnn.c @@ -0,0 +1,183 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +/* Toshiba Visconti DNN Accelerator Support + * + * (C) Copyright 2022 TOSHIBA CORPORATION + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation + */ + +#include +#include +#include + +#include "hwd_dnn.h" +#include "hwd_dnn_reg.h" + +#define HWD_DNN_INT_END BIT(0) +#define HWD_DNN_INT_EER BIT(24) +#define HWD_DNN_INT_UNUSED (0x003f0100U) +#define HWD_DNN_INT_MASK_CONFIG (HWD_DNN_INT_UNUSED | HWD_DNN_INT_EER) +#define HWD_DNN_INT_CLEAR_ALL (HWD_DNN_INT_END | HWD_DNN_INT_EER) +#define HWD_DNN_EXE_CONFIG_MODE (0x3U) + +#define HWD_DNN_CGATE_ORD_CONFIG BIT(17) + +/** + * struct hwd_dnn_resources - HWD driver internal resource structure + * + * @reg: control register + * @status: driver status + * @eer_flags_addr: Address of execution error flags + */ +struct hwd_dnn_resources { + struct hwd_dnn_reg *reg; + struct hwd_dnn_status status; + u32 *eer_flags_addr; +}; + +/* HWD driver internal resource */ +static struct hwd_dnn_resources hwd_dnn_resources[HWD_DNN_DEVICE_MAX] = {}; + +/** + * hwd_dnn_initialize() - Initialize DNN device + * + * @module_id: @ref hwd_dnn_device "id" of the h/w module + * @vaddr: register base virtual address + * Return: 0 operation completed successfully + */ +void hwd_dnn_initialize(u32 module_id, void *vaddr) +{ + struct hwd_dnn_resources *res = &hwd_dnn_resources[module_id]; + + /* Initialize the device */ + res->reg = (struct hwd_dnn_reg *)vaddr; +} + +/** + * hwd_dnn_uninitialize() - Uninitialize DNN device + * + * @module_id: @ref hwd_dnn_device "id" of the h/w module + */ +void hwd_dnn_uninitialize(u32 module_id) +{ + struct hwd_dnn_resources *res = &hwd_dnn_resources[module_id]; + + res->reg = NULL; +} + +/** + * hwd_dnn_start() - Start DNN device + * + * @module_id: @ref hwd_dnn_device "id" of the h/w module + * @desc: Pointer to descriptor structure + */ +void hwd_dnn_start(u32 module_id, const struct hwd_dnn_descriptor *desc) +{ + struct hwd_dnn_resources *res = &hwd_dnn_resources[module_id]; + struct hwd_dnn_reg *reg; + u32 base_addr_flag; + int i; + + reg = res->reg; + + /* Configure the registers with user provided values */ + writel(desc->configuration & GENMASK(31, 0), ®->cfg_addr); + writel(desc->configuration_size, ®->cfg_size); + writel(desc->list_num, ®->cfg_rpt); + + base_addr_flag = desc->base_addr_flag; + for (i = 0; i < HWD_DNN_BASE_ADDR_NUM; i++) { + if (base_addr_flag & BIT(i)) { + /* 1: address list is specified */ + writel(0U, ®->base_addr[i]); + writel(desc->base_addr[i] & GENMASK(31, 0), ®->base_list[i]); + } else { + /* 0: region address is specified */ + writel(desc->base_addr[i] & GENMASK(31, 0), ®->base_addr[i]); + writel(0U, ®->base_list[i]); + } + } + + writel(base_addr_flag, ®->base_list_cfg); + + /* clear and unmask interrupts */ + writel(HWD_DNN_INT_CLEAR_ALL, ®->intstat); + writel(HWD_DNN_INT_MASK_CONFIG, ®->intmask); + + res->status.eer_cmd = 0; + res->status.eer = 0; + res->eer_flags_addr = desc->eer_flags_addr; + + /*Invalidate load command in Clock Gating Control Register */ + writel(HWD_DNN_CGATE_ORD_CONFIG, ®->cgate_ord); + + /* Sync memory barrier */ + dsb(st); + + /* Start the device */ + writel(HWD_DNN_EXE_CONFIG_MODE, ®->exe); +} + +/** + * hwd_dnn_irq_handler() - HWD DNN interrupt handler + * + * @module_id: id of the h/w module + * Return: HWD_DNN_EVENT_EXEC_DONE Bit field saying Configuration processing is completed + */ +u32 hwd_dnn_irq_handler(u32 module_id) +{ + struct hwd_dnn_resources *res = &hwd_dnn_resources[module_id]; + struct hwd_dnn_reg *reg; + u32 event = 0; + u32 int_stat; + u32 cause; + int i; + + reg = res->reg; + + /* Read the interrupt causes */ + cause = readl(®->mask_intstat); + + /* Read the interrupt status register */ + int_stat = readl(®->intstat); + + /* + * Read the error command. + * Clearing DNN_INTSTAT.EER interrupt cause initializes the error command register. + * Read error command before clearing the interrupt cause. + */ + res->status.eer_cmd = readl(®->int_eer_cmd); + + /* Clear the interrupt causes */ + writel(cause, ®->intstat); + + if (cause & HWD_DNN_INT_END) { + /* Get EER flags, if requestd */ + if (res->eer_flags_addr) + for (i = 0; i < HWD_DNN_EER_FLAG_NUM; i++) + res->eer_flags_addr[i] = readl(®->eer_flag[i]); + + event = HWD_DNN_EVENT_EXEC_DONE; + } + + if (int_stat & HWD_DNN_INT_EER) { + /* Operation error occurred */ + res->status.eer = 1; + } + + return event; +} + +/** + * hwd_dnn_get_status() - HWD DNN Get Status + * + * @module_id: @ref hwd_dnn_device "id" of the h/w module + * @status: Pointer to status structure + */ +void hwd_dnn_get_status(u32 module_id, struct hwd_dnn_status *status) +{ + const struct hwd_dnn_resources *res = &hwd_dnn_resources[module_id]; + + /* Update device status */ + status->eer_cmd = res->status.eer_cmd; + status->eer = res->status.eer; +} diff --git a/drivers/soc/visconti/dnn/hwd_dnn.h b/drivers/soc/visconti/dnn/hwd_dnn.h new file mode 100644 index 000000000..84425990c --- /dev/null +++ b/drivers/soc/visconti/dnn/hwd_dnn.h @@ -0,0 +1,68 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* Toshiba Visconti DNN Accelerator Support + * + * (C) Copyright 2022 TOSHIBA CORPORATION + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation + */ + +#ifndef HWD_DNN_H +#define HWD_DNN_H + +/** + * enum hwd_dnn_device_id - device ids for available DNN Accelerators + */ +enum hwd_dnn_device_id { HWD_DNN_DEVICE_0 = 0, HWD_DNN_DEVICE_1 = 1, HWD_DNN_DEVICE_MAX = 2 }; + +/* DNN events returned by irq handler */ +#define HWD_DNN_EVENT_EXEC_DONE (1U) + +/* number of base address available for DNN */ +#define HWD_DNN_BASE_ADDR_NUM (8U) + +/* number of error flag registers */ +#define HWD_DNN_EER_FLAG_NUM (32U) + +/* DNN hardware can execute 1024 iterations for different dataset at a time */ +#define HWD_DNN_LIST_NUM_MAX (1024U) + +/** + * struct hwd_dnn_status - DNN HWD Driver Status + * + * @eer_cmd: Error Command + * @eer: Execution Error + * @reserved: Padding + */ +struct hwd_dnn_status { + u32 eer_cmd; + u32 eer : 1; + u32 reserved : 31; +}; + +/** + * struct hwd_dnn_descriptor - HWD DNN Descriptor + * + * @configuration: Configuration data + * @configuration_size: Configuration data size + * @list_num: Number of input/output list + * @base_addr: Base Address + * @eer_flags_addr: Address of storing execution error flags + * @base_addr_flag: Bit-fields of Base Address list config. If 1, address list. If 0, fixed address + * @config_done: Flags of called configuration + */ +struct hwd_dnn_descriptor { + u32 configuration; + u32 configuration_size; + u32 list_num; + u32 base_addr[HWD_DNN_BASE_ADDR_NUM]; + u32 *eer_flags_addr; + u32 base_addr_flag; + u16 config_done; +}; + +void hwd_dnn_initialize(u32 module_id, void *vaddr); +void hwd_dnn_uninitialize(u32 module_id); +void hwd_dnn_start(u32 module_id, const struct hwd_dnn_descriptor *desc); +u32 hwd_dnn_irq_handler(u32 module_id); +void hwd_dnn_get_status(u32 module_id, struct hwd_dnn_status *status); + +#endif /* HWD_DNN_H */ diff --git a/drivers/soc/visconti/dnn/hwd_dnn_reg.h b/drivers/soc/visconti/dnn/hwd_dnn_reg.h new file mode 100644 index 000000000..c1271e092 --- /dev/null +++ b/drivers/soc/visconti/dnn/hwd_dnn_reg.h @@ -0,0 +1,228 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* Toshiba Visconti DNN Accelerator Support + * + * (C) Copyright 2022 TOSHIBA CORPORATION + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation + */ + +#ifndef HWD_DNN_REG_H +#define HWD_DNN_REG_H + +/** + * struct hwd_dnn_reg - DNN register address struct + */ +struct hwd_dnn_reg { + u32 version; + u32 exe; + u32 reserved_a_1; + u32 intstat; + u32 intmask; + u32 mask_intstat; + u32 reserved_a_2; + u32 int_eer_cmd; + u32 reserved_b_1[8]; + u32 cfg_cid; + u32 cfg_cnm; + u32 reserved_b_2[10]; + u32 reserved_a_3; + u32 cgate_ord; + u32 reserved_b_3[2]; + u32 eer_flag[32]; + u32 cfg_addr; + u32 cfg_size; + u32 cfg_rpt; + u32 reserved_b_4[1]; + u32 base_list_cfg; + u32 reserved_b_5[11]; + u32 base_addr[8]; + u32 reserved_b_6[24]; + u32 base_list[8]; + u32 reserved_b_7[8]; + u32 cmd_load; + u32 ld_format; + u32 reserved_b_8[14]; + u32 ld_g_size0; + u32 ld_g_size1; + u32 reserved_b_9[2]; + u32 reserved_a_4; + u32 reserved_b_10[1]; + u32 ld_g_base_id; + u32 ld_g_offset; + u32 ld_g_lptch; + u32 ld_g_cptch; + u32 ld_g_vptch; + u32 reserved_b_11[5]; + u32 ld_l_size0; + u32 ld_l_size1; + u32 reserved_b_12[2]; + u32 ld_l_addr; + u32 reserved_b_13[3]; + u32 ld_l_lptch; + u32 ld_l_cptch; + u32 ld_l_vptch; + u32 reserved_b_14[17]; + u32 reserved_a_5; + u32 reserved_b_15[3]; + u32 cmd_store; + u32 st_format; + u32 reserved_b_16[14]; + u32 st_g_size0; + u32 st_g_size1; + u32 reserved_b_17[2]; + u32 reserved_a_6; + u32 reserved_b_18[1]; + u32 st_g_base_id; + u32 st_g_offset; + u32 st_g_lptch; + u32 st_g_cptch; + u32 st_g_vptch; + u32 reserved_b_19[5]; + u32 st_l_size0; + u32 st_l_size1; + u32 reserved_b_20[2]; + u32 st_l_addr; + u32 reserved_b_21[3]; + u32 st_l_lptch; + u32 st_l_cptch; + u32 st_l_vptch; + u32 reserved_b_22[17]; + u32 reserved_a_7; + u32 reserved_b_23[3]; + u32 cmd_copy; + u32 reserved_b_24[15]; + u32 cp_src_size0; + u32 cp_src_size1; + u32 reserved_b_25[2]; + u32 cp_src_addr; + u32 reserved_b_26[3]; + u32 cp_src_lptch; + u32 cp_src_cptch; + u32 reserved_a_8; + u32 reserved_b_27[5]; + u32 cp_dst_size0; + u32 cp_dst_size1; + u32 reserved_b_28[2]; + u32 cp_dst_addr; + u32 reserved_b_29[3]; + u32 cp_dst_lptch; + u32 cp_dst_cptch; + u32 reserved_a_9; + u32 reserved_b_30[17]; + u32 reserved_a_10; + u32 reserved_b_31[3]; + u32 cmd_fill; + u32 reserved_b_32[1]; + u32 fl_cfg; + u32 reserved_b_33[13]; + u32 fl_src_size0; + u32 fl_src_size1; + u32 reserved_b_34[2]; + u32 fl_src_addr; + u32 reserved_b_35[3]; + u32 fl_src_lptch; + u32 fl_src_cptch; + u32 reserved_b_36[6]; + u32 fl_dst_size0; + u32 fl_dst_size1; + u32 reserved_b_37[2]; + u32 fl_dst_addr; + u32 reserved_b_38[3]; + u32 fl_dst_lptch; + u32 fl_dst_cptch; + u32 reserved_b_39[22]; + u32 cmd_sync; + u32 reserved_b_40[1]; + u32 cmd_srst; + u32 reserved_b_41[1]; + u32 cmd_halt; + u32 reserved_b_42[3]; + u32 cmd_finish; + u32 reserved_b_43[119]; + u32 cmd_exe; + u32 exe_mode; + u32 reserved_b_44[14]; + u32 exe_inf_size0; + u32 exe_inf_size1; + u32 reserved_b_45[2]; + u32 exe_inf_addr; + u32 reserved_b_46[3]; + u32 exe_inf_lptch; + u32 exe_inf_cptch; + u32 reserved_b_47[26]; + u32 exe_conv_bias_addr; + u32 reserved_b_48[27]; + u32 exe_conv_wt_size0; + u32 exe_conv_wt_size1; + u32 reserved_b_49[2]; + u32 exe_conv_wt_addr; + u32 reserved_b_50[3]; + u32 exe_conv_wt_lptch; + u32 exe_conv_wt_cptch; + u32 exe_conv_wt_vptch; + u32 reserved_b_51[9]; + u32 exe_conv_acv_addr; + u32 reserved_b_52[3]; + u32 exe_conv_acv_lptch; + u32 exe_conv_acv_cptch; + u32 reserved_b_53[18]; + u32 exe_conv_out; + u32 reserved_b_54[11]; + u32 actv_tbl_range; + u32 reserved_a_11; + u32 actv_tbl_prm; + u32 reserved_b_55[5]; + u32 exe_pool_size; + u32 reserved_b_56[43]; + u32 exe_pool_out; + u32 exe_pool_pe_crop; + u32 reserved_b_57[10]; + u32 exe_norm_g_prm; + u32 reserved_b_58[15]; + u32 exe_sav_max_prm; + u32 reserved_b_59[35]; + u32 exe_norm_out; + u32 reserved_b_60[23]; + u32 exe_outf_addr; + u32 reserved_b_61[3]; + u32 exe_outf_lptch; + u32 exe_outf_cptch; + u32 reserved_b_62[102]; + u32 reserved_a_12; + u32 reserved_b_63[8]; + u32 reserved_a_13; + u32 reserved_a_14; + u32 reserved_a_15; + u32 reserved_b_64[4]; + u32 reserved_a_16; + u32 reserved_a_17; + u32 reserved_b_65[46]; + u32 exe_pool_pa3_0; + u32 exe_pool_pa3_2; + u32 exe_pool_pa3_4; + u32 exe_pool_pa3_6; + u32 exe_pool_pa3_8; + u32 exe_pool_pa3_10; + u32 exe_pool_pa3_12; + u32 exe_pool_pa3_14; + u32 exe_pool_pa3_16; + u32 exe_pool_pa3_18; + u32 exe_pool_pa3_20; + u32 exe_pool_pa3_22; + u32 exe_pool_pa3_24; + u32 reserved_b_66[3]; + u32 exe_pool_pa2_0; + u32 exe_pool_pa2_2; + u32 exe_pool_pa2_4; + u32 exe_pool_pa2_6; + u32 exe_pool_pa2_8; + u32 exe_pool_pa2_10; + u32 exe_pool_pa2_12; + u32 exe_pool_pa2_14; + u32 exe_pool_pa2_16; + u32 exe_pool_pa2_18; + u32 exe_pool_pa2_20; + u32 exe_pool_pa2_22; + u32 exe_pool_pa2_24; +}; + +#endif /* HWD_DNN_REG_H */ diff --git a/drivers/soc/visconti/uapi/dnn.h b/drivers/soc/visconti/uapi/dnn.h new file mode 100644 index 000000000..322c6b13d --- /dev/null +++ b/drivers/soc/visconti/uapi/dnn.h @@ -0,0 +1,77 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* Toshiba Visconti DNN Accelerator Support + * + * (C) Copyright 2022 TOSHIBA CORPORATION + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation + */ + +#ifndef _UAPI_LINUX_DNN_H +#define _UAPI_LINUX_DNN_H + +#include +#include +#include "ipa.h" + +#define DRV_DNN_BIT_CONFIG_DESC_FINAL (0x8000U) +#define DRV_DNN_BUFFER_INDEX_MAX (15) + +#define DRV_DNN_BASE_ADDR_NUM (8U) /* DNN number of base address */ + +#define DRV_DNN_BASE_ADDR_PURPOSE_INPUT (1U) +#define DRV_DNN_BASE_ADDR_PURPOSE_OUTPUT (2U) +#define DRV_DNN_BASE_ADDR_PURPOSE_AWB (3U) +#define DRV_DNN_BASE_ADDR_PURPOSE_TEMPORARY (4U) + +/** + * struct drv_dnn_status - DNN IPA status for IOC_IPA_GET_STATUS + * + * @state: State of driver + * @eer_cmd: Execution error command + * @eer: Execution error + * @reserved: Padding + * @eer_flags: Execution error flags + */ +struct drv_dnn_status { + enum drv_ipa_state state; + __u32 eer_cmd; + __u32 eer : 1; + __u32 reserved : 31; + __u32 eer_flags[32]; +}; + +struct drv_dnn_base_addr { + __u32 purpose; + union { + struct drv_ipa_addr ipa_addr; + uintptr_t list_addr; + } addr; +}; + +/** + * struct drv_dnn_descriptor - DNN IPA Descriptor for IOC_IPA_START + * + * @configuration: Address of configuration data + * @configuration_offset: Configuration offset + * @configuration_size: Configuration data size + * @list_num: Number of input/output list + * @base_addr: Base addresses + * @base_addr_flag: Bit-fields of base_addr list config; + * 0 for fixed address, + * 1 for address list. + * @config_done: Flags of called configuration + * @buffer_info: Table of buffer information + * @buffer_info_num: Number of buffer_info + */ +struct drv_dnn_descriptor { + struct drv_ipa_addr configuration; + __u32 configuration_offset; + __u32 configuration_size; + __u32 list_num; + struct drv_dnn_base_addr base_addr[DRV_DNN_BASE_ADDR_NUM]; + __u32 base_addr_flag; + __u16 config_done; + struct drv_ipa_buffer_info buffer_info[DRV_DNN_BUFFER_INDEX_MAX]; + __s32 buffer_info_num; +}; + +#endif /* _UAPI_LINUX_DNN_H */ From patchwork Fri Jul 22 08:28:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuji Ishikawa X-Patchwork-Id: 12926125 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75A26CCA47F for ; Fri, 22 Jul 2022 08:34:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234120AbiGVIe6 (ORCPT ); Fri, 22 Jul 2022 04:34:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235017AbiGVIex (ORCPT ); Fri, 22 Jul 2022 04:34:53 -0400 Received: from mo-csw.securemx.jp (mo-csw1116.securemx.jp [210.130.202.158]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B8219EC55; Fri, 22 Jul 2022 01:34:52 -0700 (PDT) Received: by mo-csw.securemx.jp (mx-mo-csw1116) id 26M8YN5O010501; Fri, 22 Jul 2022 17:34:23 +0900 X-Iguazu-Qid: 2wHHJgaTyWPqPTIE4t X-Iguazu-QSIG: v=2; s=0; t=1658478862; q=2wHHJgaTyWPqPTIE4t; m=28XmJNhu9oMp/+/VlGY2sVi9qyuPrWXzzC1Tz25r85Y= Received: from imx12-a.toshiba.co.jp ([38.106.60.135]) by relay.securemx.jp (mx-mr1113) id 26M8YKba013052 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=NOT); Fri, 22 Jul 2022 17:34:21 +0900 X-SA-MID: 39258159 From: Yuji Ishikawa To: Rob Herring , Hans Verkuil , Nobuhiro Iwamatsu , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, yuji2.ishikawa@toshiba.co.jp Subject: [PATCH v2 4/5] MAINTAINERS: Add entries for Toshiba Visconti DNN image processing accelerator Date: Fri, 22 Jul 2022 17:28:57 +0900 X-TSB-HOP2: ON Message-Id: <20220722082858.17880-5-yuji2.ishikawa@toshiba.co.jp> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> References: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index dd36acc87..a2e2bd719 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2796,12 +2796,14 @@ F: Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml F: Documentation/devicetree/bindings/gpio/toshiba,gpio-visconti.yaml F: Documentation/devicetree/bindings/pci/toshiba,visconti-pcie.yaml F: Documentation/devicetree/bindings/pinctrl/toshiba,visconti-pinctrl.yaml +F: Documentation/devicetree/bindings/soc/visconti/ F: Documentation/devicetree/bindings/watchdog/toshiba,visconti-wdt.yaml F: arch/arm64/boot/dts/toshiba/ F: drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c F: drivers/gpio/gpio-visconti.c F: drivers/pci/controller/dwc/pcie-visconti.c F: drivers/pinctrl/visconti/ +F: drivers/soc/visconti/ F: drivers/watchdog/visconti_wdt.c N: visconti From patchwork Fri Jul 22 08:28:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuji Ishikawa X-Patchwork-Id: 12926126 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9B77CCA47F for ; Fri, 22 Jul 2022 08:35:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235049AbiGVIfA (ORCPT ); Fri, 22 Jul 2022 04:35:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235028AbiGVIe4 (ORCPT ); Fri, 22 Jul 2022 04:34:56 -0400 Received: from mo-csw.securemx.jp (mo-csw1116.securemx.jp [210.130.202.158]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 818629EC5A; Fri, 22 Jul 2022 01:34:54 -0700 (PDT) Received: by mo-csw.securemx.jp (mx-mo-csw1116) id 26M8YRxT010864; Fri, 22 Jul 2022 17:34:27 +0900 X-Iguazu-Qid: 2wHHu0tpvrzq8OKSgE X-Iguazu-QSIG: v=2; s=0; t=1658478866; q=2wHHu0tpvrzq8OKSgE; m=PlG4+L+mjN4vIVxdysur12YA8Ssaej13M/64gxBMJKw= Received: from imx12-a.toshiba.co.jp ([38.106.60.135]) by relay.securemx.jp (mx-mr1112) id 26M8YPpC024474 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=NOT); Fri, 22 Jul 2022 17:34:25 +0900 X-SA-MID: 39258161 From: Yuji Ishikawa To: Rob Herring , Hans Verkuil , Nobuhiro Iwamatsu , Jonathan Corbet , Sumit Semwal , =?utf-8?q?Christian_K=C3=B6nig?= Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, yuji2.ishikawa@toshiba.co.jp Subject: [PATCH v2 5/5] Documentation: driver-api: visconti: add a description of DNN driver. Date: Fri, 22 Jul 2022 17:28:58 +0900 X-TSB-HOP2: ON Message-Id: <20220722082858.17880-6-yuji2.ishikawa@toshiba.co.jp> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> References: <20220722082858.17880-1-yuji2.ishikawa@toshiba.co.jp> Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org Signed-off-by: Yuji Ishikawa --- v1 -> v2: - newly added documents --- Documentation/driver-api/visconti/common.rst | 115 ++++++ Documentation/driver-api/visconti/dnn.rst | 394 +++++++++++++++++++ 2 files changed, 509 insertions(+) create mode 100644 Documentation/driver-api/visconti/common.rst create mode 100644 Documentation/driver-api/visconti/dnn.rst diff --git a/Documentation/driver-api/visconti/common.rst b/Documentation/driver-api/visconti/common.rst new file mode 100644 index 000000000..cc2d3f6cd --- /dev/null +++ b/Documentation/driver-api/visconti/common.rst @@ -0,0 +1,115 @@ +========================================== +Common interfaces for Visconti IPA Drivers +========================================== + +file operation +============== + +IPA driver expose thair function to userspace with device files. + +* The device file has specific name, such as /dev/dnn0 +* device file provides some operations + + - open / close + - two ioctls operations to control IPA + - poll to wait for the finish of IPA + +IOCTLs +====== + +All IPA drivers have following two IOCTL requests. +See individual documents of IPA drivers for detail. + +.. c:macro:: IOC_IPA_START + +starts IPA processing. + +.. c:macro:: IOC_IPA_GET_STATUS + +get the status of IPA driver + +IPA Addresses +============= + +IPA drivers assume that the buffers for IPA HW processing are DMA-BUF +which are phisically contiguous memory block allocated with DMA-BUF heaps. +The heap area is defined with a reserved-memory element in device tree description. +See Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt for further information. + +Two structures are used to describe DMA-BUF instances. + +* :c:type:`drv_ipa_buffer_info` +* :c:type:`drv_ipa_addr` + +Note that allocation size of DMA-BUF instances should be power of two. +This is mainly because IOMMUs have very limited number of entries for translation. + + +drv_ipa_buffer_info +------------------- + +Information about the buffer allocated with DMA-BUF heaps. +Applications specify the array of drv_ipa_buffer_info to the driver. + +drv_ipa_addr +------------ + +Application specify the region to be processed with two information; index for the array of drv_ipa_buffer_info and offset. + +.. c:macro:: IPA_INVALID_ADDR + +When a buffer is an optional parameter for an API, application can use :c:macro:`IPA_INVALID_ADDR` instead of NULL. + +For example: + +.. code-block:: c + + struct drv_ipa_addr null_addr = IPA_INVALID_ADDR; + +Code exampe +----------- + +.. code-block:: c + + /* hogehoge */ + /* hogehoge */ + +Structures +========== + +.. kernel-doc:: drivers/soc/visconti/uapi/ipa.h + +Further description of addressing +================================= + +DRAM addressing space +--------------------- + +Visconti5 (TMPV770x) SoCs have 2 DDR controllers. +Each space for a controller is mapped in 36bit address space as follows: + +* DDR0: 0x0_8000_0000 - 0x0_FFFF_FFFF +* DDR1: 0x8_8000_0000 - 0x8_FFFF_FFFF + +IPAs have dedicated IOMMUs (visconti ioatu) which translate 32bit IOVA to 36bit physical address. +Note that address registers of IPA core hardware only accept 32bit values. +When IOMMUs are absent (by removing their description from device tree), DDR1 address must not be used for IPA. + +Coherent access among CPUs / IPAs +--------------------------------- + +Bus master of IPAs has two connections; direct access to DRAM and cache-coherent connection with CPU clusters. +The bit 34 of 64bit address is used to distinguish direct access or coherent acccess. + +* DDR0 (direct): 0x0_8000_0000 - 0x0_FFFF_FFFF +* DDR0 (coherent): 0x4_8000_0000 - 0x4_FFFF_FFFF +* DDR1 (direct): 0x8_8000_0000 - 0x8_FFFF_FFFF +* DDR1 (coherent): 0xC_8000_0000 - 0xC_FFFF_FFFF + +If the whole system should run coherent mode, make following changes in device tree description: + +* set coherent address for all memory and reserved-memory elements +* set dma_coherent property to all IPA elements + +It is strongly recommended not to use both direct and coherent address at the same time. + diff --git a/Documentation/driver-api/visconti/dnn.rst b/Documentation/driver-api/visconti/dnn.rst new file mode 100644 index 000000000..1abdd6c7c --- /dev/null +++ b/Documentation/driver-api/visconti/dnn.rst @@ -0,0 +1,394 @@ +======================= +Visconti DNN IPA driver +======================= + +Hardware overview +================= + +Visconti DNN IPA is a proprietary CNN/DCNN accelerator developed by Toshiba. +This IPA integrates large scaled parallel MAC units and dedicated function units to accelerate +inference operation of well known neural network layers. It also has dedicated DMA unit to make accesses to DRAM. +Visconti5 SoC has 2 instances of DNN accelerator hardware. + +WORDS: + +* IPA: Image Processing Accelerator +* CNN: Convolutional Neural Network +* DCNN: Deep Convolutional Neural Network + +Software architecture +===================== + +The operation of Visconti DNN IPA depends on some software components including this kernel driver. +Some extra components are provided by Toshiba Electronic Device & Storage Company (at Jul 2022). + +* libdrvutil: userspace utility to build a descriptor for a NN inference operation. +* compiler: offline tool to generate configuration binaries from caffe/onnx compatible CNN models. + +The Visconti DNN IPA driver receives a configuration binary and an input vector then configures hardware registers. +Register values suitable for each run is held in a descriptor data structure generated by libdrvutil. + +The userland and the kernel driver use DMA-BUF to allocate/handle (phisically contiguous) memory blocks. + +Public user interface of Visconti DNN IPA driver is exposed as following device files. + +* DNN0: /dev/dnn0 +* DNN1: /dev/dnn1 + +IOCTLs +====== + +IOC_IPA_START +------------- + +.. c:function:: int ioctl( int fd, IOC_IPA_START, drv_dnn_descriptor_t *argp ) + :name: IOC_IPA_START_DNN + +Description +~~~~~~~~~~~ + +This ioctl call starts the DNN processing with specified descriptor. +:c:type:`drv_dnn_descriptor_t` should be set by :ref:`util_dnn` + +poll(2) will return when the DNN processing is completed. +DNN HW does not stop IPA processing by execution error. +DNN driver will report POLLERR with completed event, if the execution error occurs. +Applications can check the error information by calling :c:func:`ioctl(IOC_IPA_GET_STATUS) `. + +Return value +~~~~~~~~~~~~ + +On success, 0 is returned. +On error, -1 is returned and the ``errno`` is set appropriately. + +Errors +~~~~~~ + +* EBUSY: Driver is busy. +* EFAULT: + + - The given descriptor is not accessible from kernel space. + - ``src_list`` is not accessible from kernel space. + - ``dst_list`` is not accessible from kernel space. + +* EINVAL: + + - The given descriptor is not configured by :ref:`drv_DNN_config_descript_finalize`. + - Ion heap fd passed in :ref:`drv_DNN_config_descript_init` (buffer) is invalid. + - ``configuration_binary_addr.offset`` is greater than or equal to size of Ion heap pointed by ``configuration_binary_addr.buffer_index``. + - ``temporary_addr.offset`` is greater than or equal to size of Ion heap pointed by ``temporary_addr.buffer_index``. + - ``src_list[0 ~ n-1]`` is :c:macro:`IPA_INVALID_ADDR`. + - ``src_list[0 ~ n-1].buffer_index`` is equal to or greater than ``buffer_num`` of :ref:`drv_DNN_config_descript_init`. + - ``src_list[0 ~ n-1].offset`` is greater than or equal to size of Ion heap pointed by ``src_list[0 ~ n-1].buffer_index``. + - ``dst_list[0 ~ n-1]`` is :c:macro:`IPA_INVALID_ADDR`. + - ``dst_list[0 ~ n-1].buffer_index`` is equal to or greater than ``buffer_num`` of :ref:`drv_DNN_config_descript_init`. + - ``dst_list[0 ~ n-1].offset`` is greater than or equal to size of Ion heap pointed by ``dst_list[0 ~ n-1].buffer_index``. + +IOC_IPA_GET_STATUS +------------------ + +.. c:function:: int ioctl( int fd, IOC_IPA_GET_STATUS, drv_dnn_status_t *argp ) + :name: IOC_IPA_GET_STATUS_DNN + +Description +~~~~~~~~~~~ + +| This ioctl call asks the DNN driver to get the state of the DNN HW. +| See :c:type:`drv_dnn_status_t` for details. + +Error +~~~~~ + +* EFAULT: The given status structure is not accessible from kernel space. + +.. _util_dnn: + + +Userspace utility functions +=========================== + +.. _drv_DNN_config_descript_init: + +drv_DNN_config_descript_init +---------------------------- + +Synopsis +~~~~~~~~ + +.. code-block:: c + + int32_t drv_DNN_config_descript_init( + drv_dnn_descriptor_t *desc, + struct drv_ipa_buffer_info *buffer, + int32_t buffer_num + ); + +Parameters +~~~~~~~~~~ + +.. flat-table:: + :header-rows: 0 + + * - :c:type:`drv_dnn_descriptor_t` \*desc + - Pointer to the DNN descriptor structure. + * - :c:type:`struct drv_ipa_buffer_info ` \*buffer + - Array of Ion heaps for DNN processing. + * - int32_t buffer_num + - Number of buffer. [1, 15] + +Description +~~~~~~~~~~~ + +Initializes the :c:type:`drv_dnn_descriptor_t` structure with value 0 and sets Array of Ion heaps. + +Return value +~~~~~~~~~~~~ + +.. flat-table:: + :header-rows: 0 + + * - :c:macro:`DRV_OK` + - Operation completes successfully. + * - :c:macro:`DRV_ERROR_PARAMETER` + - * ``desc`` is NULL. + * ``buffer`` is NULL. + * ``buffer_num`` is out of range. + * ``direction`` member in ``buffer`` is invalid. + +.. _drv_DNN_config_exec_configuration: + +drv_DNN_config_exec_configuration +--------------------------------- + +Synopsis +~~~~~~~~ + +.. code-block:: c + + int32_t drv_DNN_config_exec_configuration( + drv_dnn_descriptor_t *desc, + const void *configuration_binary, + struct drv_ipa_addr configuration_binary_addr, + struct drv_ipa_addr *src_list, + struct drv_ipa_addr *dst_list, + int32_t list_num, + struct drv_ipa_addr temporary_addr, + int32_t temporary_size + ); + +Parameters +~~~~~~~~~~ + +.. flat-table:: + :header-rows: 0 + + * - :c:type:`drv_dnn_descriptor_t` \*desc + - Pointer to the DNN descriptor structure. + * - const void \*configuration_binary + - Userland address of the configuration binary. + * - :c:type:`struct drv_ipa_addr ` \*configuration_binary_addr + - IPA address of the configuration binary. Should be aligned to 512 byte. + * - :c:type:`struct drv_ipa_addr ` \*src_list + - Userland address for the list of IPA address to input data. + * - :c:type:`struct drv_ipa_addr ` \*dst_list + - Userland address for the list of IPA address to output data. + * - int32_t list_num + - Number of list. [1, 1024] + * - :c:type:`struct drv_ipa_addr ` \*temporary_addr + - IPA address of the temporary memory. Should be aligned to 4 byte. + * - int32_t temporary_size + - Size of the temporary memory. + +Description +~~~~~~~~~~~ + +Configures :c:type:`drv_dnn_descriptor_t` structure with execute configuration parameters. + +* ``desc`` should be initialized by :ref:`drv_DNN_config_descript_init` before the descriptor configuration. +* If configuration binary provides temporary buffer setting, temporary_addr should not be :c:macro:`IPA_INVALID_ADDR` and should be aligned to 4 byte. +* If configuration binary does not provide temporary buffer setting, applications can specify :c:macro:`IPA_INVALID_ADDR` for temporary_addr. +* Configuration binary for ``configuration_binary`` parameter should be created by DNN toolchain. + + +Return value +~~~~~~~~~~~~ + +.. flat-table:: + :header-rows: 0 + + * - :c:macro:`DRV_OK` + - Operation completes successfully. + * - :c:macro:`DRV_ERROR_PARAMETER` + - * ``desc`` is NULL. + * ``configuration_binary`` is NULL. + * ``configuration_binary_addr`` is :c:macro:`IPA_INVALID_ADDR`. + * ``configuration_binary_addr.buffer_index`` is equal to or greater than ``buffer_num`` of :ref:`drv_DNN_config_descript_init`. + * ``configuration_binary_addr.offset`` is not aligned to 512 byte. + * ``src_list`` is NULL. + * ``dst_list`` is NULL. + * ``list_num`` is out of range. + * ``temporary_addr`` is :c:macro:`IPA_INVALID_ADDR` when configuration binary requires more than 0 byte for temporary memory. + * ``temporary_addr.offset`` is not aligned to 4 byte when it is valid. + * ``temporary_addr.buffer_index`` is equal to or greater than ``buffer_num`` of :ref:`drv_DNN_config_descript_init` when it is valid. + * - :c:macro:`DRV_DNN_ERROR_INVALID_CONFIGURATION_BINARY` + - In Configuration binary, + + * base address mode N is except 0:unused, 1:list, 2:fixed. + * base address purpose N is except 1:input, 2:output, 3:weight/bias, 4:temporary. + * base address mode is not list when purpose is input or output + * base address mode is not fixed when purpose is temporary or weight/bias + * - :c:macro:`DRV_DNN_ERROR_TEMPORARY_MEMORY_SIZE` + - ``temporary_size`` is smaller than the size which is required by configuration binary. + +.. _drv_DNN_config_descript_finalize: + +drv_DNN_config_descript_finalize +-------------------------------- + +Synopsis +~~~~~~~~ + +.. code-block:: c + + int32_t drv_DNN_config_descript_finalize( + drv_dnn_descriptor_t *desc + ); + +Parameters +~~~~~~~~~~ + +.. flat-table:: + :header-rows: 0 + + * - :c:type:`drv_dnn_descriptor_t` \*desc + - | Pointer to the DNN descriptor structure + + +Description +~~~~~~~~~~~ + +Checks if DNN driver descriptor configuration is complete before starting DNN HW. + +``desc`` should be initialized by :ref:`drv_DNN_config_descript_init` before the descriptor configuration. + +Return value +~~~~~~~~~~~~ + +.. flat-table:: + :header-rows: 0 + + * - :c:macro:`DRV_OK` + - Operation completes successfully. + * - :c:macro:`DRV_ERROR_PARAMETER` + - ``desc`` is NULL. + * - :c:macro:`DRV_ERROR_IPA_CONFIG_INCOMPLETE` + - :ref:`drv_DNN_config_exec_configuration` is not called after :ref:`drv_DNN_config_descript_init`. + +Structures +========== + +.. kernel-doc:: drivers/soc/visconti/uapi/dnn.h + +Code example +============ + +``src_list`` and ``dst_list`` parameter for :ref:`drv_DNN_config_exec_configuration` is the array of :c:type:`drv_ipa_addr`. + +.. code-block:: c + + #include + #include + #include + #include + #include + #include + #include + #include + #include + + #include + + #include "drv_ipa.h" + #include "drv_dnn_util.h" + + #define HEAP_ID (2) + #define CONFIG_SIZE + #define INPUT_SIZE + #define OUTPUT_SIZE + #define TEMP_SIZE + #define LIST_NUM (2) + + #define NUM_OF_BUFS (4) + + void dnn_sample(void) + { + int ret; + int32_t drv_ret; + int32_t fd_heap, fd_dnn; + struct pollfd fds[1]; + int32_t timeout; + unsigned char *src, *config; + struct dma_heap_allocation_data alloc_src={0}, alloc_dst={0}, alloc_temp={0}, alloc_config={0}; + drv_dnn_descriptor_t desc; + + /* Open DMA-BUF heaps device for heap allocation */ + fd_heap = open("/dev/dma_heap/linux,cma", O_RDWR); + + /* Open DNN device for processing */ + fd_dnn = open("/dev/dnn0", O_RDWR); + + /* Input buffer */ + alloc_src.len = INPUT_SIZE; /* Size should be power of 2 */ + alloc_src.fd_flags = O_RDWR | O_CLOEXEC; + ret = ioctl(fd_heap, DMA_HEAP_IOCTL_ALLOC, &alloc_src); + + src = mmap(0, alloc_src.len, PROT_READ | PROT_WRITE, MAP_SHARED, alloc_src.fd, 0); + + alloc_config.len = CONFIG_SIZE; /* Size should be power of 2 */ + alloc_config.fd_flags = O_RDWR | O_CLOEXEC; + ret = ioctl(fd_ion, DMA_HEAP_IOCTL_ALLOC, &alloc_config); + + config = mmap(NULL, alloc_config.len, PROT_READ | PROT_WRITE, MAP_SHARED, alloc_config.fd, 0); + + alloc_temp.len = TEMP_SIZE; /* Size should be power of 2 */ + alloc_temp.fd_flags = ION_FLAG_CACHED; + ret = ioctl(fd_ion, DMA_HEAP_IOCTL_ALLOC, &alloc_temp); + + /* Output buffer */ + alloc_dst.len = OUTPUT_SIZE; /* Size should be power of 2 */ + alloc_dst.flags = O_RDWR | O_CLOEXEC; + ret = ioctl(fd_ion, DMA_HEAP_IOCTL_ALLOC, &alloc_dst); + + /* Create buffer info data for descriptor */ + struct drv_ipa_buffer_info bufinfo[NUM_OF_BUFS] + = {{.fd = alloc_config.fd, .direction = DRV_IPA_DIR_TO_DEVICE}, + {.fd = alloc_src.fd, .direction = DRV_IPA_DIR_TO_DEVICE}, + {.fd = alloc_temp.fd, .direction = DRV_IPA_DIR_FROM_DEVICE}, + {.fd = alloc_dst.fd, .direction = DRV_IPA_DIR_FROM_DEVICE}}; + + struct drv_ipa_addr config_addr = {.buffer_index = 0, .offset = 0}; + struct drv_ipa_addr src_list_addr[LIST_NUM] = {{.buffer_index = 1, .offset = 0}, {.buffer_index = 1, .offset = 0x300}}; + struct drv_ipa_addr temp_addr = {.buffer_index = 2, .offset = 0}; + struct drv_ipa_addr dst_list_addr[LIST_NUM] = {{.buffer_index = 3, .offset = 0}, {.buffer_index = 3, .offset = 0x300}}; + + /* Configure descriptor */ + drv_ret = drv_DNN_config_descript_init(&desc, bufinfo, NUM_OF_BUFS); + + drv_ret = drv_DNN_config_exec_configuration(&desc, config, config_addr, src_list_addr, dst_list_addr, LIST_NUM, temp_addr, + TEMP_SIZE); + + drv_ret = drv_DNN_config_descript_finalize(&desc); + + /* Start DNN */ + ret = ioctl(fd_dnn, IOC_IPA_START, &desc); + + /* Wait DNN */ + fds[0].fd = fd_dnn; + fds[0].events = POLLIN; + timeout = 1000; // 1sec + + ret = poll(fds, 1, timeout); + + close(fd_ion); + close(fd_dnn); + }