From patchwork Thu Sep 27 11:27:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Dalessandro X-Patchwork-Id: 10617795 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 967F2180E for ; Thu, 27 Sep 2018 11:27:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7AA622A879 for ; Thu, 27 Sep 2018 11:27:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6EBF12B1BD; Thu, 27 Sep 2018 11:27:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E19822A879 for ; Thu, 27 Sep 2018 11:27:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727149AbeI0Rp1 (ORCPT ); Thu, 27 Sep 2018 13:45:27 -0400 Received: from mga03.intel.com ([134.134.136.65]:59033 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727118AbeI0Rp1 (ORCPT ); Thu, 27 Sep 2018 13:45:27 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Sep 2018 04:27:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,310,1534834800"; d="scan'208";a="73363128" Received: from scymds02.sc.intel.com ([10.82.195.37]) by fmsmga007.fm.intel.com with ESMTP; 27 Sep 2018 04:27:23 -0700 Received: from scvm10.sc.intel.com (scvm10.sc.intel.com [10.82.195.27]) by scymds02.sc.intel.com with ESMTP id w8RBRMA1026818; Thu, 27 Sep 2018 04:27:22 -0700 Received: from scvm10.sc.intel.com (localhost [127.0.0.1]) by scvm10.sc.intel.com with ESMTP id w8RBRMJA026655; Thu, 27 Sep 2018 04:27:22 -0700 Subject: [PATCH for-next 1/2] IB/hfi1: Dump pio info for non-user send contexts From: Dennis Dalessandro To: jgg@ziepe.ca, dledford@redhat.com Cc: linux-rdma@vger.kernel.org, Mike Ruhl , Mike Marciniczyn , Kaike Wan Date: Thu, 27 Sep 2018 04:27:22 -0700 Message-ID: <20180927112718.26543.62844.stgit@scvm10.sc.intel.com> In-Reply-To: <20180926175835.2451.14284.stgit@scvm10.sc.intel.com> References: <20180926175835.2451.14284.stgit@scvm10.sc.intel.com> User-Agent: StGit/0.17.1-18-g2e886-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Kaike Wan This patch dumps the pio info for non-user send contexts to assist debugging in the field. Reviewed-by: Mike Marciniczyn Reviewed-by: Mike Ruhl Signed-off-by: Kaike Wan Signed-off-by: Dennis Dalessandro --- drivers/infiniband/hw/hfi1/chip_registers.h | 4 ++ drivers/infiniband/hw/hfi1/debugfs.c | 49 +++++++++++++++++++++++++++ drivers/infiniband/hw/hfi1/pio.c | 25 ++++++++++++++ drivers/infiniband/hw/hfi1/pio.h | 3 ++ 4 files changed, 81 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/chip_registers.h b/drivers/infiniband/hw/hfi1/chip_registers.h index c6163a3..c0800ea 100644 --- a/drivers/infiniband/hw/hfi1/chip_registers.h +++ b/drivers/infiniband/hw/hfi1/chip_registers.h @@ -935,6 +935,10 @@ #define SEND_CTXT_CREDIT_CTRL_THRESHOLD_MASK 0x7FFull #define SEND_CTXT_CREDIT_CTRL_THRESHOLD_SHIFT 0 #define SEND_CTXT_CREDIT_CTRL_THRESHOLD_SMASK 0x7FFull +#define SEND_CTXT_CREDIT_STATUS (TXE + 0x000000100018) +#define SEND_CTXT_CREDIT_STATUS_CURRENT_FREE_COUNTER_MASK 0x7FFull +#define SEND_CTXT_CREDIT_STATUS_CURRENT_FREE_COUNTER_SHIFT 32 +#define SEND_CTXT_CREDIT_STATUS_LAST_RETURNED_COUNTER_SMASK 0x7FFull #define SEND_CTXT_CREDIT_FORCE (TXE + 0x000000100028) #define SEND_CTXT_CREDIT_FORCE_FORCE_RETURN_SMASK 0x1ull #define SEND_CTXT_CREDIT_RETURN_ADDR (TXE + 0x000000100020) diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c index 9f992ae..0a55779 100644 --- a/drivers/infiniband/hw/hfi1/debugfs.c +++ b/drivers/infiniband/hw/hfi1/debugfs.c @@ -407,6 +407,54 @@ static int _rcds_seq_show(struct seq_file *s, void *v) DEBUGFS_SEQ_FILE_OPEN(rcds) DEBUGFS_FILE_OPS(rcds); +static void *_pios_seq_start(struct seq_file *s, loff_t *pos) +{ + struct hfi1_ibdev *ibd; + struct hfi1_devdata *dd; + + ibd = (struct hfi1_ibdev *)s->private; + dd = dd_from_dev(ibd); + if (!dd->send_contexts || *pos >= dd->num_send_contexts) + return NULL; + return pos; +} + +static void *_pios_seq_next(struct seq_file *s, void *v, loff_t *pos) +{ + struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private; + struct hfi1_devdata *dd = dd_from_dev(ibd); + + ++*pos; + if (!dd->send_contexts || *pos >= dd->num_send_contexts) + return NULL; + return pos; +} + +static void _pios_seq_stop(struct seq_file *s, void *v) +{ +} + +static int _pios_seq_show(struct seq_file *s, void *v) +{ + struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private; + struct hfi1_devdata *dd = dd_from_dev(ibd); + struct send_context_info *sci; + loff_t *spos = v; + loff_t i = *spos; + unsigned long flags; + + spin_lock_irqsave(&dd->sc_lock, flags); + sci = &dd->send_contexts[i]; + if (sci && sci->type != SC_USER && sci->allocated && sci->sc) + seqfile_dump_sci(s, i, sci); + spin_unlock_irqrestore(&dd->sc_lock, flags); + return 0; +} + +DEBUGFS_SEQ_FILE_OPS(pios); +DEBUGFS_SEQ_FILE_OPEN(pios) +DEBUGFS_FILE_OPS(pios); + /* read the per-device counters */ static ssize_t dev_counters_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -1143,6 +1191,7 @@ void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd) DEBUGFS_SEQ_FILE_CREATE(qp_stats, ibd->hfi1_ibdev_dbg, ibd); DEBUGFS_SEQ_FILE_CREATE(sdes, ibd->hfi1_ibdev_dbg, ibd); DEBUGFS_SEQ_FILE_CREATE(rcds, ibd->hfi1_ibdev_dbg, ibd); + DEBUGFS_SEQ_FILE_CREATE(pios, ibd->hfi1_ibdev_dbg, ibd); DEBUGFS_SEQ_FILE_CREATE(sdma_cpu_list, ibd->hfi1_ibdev_dbg, ibd); /* dev counter files */ for (i = 0; i < ARRAY_SIZE(cntr_ops); i++) diff --git a/drivers/infiniband/hw/hfi1/pio.c b/drivers/infiniband/hw/hfi1/pio.c index 9ab50d2..6d5d0d0 100644 --- a/drivers/infiniband/hw/hfi1/pio.c +++ b/drivers/infiniband/hw/hfi1/pio.c @@ -2137,3 +2137,28 @@ void free_credit_return(struct hfi1_devdata *dd) kfree(dd->cr_base); dd->cr_base = NULL; } + +void seqfile_dump_sci(struct seq_file *s, u32 i, + struct send_context_info *sci) +{ + struct send_context *sc = sci->sc; + u64 reg; + + seq_printf(s, "SCI %u: type %u base %u credits %u\n", + i, sci->type, sci->base, sci->credits); + seq_printf(s, " flags 0x%x sw_inx %u hw_ctxt %u grp %u\n", + sc->flags, sc->sw_index, sc->hw_context, sc->group); + seq_printf(s, " sr_size %u credits %u sr_head %u sr_tail %u\n", + sc->sr_size, sc->credits, sc->sr_head, sc->sr_tail); + seq_printf(s, " fill %lu free %lu fill_wrap %u alloc_free %lu\n", + sc->fill, sc->free, sc->fill_wrap, sc->alloc_free); + seq_printf(s, " credit_intr_count %u credit_ctrl 0x%llx\n", + sc->credit_intr_count, sc->credit_ctrl); + reg = read_kctxt_csr(sc->dd, sc->hw_context, SC(CREDIT_STATUS)); + seq_printf(s, " *hw_free %llu CurrentFree %llu LastReturned %llu\n", + (le64_to_cpu(*sc->hw_free) & CR_COUNTER_SMASK) >> + CR_COUNTER_SHIFT, + (reg >> SC(CREDIT_STATUS_CURRENT_FREE_COUNTER_SHIFT)) & + SC(CREDIT_STATUS_CURRENT_FREE_COUNTER_MASK), + reg & SC(CREDIT_STATUS_LAST_RETURNED_COUNTER_SMASK)); +} diff --git a/drivers/infiniband/hw/hfi1/pio.h b/drivers/infiniband/hw/hfi1/pio.h index aaf372c..bf1afb0 100644 --- a/drivers/infiniband/hw/hfi1/pio.h +++ b/drivers/infiniband/hw/hfi1/pio.h @@ -329,4 +329,7 @@ void seg_pio_copy_start(struct pio_buf *pbuf, u64 pbc, void seg_pio_copy_mid(struct pio_buf *pbuf, const void *from, size_t nbytes); void seg_pio_copy_end(struct pio_buf *pbuf); +void seqfile_dump_sci(struct seq_file *s, u32 i, + struct send_context_info *sci); + #endif /* _PIO_H */ From patchwork Wed Sep 26 17:59:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Dalessandro X-Patchwork-Id: 10616487 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F346F14BD for ; Wed, 26 Sep 2018 18:02:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3F262B662 for ; Wed, 26 Sep 2018 18:02:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E24502B663; Wed, 26 Sep 2018 18:02:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B90522B675 for ; Wed, 26 Sep 2018 18:02:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726032AbeI0AQ6 (ORCPT ); Wed, 26 Sep 2018 20:16:58 -0400 Received: from mga05.intel.com ([192.55.52.43]:36718 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725848AbeI0AQ6 (ORCPT ); Wed, 26 Sep 2018 20:16:58 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Sep 2018 10:59:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,307,1534834800"; d="scan'208";a="265956969" Received: from scymds02.sc.intel.com ([10.82.195.37]) by fmsmga005.fm.intel.com with ESMTP; 26 Sep 2018 10:59:49 -0700 Received: from scvm10.sc.intel.com (scvm10.sc.intel.com [10.82.195.27]) by scymds02.sc.intel.com with ESMTP id w8QHxnOW019242; Wed, 26 Sep 2018 10:59:49 -0700 Received: from scvm10.sc.intel.com (localhost [127.0.0.1]) by scvm10.sc.intel.com with ESMTP id w8QHxnuM006313; Wed, 26 Sep 2018 10:59:49 -0700 Subject: [PATCH for-next 2/2] IB/hfi1: Add diagnostic debugfs interface From: Dennis Dalessandro To: jgg@ziepe.ca, dledford@redhat.com Cc: linux-rdma@vger.kernel.org, Mitko Haralanov , Ira Weiny Date: Wed, 26 Sep 2018 10:59:49 -0700 Message-ID: <20180926175944.2451.67497.stgit@scvm10.sc.intel.com> In-Reply-To: <20180926175835.2451.14284.stgit@scvm10.sc.intel.com> References: <20180926175835.2451.14284.stgit@scvm10.sc.intel.com> User-Agent: StGit/0.17.1-18-g2e886-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mitko Haralanov With previous version of the HFI driver, user level tools which required access to the device's BAR made use of the 'resourceXX' files provided by the core kernel in sysfs. However, recent distro kernels have started compiling and distributing their kernels with CONFIG_STRICT_IO_DEVMEM turned on. With this option, the kernel prohibits access to device BARs by more than one entity at the same time. This renders some debugging tools unusable (or, at least, non-useful) since the device cannot be operational at the same time. The kernel does provide a workaround for this, namely the "iomem=relaxed" boot option. However, this is not a viable workaround for some customers. To solve this issue, a new debugfs interface is being added, which is designed to handle all tools that need device BAR access. It can also be expanded to handle other tools, which may require device memory access. A debugfs interface has been chosen because: * debugfs provides a level of protection as files under it are only accessible by the 'root' user, * debugfs is meant for exactly such debugging interface, and * debugfs files allow for a full set of file operation callbacks. This, in turn, allows for processing of custom IOCTL commands and memory mapping of appropriate device or kernel memory. Expandability is achieved by allowing user level applications to set the "service type" they require. The "service type" is a method that allows the driver to limit access to the device and or memory according to the service the user application will be performing. This includes limitting access to device memory thorugh the read and write system calls and providing propper mmap support. Setting of "service type" along with querying the mappable area is done through a set of newly defined IOCTL in the RDMA ioctl space and HFI1 command set. The new diag interface replaces two other debugfs files - "lcb" and "dc8051_memory". Reviewed-by: Dennis Dalessandro Reviewed-by: Ira Weiny Signed-off-by: Mitko Haralanov Signed-off-by: Dennis Dalessandro --- drivers/infiniband/hw/hfi1/debugfs.c | 453 ++++++++++++++++++++++++++-------- include/uapi/rdma/rdma_user_ioctl.h | 4 2 files changed, 347 insertions(+), 110 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c index 0a55779..97a46fa 100644 --- a/drivers/infiniband/hw/hfi1/debugfs.c +++ b/drivers/infiniband/hw/hfi1/debugfs.c @@ -53,6 +53,7 @@ #include #include #include +#include #include "hfi.h" #include "trace.h" @@ -633,114 +634,6 @@ static ssize_t asic_flags_write(struct file *file, const char __user *buf, return ret; } -/* read the dc8051 memory */ -static ssize_t dc8051_memory_read(struct file *file, char __user *buf, - size_t count, loff_t *ppos) -{ - struct hfi1_pportdata *ppd = private2ppd(file); - ssize_t rval; - void *tmp; - loff_t start, end; - - /* the checks below expect the position to be positive */ - if (*ppos < 0) - return -EINVAL; - - tmp = kzalloc(DC8051_DATA_MEM_SIZE, GFP_KERNEL); - if (!tmp) - return -ENOMEM; - - /* - * Fill in the requested portion of the temporary buffer from the - * 8051 memory. The 8051 memory read is done in terms of 8 bytes. - * Adjust start and end to fit. Skip reading anything if out of - * range. - */ - start = *ppos & ~0x7; /* round down */ - if (start < DC8051_DATA_MEM_SIZE) { - end = (*ppos + count + 7) & ~0x7; /* round up */ - if (end > DC8051_DATA_MEM_SIZE) - end = DC8051_DATA_MEM_SIZE; - rval = read_8051_data(ppd->dd, start, end - start, - (u64 *)(tmp + start)); - if (rval) - goto done; - } - - rval = simple_read_from_buffer(buf, count, ppos, tmp, - DC8051_DATA_MEM_SIZE); -done: - kfree(tmp); - return rval; -} - -static ssize_t debugfs_lcb_read(struct file *file, char __user *buf, - size_t count, loff_t *ppos) -{ - struct hfi1_pportdata *ppd = private2ppd(file); - struct hfi1_devdata *dd = ppd->dd; - unsigned long total, csr_off; - u64 data; - - if (*ppos < 0) - return -EINVAL; - /* only read 8 byte quantities */ - if ((count % 8) != 0) - return -EINVAL; - /* offset must be 8-byte aligned */ - if ((*ppos % 8) != 0) - return -EINVAL; - /* do nothing if out of range or zero count */ - if (*ppos >= (LCB_END - LCB_START) || !count) - return 0; - /* reduce count if needed */ - if (*ppos + count > LCB_END - LCB_START) - count = (LCB_END - LCB_START) - *ppos; - - csr_off = LCB_START + *ppos; - for (total = 0; total < count; total += 8, csr_off += 8) { - if (read_lcb_csr(dd, csr_off, (u64 *)&data)) - break; /* failed */ - if (put_user(data, (unsigned long __user *)(buf + total))) - break; - } - *ppos += total; - return total; -} - -static ssize_t debugfs_lcb_write(struct file *file, const char __user *buf, - size_t count, loff_t *ppos) -{ - struct hfi1_pportdata *ppd = private2ppd(file); - struct hfi1_devdata *dd = ppd->dd; - unsigned long total, csr_off, data; - - if (*ppos < 0) - return -EINVAL; - /* only write 8 byte quantities */ - if ((count % 8) != 0) - return -EINVAL; - /* offset must be 8-byte aligned */ - if ((*ppos % 8) != 0) - return -EINVAL; - /* do nothing if out of range or zero count */ - if (*ppos >= (LCB_END - LCB_START) || !count) - return 0; - /* reduce count if needed */ - if (*ppos + count > LCB_END - LCB_START) - count = (LCB_END - LCB_START) - *ppos; - - csr_off = LCB_START + *ppos; - for (total = 0; total < count; total += 8, csr_off += 8) { - if (get_user(data, (unsigned long __user *)(buf + total))) - break; - if (write_lcb_csr(dd, csr_off, data)) - break; /* failed */ - } - *ppos += total; - return total; -} - /* * read the per-port QSFP data for ppd */ @@ -1120,8 +1013,6 @@ static int qsfp2_debugfs_release(struct inode *in, struct file *fp) DEBUGFS_XOPS("qsfp2", qsfp2_debugfs_read, qsfp2_debugfs_write, qsfp2_debugfs_open, qsfp2_debugfs_release), DEBUGFS_OPS("asic_flags", asic_flags_read, asic_flags_write), - DEBUGFS_OPS("dc8051_memory", dc8051_memory_read, NULL), - DEBUGFS_OPS("lcb", debugfs_lcb_read, debugfs_lcb_write), }; static void *_sdma_cpu_list_seq_start(struct seq_file *s, loff_t *pos) @@ -1157,10 +1048,349 @@ static int _sdma_cpu_list_seq_show(struct seq_file *s, void *v) return 0; } +enum diag_file_owner_type { + SERVICE_TYPE_NONE, + SERVICE_TYPE_DIAGS, + SERVICE_TYPE_FW, +}; + +struct diag_file_priv { + struct hfi1_devdata *dd; + pid_t owner; + enum diag_file_owner_type type; + u64 bar_size; +}; + +static int diag_file_open(struct inode *inode, struct file *file) +{ + struct diag_file_priv *priv; + + priv = kzalloc(sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + priv->owner = current->pid; + priv->dd = inode->i_private; + priv->type = SERVICE_TYPE_NONE; + priv->bar_size = pci_resource_len(priv->dd->pcidev, 0); + file->private_data = priv; + return 0; +} + +/* Must be called with debugfs file reference held. */ +static int diag_setup_base(struct diag_file_priv *priv, loff_t offset, + size_t len, void __iomem **base) +{ + struct hfi1_devdata *dd = priv->dd; + + switch (priv->type) { + case SERVICE_TYPE_DIAGS: + /* must be in range */ + if (offset + len > priv->bar_size + DC8051_DATA_MEM_SIZE) + return -EINVAL; + if (offset < RCV_ARRAY) + *base = dd->kregbase1 + offset; + else if (offset < dd->base2_start) + *base = dd->rcvarray_wc + (offset - RCV_ARRAY); + else if (offset < TXE_PIO_SEND) + *base = dd->kregbase2 + (offset - dd->base2_start); + else if (offset < priv->bar_size) + *base = dd->piobase + (offset - TXE_PIO_SEND); + else if (offset < priv->bar_size + DC8051_DATA_MEM_SIZE) + *base = (void __iomem *)priv->bar_size; + else + return -EINVAL; + break; + case SERVICE_TYPE_FW: + /* + * Clamp the BAR area that is being accessed to only the + * region that is needed. + */ + if (offset >= ASIC_GPIO_OE && + offset + len <= ASIC + ASIC_EEP_DATA + 8) + *base = dd->kregbase1 + offset; + else + return -EINVAL; + break; + default: + return -EINVAL; + } + return 0; +} + +static ssize_t diag_file_read(struct file *file, char __user *buf, size_t len, + loff_t *off) +{ + struct diag_file_priv *priv; + struct hfi1_devdata *dd; + unsigned long total = 0; + void __iomem *base = NULL; + u64 data; + ssize_t ret = 0; + + if (*off < 0) + return -EINVAL; + /* only read 8 byte quantities */ + if ((len % 8) != 0) + return -EINVAL; + /* offset must be 8-byte aligned */ + if ((*off % 8) != 0) + return -EINVAL; + /* destination buffer must be 8-byte aligned */ + if ((unsigned long)buf % 8 != 0) + return -EINVAL; + + ret = debugfs_file_get(file->f_path.dentry); + if (unlikely(ret)) + return ret; + + priv = file->private_data; + ret = diag_setup_base(priv, *off, len, &base); + if (ret) + goto done; + + dd = priv->dd; + /* Special handling of DC8051 memory. */ + if ((unsigned long)base == priv->bar_size) { + loff_t start, end, s_off; + + /* + * Fill in the requested portion of the temporary buffer from + * the 8051 memory. The 8051 memory read is done in terms of + * 8 bytes. Adjust start and end to fit. Skip reading anything + * if out of range. + */ + s_off = *off - priv->bar_size; + start = s_off & ~0x7; /* round down */ + if (start < DC8051_DATA_MEM_SIZE) { + void *tmp; + + tmp = kzalloc(DC8051_DATA_MEM_SIZE, GFP_KERNEL); + if (!tmp) { + ret = -ENOMEM; + goto done; + } + end = (s_off + len + 7) & ~0x7; /* round up */ + if (end > DC8051_DATA_MEM_SIZE) + end = DC8051_DATA_MEM_SIZE; + ret = read_8051_data(dd, start, end - start, + (u64 *)(tmp + start)); + if (ret) { + kfree(tmp); + goto done; + } + ret = simple_read_from_buffer(buf, len, &s_off, tmp, + DC8051_DATA_MEM_SIZE); + *off = s_off + priv->bar_size; + kfree(tmp); + } + goto done; + } + + for (total = 0; total < len; total += 8, *off += 8, base += 8) { + if (is_lcb_offset(*off)) { + if (read_lcb_csr(dd, *off, (u64 *)&data)) + break; + } + /* + * Cannot read ASIC GPIO/QSFP* clear and force CSRs without a + * false parity error. Avoid the whole issue by not reading + * them. These registers are defined as having a read value + * of 0. + */ + else if (*off == ASIC_GPIO_CLEAR || + *off == ASIC_GPIO_FORCE || + *off == ASIC_QSFP1_CLEAR || + *off == ASIC_QSFP1_FORCE || + *off == ASIC_QSFP2_CLEAR || + *off == ASIC_QSFP2_FORCE) { + data = 0; + } else { + data = readq(base); + } + if (put_user(data, (unsigned long __user *)(buf + total))) + break; + } + ret = total; +done: + debugfs_file_put(file->f_path.dentry); + return ret; +} + +static ssize_t diag_file_write(struct file *file, const char __user *buf, + size_t len, loff_t *off) +{ + struct diag_file_priv *priv; + struct hfi1_devdata *dd; + unsigned long total = 0; + void __iomem *base = NULL; + u64 data; + bool in_lcb = false; + ssize_t ret; + + if (*off < 0) + return -EINVAL; + /* only read 8 byte quantities */ + if ((len % 8) != 0) + return -EINVAL; + /* offset must be 8-byte aligned */ + if ((*off % 8) != 0) + return -EINVAL; + /* destination buffer must be 8-byte aligned */ + if ((unsigned long)buf % 8 != 0) + return -EINVAL; + + ret = debugfs_file_get(file->f_path.dentry); + if (unlikely(ret)) + return ret; + + priv = file->private_data; + ret = diag_setup_base(priv, *off, len, &base); + if (ret) + goto done; + + /* DC8051 memory cannot be written to. */ + if ((unsigned long)base == priv->bar_size) { + ret = -EINVAL; + goto done; + } + + dd = priv->dd; + for (total = 0; total < len; total += 8, *off += 8, base += 8) { + if (get_user(data, (unsigned long __user *)(buf + total))) + break; + if (is_lcb_offset(*off)) { + if (!in_lcb) { + if (acquire_lcb_access(dd, 1)) + break; + in_lcb = true; + } + } else { + if (in_lcb) { + release_lcb_access(dd, 1); + in_lcb = false; + } + } + writeq(data, base); + } + if (in_lcb) + release_lcb_access(dd, 1); + ret = total; +done: + debugfs_file_put(file->f_path.dentry); + return ret; +} + +static long diag_file_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + struct diag_file_priv *priv; + enum diag_file_owner_type type; + u32 map_size = 0; + int ret; + + ret = debugfs_file_get(file->f_path.dentry); + if (unlikely(ret)) + return ret; + priv = file->private_data; + switch (cmd) { + case HFI1_IOCTL_SET_DIAG_TYPE: + if (get_user(type, (enum diag_file_owner_type __user *)arg)) { + ret = -EFAULT; + goto done; + } + priv->type = type; + break; + case HFI1_IOCTL_GET_DIAG_MAP_SIZE: + if (priv->type == SERVICE_TYPE_NONE) { + ret = -EINVAL; + goto done; + } + switch (priv->type) { + case SERVICE_TYPE_DIAGS: + map_size = priv->bar_size; + default: + break; + } + + if (put_user(map_size, (u32 __user *)arg)) { + ret = -EFAULT; + goto done; + } + } +done: + debugfs_file_put(file->f_path.dentry); + return 0; +} + +static loff_t diag_file_seek(struct file *file, loff_t offset, int whence) +{ + struct diag_file_priv *priv; + loff_t ret = 0; + + ret = debugfs_file_get(file->f_path.dentry); + if (unlikely(ret)) + return ret; + priv = file->private_data; + ret = fixed_size_llseek(file, offset, whence, priv->bar_size); + debugfs_file_put(file->f_path.dentry); + return ret; +} + +static int diag_file_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct diag_file_priv *priv; + struct hfi1_devdata *dd; + u64 addr, offset = vma->vm_pgoff << PAGE_SHIFT; + ssize_t length = vma->vm_end - vma->vm_start; + int ret; + + ret = debugfs_file_get(file->f_path.dentry); + if (unlikely(ret)) + return ret; + priv = file->private_data; + dd = priv->dd; + /* + * Currently, we only allow diags service type to mmap memory (the + * device BAR). However, this can be expanded to other service types + * allowing for customization of mappable memory. + */ + if (priv->type != SERVICE_TYPE_DIAGS) { + ret = -EINVAL; + goto done; + } + addr = dd->physaddr + offset; + + ret = io_remap_pfn_range(vma, vma->vm_start, PFN_DOWN(addr), + length, vma->vm_page_prot); +done: + debugfs_file_put(file->f_path.dentry); + return ret; +} + +static int diag_file_release(struct inode *inode, struct file *file) +{ + struct diag_file_priv *priv = file->private_data; + + file->private_data = NULL; + kfree(priv); + return 0; +} + DEBUGFS_SEQ_FILE_OPS(sdma_cpu_list); DEBUGFS_SEQ_FILE_OPEN(sdma_cpu_list) DEBUGFS_FILE_OPS(sdma_cpu_list); +static const struct file_operations diag_file_ops = { + .owner = THIS_MODULE, + .open = diag_file_open, + .read = diag_file_read, + .write = diag_file_write, + .unlocked_ioctl = diag_file_ioctl, + .llseek = diag_file_seek, + .mmap = diag_file_mmap, + .release = diag_file_release +}; + void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd) { char name[sizeof("port0counters") + 1]; @@ -1185,6 +1415,9 @@ void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd) pr_warn("create of %s symlink failed\n", name); return; } + if (!debugfs_create_file_unsafe("diag", 0600, ibd->hfi1_ibdev_dbg, dd, + &diag_file_ops)) + pr_warn("create of diag failed\n"); DEBUGFS_SEQ_FILE_CREATE(opcode_stats, ibd->hfi1_ibdev_dbg, ibd); DEBUGFS_SEQ_FILE_CREATE(tx_opcode_stats, ibd->hfi1_ibdev_dbg, ibd); DEBUGFS_SEQ_FILE_CREATE(ctx_stats, ibd->hfi1_ibdev_dbg, ibd); diff --git a/include/uapi/rdma/rdma_user_ioctl.h b/include/uapi/rdma/rdma_user_ioctl.h index d92d272..b411799 100644 --- a/include/uapi/rdma/rdma_user_ioctl.h +++ b/include/uapi/rdma/rdma_user_ioctl.h @@ -81,5 +81,9 @@ #define HFI1_IOCTL_TID_INVAL_READ _IOWR(RDMA_IOCTL_MAGIC, 0xED, struct hfi1_tid_info) /* get the version of the user cdev */ #define HFI1_IOCTL_GET_VERS _IOR(RDMA_IOCTL_MAGIC, 0xEE, int) +/* set diag service type */ +#define HFI1_IOCTL_SET_DIAG_TYPE _IOW(RDMA_IOCTL_MAGIC, 0xEF, __u32) +/* get diag mmap size */ +#define HFI1_IOCTL_GET_DIAG_MAP_SIZE _IOR(RDMA_IOCTL_MAGIC, 0xF0, __u32) #endif /* RDMA_USER_IOCTL_H */