From patchwork Thu Sep 29 23:35:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Serge Semin X-Patchwork-Id: 12994777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 86BC5C433F5 for ; Thu, 29 Sep 2022 23:54:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=qMMzID5qtA6pBfJBg8zd5/Trrzl8rTMEqAgQQWzVAxQ=; b=3rXn60Xq/oqVix +A3QTAQSxqNwE24KRLQRGHaCcD+FjXlwloJSRwsrhTz21wip27WsdWD8rn1kMrS0R9n402KYUQgH4 x1zeJrIpGJU1c1BZ+wquX5JpRBOPgXraBUPXjwugfeHjBlSSrslQHilp++9mNX382CHtM9FNW3+ih wBADhikUP1of641JbZAeZnGcrM0XYC9FmHmoTC/dzFsIpF8KzHJ0Jfk7ZcR02ZNfPgCRVDWsv4f9l loqg75jv9arFPqqTyoKRKpYz9oJ2j4SKycUh6mXyxgeDB9pjjjD8saoyvPEhxf6NzRDr7+bvzGPSO 39qcCjqSODLk9viXgtUw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oe3L5-006899-Jd; Thu, 29 Sep 2022 23:53:12 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oe35F-00612u-TC for linux-arm-kernel@bombadil.infradead.org; Thu, 29 Sep 2022 23:36:50 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:Content-Transfer-Encoding :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=vXzYse51h+iGGXMr0QM1J4ARGEGIsXjnppg0uKFQnxw=; b=OuQb4bymuG9DoGJz6I5ABpZW80 V/PhdRURaB1hiLcnS0KD2tiyiTfC1Trl0CnrZDxQNmLkwcJKzMZ7dlvZdlJuvuYuSYbezS1OYq6L9 0dXIDVC5Mx06uacBjrv8tjvmtCYumW5YV03yfYE56cJZdvpzZ9DE5zy2nw1TLa46PXH+l/lZHD4OA Im6HZwH8SSuC01mkSAxl16k0D0gQX2OoI/HuRbNKB4yMHPviICaMMJ94m+C+bI5yV49WfJ/Zf76Lv 8Br6fZGGwJlM7Ct8kzbLn82llXCM2gD7eshzJP94MH6AnlNKeZMVN/ehTJxvbhjMkJ4f0ddkNrb9x grVSvb+g==; Received: from post.baikalelectronics.com ([213.79.110.86]) by desiato.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oe358-00H1SP-UZ for linux-arm-kernel@lists.infradead.org; Thu, 29 Sep 2022 23:36:48 +0000 Received: from post.baikalelectronics.com (localhost.localdomain [127.0.0.1]) by post.baikalelectronics.com (Proxmox) with ESMTP id EF329E0EE3; Fri, 30 Sep 2022 02:36:41 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= baikalelectronics.ru; h=cc:cc:content-transfer-encoding :content-type:content-type:date:from:from:in-reply-to:message-id :mime-version:references:reply-to:subject:subject:to:to; s=post; bh=vXzYse51h+iGGXMr0QM1J4ARGEGIsXjnppg0uKFQnxw=; b=mzcKfG60h19/ OKscNk3LCi0pysHp8lgRR2NTLr/3ruvZ7ltVqykbQJkJGy7cBMjoREz9rTZyp5py SEdufysbTjK5sJsuhRZT5F8/GiEc486Vn6daf9ciaIkBo8xextlagj53RykORHTv KGy8TRB4DKTjtcW/KCkA1vPNT9iWv5s= Received: from mail.baikal.int (mail.baikal.int [192.168.51.25]) by post.baikalelectronics.com (Proxmox) with ESMTP id E2033E0E70; Fri, 30 Sep 2022 02:36:41 +0300 (MSK) Received: from localhost (192.168.168.10) by mail (192.168.51.25) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Fri, 30 Sep 2022 02:36:42 +0300 From: Serge Semin To: Michal Simek , Borislav Petkov , Mauro Carvalho Chehab , Tony Luck , James Morse , Robert Richter CC: Serge Semin , Serge Semin , Alexey Malahov , Michail Ivanov , Pavel Parkhomenko , Punnaiah Choudary Kalluri , Manish Narani , Dinh Nguyen , , , Subject: [PATCH RESEND v3 16/18] EDAC/synopsys: Add erroneous page-frame/offset reporting Date: Fri, 30 Sep 2022 02:35:28 +0300 Message-ID: <20220929233530.13016-17-Sergey.Semin@baikalelectronics.ru> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220929233530.13016-1-Sergey.Semin@baikalelectronics.ru> References: <20220929233530.13016-1-Sergey.Semin@baikalelectronics.ru> MIME-Version: 1.0 X-Originating-IP: [192.168.168.10] X-ClientProxiedBy: MAIL.baikal.int (192.168.51.25) To mail (192.168.51.25) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220930_003643_200326_3995B020 X-CRM114-Status: GOOD ( 20.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Since System/SDRAM address translation interface is now available we can use it to determine the system address causing the ECC faults. In order to do so first we need to add the System->Application->HIF->SDRAM address translation procedures based on the DW uMCTL2 DDRC DQ-bus config and HIF/SDRAM mapping table retrieved on the device probe stage. Secondly for the sake of simplification we need to convert the snps_ecc_error_info structure to contain the snps_sdram_addr structure instance, since the erroneous SDRAM address will now participate in the address translation chain. Finally we need to perform the SDRAM->System address translation before passing the later to the edac_mc_handle_error() method. Note the ECC address rank needs to be retrieved now too in order to determine a correct system address. But the rank won't be passed to the MCI core for now since the MCI device is registered with a single ranked layer 0. Signed-off-by: Serge Semin --- drivers/edac/synopsys_edac.c | 142 +++++++++++++++++++++++++++++------ 1 file changed, 118 insertions(+), 24 deletions(-) diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c index 26fd797d3f00..946b86149ff8 100644 --- a/drivers/edac/synopsys_edac.c +++ b/drivers/edac/synopsys_edac.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -344,19 +345,13 @@ struct snps_sdram_addr { /** * struct snps_ecc_error_info - ECC error log information. - * @row: Row number. - * @col: Column number. - * @bank: Bank number. - * @bankgrp: Bank group number. + * @sdram: SDRAM address. * @bitpos: Bit position. * @data: Data causing the error. * @syndrome: Erroneous data syndrome. */ struct snps_ecc_error_info { - u32 row; - u32 col; - u32 bank; - u32 bankgrp; + struct snps_sdram_addr sdram; u32 bitpos; u64 data; u32 syndrome; @@ -416,6 +411,21 @@ static void snps_map_app_to_hif(struct snps_edac_priv *priv, *hif = app >> priv->info.dq_width; } +/** + * snps_map_hif_to_app - Map HIF address to Application address. + * @priv: DDR memory controller private instance data. + * @hif: HIF address (source). + * @app: Application address (destination). + * + * Backward HIF-to-App translation is just the opposite DQ-width-based + * shift operation. + */ +static void snps_map_hif_to_app(struct snps_edac_priv *priv, + u64 hif, u64 *app) +{ + *app = hif << priv->info.dq_width; +} + /** * snps_map_hif_to_sdram - Map HIF address to SDRAM address. * @priv: DDR memory controller private instance data. @@ -467,6 +477,58 @@ static void snps_map_hif_to_sdram(struct snps_edac_priv *priv, } } +/** + * snps_map_sdram_to_hif - Map SDRAM address to HIF address. + * @priv: DDR memory controller private instance data. + * @sdram: SDRAM address (source). + * @hif: HIF address (destination). + * + * SDRAM-HIF address mapping is similar to the HIF-SDRAM mapping procedure, but + * we'll traverse each SDRAM rank/bank/column/row bit. + * + * Note the unmapped bits of the SDRAM address components will be just + * ignored. So make sure the source address is valid. + */ +static void snps_map_sdram_to_hif(struct snps_edac_priv *priv, + struct snps_sdram_addr *sdram, u64 *hif) +{ + struct snps_hif_sdram_map *map = &priv->hif_sdram_map; + unsigned long addr; + int i; + + *hif = 0; + + addr = sdram->row; + for_each_set_bit(i, &addr, DDR_MAX_ROW_WIDTH) { + if (map->row[i] != DDR_ADDRMAP_UNUSED) + *hif |= BIT_ULL(map->row[i]); + } + + addr = sdram->col; + for_each_set_bit(i, &addr, DDR_MAX_COL_WIDTH) { + if (map->col[i] != DDR_ADDRMAP_UNUSED) + *hif |= BIT_ULL(map->col[i]); + } + + addr = sdram->bank; + for_each_set_bit(i, &addr, DDR_MAX_BANK_WIDTH) { + if (map->bank[i] != DDR_ADDRMAP_UNUSED) + *hif |= BIT_ULL(map->bank[i]); + } + + addr = sdram->bankgrp; + for_each_set_bit(i, &addr, DDR_MAX_BANKGRP_WIDTH) { + if (map->bankgrp[i] != DDR_ADDRMAP_UNUSED) + *hif |= BIT_ULL(map->bankgrp[i]); + } + + addr = sdram->rank; + for_each_set_bit(i, &addr, DDR_MAX_RANK_WIDTH) { + if (map->rank[i] != DDR_ADDRMAP_UNUSED) + *hif |= BIT_ULL(map->rank[i]); + } +} + /** * snps_map_sys_to_sdram - Map System address to SDRAM address. * @priv: DDR memory controller private instance data. @@ -488,6 +550,27 @@ static void snps_map_sys_to_sdram(struct snps_edac_priv *priv, snps_map_hif_to_sdram(priv, hif, sdram); } +/** + * snps_map_sdram_to_sys - Map SDRAM address to SDRAM address. + * @priv: DDR memory controller private instance data. + * @sys: System address (source). + * @sdram: SDRAM address (destination). + * + * Perform a full mapping of the SDRAM address (row/column/bank/etc) to + * the system address specific to the controller system bus ports. + */ +static void snps_map_sdram_to_sys(struct snps_edac_priv *priv, + struct snps_sdram_addr *sdram, dma_addr_t *sys) +{ + u64 app, hif; + + snps_map_sdram_to_hif(priv, sdram, &hif); + + snps_map_hif_to_app(priv, hif, &app); + + *sys = app; +} + /** * snps_get_bitpos - Get DQ-bus corrected bit position. * @bitnum: Bit number retrieved from the ECCSTAT.corrected_bit_num field. @@ -540,12 +623,13 @@ static int snps_get_error_info(struct snps_edac_priv *priv) p->ceinfo.bitpos = snps_get_bitpos(p->ceinfo.bitpos, priv->info.dq_width); regval = readl(base + ECC_CEADDR0_OFST); - p->ceinfo.row = FIELD_GET(ECC_CEADDR0_ROW_MASK, regval); + p->ceinfo.sdram.rank = FIELD_GET(ECC_CEADDR0_RANK_MASK, regval); + p->ceinfo.sdram.row = FIELD_GET(ECC_CEADDR0_ROW_MASK, regval); regval = readl(base + ECC_CEADDR1_OFST); - p->ceinfo.bank = FIELD_GET(ECC_CEADDR1_BANK_MASK, regval); - p->ceinfo.bankgrp = FIELD_GET(ECC_CEADDR1_BANKGRP_MASK, regval); - p->ceinfo.col = FIELD_GET(ECC_CEADDR1_COL_MASK, regval); + p->ceinfo.sdram.bankgrp = FIELD_GET(ECC_CEADDR1_BANKGRP_MASK, regval); + p->ceinfo.sdram.bank = FIELD_GET(ECC_CEADDR1_BANK_MASK, regval); + p->ceinfo.sdram.col = FIELD_GET(ECC_CEADDR1_COL_MASK, regval); p->ceinfo.data = readl(base + ECC_CSYND0_OFST); if (priv->info.dq_width == SNPS_DQ_64) @@ -558,12 +642,13 @@ static int snps_get_error_info(struct snps_edac_priv *priv) goto out; regval = readl(base + ECC_UEADDR0_OFST); - p->ueinfo.row = FIELD_GET(ECC_CEADDR0_ROW_MASK, regval); + p->ueinfo.sdram.rank = FIELD_GET(ECC_CEADDR0_RANK_MASK, regval); + p->ueinfo.sdram.row = FIELD_GET(ECC_CEADDR0_ROW_MASK, regval); regval = readl(base + ECC_UEADDR1_OFST); - p->ueinfo.bankgrp = FIELD_GET(ECC_CEADDR1_BANKGRP_MASK, regval); - p->ueinfo.bank = FIELD_GET(ECC_CEADDR1_BANK_MASK, regval); - p->ueinfo.col = FIELD_GET(ECC_CEADDR1_COL_MASK, regval); + p->ueinfo.sdram.bankgrp = FIELD_GET(ECC_CEADDR1_BANKGRP_MASK, regval); + p->ueinfo.sdram.bank = FIELD_GET(ECC_CEADDR1_BANK_MASK, regval); + p->ueinfo.sdram.col = FIELD_GET(ECC_CEADDR1_COL_MASK, regval); p->ueinfo.data = readl(base + ECC_UESYND0_OFST); if (priv->info.dq_width == SNPS_DQ_64) @@ -595,17 +680,22 @@ static void snps_handle_error(struct mem_ctl_info *mci, struct snps_ecc_status * { struct snps_edac_priv *priv = mci->pvt_info; struct snps_ecc_error_info *pinf; + dma_addr_t sys; if (p->ce_cnt) { pinf = &p->ceinfo; snprintf(priv->message, SNPS_EDAC_MSG_SIZE, - "Row %d Col %d Bank %d Bank Group %d Bit %d Data 0x%08llx", - pinf->row, pinf->col, pinf->bank, pinf->bankgrp, + "Row %hu Col %hu Bank %hhu Bank Group %hhu Rank %hhu Bit %d Data 0x%08llx", + pinf->sdram.row, pinf->sdram.col, pinf->sdram.bank, + pinf->sdram.bankgrp, pinf->sdram.rank, pinf->bitpos, pinf->data); - edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, - p->ce_cnt, 0, 0, pinf->syndrome, 0, 0, -1, + snps_map_sdram_to_sys(priv, &pinf->sdram, &sys); + + edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, p->ce_cnt, + PHYS_PFN(sys), offset_in_page(sys), + pinf->syndrome, 0, 0, -1, priv->message, ""); } @@ -613,12 +703,16 @@ static void snps_handle_error(struct mem_ctl_info *mci, struct snps_ecc_status * pinf = &p->ueinfo; snprintf(priv->message, SNPS_EDAC_MSG_SIZE, - "Row %d Col %d Bank %d Bank Group %d Data 0x%08llx", - pinf->row, pinf->col, pinf->bank, pinf->bankgrp, + "Row %hu Col %hu Bank %hhu Bank Group %hhu Rank %hhu Data 0x%08llx", + pinf->sdram.row, pinf->sdram.col, pinf->sdram.bank, + pinf->sdram.bankgrp, pinf->sdram.rank, pinf->data); - edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, - p->ue_cnt, 0, 0, pinf->syndrome, 0, 0, -1, + snps_map_sdram_to_sys(priv, &pinf->sdram, &sys); + + edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, p->ue_cnt, + PHYS_PFN(sys), offset_in_page(sys), + pinf->syndrome, 0, 0, -1, priv->message, ""); }