Message ID | 20180622194752.11221-7-logang@deltatee.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Herbert Xu |
Headers | show |
Hi Logan, On Fri, Jun 22, 2018 at 4:47 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > Clean up the extra ifdefs which defined the wr_reg64 and rd_reg64 > functions in non-64bit cases in favour of the new common > io-64-nonatomic-lo-hi header. > > To be consistent with CAAM engine HW spec: in case of 64-bit registers, > irrespective of device endianness, the lower address should be read from > / written to first, followed by the upper address. Indeed the I/O > accessors in CAAM driver currently don't follow the spec, however this > is a good opportunity to fix the code. > > Signed-off-by: Logan Gunthorpe <logang@deltatee.com> > Reviewed-by: Horia Geantă <horia.geanta@nxp.com> > Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> > Cc: Dan Douglass <dan.douglass@nxp.com> > Cc: Herbert Xu <herbert@gondor.apana.org.au> > Cc: "David S. Miller" <davem@davemloft.net> This is now in linux-next as commit 46e4bf08f6388 and it breaks booting imx6 (32-bit ARM): [ 1.872473] caam 2100000.caam: Entropy delay = 3200 [ 1.938223] caam 2100000.caam: Instantiated RNG4 SH0 [ 1.998983] caam 2100000.caam: Instantiated RNG4 SH1 [ 2.004019] caam 2100000.caam: device ID = 0x0a16010000000000 (Era 4) [ 2.010484] caam 2100000.caam: job rings = 2, qi = 0 [ 2.027389] mmc1: queuing unknown CIS tuple 0x80 (7 bytes) [ 2.028867] caam algorithms registered in /proc/crypto [ 2.035925] mmc1: queuing unknown CIS tuple 0x80 (4 bytes) [ 2.041187] caam_jr 2101000.jr0: job ring error: irqstate: 00000103 [ 2.049878] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM [ 2.056591] Modules linked in: [ 2.059671] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.18.0-rc3-next-20180703 #484 [ 2.067338] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 2.073892] PC is at caam_jr_interrupt+0x120/0x12c [ 2.078702] LR is at vprintk_emit+0x228/0x43c [ 2.083069] pc : [<c075eb38>] lr : [<c01815ec>] psr: 60000193 [ 2.089344] sp : c1001d80 ip : c1001be0 fp : c1001da4 [ 2.094576] r10: c107ffe7 r9 : ec749e00 r8 : 0000012d [ 2.099810] r7 : c1001de0 r6 : c17ecc10 r5 : ec7a6010 r4 : 00000103 [ 2.106346] r3 : ba36048e r2 : ba36048e r1 : 00000001 r0 : 00000037 [ 2.112884] Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none [ 2.120116] Control: 10c5387d Table: 1000404a DAC: 00000051 [ 2.125873] Process swapper/0 (pid: 0, stack limit = 0x(ptrval)) [ 2.131890] Stack: (0xc1001d80 to 0xc1002000) [ 2.136262] 1d80: ec5b5f00 ec749e64 00000000 c1001de0 0000012d ec749e00 c1001ddc c1001da8 [ 2.144454] 1da0: c0183584 c075ea24 c0176ea4 c0176850 c1001dfc c1008908 ec749e64 ec749e00 [ 2.152644] 1dc0: 00000000 00000001 ec008400 f4000100 c1001e04 c1001de0 c018364c c0183504 [ 2.160834] 1de0: 00000000 ba36048e c0a87070 ec749e00 ec749e64 c1014ee4 c1001e24 c1001e08 [ 2.169025] 1e00: c01836e0 c0183628 ec749e00 ec749e64 c1014ee4 00000000 c1001e44 c1001e28 [ 2.177215] 1e20: c0187460 c01836ac c0f840a8 0000012d c1008a90 00000000 c1001e54 c1001e48 [ 2.185406] 1e40: c018264c c01873b0 c1001e7c c1001e58 c0182cdc c0182630 f400010c 000003eb [ 2.193597] 1e60: 000003ff 00000000 c1001eb8 c10280b0 c1001eb4 c1001e80 c047686c c0182c7c [ 2.201787] 1e80: ffffffff f4001100 00000000 c01091e8 20000013 ffffffff c1001eec 00000000 [ 2.209977] 1ea0: c1000000 c1008908 c1001f14 c1001eb8 c0101a30 c0476814 00000001 00000001 [ 2.218166] 1ec0: 00000000 c100bfc0 c1000000 c100892c 00000001 c100896c 00000000 c0f839b0 [ 2.226356] 1ee0: c1008908 c1001f14 c1001ed8 c1001f08 c0174ddc c01091e8 20000013 ffffffff [ 2.234545] 1f00: 00000051 00000000 c1001f24 c1001f18 c0a86e20 c01091cc c1001f64 c1001f28 [ 2.242735] 1f20: c01587dc c0a86e04 00000000 00000000 c1008900 ba36048e c1008908 000000c3 [ 2.250925] 1f40: 00000002 c1080480 c1008900 c1080480 c1008908 c0f66a4c c1001f74 c1001f68 [ 2.259114] 1f60: c0158c88 c015862c c1001f9c c1001f78 c0a7e898 c0158c74 00000000 00000000 [ 2.267304] 1f80: c0a7e78c c1001f90 c10804d8 ffffffff c1001ff4 c1001fa0 c0f00d68 c0a7e694 [ 2.275493] 1fa0: ffffffff ffffffff 00000000 c0f00718 00000000 efffcb40 00000000 c0f66a4c [ 2.283683] 1fc0: ba32168e 00000000 00000000 c0f00330 00000051 10c0387d 0000113c 18000000 [ 2.291873] 1fe0: 412fc09a 10c5387d 00000000 c1001ff8 00000000 c0f009e0 00000000 00000000 Any ideas on how to fix this issue?
On Tue, Jul 3, 2018 at 6:31 PM, Fabio Estevam <festevam@gmail.com> wrote: > Hi Logan, > > On Fri, Jun 22, 2018 at 4:47 PM, Logan Gunthorpe <logang@deltatee.com> wrote: >> Clean up the extra ifdefs which defined the wr_reg64 and rd_reg64 >> functions in non-64bit cases in favour of the new common >> io-64-nonatomic-lo-hi header. >> >> To be consistent with CAAM engine HW spec: in case of 64-bit registers, >> irrespective of device endianness, the lower address should be read from >> / written to first, followed by the upper address. Indeed the I/O >> accessors in CAAM driver currently don't follow the spec, however this >> is a good opportunity to fix the code. >> >> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> >> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> >> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> >> Cc: Dan Douglass <dan.douglass@nxp.com> >> Cc: Herbert Xu <herbert@gondor.apana.org.au> >> Cc: "David S. Miller" <davem@davemloft.net> > > This is now in linux-next as commit 46e4bf08f6388 and it breaks > booting imx6 (32-bit ARM): > Any ideas on how to fix this issue? Oops, first of all the header should be hi-lo instead of lo-hi. Does it fix it? Otherwise I didn't (briefly) see what can be the issue.
On 03/07/18 11:36 AM, Andy Shevchenko wrote: >> This is now in linux-next as commit 46e4bf08f6388 and it breaks >> booting imx6 (32-bit ARM): > > >> Any ideas on how to fix this issue? > > Oops, first of all the header should be hi-lo instead of lo-hi. > Does it fix it? > > Otherwise I didn't (briefly) see what can be the issue. I had to dig to find this: but Horia had said[1] that the HW was specified to use lo-hi and I think we ended up changing to that for this commit based on his feedback. This is also mentioned in the commit message. Besides that, it looks like we are hitting an undefined instruction. So my best guess is that we are somehow now calling a proper 64bit read when we should be calling two 32-bit reads. Fabio, can you send your configuration? Thanks, Logan [1] http://lkml.kernel.org/r/VI1PR0401MB259145C2DFDB5E4084EA5DFC98D20@VI1PR0401MB2591.eurprd04.prod.outlook.com
Hi Logan, On Tue, Jul 3, 2018 at 3:01 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > Besides that, it looks like we are hitting an undefined instruction. So > my best guess is that we are somehow now calling a proper 64bit read > when we should be calling two 32-bit reads. Yes, it seems that's the case. > Fabio, can you send your configuration? This is a imx6 board built with arch/arm/configs/imx_v6_v7_defconfig Thanks
On Tue, Jul 3, 2018 at 9:01 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > On 03/07/18 11:36 AM, Andy Shevchenko wrote: >>> This is now in linux-next as commit 46e4bf08f6388 and it breaks >>> booting imx6 (32-bit ARM): >>> Any ideas on how to fix this issue? >> >> Oops, first of all the header should be hi-lo instead of lo-hi. >> Does it fix it? >> >> Otherwise I didn't (briefly) see what can be the issue. > > I had to dig to find this: but Horia had said[1] that the HW was > specified to use lo-hi and I think we ended up changing to that for this > commit based on his feedback. This is also mentioned in the commit message. Apparently I missed that part of the discussion. > Besides that, it looks like we are hitting an undefined instruction. So > my best guess is that we are somehow now calling a proper 64bit read > when we should be calling two 32-bit reads. Which is a bit strange. For arm we have been using generic definitions. Only very few architectures use own defined macros. > > Fabio, can you send your configuration? > > Thanks, > > Logan > > [1] > http://lkml.kernel.org/r/VI1PR0401MB259145C2DFDB5E4084EA5DFC98D20@VI1PR0401MB2591.eurprd04.prod.outlook.com > >
On Tue, Jul 3, 2018 at 9:06 PM, Fabio Estevam <festevam@gmail.com> wrote: > Hi Logan, > > On Tue, Jul 3, 2018 at 3:01 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > >> Besides that, it looks like we are hitting an undefined instruction. So >> my best guess is that we are somehow now calling a proper 64bit read >> when we should be calling two 32-bit reads. > > Yes, it seems that's the case. > >> Fabio, can you send your configuration? > > This is a imx6 board built with arch/arm/configs/imx_v6_v7_defconfig Btw, what is the environment did you use to build it? and what is the environment / make variable you supply (like ARCH, CROSS_COMPILE, etc)?
On Tue, Jul 3, 2018 at 3:09 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > Btw, what is the environment did you use to build it? > and what is the environment / make variable you supply (like ARCH, > CROSS_COMPILE, etc)? I build it on a Dell laptop running Ubuntu 16.04 with the following variables exported: export ARCH=arm export CROSS_COMPILE=/usr/bin/arm-linux-gnueabi-
On 03/07/18 12:11 PM, Fabio Estevam wrote: > On Tue, Jul 3, 2018 at 3:09 PM, Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: > >> Btw, what is the environment did you use to build it? >> and what is the environment / make variable you supply (like ARCH, >> CROSS_COMPILE, etc)? > > I build it on a Dell laptop running Ubuntu 16.04 with the following > variables exported: > > export ARCH=arm > export CROSS_COMPILE=/usr/bin/arm-linux-gnueabi- Ok, I'm at a bit of a loss... When I look at the assembly before and after the patch, it looks pretty much the same. Additionally, the function where the undefined exception occurs (caam_jr_interrupt()) doesn't make any use of wr_reg64 or rd_reg64 which is the only thing the patch changes... Also, it looks like the nonatomic headers are doing what they are supposed to on this arch and generating two 32-bit ios and not a 64 bit one. So I have no idea what's going on here... Are we sure this is the patch causing the problem? Did you bisect? Thanks, Logan
On Tue, Jul 3, 2018 at 3:47 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > Ok, I'm at a bit of a loss... When I look at the assembly before and > after the patch, it looks pretty much the same. Additionally, the > function where the undefined exception occurs (caam_jr_interrupt()) > doesn't make any use of wr_reg64 or rd_reg64 which is the only thing the > patch changes... > > Also, it looks like the nonatomic headers are doing what they are > supposed to on this arch and generating two 32-bit ios and not a 64 bit one. > > So I have no idea what's going on here... Are we sure this is the patch > causing the problem? Did you bisect? Yes, I am sure that 46e4bf08f6388ba748 is the one causing the kernel boot issue. If I revert 46e4bf08f6388ba748 in linux-next then I can boot mx6 just fine. kernelci.org reports also confirm the same: https://storage.kernelci.org/next/master/next-20180629/arm/imx_v6_v7_defconfig/lab-baylibre-seattle/boot-imx6q-wandboard.html Thanks
On Tue, Jul 3, 2018 at 10:40 PM, Fabio Estevam <festevam@gmail.com> wrote: > On Tue, Jul 3, 2018 at 3:47 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > >> Ok, I'm at a bit of a loss... When I look at the assembly before and >> after the patch, it looks pretty much the same. Additionally, the >> function where the undefined exception occurs (caam_jr_interrupt()) >> doesn't make any use of wr_reg64 or rd_reg64 which is the only thing the >> patch changes... >> >> Also, it looks like the nonatomic headers are doing what they are >> supposed to on this arch and generating two 32-bit ios and not a 64 bit one. >> >> So I have no idea what's going on here... Are we sure this is the patch >> causing the problem? Did you bisect? > > Yes, I am sure that 46e4bf08f6388ba748 is the one causing the kernel boot issue. > > If I revert 46e4bf08f6388ba748 in linux-next then I can boot mx6 just fine. > Interesting... > kernelci.org reports also confirm the same: > > https://storage.kernelci.org/next/master/next-20180629/arm/imx_v6_v7_defconfig/lab-baylibre-seattle/boot-imx6q-wandboard.html By the way, is there any URL which contains dmesg out of kernel with this commit reverted? (It would be even better if we would have 'ignore_loglevel ' there when CONFIG_CRYPTO_DEV_FSL_CAAM_DEBUG=y)
On Tue, Jul 3, 2018 at 4:58 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > By the way, is there any URL which contains dmesg out of kernel with > this commit reverted? Yes, here is the linux-next 20180626 dmesg (which is the last linux-next that does not contain 46e4bf08f63 and it boots fine) : https://storage.kernelci.org/next/master/next-20180626/arm/imx_v6_v7_defconfig/lab-baylibre-seattle/boot-imx6q-wandboard.html Thanks
On Tue, Jul 3, 2018 at 11:10 PM, Fabio Estevam <festevam@gmail.com> wrote: > On Tue, Jul 3, 2018 at 4:58 PM, Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: > >> By the way, is there any URL which contains dmesg out of kernel with >> this commit reverted? > > Yes, here is the linux-next 20180626 dmesg (which is the last > linux-next that does not contain 46e4bf08f63 and it boots fine) : > https://storage.kernelci.org/next/master/next-20180626/arm/imx_v6_v7_defconfig/lab-baylibre-seattle/boot-imx6q-wandboard.html Thanks. And here just a wild guess, if you comment out a BUG() in IRQ handler, does it boot?
On Tue, Jul 3, 2018 at 11:40 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > On Tue, Jul 3, 2018 at 11:10 PM, Fabio Estevam <festevam@gmail.com> wrote: >> On Tue, Jul 3, 2018 at 4:58 PM, Andy Shevchenko >> <andy.shevchenko@gmail.com> wrote: >> >>> By the way, is there any URL which contains dmesg out of kernel with >>> this commit reverted? >> >> Yes, here is the linux-next 20180626 dmesg (which is the last >> linux-next that does not contain 46e4bf08f63 and it boots fine) : >> https://storage.kernelci.org/next/master/next-20180626/arm/imx_v6_v7_defconfig/lab-baylibre-seattle/boot-imx6q-wandboard.html > > Thanks. > > And here just a wild guess, if you comment out a BUG() in IRQ handler, > does it boot? One more thing, on the working version can you print the value of IRQ status register? Something like pr_info_ratelimited("%s %x\n", __func__, irqstatus);
On 7/3/2018 2:10 PM, Fabio Estevam wrote: > On Tue, Jul 3, 2018 at 4:58 PM, Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: > >> By the way, is there any URL which contains dmesg out of kernel with >> this commit reverted? > > Yes, here is the linux-next 20180626 dmesg (which is the last > linux-next that does not contain 46e4bf08f63 and it boots fine) : > https://storage.kernelci.org/next/master/next-20180626/arm/imx_v6_v7_defconfig/lab-baylibre-seattle/boot-imx6q-wandboard.html Is this a virtual machine? And if so, any chance we can get an image and qemu command line to run it ourselves? Thanks, Logan
On 7/3/2018 1:40 PM, Fabio Estevam wrote: >> So I have no idea what's going on here... Are we sure this is the patch >> causing the problem? Did you bisect? > > Yes, I am sure that 46e4bf08f6388ba748 is the one causing the kernel boot issue. Also, it could be helpful if you can compile with CONFIG_DEBUG_INFO and use gdb to find out what line "caam_jr_interrupt+0x120" points to in your image. When I do it on my setup, it doesn't point to a sensible line possibly due to different compilers. Thanks, Logan
On Wed, Jul 4, 2018 at 12:20 AM, Logan Gunthorpe <logang@deltatee.com> wrote: > On 7/3/2018 1:40 PM, Fabio Estevam wrote: > Also, it could be helpful if you can compile with CONFIG_DEBUG_INFO and > use gdb to find out what line "caam_jr_interrupt+0x120" points to in > your image. It is an explicit call to BUG(). That's why we see wrong instruction trap.
On 03/07/18 04:21 PM, Andy Shevchenko wrote: > On Wed, Jul 4, 2018 at 12:20 AM, Logan Gunthorpe <logang@deltatee.com> wrote: >> On 7/3/2018 1:40 PM, Fabio Estevam wrote: > >> Also, it could be helpful if you can compile with CONFIG_DEBUG_INFO and >> use gdb to find out what line "caam_jr_interrupt+0x120" points to in >> your image. > > It is an explicit call to BUG(). > That's why we see wrong instruction trap. Oh. So some IO in some other place must have changed to trigger the BUG... Logan
On Tue, Jul 3, 2018 at 5:40 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > And here just a wild guess, if you comment out a BUG() in IRQ handler, > does it boot? Not really. It fails a bit later: [ 3.863231] caam 2100000.caam: Entropy delay = 3200 [ 3.929028] caam 2100000.caam: Instantiated RNG4 SH0 [ 3.989789] caam 2100000.caam: Instantiated RNG4 SH1 [ 3.994809] caam 2100000.caam: device ID = 0x0a16010000000000 (Era 4) [ 4.001313] caam 2100000.caam: job rings = 2, qi = 0 [ 4.019569] mmcblk1: p1 p2 [ 4.021007] caam algorithms registered in /proc/crypto [ 4.030633] caam_jr 2101000.jr0: job ring error: irqstate: 00000103 [ 4.037036] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM [ 4.043751] Modules linked in: [ 4.046829] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.18.0-rc3-next-20180703-00001-g62a05d4-dirty #498 [ 4.056317] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 4.062869] PC is at caam_jr_dequeue+0x240/0x2bc [ 4.067499] LR is at 0x1 [ 4.070042] pc : [<c075e6bc>] lr : [<00000001>] psr: 60000113 [ 4.076317] sp : c1001d58 ip : ec85c000 fp : c1001da4 [ 4.081549] r10: ec858010 r9 : ec85c000 r8 : 00000000 [ 4.086783] r7 : 000001ff r6 : 00000001 r5 : 00000000 r4 : 00000000 [ 4.093318] r3 : ffffffff r2 : 00000000 r1 : 00000000 r0 : 00000001 [ 4.099856] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 4.106999] Control: 10c5387d Table: 1000404a DAC: 00000051 [ 4.112754] Process swapper/0 (pid: 0, stack limit = 0x(ptrval)) [ 4.118771] Stack: (0xc1001d58 to 0xc1002000)
On Tue, Jul 3, 2018 at 6:16 PM, Logan Gunthorpe <logang@deltatee.com> wrote: > Is this a virtual machine? And if so, any chance we can get an image and > qemu command line to run it ourselves? This is a real imx6 development board. I have never tried it on qemu. Let me see if I can reproduce it on quemu.
diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h index 4fb91ba39c36..5826acd9194e 100644 --- a/drivers/crypto/caam/regs.h +++ b/drivers/crypto/caam/regs.h @@ -10,7 +10,7 @@ #include <linux/types.h> #include <linux/bitops.h> -#include <linux/io.h> +#include <linux/io-64-nonatomic-lo-hi.h> /* * Architecture-specific register access methods @@ -136,10 +136,9 @@ static inline void clrsetbits_32(void __iomem *reg, u32 clear, u32 set) * base + 0x0000 : least-significant 32 bits * base + 0x0004 : most-significant 32 bits */ -#ifdef CONFIG_64BIT static inline void wr_reg64(void __iomem *reg, u64 data) { - if (caam_little_end) + if (!caam_imx && caam_little_end) iowrite64(data, reg); else iowrite64be(data, reg); @@ -147,35 +146,12 @@ static inline void wr_reg64(void __iomem *reg, u64 data) static inline u64 rd_reg64(void __iomem *reg) { - if (caam_little_end) + if (!caam_imx && caam_little_end) return ioread64(reg); else return ioread64be(reg); } -#else /* CONFIG_64BIT */ -static inline void wr_reg64(void __iomem *reg, u64 data) -{ - if (!caam_imx && caam_little_end) { - wr_reg32((u32 __iomem *)(reg) + 1, data >> 32); - wr_reg32((u32 __iomem *)(reg), data); - } else { - wr_reg32((u32 __iomem *)(reg), data >> 32); - wr_reg32((u32 __iomem *)(reg) + 1, data); - } -} - -static inline u64 rd_reg64(void __iomem *reg) -{ - if (!caam_imx && caam_little_end) - return ((u64)rd_reg32((u32 __iomem *)(reg) + 1) << 32 | - (u64)rd_reg32((u32 __iomem *)(reg))); - - return ((u64)rd_reg32((u32 __iomem *)(reg)) << 32 | - (u64)rd_reg32((u32 __iomem *)(reg) + 1)); -} -#endif /* CONFIG_64BIT */ - static inline u64 cpu_to_caam_dma64(dma_addr_t value) { if (caam_imx)